Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Essays on online advertising markets
(USC Thesis Other)
Essays on online advertising markets
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
Essays on Online Advertising Markets
by
Shijie Lu
A Dissertation Presented to the
FACULTY OF THE USC GRADUATE SCHOOL
UNIVERSITY OF SOUTHERN CALIFORNIA
In Partial Fulfillment of the
Requirements for the Degree
DOCTOR OF PHILOSOPHY
(BUSINESS ADMINISTRATION)
May 2015
Dissertation Committee:
Sha Yang (Chair)
Anthony Dukes
Matthew Shum
Botao Yang
i
Abstract
My dissertation examines novel interactions between consumers and advertisers enabled by
Internet platforms with new targeting technologies in online advertising markets. As paid-search
and display become two most prevalent forms of online advertising, this dissertation empirically
investigates the consumer and advertiser interactions in these two online advertising markets.
In my first essay, I examine the determinant of competition and its impact on click-
volume and cost-per-clicks in paid-search advertising. I regard each keyword as a market and
measure the competition by the number of ads on the paid-search listings. I build an integrative
model of the number entrant advertisers, the realized click-volume and cost-per-clicks of each
entrant. The proposed model is applied to data of keywords associated with digital camera/video
and accessories. Results indicate that the number of competing ads has a significant impact on
baseline click-volume, decay factor, and value-per-click. These findings help search advertisers
assess the impact of competition on their entry decisions and advertising profitability. The
proposed framework can also provide profit implications to the search host regarding two polices:
raising the decay factor by encouraging consumers to engage in more in-depth search/click-
through, and providing coupons to advertisers.
As Internet advertising infomediaries now provide rich competition-related information,
search advertisers are becoming more strategic in their keyword decisions. In the second essay, I
explore whether positive or negative spillover effects occur in advertisers’ keyword entry
decisions, which lead to assimilation or differentiation in their keyword choices. I develop a
model of advertisers’ keyword decisions based on the incomplete-information and simultaneous-
move game with two novel extensions: (i) I allow the strategic interactions to vary with
advertisement positions to reflect consumers’ top-down search pattern; and (ii) I infer potential
ii
entrants of a keyword by modeling the advertisers’ keyword consideration process to capture
their limited capacity in analyzing all existing keywords. Using a panel dataset of laptop-related
keywords mainly used by 28 manufacturers, retailers, and comparison websites that advertise on
Google, I find both assimilation and differentiation tendencies, which vary across firm types and
the expected ranking of competing firms. A counterfactual simulation suggests that the more
accurate competition information provided by infomediaries leads to a market-expansion effect.
Behavioral targeting, displaying personalized advertisements based on consumers’ past
online behaviors, has become a popular practice in the online advertising industry. Yet, empirical
research on behavioral targeting remains relatively nascent. The final essay studies the impact of
targeting level on three key players (users, advertisers, and the advertising host) in behaviorally
targeted display advertising. The targeting level is defined as an inverse scale of the number of a
consumer’s recently activated interests used in the user-ad match. Intuitively, a high targeting
level can increase the relevance of the served ads and consequently the users’ click-through rate,
but it might also evoke users’ negative reactions by triggering privacy concerns and/or
information satiation, lowering the click-through rate. Besides the mixed reaction from users,
advertisers may also respond to a high targeting level by either raising bids in an anticipation of a
higher value-per-click, or lowering bids due to the reduced competition as the result of a high
targeting level. To understand the impact of the targeting level, I develop a model to
simultaneously capture users’ reactions to behaviorally targeted ads, the advertising host’s
decision on ad serving, and advertisers’ bid decisions. I apply the model to a novel dataset
obtained from a leading Internet advertising platform. Results suggest that although the
advertisers’ profits increase with the targeting level, both the consumers’ click-through rate and
the advertising host’s revenue has an inverted-U relationship with the targeting level.
iii
Acknowledgements
First and foremost, I would like to express my deepest gratitude to my advisor Dr. Sha Yang,
whose guidance, support and encouragement have been invaluable throughout my five-year
study in the program. This dissertation would have been impossible without her advice and
inspirations. I am also grateful to my committee members Dr. Anthony Dukes, Dr. Matthew
Shum, and Dr. Botao Yang, who have patiently provided insightful comments and suggestions
on my dissertation.
I feel fortunate to have been studying at Marshall School of Business, where I learned a
lot and received generous help from the faculty members and fellow students. I would like to
especially thank Dina Mayzlin, Matthew Selove, Sivaramakrishnan Siddarth, Yi Zhu, Lan Luo,
and Gerard Tellis, for their kind support.
I am also grateful to my coauthor Xianghua Lu for her help on my first chapter. I owe
special thanks to my friends and colleagues at Alibaba, who helped me learn the institutional
background of my dissertation. I also appreciate the financial support that I received from
Marketing Science Institute on the first two chapters of my dissertation.
Finally, I would like to dedicate this dissertation to my family. I am truly indebted to my
parents, Shiyu Lu and Jufen Fan, for their unconditional love, care, and support during this
process. My gratitude to my beloved wife, Wenlin Liu, is beyond words. Without her
understanding and encouragement, I would not have started this Ph.D. at the first place, and
more importantly, would not be able to reach the finish line.
iv
Contents
Abstract ........................................................................................................................................... i
Acknowledgements ...................................................................................................................... iii
List of Tables ............................................................................................................................... vii
List of Figures ............................................................................................................................... ix
1. Modeling Competition and Its Impact on Paid-Search Advertising .................................... 1
1.1 Introduction ....................................................................................................................... 1
1.2 Relevant Literature............................................................................................................ 4
1.3 Data and Proposed Model ................................................................................................. 8
1.3.1 Description of the Data .......................................................................................... 8
1.3.2 Model Setup ......................................................................................................... 11
1.3.3 Modeling Click-Volume ....................................................................................... 12
1.3.4 Modeling Cost-Per-Clicks ................................................................................... 14
1.3.5 Modeling Number of Entrants ............................................................................. 16
1.4 Estimation Results .......................................................................................................... 19
1.4.1 Identification ........................................................................................................ 19
1.4.2 Model Estimation ................................................................................................. 22
1.4.3 Empirical Findings .............................................................................................. 23
1.5 Managerial Implications ................................................................................................. 32
1.5.1 Implications for Paid-Search Advertisers ............................................................ 32
1.5.2 Implications for the Paid-Search Host ................................................................ 34
1.6 Conclusion ...................................................................................................................... 38
2. Assimilation or Differentiation? Investigating the Effect of Competition on Sponsored
Search Advertisers’ Keyword Decisions ................................................................................... 40
2.1 Introduction ..................................................................................................................... 40
2.2 Related Literature............................................................................................................ 44
2.3 Empirical Context ........................................................................................................... 47
2.4 Model .............................................................................................................................. 50
2.4.1 Model Setup ......................................................................................................... 50
2.4.2 Modeling Ad Positions ......................................................................................... 51
v
2.4.3 Modeling Average Click-Volume and Cost-Per-Click ......................................... 52
2.4.4 Modeling Keyword Entry ..................................................................................... 53
2.4.5 Modeling Keyword Consideration ....................................................................... 56
2.5 Estimation Strategy ......................................................................................................... 58
2.5.1 An Overview of the Estimation Method ............................................................... 58
2.5.2 A Two-Step Estimation Approach ........................................................................ 60
2.5.3 Identification ........................................................................................................ 61
2.5.4 Simulation Studies ................................................................................................ 62
2.6 Results and Counterfactuals ............................................................................................ 64
2.6.1 Post-Entry Outcomes: Ad Position, Average Click-Volume, and Average CPC . 64
2.6.2 Keyword Choice or Entry .................................................................................... 68
2.6.3 Keyword Consideration ....................................................................................... 71
2.6.4 Counterfactual Simulations ................................................................................. 73
2.7 Conclusion ...................................................................................................................... 75
3. A Two-Sided Market Analysis of Behaviorally Targeted Display Advertising .................. 77
3.1 Introduction ..................................................................................................................... 77
3.2 Related Literature............................................................................................................ 81
3.3 Empirical Context ........................................................................................................... 84
3.3.1 Data on Consumers’ Advertisement Responses ................................................... 84
3.3.2 Data on Consumers’ Tracked Behaviors.............................................................. 86
3.3.3 Data on Advertisers’ Bids .................................................................................... 87
3.4 Model .............................................................................................................................. 88
3.4.1 Modeling Consumers’ Click Decisions ................................................................ 88
3.4.2 Modeling the Advertising Host’ s Rule of Ad Serving ........................................... 91
3.4.3 Modeling Advertisers’ Bid Decisions ................................................................... 95
3.4.4 Estimation Strategy .............................................................................................. 98
3.5 Results ............................................................................................................................. 99
3.5.1 Consumers’ Click Behaviors ................................................................................ 99
3.5.2 Ad-serving Algorithm: Interest Score and Quality Score................................... 103
3.5.3 Advertisers’ Value-Per-Click .............................................................................. 105
3.6 Counterfactual Experiments.......................................................................................... 107
vi
3.7 Conclusion .................................................................................................................... 110
Bibliography .............................................................................................................................. 113
Appendices ................................................................................................................................. 118
Appendices for Chapter One ............................................................................................... 118
Appendix A: Derivation of the CPC Equation and Proof of Equilibrium Properties . 118
Appendix B: Expected Revenue Function of Search Advertisers .............................. 121
Appendix C: The MCMC Algorithm and Simulation Studies.................................... 123
Appendices for Chapter Two .............................................................................................. 130
Appendix A: Estimation Algorithms for Ad Position, Average Click-Volume and
Average CPC .............................................................................................................. 130
Appendix B. Estimation Algorithm for Keyword Entry and Consideration Models . 133
Appendices for Chapter Three ............................................................................................ 136
Appendix A. Derivation of Advertisers’ Expected Profit Function ........................... 136
Appendix B. The MCMC Algorithm .......................................................................... 141
vii
List of Tables
Table 1. Summary Statistics of Keyword Characteristics .................................................... 10
Table 2. Summary Statistics of Click-volume and CPC across Keywords .......................... 11
Table 3. Estimation Results of the Reduced-Form Model ................................................... 25
Table 4. Variance Covariance Estimates from the Reduced-Form Model .......................... 26
Table 5. Estimation Results from the Proposed Structural Model ....................................... 27
Table 6. Variance Covariance Estimates from the Proposed Structural Model ................... 28
Table 7. Counterfactual Results of Adjusting Baseline Mean Decay Factor ....................... 36
Table 8. Counterfactual Results of Providing Entry Coupons ............................................. 37
Table 9. Summary Statistics of Time-Invariant Keyword Attributes .................................. 49
Table 10. Summary Statistics of Time-Variant Keyword Attributes ................................... 49
Table 11. Simulation Results of Entry and Consideration Models ...................................... 63
Table 12. Estimation Results of Position Model .................................................................. 66
Table 13. Estimation Results of Click-Volume and CPC Models ....................................... 67
Table 14. Variance Covariance Estimates of Click-Volume and CPC Models ................... 67
Table 15. Estimation Results of Keyword Entry Model ...................................................... 71
Table 16. Estimation Results of Keyword Consideration Model......................................... 73
Table 17. Counterfactual Results of Advertisers’ Keyword Choices and Search Engine
Revenue......................................................................................................................... 74
Table 18. Shares of Impressions, Categories, and Advertisements across Top-Line
Categories ..................................................................................................................... 85
Table 19. Summary Statistics of Consumer and Advertisement Characteristics ................. 86
Table 20. Estimation Results of the Click Model ............................................................... 100
Table 21. Unobserved Consumer Heterogeneity at the Consideration Stage ..................... 102
Table 22. Unobserved Consumer Heterogeneity at the Click Stage .................................. 102
Table 23. Estimation Results of the Models of Interest Score and Quality Score ............. 104
Table 24. Variance-Covariance Estimates in the Models of Click, Interest Score and Quality
Score ........................................................................................................................... 104
Table 25. Estimation Results of the Model of Value-Per-Click ......................................... 105
Table 26. Counterfactual Results of Clicks, CPC, Profits, and Revenues ......................... 109
viii
Table 27. Simulation Results of the Proposed Structural Model ....................................... 129
ix
List of Figures
Figure 1. Histogram of the Posterior Mean of Four Key Parameters across Keywords ...... 31
Figure 2. The Relationship between Number of Entrants and Four Key Advertising Metrics
in Three Keywords ........................................................................................................ 33
Figure 3. Consumer Decision Process ................................................................................. 88
Figure 4. Histogram of the Estimated Value-Per-Click ..................................................... 106
Figure 5. Histogram of the Ratio between Bid and Value-Per-Click ................................ 106
Figure 6. The Impact of Targeting Level in Online Advertising Market .......................... 110
Figure 7. A Demonstration of Expected Advertising Profit Function ............................... 140
1
1. Modeling Competition and Its Impact on Paid-Search Advertising
1.1 Introduction
Paid (or sponsored) search has grown rapidly during the last decade, driving the Internet to
become the second largest media for advertising spending in the US. This type of advertising
format not only provides the largest source of revenue to traditional search engines like Google,
but also has been extended to other business platforms such as online retailers (e.g., Amazon)
and online market makers (e.g., eBay, Priceline) serving as a host for paid-search advertising.
The paid-search host plays the role of directing users to relevant sponsored ads based on user-
generated queries. When an Internet user enters a query, she receives search results containing
both the organic links and paid links. If a user clicks on a paid link, she is directed to the
advertiser’s site, and the advertiser pays the search host a fee (i.e., cost-per-click or CPC) for
sending a potential customer.
As paid search becomes the mainstream platform for online advertising today,
competition further intensifies. In paid-search advertising, a keyword is often regarded as a
market reflecting a unique pattern of demand and supply. The demand captures the click-volume
of each ad on the sponsored search listings for that keyword, whereas the supply captures
advertisers’ decision of whether to enter the market (by purchasing the keyword) and how much
to bid for their ads. Intuitively, the number of entrants or competing ads appearing on the paid-
search listings for a given keyword affects consumer search and buying behavior, which will in
turn influence advertisers’ expected click-volume, value-per-click, and CPC for entering such a
market. At the same time, the number of entrants is related to advertisers’ entry probability
which is often determined by the expected profit from paid-search advertising.
2
The main objective of this research is two-fold. First, we want to understand, in the
context of paid-search advertising, the effects of competition (number of competing ads) on three
key latent constructs that determine click-volume and CPCs of paid-search ads: baseline click-
volume, decay factor, and value-per-click, where the decay factor can be interpreted as the
conditional probability for consumers to click on the next ad on the paid-search listings. Second,
we are interested in understanding the determinants of competition, that is, how various demand
and supply factors affect the entry probability of firms and consequently the total number of
entrants for a keyword.
The understanding of competition and its impact is important to both advertisers and the
paid-search host. On the one hand, it helps advertisers more precisely evaluate the impact of
competition on their expected profit, and thus make better decisions on keyword choices.
Furthermore, studying the separate effects of competition on the three key constructs can help
improve paid-search advertisers’ bidding effectiveness in the Generalized Second-Price (GSP)
auction. For example, advertisers should adjust their bids based on the number of competitors if
competition affects click-volume decay factor and/or value-per-click. This is because these two
parameters play a major role in determining the equilibrium CPCs in the GSP auction (Edelman,
Ostrovsky and Schwarz 2007, referred to as EOS hereafter). On the other hand, this analysis
helps the paid-search host better understand how competition affects its own profit. If the paid-
search host does benefit from competition, what policy changes can be made to improve its profit
by influencing the competition? Our research provides a general framework to help the search
host address this important question.
Utilizing a unique dataset with full information on competition, we propose in this paper
a structural framework to characterize competition and analyze its impact on click-volume and
3
CPCs. Specifically, our integrative modeling framework has three major components: i) We
model the realized click-volume of each entrant as a function of the baseline click-volume and
the decay factor, ii) We model the vector of realized CPCs of those entrants as a function of the
decay factor and the order statistics of the value-per-clicks at an equilibrium condition of the
GSP auction, and iii) We model the number of entrants as the multiplication of the number of
potential entrants and the entry probability, and the entry probability is determined by the
expected revenue (a function of expected click-volume, CPC, and value-per-click) and the entry
cost at the equilibrium condition of an incomplete information game.
We make several modeling contributions. Unlike most previous studies where click-
volume and CPC are modeled in a reduced-form fashion, we structurally model these two
variables to allow inferences of the key underlying structural parameters. We also structurally
model number of competing ads as a reflection of both demand and supply conditions. The
proposed modeling framework entails several econometric challenges. Specifically, order
statistics stemming from the unobserved value-per-clicks induces cross-position correlation on
CPCs of those entrants. Furthermore, the occurrence of the decay factor in both models of click-
volume and CPCs requires a joint estimation. Finally, number of entrants takes an implicit
functional form, requiring numerical calculation of the Jacobian to simulate the likelihood
function. To cope with these challenges, we develop a Bayesian estimation approach to make
model inferences.
Several key findings emerge from our analysis: i) Number of entrants (ads) positively
affects the baseline click-volume; ii) Number of entrants has an inverse-U relationship with the
mean decay factor; iii) Number of entrants has a negative and convex relationship with the mean
4
value-per-click of a keyword; iv) Competition generally hurts advertisers but benefits the paid-
search host.
Our structural analysis provides the paid-search host with some guidelines to improve its
profitability. We conduct two counterfactual analyses as a demonstration. First, we show that the
paid-search host could raise the decay factor by encouraging users to engage in more in-depth
search/click-through on paid-search listings, and such a policy change could help the paid-search
host increase profit. Second, our analysis can also help the search host determine the optimal
face-value of entry coupons distributed to advertisers in order to increase the search host’s profit.
The rest of the paper proceeds as follows. Section 1.2 reviews the relevant literature and
positions our study in relation to previous studies. Section 1.3 describes the data, background
information and develops the model. Section 1.4 provides an empirical application where we
apply the model to real-world data collected from a large paid-search advertising host and
discuss findings. Section 1.5 provides managerial implications and presents two counterfactual
analyses. Section 1.6 concludes the paper.
1.2 Relevant Literature
Our work is based on the growing literature on paid-search advertising. A series of papers have
analytically examined advertisers’ bidding behavior in the GSP auction. EOS and Varian (2007)
are the first two papers characterizing the bidding equilibrium in the GSP auction. EOS proved
that the GSP auction is incentive incompatible, that is, bidding one’s true value is not optimal.
By studying a corresponding generalized English auction, they found that there exists a unique
envy-free Bayes Nash equilibrium. Moreover, the ex post bids corresponding to this generalized
English auction also satisfy the Nash equilibrium conditions of the GSP auction. Varian (2007)
5
independently derived a similar equilibrium condition, which shows that the vector of
equilibrium bids of the GSP auction can be expressed in a recursive form. The theoretical study
on the GSP auction is further developed by Katona and Sarvary (2010), who extended the model
to account for the heterogeneity of click-through rate (i.e., the ratio of actual clicks to the number
of impressions) across competing ads and build the link between sponsored ads and organic ads.
Athey and Nekipelov (2012) recently introduced advertisers’ uncertainty in quality scores to the
GSP auction and presented theoretical conditions for the existence of a unique Nash equilibrium.
They also proposed a computation algorithm to infer the bounds of bidders’ valuations and
applied their method to the historical data of several keywords as a demonstration.
These studies have provided important theoretical foundations to modeling advertisers’
bidding strategies in the GSP auction. However, the theoretical results regarding paid-search
advertiser’s bidding behavior have not been empirically investigated. In this paper, we
characterize the CPC formation based on the equilibrium condition provided by EOS and Varian
(2007). We jointly estimate the CPC, click-volume, and number of entrant advertisers, and infer
the distributions of baseline click-volume, mean value-per-click, and mean decay factor. We also
identify keyword characteristics that affect these three parameters. Our proposed structural
model allows us to derive insights which cannot be obtained from a reduced-form model and fits
the data better.
On the empirical side, several papers have examined marketing-related issues in the
context of paid-search advertising. For example, Ghose and Yang (2009) simultaneously
modeled click-through, conversion, CPC, and ad position using keyword-specific data from one
retailer. Yang and Ghose (2010) extended their previous work by analyzing consumer click-
through and conversion on both sponsored search listings and organic search listings for the
6
same keyword. Rutz and Bucklin (2011) explored the potential spillover effects between
activities associated with generic and branded keywords in paid-search advertising, using data
from a hotel chain. Goldfarb and Tucker (2010) empirically studied the price variation on paid-
search ads related to a legal service on Google, and found evidence of substitution of online
advertising for offline advertising. These aforementioned studies generally employed a reduced-
form approach and focused on predicting paid-search ad performances.
Few studies have empirically examined the underlying competition of paid-search
advertising. Two important pieces of work need to be mentioned here. Chan and Park (2013)
studied the influence of sequential search behaviors of consumers on the value of click-throughs
in sponsored search advertising. They modeled the position competition in the context of a first-
price auction with a buy-it-now option, which allows advertisers to acquire a position without
submitting a bid. Since a unique equilibrium cannot be obtained in such an auction mechanism,
they used the moment-inequality estimation approach to avoid imposing restrictive assumptions
on equilibrium selection and infer advertisers’ value-per-click from the observed ad positions.
Yao and Mela (2011) also modeled the position competition in the first-price auction with a
sorting/filtering function available to users. They emphasized the dynamics in forward-looking
advertisers’ bidding strategies and used the Markov perfect equilibrium to characterize
advertisers’ bids. They estimated the model by applying the two-step estimators developed by
Bajari et al. (2007), assuming the existence and uniqueness of the equilibrium.
Our paper is different from this line of work in several ways. First, our paper focuses on
the effect of competition (number of competing ads) on click-volume and CPC through three
latent variables: baseline click-volume, decay factor, and value-per-click, whereas previous
studies have not examined the separate effects of competition on these variables. Second, we
7
develop an integrative model of click-volume, CPC, and number of entrants. However, the
previous two studies have not considered the entry decisions of advertisers and therefore have
not modeled the number of entrants for a given keyword. Third, unlike the previous two studies
which looked at paid-search advertising in the first-price auction with a small number of ad
positions, we study the GSP auction without capacity constraint, which is the most popular type
of paid-search mechanism. Fourth, the characterization of CPC in our paper is built upon the
Nash-equilibrium condition which is theoretically proved and derived by EOS. This equilibrium
condition enables us to more closely examine the realized CPCs of advertisers and estimate the
distribution of value-per-click. We develop a Bayesian estimation algorithm to cope with the
econometric challenges of the CPC model based on this equilibrium condition.
Knowing that number of entrants not only affects click-volume and CPC but also reflects
the demand and supply conditions associated with a keyword, we model number of entrants as an
aggregate outcome of entry decisions made by potential entrants. Since we model advertisers’
simultaneous entry decisions, our paper is related to the literature of simultaneous-move game.
The pioneering work in this line of research includes Bresnahan and Reiss (1990, 1991) and
Berry (1992). The first two find that the multiple-equilibrium issue is prevalent in the
simultaneous-move game, and they suggest that one way to bypass the multiple-equilibrium is to
focus on the total number of entrants in a market rather than the vector of individual entry
decisions. One important finding in Berry (1992) is that the number of entrants is uniquely
determined if the profit function strictly decreases with the number of entrants. Recently, more
complicated simultaneous-move entry models have been developed by endogenizing either firms’
product differentiation (Mazzeo 2002, Seim 2006), or spatial differentiation (Zhu and Singh
2009), or both (Datta and Sudhir 2011), accounting for the spillover effect within a market
8
(Vitorino 2012) or across markets (Jia 2008), and incorporating the effect of zoning regulations
into market structure (Datta and Sudhir 2013).
Due to the large number of potential entrants in our empirical context, we follow the
previous literature to model the entry decisions of advertisers as a simultaneous-move game with
incomplete information (Seim 2006, Zhu and Singh 2009, Datta and Sudhir 2011, Vitorino 2012).
In other words, the profitability of each advertiser in a keyword is private information and only
its distribution is common knowledge among competitors. We further assume advertisers to be
symmetrical because of the following reasons: First, since one of our main objectives is to study
the impact of number of entrants on three underlying constructs of click-volume and CPC, it is
reasonable to make this assumption to build an internally consistent model. Second, following
Berry’s idea to assume that advertiser’s expected profit is a decreasing function of number of
entrants, we can prove the existence of a unique equilibrium for number of entrants. Finally, due
to the heavy computational burden, current methods can only handle a small number of
heterogeneous players (e.g., Datta and Sudhir 2011, Vitorino 2012). However, since there are a
large number of potential entrants in our empirical context, it is infeasible to adopt the same
modeling approach.
1.3 Data and Proposed Model
1.3.1 Description of the Data
We obtain data from a leading online market maker outside the U.S. who hosts paid-search
advertising. We regard each keyword as a market, and consequently an advertiser is named an
entrant to a keyword if she decides to advertise her product through this keyword. For this paid-
search host, ad positions are auctioned in the second-price fashion and are entirely determined by
9
the rank of bids submitted by entered advertisers. Each advertiser then pays the highest bid
among all bids below hers for each click. In other words, the auction mechanism used in our data
is exactly the same as the GSP auction defined in EOS.
Our data includes aggregate information on 1573 keywords of digital camera/video
products and related accessories in June of 2010. There are 359 advertisers that advertised
through a subset of these 1573 keywords. According to the paid-search host, advertisers in that
market often review their keyword lists and make purchase decisions monthly. These 1573
keywords are further classified into three main categories: digital camera, digital video and
accessory. In addition to this primary categorization, each keyword also belongs to one or several
of 44 subcategories.
1
We create several keyword attributes. First, we define three variables DV, Accessory, and
Coverage based on the categorical information of a keyword. The variable Coverage measures
the number of subcategories a keyword belongs to, which indicates the market breadth of the
keyword. Second, we define several keyword attributes based on the product-related information:
Brand (whether the keyword contains a brand name), General (whether the keyword includes a
general feature that could apply to different products), and Specific (whether the keyword
includes a specific feature such as model/series number that exclusively refers to a product). In
addition, we also have the length information for each keyword. The variable Length indicates
the total number of characters included in the keyword. Finally, we create a dummy variable
Promotional if a keyword includes promotional terms. Taking the keyword “Nikon D700 HD
Cheap” as an example, it includes a brand name (Nikon), a specific word (D700), a general
1
Each subcategory can be regarded as a refined classification of the main category. For example, the keyword
“Canon camera” belongs to two subcategories, the “ordinary digital camera” and “professional SLR (single-lens
reflex) camera”.
10
feature (HD standing for High Definition) and a promotional term (Cheap). Table 1 reports the
summary information of keyword characteristics.
For each keyword, the paid-search host informed us of the composition of its competition
set (i.e., the set of potential advertisers). The selection is mainly based on two criteria. First, for
keywords which link to a specific product/brand, the set of potential entrants includes advertisers
that carry this product/brand in their stores associated with this paid-search host. Second, for a
keyword which does not link to a specific product/brand, the set of potential entrants includes
those who bought other keywords that share similar subcategories (e.g., professional digital
camera, single lens reflex, etc.) with it. As shown in Table 1, the number of entrants ranges from
5 to 25 and the number of potential entrants ranges from 5 to 306. On average, each keyword
belongs to 2 subcategories. For each keyword, we have information on the aggregate click-
volume, average CPC and average positions for each entered advertiser.
2
Table 2 reports the
summary statistics of the average click-volume and CPC across keywords.
Variable Mean Std. Dev. Min Max
DV 0.11 0.31 0 1
Accessory 0.42 0.49 0 1
Coverage 2.04 1.91 1 17
Length 5.90 2.40 1 18
Brand 0.47 0.50 0 1
General 0.22 0.42 0 1
Specific 0.53 0.49 0 1
Promotional 0.08 0.27 0 1
n 9.10 4.40 5 25
N_Potential 119.00 61.00 5 306
Table 1. Summary Statistics of Keyword Characteristics
3
2
We were informed by the data provider that there is very small fluctuation on these measures within the data
period.
3
All keyword characteristics including n are mean-centered in our empirical implementation. The variables n and n
2
are scaled by 10 and 100 respectively in estimation. Estimates that are significant at 95% are bolded in Tables 3 – 6.
11
Variable Mean Std. Dev. Min Max
Total Click-volume 195.79 464.46 10 8246
Average Click-volume across Positions 17.88 33.28 1.25 515.38
Average CPC across Positions (in cents) 14.53 5.36 5.27 53.40
Table 2. Summary Statistics of Click-volume and CPC across Keywords
1.3.2 Model Setup
To fix the context, we have I advertisers and K keywords. Potential entrants/advertisers of each
keyword are indexed by i and keywords are indexed by k. Let 𝐶 𝑘 stand for the set of potential
entrants of keyword k. To be consistent with the industry practice that advertisers choose the set
of keywords on a monthly basis and then optimize their bids after entry, we model advertisers’
keyword selections and bid decisions as the following two-stage sequential process.
In the first stage, potential entrants 𝑖 ∈𝐶 𝑘 decide whether to purchase a specific keyword
k. We model the entry of advertisers as a simultaneous-move game with incomplete information,
in which advertisers possess private information about their own profitability. Since potential
entrants do not observe realized click-volume, CPCs, and value-per-clicks before entry, they are
assumed to form expectations on these variables as well as the entry decisions of others. Based
on these expected values, each potential entrant will form expectations on the advertising
revenue and entry cost. We further assume that potential entrants are symmetrical, and the entry
probability is determined by the expected revenue and the entry cost in a keyword. This micro-
level process determines the total number of entrants for a keyword at the aggregate level. The
equilibrium number of entrants is modeled as the number of potential entrants multiplying by the
entry probability.
12
In the second stage after entry decisions have already been made (i.e., number of entrants
n has been realized and becomes common knowledge), entrant advertisers now determine how
much to bid. The equilibrium CPCs and ad positions are then determined by realizations of
value-per-clicks. The value-per-click of each entrant advertiser is assumed to be private
information. In the same setup, EOS proved that there exists a unique Bayes Nash equilibrium.
In this equilibrium, ad positions are determined by the descending order of value-per-clicks, and
the associated CPCs are shown to be a recursive function of value-per-clicks. Click-volume at
each position is then realized as users’ responses to paid-search ads.
We next model the three main equilibrium outcomes of this game backwards. We first
present the model of click-volume conditional on the rank of advertisers. Then we discuss how to
model advertisers’ equilibrium CPCs conditional on the number of entrants and their value-per-
clicks. Finally, we present the model of number of entrants.
1.3.3 Modeling Click-Volume
Let 𝑛 𝑘 stand for the number of entrants in keyword 𝑘 and 𝑄 𝑘𝑖
stand for the realized click-volume
for advertiser 𝑖 displayed in the paid-search results of keyword 𝑘 at its realized position 𝑗 𝑘𝑖
. A
smaller 𝑗 𝑘𝑖
corresponds to a higher position (i.e., more towards the top). Following Feng et al.
(2007), we assume that the expected click-volume of an ad decreases exponentially with its
position. We model the click-volume as
𝑄 𝑘𝑖
={
𝐵 𝑘 exp(𝜖 𝑘𝑖
), 𝑖𝑓 𝑗 𝑘𝑖
=1
𝐵 𝑘 ∏ 𝛿 𝑘𝑙
𝑗 𝑘𝑖
−1
𝑙 =1
exp(𝜖 𝑘𝑖
), 𝑖𝑓 𝑗 𝑘𝑖
≥2
(1)
where 𝐵 𝑘 stands for the baseline click-volume at the top position for keyword 𝑘 , 𝛿 𝑘𝑗
is the decay
factor which stands for the ratio of click-volume between position j+1 and position j for keyword
k (0<𝛿 𝑘𝑗
<1), and 𝜖 𝑘𝑖
is the noise component distributed as normal with mean zero and
variance 𝜎 𝑞 2
. Here 𝜖 𝑘𝑖
is a measurement error between the expected and realized log click-
13
volume and is unknown to advertisers.
4
As for 𝛿 𝑘𝑗
, we assume that these position-specific decay
factors are common knowledge to entrants because each entrant of a keyword can easily learn
𝛿 𝑘𝑗
by experimenting bids to change positions. Furthermore, since the realized decay factors
depend on consumers’ search behavior given the search listings, we assume that 𝛿 𝑘𝑗
is observed
by advertisers only after their entry. After taking log on both sides of equation (1), we obtain
ln(𝑄 𝑘𝑖
)={
𝑏 𝑘 +𝜖 𝑘𝑖
, 𝑖𝑓 𝑗 𝑘𝑖
=1
𝑏 𝑘 +∑ ln(𝛿 𝑘𝑙
)
𝑗 𝑘𝑖
−1
𝑙 =1
+𝜖 𝑘𝑖
, 𝑖𝑓 𝑗 𝑘𝑖
≥2
(2)
𝜖 𝑘𝑖
~𝑁 (0,𝜎 𝑞 2
) (3)
𝑏 𝑘 =𝛾 1
𝑋 𝑘 +𝜂 𝑘 1
(4)
where 𝑏 𝑘 =ln (𝐵 𝑘 ) , 𝑋 𝑘 is a vector of keyword characteristics where the first column is 1 and one
covariate is the number of entrants, and 𝜂 𝑘 1
is a keyword-specific shock that is common
knowledge to all potential entrants. 𝜂 𝑘 1
accounts for the unobserved heterogeneity in keyword-
specific log baseline click-volume. Its distribution specification will be described later along
with other keyword-specific error terms. We parameterize the decay factor 𝛿 𝑘𝑗
with a logit
transformation to ensure 0<𝛿 𝑘𝑗
<1,
𝛿 𝑘𝑗
=
exp (𝜆 𝑘𝑗
)
1+exp(𝜆 𝑘𝑗
)
(5a)
𝜆 𝑘𝑗
~𝑁 (𝜆 𝑘 ,𝜎 𝑑 2
) (5b)
𝜆 𝑘 =𝛾 2
𝑋 𝑘 +𝜂 𝑘 2
(6)
where 𝜆 𝑘𝑗
is the transformed decay factor which is assumed to be normally distributed and 𝜆 𝑘
stands for the mean transformed decay factor which is commonly known to all potential entrants.
Similar to 𝜂 𝑘 1
in equation (4), 𝜂 𝑘 2
captures the unobserved heterogeneity in 𝜆 𝑘 . The mean
4
All error terms specified in our model are unknown or unobservable to researchers. Therefore, we only explain
whether an error term is known to advertisers, and if yes, at what stage advertisers know about it.
14
(transformed) decay factor 𝜆 𝑘 is a key parameter that governs the click-volume across positions.
Generally speaking, the larger the mean (transformed) decay factor, the smaller the variation of
click-volume across two adjacent positions, and the less important the ad position is in attracting
click-volume, holding everything else constant.
1.3.4 Modeling Cost-Per-Clicks
Conditional on the total number of entrants 𝑛 𝑘 , advertisers submit bids. These submitted bids
determine the rank of ads as well as CPCs of advertisers according to the rule of the GSP auction.
We assume that each advertiser’s value-per-click is private information. As discussed in EOS,
the unique envy-free Bayes Nash equilibrium derived in the corresponding generalized English
auction provides a closed-form solution. More importantly, the realized bids from this
generalized English auction also satisfy the Nash equilibrium conditions of the GSP auction and
are independent of advertisers’ beliefs on others’ valuations. As EOS pointed out in the paper,
“This equilibrium has some notable properties. The bid functions have explicit analytic formulas,
which, combined with equilibrium uniqueness, make our results a useful starting point for
empirical analysis”.
Following EOS, we derive the equilibrium CPC in a recursive form by characterizing
CPC at each position as a weighted sum of the CPC and the advertiser’s value-per-click at the
adjacent position below. We provide a detailed derivation of CPC equation and prove several
equilibrium properties in Appendix A. Since the theoretical proposition indicates that the rank of
ads is in line with advertisers’ value-per-clicks at equilibrium, we model advertisers’ value-per-
clicks as order statistics generating from a distribution. Let 𝐶𝑃𝐶 𝑘𝑖
stand for the CPC advertiser i
pays at position 𝑗 𝑘𝑖
for keyword 𝑘 . Then the equilibrium condition can be written as,
𝐶𝑃 𝐶 𝑘𝑖
=𝛿 𝑘 𝑗 𝑘𝑖
𝐶𝑃 𝐶 𝑘 𝑖 ′+(1−𝛿 𝑘 𝑗 𝑘𝑖
)𝑆 𝑘 𝑖 ′ (7)
15
{𝑆 𝑘𝑖
}
𝑗 𝑘𝑖
=2
𝑛 𝑘 are 1, 2, ... , 𝑛 𝑘 −1 descending order statistics of {𝑆̃
𝑘𝑖
}
𝑖 =1
𝑛 𝑘 −1
(8)
where 𝑖 ′
refers to the advertiser who stays right below advertiser i (i.e., 𝑗 𝑘 𝑖 ′=𝑗 𝑘𝑖
+1), and
{𝑆̃
𝑘𝑖
}
𝑖 =1
𝑛 𝑘 −1
are i.i.d. log-normal distributed. Since the CPC of the advertiser at the top position is
independent of her value-per-click, we cannot infer the first advertiser’s value-per-click from
observed CPC data. We also treat the CPC at the bottom position as exogenous since CPCs at
bottom are 5 cents in our data, which is the minimum bid set by the search host. Similar to 𝛿 𝑘𝑗
,
an advertiser’s value-per-click 𝑆 𝑘𝑖
is assumed to be known after entry because the realized value-
per-click for an advertiser depends on the conversion rate of her ad. Such demand-side
information can only be accurately learned by advertisers after entering the keyword market.
However, unlike 𝛿 𝑘𝑗
, 𝑆 𝑘𝑖
is assumed to be private information to advertiser 𝑖 and is known to
others only up to the distribution. As each advertiser’s value-per-click is always positive, we take
a log transformation of 𝑆̃
𝑘𝑖
and incorporate the observed heterogeneity as follows,
ln(𝑆̃
𝑘𝑖
)~𝑁 (𝑉 𝑘 ,𝜎 𝑣𝑝𝑐 2
) (9)
𝑉 𝑘 =𝛾 3
𝑋 𝑘 +𝜂 𝑘 3
(10)
where 𝑉 𝑘 stands for the mean log value-per-click of keyword k and 𝜂 𝑘 3
accounts for unobserved
heterogeneity in 𝑉 𝑘 . Similar to 𝜂 𝑘 1
and 𝜂 𝑘 2
above, 𝜂 𝑘 3
is assumed to be publicly known to all
potential entrants. Note that {𝑆̃
𝑘𝑖
} are latent variables which need to be inferred.
As shown in equations (2) and (7), decay factors 𝛿 𝑘𝑗
are the key parameters bridging the
click-volume and CPC equations. Intuitively, the decay factor at position j can be interpreted as
the probability for a consumer to further click on an ad conditional on her click on the j
th
ad. This
implies that the decay factor at each position is determined by the marginal search benefit and
cost of a consumer. The decay factor is higher when the consumer incurs higher marginal return
from search. The click-volume decay factor also affects the distribution of CPCs across positions.
16
Consider two extreme cases where 𝛿 𝑘𝑗
=1 and 𝛿 𝑘𝑗
=0 for all j=1,..., 𝑛 𝑘 −1. When 𝛿 𝑘𝑗
=1, no
advertiser cares about the result of the auction since each position is equally profitable. Hence,
advertisers will all bid zero in the GSP auction. When 𝛿 𝑘𝑗
=0, only the top position receives
clicks and all advertisers will bid their true value as predicted in our model (equation 7). To some
extent, a vector of small decay factors drives advertisers to bid more aggressively for the top
position.
1.3.5 Modeling Number of Entrants
We focus on the keyword entry of advertisers at the aggregate level by modeling the number of
entrants for a number of reasons. First, recall that one of the main objectives of this research is to
understand, in the context of paid-search advertising, the effect of competition (number of
entrants) on three key latent constructs that determine click-volume and CPC: baseline click-
volume, decay factor and value-per-click. It makes sense for us to model the number of entrants
to control for the potential endogeneity.
Second, given that we have a large number of potential entrants, it is rather difficult to
treat and model their entry decisions individually due to the multiple-equilibrium problem and
computational burden. To the best of our knowledge, the extant literature on entry game can
handle only a handful number of heterogeneous players (e.g., Datta and Sudhir 2011, Vitorino
2012). Therefore, we assume potential entrants to be symmetrical. We further assume that the
expected profit function of an advertiser decreases with number of entrants. Under these two
assumptions, we can show later that there exists a unique equilibrium for number of entrants. We
next outline the model we proposed on number of entrants.
We model advertisers’ entry decisions as a simultaneous-move game with incomplete
information. The advertiser’s utility of purchasing a keyword consists of three parts: the expected
17
revenue from search advertising, the management/entry cost that covers different types of cost
associated with managing a keyword (e.g., cost of designing/revising ad title, ad copy, and
landing page), and an idiosyncratic shock of entry cost which is private information for each
advertiser.
𝑈 𝑘𝑖
=𝛽 [Π
𝑘 (𝑛 𝑘 ,𝜃 𝑘 )−𝐹 𝑘 ]+𝜉 𝑘𝑖
(11)
𝜉 𝑘𝑖
~ Extreme value (0,1) (12)
𝐹 𝑘 =𝛾 4
𝑋 𝑘 𝐹 +𝜂 𝑘 4
(13)
where Π
𝑘 is the expected revenue of advertising through keyword k, 𝐹 𝑘 is the keyword-specific
entry cost which depends on a vector of keyword attributes 𝑋 𝑘 𝐹 . 𝑋 𝑘 𝐹 is the same as 𝑋 𝑘 except that
it does not include the number of entrants 𝑛 𝑘 in our empirical implementation. 𝜂 𝑘 4
captures the
unobserved heterogeneity in 𝐹 𝑘 and is assumed to be common knowledge to potential entrants.
𝜉 𝑘𝑖
is the private shock of entry cost known by advertiser i in keyword k, and is assumed to
follow Type I Extreme Value distribution. The parameter 𝛽 measures how advertiser’s expected
profit (on a monetary unit) affects entry probability.
The expected advertising revenue Π
𝑘 is formed based on the expectation of click-volume,
value-per-click, and CPC. These values further depend on three keyword-specific parameters:
log baseline click-volume (𝑏 𝑘 ), mean transformed decay factor (𝜆 𝑘 ), and mean log value-per-
click (𝑉 𝑘 ). We assume that all potential entrants of a keyword know these three keyword-specific
parameters (𝑏 𝑘 ,𝜆 𝑘 ,𝑉 𝑘 ) at the stage of entry. We believe this is a reasonable assumption because
advertisers can learn these parameters from their previous advertising experience and keyword
18
statistics provided by the search host.
5
Thus, the expected advertising revenue for a potential
entrant is:
Π
𝑘 (𝑛 𝑘 )=𝐸 𝑄 ,𝜆 ,𝑆 {𝑄 (𝑏 𝑘 (𝑛 𝑘 ),𝜆 𝑘 (𝑛 𝑘 ))[𝑆 (𝑉 𝑘 (𝑛 𝑘 ))−𝐶𝑃𝐶 (𝑆 𝑘 ,𝜆 𝑘 (𝑛 𝑘 ))]} (14)
where E is the expectation with respect to the vector of value-per-clicks of all entrant advertisers,
the vector of decay factors across positions and the vector of click-volume across positions. In
other words, the expectation in equation (14) is taken with respect to the position- or advertiser-
specific errors in equations (3), (5b) and (9), but not with respect to the keyword-specific errors
(𝜂 𝑘 1
,𝜂 𝑘 2
,𝜂 𝑘 3
) associated with (𝑏 𝑘 ,𝜆 𝑘 ,𝑉 𝑘 ) . More details are provided in Appendix B on how to
obtain an advertiser’s expected revenue.
Assuming 𝜉 𝑘𝑖
follows Type I Extreme Value distribution, the entry probability of a
representative potential entrant is:
𝑃 𝑘 (𝑛 𝑘 ,𝐹 𝑘 )=
exp [𝛽 (Π
𝑘 (𝑛 𝑘 )−𝐹 𝑘 )]
1+exp [𝛽 (Π
𝑘 (𝑛 𝑘 )−𝐹 𝑘 )]
(15)
Following the previous literature (Seim 2006, Datta and Sudhir 2011), we assume that the
unobserved keyword-specific cost parameter 𝐹 𝑘 is adjusted to equate the observed number of
entrants and the expected number of entrants predicted by the model. This assumption indicates
that the entry cost 𝐹 𝑘 is just low enough for 𝑛 𝑘 entrants observed from the data. Therefore, 𝑛 𝑘 is
implicitly determined by the following equation.
𝑛 𝑘 =𝑁 𝑘 𝑃 𝑘 (𝑛 𝑘 ,𝐹 𝑘 ) (16)
where 𝑁 𝑘 is the number of potential entrants for keyword k. Under the assumption that the
expected revenue of advertisers decreases with competition (i.e., Π
𝑘 decreases with 𝑛 𝑘 ), the
entry probability specified in equation (15) also decreases with 𝑛 𝑘 . This suggests that the left-
5
Most large search hosts such as Google and Bing provide advertisers free tools to obtain aggregate information on
click-volume and CPC for various keywords. The search host in our data offers a similar service to advertisers.
19
hand side of equation (16) increases with 𝑛 𝑘 while the right-hand side decreases with 𝑛 𝑘 . Thus,
we have shown that equation (16) leads to a unique number of entrants 𝑛 𝑘 at equilibrium.
By substituting equation (15) into equation (16), we have 𝑛 𝑘 =
𝑁 𝑘 exp[𝛽 (𝛱 𝑘 (𝑛 𝑘 )−𝐹 𝑘 )]
1+exp[𝛽 (𝛱 𝑘 (𝑛 𝑘 )−𝐹 𝑘 )]
, from
which 𝐹 𝑘 can be expressed below as a function of observed number of entrants and number of
potential entrants
𝐹 𝑘 =Π
𝑘 (𝑛 𝑘 )+
1
𝛽 [ln(𝑁 𝑘 −𝑛 𝑘 )−ln (𝑛 𝑘 )] (17)
Number of entrants can be endogenous in both click-volume and CPC models since the
keyword entry decisions of advertisers are based on their expected profits which depend on the
expected click-volume, value-per-click, and CPC. To control for the potential endogeneity of 𝑛 𝑘
in click and CPC models, we need to simultaneously estimate the models of click-volume, CPC,
and number of entrants. We also assume the four keyword-specific errors in log baseline click-
volume, mean transformed decay factor, mean log value-per-click, and entry cost to be normally
distributed and potentially correlated. Thus, the conditional likelihood of number of entrants can
be derived as the conditional likelihood of entry cost multiplying by the Jacobian.
(𝜂 𝑘 1
,𝜂 𝑘 2
,𝜂 𝑘 3
,𝜂 𝑘 4
)~𝑀 𝑉 𝑁 (0,Ω) (18)
Pr(𝑛 𝑘 |𝑏 𝑘 ,𝜆 𝑘 ,𝑉 𝑘 ,Ω)=Pr(𝐹 𝑘 |𝑏 𝑘 ,𝜆 𝑘 ,𝑉 𝑘 ,Ω)|
𝜕 𝐹 𝑘 𝜕 𝑛 𝑘 | (19)
1.4 Estimation Results
1.4.1 Identification
We first show the theoretical identification of our model. We perform a simulation study in
Appendix C by estimating the proposed model based on simulated data. The estimation results
suggest that all parameters in our proposed model are identifiable and can be recovered within 95%
20
confidence intervals of their true values. We next discuss the empirical identification of our
model parameters.
A unique aspect of our data is that we observe both keyword entry decisions, and post-
entry information including CPCs and the realized click-volume for each entrant advertiser.
Using an analogy, CPC and click-volume correspond to the price and sales data in the retailing
context. These two pieces of information substantially help researchers identify the separate
effects of number of competing firms on baseline click-volume, mean decay factor, and mean
value-per-click in search advertising context. Specifically, the competition effect on baseline
click-volume is identified through the variation in log click-volume across positions and across
keywords. We also observe variation in number of entrants in different keywords. Furthermore,
as the decay factor appears in both click-volume and CPC equations, the click ratio between
adjacent positions and the correlation between consecutive CPCs help us identify the decay
factor. Given the decay factor, we can then identify the mean value-per-click.
We next illustrate the empirical identification of the equation of number of entrants. Note
that we are not estimating equation (11), which is only used to derive the entry probability.
Instead, we estimate equation (17) in which 𝐹 𝑘 is a function of observed number of entrants and
number of potential entrants. Note that 𝐹 𝑘 is also a linear function of 𝑋 𝑘 𝐹 as specified in equation
(13). As a demonstration, we assume here the error term 𝜂 𝑘 4
to be independent of other
keyword-specific errors 𝜂 𝑘 1
,𝜂 𝑘 2
,𝜂 𝑘 3
, while we allow these four errors to be correlated in the
model.
𝜂 𝑘 4
~𝑁 (0,𝜎 4
2
) (20)
Based on equations (13) and (20), equation (17) can be rewritten as:
21
[ln(𝑁 𝑘 −𝑛 𝑘 )−ln(𝑛 𝑘 )]=𝛽 𝛾 4
𝑋 𝑘 𝐹 −𝛽 Π
𝑘 +𝛽 𝜂 𝑘 4
(21)
which is equivalent to the equation below after reparameterization,
[ln(𝑁 𝑘 −𝑛 𝑘 )−ln(𝑛 𝑘 )]=𝛼 𝑋 𝑘 𝐹 −𝛽 Π
𝑘 +𝑒 𝑘 (22)
where 𝛼 =𝛽 𝛾 4
and 𝑒 𝑘 =𝛽 𝜂 𝑘 4
.
Note that the expected entry profit Π
𝑘 (𝑋 𝑘 ,𝜃 𝑘 ) depends on both keyword attributes 𝑋 𝑘
and keyword-specific parameters 𝜃 𝑘 (i.e., 𝜃 𝑘 =(𝑏 𝑘 ,𝜆 𝑘 ,𝑉 𝑘 ) ) estimated from the click-volume
and CPC equations, in a highly non-linear fashion. Given that 𝜃 𝑘 can be uniquely identified from
the click-volume and CPC equations, Π
𝑘 can then be constructed. Because Π
𝑘 includes post-
entry information in addition to keyword attributes, its coefficient 𝛽 in equation (22) can then be
identified based on the empirical relationship between [ln(𝑁 𝑘 −𝑛 𝑘 )−ln(𝑛 𝑘 )] and Π
𝑘 . Once 𝛽
is identified, 𝛾 4
and 𝜎 4
2
can be identified accordingly as in a regression. Intuitively, 𝛽 should be
positive, because as Π
𝑘 increases, we should expect more entrants and fewer non entrants, that is,
the ratio between non entrants and entrants will go down. In other word, [ln(𝑁 𝑘 −𝑛 𝑘 )−ln(𝑛 𝑘 )]
and Π
𝑘 are expected to have a negative relationship. We will show later that our estimate of 𝛽 is
positive, and this gives some empirical face validity about the model identification.
The discussion above aims to provide the basic idea of empirical identification of our
model. The actual model is more complicated given that the expected profit Π
𝑘 is also a function
of 𝑛 𝑘 . In that case, it is difficult to derive the explicit function of 𝑛 𝑘 , and therefore we
numerically compute the Jacobian when deriving the likelihood function. To ensure that the
proposed identification strategy also works for such a more complicated model, we conduct a
simulation study to confirm the theoretical identification of our model.
In sum, the identification strategy for competition effects in our study is not the same as
the identification strategy applied in previous studies in which researchers only observe firms’
22
entry or location choices (e.g., Zhu and Singh 2009, Vitorino 2012). Those studies typically rely
only on the existence of exclusion restrictions to exploit the variation in number of rivals across
markets to identify strategic interaction effects (see Bajari et al. 2010 for a detailed discussion).
In our study, we identify the competition effects not only from the observed keyword entry, but
also from the variation in subsequently realized click-volume and CPCs across positions and
across keywords. This follows the identification strategy adopted by Datta and Sudhir (2011),
who first articulated that in addition to the market entry data, the post-entry revenue and price
data are essential for the identification of different types of spillover effects.
1.4.2 Model Estimation
Several challenges arise in model estimation. First, value-per-clicks are unobserved. On top of
that, they are based on order statistics, which induces unobserved correlations across CPCs at
different positions. Second, decay factors appear in both models of click-volume and CPC. Third,
since parameters associated with the baseline click-volume, mean decay factor, and mean value-
per-click affect advertisers’ entry decisions, we need to simultaneously estimate models of click-
volume, CPC, and number of entrants. Finally, number of entrants takes an implicit functional
form, requiring numerical calculation of the Jacobian to simulate the likelihood function.
To cope with these challenges, we adopt the Bayesian estimation approach to make
model inference. The key advantage of the Bayesian approach lies in its convenience in handling
complex models. As shown, the model involves a large number of latent variables. Moreover, the
Bayesian estimation approach can help obtain keyword-specific estimates for parameters
{𝑏 𝑘 ,𝜆 𝑘 ,𝑉 𝑘 ,𝐹 𝑘 } as a by-product of the Markov chain Monte Carlo (MCMC) algorithm. These
keyword-level estimates are essential for the paid-search host to evaluate the impact of policy
changes on its revenue.
23
We estimate the proposed model by implementing MCMC methods. We iteratively
generate draws from the model parameters for 100,000 times and inspect the time series of
model parameters to assess convergence. We use the first 50,000 draws as the burn-in and keep
every 100
th
draw of the remaining 50,000 draws to estimate the posterior mean and standard
deviations. Appendix C provides details of the MCMC algorithm (e.g., prior and posterior
distribution).
1.4.3 Empirical Findings
In order to validate our proposed model on click-volume, CPC, and number of entrants, we
compare it with a benchmark model where all these three variables are modeled in a reduced
form with the same set of information. In the benchmark model, the log-transformed click-
volume is modeled as a function of rank and keyword specific attributes, and the rank coefficient
(i.e., decay factor) is also modeled as a function of keyword attributes. CPC is modeled in a
similar way except for the inclusion of an auto-regressive term. We allow unobserved
heterogeneity across keywords and correlation on all random coefficients in both the click-
volume model and the CPC model. We model number of entrants for keyword k as a linear
regression:
𝑛 𝑘 =𝛾 5
𝑊 𝑘 +𝜂 𝑘 5
(23)
where 𝑊 𝑘 includes keyword-specific attributes (𝑋 𝑘 𝐹 ), average click-volume and average CPC in a
keyword, and two instrumental variables. The first instrument variable is the number of potential
entrants (N_Potential) and the second one is the average number of entrants in similar keywords
(N_Similar) for a given keyword k. Keywords are defined to be similar to each other if they
belong to the same subcategory. We find that these two variables significantly predict 𝑛 𝑘 . On the
other hand, when Internet users make click decisions, we believe that they view only the
24
information from those entered advertisers and therefore N_Potential is less likely to affect their
clicking behavior. Furthermore, we believe that the user click intention is mainly influenced by
the information on the keyword level rather than by the information on the category level.
Therefore, the second instrument we present is also unlikely to be correlated with the error term
in the click equation. A similar argument can be applied to the CPC equation. By adopting
equation (23) and allowing 𝜂 𝑘 5
to be correlated with the unobserved heterogeneity error terms in
equations of click-volume and CPC, we control for the potential endogeneity of 𝑛 𝑘 in the
benchmark model using the limited information approach (Vilas-Boas and Winer 1999, Yang,
Chen and Allenby 2003).
Our analysis suggests that our proposed model (log-marginal = 31943) outperforms the
reduced-form model (log-marginal = 57083) substantially. We report the coefficient estimates
from the reduced-form benchmark model in Tables 3 and 4. In the reduced model, number of
competing ads affects click-volume and CPC for an individual advertiser directly in the form of a
standard regression. As shown in Table 3, number of entrants positively affects baseline click-
volume and CPC, and negatively affects the click-volume decay factor.
25
ln(Click-
volume)
Decay Factor CPC
Number of
Entrants
Keyword Attributes
Intercept 2.293 (.040) −0.091 (.009) 4.044 (.299) −0.092 (.011)
Rank −0.472 (.032)
LagCPC 1.050 (.014)
DV −0.193 (.111) −0.016 (.020) 0.643 (.189) −0.038 (.036)
Accessory −0.140 (.089) −0.004 (.015) 0.030 (.186) −0.012 (.033)
Coverage −0.335 (.172) 0.027 (.028) −0.621 (.358) 0.242 (.067)
Length −1.337 (.186) 0.065 (.031) 0.153 (.402) 0.120 (.057)
Brand −0.074 (.079) 0.014 (.013) −0.359 (.129) 0.068 (.024)
General −0.395 (.094) −0.003 (.016) −0.215 (.166) 0.065 (.027)
Specific 0.758 (.095) −0.049 (.017) 0.130 (.183) −0.170 (.028)
Promotional −0.121 (.127) −0.016 (.024) 0.339 (.229) −0.056 (.040)
n/10 0.622 (.133) −0.091 (.026) 1.806 (.317)
n
2
/100 0.101 (.096) −0.005 (.021) −0.287 (.364)
Avg_Click 0.005 (.000)
Avg_CPC 0.025 (.002)
N_Potential 0.049 (.023)
N_Similar 0.048 (.012)
Table 3. Estimation Results of the Reduced-Form Model
26
ln(Click): Std. dev. of error 1.098
(.007)
CPC: Std. dev. of error 3.936
(.027)
Variance-covariance-matrix
1
2
3
4
5
6
1
Intercept in ln(Click) ( )
1.207
(.096)
−0.071
(.010)
0.130
(.059)
−0.076
(.020)
0.047
(.009)
−0.358
(.024)
2
Intercept. in Decay factor ( )
0.022
(.001)
−0.008
(.004)
0.004
(.002)
−0.004
(.001)
0.016
(.004)
3
Intercept in CPC ( )
0.041
(.013)
−0.008
(.006)
0.003
(.003)
−0.041
(.018)
4
Rank coef. in CPC ( )
0.075
(.007)
−0.034
(.003)
0.036
(.006)
5
LagCPC coef. in CPC ( )
0.041
(.002)
−0.026
(.004)
6
Error term in ( )
k
n
0.167
(.006)
Table 4. Variance Covariance Estimates from the Reduced-Form Model
Tables 5 and 6 report the coefficient estimates from the proposed model. Our proposed
structural model differs from the reduced model in three ways. First, the click-volume decay
factor 𝛿 𝑘𝑗
appears in both click-volume equation and CPC equation, bridging the two important
elements that will jointly affect an advertiser’s expected revenue from a keyword. Second, it
allows us to examine the different mechanisms under which number of entrants affects an
individual advertiser’s CPC. For example, number of entrants can affect an advertiser’s CPC via
either the mean transformed decay factor (𝜆 𝑘 ), or the mean log value-per-click (𝑉 𝑘 ), or both.
Third, explicitly modeling number of entrants (𝑛 𝑘 ) based on the revenue-cost approach enables
us to infer the keyword-specific management cost, which cannot be inferred from the reduced-
form approach. Due to these differences, we cannot directly compare the estimates from these
two models. However, all significant effects of keyword attributes on baseline click-volume and
decay factor reported from the reduced-form model are also significant and have the same signs
in Table 5. Furthermore, we find positive effects of number of entrants on baseline click-volume
27
in both two models. The negative squared-term effect of number of entrants on mean decay
factor suggests that as number of entrants becomes large, the overall effect of 𝑛 𝑘 is the same as
what is found in the reduced-form model. As for the effect of 𝑛 𝑘 on CPC, we show later that the
simulated CPC based on estimation results from the proposed model is positively correlated with
𝑛 𝑘 . These findings support the internal validity of our proposed model.
Log Baseline
Click-Volume
k
b
Mean
Transformed
Decay Factor
(
k
)
Mean Log
Value-Per-Click
(
k
V )
Management
Cost (in 100
cents)
(
k
F )
Keyword Attributes
Intercept 2.326 (.039) 2.682 (.078) 3.596 (.041) 4.465 (.105)
DV −0.327 (.060) 0.044 (.049) 0.014 (.031) −0.965 (.230)
Accessory −0.443 (.034) 0.186 (.052) −0.005 (.032) −1.523 (.162)
Coverage 0.713 (.092) −0.260 (.048) −0.349 (.044) 0.979 (.345)
Length −0.934 (.076) 0.254 (.059) 0.238 (.040) −2.437 (.265)
Brand −0.056 (.041) 0.130 (.032) −0.106 (.028) −0.445 (.135)
General −0.457 (.041) 0.173 (.060) 0.004 (.032) −1.417 (.200)
Specific 0.290 (.048) −0.101 (.031) −0.055 (.020) 0.853 (.202)
Promotional −0.387 (.053) 0.282 (.068) 0.074 (.040) −0.939 (.254)
n/10 0.368 (.079) 0.449 (.054) −0.330 (.035)
n
2
/100 0.054 (.035) −0.315 (.042) 0.097 (.041)
Table 5. Estimation Results from the Proposed Structural Model
28
Click-volume: Std. dev. of (
q
)
1.135
(.010)
Decay factor: Std. dev. of
(
d
)
1.299
(.009)
Value-per-click: Std. dev. of ln( )
ki
S (
vpc
)
0.464
(.005)
Profit coef. in utility of entry ( ) 12.493
(2.158)
Variance-covariance-matrix Ω
1
2
3
4
1
0.441
(.028)
−0.219
(.023)
−0.061
(.013)
1.145
(.078)
2
0.218
(.020)
0.043
(.010)
−0.727
(.055)
3
0.075
(.007)
−0.070
(.034)
4
4.189
(0.320)
Table 6. Variance Covariance Estimates from the Proposed Structural Model
We next discuss our findings based on the proposed structural model on click-volume,
CPC, and number of entrants. Since one of the main objectives of this research is to understand
how competition affects click-volume and CPC through three underlying constructs: baseline
click-volume, mean decay factor, and mean value-per-click, we begin with a summary of
findings on these relationships. i) Number of entrants positively affects baseline click-volume. ii)
Number of entrants has an inverse-U relationship with mean decay factor. iii) Number of
entrants has a negative and convex relationship with the mean value-per-click of a keyword.
We first explain the effects of competition on baseline click-volume, which measures the
level of consumer interest in clicking on ads related to a keyword. Intuitively, a larger number of
paid-search ads on the result page are likely to navigate consumers’ attention to the search
listings and therefore encourage consumers to start clicking on ads. In some sense, this logic is
similar to the effect of assortment size on the consumers’ purchase intention in a retailing setting.
Consistent with this conjecture, we find positive competition effect on baseline click-volume
29
from estimation results of both reduced-form and structural models. Next we explain the
competition effect on mean decay factor. When more ads are displayed on the result page, the
higher marginal search benefit from a larger 𝑛 𝑘 encourages consumers to engage in a more in-
depth search behavior and thus leads to a larger mean decay factor. However, when the number
of ads becomes too large, it is costly for consumers to click through multiple ads from top to
bottom. Thus, consumers are less likely to reach ads at lower positions on the paid-search listings
and the mean decay factor declines accordingly. Finally, the intuitive explanation for the
negative relationship between 𝑛 𝑘 and mean value-per-click is straightforward: when more
competitors choose to advertise through the same keyword, an advertiser may expect a lower
conversion probability and therefore the lower value from a click. Besides, the small but positive
squared-term effect implies that the magnitude of negative effect of 𝑛 𝑘 on mean value-per-click
diminishes with 𝑛 𝑘 .
We next report the effects of keyword characteristics on the four parameters we are
particularly interested in, which are baseline click-volume, mean decay factor, mean value-per-
click, and management cost. We start with our findings on the baseline click-volume: i) The
baseline click-volume is higher for keywords affiliated with more subcategories and negatively
associated with keyword length. ii) Keywords containing general descriptions tend to receive
less baseline click-volume, while keywords pointing to a specific product tend to receive more
baseline click-volume. One possible explanation is that most consumers shopping from this
online market maker are knowledgeable about general product features associated with camera.
Therefore, they are less likely to search for these general keywords, leading to less baseline
click-volume. On the other hand, those specific keywords tend to generate more demand. iii)
Keywords including promotional terms tend to receive less baseline click-volume. This might be
30
because the segment of price-sensitive consumers is relatively small in the category of digital-
related products.
We next summarize our findings on the mean decay factor: i) The mean decay factor is
lower for keywords belonging to more subcategories and higher for longer keywords. ii)
Keywords with brand or general descriptions are associated with a higher mean decay factor,
whereas keywords including words that are exclusive to the product tend to have a lower mean
decay factor. This is consistent with our previous explanation for the finding on baseline click-
volume: consumers who are less knowledgeable about camera-related products tend to search
more. iii) Keywords with promotional terms tend to have a higher mean decay factor. One
possible reason is that price-sensitive consumers are likely to search deeper along the ad listings
to compare more products.
As for the effect of keyword characteristics on advertisers’ mean value-per-click and
management cost, we find that short keywords or keywords affiliated with multiple
subcategories not only provide less per-click value to advertisers but also are associated with
higher management cost. On the other hand, keywords containing fewer specific terms are
associated with higher mean value-per-click and lower management cost.
In addition to the observed heterogeneity, we also find significant unobserved
heterogeneity in these key model parameters. Figure 1 provides histograms of the posterior mean
estimates of four key parameters across keywords. As shown, there is a substantial variation
among keywords, suggesting different demand pattern, CPC level, and entry cost associated with
different keywords. The covariance estimates suggest two things. First, there are significant
unobserved co-variations among baseline click-volume, mean decay factor, and mean value-per-
click. For example, our results suggest that keywords with higher baseline click-volume tend to
31
have lower mean decay factor and mean value-per-click. Second, the negative correlation
between keyword-specific unobserved terms in entry cost and mean value-per-click suggests a
positive correlation between number of entrants and the mean value-per-click across keywords.
This finding is consistent with the intuition that keywords with higher mean value-per-click may
attract more advertisers to enter. It also provides the evidence of number of entrants being
endogenous in the CPC equation.
Figure 1. Histogram of the Posterior Mean of Four Key Parameters across Keywords
32
1.5 Managerial Implications
1.5.1 Implications for Paid-Search Advertisers
It is important for paid-search advertisers to understand how competition influences their own
expected revenue from a keyword. For both advertisers who are already competing in a
keyword-specific market and those who can potentially enter, the decisions of whether to
continue the ad campaign and whether to start advertising through a keyword both depend on the
evaluation of advertising revenue as a result of the number of entrants. Since an advertiser’s
expected revenue depends on expected click-volume, CPC, and value-per-click, one advantage
of our proposed structural model is that it can be used to infer the mean value-per-click of a
keyword and therefore it can help advertisers predict how the expected revenue changes with
competition.
We simulate an advertiser’s expected click-volume, CPC, and advertising revenue in
three keywords in Figure 2 based on the posterior distribution of keyword-specific baseline
click-volume, mean decay factor, and mean value-per-click. The three keywords “Casio h10”,
“Sony w390”, and “Fujifilm j38” are chosen based on the estimates of their mean decay factor,
baseline click-volume, and mean log value-per-click. After controlling for the effect of
competition, the three keywords share similar baseline click-volume, mean log value-per-click,
whereas the posterior mean of their mean transformed decay factor is 1.227 (Casio h10), 1.978
(Sony w390), and 2.527 (Fujifilm j38) respectively.
The implications for advertisers based on the simulation results in Figure 2 can be
summarized as follows: i) Advertisers’ expected revenue is generally a decreasing and convex
function of the number of entrants. This negative effect of competition on the expected revenue
of advertisers is consistent with the assumption made in our model setup. ii) Advertisers’
33
expected click-volume is also generally a decreasing function of the number of entrants. iii)
Advertisers’ expected CPC is in a positive and concave relationship with the number of entrants.
iv) Conditional on the competition, an advertiser’s expected revenue is positively associated with
the mean decay factor since the expected click-volume is higher, whereas the expected CPC is
lower in keywords with a larger mean decay factor.
Figure 2. The Relationship between Number of Entrants and Four Key Advertising Metrics in
Three Keywords
Notes. The red points in these figures refer to the realized number of entrants in each keyword.
34
1.5.2 Implications for the Paid-Search Host
The structural nature of our proposed model can provide many insights for the paid-search host.
As shown in Figure 2, in contrast to the advertiser, the paid-search host is interested in the
market expansion for each keyword as its expected revenue is positively related to the number of
entrants. Although the paid-search host does not have direct control over the number of entrants,
our proposed model suggests two approaches for the paid-search host to attract advertisers. We
conduct two counterfactual experiments to illustrate.
Counterfactual Experiment 1: We study how the paid-search host can improve profit by
changing the decay factor in all keywords. Specifically, we allow the intercept of mean
transformed decay factor, that is, the baseline mean transformed decay factor to be changed from
the current level by {−20%, −15%, −10%, −5%, 0%, 5%, 10%}
6
. Since the mean decay factor is
determined by consumer search patterns, the paid-search host is able to influence it through
changing the page design. For example, the paid-search host can increase the mean decay factor
by allowing advertisers to upload more content in their ads and in turn to increase consumer
search benefit, or setting a tighter space constraint for all ads shown on the paid-search results to
lower consumer search cost.
Our proposed model suggests that conditional on the competition (i.e., number of
entrants), the click-volume decay factor influences the revenue of the paid-search host in two
opposite ways: i) A larger mean decay factor tends to increase the total click-volume to the
search host. ii) A larger mean decay factor tends to lower average CPC paid by advertisers.
Therefore, it is not straightforward whether the paid-search host should increase or decrease the
baseline mean decay factor to improve its profit. Moreover, this question is further complicated
6
Given that the estimated baseline mean decay factor is already large (0.93), we restrict the upper-bound of change
in baseline transformed mean decay factor to be 10% in this counterfactual experiment.
35
by the fact that the change in baseline mean decay factor also affects the number of entrants at
equilibrium since the mean decay factor can influence the expected revenue of an advertiser.
Therefore, it is difficult for the paid-search host to anticipate the overall impact of change in
baseline mean decay factor on its profit. Our proposed model can help the paid-search host to
identify the right direction to influence the baseline mean decay factor through appropriate
website design.
We next outline how we predict the paid-search host’s revenue at equilibrium under
different values of baseline mean decay factor. As the baseline mean decay factor changes, the
advertiser’s expected revenue of entry will change. Thus, the equilibrium number of entrants
determined by equation (16) will also change. After the equilibrium number of entrants is
adjusted, the realized mean decay factor in each keyword is determined. We then predict click-
volume and CPC under this new equilibrium. Finally the paid-search host’s expected revenue in
each keyword can be computed at this new equilibrium.
We report the paid-search host’s average expected revenue as well as the average number
of entrants across keywords under each policy change over the baseline decay factor in Table 7.
We compare the counterfactual results with the paid-search host’s average revenue and the
average number of entrants at current stage based on the real data. Our result shows that the paid-
search host’s revenue can be improved substantially with an increase in baseline mean decay
factor. For example, the expected revenue of the search host increases by 51% if the baseline
mean transformed decay factor is 10% higher than its current level.
7
Moreover, such a policy
change increases the average number of entrants in keywords by 39%. Thus, it generates a net
benefit to the paid-search host by offsetting the potential loss from a smaller average CPC due to
a larger mean decay factor. Our result suggests that the paid-search host can improve its revenue
7
We assume that the cost of changing the page design for the paid-search host is negligible.
36
by shifting advertisers’ revenue function upwards through the increase in baseline mean decay
factor but consequently intensifying the competition among advertisers.
One caveat of this counterfactual analysis is that a design change could reduce consumers’
repeated usage of the paid-search host, which will have a negative effect on the search host's
revenue. However, this particular counterfactual experiment is still meaningful since it provides
an illustration of the effect of changing mean decay factor (via change of search listing designs)
on the paid-search revenue. The paid-search host just needs to identify the boundary conditions
under which a design change will not significantly affect consumer loyalty, and select the
appropriate design (associated with a certain baseline mean decay factor) within this range to
maximize its profit.
Percentage Change of Baseline Mean Transformed
Decay Factor (Intercept in 𝜆 𝑘 )
−20% −15% −10% −5% 0 5% 10%
Average Expected n across Keywords 6.37 6.95 7.67 8.55 9.10 10.55 12.66
Average Expected Revenue of Search
Host across Keywords (in 100 cents)
17.42 19.35 21.60 24.44 27.09 33.48 40.94
Table 7. Counterfactual Results of Adjusting Baseline Mean Decay Factor
Counterfactual Experiment 2: We next demonstrate how our study helps the paid-search
host select the right face-value of coupons to draw more advertisers into search advertising and
improve profit. Specifically, the paid-search host provides a coupon with common face-value to
advertisers when they launch ad campaigns in any keyword for the first time.
8
The goal of
offering such coupons is to encourage more advertisers to enter the market. Each advertiser can
redeem only one coupon for each keyword. In practice, Google at one point provides a similar
coupon system for new customers when they sign up with Google AdWords. Our experiment
8
Although the paid-search host could design coupons with different values in different keywords or to different
advertisers, it is less practical than the uniform coupon due to the potential discrimination issue. We therefore focus
on the simple common-value coupon in this paper.
37
selects the optimal entry coupon from 5 face-values, {10%, 20%, 30%, 40%, 50%} of the
average advertisers’ management cost across keywords. The mechanism works in the following
way. Given a face-value of the coupon, we predict the equilibrium number of entrants of each
keyword. Then the number of entrants at this new equilibrium will affect the baseline click-
volume, mean decay factor and mean value-per-click, through which the realized click-volume
and CPC are also changed. In a manner similar to that used in the first counterfactual experiment,
we can then calculate the paid-search host’s expected revenue at the new equilibrium induced by
a policy change.
Table 8 reports findings from this counterfactual experiment. We find that an appropriate
selection of the face-value of entry coupons (from 10% to 30% of average management cost) can
increase the profit of the paid-search host from the current level. In the optimal situation, the
paid-search host can improve its profit by 25% if it uses the entry coupon with a face-value of 20%
of average management cost. In comparison with the alternative of influencing the click-volume
decay factor through page design, the coupon system is under full control of the paid-search host
and therefore is more practical for the search host to implement.
Value of Coupon in Percentage to the Average
Management Cost
0 10% 20% 30% 40% 50%
Average Expected n across Keywords 9.10 13.12 16.76 19.72 21.91 23.60
Average Expected Revenue of Search
Host across Keywords (in 100 cents)
26.98 38.79 48.90 57.94 66.13 76.03
Average Expected Cost of Entry Coupons
across Keywords (in 100 cents)
0.00 5.86 14.97 26.42 39.14 52.68
Average Expected Profit of Search Host
across Keywords (in 100 cents)
27.09 32.93 33.93 31.52 26.99 23.35
Table 8. Counterfactual Results of Providing Entry Coupons
Notes. The average management cost across keywords is 446.50 cents.
38
1.6 Conclusion
In order for paid-search advertisers to survive in a highly competitive market, it is crucial for
them to understand the nature of competition and how it influences their profitability. As such,
most paid-search information providers collect competition-related information, and a very
important piece of this information is the number of competing ads on the same paid-search
listings. Although advertisers have been using this measure as an indicator of competition
intensity, they need more insights into how such a measure affects certain key factors
determining their click-volume and CPC. From the paid-search host’s perspective, an in-depth
understanding of the key determinants of competition among paid-search advertisers can help the
paid-search host improve its profit through appropriate interventions and policy changes.
In this paper, we propose a structural framework to characterize competition and analyze
its impact on click-volume and CPCs through three key latent constructs: baseline click-volume,
decay factor, and value-per-click. We regard each keyword as a market and establish an
integrative model of realized click-volume, CPCs, and number of entrants. One of our key
modeling contributions is that we link demand (i.e., click-volume) and supply (i.e., CPCs)
through the decay factor in a structural manner. We also structurally model the number of
entrants as an aggregate equilibrium outcome based on a micro-level process.
One of the key findings from our analysis is that the number of entrants has significant
impacts on baseline click-volume, decay factor, and value-per-click, which determine the
expected click-volume, CPC, and revenue of search advertisers in a keyword. Therefore, our
results suggest that advertisers should account for these effects in adjusting their bidding and
keyword portfolio strategies with respect to competition. As for the paid-search host, our
counterfactual analysis shows that the paid-search host can improve its profit by manipulating
39
the click-volume decay factor or distributing coupons with the right face-value to intensify the
competition among advertisers.
Our paper has limitations, offering opportunities for future research. First, our modeling
framework can be extended to a dynamic setting. A dynamic modeling framework will be
especially powerful, for example, in examining advertisers’ keyword portfolio composition and
competition strategies over time. However, due to the relatively short time-span of our data
where advertisers’ keyword portfolio composition does not change, we are not able to model the
dynamic component of advertisers’ keyword management strategy. Should longer-span data be
available, researchers can enrich our modeling framework by incorporating dynamics. Second,
our proposed model is applied to only one product category, and the generalizability of our
empirical findings needs to be validated with more product categories. Third, since we do not
observe how advertisers exactly make the keyword entry decisions, we assume the keyword
entry of advertisers to be independent across keywords. Extending our model to allow joint
keyword entry will be an interesting area for future research, even though we do not find a strong
evidence of such interdependent entry decisions across keywords in our empirical context.
9
Fourth, field experiments will be valuable to empirically test the effectiveness of our proposed
policy changes in the real world. Notwithstanding these limitations, we hope that this study will
generate further interest in exploring this important emerging area in marketing.
9
We have tested the independence assumption in our empirical context in a reduced-form model following the idea
of testing spatial autocorrelation (Anselin 1988). In the first step, we ran a reduced-form model by regressing log of
entry probability for keyword k (entry probability is defined as number of entrants divided by number of potential
entrants), on keyword attributes. In the second step, we save all the residuals obtained from the first step. In the third
step, we construct the Moran’s I statistic based on the residuals and a spatial contiguity matrix W (where the
diagonal elements of W are set to zero, and ij’th element is equal to the total number of common attributes shared by
keyword i and keyword j). The Moran’s test statistics has a p-value of 0.157, which suggests that there is no
significant unobservable interdependence in entry across keywords after controlling for the keyword attributes.
40
2. Assimilation or Differentiation? Investigating the Effect of
Competition on Sponsored Search Advertisers’ Keyword Decisions
2.1 Introduction
Sponsored search remains the most salient online advertising format today. According to a report
from the Internet Advertising Bureau in 2013, sponsored-search advertising has grown to be an
$18B industry, accounting for 43% of the annual online advertising spending in the United States.
This fast market expansion brings one noticeable difference in comparison to the past: more than
just a few sponsored ads are often listed under the same search query, competing for user
attention and transaction opportunity. As competition escalates, advertisers constantly find
themselves in rivalry with many other similar or different types of advertisers. For example, the
keyword “Lenovo Thinkpad X220” draws in PC manufacturers, retailers, and comparison sites as
sponsored advertisers.
Keyword choice is often highly strategic and a key element in search engine marketing.
From a search advertiser’s perspective, “entering” a particular keyword market means exposing
its advertisement to users who share similar product interests and search intent, along with other
entrant advertisers who also see value in this market. Intuitively, sponsored ads displayed for the
same keyword, although placed in an order, affect the consumer’s consideration set and the
subsequent purchase, if any. Therefore, what else appears on the sponsored search results and
whether these advertisements are ranked above or below will affect the click-through and
conversion probabilities on a particular search advertisement and, consequently, the advertiser’s
expected payoff and entry probability.
Advertisers’ strategic keyword selection is further facilitated by the competition
information provided by third-party keyword infomediaries, who periodically track and record
41
all search advertisements associated with thousands of keywords on major search engines. With
the surge of these keyword infomediaries, sponsored search advertisers now can readily observe
who else they were competing with for a given keyword, and who were ranked above or below
their own advertisements in previous periods. Observing such detailed competition information
from the past enables advertisers to more accurately forecast competitors’ entry probabilities and
their ad positions if entering, so they can decide whether to enter a particular keyword market in
current period. Therefore, we anticipate a strong spillover effect in advertisers’ keyword entry
decisions. In the case of a positive spillover, we anticipate “assimilation” in such a way that the
likelihood of a competitor’s entry tends to increase an individual advertiser’s entry probability.
In the case of a negative spillover, we expect a “differentiation” type of behavior such that
potential advertisers intentionally buy different keywords in order to avoid direct competition.
Although keyword choice is an important decision for search engine marketers, there has
been very limited published research concerning this phenomenon and the underlying strategic
interactions among sponsored search advertisers. In this research, we aim to fill in this gap by
developing a modeling framework to quantify the nature and magnitude of such spillover effects
in advertisers’ keyword decisions.
Our model is built upon the setup of simultaneous-move game with incomplete
information and extends it in two important ways. First, considering that sponsored search results
are ordered and users often search from the top to the bottom, we separately model the strategic
interactions with those advertisements ranked above the focal advertiser and with those ranked
below. For the former, we specifically model the spillover effects between each pair of the three
types of advertisers: manufacturer, retailer, and comparison site. For example, if the advertiser is
a manufacturer, we estimate the effects of the expected number of each of the three types of
42
above-ranked advertisements on the manufacturer’s expected entry payoff and entry probability
respectively. Our analysis of rank-specific strategic interactions can help sponsored search
advertisers better understand and predict their competition landscape in the equilibrium and
design more effective search advertisements in response to the competition.
Second, unlike the standard approach where potential entrants are assumed based on
certain classification rules, we characterize the set of potential entrants for a given keyword by
modeling advertisers’ keyword consideration sets. This extension is crucial because forming a
keyword consideration set is a common practice in the industry due to advertisers’ limited
knowledge about all available keywords. More importantly, historical entry information provided
by the keyword infomediary allows advertisers to revise their keyword consideration sets over
time based on the previous keyword choices of competitors. Through modeling the consideration
or the set of potential entrants for a keyword, we capture the indirect effect of competition
information on advertisers’ keyword entry decisions. This is a useful new approach to integrate
previous competition outcomes into the empirical analysis of a static discrete game. Furthermore,
modeling keyword consideration as affected by previous competitive entries can help search
engines evaluate the historical competition information released by the keyword infomediary.
We apply the model to a novel dataset obtained from a leading keyword infomediary in
the United States, which contains advertisers’ entry information and the corresponding ad
ranking on 1,252 popular keywords related to laptops and accessories on Google’s advertising
platform from September 2011 to April 2012. Our analysis is carried out on 28 major advertisers
including PC producers, retailers, and comparison websites. To overcome the multiple-
equilibrium problem from simultaneous entry decisions and the high dimensionality of parameter
43
space from unobserved keyword consideration sets, we use a two-step estimation approach in
conjunction with the Bayesian method to make model inferences.
Several key findings emerge from our analysis and counterfactual simulations. First, all
three types of advertisers are more likely to enter if they expect a larger number of
advertisements ranked below them, leading to the assimilation effect. Second, the expected
number of advertisements ranked above can have either a positive or negative spillover effect on
an advertiser’s entry probability. More specifically, (i) retailers are the most aggressive and have
a strong tendency to assimilate with those who are expected to rank above them; (ii)
manufacturers tend to assimilate with other manufacturers but differentiate themselves from
comparison sites; and (iii) comparison sites are the least aggressive and tend to differentiate from
other comparisons sites and manufacturers. Third, manufacturers and retailers mainly use
historical entry information to improve their keyword relevance, while comparison sites use this
information for keyword discovery. Finally, our counterfactual simulations demonstrate that
more accurate competition information provided by keyword infomediaries tends to expand the
market by increasing the average number of advertisements per keyword and the average number
of keywords per advertiser, which in turn improves the search engine’s revenue by about 4.4%.
The rest of this paper proceeds as follows. We begin with a literature review, outlining
some major prior work related to sponsored search advertising and competitive entry. We then
describe the empirical context and data, and the model we developed to capture the spillover
effects in advertisers’ keyword decisions. In the next section, we discuss the estimation methods
and empirical identification before presenting our empirical findings and managerial implications
based on counterfactual simulations. We conclude with a summary of this study.
44
2.2 Related Literature
This article builds on the burgeoning literature on search advertising. As most search engines
allocate and sell advertisement spaces via auction, the theoretical literature of search advertising
has mainly focused on understanding advertisers’ bidding reactions to different sales
mechanisms (e.g., Athey and Ellison 2012, Edelman et al. 2007, Varian 2007). In contrast,
previous empirical work focused on measuring the effectiveness of advertising performances.
This line of research is exemplified by studies of position effects on various user response
metrics (Agarwal et al. 2011, Ghose and Yang 2009, Narayanan and Kalyanam 2012), the
spillover effect from generic to branded keywords (Rutz and Bucklin 2011), the interaction
between search and organic results (Yang and Ghose 2010), the substitution between online and
offline advertising (Goldfarb and Tucker 2011a, Joo et al. 2014), and the role of keyword
popularity on consumer clicking behavior (Jerath et al. 2014). In addition to measuring search
advertising effectiveness, several papers have developed novel empirical methods to infer
advertisers’ bidding valuations in different auction mechanisms (e.g., Yang et al. 2014, Yao and
Mela 2011).
A relatively under-researched yet highly important issue is advertisers’ keyword
decisions. Only a handful of papers have examined this phenomenon. For example, researchers
in computer science have studied keyword choices by developing more efficient
recommendation algorithms (e.g., Abhishek and Hosanagar 2007, Chen et al. 2008, Fuxman et al.
2008). The primary focus of our paper is on the strategic interactions among advertisers’
keyword decisions, which has been largely ignored in the computer science literature.
A few marketing papers have examined this phenomenon. Desai et al. (2014) built a
game-theoretical model to analytically explain advertisers’ purchasing of competitor branded
45
keywords, while taking into account the impact of this tactic on the firms’ price competition on
the advertised product. One interesting finding from this study is that advertisers may purchase
their own branded keywords to offset the potential negative effect if competitors bought them
instead. In another study, Sayedi et al. (2014) also examined the “piggybacking” practice in
sponsored search advertising but from the perspective of budget allocation between online and
offline advertising. Our study differs from and extends this strand of literature in several ways.
First, our study is empirical and examines advertisers’ choices on both branded and generic
keywords. Second, we infer the strategic interaction via the observed keyword choice decisions
and provide additional insights on keyword competition among different types of advertisers
including manufacturers, retailers, and comparisons sties. Finally, we adopt a structural modeling
framework by assuming that advertisers’ keyword entry decisions are affected by their
expectations on the post-entry ranking outcomes, and this aspect has not been incorporated in
previous studies.
Another relevant paper is Yang et al. (2014), which examined the impact of the number
of competing ads on click-volume and cost-per-click (CPC), and the formation of a competition
set. Our work differs from theirs in several major ways. First, unlike Yang et al. (2014) who
treated advertisers as symmetric firms, our model includes different types of advertisers. This
general framework allows us to flexibly capture the strategic interactions among advertisers.
Second, the asymmetric assumption also enables us to study the effect of advertisements placed
above and below, which has important implications for advertisers. Finally, we extend the
simultaneous-move entry literature by modeling the set of potential entrants for each keyword. In
this way, we can more accurately quantify the strategic interaction effects in advertisers’
keyword decisions.
46
From the methodological perspective, our paper is related to the following two bodies of
literature. First, we model advertisers’ keyword choices as an incomplete-information
simultaneous-move game. This framework has been widely adopted in the marketing and
economics literature on discrete entry games (e.g., Datta and Sudhir 2013, Seim, 2006, Vitorino
2012, Zhu and Singh 2009,). Previous studies typically inferred the potential entrants based on
the characteristics of the players and the market in a deterministic way. However, since many of
these characteristics (such as distance from a firm to a market) are not applicable in the context
of sponsored search advertising, it is rather difficult to identify the set of potential entrants a
priori. Our paper extends the discrete game literature by modeling the set of potential entrants of
a keyword as latently determined by a keyword-consideration process of advertisers. In a
simulation study, we document that an incorrect specification of the set of potential entrants
significantly contaminates the estimates on strategic interaction terms and causes a downward
bias.
Second, we model advertisers’ keyword consideration sets to be consistent with the
industry’s practices and we capture advertisers’ limited capacity in analyzing all existing
keywords. There is a rich body of marketing literature on modeling consideration sets as
exemplified by Andrews and Srinivasan (1995), Bronnenberg and Vanhonacker (1996), Chiang
et al. (1999), Gilbride and Allenby (2004), and van Nierop et al. (2010). Two methods have been
developed to model the consideration of J options. The first method is to model the probability
distribution of all 2
𝐽 −1 possible consideration sets (Chiang et al. 1999). However, this
approach is not feasible in our context because the number of keyword options is large. The
second method breaks down the curse of dimensionality by modeling the marginal distribution of
each option being considered (Bronnenberg and Vanhonacker 1996, van Nierop et al. 2010). As
47
shown in van Nierop et al. (2010), the consideration sets retrieved by the second method
accurately correspond to the actual consideration sets. We adopt the second approach to model
advertisers’ keyword considerations. Our study contributes to the consideration modeling
literature by extending the concept of consideration sets to firms’ decision making, i.e.,
advertisers’ keyword choices.
2.3 Empirical Context
We obtain data from a leading search-advertising keyword infomediary in the U.S. market. This
company uses a screen-scraper technology to track all search advertisements associated with
thousands of keywords on Google AdWords on a monthly basis. By analyzing the domain
information related to each advertisement, the infomediary can identify the advertisers who
purchased the corresponding keyword, i.e., entered the keyword market, and the corresponding
positions of their advertisements. Our data include the identity and the rank of search
advertisements on Google that are associated with 1,252 keywords related to laptop products and
accessories from September 2011 to April 2012. These keywords are the popular ones associated
with this product category, and each keyword has been used by at least one advertiser during the
data period. We regard each keyword as a market because a search keyword often reflects a
certain type of user interest and purchase intention. Accordingly, each advertiser who bought a
keyword is called an entrant for that keyword market.
We focus on 28 major search advertisers who account for 72% of the exposures of these
1,252 keywords during the data period. The 28 search advertisers are classified into three types:
manufacturers (M), retailers (R), and comparison sites (C). The comparison sites are also called
comparison engines and they do not sell products to consumers directly. Instead, they list
48
products carried by retailers and charge these retailers a certain fee for each click that transfers
potential consumers to the retailer’s site. To some extent, comparison sites can be regarded as
second-tier search engines mainly relying on the payments from retailers. For each type, we list
below the 28 focal advertisers based on their entry frequencies from high to low. Our data
provider confirmed with us that these 28 firms all have access to competition information
provided by infomediaries.
8 manufacturers: Toshiba, Dell, HP, Sony, Apple, Acer, Samsung, Lenovo
12 retailers: eBay, Microsoft Store, Amazon, Newegg, Best Buy, Walmart, Staples,
Officemax, Tigerdirect, Target, Sears, Office Depot
8 comparison sites: Buycheapr, Bizrate, Nextag, Shopzilla, Pronto, Beso, Pricegrabber,
Smarter
We construct several keyword-specific variables based on the information conveyed by a
keyword: Length (number of words included in a keyword), Specific (number of words referring
to the brand/model/serial number of a product), and Promotional (number of promotional terms).
For example, a keyword of “Dell Latitude sale” includes two specific terms and one promotional
term. In addition to these keyword-specific attributes, we also construct a keyword- and
advertiser-specific dummy variable Match to capture the relevance of a keyword to a
manufacturer. We define Match as being one if a keyword contains the brand name of the
manufacturer or the keyword includes specific terms that are exclusively associated with this
manufacturer. Table 9 presents the summary statistics of keyword attributes. For example, 20%
of the keywords are highly relevant to Toshiba and only 2% are related to Samsung’s products.
49
Variable Mean Std. Dev. Min Max
Keyword-Specific
Length 2.802 0.686 1 5
Specific 1.132 1.005 0 3
Promotional 0.138 0.350 0 2
Keyword- and Advertiser-Specific
Match_Toshiba 0.200 0.400 0 1
Match_Dell 0.109 0.312 0 1
Match_HP 0.081 0.274 0 1
Match_Sony 0.056 0.230 0 1
Match_Apple 0.097 0.295 0 1
Match_Acer 0.069 0.253 0 1
Match_Samsung 0.019 0.137 0 1
Match_Lenovo 0.090 0.287 0 1
Table 9. Summary Statistics of Time-Invariant Keyword Attributes
In addition to data on advertisers’ keyword choices and ranks, we also have monthly
aggregate data for search volume, click-volume, and CPC for each keyword although for each
keyword, we only have aggregate information rather than the observations of click-volume and
CPC for individual advertisers. We report the summary statistics of these time-variant variables
in Table 10. The average number of ads per keyword is about 6. The average CPC is $1.19 and
on average, each advertisement receives16 clicks per day.
Variable Mean Std. Dev. Min Max
N 5.898 3.735 1 17
N(M) 1.762 1.781 0 8
N(R) 2.491 2.052 0 10
N(C) 1.646 1.455 0 7
CPC ($) 1.191 1.171 0.080 22.48
Daily Search Volume (1k) 2.282 20.130 0.002 370
Daily Clicks per Ad 16.160 109.057 0.050 2716
Table 10. Summary Statistics of Time-Variant Keyword Attributes
50
2.4 Model
2.4.1 Model Setup
We consider I advertisers and K keywords, indexed by i (i=1,..., I) and k (k=1,..., K), respectively.
Time is discrete and indexed by t (t=0,..., S), where 𝑡 =0 refers to the initial period of the data.
We let 𝐼 𝑘𝑡
stand for the set of potential entrants of keyword k at time t. Next we present a three-
stage process to describe advertisers’ keyword choices and several post-entry observations
including the rank of ads, average CPC, and average click volume for a keyword.
In the first stage, each advertiser forms a consideration set at time t defined as the set of
keywords he or she might purchase. This is largely consistent with a common practice in the
sponsored search advertising industry where big advertisers typically form a keyword dictionary
for each product category and then choose keywords to buy from this dictionary (Stokes 2008).
This keyword dictionary contains all terms that a firm regards as relevant to its products and is
updated from time to time. We interchangeably use keyword consideration set and keyword
dictionary here.
In the second stage, all advertisers who have considered a keyword k at time t compose
the set of potential entrants 𝐼 𝑘𝑡
and decide whether to purchase the keyword or enter that market.
We model the entry of advertisers as a simultaneous-move game with incomplete information.
Building on the incomplete-information framework proposed by Seim (2006), we assume that
each potential entrant 𝑖 ∈𝐼 𝑘𝑡
receives a privately known profit shock when making the entry
decision.
In the last stage, after the set of entrants denoted by 𝐸 𝑘𝑡
is determined, post-entry
outcomes are realized. The position of an ad is determined via an auction mechanism, in which
the search engine ranks all ads related to a keyword based on advertisers’ weighted bids. Because
51
we do not observe the bidding data for individual advertisers, we model ad positions as an
ordering outcome of the vector of latent weighted bids for all entrant advertisers. Furthermore,
we model the time series of the average CPC and the average click-volume for a keyword as a
first-order autoregressive or AR1 process. We next present the details of our model.
2.4.2 Modeling Ad Positions
Conditional on keyword entry, advertisers need to submit bids in an auction. The search engine
then determines the rank of ads based on these bids weighted by a quality score. For each entrant
advertiser, 𝑖 ∈𝐸 𝑘𝑡
, let 𝑅 𝑘𝑖𝑡
stand for advertiser i’s weighted bid, which is unobserved in our data.
Then the rank of advertiser i’s ad 𝑅𝑎𝑛𝑘 𝑘𝑖𝑡
is a discrete value determined by the vector of
{𝑅 𝑘 𝑖 ′
𝑡 }
𝑖 ′
∈𝐸 𝑘𝑡
in descending order, that is, 𝑅𝑎𝑛𝑘 𝑘𝑖𝑡
<𝑅𝑎𝑛𝑘 𝑘𝑗𝑡 iff 𝑅 𝑘𝑖𝑡
>𝑅 𝑘𝑗𝑡 . Here a smaller
Rank refers to a position further towards the top. We model the latent weighted bid for each
entrant advertiser as follows:
𝑅 𝑘𝑖𝑡
=𝛼 𝑖 +𝛼 1𝑇 𝑋 𝑘 +𝛼 2𝑇 ln(𝑆𝑉
𝑘𝑡
)+𝛼 3𝑇 𝑀𝑎𝑡𝑐 ℎ
𝑘𝑖
+∑ 𝛼 4𝑇 𝑠 𝐷 𝑠 (𝑡 )
𝑆 𝑠 =1
+∑ 𝛼 5𝑙𝑇
𝑊 𝑘𝑖𝑙
𝑡 −1
𝑙 =0
+𝜖 𝑘𝑖𝑡
(1)
where 𝑋 𝑘 are time-invariant keyword attributes, 𝑆 𝑉 𝑘𝑡
represents average daily search volume for
a keyword, and 𝑀𝑎𝑡𝑐 ℎ
𝑘𝑖
is a dummy variable measuring the relevance between a keyword and a
manufacturer-type advertiser. We account for seasonal effects by including a series of monthly
dummies 𝐷 𝑠 (𝑡 ) , which equals one when 𝑠 =𝑡 . Here 𝑠 denotes each month in our data period and
the first month is used as the benchmark. The parameter 𝛼 𝑖 captures an advertiser-specific fixed
effect and the coefficients of other variables are all type-specific. Here, {𝑊 𝑘𝑖𝑙
}
𝑙 =0
𝑡 −1
includes two
sets of information that pertain to an advertiser: an indicator of his or her previous entry and the
rank of his or her advertisement in previous months (i.e., 𝑊 𝑘𝑖𝑡
={𝑎 𝑘𝑖𝑡
,ln(𝑅𝑎𝑛 𝑘 𝑘𝑖𝑡
+1)} ). We
52
assume that the error term 𝜖 𝑘𝑖𝑡
follows a standard normal distribution. We normalize the variance
of 𝜖 𝑘𝑖𝑡
to be one because the rank of advertisements is independent of the scale of 𝑅 𝑘𝑖𝑡
. In
addition, we normalize the coefficients 𝛼 1𝑇 , 𝛼 2𝑇 and 𝛼 4𝑇 𝑠 to be zero for manufacturers for
identification purposes because the associated covariates of these parameters do not vary across
firms. Based on the specifications described above, the probability for advertiser i’s
advertisement to rank above advertiser j’s can be expressed as follows.
𝑃 (𝑅𝑎𝑛𝑘 𝑘𝑖 𝑡 <𝑅𝑎𝑛𝑘 𝑘𝑗𝑡 )=𝑃 (𝑅 𝑘𝑖𝑡
>𝑅 𝑘𝑗𝑡 )=Φ(
𝐸 (𝑅 𝑘𝑖𝑡 )−𝐸 (𝑅 𝑘𝑗𝑡
)
√2
) (2)
2.4.3 Modeling Average Click-Volume and Cost-Per-Click
It is crucial for us to model the average click-volume and the average CPC for a keyword to
provide the implications of advertisers’ keyword choices on the search engine’s revenue. We
follow the previous literature in describing the evolution of these two outcomes as an AR1
process (Rutz and Bucklin 2011). To capture the competition effect on the click-volume and the
CPC found in Yang et al. (2014), we allow both to be a function of the number of entrants from
different types. These factors lead us to characterize the average click-volume and the CPC of a
keyword as follows:
ln(𝐶𝑙𝑖𝑐 𝑘 𝑘𝑡
)=𝛽 1
𝑄 𝑋 𝑘 +𝛽 2
𝑄 ln (𝑆𝑉
𝑘𝑡
)+∑ 𝛽 3
𝑄 𝐷 𝑠 (𝑡 )
𝑆 𝑠 =1
+𝛽 4
𝑄 ln(𝐶𝑙𝑖𝑐 𝑘 𝑘 ,𝑡 −1
)
+∑ 𝛽 5𝑇 𝑄 ln[𝑁 𝑘𝑡
(𝑇 )+1]
𝑇 ={𝑀 ,𝑅 ,𝐶 }
+𝜂 𝑘 𝑄 +𝜐 𝑘𝑡
𝑄 (3)
ln(𝐶𝑃𝐶 𝑘𝑡
)=𝛽 1
𝐶𝑃𝐶 𝑋 𝑘 +𝛽 2
𝐶𝑃𝐶 ln (𝑆𝑉
𝑘𝑡
)+∑ 𝛽 3
𝐶𝑃𝐶 𝐷 𝑠 (𝑡 )
𝑆 𝑠 =1
+𝛽 4
𝐶𝑃𝐶 ln(𝐶𝑃𝐶 𝑘 ,𝑡 −1
)
+∑ 𝛽 5𝑇 𝐶𝑃𝐶 ln[𝑁 𝑘𝑡
(𝑇 )+1]
𝑇 ={𝑀 ,𝑅 ,𝐶 }
+𝜂 𝑘 𝐶𝑃𝐶 +𝜐 𝑘𝑡
𝐶𝑃𝐶 (4)
53
(𝜂 𝑘 𝑄 ,𝜂 𝑘 𝐶𝑃𝐶 )~𝑀𝑉𝑁 (0,Ω) (5)
where 𝑁 𝑘𝑡
(𝑇 ) is the number of type T advertisers who bought keyword k at time t, 𝜐 𝑘𝑡
𝑄 and 𝜐 𝑘𝑡
𝐶𝑃𝐶
are measurement errors, and the parameters 𝜂 𝑘 𝑄 and 𝜂 𝑘 𝐶𝑃𝐶 capture the unobserved keyword
heterogeneity in average click-volume and CPC. We assume 𝜂 𝑘 𝑄 and 𝜂 𝑘 𝐶𝑃𝐶 to be normally
distributed and correlated. Here 𝛽 5𝑇 𝑄 and 𝛽 5𝑇 𝐶𝑃𝐶 measure the impact of the number of type-T
advertisers on the average click-volume and CPC of a keyword.
The number of entrants 𝑁 𝑘𝑡
(𝑇 ) can be endogenous in models of average click-volume
and CPC because the keyword entry decisions may result from advertisers’ expectations about
these two metrics. We control for the potential endogeneity of 𝑁 𝑘𝑡
(𝑇 ) in Equations (3) and (4)
by using the lagged number of entrants as IV and allowing the error terms in average click-
volume, CPC, and log of the number of manufacturers, retailers, and comparison sites to be
correlated.
ln(𝑁 𝑘 𝑡 (𝑀 )+1)=𝛽 1
𝑀 𝑋 𝑘 +𝛽 2
𝑀 ln(𝑆𝑉
𝑘𝑡
)+∑ 𝛽 3
𝑀 𝐷 𝑠 (𝑡 )
𝑆 𝑠 =1
+𝛽 4
𝑀 ln(𝑁 𝑘 ,𝑡 −1
(𝑀 )+1)+𝜐 𝑘𝑡
𝑀 (6a)
ln(𝑁 𝑘𝑡
(𝑅 )+1)=𝛽 1
𝑅 𝑋 𝑘 +𝛽 2
𝑅 ln(𝑆𝑉
𝑘𝑡
)+∑ 𝛽 3
𝑅 𝐷 𝑠 (𝑡 )
𝑆 𝑠 =1
+𝛽 4
𝑅 ln(𝑁 𝑘 ,𝑡 −1
(𝑅 )+1)+𝜐 𝑘𝑡
𝑅 (6b)
ln(𝑁 𝑘𝑡
(𝐶 )+1)=𝛽 1
𝐶 𝑋 𝑘 +𝛽 2
𝐶 ln(𝑆𝑉
𝑘𝑡
)+∑ 𝛽 3
𝐶 𝐷 𝑠 (𝑡 )
𝑆 𝑠 =1
+𝛽 4
𝐶 ln(𝑁 𝑘 ,𝑡 −1
(𝐶 )+1)+𝜐 𝑘𝑡
𝐶 (6c)
(𝜐 𝑘𝑡
𝑄 ,𝜐 𝑘𝑡
𝐶𝑃𝐶 ,𝜐 𝑘𝑡
𝑀 ,𝜐 𝑘𝑡
𝑅 ,𝜐 𝑘𝑡
𝐶 )~𝑀𝑉𝑁 (0,Φ) (7)
2.4.4 Modeling Keyword Entry
We model the entry decisions made by potential entrants of a keyword as a simultaneous-move
game with incomplete information. We define the set of potential entrants as those advertisers
54
who have considered this keyword at the stage of entry. Previous studies on market entry
typically define the set of potential entrants in an exogenous way because they either focus on an
oligopoly or even a duopoly industry with only a few players—e.g., the airline industry in Berry
(1992) and Ciliberto and Tamer (2009); the discount retail industry in Jia (2008) and Zhu and
Singh (2009); department stores in Vitorino (2012)—or they are able to pin down the set of
potential entrants using a geographical proximity assumption: e.g., Datta and Sudhir (2013) and
Seim (2006). However, in the context of search advertising, there are a lot more advertisers, and
there are no well-defined rules that we can use to determine potential entrants for a keyword.
Thus, we treat the set of potential entrants as latent and model the keyword consideration process.
Following Seim’s (2006) incomplete information framework, we assume that each
potential entrant 𝑖 ∈𝐼 𝑘𝑡
forms a rational expectation on others’ entry probability. Let 𝑎 𝑘𝑖𝑡
be a
binary indicator of advertiser i’s use of keyword k at period t. Then the set of entrants is formally
defined as 𝐸 𝑘𝑡
={𝑖|𝑖 ∈𝐼 𝑘𝑡
𝑎𝑛𝑑 𝑎 𝑘𝑖𝑡
=1} . We characterize the advertiser i’s expected payoff of
purchasing a keyword as:
𝑈 𝑘𝑖𝑡
=𝛾 𝑖 +𝛾 1𝑇 𝑋 𝑘 +𝛾 2𝑇 ln(𝑆 𝑉 𝑘𝑡
)+𝛾 3𝑇 𝑀𝑎𝑡𝑐 ℎ
𝑘𝑖
+∑ 𝛾 4𝑇 𝑠 𝐷 𝑠 (𝑡 )
𝑆 𝑠 =1
+𝛾 5𝑇 𝐸𝑙𝑎 𝑔 𝑘𝑖𝑡
+
𝛾 6𝑇 ln(𝑅𝑙𝑎 𝑔 𝑘𝑖𝑡
+1)+∑ 𝛾 7𝑇 𝑇 ′ln[𝑁 𝑘𝑖𝑡
𝑎 (𝑇 ′
)+1]
𝑇 ′
={𝑀 ,𝑅 ,𝐶 }
+𝛾 8𝑇 ln[𝑁 𝑘𝑖𝑡
𝑏 +1]+𝜍 𝑘𝑖𝑡
𝑈 (8)
𝐸𝑙𝑎 𝑔 𝑘𝑖𝑡
=
1
𝑡 (∑ 𝑎 𝑘𝑖𝑙
𝑡 −1
𝑙 =0
) (9)
𝑅𝑙𝑎 𝑔 𝑘𝑖𝑡
=
1
𝑡 (∑ 𝑅𝑎𝑛𝑘 𝑘𝑖𝑙
𝑡 −1
𝑙 =0
) (10)
𝑁 𝑘𝑖𝑡
𝑎 (𝑇 )=∑ [𝑃 (𝑎 𝑘𝑗𝑡 =1)𝑃 (𝑅𝑎𝑛𝑘 𝑘𝑗𝑡 <𝑅𝑎𝑛𝑘 𝑘𝑖𝑡
)1{𝑗 ∈𝑇 }]
𝑗 ≠𝑖 ,𝑗 ∈𝐼 𝑘𝑡
(11)
𝑁 𝑘𝑖𝑡
𝑏 =∑ [𝑃 (𝑎 𝑘𝑗𝑡 =1)𝑃 (𝑅𝑎𝑛𝑘 𝑘𝑗𝑡 ≥𝑅𝑎𝑛𝑘 𝑘𝑖𝑡
)]
𝑗 ≠𝑖 ,𝑗 ∈𝐼 𝑘𝑡
(12)
55
where 𝛾 𝑖 captures advertiser i’s fixed effect, and 𝜍 𝑘𝑖𝑡
𝑈 is an idiosyncratic profit shock known by
advertiser i but not observed by its competitors and researchers. We assume 𝜍 𝑘𝑖𝑡
𝑈 to be i.i.d.
normally distributed with unit variance for identification purposes. To keep the model
parsimonious, we construct two variables 𝐸𝑙𝑎 𝑔 𝑘𝑖𝑡
and 𝑅𝑙𝑎 𝑔 𝑘𝑖𝑡
to capture the state-dependence
effect on entry probability. 𝐸𝑙𝑎 𝑔 𝑘𝑖𝑡
measures the average monthly entry probability of advertiser
i in the past (as in Equation 9) and 𝑅𝑙𝑎 𝑔 𝑘𝑖𝑡
measures the average rank of advertiser i’s
advertisement before time t (as in Equation 10). Here we define 𝑅𝑎𝑛 𝑘 𝑘𝑖𝑡
to be 30 if advertiser i
did not buy keyword k at time t. Furthermore, 𝑁 𝑘𝑖𝑡
𝑎 (𝑇 ) represents the expected number of type-T
advertisers who rank above i and 𝑁 𝑘𝑖𝑡
𝑏 represents the expected number of advertisers who rank
below i. Equations (11) and (12) specify how these two variables are constructed based on the
model of ad positions. Finally, to complete the entry model, we normalize the outside profit to be
zero so that an advertiser enters a keyword market iff 𝑈 𝑘𝑖𝑡
>0.
We allow the strategic interaction effects from competitors’ entry on an advertiser’s
payoff to vary with the expected order of ad positions. Most users browse and click on search ads
in a sequential top-down manner, which makes the advertisements ranked above more likely than
those ranked below to influence an advertiser’s expected advertising payoff. Hence, to keep the
model parsimonious, we only allow the strategic interaction effects of advertisements ranked
above 𝛾 7𝑇 𝑇 ′ to be advertiser-type specific. Under these assumptions, we can characterize the
Bayes-Nash equilibrium (BNE) of the entry probability for potential entrants as follows:
𝑃 ∗
(𝑎 𝑘𝑖𝑡
=1)=Φ[𝐸 (𝑈 (𝑃 ∗
(𝑎 −𝑖 ),𝑋 ;𝛾 ))], ∀𝑖 ∈𝐼 𝑘𝑡
(13)
56
2.4.5 Modeling Keyword Consideration
We characterize the set of potential entrants 𝐼 𝑘𝑡
by modeling advertisers’ keyword consideration
sets. By definition, an advertiser i is a potential entrant of keyword k if and only if keyword k
belongs to its consideration set. We denote that 𝑐 𝑘𝑖𝑡
=1 if advertiser i considers keyword k at
time t. We model a firm’s consideration decision in a similar way to the consideration model
proposed in Bronnenberg and Vanhonacker (1996) and as further developed in van Nierop et al.
(2010). We model an advertiser’s consideration decision by describing the latent consideration
intensity 𝑉 𝑘𝑖𝑡
, which indicates a consideration when 𝑉 𝑘𝑖𝑡
exceeds a threshold that is normalized
to be zero.
We assume advertisers to be non-strategic at the consideration stage for several reasons.
First, unlike the keyword entry game where a firm’s advertising payoff is affected by others’
entry decisions through position competition, there is no clear economic rationale for firms’
strategic interaction at the consideration stage. Second, a firm’s consideration of a keyword
sometimes merely reflects a firm’s awareness of a keyword rather than an active decision made
from a strategic perspective. Finally, since we do not observe advertisers’ keyword
considerations, we are unable to empirically identify such strategic interactions, even if they
exist, at the consideration stage.
According to Stokes (2008), advertisers usually establish and update their keyword
consideration set or keyword dictionary based on three sources of information: (i) advertisers’
own keyword history; (ii) keyword-recommendation tools relying on proximity-based algorithms
(e.g., the free Keyword Tool provided by Google AdWords); (iii) competitors’ keyword histories
as observed by infomediaries. To be consistent with these industrial routines, we allow an
advertiser’s keyword consideration set to be influenced by three sets of variables: keyword
57
loyalty, keyword similarity, and keyword popularity, the definitions of which are elaborated
below. We model an advertiser’s consideration intensity 𝑉 𝑘𝑖𝑡
of a keyword as follows:
𝑉 𝑘𝑡
=𝜆 𝑖 +𝜆 1𝑇 𝐸𝑙𝑎𝑔 𝑘𝑖𝑡 +𝜆 2𝑇 𝐾 𝑆 𝑘𝑖𝑡
+∑ 𝜆 3𝑇 𝑇 ′ln[𝑁𝑙𝑎 𝑔 𝑘𝑖𝑡
(𝑇 ′
)+1]
𝑇 ′
={𝑀 ,𝑅 ,𝐶 }
+𝜍 𝑘𝑖𝑡
𝑉 (14)
𝐾 𝑆 𝑘𝑖𝑡
=|𝑋 𝑘 −𝑋 𝑖 ,𝑡 −1
𝑅 | (15)
𝑋 𝑖 ,𝑡 −1
𝑅 =
∑ 𝑅𝑎𝑛 𝑘 𝑘𝑖 ,𝑡 −1
−1
∀𝑘 ,𝑎 𝑘𝑖 ,𝑡 −1
=1
𝑋 𝑘 ∑ 𝑅𝑎𝑛 𝑘 𝑘𝑖 ,𝑡 −1
−1
∀𝑘 ,𝑎 𝑘𝑖 ,𝑡 −1
=1
(16)
𝑁𝑙𝑎 𝑔 𝑘𝑖𝑡
(𝑇 )=
1
𝑡 [∑ 𝑁 𝑘𝑖𝑙
(𝑇 )
𝑡 −1
𝑙 =0
] (17)
where 𝜆 𝑖 accounts for the advertiser’s fixed effect and 𝜍 𝑘𝑖𝑡
𝑉 is an unknown error term that follows
i.i.d. standard normal distribution. Similar to 𝜍 𝑘𝑖𝑡
𝑈 in entry utility, the variance of 𝜍 𝑘𝑖𝑡
𝑉 is set to be
one for identification purposes.
We use the advertiser’s past average monthly entry frequency 𝐸𝑙𝑎 𝑔 𝑘𝑖𝑡
to measure its
keyword loyalty. We create a vector 𝐾 𝑆 𝑘𝑖𝑡
to measure the similarity between keyword k and a
representative keyword purchased by advertiser i in the last period. We let 𝑋 𝑖 ,𝑡 −1
𝑅 denote the
vector of attributes of a representative keyword used at time t-1, which is defined as the average
of attribute values of all purchased keywords weighted by the inverse of advertiser i’s rank
(which measures the importance of that keyword). Then, 𝐾 𝑆 𝑘𝑖𝑡
is the vector of absolute
difference between keyword k and the representative keyword in the last period across all
dimensions of keyword attribute. Finally, keyword popularity is captured by 𝑁𝑙𝑎 𝑔 𝑘𝑖𝑡
(𝑇 ) , which
is the average number of type-T competing advertisers who bought keyword k in previous
periods. The parameters 𝜆 3𝑇 𝑇 ′ indicate how advertisers use previous competition information to
construct their keyword consideration sets.
58
We have now completed the model of keyword selection, which leads to the following
likelihood function of parameters 𝛾 and 𝜆 in keyword entry and consideration models, given the
observations of advertisers’ keyword entry.
𝐿 (𝛾 ,𝜆 |𝐸 )=∏ ∏ [∑ [𝑃 (𝐼 𝑘𝑡
)𝑃 (𝐸 𝑘𝑡
|𝐼 𝑘𝑡
)]
∀𝐼 𝑘𝑡
]
𝐾 𝑘 =1
𝑆 𝑡 =1
(18)
𝑃 (𝐼 𝑘𝑡
)=∏ [𝑃 (𝑉 𝑘𝑖𝑡 >0)
1{𝑖 ∈𝐼 𝑘𝑡
}
𝑃 (𝑉 𝑘𝑖𝑡 ≤0)
1{𝑖 ∉𝐼 𝑘𝑡
}
]
𝐼 𝑖 =1
(19)
𝑃 (𝐸 𝑘𝑡
|𝐼 𝑘𝑡
)=1{𝐸 𝑘𝑡
⊆𝐼 𝑘𝑡
}∏ [𝑃 (𝑈 𝑘𝑖𝑡
>0|𝐼 𝑘𝑡
)
1{𝑖 ∈𝐸 𝑘𝑡
}
𝑃 (𝑈 𝑘𝑖𝑡
≤0|𝐼 𝑘𝑡
)
1{𝑖 ∉𝐸 𝑘𝑡
}
]
𝑖 ∈𝐼 𝑘𝑡
(20)
2.5 Estimation Strategy
2.5.1 An Overview of the Estimation Method
We present an overview of our estimation strategy and discuss several econometric challenges.
We first estimate the models of three post-entry outcomes: ad positions, average click-volume,
and average CPC. We jointly estimate the equations of average click-volume and average CPC
along with the equations of the number of three types of advertisers to deal with the potential
endogeneity problem. We adopt a Bayesian estimation approach implemented via the Markov
chain Monte Carlo (MCMC) algorithm to make model inferences. We generate 10,000 iterations
for each model and use every 10
th
of the last 5,000 draws for the inferences of parameter
estimates. The iteration plots are monitored and inspected to determine convergence of the
sampler. The detailed algorithm for this part appears in Appendix A.1. Next we describe the
estimation strategies for the models of keyword entry and consideration.
Our integrated model of entry and consideration equations presents two main challenges
for estimation. The first problem is the curse of dimensionality. As noted in Equation 18, the
likelihood function of keyword choice requires the summation across all possible sets of
59
potential entrants for each keyword, the number of which can be as large as 2
𝐼 −1≈
268,000,000 in our empirical context. Therefore, it is infeasible to integrate out all sets of 𝐼 𝑘𝑡
to
compute the unconditional likelihood function. Second, conditional on 𝐼 𝑘𝑡
, the vector of entry
probabilities for potential entrants is implicitly determined under an equilibrium condition.
However, the possibility of multiple equilibria prevents us from deriving the conditional
likelihood of entry observations expressed in Equation 20.
We adopt a Bayesian estimation approach to cope with the first challenge. We follow van
Nierop et al. (2010) in treating entry and consideration decisions as indicators of latent utilities,
whose posteriors are either normally or truncated normally distributed. By augmenting the model
with these latent variables, we have a deterministic set of potential entrants for each keyword
during the MCMC iteration. Thus, the Bayesian method helps us bypass the necessity of
integrating out all sets of potential entrants.
For the second challenge, previous literature suggested three approaches to tackle
multiple equilibria in discrete games with incomplete information (see Ellickson and Misra 2011
for a more detailed discussion). The first method is the two-step estimator, which was originally
introduced by Hotz and Miller (1993) in the dynamic discrete choice context. Recently, Bajari et
al. (2010) proved the consistency of this estimator for a static discrete game and provided the
identification conditions. The first stage is to find consistent estimates for those conditional
choice probabilities (CCPs) such as the entry probability in Equation 13. Then in the second
stage, researchers can use these predicted CCPs as covariates to estimate the parameters of
strategic interactions. The second method is the nested pseudo likelihood (NPL) approach
introduced by Aguirregabiria and Mira (2007). This method relies on repetitive procedures of
best response iteration and pseudo likelihood maximization. The final estimation method is the
60
mathematical programming with equilibrium conditions (MPEC) proposed by Su and Judd
(2012), which treats the likelihood-maximization problem as a constrained optimization problem
with equilibrium constraints. Both of the last two methods involve an optimization procedure,
which encounters the first challenge we mentioned above. Therefore, we follow Bajari et al.
(2010) and adopt the two-step estimation approach.
2.5.2 A Two-Step Estimation Approach
The goal of this section is to elaborate our modified two-step estimation method. We first discuss
why the standard two-step method cannot be directly applied to our empirical context. In a
discrete game where the set of players is observed, researchers typically estimate the CCPs based
on observed state variables using either a non-parametric method or a flexible parametric
specification in the first step. Here state variables refer to the whole set of covariates that can
affect a player’s choice probability. Unfortunately, since we intend to be agnostic about the set of
potential entrants in this study, we are unable to infer the CCPs based on the relationship
between observed state variables and observed choice decisions. Next we discuss how we adapt
the two-step method to estimate the proposed model. The details regarding the estimation
algorithm for each stage are in Appendix A.2.
Step 1: The first step estimation aims to predict the equilibrium entry probability
𝑃 ∗
(𝑎 𝑘𝑖𝑡
|𝐼 𝑘𝑡
) in Equation 13 for all possible 𝐼 𝑘𝑡
. For each 𝐼 𝑘𝑡
, we denoted the vector of state
variables by 𝑆 𝑘𝑡
(𝐼 𝑘𝑡
) , which includes all relevant keyword characteristics and previous entry and
rank information of all potential entrants. We assume that the equilibrium entry probability
𝑃 ∗
(𝑎 𝑘𝑖𝑡
|𝐼 𝑘𝑡
) can be approximated by a flexible probit specification with all 𝑆 𝑘𝑡
(𝐼 𝑘𝑡
) as covariates,
the number of which is over 300 in our empirical context. We then employ a Bayesian approach
to jointly estimate this probit entry model and the consideration model described in Equation 14.
61
Given the estimation results, we use predicted 𝑃̂
∗
(𝑎 𝑘𝑖𝑡
|𝐼 𝑘𝑡
) to construct the expected number of
entrants ranked above 𝑁̂
𝑘𝑖𝑡
𝑎 (𝐼 𝑘𝑡
) and ranked below 𝑁̂
𝑘𝑖𝑡
𝑏 (𝐼 𝑘𝑡
) , both of which appear in the right-
hand side of entry utility in Equation 8.
Step 2: At the second stage, we attempt to obtain consistent estimates on parameters of
strategic interactions. We basically replace 𝑁 𝑘𝑖𝑡
𝑎 (𝐼 𝑘𝑡
) and 𝑁 𝑘𝑖𝑡
𝑏 (𝐼 𝑘𝑡
) with predicted 𝑁̂
𝑘𝑖𝑡
𝑎 (𝐼 𝑘𝑡
) and
𝑁̂
𝑘𝑖𝑡
𝑏 (𝐼 𝑘𝑡
) generated from the first stage estimation in Equation 8) Then we estimate the entry and
consideration models together (i.e., Equations 8 and 14) in a similar way as the first step
estimation.
2.5.3 Identification
We next discuss the identification of parameters in keyword entry and consideration equations.
As in standard discrete choice models augmented with consideration decisions, the identification
of non-strategic parameters relies on the covariation between the explanatory variables and the
revealed keyword selections. Note that the sets of covariates in the entry model and the
consideration model are overlapped but not identical. The covariates of keyword popularity
measured by 𝑁𝑙𝑎 𝑔 𝑘𝑖𝑡
and keyword similarity 𝐾 𝑆 𝑘𝑖𝑡
only appear in the consideration equation.
Thus, the identification of parameters in the consideration model is achieved through the
correlation between advertisers’ keyword choices and those variables that only affect advertisers’
keyword considerations.
As established by Bajari et al. (2010), the identification of the structural parameters in the
keyword entry model (i.e., 𝛾 7𝑇 𝑇 ′ and 𝛾 8𝑇 in Equation 8) rests on exclusion restrictions, which
refer to the variables that directly affect the payoffs of each individual but not the payoffs of
other players. The intuition for the need of exclusion restrictions is as follows. Suppose 𝑋 2
serves
as an exclusion restriction variable that only affects firm 2’s entry utility but not firm 1’s. Then
62
the strategic effect of firm 2’s action on firm 1’s entry probability is identifiable because part of
the variation in firm 1’s entry observations is driven by the exclusion restriction 𝑋 2
, which does
not directly shift firm 1’s entry profit but only indirectly shifts through firm 2’s entry. The
exclusion restrictions used here are previous entry decisions and previous ranks of each
advertiser in a keyword. We believe that these two variables are valid exclusion restrictions
because they influence an advertiser’s current advertising payoff via the state of dependence
while they do not directly affect competitors’ payoffs in the current period.
2.5.4 Simulation Studies
We perform a simulation exercise with two objectives. First, we want to test the theoretical
identification of our proposed joint model of entry and consideration. Second, we hope to
demonstrate the potential bias in the estimate of strategic interaction effect when researchers
incorrectly regard potential entrants as the entire set of players.
We consider a simplified version of our entry and consideration model by assuming that
there are three players who might be interested in entering a market. All three firms have a
probability of being a potential entrant in the consideration stage. After that, the firms who are
potential entrants compete in a simultaneous-move entry game. If only one firm considers entry,
the game is reduced to an individual decision-making problem. We include an intercept and two
covariates (𝑋 𝑘 ,𝑋 𝑘𝑖
) in the entry utility, where 𝑋 𝑘𝑖
serves as the exclusion restrictions. As for the
consideration intensity, we include a new covariate 𝑊 𝑘𝑖
in addition to the intercept to warrant the
model identification. We generate 𝑋 𝑘 , 𝑋 𝑘𝑖
and 𝑊 𝑘𝑖
from an i.i.d. standard normal distribution.
We also allow a firm’s entry utility to be affected by the entry of the number of competitors
when more than two firms are considering market entry.
63
We simulate the data for 1,000 markets and estimate the model with two different
approaches. We first estimate the model using the proposed two-step method outlined in section
2.5.2. Second, we estimate the model with a standard two-step method without the consideration
stage. Specifically, we assume that all three firms are potential entrants and use (1,𝑋 𝑘 ,𝑋 𝑘𝑖
,𝑊 𝑘𝑖
)
as covariates in the specification of entry payoff. We report two sets of simulation results for the
model with either positive or negative strategic interaction effect below.
Sign of Strategical
Interaction Effect
Negative Positive
True
values
Estimates True
values
Estimates
With consideration (true model)
𝛽 1
(Consideration) 1, 1 1.008, 1.006
(.103) (.082)
1, 1 0.994, 0.953
(.041) (.047)
𝛽 2
(Entry) 1, 1, 1 1.050, 1.054, 0.966
(.113) (.067) (.056)
1, 1, 1 0.981, 1.062, 1.048
(.162) (.093) (.087)
𝛾 (Strategic interaction) −1 −1.042 (.106) 1 1.171 (.121)
Without consideration
𝛽 3
(Entry) N.A. 0.250 (.115)
0.387 (.046)
0.687 (.060)
0.599 (.043)
N.A. 0.039 (.100)
0.635 (.031)
0.265(.029)
0.256 (.027)
𝛾 (Strategic interaction) −1 −0.580 (.127) 1 0.404 (.072)
Table 11. Simulation Results of Entry and Consideration Models
Notes. Posterior means and standard deviations (in the parenthesis) are reported, and estimates
that are significant at 95% are bolded in Tables 11-16.
The estimation results reported in Table 11 suggest that all parameters in our models of
keyword entry and keyword consideration are identifiable and can be recovered within a 95%
posterior confidence level. Furthermore, we find that the estimate on the strategic interaction
effect incurs severe downward bias if researchers ignore the consideration decision and treat the
entire set of players as potential entrants. The reasoning for such a downward bias can be briefly
described as follows. Let’s assume the entry utility of a firm is 𝑈 𝑘𝑖
=𝛽 𝑋 𝑘𝑖
+𝛾 ∑ 𝑃 (𝑎 𝑗 =1)
𝑗 ≠𝑖 +
64
𝜖 𝑘𝑖
, where 𝑋 𝑘𝑖
includes the exclusion restriction and 𝜖 𝑘𝑖
stands for the private profit shock. Then
when the consideration stage is not accounted for, the 𝑃̂
(𝑎 𝑗 =1) predicted from the first step
estimation in a two-step method will be biased upwards. Therefore, the magnitude of the
estimated strategic interaction effect 𝛾 obtained from the second step tends to be smaller than its
true value.
2.6 Results and Counterfactuals
To validate our proposed joint model of entry and consideration, we compare it with a
benchmark model in which all advertisers are exogenously treated as potential entrants. In other
words, there is no consideration stage in the benchmark model. The entry utility of this
benchmark case is modeled in a similar way as in Equation 8 except for the inclusion of
additional covariates that occur in our proposed consideration models. By doing this, we ensure
that the total sets of covariates used in these two models are exactly the same. We estimate the
two models with a two-step approach. We find that our proposed entry model with a
consideration stage (log-marginal = 78412) outperforms the entry model without a consideration
stage (log-marginal = 82461). This suggests the importance of accounting for firms’ keyword
consideration in the context of sponsored search advertising.
2.6.1 Post-Entry Outcomes: Ad Position, Average Click-Volume, and Average CPC
Table 12 reports the estimation results of the advertisement position model. Consistent with
Google’s practice of using relevance between advertisement and keyword as an important
determinant of the ad’s quality, our results indicate that a manufacturer obtains a higher position
in a matched keyword. Regarding the effects of keyword characteristics, we find that retailers
tend to rank higher than manufacturers in specific, promotional, and high-traffic keywords,
65
whereas comparison sites tend to enjoy position advantages over manufacturers in lengthy and
specific keywords. We also find significant state-dependence effects of previous entry and rank.
As expected, an advertiser who bought the keyword or achieved higher ranks in previous periods
is more likely to maintain a higher position. Finally, the estimates of seasonal effects show that
in comparison to the benchmark month of October 2011, the difference of bidding efforts
between manufacturers and retailers/comparison sites is highest in January 2012. One possible
explanation is that retailers/comparison sites have more incentive than manufacturers to stay at a
higher position to maintain the sales momentum even after the holiday season.
We summarize our findings on the average click-volume and the average CPC in Table
13 and Table 14. We first note that these two advertising metrics vary across keywords with
different attributes. For example, keywords with more promotional terms are associated with
higher average click-volume and CPCs. Our results also show that both the average click-volume
and CPC are lowest in January, perhaps due to consumers’ lowered shopping interests in the
post-holiday season.
The variance-covariance estimates in Table 14 provide some evidence for the
endogeneity on the number of retailers and the number of comparison sites in the CPC equation.
After controlling for the endogeneity, we find that the average click per ad is negatively
associated with the number of manufacturers and positively associated with the number of
retailers. Meanwhile, a keyword with more manufacturers and fewer comparison sites tend to
have a higher CPC. As the laptop has entered its maturity stage of the product life cycle in the
U.S. market, most manufacturers mainly use search advertising to reinforce consumers’ brand
awareness while retailers exploit search advertising as a tool to boost sales. This may have
caused the discrepancy in consumers’ clicking intention on ads from these two types of
66
advertisers. For the latter finding about the effect of the number of ads on CPC, we conjecture
that it is because manufacturer/comparison sites typically have higher/lower weighted bids than
average. This is consistent with our data which show that the average ad position is 4.16, 8.08,
and 6.42 for manufacturers, comparison sites, and the full sample in our data, respectively.
Manufacturers Retailers (diff) Comparison Sites (diff)
Keyword attributes
Length N.A. −0.002 (.025) 0.061 (.027)
Specific N.A. 0.579 (.022) 0.484 (.025)
Promo N.A. 0.075 (.036) −0.031 (.042)
ln(SV) N.A. 0.028 (.007) 0.001 (.009)
Match 2.339 (.054) N.A. N.A.
State-dependence effects
Entry
𝑡 −1
1.070 (.045) 0.022 (.063) 0.361 (.079)
Entry
𝑡 −2
0.591 (.050) 0.003 (.074) 0.237 (.095)
Entry
𝑡 −3
0.772 (.059) −0.164 (.078) −0.160 (.100)
Entry
𝑡 −4
0.819 (.064) −0.220 (.087) −0.344 (.113)
Entry
𝑡 −5
0.034 (.080) 0.178 (.105) −0.047 (.145)
Entry
𝑡 −6
−0.130 (.107) −0.033 (.155) −0.160 (.203)
ln (𝑅𝑎𝑛 𝑘 𝑡 −1
+1) −0.515 (.025) 0.123 (.033) −0.024 (.039)
ln (𝑅𝑎𝑛 𝑘 𝑡 −2
+1) −0.225 (.029) −0.045 (.040) −0.077 (.045)
ln (𝑅𝑎𝑛 𝑘 𝑡 −3
+1) −0.298 (.033) 0.049 (.043) 0.039 (.050)
ln (𝑅𝑎𝑛 𝑘 𝑡 −4
+1) −0.228 (.034) 0.017 (.045) −0.003 (.054)
ln (𝑅𝑎𝑛 𝑘 𝑡 −5
+1) −0.066 (.047) 0.035 (.058) 0.058 (.072)
ln (𝑅𝑎𝑛 𝑘 𝑡 −6
+1) 0.056 (.062) −0.008 (.083) 0.062 (.101)
Seasonal effects
Nov N.A. 0.373 (.063) 0.369 (.071)
Dec N.A. 0.839 (.059) 0.585 (.063)
Jan N.A. 1.319 (.081) 1.569 (.088)
Feb N.A. 0.852 (.072) 0.964 (.078)
Mar N.A. 0.213 (.066) 0.741 (.073)
Apr N.A. 0.763 (.074) 0.862 (.080)
Advertiser fixed effects
Avg fixed effect 0.267 0.852 −0.013
Table 12. Estimation Results of Position Model
67
ln(Click) ln(CPC)
Keyword attributes
Intercept −0.514 (.091) −0.149 (.041)
Length 0.074 (.034) 0.077 (.014)
Specific 0.025 (.036) −0.098 (.012)
Promo 0.136 (.028) 0.098 (.021)
ln(SV) 0.164 (.053) −0.001 (.004)
State-dependence effects
ln (𝐶𝑃 𝐶 𝑡 −1
) N.A. 0.463 (.011)
ln (𝐶𝑙𝑖𝑐 𝑘 𝑡 −1
) 0.735 (.050) N.A.
Seasonal effects
Nov 0.622 (.046) 0.380 (.017)
Dec 0.586 (.041) 0.232 (.018)
Jan −0.056 (.066) −0.447 (.027)
Feb 0.309 (.048) −0.125 (.021)
Mar 0.198 (.041) −0.249 (.018)
Apr 0.229 (.046) −0.220 (.020)
Effects of no. of entrants
ln (𝑁 (𝑀 )+1) −0.098 (.048) 0.093 (.020)
ln (𝑁 (𝑅 )+1) 0.232 (.068) −0.043 (.029)
ln (𝑁 (𝐶 )+1) −0.100 (.061) −0.084 (.026)
Table 13. Estimation Results of Click-Volume and CPC Models
Variance-covariance-matrix
𝜙 1
𝜙 2
𝜙 3
𝜙 4
𝜙 5
Error term in ln(Click) (𝜙 1
) 0.954
(.039)
0.063
(.005)
0.007
(.007)
−0.025
(.015)
0.007
(.012)
Error term in ln(CPC) (𝜙 2
)
0.160
(.003)
−0.002
(.003)
0.014
(.006)
0.019
(.005)
Error term in ln (𝑁 (𝑀 )+1) (𝜙 3
)
0.159
(.002)
0.046
(.002)
0.029
(.002)
Error term in ln (𝑁 (𝑅 )+1) (𝜙 4
)
0.232
(.003)
0.087
(.003)
Error term in ln (𝑁 (𝐶 )+1) (𝜙 5
)
0.217
(.003)
Variance-covariance-matrix 𝜔 1
𝜔 2
Keyword-specific error term in
ln(Click) (𝜔 1
)
0.138 (.025) 0.006 (.004)
Keyword-specific error term in
ln(CPC) (𝜔 2
)
0.039 (.002)
Table 14. Variance Covariance Estimates of Click-Volume and CPC Models
68
2.6.2 Keyword Choice or Entry
Table 15 reports parameter estimates in the keyword entry equation.
10
We begin with a
discussion on the effects of keyword attributes on advertisers’ keyword choices. We find that all
three types of advertisers have a preference for promotional keywords. In addition,
manufacturers and retailers are more likely to use generic and high-traffic keywords while
comparison sites are more likely to use specific and low-traffic keywords. One possible
explanation is that comparison sites often have smaller advertising budgets than manufacturers
and retailers, and therefore they favor niche keywords that are searched by users with higher
purchase intention. We also find that retailers are in favor of longer keywords while comparison
sites prefer shorter ones.
Our results indicate a significant state dependence in advertisers’ keyword decisions. All
three types of advertisers are more likely to advertise through a keyword that was frequently
used or associated with a high rank in the past. These findings also support the validity of our use
of average entry frequency (Elag) and the average ad rank (Rlag) in previous time periods as
exclusion variables.
Effect of Expected Number of Ads Ranked Above. We start with the strategic interaction
between advertisers of the same type. The expected number of ads ranked above, denoted by n,
affects an advertiser’s entry probability in two ways. On one hand, a larger n could lower the
click volume on an advertiser i’s link simply because the advertisement is ranked lower, which
will directly lower the advertiser’s expected payoff from advertising, 𝑈 (𝑎 𝑖 =1) , since fewer
clicks lead to fewer conversions. This suggests a direct negative effect of n on an advertiser’s
entry probability via the influence on advertisers’ immediate profits. On the other hand, even if
10
Because the first-step estimation results are used to approximate equilibrium entry probabilities, we only report
the estimation results from the second step in this paper.
69
advertiser i decides not to advertise, its payoff from the non-advertising option, 𝑈 (𝑎 𝑖 =0) , may
also decrease with n because a large n suggests a higher likelihood for potential consumers to
buy from those competitors rather than from advertiser i when they purchase subsequently. This
suggests an indirect positive effect of n on an advertiser’s entry probability via the influence on
advertisers’ incentives to be included in consumers’ consideration set. Such an indirect effect can
occur if most advertisers are well known and the main objective of advertising is to strengthen
consumers’ brand awareness. The net impact of n on an advertiser’s entry probability depends on
the magnitude of the two forces.
Since the primary advertising objective for manufacturers is often brand building and
reminding, the indirect positive effect is likely to outweigh the direct negative effect for
manufacturers, which explains why manufacturers tend to assimilate with manufacturers ranked
above; that is, the manufacturer’s entry probability increases as the expected number of other
above-ranked manufacturers increases. In contrast, comparison sites mainly use search
advertising to boost conversions and they are often more homogenous and not as well known to
consumers as manufacturers. Hence, the indirect positive effect due to consumers’ decreased
likelihood of consideration is likely to be smaller than the direct negative effect for comparison
sites. This explains why a comparison site’s entry probability decreases as the expected number
of other comparison sites ranked above increases, i.e., a differentiation effect. As for retailers,
whose advertising objective is usually a combination of brand reminding and conversion
enhancement, our results indicate that they also tend to assimilate with other retailers ranked
above, even though the magnitude of the assimilation effect is much smaller than that for
manufacturers.
70
We next discuss the strategic interaction between different types of advertisers. We find
that the manufacturer-comparison pair and comparison-manufacturer pair are both negative (‒
0.264 and ‒0.233). This suggests a negative spillover in the sense that the expected number of
manufacturers (comparison sites) ranked above will decrease the entry probability of the
comparison site (manufacturer). It further implies that manufacturers and comparison sites are
trying to differentiate with each other, likely because manufacturers often attract loyal customers
whereas comparison sites often attract switchers. Interestingly, it turns out that both the
manufacturer-retailer pair and the comparison-retailer pair are positive (0.206 and 0.308). This
suggests a positive spillover in the sense that expected number of manufacturers and comparison
sites ranked above will increase the entry probability of the retailer. One possible explanation is
that the retailers’ advertisements can play the role of call to action after consumers’ shopping
interest is raised by the information obtained from manufacturers’ and comparison sites’ search
advertisements.
Effect of Expected Number of Ads Ranked Below. In general, we find a positive effect of
advertisements ranked below on an advertiser’s entry probability, suggesting that given an
advertiser’s ranking, the larger the total number of paid-search ads showing for a keyword, the
higher the advertiser’s entry probability. We propose two possible explanations. First, more
search advertisements ranked below may increase the relative ranking of the focal advertiser,
which in turn increases the click-through rate of the advertisement. Second, a larger number of
below-ranked advertisements may signal a higher demand for the keyword market and
consequently increase the advertiser’s entry probability.
71
Manufacturers Retailers Comparison Sites
Keyword attributes
Length −0.024 (.024) 0.210 (.015) −0.044 (.015)
Specific −0.079 (.034) –0.150 (.014) 0.046 (.013)
Promo 0.494 (.053) 0.117 (.029) 0.156 (.030)
ln(SV) 0.055 (.008) 0.045 (.004) −0.016 (.005)
Match 1.656 (.108) N.A. N.A.
State-dependence effects
Elag 0.607 (.082) 0.341 (.047) 1.374 (.046)
ln (Rlag+1) −0.322 (.030) −0.231 (.013) −0.272 (.023)
Seasonal effects
Nov 0.543 (.056) −0.439 (.032) −0.040 (.031)
Dec 0.542 (.063) 0.119 (.038) −0.063 (.033)
Jan −0.036 (.078) −1.017 (.040) −1.203 (.044)
Feb 0.996 (.078) −0.305 (.036) 0.067 (.037)
Mar 0.183 (.076) 1.239 (.047) 0.629 (.040)
Apr 2.020 (.078) 1.054 (.039) 0.803 (.042)
Competition effects of ads above
ln (𝑁 𝑎 (𝑀 )+1) 0.752 (.052) 0.206 (.035) –0.233 (.031)
ln (𝑁 𝑎 (𝑅 )+1) –0.032 (.079) 0.141 (.048) 0.028 (.054)
ln (𝑁 𝑎 (𝐶 )+1) –0.264 (.089) 0.308 (.056) −0.345 (.056)
Competition effects of ads below
ln (𝑁 𝑏 +1) 1.761 (.061) 1.216 (.044) 2.192 (.054)
Intercept and advertiser fixed effects
Intercept −1.833 (.121)
Avg fixed effect −1.144 −1.622 −1.483
Table 15. Estimation Results of Keyword Entry Model
2.6.3 Keyword Consideration
We have two main findings on how advertisers utilize competition information to update their
keyword dictionaries or keyword consideration sets. First, both manufacturers and retailers are
more likely to consider keywords that have been used by other advertisers from the same type
and less likely to consider keywords that have been used by advertisers from different types in
the past. Second of all, comparison sites are more likely to consider keywords that have been
used by other comparison sites and retailers in the past.
72
We offer explanations for these two findings. Since manufacturers and retailers are
intimately involved in producing or distributing the product, they are often more knowledgeable
than comparison sites about the appropriate keywords for their business. Therefore,
manufacturers and retailers are likely to be at a more advanced stage in optimizing the keyword
dictionary by improving the relevance of keywords, and comparison sites are more likely to still
be at a discovery stage in expanding the keyword dictionary. This may explain why
manufacturers and retailers are inclined to consider those keywords that were exclusively
popular among same-type advertisers but not others while comparison sites expand the list of
keywords based on past keyword choices by other comparison sites and retailers.
Finally, our analysis suggests that keyword loyalty is a significant factor in keyword
consideration for both manufacturers and retailers. We also find that manufacturers and retailers
are likely to consider terms that are similar to their previous keyword portfolio. Putting these
findings together, we see that comparison sites are more likely to alternate the emphasis of their
keyword consideration strategies, whereas manufacturers and retailers show more persistence
and inertia. This is also consistent with our hypothesis that manufacturers and retailers are in the
“maintaining” mode whereas comparison sites are in the “discovering” mode.
73
Manufacturers Retailers Comparison Sites
Keyword similarity
Length_dist –0.062 (.018) –0.038 (.014) –0.033 (.027)
Specific_dist –0.070 (.025) –0.259 (.018) 0.132 (.035)
Promo_dist –0.488 (.041) –0.149 (.035) 0.585 (.114)
Keyword loyalty
Elag 2.084 (.048) 1.360 (.039) 0.208 (.124)
Keyword popularity
ln (𝑁𝑙𝑎𝑔 (𝑀 )+1) 0.400 (.028) –0.121 (.030) 0.043 (.044)
ln (𝑁𝑙𝑎𝑔 (𝑅 )+1) –0.305 (.032) 0.206 (.022) 0.233 (.043)
ln (𝑁𝑙𝑎𝑔 (𝐶 )+1) –0.074 (.029) –0.052 (.024) 0.185 (.058)
Intercept and advertiser fixed effect
Intercept –0.163 (.063)
Avg fixed effect 0.539 0.658 0.864
Table 16. Estimation Results of Keyword Consideration Model
2.6.4 Counterfactual Simulations
The structural nature of our model can help search engines evaluate the revenue implications of
third-party keyword infomediaries. We consider a counterfactual scenario in which advertisers
are no longer able to acquire detailed competition information for each keyword. This can
happen if search engines such as Google start to restrict the commercial use of competitive
intelligence by third-party infomediaries. We compare the search engine’s expected revenue
under two scenarios: (i) When infomediaries do not exist, advertisers only know the average
number of each type of competitors across keywords; and (ii) When infomediaries do exist, they
provide detailed competition information for each keyword. For each scenario, we simulate
advertisers’ keyword choices in the last month of our data period by constructing advertisers’
keyword considerations and then solving the simultaneous-move game of keyword entry. We
then use the average click-volume and CPC under the new equilibrium to compute the search
engine’s advertising revenue. To alleviate the possibility of multiple equilibria of the entry game,
we randomize the starting value for the vector of entry probability in each keyword during each
of our 1,000 simulations.
74
Table 17 reports findings from this counterfactual experiment. We find that as keyword
infomediaries provide more accurate competition information, all three types of advertisers are
likely to buy more keywords (with an increase ranging from 2.6% to 3.6%). This keyword
expansion effect is the largest for comparison sites, consistent with the notion that comparison
sites use infomediaries as a tool for keyword discovery. Our results also indicate that the search
engine will expect to increase its advertising revenue by 4.4% if the keyword-level competition
information is accessible to advertisers. Taken together, our findings suggest that infomediaries
create values for both sponsored search advertisers and the search engine.
Manufacturers Retailers Comparison
Sites
Search
Engine
Average number of ads per keyword
With average competition
information
1.96 3.06 1.68 N.A.
With accurate competition
information
2.01 3.14 1.74 N.A.
% change 2.7% 2.6% 3.6% N.A.
Average number of keywords purchased per advertiser
With average competition
information
306.28 319.19 262.98 N.A.
With accurate competition
information
314.49 327.47 272.13 N.A.
% change 2.7% 2.6% 3.6% N.A.
Search engine revenue per keyword per day
With average competition
information
N.A. N.A. N.A. $36.88
With accurate competition
information
N.A. N.A. N.A. $38.51
% change N.A. N.A. N.A. 4.4%
Table 17. Counterfactual Results of Advertisers’ Keyword Choices and Search Engine Revenue
75
2.7 Conclusion
Although keyword management is an essential component in the practice of search engine
marketing, it has not received enough attention from marketing researchers. As keyword
infomediaries now provide detailed competition information to sponsored search advertisers,
investigating the effect of competition on sponsored search advertisers’ keyword choices has
important implications for both advertisers and the search engine in terms of fully understanding
the nature of the competition in a keyword market, better predicting the level of the competition
in an equilibrium, and assessing the optimal granularity of competition information to release.
In this study, we develop a structural model to study sponsored search advertisers’
keyword choice decisions. Our model is built upon the incomplete-information simultaneous
entry framework, with several novel extensions to handle the complexities that arise in the
sponsored search advertising context. We allow the strategic interactions to be advertiser-type-
specific and dependent on the advertisements’ relative positions. We also formally model the
construction of advertisers’ keyword dictionaries and allow them to vary with the observable
historical competition knowledge for the same keyword. Through modeling the consideration or
the set of potential entrants for a keyword, we capture the indirect effect of competition
information made available from the keyword infomediary on advertisers’ keyword entry
decisions. This is a useful new approach to integrate previous competition outcomes into the
empirical analysis of a static discrete game. Finally, we also make econometric developments to
adapt to the two-step estimation approach and to cope with the issue of keyword consideration
being unobservable to researchers.
On keyword market entry, we find a downward assimilation effect for all three types of
advertisers, suggesting that an advertiser’s entry probability increases with the expected number
76
of search advertisements ranked below. For the strategic interaction with advertisements ranked
above, we find both upward assimilation and upward differentiation depending on the advertiser
type combinations. In particular, retailers tend to upward assimilate with all three types of
advertisers, comparison sites tend to differentiate from other comparison sites and manufacturers,
and manufacturers tend to assimilate with other manufacturers but differentiate from comparison
sites. As for keyword consideration, we find that both manufacturers and retailers are more likely
to learn from the same type of other advertisers, while comparison sites are more likely to learn
from other comparison sites and retailers.
This study has limitations, which opens doors for future research. First, jointly modeling
advertisers’ bidding and keyword choice decisions can provide a more comprehensive
understanding of the role of competition in search engine marketing. Unfortunately, we are
unable to examine this issue because of lack of data on individual entrant advertisers’ bids.
Second, while we focus on strategic interactions among advertisers for the same keyword,
spillover effects could exist among different keywords for the same advertiser. Recovering the
keyword consideration set is a first step we take towards that direction. Third, our research can
be extended to account for certain strategic collusions among channel members like
manufacturers and retailers. In that case, we need to more structurally model the expected
advertising payoff by allowing advertisers’ payoff functions to be co-dependent with each other.
Finally, while this research has studied strategic interactions among advertisers on the same
search engine, future work could examine advertisers’ keyword competition across different
search platforms. Overall, we hope this study stimulates further interest in advertisers’ keyword
strategies, as sponsored search advertising continues to thrive and grow.
77
3. A Two-Sided Market Analysis of Behaviorally Targeted Display
Advertising
3.1 Introduction
Behavioral targeting is a widely adopted technology in the online advertising industry, allowing
marketers to serve personalized advertisements to individuals based on their historical online
behaviors. According to eMarketer, behaviorally targeted advertising is already a multibillion-
dollar business, and it is projected that nearly one in five dollars spent on display advertising will
involve behavioral targeting by 2014.
11
Despite the increasing popularity of the use of behavioral
targeting, advertising hosts are facing challenges in the user-ad match when implementing this
technology to serve online display advertisements. This paper provides an empirical framework
and analytics to help address these complexities.
The complexity arises in the ad-serving task of matching consumers with advertisements.
In most other forms of online advertising, the rules used for such a user-ad matching task are
straightforward. Examples include matching users with advertisements via keyword query as in
sponsored search advertising, via the most recently browsed product or website as in retargeted
display advertising, and via the content theme of the landing webpage as in contextual display
advertising. However, the matching task as required in behavioral targeting is much more
complicated because the advertising host needs to determine the targeting level, which is
operationalized by choosing how many and what interests inferred from the user’s past behavior
should be used in generating the set of candidate advertisements. In general, a high level of
targeting implies a more accurate inference on an individual’s current preference, leading to the
11
http://www.emarketer.com/Article.aspx?R=1007489
78
use of a smaller set of recently activated interests from the individual, and consequently a smaller
set of candidate advertisements to serve to the individual. Thus, the targeting level is an inverse
scale of the number of recent interests that are used to match the user with candidate
advertisements in behavioral targeting.
The advancement in the user-tracking technology allows online advertising hosts to
access richer information on users’ behaviors and accordingly to infer a larger set of users’
interests. This urges advertising hosts to choose the optimal targeting level in the user-ad match
rather than rely on simple heuristics such as frequency- or recency-based thresholds. Since most
advertising hosts sell display advertising spaces via real-time auctions and charge advertisers for
each click, advertising hosts need to understand the impact of the targeting level on both users’
clicks and advertisers’ bids in order to improve the revenue.
Targeting level can affect user click-through rate through complex interactions of
multiple effects. On one hand, since a high level of targeting considers the individual consumer’s
most active interests inferred from his or her recent online behavior, it can better serve the user’s
current preferences and consequently increase the click-through rate. We term this as a positive
relevance effect as a result of a high targeting level. On the other hand, even with the improved
information relevance that comes from behavioral targeting, consumers may not react to
behaviorally targeted advertisements favorably, for two possible reasons. First is the increasing
concern of users around privacy. A national survey found that over 60% of American adults did
not like behaviorally targeted advertisements because of the possibility of privacy intrusion
(Turow et al. 2009). As consumers are exposed to more behavior-based ads when browsing
online websites, they are more likely to notice that they are being tracked and then become less
responsive to targeted ads. The second possible reason is the information satiation that results
79
from the lack of novelty in behaviorally targeted advertisements. McCann, an advertising
research firm, recently reported that in a global survey of 10,000 people, 57% of respondents
expressed their worry about the lack of new information conveyed by behaviorally targeted
ads.
12
This evidence suggests that there might be a negative resistance effect of the targeting
level on consumers’ clicks.
The impact of the targeting level on advertisers’ bids or payments is also ambiguous.
Intuitively, a high level of targeting helps advertisers reach a group of relevant users who are
generally more interested in the advertising content. These relevant users are more likely to
convert upon clicks and consequently raise the value-per-click, which can incentivize advertisers
to bid higher in auctions. We term this as a positive valuation effect of the targeting level on bids.
On the other hand, a high targeting level often results in fewer advertisers participating in the
auction, and such reduced competition intensity may lower advertisers’ bidding incentives. This
lowers the cost-per-click (CPC), leading to a negative competition effect as the result of a high
targeting level.
Due to these complex relationships mentioned above, a comprehensive understanding of
the interactions between consumer reactions to behaviorally targeted advertisements and
advertisers’ bidding decisions is crucial to help the advertising host delicately balance several
opposing forces and choose the optimal targeting level when implementing behaviorally targeted
advertising. The primary objective of this research is therefore to develop an empirical modeling
framework to help advertising hosts improve the ad-serving mechanism and the effectiveness of
behaviorally targeted advertising.
We propose an integrated model for the behavioral targeting ecosystem, which includes
the click decision of consumers, the bid decision of advertisers, and the ad-serving rule adopted
12
http://blogs.wsj.com/cmo/2014/05/16/shoppers-know-brands-are-watching-them/
80
by the advertising host. We apply the proposed model to a novel dataset obtained from a leading
Internet advertising platform that consists of the complete records of 620 consumers’ online
activities during six weeks and the corresponding advertisements displayed for each user’s
impression and advertisers’ bid amount. We first obtain demand-side estimates by adopting a
Bayesian method to jointly estimate the consumer’s click model and the advertising host’s ad-
serving model, controlling for the potential sample selection problem; i.e., the set of displayed
ads were not randomly chosen. Then we infer the advertisers’ valuation per click-through from
the observed bids based on the assumption of advertisers’ profit maximization. The profit
function is simulated based on the behavioral targeting mechanism and demand-side estimates.
Our results show that consumers are more likely to click on ads from product categories
that have been recently and frequently clicked but not yet purchased. These advertisements also
show generally higher probability of being served by the advertising host, indicating a positive
relevance effect on click-through rate. Our results also provide evidence for the negative
resistance effect of the targeting level: we find that a consumer’s likelihood of engaging in clicks
decreases when the stock of the overlap between behaviorally targeted ads and his or her recent
behaviors increases. On the advertiser’s side, we find that advertisers’ bids are about 20% below
their true value-per-click. We do not find significant evidence of a positive valuation effect.
Since the competition effect results from the structural nature of the auction mechanism, we are
unable to empirically identify this effect without directly observing the advertisers’ valuation in
auctions. Based on these estimation results, we conduct a counterfactual experiment to
investigate the impact of the level of targeting on each entity’s profits. The findings indicate that
the average CPC decreases and the advertiser’s profit increases with the targeting level, and both
the click-through rate and the advertising host’s revenue are in an inverted-U relationship with
81
the targeting level, suggesting the existence of a “sweet spot” of the targeting level in serving
ads. In our empirical context, the advertising host’ revenue can be improved by at least 2% by
optimizing the targeting level.
This paper makes several contributions. To our best knowledge, we present the first
empirical analysis of the effects of ad-relevance attributes on click metrics in behaviorally
targeted online advertising. Also, our counterfactual experiment offers important managerial
implications to advertising hosts to effectively improve the targeting mechanism by finding the
optimal targeting level in matching consumers with advertisements. Finally, although we cast our
model in the context of online display advertising, our empirical framework and the proposed
counterfactuals can be readily applied to other interactive, information-rich media that employ
behavioral targeting such as the rapidly growing market for mobile advertising and mobile
applications.
What follows is a discussion of the related literature, a description of the empirical
context and data, and an introduction of the model and estimation strategies. We then present our
empirical findings, and discuss the counterfactual analyses before concluding with several future
directions for related research.
3.2 Related Literature
Behavioral targeting is a relatively new research topic. Some early work used field experiments
to study the effectiveness of behaviorally targeted advertisements (e.g., Chen et al. 2009, Farahat
and Bailey 2012, Yan et al. 2009). For example, Farahat and Bailey (2012) showed that
behavioral targeting leads to an 89% increase in click-through rates. Recently, a few papers from
the economics literature analytically investigated the welfare implications of behavioral targeting
82
and found that behavioral targeting can affect advertisers’ and publishers’ revenue favorably and
unfavorably, depending on market conditions such as consumer heterogeneity, competition
intensity, and targeting capability (Chen and Stallaert 2014, Jiang and Kumar 2014). These
studies provided important insights to our modeling framework of consumers’ and advertisers’
behaviors. Our paper complements this line of work by focusing on how to improve the
behavioral targeting mechanism from the advertising host’s perspective.
Several papers in the behavioral targeting literature have also discussed users’ privacy
concerns from different perspectives such as the impact of privacy regulation on advertising
effectiveness (Goldfarb and Tucker 2011b), the financial impact of a privacy policy in the online
advertising industry (Johnson 2013), and the alleviation of consumers’ privacy concerns by
providing users with personal-information controls (Tucker 2014). In contrast to this strand of
research, our paper does not attempt to address the privacy issue in behavioral targeting.
Nevertheless, the negative resistance effect on users’ clicks found in our study echoes the
concerns about personal privacy claimed in previous literature.
Another relevant paper is Lambrecht and Tucker (2013), which studied the specificity of
consumers’ interests used for retargeted online advertising. Their empirical results suggested that
retargeted ads are more effective when delivering generic products from the same category than
showing consumers the exact product that they have viewed before. Our work differs from
Lambrecht and Tucker (2013) in two major ways. On the advertiser’s side, our context is
behavioral targeting in which multiple advertisers compete in auctions for each impression,
whereas there is no competition in retargeted advertising because only the most recently visited
advertiser is exposed. On the consumer’s side, unlike retargeting where the match between
consumers and advertisements only hinges on the latest interest learned from a consumer’s past
83
behavior, behavioral targeting involves multiple interests from a consumer. Therefore, our
research provides new implications to online advertising hosts by shedding light on the selection
of consumer interests in developing the ad-serving matching mechanism, given the information
specificity set at the product category level.
Our paper also contributes to the broader literature on online advertising. Previous
research has examined the effect of advertisement attributes on consumers’ responses in different
contexts including non-targeted banner ads (Chatterjee et al. 2003, Manchanda et al. 2006),
sponsored search advertising (Ghose and Yang 2009, Rutz and Bucklin 2011), and retargeted
display advertising (Lambrecht and Tucker 2013). However, to our best knowledge, no published
research has examined consumers’ reactions to more broadly defined behaviorally targeted
advertisements. We extend this strand of literature by providing the first empirical investigation
into how various associations between advertisements and consumers’ past behaviors influence
click-through rates in the behavioral targeting context.
We view behaviorally targeted advertising as a two-sided market wherein the behaviors
of consumers and advertisers are interdependent on the platform facilitated by the advertising
host (Rochet and Tirole 2006). The two-sided model has been demonstrated to be a powerful
framework to investigate the interplay between consumers and advertisers in both the traditional
television advertising market (Wilbur 2008) and the online search advertising market (Yang et al.
2014, Yao and Mela 2011). We add to this line of literature by applying the two-sided model to
the online display advertising market with the behavioral targeting technology.
84
3.3 Empirical Context
We acquired a proprietary dataset from a leading online market maker outside the U.S. who hosts
behaviorally targeted display advertising. The advertising host relies on both consumers’ account
names and IP addresses to track each individual’s past behaviors, and it builds profiles for each
consumer’s shopping interests. Advertisers on this platform are mainly small- and medium-sized
merchants in various product categories who are seeking to reach potential customers through
behaviorally targeted display advertisements. For each request of ad impression, the advertising
host launches an auction to choose advertisements for display from a set of potential advertisers
who match the consumer’s interest profile. Each selected advertiser or winner then pays a
quality-adjusted second highest bid for a click on the advertisement. The full dataset consists of
three parts: (i) a panel dataset of 620 consumers’ click responses to behavioral targeted
advertisements; (ii) daily records of consumers’ tracked behaviors aggregated to 1,029 product
categories; (iii) records of daily bids and quality metrics of 56,512 advertisers.
3.3.1 Data on Consumers’ Advertisement Responses
The advertising host provides the log of all 38,615 impressions of behavioral targeted
advertisements viewed by a sample of 620 consumers during the first two weeks of November
2013.
13
These advertisements cover all advertising spaces that use behavioral targeting
technology at partner websites across the advertising host’s network. For each ad impression, the
advertising host launches an auction to display eight advertisements to consumers
simultaneously shown in a 2×4 matrix. Each data observation indicates a click decision made by
the consumer. Therefore, our data include 308,920 observations of ad-clicking behaviors. On
average, the click-through rate is 3.52% and each consumer views 35.59 ads and makes 1.25
13
We use the dataset from Nov 1 to 10 and Nov 13 to17. We exclude the three-day window from Nov 10 to Nov 12
because Nov 11 is a Chinese e-commerce holiday and consumers might spend more time on the Internet around this
holiday. This leaves us 14 days of observations used in model estimation.
85
clicks per day. We use the number of daily impressions as an indicator of the consumer’s
familiarity with the advertising host. A consumer with more frequent visits per day is regarded as
a more experienced user.
Our data also include the product price conveyed in each ad copy. The total of 56,512
unique ads are classified into 1,029 product categories, which ranges widely from conventional
categories such as clothes, personal care products, and electronics to more specialized categories
such as chemical ingredients and antiques. These 1,029 product categories can be further
classified into 14 top-line categories, which are listed in Table 18. To ensure the price
comparability of items from different product categories, we define the relative price Rprice of
an item as the log of its listed price subtracted by the mean log price of all items belonging to
that category. Table 19 reports the overall summary statistics for advertisements and consumers.
Top-Line Categories % of Impressions % of Categories % of Ads
Women’s Clothing 56.17 3.50 37.92
Men’s Clothing 4.73 1.36 4.74
Health & Beauty 1.01 4.08 1.81
Appliances 2.93 13.41 5.71
Home Improvement 3.87 18.46 7.49
Food & Grocery 1.64 9.62 3.14
Kids & Baby 6.40 10.59 8.70
Underwear & Accessories 4.29 3.98 6.55
Household Product 0.78 7.00 1.58
Electronics 1.70 7.97 3.38
Entertainment 0.44 4.28 1.01
Shoes & Luggage 11.36 0.78 9.51
Sports & Outdoor 3.24 11.69 4.54
Others 1.46 4.28 3.93
Table 18. Shares of Impressions, Categories, and Advertisements across Top-Line Categories
86
Variable Mean Std.dev Min Max
Activities
Click 0.035 0.18 0 1
Bid 2.67 1.87 0.06 69
Advertisement characteristics
Ad-specific
ln(Price) 6.00 1.26 ‒4.61 18.30
Click score 72.55 34.29 2 150
Category-specific
Freq_view 210.73 358.58 0 6307
Freq_click 5.52 8.27 0 162
Freq_buy 0.003 0.056 0 2
Rcc_view 3.93 8.08 1 43
Rcc_click 2.49 2.81 1 14
Rcc_buy 36.45 4.45 1 43
No. of displayed ads from the same cat (n) 4.04 2.10 1 8
No. of candidate ads from the same cat (N) 125.98 122.24 1 528
Impression-specific
Similarity 0.35 0.24 0 1
No. of displayed categories 2.87 1.01 1 8
No. of candidate categories 20.30 7.90 2 59
Consumer characteristics
No. of daily impressions (Z) 4.45 2.01 0.93 8.93
Table 19. Summary Statistics of Consumer and Advertisement Characteristics
3.3.2 Data on Consumers’ Tracked Behaviors
We observe the number of daily views, clicks, and purchases on each product category for each
of the 620 consumers during the six weeks from October 4 to November 17
,
2013. Our data
represent the full trajectory of consumers’ online behaviors within the advertising host’s entire
network, which reflect user behavior not only on those behaviorally targeted advertisements but
also on other contents such as organic product listings. The advertising host creates an interest
profile for each consumer by including all product categories that have been clicked by the
consumer during the previous two weeks. Hence, we have sufficiently long data to reconstruct
consumers’ interest profiles at the time of their ad impressions. Table 19 shows that on average,
87
the number of interested product categories of a consumer per impression is about 20, of which
almost 3 categories are displayed during an impression.
We create several advertisement characteristics to measure the relevance between the
advertised product and the consumers. For each advertised item, we measure the frequency and
the recency of three types of consumer actions on the associated product category: viewing,
clicking, and purchasing. The frequency is combined across incidences in the previous two
weeks to be consistent with the industrial practice. We also create a variable Similarity (sim) to
gauge the overall proximity between a consumer’s recent online behavior and the exposed
behaviorally targeted ads. For each ad impression, we define similarity as the Jaccard similarity
index between the set of product categories (A) of eight displayed ads and the set of product
categories (B) that a consumer clicked on the day before (i.e., similarity=
|𝐴 ∩𝐵 |
|𝐴 ∪𝐵 |
).
14
3.3.3 Data on Advertisers’ Bids
The platform’s ad allocation mechanism consists of two steps. The first step is to determine the
number of advertisements from each relevant product category for display, based on the
individual’s interest score and the total number of competing advertisers in each product
category. Given the ad quota for each product category, the winning advertisements within a
category are selected based on advertisers’ submitted bids weighted by quality scores. Finally,
the eight winning advertisers are pooled together and displayed to the consumer. Each advertiser
pays a CPC to the advertising host as determined in a generalized second-price auction (Edelman
et al. 2007, Varian 2007), i.e., the second highest weighted bid divided by his or her quality
score.
14
The Jaccard index is a heuristic similarity metrics commonly used in statistics and computer science, which
measures the relative overlap between two sets of elements.
88
We observe advertisers’ daily bids during the sample period, which vary from 0.06 to 69
with a mean of 2.67. Due to the confidentiality concern, our data provider did not directly
disclose advertisers’ quality scores to us. Instead, we were offered a variable Click Score (CS),
which represents the historical click performance of an ad that is a major ingredient of the quality
score. As the advertising host updates the quality scores on a daily basis, we assume that most
advertisers review their bid decisions daily.
3.4 Model
This section presents an integrated model of the consumers’ ad clicking, the advertising host’s ad
serving, and the advertisers’ bidding decisions.
3.4.1 Modeling Consumers’ Click Decisions
After being exposed to behaviorally targeted ads, consumers decide whether to click. We model
consumers’ click decisions as a two-step process illustrated in Figure 3. First, consumers decide
whether to consider clicking on these ads. Contingent on engaging in a click mode, consumers
then choose whether to actually click on each of eight displayed ads.
Figure 3. Consumer Decision Process
Encountering a set
of display ads
Consider to click
Ad1 (Click or not)
...
Ad8 (Click or not) Not Consider
89
We model consumers’ click-consideration decisions by characterizing the consideration
intensity 𝐶 𝑖𝑡
of consumer i at the incidence of impression t. This approach is consistent with the
model of consumers’ consideration sets in previous literature on choice models (e.g.,
Bronnenberg and Vanhoecker 1996, van Nierop et al. 2010). Let 𝑐 𝑖𝑡
indicate a consideration
decision, which equals one if 𝐶 𝑖𝑡
>0 and zero if otherwise. To describe whether a consumer
considers clicking on ads, we assume the following:
𝐶 𝑖𝑡
=𝛼 𝑖0
+𝛼 𝑖1
𝐴 𝑆 𝑖𝑡
+𝛼 𝑖2
𝐴 𝑆 𝑖𝑡
2
+𝛼 𝑖3
𝑆 𝑆 𝑖𝑡
+𝛼 𝑖4
𝑆 𝑆 𝑖𝑡
2
+𝛼 𝑖5
𝑅𝑝𝑟𝑖𝑐𝑒̅̅̅̅̅̅̅̅̅
𝑖𝑡
+𝜖 𝑖𝑡
𝐶 (1)
𝐴 𝑆 𝑖𝑡
=∑ 𝜌 𝑖 𝑑 (𝑡 )−𝑑 (ℎ)
𝑎 𝑖ℎ ℎ≤𝑡 (2)
𝑆 𝑆 𝑖𝑡
=∑ 𝜌 𝑖 𝑑 (𝑡 )−𝑑 (ℎ)
𝑠𝑖𝑚 𝑖ℎ ℎ≤𝑡 (3)
We allow the consideration intensity 𝐶 𝑖𝑡
to be dependent on three sets of variables at the
impression level: advertising stock 𝐴 𝑆 𝑖𝑡
, similarity stock 𝑆 𝑆 𝑖𝑡
, and average relative price of eight
displayed products 𝑅𝑝𝑟𝑖𝑐𝑒̅̅̅̅̅̅̅̅̅
𝑖𝑡
. Intuitively, a consumer’s response to behaviorally targeted ads
provided by a particular advertising host is likely to be affected by his or her accumulative
encounters of the same type of ads and the overall relevance of ad contents, which are
represented by 𝐴 𝑆 𝑖𝑡
and 𝑆 𝑆 𝑖𝑡
respectively. We follow the classic Guadagni and Little (1983) to
define the stock variables in an exponentially decaying fashion in Equations 2 and 3, where 𝑎 𝑖𝑡
is
the indicator of incidence of ad impression and 𝑠𝑖 𝑚 𝑖𝑡
measures the proximity of the set of
display ads in impression t and consumer i’s activity on the previous day. The parameter 𝜌 𝑖
reflects the daily carryover effect of advertising and similarity stock for consumer i and 𝑑 (𝑡 ) is
the date function for impression t. We reparameterize the carryover parameter in Equation 4 to
ensure 𝜌 𝑖 ∈(0,1) .
𝜌 𝑖 =
exp(𝛿 𝑖 )
1+exp(𝛿 𝑖 )
(4)
90
Finally, we employ a standard hierarchical formulation to capture consumers’
heterogeneity in Equations 5 and 6, where 𝑍 𝑖 includes an intercept and the consumer’s average
daily impressions. We also assume the measurement error 𝜖 𝑖𝑡
𝐶 in Equation 1 to be normally
distributed and normalize its variance to be one for identification purpose.
𝛼 𝑖 =Δ
𝐶 𝑍 𝑖 +𝜇 𝑖 𝐶 , 𝜇 𝑖 𝐶 ~𝑀𝑉𝑁 (0,𝛴 𝐶 ) (5)
𝛿 𝑖 =Δ
𝐷 𝑍 𝑖 +𝜇 𝑖 𝐷 , 𝜇 𝑖 𝐷 ~𝑁 (0,𝜎 𝐷 2
) (6)
Conditioned on the trigger of consumer’s clicking mode (i.e., 𝑐 𝑖𝑡
=1), we use a binary
probit model to characterize consumers’ click decisions. Let 𝑈 𝑖𝑘𝑡
denote the consumer i’s utility
of clicking on ad k at impression t and let 𝑞 𝑖𝑘𝑡
denote a click decision. We normalize the no-click
utility as zero so that 𝑞 𝑖𝑘𝑡
=1 iff 𝑈 𝑖𝑘𝑡
>0. We describe the clicking utility as the following:
𝑈 𝑖𝑘𝑡
=𝛽 𝑖0
+𝛽 𝑖1
ln(𝐹𝑟𝑒𝑞 𝑖𝑔𝑡
+1)+𝛽 𝑖2
ln(𝑅𝑐 𝑐 𝑖𝑔𝑡
)+𝛽 𝑖3
𝑅𝑝𝑟𝑖𝑐𝑒 𝑘 +𝛽 𝑖4
𝑛 𝑖𝑔𝑡
+𝑓 𝑚 𝑈
+𝜉 𝑘 𝑈 +𝜂 𝑖𝑘
𝑈 +𝜖 𝑖𝑘𝑡
𝑈 (7)
𝜂 𝑖𝑘
𝑈 =𝜂 𝑖𝑚
𝑈 , ∀𝑘 ∈𝑚 (8)
𝛽 𝑖 =Δ
𝑈 𝑍 𝑖 +𝜇 𝑖 𝑈 , 𝜇 𝑖 𝑈 ~𝑀𝑉𝑁 (0,𝛴 𝑈 ) (9)
where 𝐹𝑟𝑒𝑞 𝑖𝑔𝑡
and 𝑅𝑐 𝑐 𝑖𝑔𝑡
stand for the frequency and the recency of consumer i’s viewing,
clicking, and purchasing behaviors in category 𝑔 associated with ad k.
15
Here the variable
𝑅𝑝𝑟𝑖𝑐𝑒 refer to the relative price defined in Section 3.3. We include the number of displayed ads
from the same category, denoted by 𝑛 𝑖𝑔𝑡
, to capture the potential substitutive effect on the
consumer’s click probability. We also include dummy variables of 13 top-line categories 𝑓 𝑚 𝑈 to
control for additional observed heterogeneity, where the top-line category “women clothing” is
benchmarked.
15
We do not take log on the frequency of purchases because it only varies from 0 to 2. We also apply the same rule
to Equation 14 below.
91
We decompose the error term in consumer’s clicking utility into three parts: 𝜉 𝑘 𝑈 stands for
ad-specific unobserved heterogeneity such as the quality of advertisement design; 𝜂 𝑖𝑘
𝑈 stands for
the underlying click preference of user i towards ad k that is not observed by researchers; and
𝜖 𝑖𝑘𝑡
𝑈 is an individual-specific idiosyncratic preference shock that follows i.i.d. N(0, 1). The
distribution assumptions for 𝜉 𝑘 𝑈 and 𝜂 𝑖𝑘
𝑈 will be specified later. Because of the large number of
unique consumer-ad matching pairs, we do not have enough observations to identify consumer-
and ad-specific unobserved heterogeneity. Therefore, we make a simplifying assumption in
Equation 8 so that a consumer’s unobserved click preference is the same for advertised items
from the same top-line product category m. Finally, we capture consumers’ heterogeneity in click
preferences in a similar way as in the consideration model.
One econometric challenge for estimating the click model is the potential sample
selection problem embedded in the behavioral targeting mechanism: the set of advertised
products displayed in an impression denoted by 𝐷 𝑖𝑡
is deliberately selected by the advertising
host in order to match a consumer’s product interests. This implies that conditional on 𝐷 𝑖𝑡
, the
expectation of the consumer’s unobserved preference for displayed categories and products is
likely to be non-zero, which may violate the standard exogeneity assumption in regression
models. To correct for this potential sample selection bias, we need to explicitly model the ad-
serving algorithm used by the advertising host.
3.4.2 Modeling the Advertising Host’ s Rule of Ad Serving
We model the rule of ad serving for two reasons. Recall that one main objective of this research
is to examine the revenue implication of the change in ad-serving algorithm for the advertising
host. This requires us to understand the current mechanism for ad delivery. Also, modeling the
rule of ad serving can help us overcome the potential sample selection issue mentioned above.
92
We here describe the ad-serving algorithm following the industrial practice. For each impression,
the advertising host determines the set of ads for display in three steps:
• Matching: Select a consumer’s shopping interests based on the individual’s historical
behavior, which determines the targeting level.
• Allocating: Allocate an ad quota to the set of matched product categories based on the
category-specific weighted interest score; that is, determine how many advertisements
from each relevant product category will be displayed to the individual
• Ranking: Given the ad quota per category, select winning advertisements from each
product category based on the ad-specific weighted bid
The advertising host employs a deterministic rule in the Matching procedure: any product
category that has been clicked by the consumer during the past two weeks is considered a
category of interest. Formally, the set of product categories that consumer i is interested in at
time t is defined as 𝑃𝑜 𝑡 𝑖𝑡
={𝑔 |𝐹𝑟𝑒 𝑞 𝑖𝑔𝑡
(𝑐𝑙𝑖𝑐𝑘 )>0} .
Given the set of categories of potential interest, the advertising host determines a quota of
advertisements for each product category denoted by 𝑛 𝑖𝑔𝑡
in the Allocating step. The principal of
the algorithm for ad quota is to reward a product category with a higher purchase likelihood from
the consumer and a larger size of candidate advertisements by displaying more advertised items
from that product category. Specifically, the advertising host first assigns an Interest Score
denoted by 𝑆 𝑖𝑔𝑡
to each product category based on the predicted consumer’s purchase likelihood
of any item in that category in the next few days. After that, the vector of ad quota {𝑛 𝑖𝑔𝑡
} is
determined by a multinomial distribution so that on average, the number of displayed
advertisements for each category is proportional to the interest score weighted by the number of
93
competing advertisements in that category. Following these rules, we describe the model of ad
quota below.
𝑛 𝑖𝑔𝑡
~𝑀𝑢𝑙𝑡𝑖 (𝑛 =8,𝑝 ={𝑝 𝑖𝑔𝑡
}) (10)
𝑝 𝑖𝑔𝑡
=
𝑊 𝑆 𝑖𝑔𝑡 ∑ 𝑊 𝑆 𝑖 ℎ𝑡 ℎ∈𝑃𝑜 𝑡 𝑖𝑡
(11)
𝑊 𝑆 𝑖𝑔𝑡
=𝑆 𝑖𝑔𝑡
(𝑁 𝑔𝑡
)
𝛾 0
(12)
where 𝑊 𝑆 𝑖𝑔𝑡
stands for the weighted score and 𝑁 𝑔𝑡
represents the number of competing
advertisements in a product category, which is operationalized as the total number of advertisers
who belong to category 𝑔 and submit bids at time t. By taking log on both sides of Equation 12,
we thus have:
ln(𝑊 𝑆 𝑖𝑔𝑡
)=ln(𝑆 𝑖𝑔𝑡
)+𝛾 0
ln(𝑁 𝑔𝑡
) (13)
The exact formula of the interest score 𝑆 𝑖𝑔𝑡
is not disclosed by the advertising host.
Nevertheless, we were still informed of several core factors for the calibration of the interest
score. For example, the advertising host only uses a consumer’s active behavior associated with
the product category to compute the interest score. Based on this knowledge, we model the
interest score as follows:
ln(𝑆 𝑖𝑔𝑡
)=𝛾 1
ln(𝐹𝑟𝑒 𝑞 𝑖𝑔𝑡
𝑆 +1)+𝛾 2
ln(𝑅𝑐𝑐 𝑖𝑔𝑡
𝑆 )+𝛾 3
ln(𝑃𝑟𝑖𝑐𝑒 𝑔 )+𝑓 𝑚 𝑆 +𝜉 𝑔 𝑆 +𝜂 𝑖𝑔
𝑆 +𝜖 𝑖𝑔𝑡
𝑆 (14)
𝜂 𝑖𝑔
𝑆 =𝜂 𝑖𝑚
𝑆 , ∀𝑔 ∈𝑚 (15)
𝜉 𝑔 𝑆 ~𝑁 (0,𝜎 𝑆 2
) (16)
(𝜂 𝑖𝑚
𝑈 ,𝜂 𝑖𝑚
𝑆 )~𝑀𝑉𝑁 (0,Σ
𝜂 ) (17)
𝜖 𝑖𝑔𝑡
𝑆 ~𝑁 (0,𝜔 𝑆 2
) (18)
where 𝐹𝑟𝑒 𝑞 𝑖𝑔𝑡
𝑆 and 𝑅𝑐𝑐 𝑖𝑔𝑡
𝑆 represent the frequency and the recency of consumer i’s clicks and
purchases, 𝑃𝑟𝑖𝑐 𝑒 𝑔 stands for the average price of all items in category 𝑔 , 𝑓 𝑚 𝑆 is the dummy for
94
top-line category m, 𝜉 𝑔 𝑆 stands for category-specific unobserved heterogeneity that is assumed to
be normally distributed, 𝜂 𝑖𝑔
𝑆 stands for the advertising host’s knowledge about consumer i’s
interest in category 𝑔 that is unobserved by researchers, and 𝜖 𝑖𝑔𝑡
𝑆 is an i.i.d. normally distributed
measurement error with zero mean and variance 𝜔 𝑆 2
. For the similar reason claimed in
consumers’ clicking model, we assume that 𝜂 𝑖𝑔
𝑆 to be the same for all categories from the same
top-line category in Equation 15.
To control for the potential sample selection bias as previously mentioned, we assume in
Equation 17 that the consumer’s unobserved click preference of a top-line category is potentially
correlated with the corresponding unobserved component in the interest score determined by the
advertising host. Note that the number of competing ads in a product category 𝑁 𝑔𝑡
serves the role
of exclusion restriction because it directly affects the likelihood for advertisements from each
product category to be served but it does not influence the consumer’s click propensity.
The final task in the ad-serving algorithm is to select the winning 𝑛 𝑖𝑔𝑡
advertisements
from 𝑁 𝑔𝑡
competing ones in each product category 𝑔 ∈𝑃𝑜 𝑡 𝑖𝑡
and determine each advertiser’s
CPC in the Ranking procedure. The advertising host adopts a modified generalized second-price
auction. For each product category 𝑔 with a positive ad quota, all competing advertisements
{𝑘 |𝑘 ∈𝑔 } are ranked by their submitted bids denoted by 𝑏 𝑘𝑡
weighted by the quality score 𝑄 𝑆 𝑘𝑡
and a stochastic component 𝜖 𝑖𝑘𝑡
𝐵 . The daily quality score is observed by advertisers but not by
researchers and the stochastic element is inserted by the advertising host to ensure a positive
probability for each advertisement to be exposed. Let 𝑊 𝐵 𝑖𝑘𝑡
denote the weighted bid, and we
have:
𝑊 𝐵 𝑖𝑘𝑡
=𝑏 𝑘𝑡
𝑄 𝑆 𝑘𝑡
exp(𝜖 𝑖𝑘𝑡 𝐵 ) (19)
ln(𝑄 𝑆 𝑘𝑡
)=𝜙 1
ln(𝐶 𝑆 𝑘𝑡
)+𝜙 2
𝑅𝑝𝑟𝑖𝑐 𝑒 𝑘 +𝑓 𝑚 𝐵 +𝜉 𝑘 𝐵 (20)
95
(𝜉 𝑘 𝑈 ,𝜉 𝑘 𝐵 )~𝑀𝑉𝑁 (0,Σ
𝜉 ) (21)
𝜖 𝑖𝑔𝑡
𝐵 ~𝑁 (0,𝜔 𝐵 2
) (22)
We model the quality score of an advertisement as a function of the click score, the
relative price of the advertised item, the dummies of top-line categories, and the unobserved ad
characteristics denoted by 𝜉 𝑘 𝐵 . For example, 𝜉 𝑘 𝐵 may include the landing page quality of the
advertised item, which is not observed in our data. To further alleviate the endogeneity concern
that arises from the selection of ads, we assume 𝜉 𝑘 𝐵 to be potentially correlated with the
unobserved ad characteristics 𝜉 𝑘 𝑈 in the consumer’s click utility.
After choosing the winning 𝑛 𝑖𝑔𝑡
advertisements from each category 𝑔 ∈𝑃𝑜 𝑡 𝑖𝑡
based on
𝑊 𝐵 𝑖𝑘𝑡
, the advertising host pools these advertisements together and exposes them to the
consumer simultaneously. All these chosen advertisements from different categories are sorted
again based on their weighted bids. Each advertiser is then charged a payment per click, which
equals the below-ranked advertiser’s weighted bid divided by his or her quality score, that is,
𝐶𝑃 𝐶 𝑖𝑘𝑡
=
𝑊 𝐵 𝑖 ′
𝑘𝑡
𝑄 𝑆 𝑘𝑡
, where 𝑖 ′
refers to the advertiser ranked right below advertiser i.
3.4.3 Modeling Advertisers’ Bid Decisions
Advertisers select bids to maximize their daily profits from behaviorally targeted advertising. We
assume that an advertiser k’s daily value-per-click 𝑣 𝑘𝑡
and quality score 𝑄 𝑆 𝑘𝑡
are private
information, both of which are unknown to other advertisers and researchers. Although the
standard equilibrium concept in an incomplete information game is the Bayesian Nash
equilibrium (BNE), we argue that the BNE is implausible and intractable in our empirical context
for several reasons. First, since each advertiser is competing with thousands of other advertisers
from various product categories in the auction, it is infeasible for an individual advertiser to track
and accurately forecast the bid and quality metrics of each competitor. Second, unlike in
96
sponsored search advertising where advertisers can directly observe their competitors based on
the listings of sponsored advertisements associated with a keyword, it is rather difficult to
identify their competitors in online display advertising. Finally, solving the BNE with thousands
of players is technically formidable.
To address this complexity, we adopt the concept of Mean Field equilibrium (MFE) (e.g.,
Adlakha et al. 2013, Iyer et al.,2012, Larsy and Lions 2007). The MFE is an intuitive equilibrium
concept for studying a market with numerous bidders. It assumes that each bidder only forms a
belief on the stationary distribution of weighted bids for the whole population and optimizes his
or her bid as an individual decision-making problem based on such a belief. In equilibrium, the
distribution of optimized weighted bids coincides with bidders’ beliefs.
The underlying assumption of the MFE is that the empirical distribution of others’ actions
is hardly affected by an individual’s own action. This assumption is likely to hold in a market
with an enormous number of players such as our empirical context of behaviorally targeted
advertising. Furthermore, Iyer et al. (2012) proved that the MFE has some good properties such
as its existence under mild assumptions and the fact that it asymptotically captures the rational
behavior of players. They showed that if everyone else is playing the MFE, the benefit from the
deviation from the MFE converges to zero as the number of players increases. Because of these
advantages and the applicability in our empirical setting, we utilize the concept of the MFE to
describe an individual advertiser’s bid decision 𝑏 𝑘𝑡
.
𝑏 𝑘𝑡
=argmax
𝑏 Π
𝑘𝑡
[𝑏 ;𝑣 𝑘𝑡
,𝑄 𝑘𝑡
,𝐹 (∙)] (23)
Π
𝑘𝑡
(𝑏 )=∑𝐸 {𝑞 𝑖𝑘𝑡
Pr(𝑘 ∈𝐷 𝑖𝑡
)𝐸 [(𝑣 𝑘𝑡
−𝐶𝑃 𝐶 𝑘𝑖𝑡
)|𝑘 ∈𝐷 𝑖𝑡
]}
𝑖 (24)
𝑏 𝑘𝑡
𝑄 𝑆 𝑘𝑡
~𝐹 (∙) (25)
97
where 𝐹 (∙) denotes the advertiser’s belief in the distribution of the Ad Rank, defined as the
product of competitors’ bids and quality scores. Π
𝑘𝑡
is advertiser k’s daily profit, which depends
on consumers’ click propensity, the probability of the ad being displayed, and the profit margin
per click. The expectation in Equation 24 is taken with respect to measurement errors in the
model of click and ad serving, as well as all competing advertisers’ actions represented by 𝐹 (∙) .
In practice, advertisers can evaluate their expected daily profits based on demand estimates
including expected impressions, clicks, and CPCs through free bid simulators, which are
typically provided by major online adverting hosts such as Google and Facebook.
16
We derive
the detailed expression of an advertiser’s expected profit function in Appendix A, where we also
provide a numerical demonstration of the negative competition effect on the advertiser’s bid.
Here the competition effect is structurally captured by our model of advertisers’ bids.
Because each advertiser’s daily value-per-click is possibly contingent on the shopping
interest of a representative targeted consumer, we have included the average interest score of
matched consumers and other observed covariates into the model of 𝑣 𝑘𝑡
.
ln(𝑣 𝑘𝑡
)=𝜆 𝑘 0
+𝜆 𝑘 1
ln (𝑆 𝑔𝑡
)
̅̅̅̅̅̅̅̅̅
+𝜆 𝑘 2
ln(𝑁 𝑔𝑡
)+𝑓 𝑚 𝑉 +𝜖 𝑘𝑡
𝑉 (26)
ln (𝑆 𝑔𝑡
)
̅̅̅̅̅̅̅̅̅
=
∑1{𝑔 ∈𝑃𝑜 𝑡 𝑖𝑡
}
𝑖 ln (𝑆 𝑖𝑔𝑡 )
∑1{𝑔 ∈𝑃𝑜 𝑡 𝑖𝑡
}
𝑖 (27)
𝜆 𝑘 =Δ
𝑉 𝑊 𝑘 +𝜈 𝑘 , 𝜈 𝑘 ~𝑀𝑉𝑁 (0,𝛴 𝑉 ) (28)
𝜖 𝑘𝑡
𝑉 ~𝑁 (0,𝜔 𝑉 2
) (29)
where ln (𝑆 𝑔𝑡
)
̅̅̅̅̅̅̅̅̅
is the average log of interest score of targeted consumers who have clicked on
items from product category 𝑔 recently, 𝑓 𝑚 𝑉 stands for top-line category dummies, and 𝜖 𝑘𝑡
𝑉 is an
i.i.d. normally distributed measurement error. Here we allow the observed heterogeneity among
advertisers to be dependent on 𝑊 𝑘 =(1,ln(𝑃𝑟𝑖𝑐 𝑒 𝑘 )) .
16
https://support.google.com/adwords/answer/2470105?hl=en; https://www.facebook.com/help/527780867299597/
98
3.4.4 Estimation Strategy
We encounter several challenges in our model estimation. First, the consumer interest score
associated with different product categories and the quality scores of advertisements are both
unobserved. Second, due to the potential correlation between unobserved components that affect
the consumer’s click decision and the advertising host’s ad-serving rule, we need to jointly
estimate models of click decision, realized ad quota per category, and the set of displayed ads.
Finally, each advertiser’s bid is an implicit function of the advertiser’s value-per-click
characterized by the first-order condition (FOC) of the advertiser’s profit maximization problem.
Since the profit function and the corresponding FOC do not have a closed form, we need to
numerically simulate these functions and solve the FOC to make inferences on value-per-clicks.
To cope with these challenges, we adopt a Bayesian estimation method to make
inferences for the demand-side models (i.e., the model of click and ad-serving rule). As shown,
the demand-side models involve a huge number of latent variables. One advantage of the
Bayesian approach is to help us circumvent the necessity of integration by augmenting the data
with these latent variables. Furthermore, the implementation of the Bayesian approach by the
Markov chain Monte Carlo (MCMC) method helps us obtain posterior samples of interest score
and quality score, which can be directly used to simulate the advertiser’s profit function
afterwards. We provide details of the MCMC algorithm in Appendix B.
Given the estimation results from the demand side, we then estimate the model of
advertisers’ bid decisions in two steps. First, we infer each advertiser’s daily value-per-click 𝑣 𝑘𝑡
by numerically solving the FOC of the optimization problem. We use the posterior means of
advertisers’ quality scores to compute their ad ranks; then we use their empirical distribution to
simulate advertisers’ profit function in Equation 24. Due to the computational constraint, we only
99
infer value-per-clicks for the daily bids of 5,000 advertisers that are randomly sampled. In the
second step, we estimate the model of value-per-clicks as described in Equation 26 using the
standard Bayesian approach executed by the MCMC algorithm, where we construct ln (𝑆 𝑔𝑡
)
̅̅̅̅̅̅̅̅̅
based on the posterior mean of interest scores obtained from the demand-side estimation.
3.5 Results
The main empirical findings from our analysis can be summarized as follows: (i) consumers are
more likely to click on advertised products from a category that they have recently and
frequently clicked, but not yet purchased; (ii) consumers’ likelihood of engaging in clicks is
negatively affected by the accumulative similarity between behaviorally targeted ads and their
recent online activities; (iii) the advertising host’s rule of ad serving is almost consistent with a
consumer’s click preference; (iv) advertisers’ value-per-clicks are not significantly influenced by
the average interest score of targeted consumers. These findings suggest that the targeting level
has both the positive relevance effect and the negative resistance effect on consumers’ clicks, and
it only has the negative competition effect but not the positive valuation effect on advertisers’
bids.
3.5.1 Consumers’ Click Behaviors
Table 20 reports the coefficient estimates from the click model.
17
As expected, we find that
consumers are more likely to consider clicking on ads when they encounter behaviorally targeted
ads more frequently in a given period. This positive effect of ad exposure on click consideration
is stronger for experienced consumers and it declines as the accumulative ad exposure increases.
17
Posterior means and standard deviations (in the parenthesis) are reported, and estimates that are significant at 95%
are bolded in Tables 20-24. The estimates on fixed effects of top-line product categories are not reported and can be
provided by the authors upon the request.
100
Surprisingly, our results indicate that consumers are less inclined to consider clicking on
advertisements if the advertised product categories are highly consistent with what they have
clicked on before, suggesting the negative resistance effect of the targeting level on click-through
rate. We propose two possible explanations for this negative effect. First, the similarity may
trigger consumers’ privacy concerns about their personal information being tracked, thereby
lowering their click intention. Another explanation is the satiation effect, such that consumers
experience fatigue with these advertisements because of the lack of novelty.
Advertisement characteristics Intercept Daily Impressions (Z)
Consideration stage
Intercept ‒3.372 (.243) ‒0.280 (.103)
Transformed carryover parameter ‒2.18 (.112) ‒0.161 (.042)
Impression-specific
Advertising stock (AS) 2.176 (.169) 0.216 (.084)
AS^2 ‒0.045 (.021) 0.005 (.009)
Similarity stock (SS) ‒0.746 (.205) 0.057 (.096)
SS^2 0.152 (.086) ‒0.028 (.027)
Avg Rprice ‒0.088 (.124) 0.012 (.052)
Click stage
Intercept
-1.801 (.017) -0.127 (.007)
Category-specific
ln(Freq_view+1) 0.013 (.009) 0.003 (.005)
ln(Freq_click+1) 0.060 (.017) -0.016 (.009)
Freq_buy
-0.730 (.211)
0.040 (.110)
ln(Rcc_view) ‒0.022 (.015) ‒0.001 (.008)
ln(Rcc_click) ‒0.036 (.016) ‒0.001 (.008)
ln(Rcc_buy)
-0.224 (.071) -0.021 (.033)
n
-0.034 (.006)
0.000 (.003)
Ad-specific
Rprice
-0.053 (.011) -0.005 (.006)
Table 20. Estimation Results of the Click Model
The carryover coefficient for the advertising and the similarity stock is generally small
but has a substantial heterogeneity across consumers. The posterior mean of 𝜌 𝑖 has a mean of
101
0.22 and a range of (0.02, 0.87). This means that the advertising and the similarity stock decays
by about 80% each day, suggesting that on average, consumers forget their perceptions of
behaviorally targeted ads that they encountered two days ago.
In the click stage, we find that the click-through rate is generally higher for
advertisements that are associated with relatively cheaper items. Besides, the click-through rate
is higher for advertised products from a category that a consumer has recently and frequently
clicked but not yet purchased. Intuitively, a higher click frequency and a lower purchase
frequency are good indicators of a consumer’s shopping interest in a particular product category.
We also notice that given consumers’ previous purchases in a product category, they are more
likely to click on ads from the same category when the purchase is made recently. One possible
explanation is that in our empirical context, consumers are inclined to collect information from
advertised products within the purchased category for post-purchase evaluation. As Table 20
shows, the negative effect of the number of displayed ads from the same category on the click-
through rate suggests the substitutive effect among these ads. Our results also indicate that more
experienced consumers are less responsive to behaviorally targeted ads in both the consideration
and the click stages. In addition to the observed consumer heterogeneity, the variance-covariance
estimates in Table 21 and Table 22 show that there are also a large amount of unobserved factors
explaining the variation in consumers’ click preferences.
102
Covariance matrix (𝚺 𝑪 )
1 2 3 4 5 6
Intercept 3.655
AS ‒2.454 2.421
AS^2 0.189 ‒0.168 0.078
SS ‒0.470 0.184 ‒0.026 1.061
SS^2 0.471 ‒0.215 ‒0.028 ‒0.562 0.537
Avg Rprice 0.124 ‒0.625 0.014 ‒0.018 0.119 0.317
Standard deviations
Error term in Carryover parameter (𝜎 𝐷 ) 2.035 (.145)
Table 21. Unobserved Consumer Heterogeneity at the Consideration Stage
Covariance matrix (𝚺 𝑼 )
1 2 3 4 5 6 7 8 9
Intercept 0.054
ln(Freq_view+1) 0.001 0.023
ln(Freq_click+1) ‒0.001 ‒0.013 0.058
Freq_buy 0.004 0.023 ‒0.054 2.706
ln(Rcc_view) ‒0.002 0.012 ‒0.006 0.035 0.047
ln(Rcc_click) ‒0.002 ‒0.008 0.013 0.015 ‒0.005 0.057
ln(Rcc_buy) 0.001 0.016 ‒0.033 1.260 0.027 0.007 0.838
n 0.002 ‒0.001 ‒0.001 0.002 0.000 0.000 0.001 0.034
Rprice 0.001 ‒0.000 0.001 ‒0.007 ‒0.001 0.003 ‒0.003 ‒0.000 0.013
Table 22. Unobserved Consumer Heterogeneity at the Click Stage
103
3.5.2 Ad-serving Algorithm: Interest Score and Quality Score
We report the estimation results for models of interest score and quality score in Table 23. Our
results show that the advertising host assigns a higher interest score to a product category that is
less expensive, and one that has been recently and frequently clicked but not purchased by the
consumer. Comparing the determinants of interest score with their effects on consumers’ click
behaviors, we conclude that the advertising host’s rule of ad serving is largely consistent with
consumers’ click preferences except that the interest score is higher for a distantly purchased
category, which might lead to fewer clicks. We suspect that this is because the interest score is
constructed by the advertising host to reflect users’ purchase intentions rather than click
intentions. Intuitively, even though a consumer is likely to click on advertised products from a
category he or she has recently bought, such a click hardly converts. Nevertheless, since only
0.3% of ad impressions involve consumers’ non-zero historical purchases, we can safely claim
that our results indicate the relevance effect of the targeting level on clicks, that is,
advertisements from the category with a higher interest score are more likely to be clicked by
consumers. The positive coefficient of N confirms the advertising host’s consideration of the
total number of competing advertisers in a product category when assigning the category-specific
advertisement quota. For the quality score, we find that an advertisement with a higher click
score and a higher price for the advertised product tends to receive a better quality score from the
advertising host.
We next summarize the findings regarding the potential sample selection problem in our
study. As shown in Table 24, the covariance estimate between a consumer’s unobserved click
preference for a top-line category and the corresponding unobserved component in the interest
score is positive but not significant. There are at least two possible explanations. First, our data
104
are not large enough to identify such a potential correlation. Second, the advertising host
constructs interest scores mainly based on those observed associations between the consumer and
the product category. Perhaps because of similar reasons, we also do not find evidence of sample
selection in the ranking procedure.
Interest score
ln(Freq_click+1) 0.406 (.010)
Freq_buy
-0.820 (.168)
ln(Rcc_click)
-1.952 (.009)
ln(Rcc_buy) 0.942 (.086)
ln(Avgp)
-0.184 (.034)
ln(N) 1.272 (.012)
Quality score
ln(Click score) 1.136 (.007)
Rprice 0.035 (.004)
Table 23. Estimation Results of the Models of Interest Score and Quality Score
Standard deviations
Error term in interest score (𝜔 𝑆 ) 1.446 (.008)
Error term in weighted bid (𝜔 𝐵 ) 1.109 (.008)
Category-specific error term in interest score (𝜎 𝑆 ) 1.138 (.034)
Covariance matrix (𝚺 𝜼 )
𝜂 𝑖𝑚
𝑆 𝜂 𝑖𝑚
𝑈
im-specific error term in interest score: 𝜂 𝑖𝑚
𝑆 1.720 (.044)
im-specific error term in click utility: 𝜂 𝑖𝑚
𝑈 0.045 (.028) 0.098 (.008)
Covariance matrix (𝚺 𝝃 )
𝜉 𝑘 𝐵 𝜉 𝑘 𝑈
Ad-specific error term in quality score: 𝜉 𝑘 𝐵 0.113 (.002)
Ad-specific error term in click utility: 𝜉 𝑘 𝑈 0.005 (.003) 0.059 (.005)
Table 24. Variance-Covariance Estimates in the Models of Click, Interest Score and Quality
Score
105
3.5.3 Advertisers’ Value-Per-Click
The estimation results for the model of value-per-click are reported in Table 25. Our analysis
reveals that an advertiser’s value-per-click is positively correlated with the relative price of the
advertised item and negatively correlated with the number of competing advertisers from the
same product category. Furthermore, we do not find that advertisers adjust their value-per-click
according to the average interest score of the targeted audience. In other words, our results do not
support the positive valuation effect of targeting level on bids. One explanation could be that
advertisers do not directly observe consumers’ interests in a product category and therefore have
to infer from consumers’ past conversions. Since the daily conversion rate is small, it is rather
difficult for advertisers to acquire such knowledge on consumers’ shopping interests in each
product category.
Advertisement characteristics Intercept Relative Price (W)
Intercept 0.504 (.035) 0.161 (.012)
ln(N) ‒0.046 (.017) ‒0.017 (.013)
ln(Avg interest score) ‒0.024 (.034) ‒0.008 (.011)
Standard deviations
Error term in ln(𝑣 𝑘𝑡
) (𝜔 𝑉 ) 0.513 (.003)
Covariance matrix (𝚺 𝑽 )
1 2 3
Intercept 0.449 (.020)
ln(N) ‒0.043 (.014) 0.394 (.016)
ln(Avg interest score) ‒0.026 (.015) 0.027 (.016) 0.438 (.021)
Table 25. Estimation Results of the Model of Value-Per-Click
Figure 4 provides a histogram of inferred daily value-per-click, suggesting that there is a
substantial heterogeneity in valuation among our 5,000 sampled advertisers. Figure 5 shows that
the distribution of the ratio between an advertiser’s bid and valuation mainly has a support from
0.5 to 1, with a mean of 0.783. This indicates that on average, advertisers shade bids by nearly
106
20% below their value-per-click, suggesting an opportunity for the advertising host to encourage
advertisers to bid more truthfully in auctions.
Figure 4. Histogram of the Estimated Value-Per-Click
Figure 5. Histogram of the Ratio between Bid and Value-Per-Click
107
3.6 Counterfactual Experiments
Given the behavior of consumers and advertisers, we are able to predict how changes in the
advertising host’s ad-serving mechanism would affect expected click volume, CPC, and
revenues. In this section, we consider a counterfactual experiment that may help the advertising
host boost revenue by optimizing the targeting level in behavioral targeting mechanism.
We examine the impact of the targeting level on each party’s profits. In our empirical
context, the targeting level is operationalized by the number of interests or categories used to
match consumers with potential advertisers in behavioral targeting. A higher level of targeting
corresponds to using fewer candidate product categories with higher interest scores in the
matching procedure. Our proposed model suggests that the level of targeting can influence the
advertising host’s revenue in several ways: (i) a high level of targeting tends to increase the
click-through rate by sending more relevant advertisements to consumers, i.e., the relevance
effect on clicks; (ii) a high level of targeting leads to ads that are more proximate to consumers’
past behaviors, which decreases the consumers’ likelihood of engaging in clicks, i.e., the
resistance effect on clicks; (iii) a high level of targeting implies a larger ad quota per category,
which can lower CPC due to lessened competition, i.e., the competition effect on bids. Because
of these offsetting effects, it is difficult for the advertising host to forecast revenue change with
respect to the change in the level of targeting. Our proposed framework can help the advertising
host fine-tune this important parameter to improve advertising revenues by using an appropriate
targeting level in the matching mechanism.
We change the number of interested product categories from the users’ past behaviors by
{50%, 25%, 0%, ‒25%, ‒50%} of the current level, where a smaller number represents a higher
targeting level. In the cases of ‒25% and ‒50%, we only use the top 75% and 50% of product
108
categories from the set of categories that have been clicked by the user in the past two weeks in
the matching task. These product categories are ranked by interest scores. In the cases of 25%
and 50%, we use a four-week time window to expand the set of candidate categories and select
the corresponding number of top product categories from this lager set.
For each scenario, we predict the equilibrium click volume and CPC as follows. We first
generate the set of consumers’ candidate product categories for each auction associated with an
impression based on either inferred or predicted interest scores. Then we predict the MFE of
advertisers’ bids using the following recursive algorithm:
1. Letting 𝐺 0
denote advertisers’ initial belief about the distribution of ad rank (i.e., quality
score times bid) of the whole population, we assume 𝐺 0
to be the cumulative density
function of a log-normal distribution and we estimate 𝐺 0
based on inferred ad ranks
from the data.
2. We update each advertiser’s bid by solving the FOC of Equation 24 based on 𝐺 0
and the
new profit function with the updated matching procedure in behavioral targeting.
3. We estimate the new distribution of ad rank 𝐺 𝑛 and compute the symmetric Kullback-
Leibler distance between 𝐺 𝑛 and 𝐺 𝑛 −1
:
𝐷𝑖𝑠𝑡 (𝐺 𝑛 ,𝐺 𝑛 −1
)=0.5∗[∫ ln(
𝑔 𝑛 −1
(𝑥 )
𝑔 𝑛 (𝑥 )
)𝑔 𝑛 −1
(𝑥 )𝑑𝑥 ∞
0
+∫ ln(
𝑔 𝑛 (𝑥 )
𝑔 𝑛 −1
(𝑥 )
)𝑔 𝑛 (𝑥 )𝑑𝑥 ∞
0
] (30)
where 𝑔 𝑛 and 𝑔 𝑛 −1
are corresponding probability density functions of 𝐺 𝑛 and 𝐺 𝑛 −1
. We repeat
Steps 2 and 3 until 𝐷 𝐾𝐿
is below a small threshold.
Given advertisers’ equilibrium bids, we simulate the auction outcomes for the 5,000
advertisers selected in the estimation procedure and compute the expected click volume, the
CPC, and the advertising revenue for both advertisers and the advertising host.
109
Table 26 reports findings from this counterfactual experiment. As shown in Figure 6, we
find that the expected click volume has an inverted-U relationship with the targeting level
because of the coexistence of the relevance and the resistance effect on consumer’s click
responses. As expected, the average CPC strictly decreases with the targeting level due to the
competition effect on advertisers’ bids. Interestingly, although advertiser’s profits always
increase with the targeting level, the advertising host’s revenue first increases and then decreases
with the targeting level. This is mainly because for advertisers, the benefit of a high targeting
level on their profit margin per click dominates the potential decrease in click volume. However,
from the advertising host’s perspective, both the click volume and the CPC drop when the
targeting level exceeds certain threshold due to the resistance and competition effect.
Total daily
click volume
Average
CPC
Daily profit
per ad
Host’s daily
revenue
+50% (Low
targeting level)
67.101
(‒11.68%)
2.558
(3.65%)
0.0391
(‒15.91%)
209.410
(‒8.17%)
+25% 71.199
(‒6.29%)
2.538
(2.84%)
0.0417
(‒10.32%)
221.526
(‒2.86%)
Current rule 75.978 2.468 0.0465 228.051
‒25% 79.256
(4.31%)
2.412
(‒2.27%)
0.0501
(7.74%)
232.870
(2.11%)
‒50% (High
targeting level)
78.017
(2.68%)
2.384
(‒3.40%)
0.0518
(11.40%)
225.426
(‒1.15%)
Table 26. Counterfactual Results of Clicks, CPC, Profits, and Revenues
These findings provide essential managerial implications to the advertising host in setting
the targeting level in behavioral targeting. Advertising hosts should set the targeting level based
on their current goals. If the advertising host’s objective is to attract more advertisers to use
behaviorally targeted advertising, the advertising host should choose a relatively high targeting
110
level to boost advertisers’ profits. On the other hand, if the advertising host’s objective is more
consumer-oriented or revenue maximization, the advertising host should realize the existence of
an optimal targeting level. Our results indicate that the adverting host’s revenue can be improved
by at least 2% in our empirical context.
Figure 6. The Impact of Targeting Level in Online Advertising Market
3.7 Conclusion
Behavioral targeting has been proven to be an effective technology for helping online advertisers
reach more relevant audiences and helping online users receive more relevant advertisements.
Compared to traditional banner ads, behaviorally targeted advertisements can improve the click-
through rate by 89% and the conversion rate by 143%, making behavioral targeting the “Holy
60
65
70
75
80
85
50% 25% 0% -25% -50%
Daily Click
2.25
2.3
2.35
2.4
2.45
2.5
2.55
2.6
50% 25% 0% -25% -50%
Avg CPC
0
0.01
0.02
0.03
0.04
0.05
0.06
50% 25% 0% -25% -50%
Daily Profit per Ad
190
200
210
220
230
240
50% 25% 0% -25% -50%
Platform's Daily Revenue
111
Grail” of online advertising and driving more publishers to monetize their online traffic by
hosting behaviorally targeted advertising (Beales 2010, Farahat and Bailey 2012).
18
Despite the
promise of behavioral targeting, no prior research has empirically examined the underlying
mechanism of this two-sided market and its implications.
In this paper, we develop an integrated model of decisions of all three parties involved in
behaviorally targeted advertising: consumers, advertisers, and the advertising host. Our structural
framework can help the advertising host improve revenues through changing the targeting level,
which is operationalized by the number of recent interests used in matching users and ads.
Our estimation results show that among various behavior-related advertising attributes,
the frequency and the recency of consumer’s previous clicks and purchases are the main factors
affecting an individual’s ad clicking propensity. We also find evidence of a negative resistance
effect in behavioral targeting; that is, consumers are less likely to consider clicking on
behaviorally targeted advertisements if they have accumulatively viewed advertised products that
are highly associated with the products they have clicked on recently. Our results indicate that
the advertising host also considers the frequency and the recency of the user’s previous clicking
and purchasing behaviors in determining which advertisements to serve. The structural nature of
the advertisers’ bidding model allows us to infer advertisers’ valuations of clicks from observed
bids. We find that on average, advertisers bid about 80% of their value-per-click and advertisers
do not adjust their daily bids based on the variation in targeted audiences. Finally, the results
from our counterfactual experiment indicate that the advertisers’ and the advertising host’s
preferences for the targeting level are not aligned: advertisers always benefit from a high
18
For example, Amazon is now planning to launch its own online display advertising platform according to the Wall
Street Journal. http://online.wsj.com/articles/amazon-preps-a-challenge-to-googles-ad-business-
1408747979?KEYWORDS=amazon
112
targeting level, while the advertising host’s revenue has an inverted-U relationship with the
targeting level.
This paper has limitations, bringing up opportunities for future research. First of all, we
do not model the advertisers’ participation decisions in behaviorally targeted advertising, mainly
because we do not observe such variation in our data. An extended model with advertisers’
participation decisions would be a powerful tool for providing a more comprehensive evaluation
of policy changes. Second, the competition between different advertising hosts (e.g., Google and
Yahoo) is not considered in the current study and therefore the possible multi-homing behavior
of online advertisers is not modeled. Although our data provider has a dominant role in serving
behaviorally targeted advertisements for online advertisers in our dataset, extending our model to
account for the competition among multiple advertising hosts could provide new insights on their
choices of the targeting level. Finally, field experiments will be especially valuable in testing the
effectiveness of our recommended policy changes. Overall, in spite of these limitations, we hope
this study will generate further interest in behavioral targeting as the online display advertising
continues to grow.
113
Bibliography
Abhishek, V. and K. Hosanagar (2007), “Keyword Generation for Search Engine Advertising
Using Semantic Similarity between Terms,” Proceedings of the ninth international
conference on Electronic commerce, ACM.
Adlakha, S., R. Johari, and G. Y. Weintraub (2013), “Equilibria of Dynamic Games with Many
Players: Existence, Approximation, and Market Structure,” Journal of Economic Theory,
http://dx.doi.org/10.1016/j.jet.2013.07.002.
Agarwal, A., K. Hosanagar, and M. D. Smith (2011), “Location, Location and Location: An
Analysis of Profitability of Position in Online Advertising Markets,” Journal of Marketing
Research, 48(6), 1057-1073.
Aguirregabiria, V. and P. Mira (2007), “Sequential Estimation of Dynamic Discrete Games,”
Econometrica, 75(1), 1-53.
Andrews, R. L. and T. C. Srinivasan (1995), “Studying Consideration Effects in Empirical
Choice Models Using Scanner Panel Data,” Journal of Marketing Research, 32(February),
30-41.
Anselin, L. (1988), “Spatial Econometrics: Methods and Models,” (Vol. 4). Kluwer Academic
Pub.
Athey, S., and D. Nekipelov (2012), “A Structural Model of Sponsored Search Advertising
Auctions,” Working paper, Stanford University.
Bajari, P., C. L. Benkard, and J. Levin (2007), “Estimating Dynamic Models of Imperfect
Competition,” Econometrica, 75(5), 1331-1370.
Bajari, P., H. Hong, J. Krainer, and D. Nekipelov (2010), “Estimating Static Models of Strategic
Interactions,” Journal of Business & Economic Statistics, 28(4), 469-482.
Beales, H. (2010), “The Value of Behavioral Targeting,” Network Advertising Initiative.
Berry, S. T. (1992), “Estimation of a Model of Entry in the Airline Industry,” Econometrica,
60(4), 889-917.
Bresnahan, T. F. and P. C. Reiss (1991), “Entry and Competition in Concentrated Markets,” The
Journal of Political Economy, 99(5), 977-1009.
Bresnahan, T. F. and P. C. Reiss (1990), “Entry in Monopoly Markets,” The Review of Economic
Studies, 57(4), 531-553.
114
Bronnenberg, B. J. and W. R. Vanhonacker (1996), “Limited Choice Sets, Local Price Response,
and Implied Measures of Price Competition,” Journal of Marketing Research, 33(2), 163-
174.
Chan, T. Y. and Y.-H. Park (2013), “The Value of Consumer Search Activities for Sponsored
Search Advertisers,” Working paper.
Chatterjee, P., D. Hoffman, and T. Novak (2003), “Modeling the Clickstream: Implications for
Web-Based Advertising Efforts,” Marketing Science, 22(4), 520-541.
Chen, J. and J. Stallaert (2014), “An Economic Analysis of Online Advertising Using Behavioral
Targeting,” MIS Quarterly, 38(2), 429-449.
Chen, Y., D, Pavlov, and J. F. Canny (2009), “Large-Scale Behavioral Targeting,” Proceedings
of the 15th ACM SIGKDD international conference on Knowledge discovery and data
mining, ACM. 209-218.
Chen, Y., G. Xue, and Y. Yu (2008), “Advertising Keyword Suggestion Based on Concept
Hierarchy,” Proceedings of the 2008 international conference on web search and data
mining. ACM.
Chiang, J., S. Chib, and C. Narasimhan (1999), “Markov Chain Monte Carlo and Models of
Consideration Set and Parameter Heterogeneity,” Journal of Econometrics, 9(1-2), 223-48.
Ciliberto, F. and E. Tamer (2009), “Market Structure and Multiple Equilibria in Airline Markets.
Econometrica,” Econometrica, 77(6), 1791–1828.
Datta, S. and K. Sudhir (2011), “The Agglomeration-Differentiation Tradeoff in Spatial Location
Choice,” Working paper.
Datta, S. and K. Sudhir (2013), “Does Reducing Spatial Differentiation Increase Product
Differentiation? Effects of Zoning on Retail Entry and Format Variety,” Quantitative
Marketing and Economics, 11(1), 83-116.
Desai, P.S., W. Shin, and R. Staelin (2014), “The Company that You Keep: When to Buy a
Competitor’s Keyword,” Marketing Science, 33(4), 485-508.
Edelman, B., M. Ostrovsky, and M. Schwarz (2007), “Internet Advertising and the Generalized
Second-Price Auction: Selling Billions of Dollars Worth of Keywords,” The American
Economic Review, 97(1), 242-259.
Ellickson, P. B. and S. Misra (2011), “Estimating Discrete Games,” Marketing Science, 30(6),
997-1010.
Erdem, T., M. P. Keane, and B. Sun (2008), “A Dynamic Model of Brand Choice When Price
and Advertising Signal Product Quality,” Marketing Science, 27(6), 1111-1125.
115
Farahat, A. and M. C. Bailey (2012), “How Effective is Targeted Advertising?” Proceedings of
the 21st international conference on World Wide Web, ACM, 111-120.
Ghose, A. and S. Yang (2009), “An Empirical Analysis of Search Engine Advertising:
Sponsored Search in Electronic Markets,” Management Science, 55(10), 1605-1622.
Gilbride, T. J. and G. M. Allenby (2004), “A Choice Model with Conjunctive, Disjunctive, and
Compensatory Screening Rules,” Marketing Science, 23(3), 391-406.
Goldfarb, A. and C. Tucker (2010), “Search Engine Advertising: Channel Substitution when
Pricing Ads to Context,” Management Science, 57(3), 458-470.
Goldfarb, A. and C. Tucker (2011a), “Advertising Bans and the Substitutability of Online and
Offline advertising.” Journal of Marketing Research, 48(2), 207-227.
Goldfarb, A. and C. Tucker (2011b), “Privacy Regulation and Online Advertising,” Management
Science, 57(1), 57-71.
Guadagni, P. M. and J. D. C. Little (1983), “A Logit Model of Brand Choice Calibrated on
Scanner Data,” Marketing Science, 2(3), 203-238.
Hotz, V. J. and R. A. Miller (1993), “Conditional Choice Probabilities and the Estimation of
Dynamic Models,” The Review of Economic Studies, 60(3), 497-529.
Iyer, K., R. Johari, and M. Sundararajan (2012), “Mean Field Equilibria of Dynamic Auctions
with Learning,” Working paper, Stanford University.
Jerath, K., L. Ma, and Y.-H. Park (2014), “Consumer Click Behavior at a Search Engine: The
Role of Keyword Popularity,” Journal of Marketing Research, 51(4), 480-486.
Jia, P. (2008), “What Happens When Wal-Mart Comes to Town: An Empirical Analysis of the
Discount Retailing Industry,” Econometrica, 76(6), 1263-1316.
Jiang, J. and N. Kumar (2014), “Behavioral Targeting,” Working paper, University of Texas at
Dallas.
Johnson, G. (2013), “The Impact of Privacy Policy on the Auction Market for Online Display
Advertising,” Working paper, University of Rochester.
Joo, M., K. C. Wilbur, B. Cowgill, and Y. Zhu (2014), “Television Advertising and Online
Search,” Management Science, 60(1), 56-73.
Juan F., H. K. Bhargava, and D. M. Pennock (2007), “Implementing Sponsored Search in Web
Search Engines: Computational Evaluation of Alternative Mechanisms,” Informs Journal of
Computing, 19(1), 137-148.
116
Katona, Z. and M. Sarvary (2010), “The Race for Sponsored Links: Bidding Patterns for Search
Advertising,” Marketing Science, 29(2), 199-215.
Lambrecht, A. and C. Tucker (2013), “When Does Retargeting Work? Information Specificity in
Online Advertising,” Journal of Marketing Research, 50(5), 561-576.
Lasry, J.-M. and P.-L. Lions (2007), “Mean Field Games,” Japanese Journal of Mathematics,
2(1), 229-260.
Manchanda, P., J.-P. Dubé, K. Y. Goh, and P. K. Chintagunta (2006), “The Effect of Banner
Advertising on Internet Purchasing,” Journal of Marketing Research, 43(1), 98-108.
Mazzeo, M. J. (2002), “Product Choice and Oligopoly Market Structure,” The RAND Journal of
Economics, 33(2), 221-242.
Nair, H. S., S. Misra, W. J. Hornbuckle IV, R. Mishra, and A. Acharya (2013), “Big Data and
Marketing Analytics in Gaming: Combining Empirical Models and Field Experimentation,”
Working paper, Stanford University.
Narayanan, S. and K. Kalyanam (2012), “Measuring Position Effects in Search Advertising: A
Regression Discontinuity Approach,” Working paper.
Rochet, J. C. and J. Tirole (2006), “Two-Sided Markets: A Progress Report,” The RAND Journal
of Economics, 37(3), 645-667.
Rutz, O. J. and R. E. Bucklin (2011), “From Generic to Branded: A Model of Spillover
Dynamics in Paid Search Advertising,” Journal of Marketing Research, 48(1), 87-102.
Sayedi, A., K. Jerath, and K. Srinivasan (2014), “Competitive Poaching in Sponsored Search
Advertising and Strategic Impact on Traditional Advertising,” Marketing Science, 33(4),
586-608.
Seim, K. (2006), “An Empirical Model of Firm Entry with Endogenous Product-Type Choices,”
The RAND Journal of Economics. 37(3), 619-640.
Stokes, R. (2008), “Mastering Search Advertising: How the Top 3% of Search Advertisers
Dominate Google AdWords,” iUniverse.
Su, C.-L. and K. L. Judd (2012), “Constrained Optimization Approaches to Estimation of
Structural Models,” Econometrica, 80(5), 2213-2230.
Terui, N., M. Ban, and G. M. Allenby (2011), “The Effect of Media Advertising on Brand
Consideration and Choice,” Marketing Science, 30(1), 74-91.
Tucker, C. (2014), “Social Networks, Personalized Advertising and Privacy Controls,” Journal
of Marketing Research, 51(5), 546-562.
117
Turow, J., J. King, C. J. Hoofnatle, A. Bleakley, and M. Hennessy (2009), “Americans Reject
Tailored Advertising and Three Activities that Enable It,” Working paper, University of
Pennsylvania.
Van Nierop, E., B. J. Bronnenberg, R. Paap, M. Wedel, and P. H. Franses (2010), “Retrieving
Unobserved Consideration Sets from Household Panel Data,” Journal of Marketing
Research, 47(1), 63-74.
Varian, H. R. (2007), “Position Auctions,” International Journal of Industrial Organization,
25(6), 1163-1178.
Villas-Boas, M. and R. Winer. (1999), “Endogeneity in Brand Choice Models,” Management
Science, 45(10), 1324-1338.
Vitorino, M. A. (2012), “Empirical Entry Games with Complementarities: An Application to the
Shopping Center Industry,” Journal of Marketing Research, 49(2), 175-191.
Wilbur, K. C. (2008), “A Two-Sided, Empirical Model of Television Advertising and Viewing
Markets,” Marketing Science, 27(3), 356-378.
Wu, C. (2013), “Matching Markets in Online Advertising Networks: The Tao of Taobao and the
Sense of AdSense,” Working paper, University of British Columbia.
Yan, J., N. Liu, G. Wang, W. Zhang, Y. Jiang, and Z. Chen (2009), “How Much Can Behavioral
Targeting Help Online Advertising?” Proceedings of the 18th international conference on
World wide web, ACM, 261-270.
Yang, S., Y., Chen and G. M. Allenby (2003) “Bayesian Analysis of Simultaneous Demand and
Supply,” Quantitative Marketing and Economics, 1(3), 251-275.
Yang, S. and A. Ghose (2010), “Analyzing the Relationship Between Organic and Sponsored
Search Advertising: Positive, Negative or Zero Interdependence?” Marketing Science,
29(4), 602-623.
Yang, S., S. Lu, and X. Lu (2014), “Modeling Competition and Its Impact on Paid-Search
Advertising,” Marketing Science, 33(1), 134-153.
Yao, S. and C. F. Mela (2011), “A Dynamic Model of Sponsored Search Advertising,”
Marketing Science, 30(3), 447-468.
Zhu, T. and V. Singh (2009), “Spatial Competition with Endogenous Location Choices: An
Application to Discount Retailing,” Quantitative Marketing and Economics, 7(1), 1-35.
Zhu, T, V. Singh, and M. Manuszak (2009), “Market Structure and Competition in the Retail
Discount Industry,” Journal of Marketing Research, 46(4), 453-466.
118
Appendices
Appendices for Chapter One
Appendix A: Derivation of the CPC Equation and Proof of Equilibrium Properties
EOS found that there is a unique Bayes-Nash equilibrium in the generalized English auction that
corresponds to the GSP auction. More importantly, the realized bids of this generalized English
auction satisfy the Nash equilibrium condition of the GSP auction. We first describe the rule of
the generalized English auction. Suppose there is a price clock continuously increasing from
zero, and each advertiser’s decision is to choose at what price to drop out (i.e., this price will be
her bid). The auction concludes when there is only one advertiser left. The vector of these bids is
used to determine the rank and CPCs of ads based on the same rule of GSP auction. As for the
information structure, each advertiser knows her own value-per-click 𝑆 𝑘𝑖
but only the
distribution of other advertisers’ valuations after their entry into keyword k. The vector of click
ratio between adjacent positions {𝛿 𝑘𝑗
} in a keyword is assumed to be common knowledge.
Theorem 2 of EOS indicates that in the unique Bayes-Nash equilibrium of this generalized
English auction, an advertiser i drops out at price
𝑝 𝑘𝑖
=𝑆 𝑘𝑖
−𝛿 𝑘 ,𝑗 𝑘𝑖
−1
(𝑆 𝑘𝑖
−𝑏 𝑘 𝑖 ′) (A1)
where 𝑖 ′
refers to the advertiser who drops right before advertiser i and thus stays right below
advertiser i (i.e., 𝑗 𝑘 𝑖 ′=𝑗 𝑘𝑖
+1). If we replace i and 𝑖 ′
with 𝑖 ′
and 𝑖 ′′
(assume 𝑖 ′′
is the one who
stays below 𝑖 ′
), equation (A1) can be written as
𝑝 𝑘 𝑖 ′=𝑆 𝑘 𝑖 ′−𝛿 𝑘 𝑗 𝑘𝑖
(𝑆 𝑘 𝑖 ′−𝑏 𝑘 𝑖 ′′) (A2)
EOS defines an advertisers’ bid as the price at which she drops out from the auction, and
therefore we have 𝑝 𝑘 𝑖 ′=𝑏 𝑘 𝑖 ′=𝐶𝑃 𝐶 𝑘𝑖
and 𝑏 𝑘 𝑖 ′′=𝐶𝑃 𝐶 𝑘 𝑖 ′. Equation (A2) can then be written
as,
119
𝐶𝑃 𝐶 𝑘𝑖
=𝑆 𝑘 𝑖 ′−𝛿 𝑘 𝑗 𝑘𝑖
(𝑆 𝑘 𝑖 ′−𝐶𝑃 𝐶 𝑘 𝑖 ′)=𝛿 𝑘 𝑗 𝑘𝑖
𝐶𝑃 𝐶 𝑘 𝑖 ′+(1−𝛿 𝑘 𝑗 𝑘𝑖
)𝑆 𝑘 𝑖 ′ (A3)
which is exactly the same as equation (7) in our model. We next formally show that this bidding
equilibrium characterized by EOS has three important properties.
Property 1. Advertiser with a higher value-per-click obtains a higher position.
Proof of property 1. Equation (A1) implies that given last advertiser’s bid, the price at which
each advertiser drops out increases with her value-per-click. Thus, among the remaining
advertisers, the advertiser with the lowest value-per-click will drop out and obtains the lowest
remaining position. This implies that at realized value-per-clicks, an advertiser with a higher
value-per-click gets a higher position.
Property 2. Equilibrium CPCs decline with positions and each advertiser’s CPC is below her
value-per-click by construction.
Proof of property 2. We remove the subscript k for expositional convenience. Besides, with a
slight abuse of notation, we denote advertiser i as the advertiser with the i
th
highest value-per-
click among all n entrants. Property 1 implies that advertiser i stays at the i
th
position at
equilibrium. We start the proof for the CPC of advertiser at the next to last position: 𝐶𝑃 𝐶 𝑛 −1
=
𝛿 𝑛 −1
𝐶𝑃 𝐶 𝑛 +(1−𝛿 𝑛 −1
)𝑆 𝑛 , where 𝐶𝑃 𝐶 𝑛 equals the minimum bid set by the search host and
therefore 𝐶𝑃 𝐶 𝑛 <𝑆 𝑛 . Then we have 𝐶𝑃 𝐶 𝑛 −1
−𝐶𝑃 𝐶 𝑛 =(1−𝛿 𝑛 −1
)(𝑆 𝑛 −𝐶𝑃 𝐶 𝑛 )>0. At the
same time, we have 𝑆 𝑛 −1
−𝐶𝑃 𝐶 𝑛 −1
=(𝑆 𝑛 −1
−𝑆 𝑛 )+𝛿 𝑛 −1
(𝑆 𝑛 −𝐶𝑃 𝐶 𝑛 )>0. Next we prove
that as long as 𝐶𝑃 𝐶 𝑖 <𝑆 𝑖 <𝑆 𝑖 −1
, the relationship 𝐶𝑃 𝐶 𝑖 <𝐶𝑃 𝐶 𝑖 −1
<𝑆 𝑖 −1
always holds. Similar
to the analysis for the case when i=n, these two inequalities hold because 𝐶𝑃 𝐶 𝑖 −1
−𝐶𝑃 𝐶 𝑖 =
120
(1−𝛿 𝑖 −1
)(𝑆 𝑖 −𝐶𝑃 𝐶 𝑖 )>0 and 𝑆 𝑖 −1
−𝐶𝑃 𝐶 𝑖 −1
=(𝑆 𝑖 −1
−𝑆 𝑖 )+𝛿 𝑖 −1
(𝑆 𝑖 −𝐶𝑃 𝐶 𝑖 )>0. Thus, we
have shown that 𝐶𝑃 𝐶 𝑖 <𝐶𝑃 𝐶 𝑖 −1
<𝑆 𝑖 −1
holds for any i=2,..., n, which completes our proof.
Property 3. The unique Bayes-Nash equilibrium in the generalized English auction is stable in
the sense that the realized bids form a Nash equilibrium of the GSP auction. In other words, no
advertiser will regret about her bid after she knows others’ value-per-clicks.
Proof of property 3. We follow the notations used in the proof of property 2. As EOS pointed
out, to prove that the realized bids form a Nash equilibrium, it is sufficient to show that the
vector of equilibrium bids is “locally envy-free.” “Locally envy-free” is a refined Nash
equilibrium concept defined by EOS, which requires that no advertiser has incentive to exchange
bids with the advertiser who ranks above. We show that the vector of bids characterized by
equation (A1) satisfies this condition. For any advertiser i=2,..., n, the profit for advertiser i at the
i
th
position is 𝜋 𝑖 (𝑏 𝑖 )=𝐸 (𝑄 𝑖 )(𝑆 𝑖 −𝑏 𝑖 +1
) and the profit for advertiser i at the i−1
th
position after
exchanging bids with advertiser i–1 is 𝜋 𝑖 (𝑏 𝑖 −1
)=𝐸 (𝑄 𝑖 −1
)(𝑆 𝑖 −𝑏 𝑖 ) . Since 𝑏 𝑖 =𝑆 𝑖 −
𝛿 𝑖 −1
(𝑆 𝑖 −𝑏 𝑖 +1
) from equation (A1) and 𝛿 𝑖 −1
=
𝐸 (𝑄 𝑖 )
𝐸 (𝑄 𝑖 −1
)
by definition, we have 𝜋 𝑖 (𝑏 𝑖 )=
𝐸 (𝑄 𝑖 )
𝛿 𝑖 −1
(𝑆 𝑖 −𝑏 𝑖 )=𝜋 𝑖 (𝑏 𝑖 −1
) . Thus, we have proved that the vector of realized bids is a stable
“locally envy-free” Nash equilibrium.
121
Appendix B: Expected Revenue Function of Search Advertisers
Based on equations (2) and (14), we have
,,
1
1
,,
1
1
2
,
1
= ( ), ( ) ( ) , ( )
= exp ,
= exp 0.5 ,
ki
k
km ki ki
ki
km ki
k k Q S k k k k k k k k k
j
n
Q S k ki ki k ki km km
j j m j
l
j
kl
k k q S ki k ki km km
j j m
l
l
n E Q b n n S V n CPC S n
E b S V CPC S
b E S V CPC S
1
k
ki
n
j
(B1)
where 𝛿 𝑘𝑙
=
exp (𝜆 𝑘𝑙
)
1+exp (𝜆 𝑘𝑙
)
and we define ∏ 𝛿 𝑘𝑙
0
𝑙 =1
=1 with a slight abuse of notation.
Under the assumption of symmetrical advertisers, the post-entry ad position is uniformly
distributed, that is, Pr(𝑗 𝑘𝑖
=𝑙)=
1
𝑛 𝑘 , 𝑙 =1,…,𝑛 𝑘 . Thus, the expected revenue function can be
rewritten as
2
1
1
, , 1
1 1
exp 0.5
=,
k
k k
kl
n j
n
q
k k S kj k kj k m km
mj
j l k
b
n E S V CPC S
n
(B2)
where 𝑆 𝑘𝑗
and 𝐶𝑃 𝐶 𝑘𝑗
stands for the value-per-click and CPC of the advertiser who stays at
position j at equilibrium.
Equation (7) suggests that the equilibrium CPC at each position can be expressed as a weighted
summation of value-per-clicks of advertisers who ranked below, which is
1
, 1 0 , 1
2 2
(1 ) (1 )
k k
n n j
kj k l kl km kl kj k j
lj m l l j
CPC S CPC S
(B3)
where 𝐶𝑃 𝐶 0
stands for the CPC at the bottom position, which is treated as an exogenous
variable. Equations (B2) and (B3) fully characterize the expected revenue function of advertisers.
Note that we are unable to analytically integrate out two sets of error terms, which are {𝛿 𝑘𝑗
}
𝑗 =1
𝑛 𝑘 −1
and {𝑆 𝑘𝑗
}
𝑗 =1
𝑛 𝑘 . We therefore rely on a simulation-based approach. First, we create two sets of
122
random samples 𝒆 𝟏 and 𝒆 𝟐 generated from i.i.d. standard normal distribution, where 𝒆 𝟏 is a
(𝑛 𝑘 −1) by R matrix and 𝒆 𝟐 is a 𝑛 𝑘 by R matrix. Here R stands for the number of random draws
used for integration. For each random draw r, we first use 𝑒 1
(𝑟 )
to generate the vector of decay
factors.
()
1 ()
()
1
exp( )
1 exp( )
r
k d j r
kj r
k d j
e
e
(B4)
As for the vector of value-per-click, we first sort the vector of 𝑒 2
(𝑟 )
in a descending order to
create 𝑒𝑛𝑒𝑤 2
(𝑟 )
. Then we construct the value-per-click of the j
th
advertiser as follows.
( ) ( ) ( )
22
exp exp( ) exp( )
vpc
r r r
kj k vpc j k j
S V enew V enew
(B5)
Combining equations (B2), (B3), (B4,) and (B5), we can fully specify the expected revenue
function of an advertiser as
( ) ( ) ( ) ( )
,1 2
1
2 2
1
11 1
( ) ( ) ( )
0 , 1
()
(1 )
exp 0.5
=
(1 )
k
k
k
n j
r r r r
kj k l kl km
n j R
lj ml q
kk
n
rj l k r r r
kl kj k j
l
k
k
j
r
l
SS
b
n
n
CPC S
(B6)
Note that the keyword-specific parameters including log baseline click-volume (𝑏 𝑘 ), mean
transformed decay factor (𝜆 𝑘 ) and mean log value-per-click (𝑉 𝑘 ) are all functions of number of
entrants. Thus, the expected revenue of advertisers is ultimately a function of the following
parameters:
222
, ( ), ( ), ( ), , ,
k k k k k k k k q d vpc
f n b n n V n
(B7)
123
Appendix C: The MCMC Algorithm and Simulation Studies
We estimate the model parameters using the Bayesian inference method. We run the Markov
chains for 100,000 iterations, where the first 50,000 draws are discarded as the initial “burn-in”.
We kept every 100th draw from the last 50,000 draws for inference. We use 10 draws of the
vector of value-per-clicks and the vector of decay factors to simulate advertisers’ expected
revenue in a keyword. Before we describe the prior and posterior distribution for each parameter,
we first present the posterior distribution for all parameters.
22
1 1 3 3
22
1
22
| , , , | , , ,
Pr( | , , )
| , , , Pr | , , Pr |
q cpc
K
k k q k k vpc
n
k
k k d
L Q L CPC
Q CPC n
Ln
(C1)
where Θ={𝜂 ,𝜆 ,𝛾 ,𝛽 ,𝜎 2
} is the set of all model parameters, 𝐿 𝑘 𝑞 (∙),𝐿 𝑘 𝑐𝑝𝑐 (∙),𝐿 𝑘 𝑛 (∙) are likelihood
functions of the click-volume vector, vector of observed CPCs, and number of entrants in
keyword k respectively, Pr (𝜆 ) is the prior probability for the vector of decay factors 𝜆⃗
𝑘 =
(𝜆 𝑘 1
,…,𝜆 𝑘 ,𝑛 𝑘 −1
) and Pr(𝜂 ) is the marginal distribution of (𝜂 𝑘 1
,𝜂 𝑘 2
,𝜂 𝑘 3
) .
Next we characterize each of these functions. 𝐿 𝑘 𝑞 (∙) is the likelihood function of viewing
the vector of click-volume in keyword k,
2
1
1
22
2
22
11
2 exp ln ln 2
k k
n n j
q
k q kj kl k k q
jl
L Q X
(C2)
where 𝛿 𝑘𝑗
=
exp (𝜆 𝑘𝑗
)
1+exp (𝜆 𝑘𝑗
)
and we define ∑ ln (𝛿 𝑘𝑙
)
0
𝑙 =1
=0 with a slight abuse of notation.
For expositional convenience, we index advertisers in a keyword by their equilibrium
positions (i.e., 𝑆 𝑘𝑗
stands for the value-per-click of advertiser i at the j
th
position). Then 𝐿 𝑘 𝑐𝑝𝑐 (∙)
can be described as
124
1 2
2
33
23 1
2 2
2
exp ln 2
...
2
k
k
n
kj k k vpc
cpc k
k k k kn
j k
vpc
SX
S
L I S S S
CPC
(C3)
where 𝑆 𝑘𝑗
=
𝐶𝑃 𝐶 𝑘 ,𝑗 −1
−𝛿 𝑘 ,𝑗 −1
𝐶𝑃 𝐶 𝑘𝑗
1−𝛿 𝑘 ,𝑗 −1
is derived from equation (7) and 𝐼 (𝑆 𝑘 2
>𝑆 𝑘 3
>⋯>𝑆 𝑘 𝑛 𝑘 ) is an
indicator function which equals one if the vector of value-per-clicks is in a descending order.
𝜕 𝑆 𝑘 𝜕𝐶𝑃 𝐶 𝑘 is a (𝑛 𝑘 −1) by (𝑛 𝑘 −1) Jacobian matrix, which is an upper-triangle matrix with
{
1
1−𝛿 𝑘 ,𝑗 −1
}
𝑗 =2
𝑛 𝑘 on the diagonal. Thus, the likelihood function of the CPC vector can be further
expressed as
1 2
2
33
23 1
2 2
2
,1
exp ln 2
...
21
k
k
n
kj k k vpc
cpc
k k k kn
j
vpc k j
SX
L I S S S
(C4)
Combining equations (17) and (19), we have,
2 4 3 1 2 1 3
Pr | , Pr , |, ,
n kk
k k k k k
k
k k k k
k
FF
LF
nn
(C5)
where the prior distribution of 𝐹 𝑘 is 𝐹 𝑘 =𝛾 4
𝑋 𝑘 𝐹 +𝜂 𝑘 4
from equation (13) and the description of
𝐹 𝑘 is 𝐹 𝑘 =Π
𝑘 +
1
𝛽 [ln(𝑁 𝑘 −𝑛 𝑘 )−ln(𝑛 𝑘 )] from equation (17). As shown in equation (B7),
advertiser’s expected revenue Π
𝑘 depends on all parameters except for 𝛽 . However, because 𝐹 𝑘
is a function of both Π
𝑘 and 𝛽 , all model parameters enter into the likelihood function of number
of entrants. Combining equation (C5) with the marginal distribution function
Pr(𝜂 𝑘 1
,𝜂 𝑘 2
,𝜂 𝑘 3
|Ω) , we have,
2 3 2 3 4 1 1
1
| , , ( ) Pr Pr | , Pr ,
1
x
2
|
ep
n k
k k k k k
k
k
kk
k k k k
k
F
Ln
n
F
n
(C6)
125
where 𝜂 𝑘 =(𝜂 𝑘 1
,𝜂 𝑘 2
,𝜂 𝑘 3
,𝜂 𝑘 4
)
′
. Finally, the prior of decay factors can be expressed below.
1 1
1 2
22
2
2 2 2
1
Pr( ) 2 exp 2
k
n
d kj k k d
j
X
(C7)
1. Draw 𝜆 𝑘𝑗
and update 𝑆 𝑘𝑗
.
We use Metropolis-Hastings algorithm with a random walk chain to generate draws (see Chib
and Greenberg 1995). Since the error terms in equations (2) and (7) are independent across
keywords, we draw the vector of 𝜆 𝑘𝑗
for each keyword separately. The 𝜆 𝑘𝑗
in the same keyword
k is updated simultaneously during each iteration. Recall that 𝜆⃗
𝑘 is referred to a 𝑛 𝑘 −1 by 1
vector of (𝜆 𝑘 1
,…,𝜆 𝑘 ,𝑛 𝑘 −1
) . Let 𝜆⃗
𝑘 (𝑝 )
denote the previous draw, then the next draw 𝜆⃗
𝑘 (𝑛 )
is given
by the following:
𝜆⃗
𝑘 (𝑛 )
=𝜆⃗
𝑘 (𝑝 )
+Δ
1
with the accepting probability given below,
( ) ( ) ( )
( ) ( ) ( )
( ) ( )Pr( )
min ,1
( ) ( )Pr( )
n n n
q cpc
k k k
kk
p p p
q cpc
k k k
kk
LL
LL
where Δ
1
is a draw from the density Normal(0, .0025I) and I is the identity matrix. To ensure the
ordered structure of value-per-clicks, we consider a candidate draw of decay factors only if the
vector of 𝑆 𝑘𝑗
defined by 𝑆 𝑘𝑗
=
𝐶𝑃 𝐶 𝑘 ,𝑗 −1
−𝛿 𝑘 ,𝑗 −1
𝐶𝑃 𝐶 𝑘𝑗
1−𝛿 𝑘 ,𝑗 −1
satisfies 𝑆 𝑘 2
>𝑆 𝑘 3
>⋯>𝑆 𝑘 𝑛 𝑘 . Finally,
given an accepted vector of 𝜆 𝑘𝑗
, we update the vector of 𝑆 𝑘𝑗
accordingly.
2. Draw 𝜃 =(𝛾 1
,𝛾 2
,𝛾 3
,𝜎 𝑞 2
,𝜎 𝑑 2
,𝜎 𝑣𝑝𝑐 2
,𝛽 ) .
We use Metropolis-Hastings algorithm with a random walk chain to generate draws of 𝜃 . We
transform the variance parameters 𝜎 𝑞 2
,𝜎 𝑑 2
,𝜎 𝑣𝑝𝑐 2
to their log forms in the implementation of
126
random walk chain. Let 𝜃 (𝑝 )
denote the previous draw, then the next draw 𝜃 (𝑛 )
is given by the
following:
𝜃 (𝑛 )
=𝜃 (𝑛 )
+Δ
2
with the accepting probability given below,
( ) ( ) ( ) ( )
1
( ) ( ) ( ) ( )
1
Pr |
min ,1
Pr |
K
q n cpc n n n n
k k k k
k
K
q p cpc p n p p
k k k k
k
L L L
L L L
where Δ
2
is a draw from the density Normal(0, .0001I).
3. Draw 𝛾 4
.
We first derive the conditional distribution of 𝜂 𝑘 4
, given 𝜂 𝑘 1
,𝜂 𝑘 2
,𝜂 𝑘 3
.
Denoting (𝜂 𝑘 4
,𝜂 𝑘 1
,𝜂 𝑘 2
,𝜂 𝑘 3
)
′
~𝑀𝑉𝑁 (0,[
Σ
11
Σ
12
Σ
21
Σ
22
]) , where Σ
11
=Ω
44
, Σ
12
=(Ω
14
,Ω
24
,Ω
34
) ,
Σ
21
=Σ
12
′
, Σ
22
=(
Ω
11
Ω
12
Ω
13
Ω
21
Ω
22
Ω
23
Ω
31
Ω
32
Ω
33
). Then we have
𝜂 𝑘 4
|𝜂 𝑘 1
,𝜂 𝑘 2
,𝜂 𝑘 3
~𝑁 (𝜇 ̅
𝑘 4
, 𝜎̅
4
2
)
where 𝜇 ̅
𝑘 4
=Σ
12
Σ
22
−1
(𝜂 𝑘 1
,𝜂 𝑘 2
,𝜂 𝑘 3
)
′
, 𝜎̅
4
2
=Σ
11
−Σ
12
Σ
22
−1
Σ
21
. We assume the prior distribution of
𝛾 4
is,
1
4 0 0
~ MVN , A
Since (𝐹 𝑘 −𝛾 4
𝑋 𝑘 𝐹 )~𝑁 (𝜇 ̅
𝑘 4
, 𝜎̅
4
2
) , the posterior of 𝛾 4
can be given by the following,
1
22
4 4 4 4 4 0
| , , , ~ MVN ,
F F F
k k k k k
F X X X A
where
1
22
4 4 0 4 4 0 0
F F F
k k k k k
X X A X F A
, 𝐴 0
=.01I and 𝛾 0
=0. The entry cost
𝐹 𝑘 is calculated based on equation (17).
127
4. Draw Ω.
We assume the prior of Ω follows an inverted-Wishart distribution:
00
~ IW( , ) wW
Then, the posterior can be given by the following:
1 2 3 4 0 0
1
| , , , ~ IW ,
K
k k k k k k
k
w K W
where 𝜂 𝑘 =(𝜂 𝑘 1
,𝜂 𝑘 2
,𝜂 𝑘 3
,𝜂 𝑘 4
)
′
, 𝑤 0
=10 and 𝑊 0
= 10I.
5. Draw 𝜂 𝑘 1
.
Since the error terms in our model are assumed to be independent across keywords, we can draw
𝜂 𝑘 1
for each keyword k separately. In our model, 𝜂 𝑘 1
or 𝑏 𝑘 affects the vector of click-volume and
the number of entrants. Let 𝜂 𝑘 1
(𝑝 )
denote the previous draw, then the next draw 𝜂 𝑘 1
(𝑛 )
is given by
the following:
𝜂 𝑘 1
(𝑛 )
=𝜂 𝑘 1
(𝑝 )
+Δ
3
with the accepting probability given below,
( ) ( ) (
23
23
)
1 1 1
( ) ( ) ( )
1 1 1
Pr , |
min ,1
Pr , |
,
,
q n n n n
k k k k k
q p n p p
k k k k
kk
kk k
LL
LL
where Δ
3
is a draw from the density Normal(0, .25I).
6. Draw 𝜂 𝑘 2
similar to Step5.
7. Draw 𝜂 𝑘 3
similar to Step5.
We conduct a simulation study to test whether the proposed algorithm can recover the true
parameters in the full model of click-volume, CPC, and number of entrants. We jointly simulate
the data of click, CPC, and number of entrants for 200 keywords. For convenience, we fix the
128
CPC at the lowest position to be zero. The keyword characteristics vector 𝑋 𝑘 includes one
intercept, one keyword covariate generated from a standard normal distribution, and the number
of entrants (𝑛 𝑘 ) generated by searching for the unique solution of equation (16). The explanatory
variables (𝑋 𝑘 𝐹 ) of the management cost in equation (13) use the same keyword covariates in 𝑋 𝑘
except for the number of entrants. The true model parameters (𝛾 1
,𝛾 2
,𝛾 3
,𝛾 4
,𝜎 𝑞 2
,𝜎 𝑑 2
,𝜎 𝑣𝑝𝑐 2
,𝛽 ,Ω)
are reported in Table 27.
We ran the Markov chains for 100,000 and saved every 100
th
draw for the parameters of
interest. We set the starting values for 𝛾 1
,𝛾 2
,𝛾 3
,𝛾 4
as (0, 0, 0), and for the four diagonal elements
in Ω as 0.1. The starting values for 𝜎 𝑞 2
,𝜎 𝑑 2
,𝜎 𝑣𝑝𝑐 2
,𝛽 are also set to be 0.1. We discard the first
50,000 draws and used the last 50,000 to calculate the posterior means and standard deviations of
the parameters.
The estimation results are reported in Table 27. It shows that all parameters of interest are
identifiable and their true values lie in 95% confidence intervals of the posterior mean estimates.
We also conduct additional simulations and find that the estimation results are not sensitive to
the choice of starting values. Therefore, our simulation studies provide evidence of the validity
of the proposed method.
129
True Values Estimates
1
(Baseline Click)
2, 1, −3 1.907, 0.930, −3.046
(.086) (.061) (.114)
2
(Decay Factor)
2, 1, −1 2.138, 1.012, −1.031
(.113) (.064) (.100)
3
(Value-Per-Click)
2, 1, −2 1.962, 0.965, −1.887
(.107) (.068) (.105)
4
(Management Cost)
2, 1 2.297, 0.961
(.206) (.106)
2
q
0.5 0.498 (.015)
2
d
0.5 0.489 (.014)
2
vpc
0.5 0.504 (.018)
0.5 0.451 (.050)
0.5, 0.5, 0.5, 0.5
1, 1, 1
1.5, 1.5
2
0.592, 0.404, 0.384, 0.454
(.073) (.081) (.097) (.146)
1.079, 1.058, 1.245
(.147) (.159) (.234)
1.547, 1.706
(.198) (.269)
2.397
(.499)
Table 27. Simulation Results of the Proposed Structural Model
130
Appendices for Chapter Two
Appendix A: Estimation Algorithms for Ad Position, Average Click-Volume and Average
CPC
We ran the MCMC chain for 10,000 iterations, and use every 10
th
of the last 5,000 iterations to
compute the mean and standard deviation of the posterior distribution of model parameters. We
report below the MCMC algorithm for the model of ad positions, and the joint model of average
click-volume and average CPC.
A.1. Ad Position
1. Draw 𝑅 𝑘𝑗𝑡
For expositional convenience, we index advertisers in a keyword by their ranks (i.e., 𝑅 𝑘𝑗𝑡 stands
for the weighted bid of the advertiser at the jth position in keyword k at time t). We describe the
posterior distribution of augmented variables R as follows.
Pr(𝑅 |𝑅𝑎𝑛𝑘 )∝∑ ∑ (1{𝑅 𝑘 1𝑡 >𝑅 𝑘 2𝑡 >⋯>𝑅 𝑘 𝑁 𝑘𝑡
𝑡 }∏ exp [−
(𝑅 𝑘𝑗𝑡
−𝑅̅
𝑘𝑗𝑡
)
2
2
]
𝑁 𝑘𝑡
𝑗 =1
)
𝐾 𝑘 =1
𝑆 𝑡 =1
where 𝑁 𝑘𝑡
stands for the number of entrant advertisers, 1{𝑅 𝑘 1𝑡 >𝑅 𝑘 2𝑡 >⋯>𝑅 𝑘 𝑁 𝑘𝑡
𝑡 } is an
indicator function which equals one if the vector of weighted bid is in a descending order, and
𝑅̅
𝑘𝑗𝑡 is the following:
𝑅̅
𝑘𝑗𝑡 =𝛼 𝑗 +𝛼 1𝑇 𝑋 𝑘 +𝛼 2𝑇 ln(𝑆𝑉
𝑘𝑡
)+𝛼 3𝑇 𝑀𝑎𝑡𝑐 ℎ
𝑘𝑗
+∑ 𝛼 4𝑇 𝑠 𝐷 𝑠 (𝑡 )
𝑆 𝑠 =1
+∑ 𝛼 5𝑙𝑇
𝑊 𝑘𝑗𝑙 𝑡 −1
𝑙 =0
The posterior conditional distribution of 𝑅 𝑘𝑗𝑡 given (𝑅 𝑘 1𝑡 ,…,𝑅 𝑘 ,𝑗 −1,𝑡 ,𝑅 𝑘 ,𝑗 +1,𝑡 ,…,𝑅 𝑘 𝑁 𝑘𝑡
𝑡 ),
denoted by 𝑅 𝑘 ,−𝑗 ,𝑡 , is the truncated normal distribution expressed below.
𝑅 𝑘𝑗𝑡 |𝑅 𝑘 ,−𝑗 ,𝑡 ~{
𝑁 (𝑅̅
𝑘𝑗𝑡 ,1)×𝐼 (𝑅 𝑘 ,𝑗 +1,𝑡 ,+∞) 𝑖𝑓 𝑗 =1
𝑁 (𝑅̅
𝑘𝑗𝑡 ,1)×𝐼 (𝑅 𝑘 ,𝑗 +1,𝑡 ,𝑅 𝑘 ,𝑗 −1,𝑡 ) 𝑖𝑓 2≤𝑗 ≤𝑁 𝑘𝑡
−1
𝑁 (𝑅̅
𝑘𝑗𝑡 ,1)×𝐼 (−∞,𝑅 𝑘 ,𝑗 −1,𝑡 ) 𝑖𝑓 𝑗 =𝑁 𝑘𝑡
−1
2. Draw 𝛼
131
We rewrite the model of 𝑅 𝑘𝑗𝑡 as 𝑅 𝑘𝑗𝑡 =𝛼 𝑋 𝑘𝑗𝑡 𝑅 +𝜖 𝑘𝑗𝑡 . We assume the prior distribution of 𝛼 as
𝛼 ~𝑀𝑉𝑁 (𝛼 0
,Σ
0
)
Then the posterior distribution of 𝛼 is also normal.
𝛼 |𝑅 ,𝑋 𝑅 ~𝑀𝑉𝑁 (𝐴 ,𝐵 )
where 𝐵 =[𝑋 𝑅 ′
𝑋 𝑅 +Σ
0
−1
]
−1
, 𝐴 =𝐵 [𝑋 𝑅 ′
𝑅 +Σ
0
−1
𝛼 0
], 𝛼 0
=0 and Σ
0
=100𝐼 .
A.2. Average Click-Volume and Average CPC
1. Draw 𝛽 𝑄
We rewrite the model of average click-volume as ln(𝐶𝑙𝑖𝑐 𝑘 𝑘𝑡
)=𝛽 𝑄 𝑋 𝑘𝑡
𝑄 +𝜂 𝑘 𝑄 +𝜐 𝑘𝑡
𝑄 . The
posterior distribution of 𝛽 𝑄 depends on the conditional distribution of 𝜐 𝑘𝑡
𝑄 , given
(𝜐 𝑘𝑡
𝐶𝑃𝐶 ,𝜐 𝑘𝑡
𝑀 ,𝜐 𝑘𝑡
𝑅 ,𝜐 𝑘𝑡
𝐶 ) . We define the following notations.
(𝜐 𝑘𝑡
𝑄 ,𝜐 𝑘𝑡
𝐶𝑃𝐶 ,𝜐 𝑘𝑡
𝑀 ,𝜐 𝑘𝑡
𝑅 ,𝜐 𝑘𝑡
𝐶 )
′
~𝑀𝑉𝑁 (0,[
Σ
11
Σ
12
Σ
21
Σ
22
])
where Σ
11
=Φ
11
, Σ
12
=(Φ
12
,Φ
13
,Φ
14
,Φ
15
,) , Σ
21
=Σ
12
′
, Σ
22
=[
Φ
22
⋯ Φ
25
⋮ ⋱ ⋮
Φ
52
⋯ Φ
55
]. Then we
have
𝜐 𝑘𝑡
𝑄 |𝜐 𝑘𝑡
𝐶𝑃𝐶 ,𝜐 𝑘𝑡
𝑀 ,𝜐 𝑘𝑡
𝑅 ,𝜐 𝑘𝑡
𝐶 ~𝑁 (𝜇 ̅ ,𝜎̅
2
)
where 𝜇 ̅=Σ
12
Σ
22
−1
(𝜐 𝑘𝑡
𝐶𝑃𝐶 ,𝜐 𝑘𝑡
𝑀 ,𝜐 𝑘𝑡
𝑅 ,𝜐 𝑘𝑡
𝐶 )
′
and 𝜎̅
2
=Σ
11
−Σ
12
Σ
22
−1
Σ
21
. We assume the prior
distribution of 𝛽 𝑄 as the following:
𝛽 𝑄 ~𝑀𝑉𝑁 (𝛽 0
,Σ
0
)
Then 𝛽 𝑄 |𝐶𝑙𝑖𝑐𝑘 ,𝑋 𝑄 ,𝜂 𝑄 ,𝜇 ̅ ,𝜎̅
2
~𝑀𝑉𝑁 (𝐴 ,𝐵 ) , where 𝛽 0
=0, Σ
0
=100𝐼 , 𝐵 =[𝑋 𝑄 ′
𝑋 𝑄 𝜎̅
2
+Σ
0
−1
]
−1
and 𝐴 =𝐵 [𝑋 𝑄 ′
(ln(𝐶𝑙𝑖𝑐𝑘 )−𝜂 𝑄 −𝜇 ̅ )+Σ
0
−1
𝛽 0
].
2. Draw 𝜂 𝑘 𝑄
132
We update 𝜂 𝑘 𝑄 for each keyword k respectfully. The conditional distribution of 𝜂 𝑘 𝑄 given 𝜂 𝑘 𝐶𝑃𝐶 is
the following.
𝜂 𝑘 𝑄 |𝜂 𝑘 𝐶𝑃𝐶 ~𝑁 (𝜇 ̂ ,𝜎̂
2
)
where 𝜇 ̂=Ω
12
Ω
22
−1
𝜂 𝑘 𝐶𝑃𝐶 and 𝜎̂
2
=Ω
11
−Ω
12
Ω
22
−1
Ω
21
. We keep the notations 𝜇 ̅ ,𝜎̅
2
defined in
Step 1. The posterior distribution of 𝜂 𝑘 𝑄 is given below.
𝜂 𝑘 𝑄 |𝐶𝑙𝑖𝑐𝑘 ,𝑋 𝑄 ,𝛽 𝑄 ,𝜂 𝑘 𝐶𝑃𝐶 ,𝜇 ̂ ,𝜎̂
2
,𝜇 ̅ ,𝜎̅
2
~𝑁 (𝐴 ,𝐵 ) , where 𝐵 =(𝑆 𝜎̂
2
+𝜎̅
2
)
−1
𝜎̂
2
𝜎̅
2
and
𝐴 =𝐵 [
∑ (ln(𝐶𝑙𝑖𝑐 𝑘 𝑘𝑡
)−𝛽 𝑄 𝑋 𝑘𝑡
−𝜇̅)
𝑆 𝑡 =1
𝜎̅
2
+
𝜇̂
𝜎̂
2
].
3. Draw Φ
We assume that the prior of Φ follows an inverted-Wishart distribution.
Φ~𝐼𝑊 (𝑤 0
,𝑊 0
)
Then the posterior of Φ can be shown as the following.
Φ|𝜐 𝑘𝑡
𝑄 ,𝜐 𝑘𝑡
𝐶𝑃𝐶 ,𝜐 𝑘𝑡
𝑀 ,𝜐 𝑘𝑡
𝑅 ,𝜐 𝑘𝑡
𝐶 ,𝑤 0
,𝑊 0
~𝐼𝑊 (𝑤 0
+𝑆 ∗𝐾 ,𝑊 0
+𝜐 𝜐 ′
)
where 𝜐 =(𝜐 𝑄 ,𝜐 𝐶𝑃𝐶 ,𝜐 𝑀 ,𝜐 𝑅 ,𝜐 𝐶 ) , S is the number of time periods, K is the number of keywords,
𝑤 0
=10 and 𝑊 0
= 10I.
4. Draw Ω
We assume that the prior of Ω follows an inverted-Wishart distribution.
Ω~𝐼𝑊 (𝑤 0
,𝑊 0
)
Then the posterior can be shown as the following.
Ω|𝜂 𝑘 𝑄 ,𝜂 𝑘 𝐶𝑃𝐶 ,𝑤 0
,𝑊 0
~𝐼𝑊 (𝑤 0
+𝐾 ,𝑊 0
+𝜂𝜂 ′)
where 𝜂 =(𝜂 𝑄 ,𝜂 𝐶𝑃𝐶 ) , 𝑤 0
=10 and 𝑊 0
= 10I.
5. Draw 𝛽 𝐶𝑃𝐶 similar to Step 1.
6. Draw 𝜂 𝑘 𝐶𝑃𝐶 similar to Step 2.
133
7. Draw 𝛽 𝑀 similar to Step 1.
8. Draw 𝛽 𝑅 similar to Step 1.
9. Draw 𝛽 𝑆 similar to Step 1.
Appendix B. Estimation Algorithm for Keyword Entry and Consideration Models
We ran the MCMC chain for 20,000 iterations, and use every 10
th
of the last 10,000 iterations to
compute the mean and standard deviation of the posterior distribution of model parameters. We
report below the MCMC algorithm for the implementation of the two-step approach for the joint
model of keyword entry and consideration.
B.1. First Step
We use a flexible probit specification to describe an advertiser’s keyword entry decision, which
is expressed below.
𝑈 𝑘𝑖𝑡
=𝛾 𝑖 +𝛾 1𝑇 𝑋 𝑘 +𝛾 2𝑇 ln(𝑆 𝑉 𝑘𝑡
)+𝛾 3𝑇 𝑀𝑎𝑡𝑐 ℎ
𝑘𝑖
+∑𝛾 4𝑇 𝑠 𝐷 𝑠 (𝑡 )
𝑆 𝑠 =1
+∑𝛾 5𝑙𝑇
𝑎 𝑘𝑖𝑙 𝑡 −1
𝑙 =0
+∑𝛾 6𝑙𝑇
ln (𝑅𝑎𝑛 𝑘 𝑘𝑖𝑙
)
𝑡 −1
𝑙 =0
+∑( ∑ 𝛾 7𝑇 𝑇 ′𝑎̅
𝑘 ,−𝑖 ,𝑙 (𝑇 ′
,𝑐 𝑘𝑡
)
𝑇 ′
={𝑀 ,𝑅 ,𝐶 }
)
𝑡 −1
𝑙 =0
+∑( ∑ 𝛾 8𝑇 𝑇 ′ln[𝑅𝑎𝑛𝑘̅̅̅̅̅̅̅
𝑘 ,−𝑖 ,𝑙 (𝑇 ′
,𝑐 𝑘𝑡
)]
𝑇 ′
={𝑀 ,𝑅 ,𝐶 }
)
𝑡 −1
𝑙 =0
+𝜍 𝑘𝑖𝑡
𝑈
𝑎̅
𝑘 ,−𝑖 ,𝑙 (𝑇 ,𝑐 𝑘𝑡
)=
∑ 𝑎 𝑘𝑗𝑙 𝑐 𝑘𝑗𝑡
𝑗 ≠𝑖 ,𝑗 ∈𝑇 ∑ 𝑐 𝑘𝑗𝑡
𝑗 ≠𝑖 ,𝑗 ∈𝑇
𝑅𝑎𝑛𝑘̅̅̅̅̅̅̅
𝑘 ,−𝑖 ,𝑙 (𝑇 ,𝑐 𝑘𝑡
)=
∑ 𝑅𝑎𝑛𝑘 𝑘𝑗𝑙
𝑐 𝑘𝑗𝑡
𝑗 ≠𝑖 ,𝑗 ∈𝑇 ∑ 𝑐 𝑘𝑗𝑡
𝑗 ≠𝑖 ,𝑗 ∈𝑇
where we allow advertiser i’s payoff to depend on three sets of variables: (i) keyword
characteristics, (ii) the focal advertiser’s entry and rank history, and (iii) the average entry and
134
rank of competing potential entrants from each type in previous periods. We also include
advertiser-specific and time-specific fixed effects. Let 𝑐 𝑘𝑡
denote the vector of advertisers’
consideration decisions, i.e., 𝑐 𝑘 𝑡 =(𝑐 𝑘 1𝑡 ,…,𝑐 𝑘𝐼𝑡 ). Then 𝑎̅
𝑘 ,−𝑖 ,𝑙 (𝑇 ,𝑐 𝑘𝑡
) and 𝑅𝑎𝑛𝑘̅̅̅̅̅̅̅
𝑘 ,−𝑖 ,𝑙 (𝑇 ,𝑐 𝑘𝑡
)
stand for the average entry frequency and rank of type-T potential entrants in previous period l.
We simplify the notations by rewriting the utility of keyword entry and keyword
consideration as the following:
𝑈 𝑘𝑖𝑡
=𝛾 𝑋 𝑘𝑖𝑡
𝑈 +𝜍 𝑘𝑖𝑡
𝑈
𝑉 𝑘𝑖𝑡
=𝜆 𝑋 𝑘𝑖𝑡
𝑉 +𝜍 𝑘𝑖𝑡
𝑉
𝑐 𝑘𝑖𝑡
=1{𝑉 𝑘𝑖𝑡
>0}
𝑋 𝑘𝑖𝑡
𝑈 =(𝑋 𝑘𝑖𝑡
𝑈 1
,𝑋 𝑘𝑖𝑡
𝑈 2
(𝑐 𝑘𝑡
))
Next we describe the MCMC algorithm.
1. Draw 𝑈 𝑘𝑖𝑡
We update 𝑈 𝑘𝑖𝑡
given 𝑎 𝑘𝑖 𝑡 and 𝑐 𝑘𝑖𝑡
as the following:
𝑈 𝑘𝑖𝑡
|𝑎 𝑘𝑖𝑡
,𝑐 𝑘𝑖𝑡
~{
𝑁 (𝛾 𝑋 𝑘𝑖𝑡
𝑈 ,1)×𝐼 (0,+∞) 𝑖𝑓 𝑎 𝑘𝑖𝑡
=1
𝑁 (𝛾 𝑋 𝑘𝑖𝑡
𝑈 ,1)×𝐼 (−∞,0) 𝑖𝑓 𝑎 𝑘𝑖𝑡
=0 𝑎𝑛𝑑 𝑐 𝑘𝑖𝑡
=1
𝑁 (𝛾 𝑋 𝑘𝑖𝑡
𝑈 ,1) 𝑖𝑓 𝑎 𝑘𝑖𝑡
=0 𝑎𝑛𝑑 𝑐 𝑘𝑖𝑡
=0
2. Draw 𝑉 𝑘𝑖𝑡
and update 𝑐 𝑘𝑖𝑡
and 𝑋 𝑘𝑖𝑡
𝑈
We use Metropolis-Hastings algorithm with a random walk chain to generate draws (See Chib
and Greenberg 1995). Let 𝑉 𝑘𝑖𝑡
(𝑝 )
denote the previous draw, then the next draw 𝑉 𝑘𝑖𝑡
(𝑝 )
is given by the
following:
𝑉 𝑘𝑖𝑡
(𝑛 )
=𝑉 𝑘𝑖𝑡
(𝑝 )
+Δ
with the accepting probability given below,
135
𝑃𝑟𝑜𝑏 =
{
min{
exp[−(𝑉 𝑘𝑖𝑡 (𝑛 )
−𝜆 𝑋 𝑘𝑖𝑡 𝑉 )
2
/2 ]1{𝑉 𝑘𝑖𝑡 (𝑛 )
>0}
exp[−(𝑉 𝑘𝑖𝑡 (𝑝 )
−𝜆 𝑋 𝑘𝑖𝑡 𝑉 )
2
/2 ]
,1} 𝑖𝑓 𝑎 𝑘𝑖𝑡
=1
min{
exp[−(𝑉 𝑘𝑖𝑡 (𝑛 )
−𝜆 𝑋 𝑘𝑖𝑡 𝑉 )
2
/2 ]
exp[−(𝑉 𝑘𝑖𝑡 (𝑝 )
−𝜆 𝑋 𝑘𝑖𝑡 𝑉 )
2
/2 ]
,1} 𝑖𝑓 𝑎 𝑘𝑖𝑡
=0 𝑎𝑛𝑑 𝑉 𝑘𝑖𝑡
(𝑛 )
∗𝑉 𝑘 𝑖𝑡 (𝑝 )
>0
min{
exp[−(𝑉 𝑘𝑖𝑡 (𝑛 )
−𝜆 𝑋 𝑘𝑖𝑡 𝑉 )
2
/2−(𝑈 𝑘𝑖𝑡 −𝛾 𝑋 𝑘𝑖𝑡 𝑈 (𝑉 𝑘𝑖𝑡 (𝑛 )
))
2
/2 ]
exp[−(𝑉 𝑘𝑖𝑡 (𝑝 )
−𝜆 𝑋 𝑘𝑖𝑡 𝑉 )
2
/2−(𝑈 𝑘𝑖𝑡 −𝛾 𝑋 𝑘𝑖𝑡 𝑈 (𝑉 𝑘𝑖𝑡 (𝑝 )
))
2
/2 ]
,1} 𝑖𝑓 𝑎 𝑘𝑖𝑡
=0 𝑎𝑛𝑑 𝑉 𝑘𝑖𝑡
(𝑛 )
∗𝑉 𝑘𝑖𝑡
(𝑝 )
≤0
where Δ is a draw from the density 𝑁 (0,2.25) . Given an accepted 𝑉 𝑘𝑖𝑡
, we update 𝑐 𝑘𝑖𝑡
and 𝑋 𝑘𝑖𝑡
𝑈
accordingly.
3. Draw 𝛾 similar to Step 2 in A.1.
4. Draw 𝜆 similar to Step 3.
B.2. Second Step
Given the vector of advertisers’ keyword consideration 𝑐 𝑘𝑡
, we use the estimation results from
the first step to construct the expected number of above-ranked advertisers of each type as
follows:
𝑁̂
𝑘𝑖𝑡
𝑎 (𝑇 ,𝑐 𝑘𝑡
)=∑ [𝑃̂
(𝑎 𝑘𝑗𝑡 =1|𝑐 𝑘𝑡
)𝑃̂
(𝑅𝑎𝑛𝑘 𝑘𝑗𝑡 <𝑅𝑎𝑛𝑘 𝑘𝑖𝑡
)]
𝑗 ≠𝑖 ,𝑗 ∈𝑇
where 𝑃̂
(𝑎 𝑘𝑖𝑡
=1|𝑐 𝑘𝑡
)=∑ Φ[𝛾 (𝑟 )
(𝑋 𝑘𝑖𝑡
𝑈 1
,𝑋 𝑘𝑖𝑡 𝑈 2
(𝑐 𝑘𝑡
))]
𝑅 𝑟 =1
, where 𝑅 =100, 𝛾 (𝑟 )
is every 100
th
of
the last 10,000 MCMC iterations for the vector of coefficient parameters 𝛾 , and Φ is the
cumulative distribution function for standard normal distribution. 𝑃̂
(𝑅𝑎𝑛𝑘 𝑘𝑗𝑡 <𝑅𝑎𝑛𝑘 𝑘𝑖𝑡
) is
constructed based on Equation 2 using posterior draws from the estimation of the ad position
model. Similarly, we can construct the expected number of below-ranked advertisers 𝑁̂
𝑘𝑖𝑡
𝑏 (𝑐 𝑘𝑡
) .
Finally, we use 𝑁̂
𝑘𝑖𝑡
𝑎 (𝑇 ,𝑐 𝑘𝑡
) and 𝑁̂
𝑘𝑖𝑡
𝑏 (𝑐 𝑘𝑡
) as covariates and jointly estimate the proposed model
of keyword entry and consideration in Equations 8 and 14 in a similar way as the algorithm
outlined in B.1.
136
Appendices for Chapter Three
Appendix A. Derivation of Advertisers’ Expected Profit Function
We rewrite an advertiser’s expected advertising profit function in Equation 24 as the following:
Π
𝑘 (𝑏 )=∑𝐸 {
1{𝑔 𝑘 ∈𝑃𝑜 𝑡 𝑖 }
⏟
𝑀𝑎𝑡𝑐 ℎ𝑖𝑛𝑔 ∑ Pr({𝑛 𝑖𝑔
}|𝑆 𝑖𝑔
)
⏟
𝐴𝑙𝑙𝑜𝑐𝑎𝑡𝑖𝑛𝑔 ∀{𝑛 𝑖𝑔
}
∗
(
1{𝑐 𝑖 =1|{𝑛 𝑖𝑔
}}1{𝑞 𝑖𝑘
=1|𝑐 𝑖 =1,{𝑛 𝑖𝑔
}}
⏟
𝐶𝑙𝑖𝑐𝑘𝑖𝑛𝑔 ∗
∑ Pr(𝑟𝑎𝑛 𝑘 𝑘 =𝑗 |{𝑛 𝑖𝑔
})
⏟
𝑅𝑎𝑛𝑘𝑖𝑛𝑔 [𝑣 𝑘 −𝐸 (𝐶𝑃𝐶 |𝑗 ,{𝑛 𝑖𝑔 })]
⏟
𝑀𝑎𝑟𝑔𝑖𝑛 𝑝𝑒𝑟 𝑐𝑙𝑖𝑐𝑘 𝑛 𝑖 𝑔 𝑘 𝑗 =1
)
}
𝑖 (A1)
where time t is omitted for expositional convenience. With a slight abuse of notation, the
outmost summation in Equation A1 is over all impressions in one day so that some consumers
might appear multiple times if they encountered multiple impressions in that day. Here the
expectation is taken with respect to the four sets of measurement errors in equations of
consideration intensity, click utility, interest score, and weighted bid.
The simulation for the profit function in Equation A1 is extremely complicated because
in addition to the integration of {𝜖 𝐶 ,𝜖 𝑈 ,𝜖 𝑆 ,𝜖 𝐵 } , we also need to integrate out all possible vectors
of ad quota per category given the draws of their interest scores. To simplify the analysis, we use
expected {𝑛 𝑖𝑔
} to replace the corresponding integration as an approximation.
Π
𝑘 (𝑏 )≐∑𝐸 {𝜖 ,𝘀 }
{
1{𝑔 𝑘 ∈𝑃𝑜 𝑡 𝑖 }
⏟
𝑀𝑎𝑡𝑐 ℎ𝑖𝑛𝑔 Pr(𝑈 𝑖𝑘
>0 & 𝐶 𝑖 >0|{𝐸 (𝑛 𝑖𝑔
)})
⏟
𝐶𝑙𝑖𝑐𝑘𝑖𝑛𝑔 ∗
∑ Pr(𝑟𝑎𝑛 𝑘 𝑘 =𝑗 |{𝐸 (𝑛 𝑖𝑔
)})
⏟
𝑅𝑎𝑛𝑘𝑖𝑛𝑔 [𝑣 𝑘 −𝐸 (𝐶𝑃𝐶 |𝑗 ,{𝐸 (𝑛 𝑖𝑔
)})]
⏟
𝑀𝑎𝑟𝑔𝑖𝑛 𝑝𝑒𝑟 𝑐𝑙𝑖𝑐𝑘 𝐸 (𝑛 𝑖 𝑔 𝑘 )
𝑗 =1
}
𝑖 (A2)
𝐸 (𝑛 𝑖𝑔
)=⌊8𝑝 𝑖𝑔
⌋+{
1 𝑖𝑓 𝑅𝑒 𝑠 𝑖𝑔
𝑟𝑎𝑛𝑘𝑠 𝑡𝑜𝑝 𝐿 0 𝑒𝑙𝑠𝑒 (A3)
𝑝 𝑖𝑔
=
𝑆 𝑖𝑔
(𝑁 𝑔 )
𝛼 ∑ 𝑆 𝑖 ℎ
(𝑁 ℎ
)
𝛼 ℎ∈𝑃𝑜 𝑡 𝑖 (A4)
𝑅𝑒 𝑠 𝑖𝑔
=8𝑝 𝑖𝑔
−⌊8𝑝 𝑖𝑔
⌋ (A5)
137
𝐿 =∑ 𝑅𝑒 𝑠 𝑖ℎ ℎ∈𝑃𝑜 𝑡 𝑖 (A6)
where 𝐸 (𝑛 𝑖𝑔
) stands for the expected number of ad quota for category 𝑔 given the multinomial
distribution parameters determined by the vector of weighted interest scores. To ensure 𝐸 (𝑛 𝑖𝑔
) to
be an integer, we compute 𝐸 (𝑛 𝑖𝑔
) in two steps. We first assign the floor of theoretical mean of ad
quota to each candidate category. Then we distribute the remaining L ad quota to categories
based on the discrepancy between their theoretical means and floors.
Next we derive each component of the expected profit function in Equation A2. First, the
matching function is a deterministic indicator function characterized by the rule of matching.
Conditional on {𝐸 (𝑛 𝑖𝑔
)} , the consumer’s click-through rate is,
Pr(𝑈 𝑖𝑘
>0 & 𝐶 𝑖 >0|{𝐸 (𝑛 𝑖𝑔
)})=Φ[𝛼 𝑋 𝐶 ]Φ[𝛽 𝑋 𝑈 +𝜉 𝑘 𝑈 +𝜂 𝑖𝑘
𝑈 ] (A7)
where the similarity stock 𝑆 𝑆 𝑖 in 𝑋 𝐶 and the number of displayed ads from the same category 𝑛 𝑖𝑔
in 𝑋 𝑈 both depend on {𝐸 (𝑛 𝑖𝑔
)} , and Φ(∙) is the CDF of standard normal distribution. We assume
that a consumer’s belief on other’s ad ranks is log-normally i.i.d. distributed below:
𝐹 (𝑏 𝐴𝑅
)~𝐿𝑁 (𝜇 𝐵 ,𝜎 𝐵 2
) (A8)
where 𝑏 𝐴𝑅
=𝑏 ∗𝑄𝑆 denotes the advertiser-specific ad rank and (𝜇 𝐵 ,𝜎 𝐵 2
) are the mean and
variance of ln(𝑏 𝐴𝑅
) .
Under such a belief and a given draw of advertiser k’s weighted bid 𝑊 𝐵 𝑖𝑘
=
𝑏 𝑘 𝐴𝑅
exp(𝜖 𝑖𝑘
𝐵 ) ., the probability for advertiser k to obtain a higher rank than a competitor l is
Pr(ln (𝑏 𝑘 𝐴𝑅
)+𝜖 𝑖𝑘
𝐵 >ln (𝑏 𝑙 𝐴𝑅
)+𝜖 𝑖𝑙
𝐵 )=Φ(
ln(𝑏 𝑘 𝐴𝑅
)−𝜇 𝐵 +𝜖 𝑖𝑘
𝐵 √𝜎 𝐵 2
+𝜔 𝐵 2
)=Φ(
ln(𝑊 𝐵 𝑖𝑘
)−𝜇 𝐵 √𝜎 𝐵 2
+𝜔 𝐵 2
) (A9)
Then the probability for advertiser k to be ranked at the j
th
position among 𝐸 (𝑛 𝑖𝑔
) displayed ads
in category 𝑔 is given below:
138
Pr(𝑟𝑎𝑛 𝑘 𝑘 =𝑗 |{𝐸 (𝑛 𝑖𝑔
)},𝑊 𝐵 𝑖𝑘
)=
(
𝑁 𝑖𝑔
−1
𝑗 −1
)(1−Φ(
ln(𝑊 𝐵 𝑖𝑘
)−𝜇 𝐵 √𝜎 𝐵 2
+𝜔 𝐵 2
))
𝑗 −1
Φ(
ln(𝑊 𝐵 𝑖𝑘
)−𝜇 𝐵 √𝜎 𝐵 2
+𝜔 𝐵 2
)
𝑁 𝑖𝑔
−𝑗 (A10)
Given 𝑊 𝐵 𝑖𝑘
and 𝑟𝑎𝑛 𝑘 𝑘 =𝑗 , the 𝐶𝑃 𝐶 𝑖𝑘
is the second highest weighted bid across all
categories divided by advertiser k’s quality score:
𝐶𝑃 𝐶 𝑖𝑘
(𝑊 𝐵 𝑖𝑘
)=
1
𝑄 𝑆 𝑘 max
{
(𝑋 (𝑁 𝑖𝑔
−𝑗 ):𝑁 𝑖𝑔
|𝑋 (𝑁 𝑖𝑔
−𝑗 +1):𝑁 𝑖𝑔
=𝑊 𝐵 𝑖𝑘
)
⏟
𝑊𝑒𝑖𝑔 ℎ𝑡𝑒𝑑 𝑏𝑖𝑑 𝑟𝑖𝑔 ℎ𝑡 𝑏𝑒𝑙𝑜𝑤
𝑖𝑛 𝑜𝑤𝑛 𝑐𝑎𝑡𝑒𝑔𝑜𝑟𝑦 ,
min
{
𝑊 𝐵 𝑖𝑘
,{𝑋 (𝑁 𝑖 ,−𝑔 −𝑙 +1):𝑁 𝑖 ,−𝑔 }
𝑙 =1
𝑛 𝑖 ,−𝑔 ⏟
𝑊𝑒𝑖𝑔 ℎ𝑡𝑒𝑑 𝑏𝑖𝑑𝑠 𝑓𝑟𝑜𝑚 𝑎𝑙𝑙 𝑜𝑡 ℎ𝑒𝑟 𝑐𝑎𝑡𝑒𝑔𝑜𝑟𝑖𝑒𝑠 }
}
(A11)
Pr(𝑋 (𝑁 𝑖𝑔
−𝑗 ):𝑁 𝑖𝑔
≤𝑥 |𝑋 (𝑁 𝑖𝑔
−𝑗 +1):𝑁 𝑖𝑔
=𝑊 𝐵 𝑖𝑘
)=(
𝐻 (𝑥 )
𝐻 (𝑊 𝐵 𝑖𝑘
)
)
𝑗 −1
(A12)
where 𝑋 𝑗 :𝑁 is the j
th
order statistics of a sample of N random variables that are drawn from an
i.i.d. distribution 𝐻 (𝑊𝐵 )~𝐿𝑁 (𝜇 𝐵 ,𝜎 𝐵 2
+𝜔 𝐵 2
) , and {𝑋 (𝑁 𝑖 ,−𝑔 −𝑙 +1):𝑁 𝑖 ,−𝑔 }
𝑙 =1
𝑛 𝑖 ,−𝑔 stands for those top
𝑛 𝑖 ,−𝑔 impression-specific weighted bids from each of other candidate categories. We derive the
conditional distribution of below-ranked weighted bid in category 𝑔 given 𝑊 𝐵 𝑖𝑘
and 𝑅𝑎𝑛 𝑘 𝑘 =𝑗
in Equation A12.
Finally, we integrate out the error term 𝜖 𝑖𝑘
𝐵 in 𝑊 𝐵 𝑖𝑘
to obtain the expected profit function:
𝐸 𝜖 𝐵 {∑ Pr(𝑟𝑎𝑛 𝑘 𝑘 =𝑗 |{𝐸 (𝑛 𝑖𝑔
)})[𝑣 𝑘 −𝐸 (𝐶𝑃𝐶 |𝑗 ,{𝐸 (𝑛 𝑖𝑔
)})]
𝐸 (𝑛 𝑖 𝑔 𝑘 )
𝑗 =1
}=
𝐸 𝜖 𝐵 {∑ Pr(𝑟𝑎𝑛 𝑘 𝑘 =𝑗 |{𝐸 (𝑛 𝑖𝑔
)},𝑏 𝑘 𝐴𝑅
,𝜖 𝑖𝑘
𝐵 )[𝑣 𝑘 −𝐸 𝜖 𝑖 ,−𝑘 𝐵 (𝐶𝑃𝐶 |𝑗 ,{𝐸 (𝑛 𝑖𝑔
)}),𝑏 𝑘 𝐴𝑅
,𝜖 𝑖𝑘
𝐵 ]
𝐸 (𝑛 𝑖 𝑔 𝑘 )
𝑗 =1
}(A13)
We simulate Π
𝑘 (𝑏 ) in Equation A2 using every 100
th
of the last 5,000 MCMC draws of
the vector {𝑆 𝑖𝑔
} . Then for each impression for a consumer i, we simulate advertiser k’s 𝐶𝑃 𝐶 𝑖𝑘
139
based on Equation A11 using ten draws of random vectors. Specifically, given 𝑊 𝐵 𝑖𝑘
=
𝑏 𝑘 𝑊 exp(𝜖 𝑖𝑘
𝐵 ) , we simulate the next highest weighed bid in category 𝑔 𝑘 based on the CDF
function in Equation A12. After that, we draw the 𝑛 𝑖 ,−𝑔 highest order statistics in any other
category in a similar way: we first draw the largest order statistics 𝑋 𝑁 𝑖 ,−𝑔 :𝑁 𝑖 ,−𝑔 based on its CDF
function 𝐻 (𝑥 )
𝑁 𝑖 ,−𝑔 ; then we iteratively generate 𝑋 (𝑁 𝑖 ,−𝑔 −𝑙 ):𝑁 𝑖 ,−𝑔 given 𝑋 (𝑁 𝑖 ,−𝑔 −𝑙 +1):𝑁 𝑖 ,−𝑔 for
𝑙 =1,…,𝑛 𝑖 ,−𝑔 using the conditional CDF function described in Equation A12. After we generate
the simulated 𝐶𝑃 𝐶 𝑖𝑘
(𝑊 𝐵 𝑖𝑘
) , we integrate out 𝜖 𝑖𝑘
𝐵 in Equation A13 using the Gauss-Hermite
quadrature method with 30 nodes.
Given the simulated profit function Π
𝑘 (𝑏 |𝑣 𝑘 ,𝑄 𝑆 𝑘 ) , 𝑣 𝑘 is an implicit function of 𝑏 𝑘
determined by
𝜕 Π
𝑘 (𝑏 𝑘 |𝑣 𝑘 ,𝑄 𝑆 𝑘 )
𝜕𝑏
=0. We numerically derive and solve the FOC to make inferences
on 𝑣 𝑘 . We also randomly check the SOC
𝜕 2
Π
𝑘 (𝑏 𝑘 |𝑣 𝑘 ,𝑄 𝑆 𝑘 )
𝜕 𝑏 2
at 𝑏 𝑘 , 𝑣 𝑘 and 𝑄 𝑆 𝑘 for 10% of our
estimation sample and find that the SOCs are all negative.
It is difficult to analytically prove the concavity of an advertiser’s expected profit
function. Instead, we conduct a numerical exercise to demonstrate that the simulated profit
function is concave. We consider only one product category with 𝑁 =20 advertisers competing
for 𝑛 ={2,4,6,8} ad slots. We set the focal advertiser k’s value-per-click and quality score as
𝑣 𝑘 =5, 𝑄 𝑆 𝑘 =100, and assume that the distribution of others’ ad rank follows
𝐹 (𝑏 𝐴𝑅
)~𝐿𝑁 (5,1) . We also set the variance of measurement error term in weighted bid as
𝜔 𝐵 2
=1. Figure 7 shows the relationship between an advertiser’s bid and his or her expected
profit when the ad quota changes from 2 to 8. We notice that the profit function is inverted-U
shaped and advertiser’s optimal bid increases with the competition intensity, which is measured
by the inverse of ad quota.
140
Figure 7. A Demonstration of Expected Advertising Profit Function
Notes. The red vertical line in each subplot points to the optimal bid in each case.
141
Appendix B. The MCMC Algorithm
We ran the MCMC chain for 20,000 iterations, and use every 20
th
of the last 10,000 iterations to
compute the mean and standard deviation of the posterior distribution of model parameters. We
report below the MCMC algorithm for the joint models of click decision (𝑞 𝑖𝑘𝑡
), ad quota per
category (𝑛 𝑖𝑔𝑡
), and the set of displayed ads (𝐷 𝑖𝑡
). The MCMC algorithm for the hierarchical
Bayesian model of value-per-click is standard and therefore is neglected.
We derive below the posterior distribution of all parameters denoted by Θ. Given the
posterior distribution of the unknowns, we then draw different sets of parameters sequentially
until we achieve convergence.
Pr(Θ|𝑞 ,𝑛 ,𝐷 )∝
∏
(
∏ [
1{𝑈 𝑖𝑘𝑡
, 𝐶 𝑖𝑡
>0|𝑞 𝑖𝑘𝑡
=1}1{𝑈 𝑖𝑘𝑡
≤0 𝑜𝑟 (𝑈 𝑖𝑘𝑡
>0 & 𝐶 𝑖𝑡
≤0)|𝑞 𝑖𝑘𝑡
=0}∗
Pr(𝐶 𝑖𝑡
|𝛼 𝑖 ,𝛿 𝑖 ,𝑋 𝐶 )Pr(𝑈 𝑖𝑘𝑡
|𝛽 𝑖 ,𝑋 𝑈 ,𝑓 𝑚 𝑈 ,𝜉 𝑘 𝑈 ,𝜂 𝑖𝑚
𝑈 )
]
𝑘 ∈𝐷 𝑖𝑡
⏟
𝐶𝑙𝑖𝑐𝑘𝑖𝑛𝑔 ∗
∏ 𝑝 𝑖𝑔𝑡
𝑛 𝑖𝑔𝑡 ({𝑊 𝑆 𝑖𝑔𝑡
})Pr(𝑊 𝑆 𝑖𝑔𝑡
|𝛾 ,𝑋 𝑆 ,𝜉 𝑔 𝑆 ,𝜂 𝑖𝑚
𝑆 ,𝜔 𝑆 2
)
𝑔 ∈𝑃𝑜 𝑡 𝑖𝑡 ⏟
𝐴𝑙𝑙𝑜𝑐𝑎𝑡𝑖𝑛𝑔 ∗
∏ [
1{min{𝑊 𝐵 𝑖𝑘𝑡
|𝑘 ∈𝐷 𝑖𝑡
,𝑘 ∈𝑔 }>max{𝑊 𝐵 𝑖𝑘𝑡 |𝑘 ∉𝐷 𝑖𝑡
,𝑘 ∈𝑔 }}∗
Pr (𝑊 𝐵 𝑖𝑘𝑡
|𝜙 ,𝑋 𝐵 ,𝜉 𝑘 𝐵 ,𝜔 𝐵 2
)
]
𝑔 ∈𝑃𝑜 𝑡 𝑖𝑡
⏟
𝑅𝑎𝑛𝑘𝑖𝑛𝑔 )
𝑖 ,𝑡 ∗
∏ Pr(𝛼 𝑖 |Δ
𝐶 ,𝑍 ,Σ
𝐶 )
𝑖 Pr(𝛽 𝑖 |Δ
𝑈 ,𝑍 ,Σ
𝑈 )Pr (𝛿 𝑖 |Δ
𝐷 ,𝑍 ,𝜎 𝐷 2
)∏ Pr(𝜉 𝑔 𝑆 |𝜎 𝑆 2
)
𝑔 ∗
∏ Pr(𝜉 𝑘 𝑈 ,𝜉 𝑘 𝐵 |Σ
𝜉 )
𝑘 ∏ Pr(𝜂 𝑖𝑚
𝑈 ,𝜂 𝑖𝑚
𝑆 |Σ
𝜂 )
𝑖 ,𝑚 (B1)
1. Draw 𝑈 𝑖𝑘𝑡
|𝑞 𝑖𝑘𝑡
,𝐶 𝑖𝑡
The posterior conditional distribution of 𝑈 𝑖𝑘𝑡
given 𝑞 𝑖𝑘𝑡
and 𝐶 𝑖𝑡
is either normal or truncated
normal expressed below.
𝑈 𝑖𝑘𝑡
|𝑞 𝑖𝑘𝑡
,𝐶 𝑖𝑡
~{
𝑁 (𝑈̅
𝑖𝑘𝑡
,1)×𝐼 (0,+∞) 𝑖𝑓 𝑞 𝑖𝑘𝑡
=1
𝑁 (𝑈̅
𝑖𝑘𝑡
,1)×𝐼 (−∞,0] 𝑖𝑓 𝑞 𝑖𝑘𝑡
=0 𝑎𝑛𝑑 𝐶 𝑖𝑡
>0
𝑁 (𝑈̅
𝑖𝑘𝑡
,1) 𝑖𝑓 𝑞 𝑖𝑘𝑡
=0 𝑎𝑛𝑑 𝐶 𝑖𝑡
≤0
(B2)
where 𝑈̅
𝑖𝑘𝑡
=𝛽 𝑖 𝑋 𝑈 +𝑓 𝑚 𝑈 +𝜉 𝑘 𝑈 +𝜂 𝑖𝑚
𝑈 .
142
2. Draw 𝛽 𝑖 |𝑈 𝑖𝑘𝑡
,𝑋 𝑈 ,𝑓 𝑚 𝑈 ,𝜉 𝑘 𝑈 ,𝜂 𝑖𝑚
𝑈 ,Δ
𝑈 ,𝑍 ,Σ
𝑈
𝛽 𝑖 |𝑈 𝑖𝑘𝑡
,𝑋 𝑈 ,𝑓 𝑚 𝑈 ,𝜉 𝑘 𝑈 ,𝜂 𝑖𝑚
𝑈 ,Δ
𝑈 ,𝑍 ,Σ
𝑈 ~𝑀𝑉𝑁 (𝐴 ,𝐵 ) (B3)
where 𝐵 =(𝑋 𝑈 ′
𝑋 𝑈 +Σ
𝑈 −1
)
−1
and 𝐴 =𝐵 [𝑋 𝑈 ′
(𝑈 −𝑓 𝑈 −𝜉 𝑈 −𝜂 𝑈 )+Σ
𝑈 −1
Δ
𝑈 𝑍 ].
3. Draw Δ
𝑈 ,Σ
𝑈 |𝛽 𝑖 ,𝑍
We assume the priors of Δ
𝑈 and Σ
𝑈 as follows:
Δ
𝑈 ~𝑀𝑉𝑁 (Δ
0
,𝑉 Δ
) (B4)
Σ
𝑈 ~𝐼𝑊 (𝑤 0
,𝑊 0
) (B5)
Then the posterior conditional distribution of Δ
𝑈 and Σ
𝑈 is given below.
Δ
𝑈 |𝛽 𝑖 ,𝑍 ,Σ
𝑈 ,Δ
0
,𝑉 Δ
~𝑀𝑉𝑁 (𝐴 ,𝐵 ) (B6)
Σ
𝛾 |𝛽 𝑖 ,𝑍 ,Δ
𝑈 ,𝑤 0
,𝑊 0
~𝐼𝑊 (𝑤 0
+𝑁 𝐼 ,𝑊 0
+∑ (𝛽 𝑖 −Δ
𝑈 𝑍 𝑖 )
′
(𝛽 𝑖 −Δ
𝑈 𝑍 𝑖 )
𝑁 𝐼 𝑖 =1
) (B7)
where 𝐵 =(∑ 𝑍 𝑖 ′
Σ
𝑈 −1
𝑍 𝑖 +𝑉 Δ
−1
𝑁 𝐼 𝑖 =1
)
−1
, 𝐴 =𝐵 (∑ 𝑍 𝑖 ′
Σ
𝑈 −1
𝛽 𝑖 𝑁 𝐼 𝑖 =1
+𝑉 Δ
−1
Δ
0
) , 𝑁 𝐼 is the total number of
consumers, Δ
0
=0, 𝑉 Δ
=100𝐼 , 𝑤 0
=10, 𝑊 0
=10𝐼 , and 𝐼 is the identity matrix.
4. Draw 𝜉 𝑘 𝑈 |𝜉 𝑘 𝐵 ,Σ
𝜉 ,𝑈 𝑖𝑘𝑡
,𝛽 𝑖 ,𝑋 𝑈 ,𝑓 𝑚 𝑈 ,𝜂 𝑖𝑚
𝑈
We update 𝜉 𝑘 𝑈 for each ad k sequentially. The conditional distribution of 𝜉 𝑘 𝑈 given 𝜉 𝑘 𝐵 is the
following
𝜉 𝑘 𝑈 |𝜉 𝑘 𝐵 ~𝑁 (𝜇 ̂ ,𝜎̂
2
) (B8)
where 𝜇 ̂=𝜎 12
𝜎 22
−1
𝜉 𝑘 𝐵 and 𝜎̂
2
=𝜎 11
−𝜎 12
𝜎 22
−1
𝜎 21
, denoting Σ
𝜉 =(
𝜎 11
𝜎 12
𝜎 21
𝜎 22
) . The posterior
distribution of 𝜉 𝑘 𝑈 is given below.
𝜉 𝑘 𝑈 |𝜉 𝑘 𝐵 ,Σ
𝜉 ,𝑈 𝑖𝑘𝑡
,𝛽 𝑖 ,𝑋 𝑈 ,𝑓 𝑚 𝑈 ,𝜂 𝑖𝑚
𝑈 ~𝑁 (𝐴 ,𝐵 ) (B9)
where 𝐵 =(𝑁 𝑘 𝜎̂
2
+1)
−1
𝜎̂
2
, 𝐴 =𝐵 [∑ (𝑈 𝑖𝑘𝑡
−𝛽 𝑖 𝑋 𝑖𝑘𝑡
𝑈 −𝑓 𝑚 𝑈 −𝜂 𝑖𝑚
𝑈 )
𝑖 ,𝑡 ,𝑠 .𝑡 .𝑘 ∈𝐷 𝑖𝑡
+
𝜇̂
𝜎̂
2
], and 𝑁 𝑘 is
the number of impressions s.t. 𝑘 ∈𝐶 𝑖𝑡
.
5. Draw Σ
𝜉 |𝜉 𝑘 𝑈 ,𝜉 𝑘 𝐵
143
We assume that the prior of Σ
𝜉 follows an inverted-Wishart distribution.
Σ
𝜉 ~𝐼𝑊 (𝑤 0
,𝑊 0
) (B10)
Then the posterior of Σ
𝜉 can be shown as follows.
Σ
𝜉 |𝜉 𝑘 𝑈 ,𝜉 𝑘 𝐵 ,𝑤 0
,𝑊 0
~𝐼𝑊 (𝑤 0
+𝐾 ,𝑊 0
+𝜉 𝜉 ′
) (B11)
where 𝐾 is the number of ads, 𝜉 =(𝜉 𝑈 ,𝜉 𝐵 ) , 𝑤 0
=10, and 𝑊 0
=𝐼 .
6. Draw 𝑓 𝑚 𝑈 |𝑈 𝑖𝑘𝑡
,𝑋 𝑈 ,𝜉 𝑘 𝑈 ,𝜂 𝑖𝑚
𝑈
This is the standard Bayesian regression with the variance of measurement error fixed at one.
7. Draw 𝐶 𝑖𝑡
|𝑞 𝑖𝑘𝑡
,𝑈 𝑖𝑘𝑡
𝐶 𝑖𝑡
|𝑞 𝑖𝑘𝑡
,𝑈 𝑖𝑘𝑡
~{
𝑁 (𝐶 ̅
𝑖𝑡
,1)×𝐼 (0,+∞) 𝑖𝑓 max{𝑞 𝑖𝑘𝑡
}
𝑘 ∈𝐷 𝑖𝑡
=1
𝑁 (𝐶 ̅
𝑖𝑡
,1)×𝐼 (−∞,0] 𝑖𝑓 max{𝑞 𝑖𝑘𝑡
}
𝑘 ∈𝐷 𝑖𝑡
=0 𝑎𝑛𝑑 max{𝑈 𝑖𝑘𝑡
}
𝑘 ∈𝐷 𝑖𝑡
>0
𝑁 (𝐶 ̅
𝑖𝑡
,1) 𝑖𝑓 max{𝑞 𝑖𝑘𝑡
}
𝑘 ∈𝐷 𝑖𝑡
=0 𝑎𝑛𝑑 max{𝑈 𝑖𝑘 𝑡 }
𝑘 ∈𝐷 𝑖𝑡
≤0
(B12)
where 𝐶 ̅
𝑖𝑡
=𝛼 𝑖 𝑋 𝐶 .
8. Draw 𝛿 𝑖 =logit(𝜌 𝑖 )|𝐶 𝑖𝑡
,𝛼 𝑖 ,𝑋 𝐶 .
Recall that the two stock variables in 𝑋 𝐶 are defined as follows:
𝐴 𝑆 𝑖𝑡
=𝑎 𝑖𝑡
+𝜌 𝑖 𝑑 (𝑡 )−𝑑 (𝑡 −1)
𝐴 𝑆 𝑖 ,𝑡 −1
(B13)
𝑆 𝑆 𝑖𝑡
=𝑠𝑖𝑚 𝑖𝑡
+𝜌 𝑖 𝑑 (𝑡 )−𝑑 (𝑡 −1)
𝑆 𝑆 𝑖 ,𝑡 −1
(B14)
where 𝑑 (𝑡 ) is the date for ad impression t. We follow Erdem et al. (2008) and Terui et al. (2011)
to set the initial value of advertising stock, and similarity stock, as 𝐴 𝑆 𝑖0
=
∑ 𝑎 𝑖𝑡
𝑇 𝑡 =1
𝑇 ⁄
1−𝜌 𝑖 , where T is
the total number of days used in estimation period. Obviously, the vector of observed covariates
𝑋 𝑖𝑡
𝐶 can be viewed as a function of 𝛿 𝑖 . We use the random-walk Metropolis-Hastings algorithm to
generate draws for each transformed carryover parameter 𝛿 𝑖 . Let 𝛿 𝑖 (𝑝 )
denote the previous draw;
then the next draw, 𝛿 𝑖 (𝑛 )
, is given by the following:
144
𝛿 𝑖 (𝑛 )
=𝛿 𝑖 (𝑝 )
+Ψ
𝐷 (B15)
with acceptance probability
min
{
∏ exp(−
[𝐶 𝑖𝑡
−𝛼 𝑖 𝑋 𝑖𝑡
𝐶 (𝛿 𝑖 (𝑛 )
)]
2
2
)exp(−
[𝛿 𝑖 (𝑛 )
−Δ
𝐷 𝑍 𝑖 ]
2
2𝜎 𝐷 2
)
𝑡 ∏ exp(−
[𝐶 𝑖𝑡
−𝛼 𝑖 𝑋 𝑖𝑡
𝐶 (𝛿 𝑖 (𝑝 )
)]
2
2
)exp(−
[𝛿 𝑖 (𝑝 )
−Δ
𝐷 𝑍 𝑖 ]
2
2𝜎 𝐷 2
)
𝑡 ,1
}
(B16)
where Ψ
𝐷 is a draw from the density 𝑁 (0,0.01𝐼 ) .
9. Draw 𝑊𝑆 𝑖𝑔𝑡
|𝑛 𝑖𝑔𝑡
,𝛾 ,𝑋 𝑆 ,𝜉 𝑔 𝑆 ,𝜂 𝑖𝑚
𝑆 ,𝜔 𝑆 2
We use the random-walk Metropolis-Hastings algorithm to generate draws. For each impression,
we simultaneously update the vector of log weighted interest score for all candidate categories,
which is denoted by 𝐿𝑊𝑆⃗⃗⃗⃗⃗⃗⃗⃗⃗⃗
𝑖𝑡
={ln(𝑊 𝑆 𝑖𝑔𝑡
)}
𝑔 ∈𝑃𝑜 𝑡 𝑖𝑡
. Let 𝐿𝑊𝑆⃗⃗⃗⃗⃗⃗⃗⃗⃗⃗
𝑖𝑡
(𝑝 )
denote the previous draw; then
the next draw, 𝐿𝑊𝑆⃗⃗⃗⃗⃗⃗⃗⃗⃗⃗
𝑖𝑡
(𝑛 )
, is given by the following:
𝐿𝑊𝑆⃗⃗⃗⃗⃗⃗⃗⃗⃗⃗
𝑖𝑡
(𝑛 )
=𝐿𝑊𝑆⃗⃗⃗⃗⃗⃗⃗⃗⃗⃗
𝑖𝑡
(𝑝 )
+Ψ
𝑆 (B17)
with acceptance probability
min
{
∏ (
exp(𝐿𝑊 𝑆 𝑖𝑔𝑡 (𝑛 )
)
∑ exp(𝐿𝑊 𝑆 𝑖 ℎ𝑡 (𝑛 )
)
ℎ∈𝑃𝑜 𝑡 𝑖𝑡
)
𝑛 𝑖𝑔𝑡 exp(−
(𝐿𝑊 𝑆 𝑖𝑔𝑡 (𝑛 )
−𝛾 𝑋 𝑆 −𝜉 𝑔 𝑆 −𝜂 𝑖𝑚
𝑆 )
2
2𝜔 𝑆 2
)
𝑔 ∈𝑃𝑜 𝑡 𝑖𝑡
∏ (
exp(𝐿𝑊 𝑆 𝑖𝑔𝑡 (𝑝 )
)
∑ exp(𝐿𝑊 𝑆 𝑖 ℎ𝑡 (𝑝 )
)
ℎ∈𝑃𝑜 𝑡 𝑖𝑡
)
𝑛 𝑖𝑔𝑡 exp(−
(𝐿𝑊 𝑆 𝑖𝑔𝑡 (𝑝 )
−𝛾 𝑋 𝑆 −𝜉 𝑔 𝑆 −𝜂 𝑖𝑚
𝑆 )
2
2𝜔 𝑆 2
)
𝑔 ∈𝑃𝑜 𝑡 𝑖𝑡
,1
}
(B18)
where Ψ
𝑆 is a draw from the density 𝑁 (0,0.25𝐼 ) .
10. Draw 𝛾 ,𝜔 𝑆 2
|𝑊 𝑆 𝑖𝑔𝑡
,𝑋 𝑆 ,𝜉 𝑔 𝑆 ,𝜂 𝑖𝑚
𝑆
We assume the prior of 𝛽 and 𝜏 2
as follows:
𝛾 ~𝑀𝑉𝑁 (𝛾 0
,Σ
0
) (B19)
𝜔 𝑆 2
~𝐼𝐺 (𝑎 ,𝑏 ) (B20)
The posterior distributions of 𝛾 and 𝜔 𝑆 2
are expressed below.
145
𝛾 | 𝑊 𝑆 𝑖𝑔𝑡
,𝑋 𝑆 ,𝜉 𝑔 𝑆 ,𝜂 𝑖𝑚
𝑆 ,𝜔 𝑆 2
,𝛾 0
,Σ
0
~𝑀𝑉𝑁 (𝐴 ,𝐵 ) (B21)
𝜔 𝑆 2
|𝑊 𝑆 𝑖𝑔𝑡
,𝑋 𝑆 ,𝜉 𝑔 𝑆 ,𝜂 𝑖𝑚
𝑆 ,𝜔 𝑆 2
,𝛾 ,𝑎 ,𝑏 ~𝐼𝐺 (
𝑁 𝑆 2
+𝑎 ,[
1
𝑏 +
∑ (ln(𝑊 𝑆 𝑖𝑔𝑡 )−𝛾 𝑋 𝑆 −𝜉 𝑔 𝑆 −𝜂 𝑖𝑚
𝑆 )
2
𝑖 ,𝑔 ,𝑡 2
]) (B22)
where 𝐵 =[𝑋 𝑆 ′
𝑋 𝑆 𝜔 𝑆 2
+Σ
0
−1
]
−1
, 𝐴 =𝐵 [𝑋 𝑆 ′
(ln(𝑊𝑆 )−𝜉 𝑆 −𝜂 𝑆 )+Σ
0
−1
𝛾 0
], 𝑁 𝑆 is the total
number of candidate categories across all impressions, 𝛾 0
=0, Σ
0
=100𝐼 , 𝑎 =10 and 𝑏 =1.
11. Draw 𝜉 𝑔 𝑆 |𝑊 𝑆 𝑖𝑔𝑡
,𝑋 𝑆 ,𝜂 𝑖𝑚
𝑆 ,𝛾 ,𝜔 𝑆 2
,𝜎 𝑆 2
𝜉 𝑔 𝑆 |𝑊 𝑆 𝑖𝑔𝑡
,𝑋 𝑆 ,𝜂 𝑖𝑚
𝑆 ,𝛾 ,𝜔 𝑆 2
,𝜎 𝑆 2
~𝑁 (𝐴 ,𝐵 ) (B23)
where 𝐵 =(𝑁 𝑔 𝜎 𝑆 2
+𝜔 𝑆 2
)
−1
𝜎 𝑆 2
𝜔 𝑆 2
, 𝐴 =𝐵 [
1
𝜔 𝑆 2
∑ (ln(𝑊 𝑆 𝑖𝑔𝑡
)−𝛾 𝑋 𝑆 −𝜂 𝑖𝑚
𝑆 )
𝑖 ,𝑡 ,𝑠 .𝑡 .𝑔 ∈𝑃𝑜 𝑡 𝑖𝑡
], and 𝑁 𝑔 is
the number of impressions s.t. 𝑔 ∈𝑃𝑜 𝑡 𝑖𝑡
.
12. Draw 𝜎 𝑆 2
|𝜉 𝑔 𝑆
Assume that the prior of 𝜎 𝑆 2
follows an inverted-Gamma distribution.
𝜎 𝑆 2
~𝐼𝐺 (𝑎 ,𝑏 ) (B24)
The posterior of 𝜎 𝑆 2
is given below.
𝜎 𝑆 2
|𝜉 𝑔 𝑆 ,𝑎 ,𝑏 ~𝐼𝐺 (
𝑁 𝑔 2
+𝑎 ,[𝑏 −1
+
∑ (𝜉 𝑔 𝑆 )
2
𝑖 ,𝑡 ,𝑠 .𝑡 .𝑔 ∈𝑃𝑜 𝑡 𝑖𝑡
2
]) (B25)
where 𝑎 =10 and 𝑏 =1.
13. Draw 𝑊 𝐵 𝑖𝑘𝑡
|𝐷 𝑖𝑡
,𝜙 ,𝑋 𝐵 ,𝜉 𝑘 𝐵 ,𝜔 𝐵 2
We update the vector of 𝑊 𝐵 𝑖𝑘𝑡 of all competing ads in the auction for each impression
sequentially. Given the set of displayed ads 𝐷 𝑖𝑡
for an impression, the posterior conditional
distribution of 𝑊𝐵 𝑖𝑘𝑡
is truncated normal expressed below.
𝑊 𝐵 𝑖𝑘𝑡
|𝐷 𝑖𝑡
~{
𝑁 (𝜙 𝑋 𝑘𝑡
𝐵 +𝜉 𝑘 𝐵 ,𝜔 𝐵 2
)×𝐼 (max {𝑊 𝐵 𝑖𝑘𝑡
|𝑘 ∉𝐷 𝑖𝑡
},+∞) 𝑖𝑓 𝑘 ∈𝐷 𝑖𝑡
𝑁 (𝜙 𝑋 𝑘𝑡
𝐵 +𝜉 𝑘 𝐵 ,𝜔 𝐵 2
)×𝐼 (−∞,min{𝑊 𝐵 𝑖𝑘𝑡
|𝑘 ∈𝐷 𝑖𝑡
}) 𝑖𝑓 𝑘 ∉𝐷 𝑖𝑡
(B26)
14. Draw 𝛼 𝑖 similar to Step 2.
146
15. Draw Δ
𝐶 , Σ
𝐶 similar to Step 3.
16. Draw Δ
𝐷 , 𝜎 𝐷 2
similar to Step 3.
17. Draw 𝜉 𝑘 𝐵 similar to Step 4.
18. Draw 𝜂 𝑖𝑚
𝑈 similar to Step 4.
19. Draw 𝜂 𝑖𝑚
𝑆 similar to Step 4.
20. Draw Σ
𝜂 similar to Step 5.
21. Draw 𝜙 ,𝜔 𝐵 2
similar to Step 10.
Abstract (if available)
Abstract
My dissertation examines novel interactions between consumers and advertisers enabled by Internet platforms with new targeting technologies in online advertising markets. As paid‐search and display become two most prevalent forms of online advertising, this dissertation empirically investigates the consumer and advertiser interactions in these two online advertising markets. ❧ In my first essay, I examine the determinant of competition and its impact on click‐volume and cost‐per‐clicks in paid‐search advertising. I regard each keyword as a market and measure the competition by the number of ads on the paid‐search listings. I build an integrative model of the number entrant advertisers, the realized click‐volume and cost‐per‐clicks of each entrant. The proposed model is applied to data of keywords associated with digital camera/video and accessories. Results indicate that the number of competing ads has a significant impact on baseline click‐volume, decay factor, and value‐per‐click. These findings help search advertisers assess the impact of competition on their entry decisions and advertising profitability. The proposed framework can also provide profit implications to the search host regarding two polices: raising the decay factor by encouraging consumers to engage in more in‐depth search/click‐through, and providing coupons to advertisers. ❧ As Internet advertising infomediaries now provide rich competition‐related information, search advertisers are becoming more strategic in their keyword decisions. In the second essay, I explore whether positive or negative spillover effects occur in advertisers’ keyword entry decisions, which lead to assimilation or differentiation in their keyword choices. I develop a model of advertisers’ keyword decisions based on the incomplete‐information and simultaneous‐move game with two novel extensions: (i) I allow the strategic interactions to vary with advertisement positions to reflect consumers’ top‐down search pattern
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Essays on consumer product evaluation and online shopping intermediaries
PDF
Essays on understanding consumer contribution behaviors in the context of crowdfunding
PDF
Competing across and within platforms: antecedents and consequences of market entries by mobile app developers
PDF
Essays on the luxury fashion market
PDF
Essays on the competition between new and used durable products
PDF
Essays on commercial media and advertising
PDF
Essays on consumer returns in online retail and sustainable operations
PDF
Quality investment and advertising: an empirical analysis of the auto industry
PDF
Marketing strategies with superior information on consumer preferences
PDF
Essays on quality screening in two-sided markets
PDF
Efficient policies and mechanisms for online platforms
PDF
The essays on the optimal information revelation, and multi-stop shopping
PDF
Essays on revenue management with choice modeling
PDF
Essays on information and financial economics
PDF
Manipulating consumer opinion on social media using trolls and influencers
PDF
Three essays on agent’s strategic behavior on online trading market
PDF
Essays on digital platforms
PDF
Essays on bounded rationality and revenue management
PDF
Essays on the role of entry strategy and quality strategy in market and consumer response
PDF
Essays on the economics of digital entertainment
Asset Metadata
Creator
Lu, Shijie
(author)
Core Title
Essays on online advertising markets
School
Marshall School of Business
Degree
Doctor of Philosophy
Degree Program
Business Administration
Publication Date
04/15/2015
Defense Date
03/13/2015
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
Bayesian estimation,behavioral targeting,competition,entry,generalized second-price auction,incomplete-information game,infomediary,Internet marketing,OAI-PMH Harvest,online advertising,paid search
Format
application/pdf
(imt)
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Yang, Sha (
committee chair
), Dukes, Anthony (
committee member
), Shum, Matthew (
committee member
), Yang, Botao (
committee member
)
Creator Email
cj.shijielu@gmail.com,shijielu@usc.edu
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-c3-548697
Unique identifier
UC11297760
Identifier
etd-LuShijie-3300.pdf (filename),usctheses-c3-548697 (legacy record id)
Legacy Identifier
etd-LuShijie-3300.pdf
Dmrecord
548697
Document Type
Dissertation
Format
application/pdf (imt)
Rights
Lu, Shijie
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the a...
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Tags
Bayesian estimation
behavioral targeting
entry
generalized second-price auction
incomplete-information game
infomediary
Internet marketing
online advertising
paid search