Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
A structural econometric analysis of network and social interaction models
(USC Thesis Other)
A structural econometric analysis of network and social interaction models
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
A STRUCTURAL ECONOMETRIC ANALYSIS OF NETWORK AND SOCIAL INTERACTION MODELS by Shuyang Sheng A Dissertation Presented to the FACULTY OF THE USC GRADUATE SCHOOL UNIVERSITY OF SOUTHERN CALIFORNIA In Partial Fulllment of the Requirements for the Degree DOCTOR OF PHILOSOPHY (ECONOMICS) August 2013 Copyright 2013 Shuyang Sheng To my parents and to Fanli Zhou. ii Acknowledgements I am deeply grateful to my advisor Geert Ridder for his valuable and insightful advice and guidance throughout the period of my dissertation re- search. Without his help and support this research could not have been n- ished. I am thankful to my committee members John Strauss, Roger Moon, Cheng Hsiao, Hesham Pesaran, and Sha Yang for their helpful comments and suggestions, which have substantially improved the dissertation. I want to thank my ocemates Martin Weidner and Bo Kim when the main part of this research was conducted, with whom I had a joyful time and helpful discussions. I also thank my friend Bo Zhou for the happiness she brought to me during my life at USC. My deepest thanks go to my parents, who taught me to be independent and strong, and to my husband Fanli Zhou, who gives me unconditional love and support and encourages me to achieve my childhood dreams. iii Table of Contents Acknowledgements iii List of Tables vi List of Figures vii Abstract viii Chapter One Introduction 1 Chapter Two Network Formation Models 4 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2.2 A Model of Network Formation . . . . . . . . . . . . . . . . . 11 2.2.1 Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 2.2.2 The existence of pairwise stable networks . . . . . . . . 19 2.3 Identication . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 2.3.1 Multiple equilibria . . . . . . . . . . . . . . . . . . . . 27 2.3.2 Partial identication . . . . . . . . . . . . . . . . . . . 35 iv 2.3.3 Subnetworks . . . . . . . . . . . . . . . . . . . . . . . . 40 2.4 Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 2.4.1 Graph isomorphism . . . . . . . . . . . . . . . . . . . . 55 2.4.2 Estimation and inference . . . . . . . . . . . . . . . . . 61 2.4.3 Computation of the bound functions . . . . . . . . . . 67 2.5 Monte Carlo Simulations . . . . . . . . . . . . . . . . . . . . . 69 2.6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 Chapter Three Social Interaction Models 78 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 3.2 A Model of Social Interactions Through Learning . . . . . . . 80 3.3 Identication . . . . . . . . . . . . . . . . . . . . . . . . . . . 86 3.3.1 When " is i.i.d. . . . . . . . . . . . . . . . . . . . . . . 88 3.3.2 When " has geographic-neighborhood heterogeneity . . 101 3.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 Chapter Four Conclusions 105 Appendix A Proofs 107 Bibliography 137 v List of Tables Table 1 The Collections of Nonadjacent Networks . . . . . . . . . . 31 Table 2 The PS NT Sets for Multiple Equilibria . . . . . . . . . . . . 32 Table 3 The PS NT Sets for Unique Equilibrium . . . . . . . . . . . 33 Table 4 The PS NT Sets that Contain a Network with G 13 = 1 . . . 44 Table 5 The PS NT Sets of Subnetworks . . . . . . . . . . . . . . . . 52 Table 6 Projections of the Estimated Identied Sets: N = 3; 4 . . . 72 Table 7 Projections of the Estimated Identied Sets: N = 6 . . . . 73 vi List of Figures Figure 1 An Example of a Closed Cycle . . . . . . . . . . . . . . . . 20 Figure 2 An Example of PS NT Sets . . . . . . . . . . . . . . . . . . 30 Figure 3 A Subnetwork and Its Neighborhood . . . . . . . . . . . . . 41 Figure 4 An Example of the Labeling Problem . . . . . . . . . . . . 57 Figure 5 Estimated Identied Sets Projected to the (;v)-Plane. . . 74 Figure 6 Estimated Identied Sets Projected to the (;w)-Plane. . 75 Figure 7 Estimated Identied Sets Projected to the (v;w)-Plane. . . 75 vii Abstract Social and economic networks play an important role in shaping indi- viduals' behaviors. In this dissertation, we provide a structural econometric analysis of network-related models, including network formation models and social interaction models. In the analysis of network formation models, the goal is to identify and estimate the underlying utility parameters using ob- served data on network structure, i.e., who is linked with whom. We consider a game-theoretic model of network formation and use pairwise stability, in- troduced by Jackson and Wolinsky (1996) as the equilibrium condition. The parameters are not point identied when there are only multiple equilibria. We leave the equilibrium selection completely unrestricted and use partial identication. Following Ciliberto and Tamer (2009), we derive bounds on the probability of observing a network. These bounds, however, are compu- tationally infeasible if networks are large. To overcome this computational problem, we propose a novel method based on subnetworks. A subnetwork is the restriction of a network to a subset of the individuals. We derive bounds on the probability of observing a subnetwork, considering only the pairwise viii stability of the subnetwork rather than the entire network. Under mild as- sumptions, these subnetwork bounds are computationally feasible as long as we consider only small subnetworks. As for the social interaction models, we focus on a special case where individuals interact because they can learn from their neighbors about a new technology. We follow the literature on nonparametric identication and provide conditions under which the structural functions and average learning eects in this model can be nonparametrically identied. Keywoods: network formation, pairwise stability, multiple equilibria, par- tial identication, subnetworks, simulation, social interactions, Bayesian learn- ing, nonparametric identication, nonadditive index models ix Chapter One Introduction Social and economic networks play an important role in shaping individ- uals' behaviors. Numerous studies have provided empirical evidence docu- menting that friends can in uence an individual's outcomes, including ed- ucational achievement, employment, adoption of technology, smoking, and health. 1 In this research, we aim to provide a structural econometric analy- sis of economic models of networks. Unlike network models in mathematics, these models assume that individuals involved in social networks make op- timal decisions based on utility maximization. The goal of this research is to recover the structural parameters in these models using observed data on networks. We consider two types of network models: One explaining the causes 1 See, e.g., Sacerdote (2001), Calv o-Armengol, Patacchini, and Zenou (2009), de Giorgi, Pellizzari and Redaelli (2010), Topa (2001), Conley and Udry (2010), Nakajima (2007), and Christakis and Fowler (2007). 1 of networks, and the other explaining the consequences of networks. The rst type of model is called a network formation model, which explores how individuals optimally form a network given their utilities in each possible network. The structural analysis of network formation models is presented in Chapter 2. In this chapter, we consider a game-theoretic model of net- work formation in which individuals choose their friends simultaneously and the network they form is an equilibrium outcome. The main problem in the identication of the structural parameters (i.e., the utility parameters) in this model is that there may be multiple equilibria (i.e., given the utilities there may be more than one equilibrium network). In fact, if there are only multiple equilibria, the parameters are not point identied. Instead of impos- ing restrictions on equilibrium selection, we circumvent this nonidentiability problem using partial identication and derive bounds on the probability of observing a network. These bounds, however, are computationally infeasible when networks are large. To overcome this computational problem, we pro- pose a novel approach based on subnetworks in which we derive bounds on the probability of observing a subnetwork. These subnetwork-based bounds are computationally feasible even in large networks. In Chapter 3, we examine the second type of network model, which is called a social interaction model since it explores the interactions of an in- dividual and her friends in a network, i.e., how an individual's behavior is aected by the behaviors of her friends. We focus on a particular social in- teraction model where individuals interact because they can learn from their 2 neighbors about a new agricultural technology. Our goal is to identify the structural parameters (e.g., the input demand and production functions) in this model nonparametrically (i.e., no parametric assumption is imposed on the structural functions) under the assumption that the network structure is exogenously determined. We apply the results in the literature on non- parametric identication and provide conditions under which the structural functions (and also average eects) can be nonparametrically identied from the observed data distribution. 3 Chapter Two Network Formation Models 2.1 Introduction In this chapter, we seek to develop a structural model of network forma- tion. More precisely, suppose we have observed data on the network struc- ture, i.e., who is linked with whom. 1 We want to identify and estimate the parameters in the model that governs the network formation. We consider game-theoretic models of network formation, also known as strategic network formation models in the network literature (see Goyal, 2007; Jackson, 2008; de Marti & Zenou, 2009, for comprehensive surveys). These models assume that individuals make friends based on utility maximization and treat a network as the equilibrium outcome of strategic interactions 1 Due to cost constraints, most network datasets only record a subset of the relation- ships in a network. In this paper, we consider the case when this subset is a random sample of all the relationships. 4 among individuals. Compared with random graph models, 2 which assume that each link occurs at random, strategic models have solid microfounda- tions and thus are more useful for policy evaluation. The equilibrium concept we use is pairwise stability, introduced in a seminal paper by Jackson and Wolinsky (1996). A network is pairwise stable if no pair of individuals has incentives to create a new link, and no individual has incentives to sever an existing link. This widely used equilibrium concept is considered as a nec- essary condition for any equilibrium concept in strategic network formation (Calv o-Armengol & _ Ilkl c, 2009). By imposing only the pairwise stability condition, we make the weakest possible assumption about individual behav- iors. We specify the utility of an individual in a network following the random utility model. We allow the utility to depend not only on one's own friends, but also on friends of friends and friends that one and her friends have in common. This is motivated by the evidence that these indirect friends aect link formation in networks like friendships, job contacts, favor exchanges, and risk sharing. 3 Once the utility from a link depends on other links, the pairwise stabil- ity condition leads to an interdependent simultaneous discrete choice model where the decision to form a link depends on the choices of others. The inter- dependence of the choices renders invalid a commonly used single-equation 2 See, e.g., Bollob as (2001), Jackson and Rogers (2007), Chandrasekhar and Jackson (2012), and Watts and Strogatz (1998). 3 See, e.g., Christakis, Fowler, Imbens, and Kalyanaraman (2010), Mele (2010), Calv o- Armengol (2004), Jackson, Barraquer, and Tan (2012), and Bramoull e and Kranton (2007). 5 discrete choice model which treats each link as an independent choice of a pair because other links in the network, which are endogenously determined, could be correlated with the error term. What further complicates the sta- tistical inference of the model is that there may be multiple equilibria. Since we consider a weak equilibrium condition and allow for utility interdepen- dence, it is not surprising that multiple equilibria will arise, as in a wide range of static games with complete and incomplete information (e.g., Berry & Tamer, 2006; Bisin, Moro, & Topa, 2011; Bresnahan & Reiss, 1991; de Paula & Tang, 2012). Because the fraction of unique equilibria is close to zero, the model is generally not identied. A possible approach to achieving identiability is to assume an equilibrium selection mechanism which spec- ies how each pairwise stable network is selected if there is more than one such network (e.g., Bajari, Hahn, Hong, & Ridder, 2011; Bajari, Hong, & Ryan, 2010; Bjorn & Vuong, 1985). In network formation, an implicit way to do this is to use a sequential model which assumes that individuals form each link in a random sequence (Christakis, Fowler, Imbens, & Kalyanara- man, 2010; Jackson & Watts, 2002; Mele, 2010). This sequential model selects certain networks from the equilibrium networks in the static model that we consider (Jackson & Watts, 2002; Young, 1993). We do not want to impose any restrictions on the equilibrium selection mechanism. We leave this mechanism totally unspecied and seek to derive inequalities that can be used to partially identify the model. Partial identication has been ex- tensively applied to static games in part because of its advantage of allowing 6 for any equilibrium selection (e.g., Andrews, Berry, & Jia, 2004; Beresteanu, Molchanov, & Molinari, 2011; Berry & Tamer, 2006; Ciliberto & Tamer, 2009; Galichon & Henry, 2011; Pakes, Porter, Ho, & Ishii, 2006). We follow the insight of Ciliberto and Tamer and derive upper and lower bounds on the probability of observing a network, using the fact that an equilibrium selection probability always lies between 0 and 1. The above bound approach is computationally feasible only when net- works are small: The computation of the bounds involves checking pairwise stability for all possible networks, the number of which increases exponen- tially with the number of individuals. 4 In fact, this computational problem arises in almost all the network formation models because of the need to handle, directly or indirectly, all possible networks. To overcome this prob- lem, researchers have adopted Markov Chain Monte Carlo (MCMC) methods in random graph models (Snijders, 2002) and the aforementioned sequential models (Christakis et al., 2010; Mele, 2010). Leaving aside the issue that a Markov Chain may take exponential time to converge (Chandrasekhar & Jackson, 2012), this MCMC approach is not applicable when the model is only partially identied. To tackle the computational problem, we propose a novel method that is dierent from any existing methods in the literature. The idea is to use subnetworks. A subnetwork is the restriction of a network to a subset of the individuals. The denition of pairwise stability implies that if a network is 4 For N individuals, the number of all possible networks is 2 N(N1)=2 . 7 pairwise stable, any subnetwork of it must also be pairwise stable. From this simple property we can derive upper and lower bounds on the probability of observing a subnetwork, considering only the pairwise stability of the subnetwork. If the utility interdependence is assumed to take place only within a neighborhood (i.e., friends of friends and friends in common), these new bounds, though not sharp, are computationally feasible no matter how large the network is, provided that we only consider small subnetworks and that the maximal number of friends an individual has is small. The idea of subnetworks is also considered by Chandrasekhar and Jack- son (2012) in order to achieve feasibility, but their approach is substantially dierent from ours. They develop a particular type of random graph models based on subnetwork statistics, 5 assuming that a network is built up from randomly generated subnetworks. Unlike their approach, our approach main- tains the assumption of strategic behaviors (i.e., pairwise stability), so the subnetworks in a network can be arbitrarily correlated. In a rough sense, our approach can be understood as a microfoundation of theirs. As for the estimation, we consider the asymptotics for a large number of networks. The size of the networks can vary across observations, another ad- vantage of using subnetworks. Following the literature on partially identied models (e.g., Andrews & Jia, 2012; Andrews & Soars, 2010; Chernozhukov, Hong, & Tamer (hereafter CHT), 2007; Romano & Shaikh, 2010), we discuss how to consistently estimate the identied set, which is dened using the fea- 5 For example, the numbers of links, triangles, and stars. 8 sible moment inequalities derived from subnetworks, and how to construct a condence region. When estimating the distribution of subnetworks, we may face two problems. First, we may not be able to uniquely label the individu- als to represent a network if X is discrete. Second, there may be more than one subnetwork in a network to choose from. The rst problem is resolved if we consider the distribution of the equivalence classes of subnetworks. As for the second problem, we show that the subnetworks of the same size are ex- changeable, so it does not matter which subnetwork we choose. These results are proved using graph isomorphism in graph theory. As for the bounds, we compute them by simulation if they do not have closed forms. We contribute to the literature in several aspects. First, we propose a util- ity function that allows for individual heterogeneity and ensures the existence of pairwise stable networks for all parameter values. These results are new to the network literature. We also contribute to the econometric literature by addressing the econometric issues if the existence of pairwise stable net- works is not ensured. Second, we leave the equilibrium selection mechanism completely unrestricted and use partial identication to overcome the non- identiability problem. We are unaware of prior literature that uses partial identication to deal with multiple equilibria in network formation models. Third, we propose a novel method to tackle the computational problem. To our knowledge, this is the rst study that provides a computationally feasible method for the estimation of partially identied network formation models. Fourth, we show how to use graph isomorphism to deal with labeling-related 9 issues which often arise when we handle network data. This is also new to the literature. Other related literature includes social interaction models (Brock and Durlauf, 2001; Blume, Brock, Durlauf, & Ioannides, 2011) and matching games (Choo & Siow, 2006; Echenique, Lee, & Shum, 2010; Fox, 2010a, 2010b; Galichon & Salanie, 2010). Social interaction models typically con- sider the eects of peers for a given network, which complements the focus of network formation games. Manski (1993) addresses the diculties in the identication of social interactions, in particular, how to disentangle the ef- fects of peers on outcomes from the eects of self selection of peers. Network formation models may provide a useful perspective for the disentanglement of these eects. Matching games can be understood as "restricted" net- work formation games, where the number and types of parties to which an individual can be matched and the utility interdependence across matches are restricted. This implies that the econometric techniques developed for matching games are often not applicable to network formation games, unless we consider special models of the latter (Currarini, Jackson, & Pin, 2009). The remainder of the chapter is organized as follows. Section 2.2 presents a model of network formation. Section 2.2.1 sets up the model. Section 2.2.2 establishes the existence of equilibrium. Section 2.3 presents the identi- cation results. Section 2.3.1 addresses the problem of multiple equilibria. Section 2.3.2 derives inequalities from the pairwise stability condition. Sec- tion 2.3.3 introduces subnetworks and derives inequalities using subnetworks. 10 Section 2.4 presents the estimation method. Section 2.4.1 discusses the net- work isomorphism. Section 2.4.2 discusses the estimation and inference of the identied set. Section 2.4.3 provides simulators for the bounds. Section 2.5 contains the results of Monte Carlo simulations, and Section 2.6 concludes the chapter. 2.2 A Model of Network Formation 2.2.1 Setup Let V =f1; 2;:::;Ng be the set of individuals who can form links. The links are undirected in the sense that forming a link requires the consent of both individuals involved in the link, but severing a link can be unilateral. This is the natural setting in the context of friendship networks, and for that reason we call linked individuals friends. The links form a network, which is denoted by a graphG = (V;E), where the vertex set V contains the individuals, and the edge set E contains the links. Equivalently, the network can be represented by an NN symmetric binary matrixG, where the (i;j) entry represents the relationship betweeni and j G ij = 8 > < > : 1 if i and j are friends, 0 otherwise. Following the convention, we normalize G ii = 0 for all i2 V . LetG be the 11 set of all possible networks. For N individuals, there are 2 N(N1)=2 possible networks, sojGj = 2 N(N1)=2 . Utility Each individual i has a K 1 vector of observed attributes X i (e.g., gender, age, race, parental education) and an (N 1) 1 vector of unobserved (to researchers) preferences " i = (" i1 ;:::;" i;i1 ;" i;i+1 ;:::;" iN ) 0 , where " ij isi's preference for link ij. The utility of i in network G generally depends on the attributes and preferences of all the individuals U i (G;X;"); (2.1) where X = (X 1 ;:::;X N ) 0 (an NK matrix) and " = (" 0 1 ;:::;" 0 N ) 0 (an N (N 1) 1 vector). This general form can allow i's utility to depend on heterogeneous utilities from friends of friends. If we only allow for the depen- dence on heterogeneous utilities from direct friends, it reduces toU i (G;X;" i ). For any i 6= j, we write G = (G ij ;G ij ), where G ij 2 G ij is the network obtained from G by removing link ij, withjG ij j = 2 N(N1)=21 . The marginal utility that i obtains from forming link ij for a given G ij is ij U i (G ij ;X;") =U i (1;G ij ;X;")U i (0;G ij ;X;"): (2.2) Because G ij is symmetric, we use ij U j (G ij ;X;") to denote j's marginal utility from link ij given G ij . 12 In this chapter, we consider a particular utility function U i (G;X;" i ) = N P j=1 G ij u ij + N P j=1 N P k=1 k6=i G ij G jk v+ N P j=1 N P k>j G ij G ik G jk wC i N P j=1 G ij ! ; (2.3) whereu ij = 1 + 0 2 X i (X i X j ) 0 (X i X j ) +" ij , andC i (:) is a function that only depends on the number of i's friends. Then i's marginal utility from link ij is ij U i (G ij ;X i ;X j ;" ij ) =u ij + N P k=1 k6=i G jk v + N P k=1 G ik G jk w C i 0 @ N P k=1 k6=j G ik 1 A ; (2.4) where C i (n) =C i (n + 1)C i (n). This marginal utility function consists of four terms. The rst term u ij captures the direct utility from j, which depends oni andj's attributes (X i andX j ) andi's preference for linkij (" ij ). The term (X i X j ) 0 (X i X j ) in u ij is to capture the homophily eect, which says that people tend to make friends with those who are similar to them (Currarini, Jackson, & Pin, 2009; Christakis et al., 2010). The second and third terms capture the utility from j's friends and the (additional) utility from i and j's friends in common, where v and w are assumed to be homogeneous for all i. Of these two terms, the former is to represent the eect of j's popularity, which can be positive (e.g. better source of information) or negative (e.g. less time), and the latter is motivated by the clustering phenomenon found in social networks, which says that if i and j have friends in common, they are more likely to be friends than if links are 13 formed randomly (Christakis et al., 2010; Jackson & Rogers, 2007; Jackson, 2008; Jackson et al., 2012). The last term captures the additional cost of having one more friend. In this chapter we consider a linear cost function, but let C i (n) =1 if n > b for some b <1 so that the number of one's direct friends is bounded by an upper limit (the reason is described later). Our utility function follows closely the specication in Christakis et al. (2010). They also consider the eects of friends' friends and friends in com- mon, but allow for nonlinearity in these eects. Our specication is a linear version of theirs. By virtue of the linear form, we can establish the existence of equilibrium (see Section 2.2.2), which is an open question for the speci- cation in Christakis et al.. Other related specications are proposed by Mele (2010), who considers a linear utility function without the eects of friends in common, and Goyal and Joshi (2006), who in addition assume that the direct friend eect is homogeneous. Our utility function is more general than these specications. Strategies We assume that individuals have complete information, i.e., they know all X i and " i . To take into account the mutual consent require- ment, we model the strategies of the individuals as in the link-announcement game (Myerson, 1991). The strategies are dierent in the cases with and without transfers. Without transfers, each individual i simultaneously an- nounces a set of intended links s i = (s i1; :::;s i;i1 ;s i;i+1 ;:::;s iN ), where s ij equals 1 if i intends to form a link with j and 0 otherwise. With transfers, 14 eachi simultaneously announces a set of intended transfers t i = (t i1; :::;t i;i1 ; t i;i+1 ;:::;t iN ), where t ij 2R is the amount of the transfer that i is willing to pay in order to form linkij. Linkij is formed if and only ifs ij s ji = 1 in the nontransferable case and t ij +t ji 0 in the transferable case. Equilibrium To close the model, we need an equilibrium condition that relates the optimal strategies to the marginal utilities. For undirected net- works, equilibrium conditions dier in how much coordination individuals are allowed to have. If no coordination is allowed for, and each individ- ual chooses the optimal strategy given the strategies of others, the network formed is a Nash equilibrium. Unfortunately, Nash equilibrium may fail to predict some links that are expected to form given the marginal utilities. For example, when forming link ij is benecial for both i and j (i.e., ij U i > 0 and ij U j > 0), it can still be optimal in the Nash sense that i and j do not intend to form the link (i.e., s ij = s ji = 0). This is because if i re- jects the link, it does not matter whether or not j rejects it. Then rejection is a (weakly) optimal choice for j. Moreover, given j's rejection, it is also (weakly) optimal fori to reject the link. Since Nash equilibrium fails to pre- dict benecial links, it is reasonable to assume that any two individuals can coordinate when considering their relationship so that they do not fail to form the link if that is benecial for both. The equilibrium concept that allows for such pairwise coordination is pairwise stability, rst proposed by Jackson and Wolinsky (1996) in the case without transfers and later extended to the 15 case with transfers by Bloch and Jackson (2006, 2007). A network is pairwise stable if no pair of individuals has incentives to create a new link, and no individual has incentives to sever an existing link. The formal denitions are presented in the notation in this chapter. Denition 2.1 A network G is pairwise stable without transfers (PS NT ) if 1. for all G ij = 1, ij U i (G ij ;X i ;X j ;" ij ) 0; ij U j (G ij ;X j ;X i ;" ji ) 0; 2. for all G ij = 0, ij U i (G ij ;X i ;X j ;" ij )> 0 =) ij U j (G ij ;X j ;X i ;" ji )< 0: Denition 2.2 A network G is pairwise stable with transfers (PS T ) if 1. for all G ij = 1, ij U i (G ij ;X i ;X j ;" ij ) + ij U j (G ij ;X j ;X i ;" ji ) 0; 2. for all G ij = 0, ij U i (G ij ;X i ;X j ;" ij ) + ij U j (G ij ;X j ;X i ;" ji ) 0: 6 6 The denition of PS T can be derived from that of PS NT . To see this, note that the optimal transfers satisfyt ij +t ji = 0,8i,j2V , because otherwise eitheri orj can strictly 16 The dierence between PS NT and PS T is that the former depends on the marginal utilities of two individuals involved whereas the latter depends only on the sum of the marginal utilities. As a consequence, the existence of a pairwise stable network holds under weaker conditions for PS T compared to PS NT . We come back to this issue in Section 2.2.2. In sequel we use the term "pairwise stability" (PS) to mean pairwise stability with or without transfers, depending on the context. Since no one-link deviation is a reasonable condition for a network equi- librium, other equilibrium concepts in the literature also assume pairwise stability and rene it with additional restrictions, such as pairwise Nash equilibrium (Bloch & Jackson, 2006, 2007; Calv o-Armengol & _ Ilkl c, 2009), 7 bilateral equilibrium (Goyal & Vega-Redondo, 2007), and strong stability (Dutta & Mutuswami, 1997; Jackson & van den Nouweland, 2005). These equilibrium concepts usually predict fewer network outcomes due to assuming higher-level coordination (e.g. coordination on multiple links or among more than two individuals), but they may lead to biased estimates if the assumed behaviors are not part of the "true" data generating process. Therefore, they are not used in this chapter. benet from lowering their transfers without changing the network outcome. Then the marginal utilities of i and j from link ij are ij U i t ij and ij U j +t ij , respectively. Hence, 9t ij 2 R s.t. ij U i t ij 0 and ij U j + t ij 0 is equivalent to ij U i + ij U j 0. Moreover, ij U i t ij > 0 =) ij U j +t ij < 0,8t ij 2 R, is equivalent to ij U i + ij U j 0. 7 A network is a pairwise Nash equilibrium if it is both Nash and pairwise stable. 17 Under pairwise stability, we obtain the following equations for network G, i.e., G ij = 1f ij U i (G ij ;X i ;X j ;" ij ) 0; ij U j (G ij ;X j ;X i ;" ji ) 0g;8i;j2V; (2.5) if G is PS NT , and G ij = 1f ij U i (G ij ;X i ;X j ;" ij ) + ij U j (G ij ;X j ;X i ;" ji ) 0g;8i;j2V; (2.6) ifG is PS T . Note that equations (2.5) and (2.6) dier slightly from Denitions 2.1 and 2.2 in handling the indierence case, 8 but the dierence is negligi- ble when later " is assumed to have a continuous distribution. Equations in (2.5) or (2.6) form a simultaneous discrete choice model, where individual choices are interdependent due to the presence of G ij in the marginal util- ities. The interdependence of the choices renders invalid a commonly used single-equation discrete choice model which treats each link as an indepen- dent choice of a pair, because G ij , which is endogenously determined, is correlated with " ij and " ji . Issues that further complicate the statistical inference of (2.5) and (2.6) are that (i) network G that solves (2.5) or (2.6) may not exist for some parameter values, and that (ii) there may be multiple solutions, i.e., multiple equilibria. We will discuss the former in the next section and the latter in 8 Denitions 2.1 and 2.2 considerG ij = 0 as an optimal choice when ij U i = ij U j = 0 and ij U i + ij U j = 0, respectively, while equations (2.5) and (2.6) do not. 18 Section 2.3. 2.2.2 The existence of pairwise stable networks For certainX and", there may be noG that solves the equations in (2.5) or (2.6), i.e., there is no PS network. According to Jackson and Watts (2002, Lemma 1), for any utility function there is either a PS network or a closed cycle of networks. Thus if a PS network does not exist, there is a closed cycle. They proved the result in the case without transfers. We show in the appendix that the result also holds in the case with transfers. A closed cycle is a set of networks that satises the following two conditions: (i) For any two networks in the set there is an improving path from one to the other; and (ii) no improving path starting from a network in the set leads to a network outside the set. In the above conditions, an improving path is a sequence of networks such that each two consecutive networks dier by one link, and adding (or deleting) that link in the succeeding network is benecial for the involved individuals. The precise denitions are presented in the appendix. Below we give an example of a closed cycle. Example 2.1 Consider networks withN = 3. Suppose the utility function is as in (2.3) with u ij =" ij , v< 0, w> 0, v +w< 0, and C i () = 0. Consider the case without transfers. When " 21 ;" 32 ;" 13 >v and 0 < " 12 ;" 23 ;" 31 < v w, the 1-link and 2-link networks form a closed cycle as shown in Figure 1, where the bottom left, bottom right, and top nodes in each network 19 0 12 ε 21 ε 13 ε 31 ε 0 0 32 31 ε ε + 0 31 < +v ε 13 12 ε ε + 0 0 21 12 > + > v ε ε 23 21 ε ε + v + 12 ε v + 32 ε v + 13 ε v + 23 ε 23 ε 32 ε 0 23 < + v ε v + 21 ε v + 31 ε 0 0 32 23 > + > v ε ε 0 12 < + v ε 0 0 13 31 > + > v ε ε Figure 1: An Example of a Closed Cycle represent individuals 1, 2, and 3. 9 Moreover, the 1-link and 2-link networks are preferred to the empty and complete networks, respectively. In this case, there is no PS NT network. A closed cycle represents an unstable state in which individuals are switch- ing constantly between forming and severing links. It is reasonable to assume that the observed networks are not part of a closed cycle. If under certain parameter values a PS network does not exist for someX and" (as shown in Example 2.1), then given that a network is observed, the conditional support of " will depend on the parameter value and X. This parameter-dependent support property creates discontinuity in the econometric model, and the conventional asymptotic theory does not apply (e.g., Chernozhukov & Hong, 9 The values beside the nodes in each network are the utilities of the individuals in that network. 20 2004; Hirano & Porter, 2003). Moreover, the dependence of "'s support on X leads to the correlation between X and ". To avoid these problems, we need to restrict the utility function and/or the parameter space so that for all parameter values in the parameter space and for all X and " there is a PS network. For example, the parameter values in Example 2.1 (i.e., v < 0, w> 0, v +w< 0, and C i () = 0) are not in the parameter space. Discussion of Christakis et al. (2010) One also needs to pay attention to closed cycles when using the sequential model as in Christakis et al. (2010). This model assumes that individuals form links in a sequence: In each period a pair of individuals is randomly selected to meet, and only that pair can update their relationship. Moreover, individuals are assumed to be myopic, so when they meet in a period, they form or sever the link merely based on whether it is benecial in that period. 10 The idea of Christakis et al. is that if the sequence of meetings were known, the entire sequence of networks can be recovered. By integrating the likelihood of the complete data (including the meetings and networks) over the unobserved sequence of meetings they obtain the likelihood of the observed data. 11 10 For this reason, the model is not a typical dynamic game which usually allows indi- viduals to be forward looking and take the strategies that maximize the discounted sum of future utilities. 11 Christakis et al. (2010) consider a simple case where each pair meets exactly once. In this case, the sequence of networks is uniquely determined by the sequence of meetings and the nal network, so only the sequence of meetings is unobserved. However, if a pair is allowed to meet multiple times, there may be multiple sequences of networks that are compatible with the sequence of meetings and the nal network. Then networks in all but the last period are also unobserved and need to be integrated out as well. 21 A potential problem with this approach is that the region of " that is mapped to a network may include those " that correspond to a closed cycle. To illustrate this, let us x X and consider Example 2.1. Suppose that the individuals start with an empty network and each pair meets exactly once, as assumed in Christakis et al. (2010). If the sequence of meetings is f13; 12; 23g, then for the parameter values v < 0, w > 0, v +w < 0, and C i () = 0, and for " that satisfy " 21 ;" 32 ;" 13 >v and 0 < " 12 ;" 23 ;" 31 < vw, the networks formed in each period are Periods Networks (g 12 ;g 23 ;g 13 ) 0 (0; 0; 0) 1 (0; 0; 1) 2 (1; 0; 1) 3 (1; 0; 1) According to Christakis et al., this set of" is then considered to be consistent with the above networks and is included when we construct the likelihood of the latter. However, we already know that for the above parameter values and " these networks are only part of a closed cycle and should never be observed. To correctly specify the likelihood, we need to exclude this set of " from the support of " for the above parameter values. 12 This causes the same support problems as described before. To avoid the problems, we need 12 In this case, we simply exclude this set of " from the support of " for all networks because none of them is PS. If for a certain set of " there is both a PS network and a closed cycle, we exclude the set of " only for the networks in the closed cycle. 22 a utility function such that for all parameter values in the parameter space and for all X and " there is no closed cycle. Unfortunately, it is unknown whether the utility function specied in Christakis et al. has this property. The existence of PS networks is an ongoing topic in the network formation literature. Most of the existing results focus on special settings that do not allow for heterogeneous individuals and are thus not suitable for econometric analyses. 13 Notable exceptions are the work of Jackson and Watts (2001) and Hellmann (2013), which provide general conditions on the utility function that are sucient to ensure the existence of PS networks. 14 We apply the results of Jackson and Watts to the utility function in (2.3) and show that a PS T network exists for any parameter values. As for the case without transfers, we have seen in Example 2.1 that the specication in (2.3) does not ensure that there is a PS NT network. However, this is guaranteed under some parameter restrictions, which are found by applying the result of Hellmann. Proposition 2.1 Suppose the utility function is as in (2.3). Then for all values of u ij , v, w, and for all cost functions C i (), there is no closed cycle in the case with transfers, and thus a PS T network exists. Proof. The proof is an application of Theorem 1 in Jackson and Watts (2001). They provide a sucient condition that rules out cycles in the case 13 See e.g., Belle amme and Bloch (2004), Goyal and Joshi (2006). 14 In noncooperative games, the existence problem is resolved by considering mixed strategies. This may be a promising approach to establish general results on the existence of pairwise stable networks. Unfortunately, how to clearly dene a mixed strategy in network formation games is still an open question. See Groenert (2010) for work along this line. 23 without transfers, where a cycle is a set of networks such that for any two networks in the set there is an improving path from one to the other, i.e., requirement (i) of closed cycles. We show that the condition is also sucient in the case with transfers. The condition is that a cycle does not exist, if there is a function :G!R such that for anyG,G 0 that dier by one link, G 0 defeatsG (dened in the appendix) if and only if (G 0 )> (G). We nd such a function for the utility function in (2.3), which then implies that there is no cycle and thus no closed cycle. The detailed proof is in the appendix. Proposition 2.2 Suppose the utility function is as in (2.3). Then for all values of u ij , v 0, w 0, and for cost function C i (n) = C i n with C i 0, there is no closed cycle in the case without transfers, and thus a PS NT network exists. Proof. Our proof is an application of Theorem 1 in Hellmann (2013). He provides a sucient condition that rules out closed cycles in the case without transfers. That is, if the utility function satises the properties of convexity in one's own links and strategic complements (Bloch & Jackson, 2007; Calv o- Armengol & _ Ilkili c, 2009; Goyal & Joshi, 2006), then there is no closed cycle. Convexity in one's own links means that each i has nondecreasing marginal utility from adding links to others, and strategic complementarity means that eachi has nondecreasing marginal utility from adding links that do not involvei directly. Formal denitions are presented in the appendix. We show 24 that these properties are satised by the utility function in (2.3) under the stated restrictions. Hence there is no closed cycle. The detailed proof is in the appendix. To understand Propositions 2.1 and 2.2, note that a closed cycle arises when the utility interdependence leads to con icting choices among individ- uals. For example, if the addition of link ij is benecial when link ik is formed, but the addition of link ik is benecial when link ij is not formed, we may obtain the closed cycle "formingik, formingij, severingik, severing ij, forming ik, ..." In the case with transfers, this is impossible if the utility interdependence is linear and homogeneous (i.e., constant v and w) because the utility externality of linkij on linkik is equal to the utility externality of linkik on linkij, so there is no con ict between the choice of linkij and the choice of link ik. In the case without transfers, assuming linear and homo- geneous interdependence is not enough because the con ict can arise within a pair if the utility externalities of other links in uence the two individuals dierently, as shown in Example 2.1. Nevertheless, this within-pair con ict disappears if both the externalities of one's own links and the externalities of other links are nonnegative because then the eects of other links on the two individuals in a pair will be in the same direction. 25 2.3 Identication In this section, we examine the framework that we use to identify the model. We rst discuss multiple equilibria, the main problem in identica- tion. Then we show how much we can learn from the model without imposing any restrictions on the equilibrium selection. To start, let us introduce the data generating process. Suppose there is an innite population of individuals. We rst generate an integer N from a distribution with supportf2; 3; 4;:::g. Given N, we then draw N in- dividuals at random from the population and label them 1; 2;:::;N. Let V =f1; 2;:::;Ng. Each individuali2V has aK1 vector of attributesX i and a (N 1) 1 vector of preferences " i = (" i1 ;:::;" i;i1 ;" i;i+1 ;:::;" iN ) 0 . We let these N individuals form links, and a PS network emerges. We ob- serve the network G (N) (a NN matrix) and the attributes X (N) = (X 1 ;:::;X N ) 0 (aNK matrix), but not the preferences" (N) = (" 0 1 ;:::;" 0 N ) 0 (a N (N 1) 1 vector). Note that due to the sampling scheme X i are i.i.d. for i2 V . The above procedure is repeated independently T times, and we obtain the sample (G t (N t );X t (N t );N t ), t = 1;:::;T . Let V t be f1; 2;:::;N t g and " t (N t ) the unobserved preference vector of individuals in V t . For notational simplicity, we suppress N in G (N), X (N), and " (N) whenever there is no confusion. The following assumption is used throughout the chapter. Assumption 2.1 (Data generating process) (i) We have a sample of 26 observations (G t (N t );X t (N t );N t ),t = 1;:::;T , that are generated as above and hence i.i.d. (ii) X t (N t ) and " t (N t ) are jointly independent for t = 1;:::;T . (iii) " ij are i.i.d. for i;j2 V t , i6= j, t = 1;:::;T . 15 (iv) " t (N t ) (t = 1;:::;T ) follow a distributionF "(N) (" (N)j " ) that is absolutely contin- uous w.r.t. Lebesgue measure and is known up to " 2 " R d" . In addition, we assume that the utility function U i (G;X;" i ; u ) is known up to u 2 u R du . The marginal utility of i from link ij takes the form ij U i (G ij ;X i ;X j ;" ij ; u ) which satises (2.4). The parameter of interest is = ( u ; " )2 u " = R du+d" . Pairwise stability is understood as dened by equations (2.5) and (2.6) rather than by Denitions 2.1 and 2.2. 2.3.1 Multiple equilibria GivenX and", if equations (2.5) (or (2.6)) have more than one solution, there are multiple PS networks, or simply multiple equilibria. The multiplic- ity results from the dependence of the marginal utilities onG ij . 16 To see how multiple equilibria aect the identication of the model, let us xX and char- acterize the equilibrium/equilibria by regions of". For each"2R N(N1) , we denote byPS(U(X;"; u )) the PS set, i.e., the collection of all PS networks for", where U(X;"; u ) :=ff ij U i (G ij ;X i ;X j ;" ij ; u )g G ij 2G ij g i;j2V;i6=j 2 15 The i.i.d. assumption is not crucial to our analysis. In fact, we can introduce depen- dence by letting " ij = i + ij , where i is the individual random eects, i and ij are independent, and ij ; ji are i.i.d.. In this case, " ij ;j = 1;:::;N are dependent through i . 16 If there is no such dependence, i.e., ij U i (G ij ;X i ;X j ;" ij ; u ) = ij U i (X i ;X j ;" ij ; u ), the PS NT and PS T networks are unique. 27 R N(N1)jGj=2 is the marginal-utility prole of the individuals in V . For any collection of networksHG, we deneE (H) to be the collection of all " such thatH is the PS set for ", i.e.,f"2R N(N1) jH =PS(U(X;"; u ))g. More precisely,E (H) is given by 8 > > > > > > > > > > > > > < > > > > > > > > > > > > > : "2R N(N1) 8G2H;8i;j2V;G ij = 1 8 > > > > > < > > > > > : ij U i (G ij ;X i ;X j ;" ij ) 0; ij U j (G ij ;X j ;X i ;" ji ) 0 9 > > > > > = > > > > > ; ; 8G = 2H;9i;j2V;G ij 6= 1 8 > > > > > < > > > > > : ij U i (G ij ;X i ;X j ;" ij ) 0; ij U j (G ij ;X j ;X i ;" ji ) 0 9 > > > > > = > > > > > ; 9 > > > > > > > > > > > > > = > > > > > > > > > > > > > ; (2.7) in the nontransferable case and 8 > > > > > > > > > > > > > < > > > > > > > > > > > > > : "2R N(N1) 8G2H;8i;j2V;G ij = 1 8 > > > > > < > > > > > : ij U i (G ij ;X i ;X j ;" ij ) + ij U j (G ij ;X j ;X i ;" ji ) 0 9 > > > > > = > > > > > ; ; 8G = 2H;9i;j2V;G ij 6= 1 8 > > > > > < > > > > > : ij U i (G ij ;X i ;X j ;" ij ) + ij U j (G ij ;X j ;X i ;" ji ) 0 9 > > > > > = > > > > > ; 9 > > > > > > > > > > > > > = > > > > > > > > > > > > > ; (2.8) in the transferable case. For dierentH andH 0 , setsE (H) andE (H 0 ) are disjoint because for any value of " the PS set is unique. Note that for some H the setE (H) is empty, i.e., thisH can never be a PS set. Given that a PS network always exists (recall Section 2.2.2), the nonemptyE (H) form a 28 partition ofR N(N1) , and the correspondingH represent all possible unique equilibria (forjHj = 1) and multiple equilibria (forjHj 2). What kind of nonsingletonH can have nonemptyE (H)? That is, they are the collection of multiple equilibria for certain ". The lemma below says that suchH must contain only nonadjacent networks, where two networks are called nonadjacent if they dier by at least two links. Lemma 2.1 IfHG is a PS NT (or PS T ) set, then the networks inH are mutually nonadjacent. Proof. See the appendix. The idea of Lemma 2.1 is simple. If two networks dier by only one link, the pair involved in the link must prefer one network over the other, and the latter network cannot be PS. This result is useful because it implies that to nd the PS sets, we only need to look at collections of nonadjacent networks. We illustrate the PS sets with a simple example which uses the utility function in (2.3) and considers networks with three individuals. Example 2.2 Consider networks with N = 3 in the case without transfers. Suppose the utility function is as in (2.3) with u ij = " ij , v < 0, w > 0, v +w > 0, C i () = 0. For simplicity we assume " ij = " ji , i;j = 1; 2; 3, so " = (" 12 ;" 23 ;" 13 ) 2 R 3 . Note that the symmetry assumption ensures the existence of a PS NT network for all " and all parameter values, because under symmetric u ij the nontransferable case is similar to the transferable case, and Proposition 2.1 applies to PS NT . The utilities of the individuals 29 0 0 0 0 0 12 ε 12 ε 0 23 ε 23 ε 13 12 ε ε + v + 13 ε v + 12 ε v + 13 ε v + 23 ε 23 13 ε ε + 23 12 ε ε + v + 12 ε v + 23 ε w v + + + 2 13 12 ε ε w v + + + 2 23 12 ε ε w v + + + 2 23 13 ε ε 1 G 2 G 3 G 4 G 5 G 6 G 7 G 8 G 13 ε 13 ε Figure 2: An Example of PS NT Sets in each network are shown in Figure 2, where the bottom left, bottom right, and top nodes represent individuals 1, 2, and 3. We want to calculate all possible PS NT sets and the corresponding partition of R 3 . By Lemma 2.1, we start with the collections of nonadjacent networks. They are classied into 6 cases and listed in Table 1. Then for each collectionH in the table, we compute the region of (" 12 ;" 23 ;" 13 ) in whichH is a PS NT set, i.e., setE (H) in (2.7). The PS NT sets and the correspondingE (H) are listed in Table 2. For example,E (fG 2 ;G 3 g) is the set of " such thatfG 2 ;G 3 g is the collection of PS NT networks. For the region in R 3 not covered in Table 2, there is a unique PS NT network. TheE (H) corresponding to each unique PS NT network are listed in Table 3. All theE (H) in Tables 2 and 3 are mutually disjoint. They form a partition ofR 3 . The identiability of the model depends on the existence and proportion 30 Table 1: The Collections of Nonadjacent Networks Case 1: the one-link networks fG 2 ;G 3 g;fG 3 ;G 4 g;fG 2 ;G 4 g;fG 2 ;G 3 ;G 4 g Case 2: the two-link networks fG 5 ;G 6 g;fG 6 ;G 7 g;fG 5 ;G 7 g;fG 5 ;G 6 ;G 7 g Case 3: the complete and one-link networks fG 2 ;G 8 g;fG 3 ;G 8 g;fG 4 ;G 8 g;fG 2 ;G 3 ;G 8 g; fG 3 ;G 4 ;G 8 g;fG 2 ;G 4 ;G 8 g;fG 2 ;G 3 ;G 4 ;G 8 g Case 4: the empty and two-link networks fG 1 ;G 5 g;fG 1 ;G 6 g;fG 1 ;G 7 g;fG 1 ;G 5 ;G 6 g; fG 1 ;G 6 ;G 7 g;fG 1 ;G 5 ;G 7 g;fG 1 ;G 5 ;G 6 ;G 7 g Case 5: the empty and complete networks fG 1 ;G 8 g Case 6: one-link and two-link networks that are complementary fG 2 ;G 7 g;fG 3 ;G 6 g;fG 4 ;G 5 g 31 Table 2: The PS NT Sets for Multiple Equilibria PS NT setsH SetsE (H) of (" 12 ;" 23 ;" 13 ) Case 1: the one-link networks fG 2 ;G 3 g [0;v) (1;vw) [0;v) fG 3 ;G 4 g [0;v) [0;v) (1;vw) fG 2 ;G 4 g (1;vw) [0;v) [0;v) Case 3: the complete and one-link networks fG 2 ;G 8 g f[vw; 0) [vw; 0) [0;v)g [f[vw;v) [vw;v) [v;1)g fG 3 ;G 8 g f[0;v) [vw; 0) [vw; 0)g [f[v;1) [vw;v) [vw;v)g fG 4 ;G 8 g f[vw; 0) [0;v) [vw; 0)g [f[vw;v) [v;1) [vw;v)g fG 2 ;G 3 ;G 8 g [0;v) [vw; 0) [0;v) fG 3 ;G 4 ;G 8 g [0;v) [0;v) [vw; 0) fG 2 ;G 4 ;G 8 g [vw; 0) [0;v) [0;v) fG 2 ;G 3 ;G 4 ;G 8 g [0;v) [0;v) [0;v) Case 5: the empty and complete networks fG 1 ;G 8 g [vw; 0) [vw; 0) [vw; 0) 32 Table 3: The PS NT Sets for Unique Equilibrium PS NT setsH SetsE (H) of (" 12 ;" 23 ;" 13 ) fG 1 g f(1; 0) (1;vw) [vw; 0)g [f[vw; 0) (1; 0) (1;vw)g [f(1;vw) [vw; 0) (1; 0)g [f(1;vw) (1;vw) (1;vw)g fG 2 g f(1;vw) (1; 0) [0;v)g [f(1;vw) (1;v) [v;1)g [f[vw; 0) (1;vw) [0;v)g [f[vw;v) (1;vw) [v;1)g fG 3 g f[0;v) (1; 0) (1;vw)g [f[v;1) (1;v) (1;vw)g [f[0;v) (1;vw) [vw; 0)g [f[v;1) (1;vw) [vw;v)g fG 4 g f(1; 0) [0;v) (1;vw)g [f(1;v) [v;1) (1;vw)g [f(1;vw) [0;v) [vw; 0)g [f(1;vw) [v;1) [vw;v)g fG 5 g [v;1) (1;vw) [v;1) fG 6 g (1;vw) [v;1) [v;1) fG 7 g [v;1) [v;1) (1;vw) fG 8 g f[v;1) [vw;1) [v;1)g [f[v;1) [v;1) [vw;v)g [f[vw;v) [v;1) [v;1)g 33 of the unique equilibria. If for any parameter values there exists a network that can only be a unique PS network, then theE sets corresponding to the unique PS networks can be used to construct moment restrictions. If in addition the proportion of the unique PS networks among all network outcomes is not too small, we could in principle identify the true parameter value using only the moment restrictions from the unique PS networks. For instance, in Example 2.2, G 5 is a unique PS network under the assumed parameter values. 17 Moreover, it can be shown that under other parameter values there also exists a unique PS network. Hence, in this simple example we expect that the model is identied. In more general settings, however, the proportion of unique equilibria may be small or even zero. For example, it can be shown that for the setting in Example 2.2 withN = 4 and parameter values v > 0; v + 2w > 0, there is no unique PS network. This implies that the model cannot be identied by simply considering the unique equilibria. As is known, multiple equilibria do not help identify the model: For networks that are contained in nonsingleton PS sets, we are unable to establish a one- to-one mapping from each network to a region of " where the network is PS because such regions overlap. This is the incoherency problem addressed in the econometric literature (Bresnahan & Reiss, 1991; Heckman, 1978; Tamer, 2003). Therefore, the model is generally not identied unless we have additional information that species how a network is chosen from a PS 17 In this example, G 6 and G 7 are also unique PS networks, but they are the same network as G 5 and provide the same moment restriction. See Section 2.4.1 for more details. 34 set. 2.3.2 Partial identication We overcome the nonidentiability problem using partial identication; i.e., we derive bounds on the probability of observing a network without making any assumptions on the equilibrium selection. As discussed, givenX and " there is a unique PS setPS(U(X;"; u )). To complete the model, suppose there is an equilibrium selection mechanism that picks a network from each PS set according to certain distributions. Then the probability that network G occurs given X and N is Pr(GjX;N) = Z Pr (GjPS (U (X;"; u ));X;")dF " ("j " ); (2.9) where Pr (GjPS (U (X;"; u ));X;") represents the equilibrium selection mechanism, which species the probability with which network G is chosen from setPS (U (X;"; u )). Equation (2.9) is similar to what Ciliberto and Tamer (2009) establish in entry games and Bajari, et al., (2010) in discrete games with complete information. We allow the equilibrium selection mechanism to take any form and derive bounds on Pr(GjX;N) by exploiting the fact that a selection probability always lies between 0 and 1. This approach has been widely applied to game-theoretic models with multiple equilibria for its advantage of allowing for any equilibrium selection (see, e.g., Andrews, et al., 2004; Berry & Tamer, 35 2006; Ciliberto & Tamer, 2009; Pakes, et al., 2006). Following closely the argument in Ciliberto and Tamer, we split the right hand side of equation (2.9) into two parts Pr(GjX;N) = Z G2PS(U(X;";u)) &jPS(U(X;";u))j=1 dF " ("j " ) + Z G2PS(U(X;";u)) &jPS(U(X;";u))j2 Pr (GjPS (U (X;"; u ));X;")dF " ("j " ); (2.10) where we have used the fact that the selection probability Pr(GjPS(U(X;"; u ));X;") equals 1 when G2PS (U (X;"; u )) andjPS(U(X;"; u ))j = 1, i.e., G is the unique PS network. Replacing the selection probability in the second integral by 0 and 1 and combining the two integrals we obtain Pr(GjX;N) Z G2PS(U(X;";u)) dF " ("j " ); (2.11) and Pr(GjX;N) Z G2PS(U(X;";u))&jPS(U(X;";u))j=1 dF " ("j " ): (2.12) The upper bound of Pr(GjX;N) in (2.11) is the probability that G is a PS network, and the lower bound in (2.12) is the probability that G is a unique PS network. These bounds provide the best possible bounds for Pr(GjX;N) 36 because the selection probability in (2.10) can be any number between 0 and 1. If we use inequalities (2.11) and (2.12) to estimate the parameters, a computational problem can arise if the network size N is large. Because the lower bound in (2.12) has no closed form, we need to compute it by simulation (see Section 2.4.3 for discussion on computation of the bounds). That is, for each,X and simulated", we need to check whether a network is uniquely pairwise stable, a task that requires checking pairwise stability for 2 N(N1)=2 possible networks. 18 This is computationally infeasible if N is not small. Even in a network with 20 individuals, the total number of networks to check is 2 190 10 57 . Discussion of the equilibrium selection approach Unlike our approach, one can achieve identiability by making assump- tions about the equilibrium selection mechanism. If each PS set is associated with a probability distribution that species how each network in the set is chosen, then equation (2.9) provides moment restrictions that can be used to identify the model. In the context of network formation, one way to do this is to use the se- quential model discussed in Section 2.2.2. To see why this model can create a 18 By virtue of Lemma 2.1, the total number of networks to check is in fact 2 N(N1)=2 N(N 1)=2 because there is no need to check networks that are adjacent to the network under consideration (i.e., those that dier by only one link). However, this does not reduce the order of the computational complexity. 37 valid equilibrium selection mechanism, let us xX and" and suppose individ- uals form links as described in the model. Because the meetings are random, the sequence of networks in each period form a nondegenerate Markov chain with states corresponding to the networks. 19 If in addition the individu- als are allowed to make mistakes (i.e., forming or deleting a link randomly rather than based on utility maximization), and if the probability of making a mistake is suciently small, then it can be shown that the Markov chain converges to a unique stationary distribution which typically assigns prob- ability one to a single PS network. 20 This network is essentially the most "stable" one among all the PS networks (Jackson & Watts, 2002; Young, 1993). 21 Therefore, the stationary distribution denes a valid equilibrium selection mechanism for equation (2.9). This sequential model is considered both in Christakis et al. (2010) and Mele (2010), but they choose a dierent setting which changes the nature of the equilibrium selection mechanism and could lead to biased estimates of the parameters. Christakis et al. consider a variant of the model that does 19 The term "nondegenerate" means that the transition probabilities of the Markov chain can have numbers other than 0 and 1. 20 Suppose closed cycles are ruled out for the given X and ". 21 More precisely, the PS network in the support of the stationary distribution is the network that has the minimum resistance. The resistance of network G is dened as follows. For eachG 0 6=G, there is a smallest number of mistakes that are needed to move from G 0 to G. The sum of these smallest numbers for all G 0 is the resistance of G. The network with the minimum resistance must be one of the PS networks if closed cycles are ruled out. This network can be found by computing the resistance of each PS network, which amounts to solving a shortest path problem in a direct graph, and nding the PS network that has the minimum resistance, which is known as the arborescence problem in combinatorial optimization. See Young (1993) and Jackson and Watts (2002) for more details. 38 not allow for mistakes. In this case, any network in the PS set is an absorbing state of the Markov Chain, and any distribution that has support on the PS set is a stationary distribution. Hence, the equilibrium selection mechanism can pick any network in the PS set, and which one it picks depends on an arbitrary assumption of the initial probabilities. 22 Mele considers a version of the sequential model for directed networks. 23 The Markov chain in his model has a unique stationary distribution, but that stationary distribution has support on all possible networks, not just the equilibria. 24 This is an inappropriate equilibrium selection mechanism because it assigns nonzero probabilities to nonequilibrium networks. 25 An alternative way is to specify a more exible equilibrium selection mech- anism (Bajari, et al., 2010; Bajari, et al., 2011). For example, one can assume that the equilibrium selection satises Pr (GjPS (U);X;") =P (G;PS (U);X;"; ); (2.13) 22 As discussed in Section 2.2.2, Christakis et al. (2010) do not use a stationary dis- tribution. They use the transition probabilities in the rst few periods to construct the likelihood of the data. This approach produces estimates that are subject to the choice of the initial state and length of periods. 23 That is, all the links are directed and the link from i to j is formed once i wants to form it. 24 The equilibrium concept that Mele (2010) uses is Nash equilibrium. 25 The reason that the stationary distribution covers so many networks is that Mele (2010) denes the equilibria only based on the observables and uses the i.i.d. shocks " ij , which are assumed to follow a logistic distribution, as the random device to ensure the ergodicity of the Markov chain. In fact, Young (1993) shows a similar result in the model discussed above that the support of a unique stationary distribution can contain any networks if the probability of making a mistake is not suciently small. 39 where P is a probability distribution known up to a nite-dimensional pa- rameter (Bajari, et al., 2010). For references on how to specify P , see, e.g., Ackerberg and Gowrisankaran (2006), Bajari, et al. (2010), Bjorn and Vuong (1985), Jia (2008). Bajari, et al. (2011) allow the equilibrium se- lection mechanism to be nonparametric and provide conditions under which the model is identied. For complete information games, they show that to ensure identiability the nonparametric equilibrium selection cannot depend on ". Such dependence is allowed for in our approach. 2.3.3 Subnetworks We propose a novel method to tackle the computational problem when networks are large. The idea is simple. While it is infeasible to use full information on the pairwise stability of a whole network, it may be feasible to use information on the pairwise stability of "parts" of the network. This partial information, though weak, may be useful to infer the parameters of interest. The precise concept of "parts" of the network is subnetworks. A sub- network is the restriction of a network to a subset of the individuals. The formal denition is presented in graph-theoretic notation. Recall that a net- work can be denoted by a graph G = (V;E). For A V , we say graph G A = (A;E A ) is the subnetwork of G = (V;E) in A if the edge set E A E contains the links in G that connect two individuals in A. 26 Moreover, let 26 The subnetworks dened here are equivalent to the induced subgraphs dened in graph 40 graphG A = (V;EnE A ) denote the complement ofG A inG, i.e., the remain- der of G after the links in E A are omitted. The links in G A either connect two individuals not in A or connect individuals in A to individuals not in A. The links of the former type form the subnetwork in VnA, i.e., G VnA = VnA;E VnA , while the links of the latter type form a graph that we call the neighborhood ofA inG, which is denoted byB A (G) = (V;En E A [E VnA ). 27 For convenience, we suppress G in B A (G) whenever possible. If A =fi;jg, we abbreviate G fi;jg , G fi;jg and B fi;jg to G ij , G ij , and B ij in accordance with the previously used notation. Figure 3 illustrates a subnetwork and its neighborhood in a network of individuals 1; 2;:::; 12. The red lines form the subnetwork inf1; 2; 3g, the blue lines form its neighborhood, and the black lines form the subnetwork inf4; 5;:::; 12g. In the equivalent matrix notation, subnetwork G A is represented by the submatrix of G with rows and columns in A, and its complement G A is represented by the rest of G that is not in G A . Within G A , the submatrix with rows in A and columns in VnA represents neighborhood B A and the submatrix with rows and columns in VnA represents subnetwork G VnA . We denote the sets of all possibleG A ,G A , andB A byG A ,G A , andB A , respec- tively. The submatrix representations imply thatjG A j = 2 jAj(jAj1)=2 ,jG A j = 2 [N(N1)jAj(jAj1)]=2 , andjB A j = 2 jAj(NjAj) . theory (see Bollob as (1998, p.2)). 27 Note that due to symmetry the neighborhood of A is the same as that of VnA, i.e., B A (G) =B VnA (G). 41 12 11 9 10 8 7 6 5 4 3 2 1 Figure 3: A Subnetwork and Its Neighborhood From equation (2.9) we obtain Pr (G A jX;N) = X G A 2G A Pr (G A ;G A jX;N) = Z X G A 2G A Pr (G A ;G A jPS (U (X;"; u ));X;")dF " ("j " ): (2.14) If all the networks inPS (U (X;"; u )) have the same subnetworkG A inA, we have P G A Pr(G A ;G A jPS(U (X;"; u );X;") = 1. However, the value of the sum is unknown if the networks inPS (U (X;"; u )) have more than one subnetwork in A. Following the same idea as in (2.9), we derive upper and lower bounds for (2.14) that are analogous to those in (2.11) and (2.12). To be precise, we split the integral in (2.14) into two parts (we abbreviate 42 U (X;"; u ) to U and F " ("j " ) to F " for the simplicity of the expression) Pr (G A jX;N) = Z f9G A ;(G A ;G A )2PS(U)g &jPS A (U)j=1 dF " + Z f9G A ;(G A ;G A )2PS(U)g &jPS A (U)j2 X G A Pr (G A ;G A jPS (U);X;")dF " ; (2.15) wherePS A (U (X;"; u )) :=fG A :9G A ; (G A ;G A )2PS (U (X;"; u ))g is the set of subnetworks inA that are part of a network inPS (U (X;"; u )). 28 Replacing the sum in the second integral in (2.15) by 0 and 1 and combining the two integrals we obtain Pr (G A jX;N) Z 9G A ;(G A ;G A )2PS(U(X;":u)) dF " ("j " ); (2.16) and Pr (G A jX;N) Z 8G A ;8G 0 A 6=G A ;(G 0 A ;G A)= 2PS(U(X;";u)) dF " ("j " ); (2.17) where (2.17) follows because given thatPS (U (X;"; u )) is nonempty, the event that all the networks inPS (U (X;"; u )) have the same subnetwork G A is equivalent to the event that any network that has a subnetwork dierent from G A is not inPS (U (X;"; u )). The bounds in (2.16) and (2.17) are analogous to those in (2.11) and (2.12): The upper bound in (2.16) is the 28 Equation (2.15) holds because P G A Pr (G A ;G A jPS (U (X;"; u ));X;")> 0 only if there is G A such that (G A ;G A )2PS (U (X;"; u )). 43 Table 4: The PS NT Sets that Contain a Network with G 13 = 1 Networks PS NT sets G 2 fG 2 g;fG 2 ;G 3 g;fG 2 ;G 4 g;fG 2 ;G 8 g;fG 2 ;G 3 ;G 8 g; fG 2 ;G 4 ;G 8 g;fG 2 ;G 3 ;G 4 ;G 8 g G 5 fG 5 g G 6 fG 6 g G 8 fG 8 g;fG 1 ;G 8 g;fG 2 ;G 8 g;fG 3 ;G 8 g;fG 4 ;G 8 g; fG 2 ;G 3 ;G 8 g;fG 2 ;G 4 ;G 8 g;fG 3 ;G 4 ;G 8 g;fG 2 ;G 3 ;G 4 ;G 8 g probability thatG A can be complemented by someG A such that (G A ;G A ) is PS, and the lower bound in (2.17) is the probability that only G A has this property. These bounds are still sharp because the sum in (2.15) can be any number between 0 and 1. Example 2.3 (Example 2.2 continued) Follow the setting in Example 2.2. Consider the subnetwork inf1; 3g, i.e., G 13 . We derive (2.14), (2.16), and (2.17) for G 13 = 1. According to Figure 2, the networks that have subnet- workG 13 = 1 areG 2 ,G 5 ,G 6 , andG 8 . The PS NT sets in Tables 2 and 3 that contain any of these networks are listed in Table 4. It is then straightforward to derive equation (2.14) for G 13 = 1: 29 Pr (G 13 = 1) = Z E 2 dF " + Z E 23 Pr (G 2 jfG 2 ;G 3 g;")dF " + Z E 24 Pr (G 2 jfG 2 ;G 4 g;")dF " 29 We suppress the conditioning on X and N in the left hand side of (2.18) because in this example there is no X and N = 3 for all networks. 44 + Z E 28 dF " + Z E 238 (Pr (G 2 jfG 2 ;G 3 ;G 8 g;") + Pr (G 8 jfG 2 ;G 3 ;G 8 g;"))dF " + Z E 248 (Pr (G 2 jfG 2 ;G 4 ;G 8 g;") + Pr (G 8 jfG 2 ;G 4 ;G 8 g;"))dF " + Z E 2348 (Pr (G 2 jfG 2 ;G 3 ;G 4 ;G 8 g;") + Pr (G 8 jfG 2 ;G 3 ;G 4 ;G 8 g;"))dF " + Z E 5 dF " + Z E 6 dF " + Z E 8 dF " + Z E 18 Pr (G 8 jfG 1 ;G 8 g;")dF " + Z E 38 Pr (G 8 jfG 3 ;G 8 g;")dF " + Z E 48 Pr (G 8 jfG 4 ;G 8 g;")dF " + Z E 348 Pr (G 8 jfG 3 ;G 4 ;G 8 g;")dF " ; (2.18) where we useE 23 to denote setE (fG 2 ;G 3 g) in Table 2 and similar for other sets. Because G 13 = 1 in both G 2 and G 8 , we have Pr (G 2 jfG 2 ;G 8 g;") + Pr (G 8 jfG 2 ;G 8 g;") = 1. Replacing the selection probabilities in (2.18) by 0 and 1 we obtain Pr (G 13 = 1) Z E 2 [E 23 [E 24 [E 28 [E 238 [E 248 [E 2348 [E 5 [E 6 [E 8 [E 18 [E 38 [E 48 [E 348 dF " ; (2.19) and Pr (G 13 = 1) Z E 2 [E 28 [E 5 [E 6 [E 8 dF " : (2.20) Similarly to the lower bound in (2.12), we need to compute the bounds in (2.16) and (2.17) by simulation. The upper bound in (2.16) requires checking the pairwise stability of (G A ;G A ) for allG A , and the lower bound in (2.17) 45 requires checking that of (G 0 A ;G A ) for allG A and allG 0 A 6=G A . The total computational complexity of them is exactly the same as that of the lower bound in (2.12). However, we can derive inequalities from (2.16) and (2.17) that are computationally feasible. Main idea. The idea is to use the pairwise stability of subnetworks rather than net- works. We say a subnetwork G A is pairwise stable (PS) for a given comple- mentG A if equation (2.5) (or (2.6)) is satised for all i;j2A. That is,G A is viewed as a "network" in A, and this "network" is PS. More generally, we say a collection of links is pairwise stable (PS) for a given complement of the collection if each link in the collection satises equation (2.5) (or (2.6)). 30 This denition is needed when we consider the pairwise stability of G A for a given G A . It is simple to see that if network G is PS, then for any AV subnetwork G A is PS for G A . This property can be used to obtain new inequalities that only involve the pairwise stability of subnetworks and thus have lower computational complexity. To derive the new inequalities, for anyAV andi;j2A, we write sub- networkG A = (G ij ;G Aij ), whereG Aij is the remainder ofG A after linkij is omitted. It is clear that G = (G ij ;G Aij ;G A ) and G ij = (G Aij ;G A ). Denote byG Aij the set of all possible G Aij . ThenjG Aij j =jG A j=2 = 2 jAj(jAj1)=21 . Let X A = (X i ) i2A denote anjAjK matrix of attributes of 30 This collection is not necessarily a network. 46 the individuals in A, and " A = (" ij ) i;j2A;i6=j anjAj (jAj 1) 1 vector of preferences of the individuals in A for the links between them. The distri- bution of " A is denoted by F " A (" A j " ). Given G A , X A , and " A , there is a unique collection of PS subnetworks in A. This collection is called the PS set of subnetworks in A, denoted byPS(U A (G A ;X A ;" A ; u )), where U A (G A ;X A ;" A ; u ) =ff ij U i (G Aij ;G A ;X i ;X j ;" ij ; u )g G Aij 2G Aij g i;j2A;i6=j 2R jAj(jAj1)jG A j=2 represents the marginal-utility prole of the indi- viduals inA. Then from the aforementioned property of the pairwise stability of subnetworks, we obtain the new inequalities in the lemma below. Lemma 2.2 Under Assumption 2.1, for any AV , Pr (G A jX A ;N) Z 9G A ;G A 2PS(U A (G A ;X A ;" A ;u)) dF " A (" A j " ); (2.21) and Pr (G A jX A ;N) Z 8G A ;8G 0 A 6=G A ;G 0 A = 2PS(U A (G A ;X A ;" A ;u)) dF " A (" A j " ): (2.22) Proof. See the appendix. The upper bound in (2.21) is the probability thatG A is PS for someG A , and the lower bound in (2.22) is the probability that only G A has this prop- erty. In contrast to the bounds in (2.16) and (2.17), the new bounds in (2.21) and (2.22) take into account only the pairwise stability of the subnetworks, ignoring the pairwise stability of the rest of the networks. As a consequence, 47 these bounds are generally wider than their counterparts in (2.16) and (2.17) and thus not sharp. 31 In fact, this loss of sharpness is exactly the source of the reduction in the computational burden. As the bounds in (2.21) and (2.22) do not need information about the pairwise stability outside the sub- networks, their computation only requires checking the pairwise stability of all G A for all G A . More importantly, note that a large proportion of the above computation is to repeat the task for all G A , whereas G A plays a role only in the marginal-utility prole U A . If the eect of G A can be captured by some components of G A , we can avoid checking for all G A and further reduce the computational burden in (2.21) and (2.22). In fact, the utility function in (2.3) satises a locality property that can be exploited to achieve the desired reduction in the computational complexity. To be concrete, observe that, for any i;jV , the marginal utility ij U i in (2.4) depends onG ij only through the numbers ofi's friends,j's friends, and i;j's friends in common, all of which are determined by the neighborhood offi;jg, B ij . In general, we assume that the utility function satises the property of local externality. 31 For example, the upper bound in (2.16) excludes the region of " where subnetwork G A is PS for a G A but this G A is not PS for G A ; however, this region is included in the upper bound in (2.21). Similarly, the lower bound in (2.17) includes the region of " in which for any G 0 A 6= G A , (i) if there is a G A such that G 0 A is PS for G A , then this G A is not PS for G 0 A , and (ii) there exists at least one such G A . This region, however, is excluded from the lower bound in (2.22). 48 Assumption 2.2 (Local externality) For any i;j2V ij U i (G ij ;X i ;X j ;" ij ; u ) = ij U i (B ij ;X i ;X j ;" ij ; u ): (2.23) Because B ij is fully determined by G Aij and B A , i.e., B ij = B ij (G Aij ; B A ), 32 the local externality property implies that for anyAV , the marginal- utility prole depends on G A only through B A , i.e., U A (G A ;X A ;" A ; u ) = U A (B A ;X A ;" A ; u ) =ff ij U i (B ij (G Aij ;B A );X i ;X j ;" ij ; u )g G Aij 2G Aij g i;j2A;i6=j . Therefore, we obtain the proposition below. Proposition 2.1 Under Assumptions 2.1 and 2.2, for any AV , Pr(G A jX A ;N) Z 9B A ;G A 2PS(U A (B A ;X A ;" A ;u)) dF " A (" A j " ); (2.24) and Pr(G A jX A ;N) Z 8B A ;8G 0 A 6=G A ;G 0 A = 2PS(U A (B A ;X A ;" A ;u)) dF " A (" A j " ): (2.25) Proof. This follows immediately from Lemma 2.2 and the discussion above. In the computation of the bounds in (2.24) and (2.25), the total num- ber of subnetworks whose pairwise stability needs to be checked is equal to 2 jAj(jAj1)=2 times the number ofB A that yield distinct values of U A . To be 32 G Aij determines the part of the neighborhood that is inside subnetwork G A , and B A determines the part outside G A . 49 clear about the computational eort, let us discuss the magnitude of the last number. Because the eects of the links in B ij are assumed to be indepen- dent of X k , k6=i;j, i.e., all the friends in Vnfi;jg are treated as identical, B A can be fully captured by a vector of integers, whose components represent the numbers of friends inVnA that eachi inA has, that each pair (i;j) inA has in common, that each triplet (i;j;k) inA has in common, ..., and that all individuals inA have in common. Moreover, we have made two assumptions about the utility function in (2.3) that further reduce the number of possible values of U A (B A ). First, we only consider the rst two types of friends, i.e., friends of each individual and common friends of each pair. Second, we choose a linear cost function C i (n) = C i n that satises C i =1 if n > b for some integer b <1. This cost function is to ensure that in equilibrium nobody has more than b friends. Under these two assumptions, the role of B A in U A can be captured by jAj 1 + jAj 2 integers, each of which takes a value in 0; 1;:::; b . Thus, the number of distinct U A (B A ) is bounded above by b + 1 jAj(jAj+1)=2 , which, unlike 2 jAj(NjAj) (the cardinality ofB A ), is independent of N. 33 Therefore, the computation of the bounds in (2.24) and (2.25) is feasible no matter how large N is, provided that we choose smalljAj and b. Please note that the computational complexity will not increase for other utility specications that satisfy Assumption 2.2, provided 33 In fact, the number of possible U A (B A ) is substantially smaller than b + 1 jAj(jAj+1)=2 because the upper limit b is imposed on the total number of i's own friends and the friends that i share with others. For N < ( b + 1)jAj, the number is even smaller. 50 that the number of possible friends is less than b and the common friends of only pairs are considered. In particular, the computational complexity is the same if the eects of own's friends or common friends are allowed to be nonlinear or interact with the attributes. The computation becomes easier if the number of common friends is replaced by a dummy (i.e., whether or not two individuals have a friend in common), as specied in Christakis et al. (2010). Now we consider Pr (G A ;B A jX A ;N). Following the same idea as in Lemma 2.2 and Proposition 2.1, we derive an upper bound for Pr(G A ;B A jX A ; N). Proposition 2.2 Under Assumptions 2.1 and 2.2, for any AV , Pr (G A ;B A jX A ;N) Z G A 2PS(U A (B A ;X A ;" A ;u)) dF " A (" A j " ): (2.26) Proof. See the appendix. The upper bound in (2.26) has a simple interpretation. It is the proba- bility that G A is PS for the given B A . Note that inequality (2.26) uses the realized B A in the data, rather than considering all possible B A as in (2.24) and (2.25). As a consequence, this inequality is more informative about the parameters. Another benet from using the observed B A is that the com- putation of the bound is much easier than those in (2.24) and (2.25). We mention again thatB A can be captured by jAj 1 + jAj 2 integers, each of which takes a value in 0; 1;:::; b . This substantially reduces the dimension ofB A 51 in Pr (G A ;B A jX A ;N) and improves the estimation precision. Example 2.4 (Example 2.3 continued) We have the same setting as in Example 2.3 and derive inequalities (2.21) and (2.22) for subnetwork G 13 = 1. Note that in this simple example inequalities (2.24) and (2.25) are the same as inequalities (2.21) and (2.22) because G 13 = (G 12 ;G 23 ) = B 13 . Unlike in Example 2.3, theE (H) in Tables 2 and 3 are not useful for calcu- lating the integration regions in (2.21) and (2.22). Instead, we calculate them directly. First note that the possible PS NT sets of G 13 are onlyfG 13 = 1g andfG 13 = 0g, and the complement G 13 takes four possible values, i.e., G 13 =f(0; 0); (1; 0); (0; 1); (1; 1)g, where, for example, (1; 0) represents the complement with G 12 = 1 and G 23 = 0. As the pairwise stability of G 13 for a given G 13 is aected by " only through " 13 , we nd the regions of " 13 in which the PS NT set isfG 13 = 1g orfG 13 = 0g for eachG 13 2G 13 . All the regions are listed in Table 5. Using these regions, we can calculate the inte- gration regions in (2.21) and (2.22). To be specic, the integration region in (2.21) for G 13 = 1 (i.e., the region in which there is G 13 such that G 13 = 1 is PS NT ) is equal to the union of the four regions in the top panel of Table 5, i.e., [vw;1) (recall that v < 0 < v +w). Moreover, the integration region in (2.22) for G 13 = 1 (i.e., the region in which there is no G 13 such that G 13 = 0 is PS NT ) is equal to the complement of the union of the four regions in the bottom panel of Table 5, i.e., [v;1). Therefore, (2.21) and 52 Table 5: The PS NT Sets of Subnetworks PS NT sets of G 13 G 13 Corresponding regions of " 13 fG 13 = 1g G 13 = (0; 0) [0;1) G 13 = (1; 0) [v;1) G 13 = (0; 1) [v;1) G 13 = (1; 1) [vw;1) fG 13 = 0g G 13 = (0; 0) (1; 0) G 13 = (1; 0) (1;v) G 13 = (0; 1) (1;v) G 13 = (1; 1) (1;vw) (2.22) for G 13 = 1 are Pr (G 13 = 1) Z [vw;1) dF " 13 ; (2.27) and Pr (G 13 = 1) Z [v;1) dF " 13 : (2.28) We dene the identied set as the collection of2 that satisfy inequal- ities (2.24)-(2.26). Denition 2.3 The identied set of 2 is I =f2 j (2.24)-(2.26) hold,8A s.t. 2jAj a,8G A ,8B A ,8X A g (2.29) 53 for some integer a. I is nonempty because the true 0 lies in it. We want I to be as tight as possible (recall that the inequalities in (2.24)-(2.26) are not sharp), and thus dene it using all the feasible inequalities. The upper limit a restricts the size of the subnetworks to consider. It is chosen according to the computational capability. In practice we choose a = 5. 2.4 Estimation In this section, we discuss the estimation and inference of the identied set in (2.29). This set is dened by conditional moment inequalities (2.24)- (2.26). We assume X is discrete and transform the conditional moment inequalities into unconditional moment inequalities, i.e., p (g A ;x A ;n)H 1 (g A ;x A ;n;)p (x A ;n) 0; H 2 (g A ;x A ;n;)p (x A ;n)p (g A ;x A ;n) 0; p (g A ;b A ;x A ;n)H 3 (g A ;b A ;x A ;n;)p (x A ;n) 0; (2.30) where p (g A ;b A ;x A ;n) = Pr (G A =g A ;B A =b A ;X A =x A ;N =n); p (g A ;x A ;n) = Pr (G A =g A ;X A =x A ;N =n); p (x A ;n) = Pr (X A =x A ;N =n); (2.31) 54 and H 1 (g A ;x A ;n;) = Z max b A 1fg A 2PS (U A (b A ;x A ;" A ; u ))gdF " A ( " ); H 2 (g A ;x A ;n;) = Z min g 0 A 6=g A ;b A 1fg 0 A = 2PS (U A (b A ;x A ;" A ; u ))gdF " A ( " ); H 3 (g A ;b A ;x A ;n;) = Z 1fg A 2PS (U A (b A ;x A ;" A ; u ))gdF " A ( " ): (2.32) The unconditional moment inequalities are equivalent to their conditional counterparts as long as we consider all possible values of (x A ;n) (CHT, 2007; Bungi, 2010). In Section 2.4.1, we rst address the issues in estimatingp (g A ;x A ;n) and p (g A ;b A ;x A ;n) when the individuals in a network have no identities. Then we discuss in Section 2.4.2 how to estimate the identied set and construct a condence region, assuming that the bound functions in (2.32) are known. The computation of the bounds are discussed in Section 2.4.3. 2.4.1 Graph isomorphism In order to estimate p (g A ;x A ;n) and p (g A ;b A ;x A ;n), we need to label the individuals in a network 1; 2;:::N so that we can construct the matrix of the network and obtain a subnetwork from the corresponding submatrix. However, most of the network data (such as Add Health) do not have in- dividuals' identities, i.e., the individuals in each network can be numbered 55 arbitrarily. 34 If X is continuous, the individuals in a network have distinct X with probability one, so we can sort the individuals by their X and use the orders of each individual as the labels. However, if X is discrete, with positive probability two individuals have the same X. In this case, there is more than one way to label the individuals in a network, and how we label them aects the matrix representations. For example, the two networks of two boys (the black dots) and two girls (the white dots) in Figure 4 have exactly the same topology. We label the girls 1; 2 and the boys 3; 4 in both networks. However, because the two boys are labeled dierently in terms of their relative positions to the two girls, we obtain two dierent matrices. Note that this labeling problem also arises in subnetworks. Another problem due to lack of identities is how to choose the subnetworks from a network. Because individual i in network j has nothing to do with individual i in network k, it makes no sense to choose from each network only the subnetworks that have labels in a given subsetA, e.g.,A =f1; 2g. 35 Instead, we should look at all the subnetworks of sizejAj. If there is more than one such subnetwork that has the targeted x A , we will have trouble determining which subnetwork represents g A . For example, if we consider 34 Network data that have individuals' identities are those that contain repeated ob- servations for the same group of individuals, for example, a time series of networks. In such data, the individuals in each network are numbered in a xed way, and individual i in any network represents the same agent. The dierence between network data with- out identities and those with identities is analogous to the dierence between repeated cross-sectional data and panel data. 35 In network data that have individuals' identities, however, we should only consider the subnetworks that have labels in the given A because only these subnetworks are the observations for the subnetwork of the individuals in A. 56 Figure 4: An Example of the Labeling Problem the subnetworks of a boy and a girl in the networks in Figure 4, we will nd (in both networks) four such subnetworks, i.e., the subnetworks inf1; 3g, f1; 4g,f2; 3g, andf2; 4g. These problems arise because we try to distinguish between the net- works/subnetworks that are observationally equivalent in network data with- out identities. The rst problem is resolved if we consider the distribution of the equivalence classes of networks/subnetworks. As for the second problem, we show that the subnetworks of the same size follow the same distribution, so it does not matter which subnetwork we choose. The rest of the section is devoted to proving these results. Fix n and let be a permutation over V =f1;:::;ng. Denote by i := (i) the image of i2 V and A :=f (i) :i2Ag the image of A V . For expositional convenience, we associate a network/subnetwork with the 57 attributes and write a network as (g;x;n), a subnetwork as (g A ;x A ;n), and a subnetwork with a neighborhood as (g A ;b A ;x A ;n). First we consider networks. For network (g;x;n), we dene (g ;x ;n) to be a network such that g i j =g ij ; 8i;j2V;i6=j; x i =x i ; 8i2V; i.e., the edges and attributes are preserved under. It is clear that networks (g;x;n) and (g ;x ;n) are equivalent except for the labels of the individuals. Following the terminology in graph theory, we say that (g;x;n) and (g ;x ;n) are isomorphic 36 and write (g;x;n) = (g ;x ;n): Intuitively, we should treat (g;x;n) and (g ;x ;n) as the same realization of (G;X;N). This is valid because Pr (G =g;X =x;N =n) = Pr (G =g ;X =x ;N =n); (2.33) The proof of equation (2.33) is in the appendix. Now we consider subnetworks. The logic is similar. For subnetwork 36 See, e.g., Godsil and Royle (2001). 58 (g A ;x A ;n), we dene (g A ;x A ;n) to be a subnetwork in A such that g i j =g ij ; 8i;j2A;i6=j; x i =x i ; 8i2A; i.e., the edges in g A and the attributes in A are preserved under . Then (g A ;x A ;n) and (g A ;x A ;n) are equivalent subnetworks except for the labels. We say they are isomorphic and write (g A ;x A ;n) = (g A ;x A ;n): Similar to equation (2.33), we show in the appendix that Pr (G A =g A ;X A =x A ;N =n) = Pr (G A =g A ;X A =x A ;N =n): (2.34) Equation (2.34) has several implications. First, for A = A , we can treat (g A ;x A ;n) and (g A ;x A ;n) = (g A ;x A ;n) as the same realization of (G A ;X A ; N). This is similar to what we do for networks. Second, forA6=A , equation (2.34) implies (G A ;X A ;N) d = (G A ;X A ;N); (2.35) i.e., subnetworks (G A ;X A ;N) and (G A ;X A ;N) follow the same distribu- tion. 37 This demonstrates that we can pick any subnetwork of sizejAj from 37 We can choose a such that i < j, i < j ,8i;j2 A;i6= j, so g A = g A and x A =x A . Expression (2.35) then follows from equation (2.34) 59 a network to estimate p (g A ;x A ;n). Moreover, given that (G A ;X A ;N) and (G A ;X A ;N) follow the same distribution, we can treat (g A ;x A ;n) and (g A ;x A ;n) as the same realization. The same argument applies to subnetworks with neighborhoods. For example, we show that (G A ;B A ;X A ;N) d = (G A ;B A ;X A ;N): (2.36) See the appendix for more details. In sum, we can choose any subnetwork of sizejAj from a network to estimate p (g A ;x A ;n) and p (g A :b A ;x A ;n) because all such subnetworks fol- low the same distribution. Moreover, we treat all the isomorphic (g A ;x A ;n) and (g A ;b A ;x A ;n) as the same realizations, i.e., we consider the equivalence classes of subnetworks. To do this in practice, we need a graph-isomorphism algorithm to nd which subnetworks are isomorphic. In the Monte Carlo sim- ulations we use the algorithm Nauty. 38 More details about how to implement Nauty can be found in the appendix. How are the moment inequalities in (2.30) aected if we replace the sub- networks by their equivalence classes? In fact, it is easy to show that the bound functions in (2.32) are invariant under isomorphisms, e.g., H 1 (g A ;x A ;n;) =H 1 (g A ;x A ;n;): 38 See http://cs.anu.edu.au/~bdm/nauty/. 60 Therefore, we can simply multiply the bound functions for a subnetwork by the number of subnetworks isomorphic to the subnetwork, and the moment inequalities in (2.30) still hold. 2.4.2 Estimation and inference Now we discuss how to estimate the identied set and construct a con- dence region. Let a =jAj, and denote by (g a ;x a ;n) and (g a ;b a ;x a ;n) the realizations of (G A ;X A ;N) and (G A ;B A ;X A ;N). This is valid because the values of the realizations are the equivalence classes of subnetworks which depend only onjAj. As discussed, we can pick from each network any subnetwork of size a. These subnetworks form an i.i.d. sample and can provide consistent estimators for p (g A ;x A ;n) and p (g A ;b A ;x A ;n). However, we exploit the data more eciently by picking from each network all subnetworks of sizea. By taking the average over all such subnetworks from the same network we obtain estimators with smaller variances. 39 To be precise, let A 1 ;:::;A ( n a ) be all the subsets of V =f1;:::;ng that have a elements. We dene the moment functions for observation t to be m a1 t (g a ;x a ;n;) = p t (g a ;x a ;n)H 1 (g a ;x a ;n;) p t (x a ;n); m a2 t (g a ;x a ;n;) =H 2 (g a ;x a ;n;) p t (x a ;n) p t (g a ;x a ;n); 39 Note that subnetworks from the same network are dependent because they are part of a single PS network, but this dependence does not cause any additional problems when we consider the asymptotics for a large number of networks. 61 m a3 t (g a ;b a ;x a ;n;) = p t (g a ;b a ;x a ;n)H 3 (g a ;b a ;x a ;n;) p t (x a ;n); (2.37) fora = 2;:::; a, where p t (g a ;b a ;x a ;n), p t (g a ;x a ;n), and p t (x a ;n) are the av- erages of the indicator functions over all the subnetworks of sizea in network t, i.e., p t (g a ;b a ;x a ;n) = 1 n a ( n a ) X j=1 1 G t;A j =g a ;B t;A j =b a ;X t;A j =x a ;N t =n ; p t (g a ;x a ;n) = 1 n a ( n a ) X j=1 1 G t;A j =g a ;X t;A j =x a ;N t =n ; p t (x a ;n) = 1 n a ( n a ) X j=1 1 X t;A j =x a ;N t =n : (2.38) Equation (2.34) implies thatE p t (g a ;x a ;n) =p (g a ;x a ;n) := Pr(G A =g a ;X A = x a ;N = n). Moreover, p t (g a ;x a ;n) has a smaller variance than 1fG t;A 1 = g a ;X t;A 1 = x a ;N t = ng. 40 Similar results hold for p t (g a ;b a ;x a ;n) and p t (x a ;n). We stack all the moment functions in (2.37) into a vector. Suppose (g a ;x a ;n) and (g a ;b a ;x a ;n) have g and b distinct equivalence classes, i.e., (g a ;x a ;n) 1 ;:::; (g a ;x a ;n) g ; 40 This is due to Cauchy-Schwarz inequality and Pr G Aj =G A k ;X Aj =X A k ;N =n 6= 1,8j6=k. 62 and (g a ;b a ;x a ;n) 1 ;:::; (g a ;b a ;x a ;n) b ; where g and b are nite numbers that depend on a and the number of possible values of (x a ;n). The moment functions in (2.37) form the vector m t () = 0 B B B B @ m 2 t () . . . m a t () 1 C C C C A ; where for each a = 2;:::; a; m a t () = 0 B B B B @ m a1 t () m a2 t () m a3 t () 1 C C C C A ; with m a1 t () = m a1 t ((g a ;x a ;n) 1 ;);:::;m a1 t ((g a ;x a ;n) g ;) 0 ; m a2 t () = m a2 t ((g a ;x a ;n) 1 ;);:::;m a2 t ((g a ;x a ;n) g ;) 0 ; m a3 t () = m a3 t ((g a ;b a ;x a ;n) 1 ;);:::;m a3 t ((g a ;b a ;x a ;n) b ;) 0 : Then the moment inequalities in (2.30) are simplied to Em t () 0: (2.39) 63 We estimate the identied set by minimizing the sample analog of a cri- terion function constructed based on the moment inequalities (CHT, 2007; Andrews & Guggenberger, 2009; Andrews & Soars, 2010; Andrews & Jia, 2012; Bugni, 2010; Romano & Shaikh, 2010, among others). Following the literature, we use the criterion function Q () = (Em t ()) + 2 ; (2.40) where (x) + = max (x; 0) andkk is the Euclidean norm. The identied set is simply I =f2 :Q () = 0g: (2.41) Its estimation is based on the sample analog of Q () Q T () = (E T m t ()) + 2 ; (2.42) where E T m t () = 1 T P T t=1 m t () is the sample average of the moments. To account for the misspecication, we use the normalized sample criterion func- tion (CHT, 2007; Ciliberto & Tamer, 2009) Q 0 T () =Q T () inf 0 2 Q T ( 0 ): (2.43) The estimator of I is given by ^ I =f2 :TQ 0 T ()c T g; (2.44) 64 where c T is chosen to be c T !1 and c T =T! 0, e.g. c T / lnT . Let d (A;B) denote the Hausdor distance between sets A and B d (A;B) = max sup a2A d (a;B); sup b2B d (b;A) ; with d (a;B) = inf b2B kabk. Following CHT (2007), it can be shown that ^ I is a consistent estimator of I , i.e., d ^ I ; I p ! 0 as T!1. Theorem 2.1 Suppose Assumptions 2.1 and 2.2 are satised. Assume is a compact set. (a) If Q () and Q T () dened in (2.40) and (2.42) satisfy (i) Q () is continuous in , (ii) sup 2 jQ ()Q T ()j =O p 1= p T , and (iii) sup 2 I Q T () =O p (1=T ), then ^ I in (2.44) is a consistent estimator of I , i.e., d ^ I ; I p ! 0; as T!1. (b) For the utility function in (2.3), conditions (i)-(iii) in part (a) hold. Proof. See the appendix. The condence region for the true parameters is constructed by inverting the acceptance region of a test (e.g. CHT, 2007; Andrews & Guggenberger, 2009; Andrews & Soars, 2010; Andrews & Jia 2012). We use the condence region proposed by CHT C T =f2 :TQ 0 T () c 1 ()g; (2.45) 65 where the critical value c 1 () := min (^ c 1 (); ^ c 1 ). ^ c 1 () is a consis- tent estimator of c 1 (), the 1 quantile of the limiting distribution of TQ 0 T (). ^ c 1 is an estimator that isO p (1) and larger than sup 2 I c 1 (). It can be shown that for any parameter values in the identied set, the con- dence region in (2.45) has an asymptotically correct size, i.e., inf 2 I lim inf T!1 Pr (C T ) 1: As the limiting distribution ofTQ 0 T () is data dependent, we propose to use the subsampling method in CHT to obtain c 1 (). The computation of the identied set and condence region is nontriv- ial. These (d u +d " )-dimensional sets are dened as the lower level sets of the normalized sample criterion function which needs to be computed by simulation (see Section 2.4.3 for the computation of the bound functions). Currently there is no optimization solver designed to nd a continuum of optima for a simulated objective function. Algorithms like grid search or simulated annealing (Ciliberto & Tamer, 2009) are of low eciency because they try to approximate an entire set. We propose a new algorithm that seeks to approximate the boundary of a set. 41 Suppose we want to compute the setf2 : ~ Q () cg. The optimal that solves the minimization problem min 2 ~ Q ()c 2 41 We are grateful to Arnold Neumaier at the University of Vienna for suggesting the algorithm. 66 gives a point on the boundary of the set. By randomly choosing the starting points and keeping only the solutions (; ~ Q ()) that satisfy ~ Q ()c, we can generate as many boundary points as we want. 42 Note that this algorithm requires ~ Q () to be continuous. 43 More details about the implementation of the algorithm are discussed in the appendix. 2.4.3 Computation of the bound functions In this section we discuss how to compute the bound functions in (2.32). For expositional convenience, we denote the realizations by (g A ;x A ;n) and (g A ;b A ;x A ;n). By virtue of Assumption 2.1,H 3 (g A ;b A ;x A ;n;) has a closed form, i.e., Y i;j2A g ij =1 Pr 0 B @ ij U i (b ij (g Aij ;b A );x i ;x j ;" ij ) 0; ij U j (b ij (g Aij ;b A );x j ;x i ;" ji ) 0 1 C A Y i;j2A g ij =0 Pr 0 B @ ij U i (b ij (g Aij ;b A );x i ;x j ;" ij )< 0 or ij U j (b ij (g Aij ;b A );x j ;x i ;" ji )< 0 1 C A (2.46) in the nontransferable case and Y i;j2A g ij =1 Pr 0 B @ ij U i (b ij (g Aij ;b A );x i ;x j ;" ij ) + ij U j (b ij (g Aij ;b A );x j ;x i ;" ji ) 0 1 C A 42 We use the built-in local solver fminsearch in MATLAB to solve this minimization problem. fminsearch is a derivative-free local optimization solver which implements the Nelder-Mead algorithm. 43 Otherwise, there is no guarantee thatf2 : ~ Q () =cg is the boundary of the set. 67 Y i;j2A g ij =0 Pr 0 B @ ij U i (b ij (g Aij ;b A );x i ;x j ;" ij ) + ij U j (b ij (g Aij ;b A );x j ;x i ;" ji )< 0 1 C A (2.47) in the transferable case. These functions are easy to compute when " ij and " ji are assumed to follow a logit or normal distribution. H 1 (g A ;x A ;n;) and H 2 (g A ;x A ;n;) do not have closed forms because of the maximal or minimal functions in the integrands. These bounds need to be computed by simulation. One simple way is to use crude frequency simulators. That is, we simulate an i.i.d. sample " A;1 ;:::;" A;R and use the averages of the maximal and minimal functions in (2.32) over this sample as the simulators (McFadden, 1989; Pakes & Pollard, 1989). However, these simulators are not continuous in and lead to computational challenges in the estimation. In particular, the algorithm we propose in Section 2.4.2 requires the sample criterion function to be continuous. We construct more sophisticated simulators that are continuous in using the GHK algorithm (Hajivassiliou & Ruud, 1994; Geweke & Keane, 2001). The idea is to express the multiple integrals in H 1 and H 2 as functions of a sequence of conditional probabilities, each of which involves only one pair of (" ij ;" ji ) in " A and thus has a closed form if we assume, e.g., that (" ij ;" ji ) follows a normal distribution. Using the closed forms of the conditional probabilities we can construct simulators that are continuous in . For example, H 1 (g A ;x A ;n;) is the probability of the event that sub- network g A is PS for some neighborhood b A . Let a =jAj and denote the 68 a(a 1)=2 links in g A by g 12 ;g 13 ;:::;g a1;a (we assume a 3). The above event is equivalent to a sequence of events that g 12 is PS for some b A , and g 13 is PS for some b A such that g 12 is PS, and ::: g a1;a is PS for some b A such that g 12 ;:::;g a2;a are PS. (2.48) The event in each line of (2.48) involves only one pair of (" ij ;" ji ), so its probability conditional on the events in the previous lines can be computed analytically. The analytical forms of the conditional probabilities are contin- uous in because the utility and distribution functions are continuous in . See the appendix for details about the simulators. 2.5 Monte Carlo Simulations In this section, we implement Monte Carlo simulations for the methods developed in Sections 2.3 and 2.4. For comparison, we consider three data generating processes (DGP) with dierent network size N = 3, 4, and 6. Other specications of the DGPs are the same. We use the utility function U i = N P j=1 G ij u ij + N P j=1 N P k=1;k6=i G ij G jk v + N P j=1 N P k>j G ij G ik G jk w u ij = jX i X j j +" ij 69 where X2f0; 1g is a dummy variable, (" ij ;" ji ) iid:N (0;I 2 ), where I 2 is 22 identity matrix. The parameters of interest are (;v;w), and their true values are ( 0 ;v 0 ;w 0 ) = (0:5;0:4; 0:6). The simulated networks are PS T networks, where the equilibrium selection distributions are uniform over each PS T sets. For each DGP, we consider sample size T = 100, 500, and 1000. All the experiments are repeated for 10 times. The bounds are computed using R = 100 simulations. For N = 3 and 4, we use the full networks, not the subnetworks, to estimate the identied sets. These estimates re ect the information deciency only due to multiple equilibria and thus provide benchmarks. For N = 6, we consider two estimates: One obtained from using subnetworks of size up to a = 3 and the other from using subnetworks of size up to a = 4. We compare these estimates with those from N = 3 and 4 and see how much can be learned from the subnetwork analysis. The identied sets in this example are three dimensional and are com- puted using the algorithm proposed in Section 2.4.2. For each specication, we generate 50 points on the boundary of the identied set. More details about implementing the algorithm are discussed in the appendix. Other computational issues, including how to calculate all possible neighborhoods and how to nd the equivalence classes of isomorphic networks/subnetworks, are also discussed in the appendix. We report the one-dimensional projections of the estimates in Table 6 (for N = 3; 4) and Table 7 (for N = 6) and plot the two-dimensional projections 70 in Figures 5-7. 44 Compared with those estimates using the full networks, the estimates from a = 3 generally give wider bounds in large samples (T = 500; 1000). This is expected given the fact that we only use pairs and triples in a network with 6 individuals. However, the estimates from a = 4 give tight bounds that are very similar to those fromN = 3 and 4, especially forv and w. For example, for T = 1000, the estimates from a = 4 perform almost as well as those from N = 3 and 4. This seems to suggest that the subnetwork approach is promising: We save a lot of computational eort by reducing the network size from 6 to 4, and we still obtain tight bounds. We also nd that the estimates fromN = 3 and 4 perform poorly in small samples (T = 100), much worse than those from a = 3 and 4. This re ects the fact that a larger network contains more subnetworks of a given size than a smaller network does. For example, in a network with N = 6, there are 20 subnetworks of size 3, while in a network with N = 3, there is only one such subnetwork (the network itself). Taking the average over all subnetworks in a network tends to improve the estimation precision. 44 For the two-dimensional projections, we discretize the points (w.r.t. thex-coordinate) into 10 bins and use the maximum and minimum in each bin to approximate the estimated set. Then we calculate the averages and 80 percentiles of the 10 samples within each bin as the mean estimates and condence regions of the two-dimensional projections. 71 Table 6: Projections of the Estimated Identied Sets: N = 3; 4 True values T N = 3 N = 4 =0:5 100 [1:461;0:226] [0:849; 0:040] ([1:369; 0:079]) ([1:071; 0:052]) v =0:4 [6:368;0:253] [3:423;0:324] ([10:655;0:208]) ([5:515;0:291]) w = 0:6 [ 0:445; 14:234] [ 0:566; 8:622] ([ 0:286; 34:058]) ([ 0:548; 16:473]) =0:5 500 [0:626;0:432] [0:611;0:372] ([0:668;0:370]) ([0:671;0:270]) v =0:4 [0:513;0:317] [0:564;0:318] ([0:571;0:267]) ([0:708;0:290]) w = 0:6 [ 0:531; 0:835] [ 0:551; 0:882] ([ 0:485; 0:887]) ([ 0:523; 1:041]) =0:5 1000 [0:579;0:449] [0:571;0:424] ([0:617;0:402]) [0:596;0:404]) v =0:4 [0:530;0:349] [0:509;0:352] ([0:581;0:309]) ([0:536;0:330]) w = 0:6 [ 0:555; 0:762] [ 0:574; 0:769] ([ 0:523; 0:822]) ([ 0:573; 0:783]) Note. Intervals not in parentheses are the mean estimates of the projections. Intervals in parentheses are the 80% con- dence intervals. 72 Table 7: Projections of the Estimated Identied Sets: N = 6 True values T N = 6 a = 3 a = 4 =0:5 100 [0:657; 0:251] [0:633;0:067] ([0:737; 0:746]) ([0:671; 0:079]) v =0:4 [0:772;0:256] [0:563;0:264] ([0:890;0:190]) ([0:645;0:222]) w = 0:6 [ 0:447; 1:147] [ 0:486; 0:933] ([ 0:396; 1:275]) ([ 0:399; 1:072]) =0:5 500 [0:639; 0:112] [0:626;0:036] ([0:782; 0:304]) ([0:688; 0:034]) v =0:4 [0:802;0:290] [0:564;0:231] ([0:919;0:203]) ([0:628;0:183]) w = 0:6 [ 0:482; 1:159] [ 0:467; 0:801] ([ 0:420; 1:272]) ([ 0:391; 0:874]) =0:5 1000 [0:692; 0:281] [0:656;0:094] ([0:820; 0:578]) ([0:690;0:056]) v =0:4 [0:773;0:248] [0:523;0:240] ([0:823;0:185]) ([0:592;0:184]) w = 0:6 [ 0:415; 1:197] [ 0:517; 0:758] ([ 0:363; 1:273]) ([ 0:448; 0:784]) Note. Intervals not in parentheses are the mean estimates of the projections. Intervals in parentheses are the 80% con- dence intervals. 73 1 0 1 1.5 1 0.5 0 0.5 beta v N=3, T=500 1 0 1 1.5 1 0.5 0 0.5 beta v N=4, T=500 1 0 1 1.5 1 0.5 0 0.5 beta v N=6, a<=3, T=500 1 0 1 1.5 1 0.5 0 0.5 beta v N=6, a<=4, T=500 Figure 5: Estimated identied sets projected to the (;v)-plane. Solid curves and dotted curves are the mean estimates and 80th percentiles of the projec- tions. 2.6 Conclusions In this chapter, we develop a structural model for network formation games. We use pairwise stability to map between observables and prim- itives and resort to partial identication to cope with multiple equilibria. The bounds derived following Ciliberto and Tamer (2009) are computation- ally infeasible when networks are large. We contribute to the literature by proposing a novel method based on subnetworks that can provide feasible bounds no matter how large the networks are. This subnetwork approach does not impose any additional assumptions on the equilibrium condition, 74 1 0 1 0 0.5 1 1.5 2 beta w N=3, T=500 1 0 1 0 0.5 1 1.5 2 beta w N=4, T=500 1 0 1 0 0.5 1 1.5 2 beta w N=6, a<=3, T=500 1 0 1 0 0.5 1 1.5 2 beta w N=6, a<=4, T=500 Figure 6: Estimated identied sets projected to the (;w)-plane. Solid curves and dotted curves are the mean estimates and 80th percentiles of the projec- tions. 1.5 1 0.5 0 0.5 0 0.5 1 1.5 2 v w N=3, T=500 1.5 1 0.5 0 0.5 0 0.5 1 1.5 2 v w N=4, T=500 1.5 1 0.5 0 0.5 0 0.5 1 1.5 2 v w N=6, a<=3, T=500 1.5 1 0.5 0 0.5 0 0.5 1 1.5 2 v w N=6, a<=4, T=500 Figure 7: Estimated identied sets projected to the (v;w)-plane. Solid curves and dotted curves are the mean estimates and 80th percentiles of the projec- tions. 75 nor any restrictions on the dependence between subnetworks. It can provide the most robust estimates for many eects that are of interest in the network literature, such as the homophily eects, the eects of friends of friends and the eects of friends in common. Our approach is also useful for policy evaluation. For example, consider a policy maker who is interested in policies that reduce racial segregation. Suppose that she starts an interdistrict transfer program which can increase racial diversity of schools, and her goal is to promote interracial friendships, measured by the proportion of the interracial friendships among all possi- ble relationships. Our approach can be used to evaluate such a program, but with limitations. Given the fact that the model parameters are only partially identied and there may be multiple equilibria when we simulate network outcomes, for a given racial composition what we can predict is an interval of the proportion of the interracial friendships. The endpoints of this interval are the maximal and minimal proportions among all the pairwise stable networks for all the parameter values in the estimated identied set. Moreover, if the policy maker is interested in the optimal racial composition, we may need to use, e.g., a minimax-regret criterion in statistical decision theory (Manski, 2004) and nd the racial composition that maximizes the lowest level of the interracial friendships. We leave this policy analysis for future work. One concern about our approach is how informative the bounds are given that we only consider small subnetworks. In a small-scale Monte Carlo study, 76 we show that picking 4 individuals out of 6 can provide tight bounds for the parameters. In practice, however, there may be hundreds or even thousands of individuals in a network (like Facebook). Will the bounds still be use- ful? Our answer is that it depends on the application. In fact, the tightness of the bounds derived from subnetworks, relative to that of the bounds de- rived from networks, mainly depends on two things: The number of possible neighborhoods (jB A j) and the magnitude of the utility interdependence (v and w). If both of them are large, the maximal and minimal values of the neighborhood eects will be poor approximations for the realized neighbor- hood eect. In this case, we suspect that the bounds are too wide to be informative. However, if either the number of possible neighborhoods or the utility interdependence is small, i.e., varying across neighborhoods has little in uence on the utility from a link, we expect that the bounds are still tight. Further evaluation of the tightness of the bounds is left for future work. Another limitation is that we need a large number of networks. Because the subnetworks from a network are arbitrarily correlated, we cannot use the asymptotics for a large number of individuals within a single network. This is not because of the utility interdependence, but because that the pairwise stability condition leads to arbitrary dependence between any two links. We may have to relax the pairwise stability condition to some extent in order to allow for the asymptotics within a single large network. Chandrasekhar and Jackson (2012) have considered such asymptotics in random graph models. It would be more attractive if this could be done within the strategic framework. 77 Chapter Three Social Interaction Models 3.1 Introduction Social and economic networks have been attracting increasing attention in economic research because individuals' behaviors and outcomes may be in- uenced by their classmates, colleagues, neighbors, friends, etc., in a variety of applications, such as schooling, labor supply, and consumption. In this chapter, we look at social interactions between individuals in a given net- work, that is, how individuals are aected by their friends given the network structure. Our goal is to identify the interaction eects when the network structure is exogenous. We focus on a specic social interaction model where individuals inter- act because they can learn from their neighbors about a new agricultural technology. In particular, a farmer observes the experiences of his neighbors 78 in the new technology and uses this information to infer the unknown fea- tures of the technology. The inferred information is then taken into account when the farmer makes a production decision. This model is motivated by Conley and Udry (2010), who are the rst to examine empirically whether farmers could learn from their neighbors about an agricultural technology. We extend Conley and Udry in two aspects. First, they assume an ad-hoc learning mechanism, while we model the learning behaviors in a more general Bayesian updating framework, following the idea in Bala and Goyal (1998). Our study is dierent from Bala and Goyal in the sense that their focus is on the long-term equilibrium after the learning process converges, while we are interested in the temporal pattern during the learning process. Second, Conley and Udry make parametric assumptions about the production func- tion, while we aim to identify the social interactions when the production function is nonparametrically specied. In particular, we provide conditions under which the input demand and production functions and the average learning eects can be nonparametrically identied. Our research is an application of the econometric literature on nonpara- metric identication. For comprehensive surveys on nonparametric identi- cation, see Matzkin (1994, 2007). We show that under certain conditions the input demand function becomes a nonadditive index model as in Matzkin (2007). By applying the results in Matzkin (2003, 2007), we show that the input demand function can be nonparametrically identied if we impose cer- tain restrictions on the production function and distribution function of the 79 unobservables. Since the input demand function and the production func- tion form a triangular system, once the former is identied, the latter is also identied, where the information learned from one's neighbors serve as an instrument for the endogenous input demand in the production function. In addition, we show that the average learning eects can be identied under much weaker assumptions. The rest of the chapter is organized as follows. Section 3.2 proposes an economic model of social learning where the learning behaviors are modeled in a Bayesian updating framework. Section 3.3 presents conditions under which the model in Section 3.2 is nonparametrically identied. Section 3.3.1 considers the i.i.d. case, while Section 3.3.2 relaxes the i.i.d. assumption by allowing for neighborhood-level heterogeneity. Section 3.4 concludes the chapter. 3.2 A Model of Social Interactions Through Learning Suppose there is a new agricultural technology that becomes available to farmers, who can either stay with the old technology or switch to the new one. The old technology is well known, so a farmer knows exactly how much prot he can earn if he stays with it. Denote the farmer's prot from the old technology by (w), wherew2R K is the observed characteristics of the farmer and his plots. The farmer will switch to the new technology if the 80 expected prot from that is greater than (w). The production function using the new technology is y =f(x;w;";); (3.1) where x represents the inputs, including fertilizer, irrigation, labor, etc. For simplicity, x is assumed to be a scalar, i.e., x2R + . "2R + represents the idiosyncratic productivity shock, which is observed by the farmer, but neither by other farmers nor by researchers. Assume " is i.i.d across farmers with a distribution function F " , which is continuous and strictly increasing on R + with densityp " ("). Later we will relax the i.i.d. assumption by allowing" to be correlated across farmers. 2 [0; 1] represents the unknown features of the new technology, which are common to all the farmers. For simplicity, we assume takes only nite number of values, and and" are independent. Moreover, the production function satises the following assumptions. Assumption 3.1 The function f : R + R K R + [0; 1]! R + satises the following properties: (i) For all w and , f is twice continuously dierentiable in x and "; (ii) For all w;" and , f is strictly increasing and strictly concave in x: @f(x;w;";) @x > 0; @ 2 f(x;w;";) @x 2 < 0; (iii) For all x;w and , f is strictly increasing and strictly concave in ": @f(x;w;";) @" > 0; @ 2 f(x;w;";) @" 2 < 0; 81 (iv) For all x;w;" and , @ 2 f(x;w;";) @x@" > 0; (v) For all x;w;" and , f(x;w;";)> 0 if and only if x"> 0. (iv) For all w, " and , lim x!0 @f(x;w;";) @x =1: Example 3.1 The following Cobb-Douglas production function y =A (w)x " 1 satises Assumption 3.1, where x is the input, A (w) is the observed produc- tivity, " is the unobserved shock, and is the share of x. Farmers have a common prior belief about , and each of them updates the belief based on his own information. Denote the prior probability by (), where 0 () 1, P 2 () = 1. is independent of w and ". The information used to update the prior belief includes the experiences in the new technology of the farmer (if any) and his neighbors. To describe the information more clearly, let us introduce the network structure. Suppose there are countably many farmers in the economy, labeled 1; 2;:::. We say farmerj is an information neighbor of farmeri if farmerj's past ex- periences in the new technology are observed by farmeri, or more specically, if farmerj uses the new technology and (y j ;x j ) is observed by farmeri. 1 Let N i be the set of farmer i's information neighbors. The information relation 1 Note that farmer i can never observe farmer j's productivity shock " j even if j is i's information neighbor. 82 is re exive (i.e.,i2N i ), but not necessarily symmetric (i.e.,j2N i does not implyi2N j ) nor transitive (i.e.,j2N i andk2N j does not implyk2N i ). We assume that the information neighborhoodsN i is exogenous, i.e.,N i and " i are independent. LetI i =f(y j ;x j );8j2 N i g denote the information set of farmer i. Suppress subscript i for simplicity. According to Bayes' rule, if I is nonempty, the posterior probability (jI) is given by 2 (jI) = Q j2N p(y j ;x j ;w j ;)() P 0 2 Q j2N p(y j ;x j ;w j ; 0 )( 0 ) ; (3.2) where p(y;x;w;) is the probability of y conditional on x;w; and : For all 2 , sincef(x;w;";) is strictly increasing in", it has the inverse function f 1 (x;w;y;) with respect to "; and @f 1 (x;w;y;) @y is strictly positive. Hence, p(y;x;w;) =p " (f 1 (x;w;y;)) @f 1 (x;w;y;) @y . Given (jI), a farmer chooses x to maximize his expected prot max x X 2 (jI)f(x;w;";)cx; (3.3) where c is the unit cost of the input. The output price is normalized to 1. Under Assumption 3.1, the maximization problem in (3.3) has a unique 2 IfI is empty, i.e. the farmer has no information neighbor, then (jI) =(): 83 interior solution x , which is obtained from the rst-order condition X 2 (jI) @f(x ;w;";) @x c = 0: (3.4) x satises the second-order condition because of the concavity of f. Let = P 2 (jI)f(x ;w;";)cx be the optimal prot from the new technology. The farmer switches to the new technology if and only if (w), i.e., x = 8 > < > : x if (w) 0 if < (w) : (3.5) In this chapter, we ignore the selection problem and only consider the farmers who switch. The following lemma says that the optimal input demand is continuous and strictly increasing in ". Lemma 3.1 Under Assumption 3.1,x is continuous and strictly increasing in ". Proof. The continuity is from Assumption 3.1 (i). Dierentiating (3.4) with respect to " gives P 2 (jI) h @ 2 f(x;w;";) @x 2 @x @" + @ 2 f(x;w;";) @x@" i = 0. Then @x @" > 0 by Assumption 3.1 (ii) and (iv). To derive further properties of the input demand as a function of , we assume hereafter that there are only two possible states, 0 and 1 . Suppress 84 and simplify the notation as f 0 (x;w;") = f(x;w;"; 0 ); f 1 (x;w;") = f(x;w;"; 1 ): Let x 0 and x 1 be the values that satisfy @f 0 (x 0 ;w;") @x c = 0 (3.6) @f 1 (x 1 ;w;") @x c = 0; (3.7) and assume they maintain the same ordering for all w and ". Assumption 3.2 For all w and ", x 0 <x 1 , where x 0 and x 1 are dened in (3.6) and (3.7). Then the optimal x satises the following lemma. Lemma 3.2 Under Assumptions 3.1-3.2, x that solves (3.4) lies in the interval [x 0 ;x 1 ]. Furthermore, x is continuous and strictly decreasing in ( 0 jI) for any ( 0 jI)2 (0; 1). Proof. See the appendix. 85 3.3 Identication We are interested in the identication of the input demand function and production function under the true state. Denote the true state by . The input demand function m and the production function f under are given by x = m(c;w;";;I) = m(c;w;";(jI)); (3.8) y = f(x;w;"; ) , f (x;w;"); (3.9) Expression (3.8) is valid because by (3.4) andI aect x only through (jI). We observe (x;y;c;w;I), but not (;"). (c;w;I;) is assumed to be independent of ", where the independence ofI and " comes from the exogeneity of the information network. In fact, equations (3.8) and (3.9) form a triangular system, whose identication essentially relies on the exogenous variation in (c;I), as will be shown in this section. Let M = fm(c;w;";(jI)) :R + R K R + [0; 1]!R + such that m is continuous and strictly increasing in " and (jI) for all c;wg; F = ff (x;w;") :R + R K R + !R + that satises Assumption 3.1g; 86 P = f() :f 0 ; 1 g! [0; 1]g; Q = fF " :R + ! [0; 1] that is continuous and strictly increasing onR + g. Our goal is to recover (m;f ;;F " )2MFPQ from the joint distribu- tion of the observed variables (x;y;c;w;I). We say (m;f ;;F " ) is identied if it is uniquely determined inMFPQ by the joint distribution of (x;y;c;w;I). If (m;f ;;F " ) and ( ~ m; ~ f ; ~ ; ~ F " ) inMFPQ can gen- erate the same joint distribution of (x;y;c;w;I), we say (m;f ;;F " ) and ( ~ m; ~ f ; ~ ; ~ F " ) are observationally equivalent. Beside (m;f ;;F " ), we are also interested in the average learning eect on input demand Z [m(c;w;";;I 0 ))m(c;w;";;I)]dF " ("); (3.10) and on output Z [f (m(c;w;";;I 0 );w;")f (m(c;w;";;I);w;")]dF " ("); (3.11) when the information set changes fromI toI 0 . In particular, ifI =;, (3.10) and (3.11) are the average learning eects for an isolated novice farmer if he receives informationI 0 . The identication of the average eects in (3.10) and (3.11) requires weaker conditions than the identication of (m;f ;;F " ). 87 3.3.1 When " is i.i.d. We start with the case of i.i.d ". First we show that without further restrictions (m;f ;;F " ) is not identied. Proposition 3.1 (No Identication) Suppose Assumptions 3.1-3.2 are sat- ised. For any (m;f ;;F " )2MFPQ, there is a ( ~ m;f ; ~ ;F " )2 MFPQ that is observationally equivalent to (m;f ;;F " ). Proof. See the appendix. The intuition in Proposition 3.1 is simple. The input demand m is de- termined by f 0 , f 1 and . Both f under the false state and aect m (and thus the observed distribution) only through the rst-order condition. One single restriction cannot pin down two functions that can be chosen freely. To achieve identication, we have to restrict eitherf under the false state or . In this chapter, we restrict the prior to be noninformative and at. Assumption 3.3 The prior is noninformative and at, i.e., ( 0 ) = ( 1 ) = 1 2 : Assumption 3.3 requires that a novice farmer with no information neigh- bors believes that all possible states are equally likely. Under this assumption, the posterior probability becomes ( 0 jI) = Q j2N p(y j ;x j ;w j ; 0 ) Q j2N p(y j ;x j ;w j ; 0 ) + Q j2N p(y j ;x j ;w j ; 1 ) 88 = 1 1 + Q j2N p(y j ;x j ;w j ; 1 ) p(y j ;x j ;w j ; 0 ) : (3.12) Equation (3.12) indicates that the information setI aects ( 0 jI) only through a single index, i.e., the likelihood ratio Q j2N p(y j ;x j ;w j ; 1 ) p(y j ;x j ;w j ; 0 ) . Since I enters m only through ( 0 jI), its role in m is hence fully captured by Q j2N p(y j ;x j ;w j ; 1 ) p(y j ;x j ;w j ; 0 ) . This is a handy simplication. Let P 1 P 0 , Q j2N p(y j ;x j ;w j ; 1 ) p(y j ;x j ;w j ; 0 ) . Redene the input demand function as x =m P 1 P 0 ;c;w;" : (3.13) It is easy to show that m in (3.13) is strictly increasing in P 1 P 0 . Lemma 3.1 Suppose Assumptions 3.1-3.3 are satised. Then m in (3.13) is strictly increasing in P 1 P 0 for any c;w: Proof. This is an immediate result of equation (3.12) and Lemma 3.2. Lemmas 3.1-3.1 imply that m is continuous and strictly increasing in P 1 P 0 and " for any (c;w). Moreover, P 1 P 0 is a continuous function offy j ;x j g j2N for any w j , j 2 N. If we can further impose restrictions so that P 1 P 0 is a strictly monotonic function of certain arguments, equation (3.13) becomes a nonadditive index model studied in Matzkin (1994, 2007). We can then apply Matzkin's results to identify (3.13) and thus (3.8). It is known that m and F " are not jointly identied unless certain re- strictions are imposed on either m or F " (Matzkin, 2007). We follow Cher- 89 nozhukov, Imbens and Newey (2007) and Imbens and Newey (2009) to nor- malize " to be uniformly distributed in [0; 1]: This normalization is not re- strictive. It is equivalent to redening " as the quantile in the distribution of the original", i.e., =F e ("). Under the assumption that" is a scalar with a continuous and strictly increasing distribution, there is a bijection between " and . Assumption 3.4 "U(0; 1). Lemma 3.2 Suppose Assumptions 3.1-3.4 are satised. If P 1 P 0 is identied, then m is identied. Proof. For simplicity, assume X has a everywhere positive density condi- tional on P 1 P 0 ;c;w , which ensures the existence of the inverse function of F X P 1 P 0 ;c;w . For all e and P 1 P 0 ;c;w , e = F " (e) = Pr("e) = Pr "e P 1 P 0 ;c;w = Pr m P 1 P 0 ;c;w;" m P 1 P 0 ;c;w;e P 1 P 0 ;c;w = Pr Xm P 1 P 0 ;c;w;e P 1 P 0 ;c;w = F X P 1 P 0 ;c;w m P 1 P 0 ;c;w;e ; 90 where the conditional distribution F X P 1 P 0 ;c;w can be obtained from the data. Therefore m is identied by m P 1 P 0 ;c;w;e =F 1 X P 1 P 0 ;c;w (e): (3.14) The normalization in Assumption 3.4 also simplies P 1 P 0 . Becausep " (") = 1 for all ", we have P 1 P 0 = Y j2N p(y j ;x j ;w j ; 1 ) p(y j ;x j ;w j ; 0 ) = Y j2N p " (f 1 1 (x j ;w j ;y j ))@f 1 1 (x j ;w j ;y j )=@y p " (f 1 0 (x j ;w j ;y j ))@f 1 0 (x j ;w j ;y j )=@y = Y j2N @f 1 1 (x j ;w j ;y j )=@y @f 1 0 (x j ;w j ;y j )=@y : (3.15) In order to achieve the desired monotonicity of the likelihood ratio, we need certain assumptions about the production function. First we assume that the production function is homogeneous in x and ". Assumption 3.5 f i (x;w;") (i = 0; 1) is homogeneous of degree 1 in x and ", namely f i (x;w;") =f i (x;w;"); i = 0; 1; for all 2R + nf0g;x2R + ;w2R K ;"2R + . 91 For i = 0; 1, if f i (x;w;") is homogeneous of degree 1 in x and ", its inverse function with respect to",f 1 i (x;w;y), is homogeneous of degree 1 in x and y, i.e., f 1 i (x;w;y) =f 1 i (x;w;y). By continuous dierentiability of f 1 i , the partial derivative of f 1 i with respect to y, @f 1 i (x;w;y)=@y, is homogeneous of degree 0 inx andy. 3 Therefore, we have@f 1 i (x;w;y)=@y = @f 1 i (1;w; y x )=@ y x and P 1 P 0 = Y j2N @f 1 1 (1;w j ; y j x j )=@ y j x j @f 1 0 (1;w j ; y j x j )=@ y j x j = Y j2N h y j x j ;w j ; (3.16) where h y x ;w = @f 1 1 (1;w; y x )=@ y x @f 1 0 (1;w; y x )=@ y x . Note that only the ratio y x enters the likeli- hood ratio in (3.16). Next, we restrict h in (3.16) to be strictly increasing in y x . Assumption 3.6 h y x ;w is strictly increasing in y x for any w. The following two examples provide production functions that satisfy Assumption 3.6. Example 3.2 Consider the Cobb-Douglas production function in Example 3.1 f i (x;w;") =A(w)x i " 1 i , for i = 0; 1; where 0 < i < 1, i = 0; 1. Suppose 1 > 0 as required by Assumption 3.2. 3 Because @f 1 i (x;w;y) @y = @f 1 i (x;w;y) @y = @f 1 i (x;w;y) @y , for i = 0; 1. 92 The partial derivative of f i with respect to y is @f 1 i (x;w;y) @y = 1 1 i 1 A(w) 1 1 i y x i 1 i , for i = 0; 1: Therefore h( y x ;w) = 1 0 1 1 A(w) 0 1 (1 0 )(1 1 ) y x 1 0 (1 0 )(1 1 ) : When 1 > 0 , h( y x ;w) is strictly increasing in y x . Example 3.3 Consider the CES production function f i (x;w;") =A(w)[ i x + (1 i )" ] 1 , for i = 0; 1; where 0< < 1: Suppose 1 > 0 . 4 The partial derivative of f i with respect to y is @f 1 i (x;w;y) @y = 1 1 i 1 1 A(w) 1 A(w) i y x 1 : Therefore h( y x ;w) = 1 0 1 1 1 " 1 A(w) 1 y x 1 A(w) 0 y x # 1 : When 1 > 0 , h( y x ;w) is strictly increasing in y x . 4 This is consistent with Assumption 3.2 if 1 + 1 1 ( c A(w) ) 1 1 1 ()> 0 for all: 93 According to Matzkin (2007, Theorem 3.2), under Assumptions 3.1-3.6, the nonadditive index model x =m Y j2N h y j x j ;w j ;c;w;" ! (3.17) is identied, or equivalently, h is identied, if and only if there is no con- tinuous, strictly increasing function g : R + ! R + such that for any two h, ~ h2H, where H= n h y x ;w :h is continuous and strictly increasing in y x for all w o ; ~ h =gh. To identify h, we need to restrictH to a subsetH I H; so that no two functions inH I are strictly increasing transformations of each other. Compared to a standard nonadditive index model as in Matzkin (2007), (3.17) has more features that are useful for identication. First, for farm- ers who have only one information neighbor, i.e., N =fjg, we have x = m h y j x j ;w j ;c;w;" , the same as in a standard nonadditive index model. Moreover, for farmers who have two information neighbors, i.e., N =fj;kg, (3.17) becomes x = m h y j x j ;w j h y k x k ;w k ;c;w;" . This implies that if h2H I is not identied, or equivalently, there is a ~ h2H I and a continuous, strictly increasing functiong such that ~ h =gh, then thisg should also sat- isfy ~ h (j) ~ h (k) =g(h (j)h (k)), where h (j) =h y j x j ;w j , h (k) =h y k x k ;w k . 94 Since ~ h (j) =g (h (j)) and ~ h (k) =g (h (k)), we have g(h(j)h(k)) =g(h(j))g(h(k)): (3.18) In particular, when h (j) =h (k), it follows that g(h 2 ) =g(h) 2 : (3.19) Pick h(k) = 1. By g(h(k)) = g(h(j)h(k)) g(h(j)) = g(h(j)) g(h(j)) we have g(1) = 1: (3.20) Pick h (k) = 1 h(j) . By (3.18) and (3.20) g 1 h(j) g (h (j)) = g 1 h(j) h (j) = g (1) = 1. Hence g 1 h = 1 g (h) : (3.21) By induction, from farmers who have n information neighbors, we obtain g (h n ) =g (h) n , n = 1; 2;:::: (3.22) Combining (3.18)-(3.22) gives g h k =g (h) k , k = 0;1;2;:::: (3.23) 95 We are ready to present two additional conditions under each of which h is identied. The rst condition imposes a homogeneous restriction on h, and the second restricts the value of h at certain points. These restrictions are constructed following the insights of Matzkin (2003, 2007) where she proposes functional restrictions that are helpful to achieve identication. Assumption 3.7 h y x ;w is homogeneous of degree 1 in y x for all w, i.e., h y x ;w =h y x ;w for all 2R + nf0g: Theorem 3.1 (Identication of m) Suppose Assumptions 3.1-3.7 are sat- ised. For all h, ~ h2H I , where H I = n h y x ;w 2H :h is homogeneous of degree 1 in y x for all w o ; if there is a continuous, strictly increasing function g such that ~ h = gh, then g is an identity function and m in (3.17) is identied. Proof. Suppress w for simplicity. For any > 0; g ()g h y x = g h y x = g h y x = ~ h y x = ~ h y x = g h y x ; 96 where the rst equality is from (3.18), the second and the fourth are from As- sumption 3.7, the third and the last follow by construction. Sinceg h y x > 0, we have g() =, i.e., an identity function. This implies that h is identi- ed. By (3.14) m is identied. The proof is complete. Assumption 3.8 h y x ;w =(w) for all w, where () is known, y and x are certain values of y and x. Theorem 3.2 (Identication of m) Suppose Assumptions 3.1-3.6 and 3.8 are satised. For all h, ~ h2H 0 I , where H 0 I = n h y x ;w 2H :h y x ;w =(w) for all w o ; if there is a continuous, strictly increasing function g such that ~ h = gh, then g is an identity function and m in (3.17) is identied. Proof. See the appendix. Once m is identied, we can show that f in (3.9) is also identied, following the ideas in Matzkin (2004) and Theorem 3.1 in Heckman, Matzkin and Nesheim (2010). Theorem 3.3 (Identication of f) Suppose Assumptions 3.1-3.6 are sat- ised. Moreover, suppose Assumption 3.7 or 3.8 is satised. Thenf in (3.9) is identied. 97 Proof. According to Theorems 3.1-3.2, under the stated assumptions, both m and P 1 P 0 are identied. Now consider y = f (x;w;") = f m P 1 P 0 (I);c;w;" ;w;" = r P 1 P 0 (I);c;w;" ; where r denotes the reduced form. Since " is independent of P 1 P 0 (I);c;w and F " is known, r is identied similar to (3.14) in Lemma 3.2. For any x2 R + ,w2R K ,"2 [0; 1], chooseI andc that satisfyx =m P 1 P 0 (I );c ;w;" . Then f is identied from f (x;w;") =r P 1 P 0 (I );c ;w;" : (3.24) The intuition in Theorem 3.3 is simple. For any value of (x;w;"), sincem and P 1 P 0 are identied, we can nd all (I;c) that are consistent with (x;w;"). Moreover, the reduced form r is identied, so we can calculate the corre- sponding y for a given (I;c;w;"). Relating (x;w;") to y we identify f . In this nonparametric triangular system, (I;c) serves as instruments for x. Unlike (3.8) and (3.9), whose identication requires undesirable assump- tions, in particular Assumptions 3.7 and 3.8, the average learning eects in (3.10) and (3.11) can be identied under much weaker conditions, follow- 98 ing Blundell and Powell (2006), Matzkin (2007), Imben and Newey (2009). Specically, we can replace Assumption 3.1 by a weaker assumption that ensures the existence of an interior solution. Assumption 3.9 f(x;w;";) :R + R K R + [0; 1]!R + is twice continu- ously dierentiable, strictly increasing, strictly concave inx, lim x!0 @f(x;w;";) @x = 1, for all w, ", . Moreover, suppose Assumption 3.3 remains satised so that disappears, and suppose " is independent of (I;c;w), "jI;c;w": (3.25) Rewrite (3.8) and (3.9) as x = m(I;c;w;"); y = f (m(I;c;w;");w;") =r(I;c;w;"): Then the average learning eects on input demand and output Z [m(I 0 ;c;w;")m(I;c;w;")]dF " (") (3.26) Z [r(I 0 ;c;w;")r(I;c;w;")]dF " (") (3.27) are identied. 99 Theorem 3.4 (Identication of Average Learning Eects) Suppose As- sumptions 3.3 and 3.9 are satised. Moreover, suppose condition (3.25) is satised. Then the average learning eects (3.26) and (3.27) are identied. Proof. For any (I;c;w), we have E(xjI;c;w) = Z m(I;c;w;")dF "jI;c;w (") = Z m(I;c;w;")dF " ("); where the second equality follows by the independence condition (3.25). Therefore Z [m(I 0 ;c;w;")m(I;c;w;")]dF " (") =E(xjI 0 ;c;w)E(xjI;c;w) is identied since the right-hand side can be obtained from the observed data. Similarly, Z [r(I 0 ;c;w;")r(I;c;w;")]dF " (") = Z r(I 0 ;c;w;")dF "jI 0 ;c;w (") Z r(I;c;w;")dF "jI;c;w (") = E(yjI 0 ;c;w)E(yjI;c;w) is also identied. 100 3.3.2 When" has geographic-neighborhood heterogene- ity A potential problem with the assumption of i.i.d. " is that " may be correlated across farmers. For example, farmers who live close may have similar unobserved heterogeneity, such as soil quality or weather shocks. In this section, we relax the i.i.d assumption by allowing for an unobserved eect in each geographic neighborhood. Suppose " takes the form " ik =( k ; ik ); (3.28) where k is the unobserved heterogeneity in geographic neighborhoodk, and ik is the unobserved heterogeneity of individuali in geographic neighborhood k. For all i;k, we assume that both v k and ik are i.i.d., k and ik are independent, and : R 2 ! R is continuous and strictly increasing in both of its arguments. Please note that the geographic neighborhoods do not necessarily coincide with the information neighborhoods. For example, a farmer may talk to someone who lives far from him (i.e., not in his own geographic neighborhood) for experiences in the new technology. Now we consider the learning behaviors in the presence of neighborhood eects. Suppose farmer i lives in geographic neighborhood k. We assume that i observes v k , but does not observe v k 0 for all k 0 6=k. If i's information neighbor j also lives in geographic neighborhood k, i observes j's v k , so the density of y j in (3.2) is computed conditional on (x j ;w j ;v k ). However, if 101 j does not live in geographic neighborhood k, i cannot observe j's v, and thus computes the density of y j conditional on (x j ;w j ) only. In the for- mer case, p(y;x;w;;) = p ( 1 (;f 1 (x;w;y;))) @ 1 (;f 1 (x;w;y;)) @y , where 1 (;:) is the inverse function of(;) with respect to, while in the latter p(y;x;w;) =p (;) (f 1 (x;w;y;)) @f 1 (x;w;y;) @y . Therefore, the posterior be- lief in (3.2) generally depends on, i.e. (jI;), unless all ofi' information neighbors live outside i's geographic neighborhood. We modify (3.8) and (3.9) as x = m(c;w;(;);(jI;)) (3.29) y = f (x;w;(;)): (3.30) The identication of (3.29) is challenging because of the correlation between (;) and(jI;) throughv. However, if we only use those farmers whose information neighbors live outside their neighborhoods, then (3.29) becomes x =m(c;w;(;);(jI)): (3.31) Under a normalization assumption imposed on(;) as in Assumption 3.4, (3.31) and (3.30) can be identied similarly to (3.8) and (3.9) in Theorems 3.1-3.3. 102 3.4 Conclusions In this chapter, we develop a structural model of social learning and pro- pose conditions under which the model is nonparametrically identied. We show that the input demand function in the presence of learning can be identied by applying Matzkin(2007)'s results on the identication of non- additive index models. Since the input demand function and the production function form a triangular system, once the former is identied, the latter is also identied. Moreover, the average learning eects can be identied under weaker conditions. The current study has several limitations. First, the identication results in the case with unobserved neighborhood-level heterogeneity is based on a strong assumption that the error term is uniformly distributed and is inde- pendent of the observables. It is more reasonable to relax this random-eect- type assumption by allowing for the correlation between the neighborhood- level unobservables. and the observables. Under this more general assump- tion, however, the nonidentiability results in nonseparable xed-eect mod- els (Chamberlain, 2010) imply that nonparametric identication may not be possible. Further examination of this issue is left for future work. Second, since individuals choose between the old and new technologies, the input demand function and production function in (3.8) and (3.9) are in fact selection models with a selection rule dened in (3.5). The identication of these selection models is not easy. We leave it for future research. 103 Third, as we have seen in Chapter 2, network structure is in general an outcome of individuals' choices and thus is endogenously determined. It is more reasonable to consider a social interaction model in which networks are endogenously formed. The identication of interaction eects in such a model is challenging given the fact that there may be multiple equilibria in the network formation. This is an interesting research topic that deserves further investigation. 104 Chapter Four Conclusions In this research, we provide a structural analysis of network-related mod- els. We analyze a network formation model and a social interaction model: The former is used to explore the determinants of network structure, and the latter is used to explore the interactions between individuals in a given network. In the network formation model, we make the weakest possible as- sumptions about individuals' behaviors, and overcome the nonidentication problem in the presence of multiple equilibria using partial identication. Moreover, we propose to use subnetworks to derive probability bounds that are computationally attractive. In the social interaction model, we focus on a special case where individuals interact with their neighbors due to learning. We provide conditions under which the structural functions and the average learning eects in this special model can be nonparametrically identied. A crucial assumption in the social interaction model in Chapter 3 is that 105 the network in which individuals interact is exogenously determined. How- ever, according to the network formation model in Chapter 2, a network is in fact an equilibrium choice of the individuals involved and is thus formed endogenously. It is more appropriate to incorporate a network formation model into a social interaction model and consider social interactions in en- dogenously formed networks. The resulting integrated model is similar to a sample selection model in which individuals form a network in the rst stage and interact with their friends in the network in the second stage. Identify- ing such a selection model, however, is dicult in part because there may be multiple equilibria in the network formation stage. A more restricted model of network formation may be needed to achieve the identication of the social interactions in the second stage. 106 Appendix A Proofs Proofs and Discussions in Chapter 2 Formal Denitions of Closed Cycles. Jackson and Watts (2001, 2002) introduced the denition of closed cycles in the case without transfers. Here we present a variation of their denition that can allow for transfers as well. We say that two networks are adjacent if they dier by one link. An im- proving path from a network G to a network G 0 is a sequence of adjacent networks G 1 = G, G 2 , :::, G K , G K+1 = G 0 such that for each k = 1;:::;K, G k+1 defeats G k , i.e., in the case without transfers, { either (G k ) ij = 1, (G k+1 ) ij = 0, andU i (G k+1 )>U i (G k ), for some i;j, 107 { or (G k ) ij = 0, (G k+1 ) ij = 1,U i (G k+1 )>U i (G k ), andU j (G k+1 ) U j (G k ), for some i;j; in the case with transfers, { (G k ) ij 6= (G k+1 ) ij , andU i (G k+1 ) +U j (G k+1 )>U i (G k ) +U j (G k ), for some i;j. 1 Based on the denition of improving paths, we say that a set of networksC forms a cycle if for any G, G 0 2C there is an improving path from G to G 0 . A closed cycleC is a cycle such that there is no improving path from any G2C to any G 0 = 2C. These denitions are appropriate for both cases with and without transfers. Lemma 1.1 (Lemma 1 in Jackson and Watts (2002)) For any utility function there is a PS T network or a closed cycle in the case with transfers. Proof. Because the proof of Lemma 1 in Jackson and Watts (2002) is mainly logical and does not involve the technical denition of improving paths as above, where the cases with and without transfers dier, the proof is just a restatement of that proof. We include it here for completion. A network is PS T if and only if it does not lie on an improving path that leads to another network. Start with a network. If it is PS T , the proof is nished. 1 These denitions are suitable for pairwise stability presented in Denitions 2.1 and 2.2. For pairwise stability dened in the econometric models (2.5) and (2.6), some modi- cations are needed. That is, when the change fromG k toG k+1 is an addition, i.e., (G k ) ij = 0, (G k+1 ) ij = 1, then the inequalities should be nonstrict, i.e., U i (G k+1 ) U i (G k ) and U j (G k+1 ) U j (G k ) without transfers and U i (G k+1 ) +U j (G k+1 ) U i (G k ) +U j (G k ) with transfers. 108 If not, it lies on an improving path to another network. As the number of possible networks is nite, the improving path either ends at a network that has no improving path leaving it and hence is PS T or recurrently hits some networks. In the latter case, the improving path forms a cycle. This implies that if a PS T network does not exist, there is a cycle. Now we show in this case there must be a closed cycle. Given the nite number of networks, there is a maximal cycle that is not contained in another cycle. Consider the collection of all maximal cycles. There must be a maximal cycle that has no improving path leaving it. Otherwise, if all maximal cycles have an improving path leaving them, then there is a larger cycle, which contradicts maximality. Hence there is a closed cycle. Proof of Proposition 2.1. First we verify that the stated condition is sucient in the case with transfers. That is, if there is a function : G ! R such that for any adjacent G and G 0 , G 0 defeats G if and only if (G 0 ) > (G), then there is no cycle, where "adjacent", "defeat", and "cycle" are dened in the paragraph "Formal denitions of closed cycles" in the appendix. 2 The proof is identical to that of Theorem 1 in Jackson and Watts (2001) because, again, the latter does not involve the technical denition of improving paths. Suppose there is a cycleC. So forG2C there is an improving path from G to G. This implies that (G) > (G), which is impossible. Hence there is no cycle. 2 If the pairwise stability is dened as in equations (2.5) and (2.6), the condition needs a minor modication. That is, G 0 defeats G if and only if (G 0 ) (G) if from G to G 0 is an addition and (G 0 )> (G) if a deletion. The suciency result still holds. 109 Now we propose a function for the utility function in (2.3) (G) = N P i=1 N P j=1 G ij u ij + 1 2 N P i=1 N P j=1 N P k=1 k6=i G ij G jk v + 2 3 N P i=1 N P j=1 N P k>j G ij G ik G jk w N P i=1 C i N P j=1 G ij ! : (1.1) We show that this function has the desired property. Consider two adjacent networksG andG 0 . Suppose they dier by linkij. Without loss of generality assume that G = (0;G ij ) and G 0 = (1;G ij ). It suces to show that (G 0 ) (G) = ij U i (G ij ) + ij U j (G ij ): By simple algebra we obtain (G 0 ) (G) = u ij +u ji + N P k=1 k6=i G jk v + N P k=1 k6=j G ik v + 2 N P k=1 G ik G jk w C i 0 @ N P k=1 k6=j G ik 1 A C j 0 @ N P k=1 k6=i G jk 1 A : (1.2) Please note that because G ij =G ji , G ij appears in G ij G jk , G kj G ji , G ji G ik , andG ki G ij in the rst triple sum in (1.1). When we take the dierence, this triple sum becomes 2 P N k=1;k6=i G jk v + 2 P N k=1;k6=j G ik v. By virtue of the " 1 2 " in (1.1) we then obtain the third and forth terms in the right hand side of (1.2). The " 2 3 " in (1.1) plays a similar role. Now we look at the individual 110 marginal utilities. From (2.4) we have ij U i (G ij ) =u ij + N P k=1 k6=i G jk v + N P k=1 G ik G jk w C i 0 @ N P k=1 k6=j G ik 1 A ; ij U j (G ij ) =u ji + N P k=1 k6=j G ik v + N P k=1 G jk G ik w C j 0 @ N P k=1 k6=i G jk 1 A : It is clear that (G 0 )(G) = ij U i (G ij )+ ij U j (G ij ), so the function in (1.1) has the desired property. Proof of Proposition 2.2. We present the denitions of convexity in one's own links and strategic complementarity in Hellmann (2013) using the notation in this chapter. Let (G ij ) kl denote link kl in G ij for kl6= ij. According to Hellmann,U i (G) is convex in one's own links if for anyj2V , j6=i, and for any G ij ;G 0 ij 2G ij such that 1. (G ij ) kl = G 0 ij kl , for all k;l2V and k;l6=i, and 2. for some LVnfi;jg, (G ij ) il = 0 and G 0 ij il = 1, for all l2L, we have ij U i (G 0 ij ) ij U i (G ij ). In other words, ifG 0 ij dier fromG ij by adding some links that involvei, then the marginal utility ofi from linkij with these additional links is larger than without. Moreover, U i (G) satises the strategic complements property if for any j 2 V , j 6= i, and for any G ij ;G 0 ij 2G ij such that 1. (G ij ) il = G 0 ij il , for all l2V and l6=i;j, and 111 2. for some LVnfig, (G ij ) kl = 0 and G 0 ij kl = 1 for all k;l2L, we have ij U i (G 0 ij ) ij U i (G ij ). In other words, ifG 0 ij dier fromG ij by adding some links that do not involve i, then the marginal utility of i from link ij given these additional links is larger than without these links. According to Theorem 1 in Hellmann, if both properties are satised, then there is no closed cycle. It suces to verify that the utility function in (2.3) with v 0, w 0, and the assumed cost function has both properties. We rst consider the case of C i <1. The marginal utility (2.4) is ij U i (G ij ) =u ij + N P k=1 k6=i G jk v + N P k=1 G ik G jk wC i : Since v 0 and w 0, changing G ik or G jk from 0 to 1 for some k2 V weakly increases ij U i (G ij ). Hence both properties are satised. In Section 2.3.3, we need a cost function C i (n) =C i n that satises C i <1 if n b and C i =1 if n > b, for some integer b <1. For this cost function, it is easy to see that in either a PS NT network or a closed cycle, nobody has more than b friends. 3 Therefore we return to a linear cost function and the above argument applies. Proof of Lemma 2.1. We prove the statement for PS NT . The proof for PS T is similar and omitted. Suppose there are adjacent G;G 0 2 H. 3 For a closed cycleC, suppose there is a G2C that P N j=1 G ij > b, for some i2 V . Then i has the incentive to delete links until the number of friends is less than or equal to b. This forms an improving path from G to another G 0 . By denition of closed cycles, G 0 2C. However, there is no improving path fromG 0 toG, because the deleted links have to be added back, which is never benecial for i. This contradicts thatC is a closed cycle. 112 Then G ij 6= G 0 ij and G ij = G 0 ij for some i;j. Without loss of gener- ality, suppose G ij = 1 and G 0 ij = 0. BecauseH is a PS NT set, we have ij U i (G ij ;X i ;X j ;" ij ) 0 and ij U j (G ij ;X j ;X i ;" ji ) 0, and ij U i (G 0 ij ; X i ;X j ;" ij ) < 0 or ij U j (G 0 ij ;X j ;X i ;" ji ) < 0. No " can satisfy these con- ditions because G ij = G 0 ij , thereby contradicting the assumption. Hence the networks inH must be mutually nonadjacent. Proof of Lemma 2.2. Given X and ", if there is a G A such that (G A ;G A ) is pairwise stable, then G A must be pairwise stable for the G A . This gives 1f9G A ; (G A ;G A )2PS (U (X;"; u ))g 1f9G A ;G A 2PS (U A (G A ;X A ;" A ; u ))g: (1.3) Similarly, for anyG A and anyG 0 A 6=G A , ifG 0 A is not pairwise stable for the G A , then (G 0 A ;G A ) must not be pairwise stable. Therefore, 1f8G A ;8G 0 A 6=G A ; (G 0 A ;G A ) = 2PS (U (X;"; u ))g 1f8G A ;8G 0 A 6=G A ;G 0 A = 2PS (U A (G A ;X A ;" A ; u ))g: (1.4) LetX A = (X i ) i2VnA denote a (NjAj)K matrix of attributes of the in- dividuals not inA, and" A = (" ij ) i or j2VnA denote a (N (N 1)jAj (jAj 1)) 1 vector of preferences for the links in G A . It is clear that X = (X A ;X A ) and" = (" A ;" A ). Because the second lines of (1.3) and (1.4) do 113 not depend on " A , when we substitute them into (2.16) and (2.17), " A is integrated out. Therefore we obtain Pr(G A jX;N) Z 9G A ;G A 2PS(U A (B A ;X A ;" A ;u)) dF " A (" A j " ); (1.5) and Pr(G A jX;N) Z 8G A ;8G 0 A 6=G A ;G 0 A = 2PS(U A (B A ;X A ;" A ;u)) dF " A (" A j " ): (1.6) Moreover, because the right hand sides of (1.5) and (1.6) does not depend on X A and E (Pr(G A jX;N)jX A ;N) = E (Pr(G A jX A ;X A ;N)jX A ;N) = Pr(G A jX A ;N), we obtain (2.21) and (2.22). Proof of Proposition 2.2. For anyAV , we haveG = G A ;B A ;G VnA . Similarly to (2.14), we obtain Pr (G A ;B A jX;N) = Z X G VnA 2G VnA Pr G A ;B A ;G VnA PS (U (X;"; u ));X;" dF " ("j " ): (1.7) Following the ideas in (2.16) , Lemma 2.2, and Proposition 2.1, we obtain X G VnA 2G VnA Pr G A ;B A ;G VnA PS (U (X;"; u ));X;" 1 9G VnA ; G A ;B A ;G VnA 2PS (U (X;"; u )) 114 1 9G VnA ;G A 2PS U A B A ;G VnA ;X A ;" A ; u = 1fG A 2PS (U A (B A ;X A ;" A ; u ))g: (1.8) (2.26) then follows by applying (1.8) to (1.7) and integrating out " A and X A as in Lemma 2.2. Proof of Equation (2.33). It is sucient to show Pr (G =gjX =x;N =n) = Pr (G =g jX =x ;N =n); (1.9) Pr (X =x;N =n) = Pr (X =x ;N =n): (1.10) To show equation (1.9), note that pairwise stability and equilibrium selection are invariant under isomorphisms, i.e., g2PS (U (x;")),g 2PS (U (x ;" )); (1.11) Pr (gjPS (U (x;"));x;") = Pr (g jPS (U (x ;" ));x ;" ); (1.12) where " is dened by " i j = " ij ;8i;j2 V;i6= j. Under Assumption 2.1 (iii)," ij are i.i.d.. Equation (1.9) then follows from equation (2.9). Equation (1.10) holds because X i are i.i.d., a result of the data generating process in Assumption 2.1. Proof of Equation (2.34). We prove this equation in Proposition 1.1 below. 115 Denition of Isomorphic Subnetworks with Neighborhoods. Let be a permutation over V = f1;:::;ng. For (g A ;b A ;x A ;n), we dene (g A ;b A ;x A ;n) to be a subnetwork with a neighborhood such that g i j =g ij ; 8i2A;8j2V;i6=j; x i j =x i ; 8i2A; i.e., the edges in g A and b A and the attributes in A are preserved under . 4 Then (g A ;b A ;x A ;n) and (g A ;b A ;x A ;n) are equivalent subnetworks with neighborhoods, except for the labels. We say they are isomorphic and write (g A ;b A ;x A ;n) = (g A ;b A ;x A ;n): Proposition 1.1 Under Assumption 2.1, for any AV , we have Pr (G A =g A ;X A =x A ;N =n) = Pr (G A =g A ;X A =x A ;N =n); (1.13) and Pr (G A =g A ;B A =b A ;X A =x A ;N =n) = Pr (G A =g A ;B A =b A ;X A =x A ;N =n): (1.14) 4 The attributes of the individuals inVnA may not necessarily be preserved. This is the result of the local externality assumption because two neighborhoods with heterogeneous neighbors but the same topology in other aspects are still considered equivalent. 116 Proof of Proposition 1.1. It is sucient to show (1.14) because by construction there is a one-to-one mapping betweenb A andb A , so summing over b A and b A on both sides of (1.14) we obtain (1.13). Equation (1.14) follows from Pr (G A =g A ;B A =b A jX A =x A ;N =n) = Pr (G A =g A ;B A =b A jX A =x A ;N =n); (1.15) and Pr (X A =x A ;N =n) = Pr (X A =x A ;N =n): (1.16) To show (1.15), note that from (1.7) in the proof of Proposition 2.2 we obtain Pr (G A =g A ;B A =b A jX A =x A ;N =n) = ZZ X g VnA Pr g A ;b A ;g VnA PS U x A ;x VnA ;" ;x A ;x VnA ;" dF X VnA x VnA dF " ("): (1.17) Similarly, Pr (G A =g A ;B A =b A jX A =x A ;N =n) = ZZ X g V nA Pr g A ;b A ;g V nA PS U x A ;x V nA ;" ;x A ;x V nA ;" dF X V nA x V nA dF " (" ); (1.18) 117 where g V nA ;x V nA ;n is a subnetwork in V nA such that g i j =g ij ; 8i;j2VnA;i6=j x i =x i ; 8i2VnA; F X VnA and F X V nA are the distribution functions of X VnA and X V nA , and " is dened as in the proof of equation (2.33). 5 Let g = g A ;b A ;g VnA , g = g A ;b A ;g V nA , x = x A ;x VnA , and x = x A ;x V nA . It is straightforward to show that networks (g;x;n) and (g ;x ;n) are isomorphic under . Therefore, we have Pr g A ;b A ;g VnA PS U x A ;x VnA ;" ;x A ;x VnA ;" = Pr (gjPS (U (x;"));x;") = Pr (g jPS (U (x ;" ));x ;" ) = Pr g A ;b A ;g V nA PS U x A ;x V nA ;" ;x A ;x V nA ;" ; (1.19) where the second equality is from (1.11) and (1.12). Under Assumption 2.1, X i are i.i.d., soX VnA d =X V nA and both are exchangeable. Moreover," ij are i.i.d., so" is exchangeable. Therefore, applying (1.19) to (1.17) and (1.18) we obtain (1.15). Equation (1.16) holds also because X i are i.i.d., soX A d =X A 5 Note that it is valid to denote by g V nA ;x V nA ;n an arbitrary subnet- work in V nA in equation (1.18) because there is a one-to-one mapping between g V nA ;x V nA ;n and g VnA ;x VnA ;n and the latter can be arbitrary. 118 and both are exchangeable. Discussion of Equation (1.14). Equation (1.14) implies that (G A ;B A ;X A ;N) d = (G A ;B A ;X A ;N); and we can treat (g A ;b A ;x A ;n) and (g A ;b A ;x A ;n) as the same realization. Proof of Theorem 2.1. This is an application of Theorems 3.1 and 4.2 in CHT (2007). (a) Condition (iii) implies that inf 2 TQ T () inf 2 I TQ T () sup 2 I TQ T () =O p (1): Hence, from conditions (ii) and (iii) we obtain sup 2 p TjQ ()Q 0 T ()j sup 2 p TjQ ()Q T ()j+ inf 0 2 p TQ T ( 0 ) =O p (1); and sup 2 I TQ 0 T () = sup 2 I TQ T () inf 0 2 TQ T ( 0 ) =O p (1); Therefore, following the proof of Theorem 3.1 in CHT, we can show that sup 2 I d ; ^ I = 0; and sup 2 ^ I d (; ) p ! 0; (1.20) which imply d ^ I ; I p ! 0. The rst part of (1.20) holds because sup 2 I TQ 0 T () =O p (1)<c T and thus I ^ I . The second part of (1.20) 119 follows from sup 2 ^ I Q () sup 2 ^ I jQ ()Q 0 T ()j + sup 2 ^ I Q 0 T ()O p 1 T + c T T =o p (1); (1.21) and inf 2n " I Q () ("); (1.22) for some (") > 0, where " I = f2 :d (; I )<"g, " > 0. (1.22) is satised because Q () is continuous in under condition (i) and thus inf 2n " I Q () = Q ( ) > 0 for some 2 n " I by compactness of . Combining (1.21) with (1.22) yields ^ I \ n " I =; with probability ap- proaching 1 we obtain the second part of (1.20). (b) Now we show that conditions (i)-(iii) hold for the utility function in (2.3). To show condition (i), note that Q () is continuous in if the bound functions in (2.32) are continuous in. The continuity of the bound functions follows from the fact that these bound functions are integrals over " which follows a continuous distribution under Assumption 2.1 and the boundary functions of the integration regions are continuous in . Condition (ii) is satised becausefm t ();2 g is Donsker. This is because m t () satises the nite-dimensional convergence property due to standard central limit theorem and the stochastic equicontinuity property due to the continuity of m t () in which follows from the continuity of the 120 bound functions. To be precise, by CLT we have 0 B B B B @ 1 p T P T t=1 P g t P g 1 p T P T t=1 P b t P b 1 p T P T t=1 P x t P x 1 C C C C A d !N (0; ); for certain positive denite matrix . In the above expression, P g t , P b t , and P x t are the vectors of their counterparts in (2.38) over all equivalence classes of (g a ;x a ;n); (g a ;b a ;x a ;n) and (x a ;n), and all 2 a a, and P g , P b , and P x are the corresponding mean vectors. It is then straightforward to show that (G T m t ( 1 );:::;G T m t ( k )) d !N (0; ( 1 ;:::; k )); (1.23) for certain positive denite matrix ( 1 ;:::; k ), for all nite k and all 1 ;:::; k 2 , where G T m t ( j ) = 1 p T T X t=1 (m t ( j )Em t ( j )); j = 1;:::;k: Condition (iii) is a result offm t ();2 g being Donsker andEm t () 0 for 2 I . To see this, write TQ T () = p TE T m t () + 2 = G T m t () + p TEm t () + 2 : 121 For2 I , we haveTQ T () p ! 0 asT!1 forEm t ()< 0, andTQ T () = (G T m t ()) + 2 = O p (1) for Em t () = 0. Hence condition (iii) is satised. Simulators for H 1 (g A ;x A ;n;) and H 2 (g A ;x A ;n;) To describe the simulators, let us x A V and index the links in g A by l 1 ;l 2 ;:::;l , where =jAj (jAj 1)=2. For any k = 1;:::;, let x l k (a 2K matrix) and" l k (a 2 1 vector) represent the observed attributes and unobserved preferences of the two individuals involved in link l k . Moreover, let g l k = (g l 1 ;:::;g l k ); x l k = (x l 1 ;:::;x l k ); " l k = (" l 1 ;:::;" l k ); represent links l 1 ;:::;l k and the attributes and preferences associated with these links. Fork =, they are simplyg A ,x A , and" A . Note that for linkij that is among l 1 ;:::;l k , its neighborhood is b ij =b ij (g Aij ;b A ) =b ij g (l k )ij ;g A(l k ) ;b A ; whereg (l k )ij represents linksl 1 ;:::;l k exceptij, andg A(l k ) represents the links ing A other thanl 1 ;:::;l k . We then denote the PS set of links l 1 ;:::;l k 122 by PS U l k g A(l k ) ;b A ;x l k ;" l k ; u ; where U l k g A(l k ) ;b A ;x l k ;" l k ;: :=ff ij U i (b ij (g (l k )ij ;g A(l k ) ;b A ); x i ;x j ;" ij ;:); ij U j (b ji g (l k )ij ;g A(l k ) ;b A ;x j ;x i ;" ji ;:)g g (l k )ij g ij2fl 1 ;:::;l k g 2R k2 k represents the marginal-utility prole of linksl 1 ;:::;l k . Note that for k = the marginal-utility prole and PS set are simply U A (b A ;x A ;" A ; u ) andPS (U A (b A ;x A ;" A ; u )) as before. We rst consider the simulator for H 1 (g A ;x A ;n;). As mentioned in the main text, the key step is to express H 1 (g A ;x A ;n;) as a function of a sequence of conditional probabilities which have closed forms. Theorem 1.1 Suppose Assumptions 2.1 and 2.2 are satised. For any (g A ; x A ;n) and , we have H 1 (g A ;x A ;n;) = Pr (" l 1 2D 1 (g A ;x l 1 ; u )j " ) Z Y k=2 Pr " l k 2D kjk1 ~ " l k1 ;g A ;x l k ; u ~ " l k1 ; " 1 Y k=2 f " l k ~ " l k j~ " l k1 f " l 1 (~ " l 1 ) ! d~ " l 1 ; (1.24) where we dene the sets D k (g A ;x l k ; u ) 123 = 8 > < > : " l k 2R 2k : max b A 1 g l k 2PS U l k g A(l k ) ;b A ;x l k ;" l k ; u = 1 9 > = > ; ; (1.25) for k = 1;:::;, and D kjk1 " l k1 ;g A ;x l k ; u := " l k 2R 2 : " l k1 ;" l k 2D k (g A ;x l k ; u ) ; (1.26) for any " l k1 2D k1 g A ;x l k1 ; u and k = 2;:::;. In equation (1.24), f " l 1 (~ " l 1 ) represents the density of " l 1 subject to " l 1 2 D 1 (g A ;x l 1 ; u ), and f " l k ~ " l k j~ " l k1 represents the conditional density of " l k given ~ " l k1 subject to " l k 2D kjk1 ~ " l k1 ;g A ;x l k ; u , i.e., f " l 1 (~ " l 1 ) = f " l (~ " l 1 j " ) 1f~ " l 1 2D 1 (g A ;x l 1 ; u )g R D 1 (g A ;x l 1 ;u) dF " l (" l 1 j " ) ; f " l k ~ " l k j~ " l k1 = f " l (~ " l k j " ) 1 ~ " l k 2D kjk1 ~ " l k1 ;g A ;x l k ; u R D kjk1 (~ " l k1 ;g A ;x l k ;u) dF " l (" l k j " ) ; (1.27) k = 2;:::; 1; where F " l and f " l represent the distribution and density functions of " l . Proof of Theorem 1.1. The proof is similar to that of Theorem 2.1.1 in Geweke and Keane (2001, p. 3471). Consider the integral on the right hand 124 side of (1.24). It satises Z Y k=2 Pr " l k 2D kjk1 ~ " l k1 ;g A ;x l k ; u ~ " l k1 ; " 1 Y k=2 f " l k ~ " l k j~ " l k1 f " l 1 (~ " l 1 )d~ " l 1 = Z Z Pr " l 2D j1 ~ " l 1 ;g A ;x l ; u ~ " l 1 ; " Pr " l 1 2D 1j2 ~ " l 2 ;g A ;x l 1 ; u ~ " l 2 ; " Pr " l 2 2D 2j1 (~ " l 1 ;g A ;x l 2 ; u ) ~ " l 1 ; " f " l 1 ~ " l 1 j~ " l 2 f " l 2 ~ " l 2 j~ " l 3 f " l 1 (~ " l 1 )d~ " l 1 d~ " l 2 d~ " l 1 ; (1.28) The integral over ~ " l 1 in (1.28) is Z Pr " l 2D j1 ~ " l 1 ;g A ;x l ; u ~ " l 1 ; " f " l 1 ~ " l 1 j~ " l 2 d~ " l 1 = Pr 0 B @ " l 2D j1 ~ " l 2 ;" l 1 ;g A ;x l ; u & " l 1 2D 1j2 ~ " l 2 ;g A ;x l 1 ; u ~ " l 2 ; " 1 C A Pr " l 1 2D 1j2 ~ " l 2 ;g A ;x l 1 ; u ~ " l 2 ; " : Substituting the last expression, the integral over ~ " l 2 and ~ " l 1 in (1.28) is Z Pr 0 B @ " l 2D j1 ~ " l 2 ;" l 1 ;g A ;x l ; u & " l 1 2D 1j2 ~ " l 2 ;g A ;x l 1 ; u ~ " l 2 ; " 1 C A Pr " l 1 2D 1j2 ~ " l 2 ;g A ;x l 1 ; u ~ " l 2 ; " 125 Pr " l 1 2D 1j2 ~ " l 2 ;g A ;x l 1 ; u ~ " l 2 ; " f " l 2 ~ " l 2 j~ " l 3 d~ " l 2 = Pr 0 B @ " l k 2D kjk1 ~ " l 3 ;" l 2 ;:::;" l k1 ;g A ;x l k ; u k = 2;:::; ~ " l 3 ; " 1 C A Pr " l 2 2D 2j3 ~ " l 3 ;g A ;x l 2 ; u ~ " l 3 ; " : Proceeding in this way, the integral over ~ " l j ;:::; ~ " l 1 ,j 3, in (1.28) is Z Pr 0 B @ " l k 2D kjk1 ~ " l j ;" l j+1 ;:::;" l k1 ;g A ;x l k ; u k =j + 1;:::; ~ " l j ; " 1 C A Pr " l j+1 2D j+1jj ~ " l j ;g A ;x l j+1 ; u ~ " l j ; " Pr " l j+1 2D j+1jj ~ " l j ;g A ;x l j+1 ; u ~ " l j ; " f " l j ~ " l j j~ " l j1 d~ " l j = Pr 0 B @ " l k 2D kjk1 ~ " l j1 ;" l j ;:::;" l k1 ;g A ;x l k ; u k =j;:::; ~ " l j1 ; " 1 C A Pr " l j 2D jjj1 ~ " l j1 ;g A ;x l j ; u ~ " l j1 ; " : For j = 1, the last expression becomes Pr 0 B @ " l k 2D kjk1 " l 1 ;:::;" l k1 ;g A ;x l k ; u ;k = 2;:::;; & " l 1 2D 1 (g A ;x l 1 ; u ) " 1 C A Pr (" l 1 2D 1 (g A ;x l 1 ; u )j " ) = Pr ((" l 1 ;:::;" l )2D (g A ;x l ; u )j " ) Pr (" l 1 2D 1 (g A ;x l 1 ; u )j " ) = M 1 (g A ;x A ;) Pr (" l 1 2D 1 (g A ;x l 1 ; u )j " ) : 126 Equation (1.24) then follows. In equation (1.24),D 1 (g A ;x l 1 ; u ) is the region of " l 1 where g l 1 is PS for some b A , andD kjk1 ~ " l k1 ;g A ;x l k ; u , k = 2;:::;, is the region of " l k whereg l k is PS for someb A given ~ " l k1 . Note that ~ " l k1 2D k1 (g A ;x l k1 ; u ) already ensures that g l k1 is PS for some b A . It is sucient for " l k to lie inD kjk1 ~ " l k1 ;g A ;x l k ; u if g l k is PS for some b A . Therefore, these regions are subsets of R 2 whose boundaries can be obtained by checking equations (2.5) or (2.6) for each ofg l 1 ;g l 2 ;:::;g l . Based on these boundaries, we can calculate analytically the probabilities Pr(" l 1 2D 1 (g A ;x l 1 ; u )j " ) and Pr(" l k 2 D kjk1 ~ " l k1 ;g A ;x l k ; u ~ " l k1 ; " ) in (1.24). The closed forms of these probabilities are continuous in because both the utility and distribution functions are continuous in . As an illustration, let us assume " l 1 = (" ij ;" ji ) N (0; 2 I 2 ), where I 2 is the 2 2 identity matrix. Consider the case with transfers. Write the marginal utility function in (2.4) as ij U i (b ij ;x i ;x j ;" ij ; u ) = ij U i (b ij ;x i ; x j ; u ) +" ij and the same for ij U j . Then Pr (" l 1 2D 1 (g A ;x l 1 ; u )j " ) is equal to P " ij +" ji max b A ij U i (g Aij ;b A ;:) + ij U j (g Aij ;b A ;:) = max b A ij U i (g Aij ;b A ;:) + ij U j (g Aij ;b A ;:) = p 2 127 if g ij = 1 and P " ij +" ji < min b A ij U i (g Aij ;b A ;:) + ij U j (g Aij ;b A ;:) = 1 min b A ij U i (g Aij ;b A ;:) + ij U j (g Aij ;b A ;:) = p 2 if g ij = 0. Theorem 1.1 shows thatH 1 (g A ;x A ;n;) can be expressed as the expecta- tion of the product of the probability terms over ~ " l 1 ; ~ " l 2 ;:::; ~ " l a1 that follow the distributions given in (1.27). Therefore, we can simulate an i.i.d. sam- ple of ~ " l 1 ; ~ " l 2 ;:::; ~ " l a1 from the distributions in (1.27) and use the average of the product over this sample as the simulator for H 1 (g A ;x A ;n;). The procedure is as follows: Algorithm of the simulator for H 1 (g A ;x A ;n;) 1. Generate ~ " l 1 ;r ;:::; ~ " l a1 ;r as follows: (a) Compute the setD 1 (g A ;x l 1 ; u ) and generate ~ " l 1 ;r from the distri- bution f " l 1 (~ " l 1 ). (b) fork = 2;:::; 1, compute the setD kjk1 ~ " l k1 ;r ;g A ;x l k ; u and generate ~ " l k ;r from the distributionf " l k ~ " l k j~ " l k1 ;r given the simulated ~ " l k1 ;r . 2. Compute Pr (" l 1 2D 1 (g A ;x l 1 ; u )j " ) and 128 Pr " l k 2D kjk1 ~ " l k1 ;r ;g A ;x l k ; u ~ " l k1 ;r ; " (k = 2;:::;) ana- lytically. 3. Let ^ H 1;r (g A ;x A ;n;) = Pr (" l 1 2D 1 (g A ;x l 1 ; u )j " ) Y k=2 Pr " l k 2D kjk1 ~ " l k1 ;r ;g A ;x l k ; u ~ " l k1 ;r ; " : 4. Repeat Steps 1-3 for r = 1;:::;R and obtain the simulator ^ H 1 (g A ;x A ;n;) = 1 R R X r=1 ^ H 1;r (g A ;x A ;n;): The simulator ^ H 1 (g A ;x A ;n;) is continuous in because each ^ H 1;r (g A ;x A ; n;) is continuous in . Due to standard central limit theorem we have p R ^ H 1 (g A ;x A ;n;)H 1 (g A ;x A ;n;) =O p (1) as R!1. In practice, we choose R =O(T 2+ ) for some > 0 so that the simulation errors do not have any eect on the estimation. Now we consider the simulator for H 2 (g A ;x A ;n;). Since H 2 (g A ;x A ;n;) = Z min g 0 A 6=g A ;b A (1 1fg 0 A 2PS (U A (b A ;x A ;" A ; u ))g)dF " A 129 = 1 Z max g 0 A 6=g A ;b A 1fg 0 A 2PS (U A (b A ;x A ;" A ; u ))gdF " A = 1 ~ H 2 (g A ;x A ;n;); we can apply Theorem 1.1 to ~ H 2 (g A ;x A ;n;), if we replaceD k (g A ;x l k ; u ) in (1.25) by D 0 k (g A ;x l k ; u ) = 8 > < > : " l k 2R 2k : max g 0 l k 6=g l k ;b A 1 g 0 l k 2PS U l k g A(l k ) ;b A ;x l k ;" l k ; u = 1 9 > = > ; for k = 1;:::;, replaceD kjk1 " l k1 ;g A ;x l k ; u in (1.26) by D 0 kjk1 " l k1 ;g A ;x l k ; u := " l k 2R 2 : " l k1 ;" l k 2D 0 k (g A ;x l k ; u ) ; for all " l k1 2D 0 k1 g A ;x l k1 ; u , k = 2;:::;, and redene f " l 1 (~ " l 1 ) and f " l k ~ " l k j~ " l k1 , k = 2;:::; 1, accordingly. The counterpart of (1.24) for ~ H 2 (g A ;x A ;n;) is ~ H 2 (g A ;x A ;n;) = Pr (" l 1 2D 0 1 (g A ;x l 1 ; u )j " ) Z Y k=2 Pr " l k 2D 0 kjk1 ~ " l k1 ;g A ;x l k ; u ~ " l k1 ; " 1 Y k=2 f " l k ~ " l k j~ " l k1 f " l 1 (~ " l 1 ) ! d~ " l 1 (1.29) The proof of (1.29) is identical to that of Theorem 1.1. Hence we can use 130 the same algorithm as the above to construct a simulator for ~ H 2 (g A ;x A ;n;) and thus H 2 (g A ;x A ;n;). Computational Issues 1. Calculating all possible neighborhoods. The brutal-force method is to calculate the 2 a(na) possible neighborhoods (for a subnetwork of size a in a network with n individuals) and nd all the distinct combinations of the numbers of one's own friends and friends in common. This is infeasible even for a moderaten. We transform the problem into an integer linear program- ming problem which can be solved easily. 6 We rst generate all potential neighborhoods given that one makes no more than b friends. Then we check for each potential neighborhood whether it is compatible (e.g. whether the number of one's own friends is compatible with the numbers of friends in common) and whether it is feasible given that there are at most na indi- viduals to form the neighborhood. This is equivalent to solving the following integer linear programming program min x e 0 x s:t: x =b 0x b x is an integer vector 6 The integer linear programming solver we use is CPLEX. 131 In the above problem, x represents the full conguration of a neighborhood, i.e., the numbers of friends each individual has, the numbers of friends each pair has in common, the numbers of friends each triple has in common, etc. It is an integer vector with dimensiond x = n 1 + + n min(a; b) . b represents a potential neighborhood that considers only the rst two types of friends. It is an integer vector with dimensiond b = n 1 + n 2 . e is ad x 1 vector that has all the components to be 1. is ad b d x matrix with entries being 0 or 1 and is determined by the relations of all types of friends. For each potential neighborhoodb, if the above problem has a solution and the minimume 0 x is not larger than na, we keep this b as a valid neighborhood. 2. Calculating the equivalence classes of isomorphic networks/subnetworks. We use a portable C package named Nauty developed by McKay to determine whether two networks/subnetworks are isomorphic. 7 Nauty uses one of the state-of-the-art algorithms in graph isomorphism. See McKay (1981) for the theory of the algorithm. To call Nauty, we need to transform subnetworks and subnetworks with neighborhoods into standard graphs. More precisely, we transform a subnetwork (g A ;x A ;n) into a vertex-colored graph, where the vertices are colored by the attributes. Moreover, we transform a sub- network with a neighborhood (g A ;b A ;x A ;n) into a vertex-and-edge-colored graph, where the vertices are colored by the attributes and the numbers of each one's own friends, and the edges are colored by the numbers of friends in common. 7 http://cs.anu.edu.au/~bdm/nauty/. 132 3. Computing the identied sets. Before we solve the minimization prob- lem presented in Section 2.4.2, we rst solve the following optimization prob- lems to enhance eciency min 2 i s:t: Q ()c and max 2 i s:t: Q ()c i = 1;:::;d whered is the dimension of, and i is theith component of. 8 The optima of these problems give us the smallest box that contains the set, i.e., the one- dimensional projections of the set. 9 Using starting points randomly generated from this box, we can solve the aforementioned minimization problem with much higher eciency. To account for the quality of the solutions, we keep updating the box using the boundary points obtained from the minimization problem. Proofs in Chapter 3 Proof of Lemma 3.2. x solves ( 0 jI) @f 0 (x;w;") @x +( 1 jI) @f 1 (x;w;") @x =c (1.30) 8 We are grateful to Arnold Neumaier at the University of Vienna for suggesting these algorithms. 9 We use the global optimization solver MCS to solve these problems. For a description of MCS, see http://www.mat.univie.ac.at/~neum/software/mcs/. 133 By concavity, @ 2 f 0 @x 2 < 0 and @ 2 f 1 @x 2 < 0, so @f 0 @x x 1 < c and @f 1 @x x 0 > c by (3.6), (3.7) and x 0 <x 1 . Hence ( 0 jI) @f 0 (x 0 ;w;") @x +( 1 jI) @f 1 (x 0 ;w;") @x c ( 0 jI) @f 0 (x 1 ;w;") @x +( 1 jI) @f 1 (x 1 ;w;") @x c By assumption 1 the LHS of (1.30) is continuous and strictly decreasing, therefore there is a unique solution x to (1.30) such that x 2 [x 0 ;x 1 ]. x = x 0 if ( 0 jI) = 1 and x = x 1 if ( 0 jI) = 0. For 0 < ( 0 jI) < 1, x 0 < x < x 1 . To prove the second statement, suppose 0 < ( 0 jI) < ~ ( 0 jI) < 1. Since x 0 < x < x 1 we have @f 0 @x x < c and @f 1 @x x > c. So ~ ( 0 jI) @f 0 (x ;w;") @x + ~ ( 1 jI) @f 1 (x ;w;") @x < c. Therefore ~ x < x by continuity and decreasing monotonicity of the LHS of (1.30). Proof of Proposition 3.1. From the rst-order condition (3.4), after simple algebra we obtain ( 0 jI) = @f 1 (x;w;") @x c @f 1 (x;w;") @x @f 0 (x;w;") @x (1.31) By Bayes' rule (3.2), ( 0 jI) is given by ( 0 jI) = Q j2N p(y j ;x j ;w j ; 0 )( 0 ) Q j2N p(y j ;x j ;w j ; 0 )( 0 ) + Q j2N p(y j ;x j ;w j ; 1 )( 1 ) = ( 0 ) ( 0 ) + Q j2N p(y j ;x j ;w j ; 1 ) p(y j ;x j ;w j ; 0 ) ( 1 ) (1.32) 134 Equating (1.31) and (1.32) gives ( 0 ) ( 0 ) + Q j2N p(y j ;x j ;w j ; 1 ) p(y j ;x j ;w j ; 0 ) ( 1 ) = @f 1 (x;w;") @x c @f 1 (x;w;") @x @f 0 (x;w;") @x (1.33) Given (m;f ;;F " ). Without loss of generality, suppose = 0 . 10 Pick any ~ f 1 that satises Assumption 1. Let ~ p 1 = ~ p(y;x;w; 1 ) = p " ( ~ f 1 1 (x;w;y)) @ ~ f 1 1 (x;w;y)=@y, where ~ f 1 1 (x;w;y) is the inverse function of y = ~ f 1 (x;w;") with respect to". Plug ~ f 1 and ~ p 1 into (1.33) and solve for( 0 ). Denote the solution by ~ ( 0 ). Givenf , ~ f 1 and ~ , we can derive the input demand func- tion from the rst-order condition (3.4). Denote it by ~ m. By construction we have x = ~ m(c;w;"; ~ ;I) and y = f (x;w;"): Hence ( ~ m;f ; ~ ;F " ) generates the same joint distribution of (x;y;c;w;I) as (m;f ;;F " ) does and is thus observationally equivalent to the latter. Proof of Theorem 3.2. Suppress w for simplicity. Without loss of generality, suppose> 1 (the proof for 0<< 1 is similar). By Assumption 8, g() = g(h( y x )) = ~ h( y x ) = . Hence by (3.23), g( k ) = k for all integer k. The rest of the proof is divided into three steps. (i) we show for all real r2 [0; 1], g( r ) = r . (ii) g(z) = z for all z2 [1;]. (iii) g(z) = z for all z2 (0;1). To show g( r ) = r for real r2 [0; 1], we claim rst that g( 1 2 ) = 1 2 . If not, g() =g(( 1 2 ) 2 ) = (g( 1 2 )) 2 6= ( 1 2 ) 2 =, contradiction. By induction, g( 2 n ) = 2 n for all n. Any real r in [0; 1] has a base- 2 expansion, r = P 1 1 b j 2 j where b j = 0; 1. For any n, g( P n 1 b j 2 j ) = 10 The proof is similar if = 1 . 135 P n 1 b j 2 j . Since P n 1 b j 2 j ! r as n!1 and g is continuous, we have g(a r ) = a r for any real r2 [0; 1]. Next, for any z2 [1;], let r = logz log . Then r2 [0; 1] and z = r : By (i) g(z) = z. In the third step, for any z2 (0;1), there is an integer k such that k z k+1 . 1 z k , so by (ii) g( z k ) = z k . By (3.23), g(z) = k z k =z. Therefore, g is the identity function on (0;1) and the model (3.17) is identied. 136 Bibliography Ackerberg, D. A., & Gowrisankaran, G. (2006). Quantifying equilibrium network externalities in the ACH banking industry. The RAND Journal of Economics, 37(3), 738-761. Andrews, D. W. K., Berry, S., & Jia, P. (2004). Condence regions for parameters in discrete games with multiple equilibria, with an applica- tion to discount chain store location. Mimeo, Cowles Foundation, Yale University. Andrews, D. W. K., & Soares, G. (2010). Inference for parameters dened by moment inequalities using generalized moment selection. Econometrica, 78(1), 119-157. Andrews, D. W. K., & Jia, P. (2012). Inference for parameters dened by moment inequalities: A recommended moment selection procedure. Econometrica, 80(6), 2805-2826. Bajari, P., Hahn, J., Hong, H., & Ridder, G. (2011). A note on semipara- metric estimation of nite mixtures of discrete choice models with ap- 137 plication to game theoretic models. International Economic Review, 53(3), 807-824. Bajari, P., Hong, H., & Ryan, S. P. (2010). Identication and estimation of a discrete game of complete information. Econometrica, 78(5), 1529- 1568. Belle amme, P., & Bloch, F. (2004). Market sharing agreements and collu- sive networks. International Economic Review, 45(2), 387-411. Bala, V., & Goyal, S. (1998). Learning from neighbors. Review of Economic Studies, 65, 595-621. Beresteanu, A., Molchanov, I., & Molinari, F. (2011) Sharp identication re- gions in models with convex moment predictions, Econometrica, 79(6), 1785{1821. Berry, S., & Tamer, E. (2006). Identication in models of oligopoly entry. In R. Blundell, W. K. Newey, & T. Persson (Eds.), Advances in Economics and Econometrics: Theory and Applications, Ninth World Congress (Vol. 2, pp. 46-85). Cambridge, UK: Cambridge University. Bison, A., Moro,A., & Topa, G. (2011). The empirical content of models with multiple equilibria in economics with social interactions. Federal Reserve Bank of New York, Sta Report No. 504. Bjorn, P., & Vuong, O. (1985). Simultaneous equations models for dummy endogenous variables: A game theoretic formulation with an applica- 138 tion to labor force participation. SS Working Paper 537, California Institute of Technology. Bloch, F., & Jackson, M. O. (2006). Denitions of equilibrium in network formation games. International Journal of Game Theory, 34, 305-318. Bloch, F., & Jackson, M. O. (2007). The formation of networks with trans- fers among players. Journal of Economic Theory, 133, 83-110. Blume, L. E., Brock, W. A., Durlauf, S. N., & Ioannides, Y. M. (2011). Identication of social interactions. In J. Benhabib, M. O. Jackson, & A. Bison (Eds.), Handbook of Social Economics (Vol. 1B, pp. 853-964). Amsterdam: North-Holland. Blundell, R., & Powell, J. L. (2006). Endogeneity in nonparametric and semiparametric regression models. In M. Dewatripont, L. P. Hansen, & S. J. Turnovsky (Eds.), Advances in Economics and Econometrics: Theory and Applications, Eighth World Congress (Vol. 2). Cambridge, UK: Cambridge University. Bollob as, B. (1998). Modern Graph theory. New York, NY: Springer-Verlag. Bollob as, B. (2001). Random Graphs. Cambridge, UK: Cambridge Univer- sity. Bramoull e, Y., & Kranton, R. (2007). Risk-sharing networks. Journal of Economic Behavior and Organization, 64, 275-294. Bramoulle, Y., Djebbari, H., & Fortin, B. (2009). Identication of peer eects through social networks. Journal of Econometrics, 150, 41-55 139 Bresnahan, T. F., & Reiss, P. C. (1991). Empirical models of discrete games, Journal of Econometrics, 48, 57-81. Brock, W. A., & Durlauf, S. N. (2001). Interaction-based models. In J. J. Heckman & E. Leamer (Eds.), Handbook of Econometrics (Vol. 5, pp. 3297-3380). Amsterdam: North-Holland. Bugni, F. A. (2010). Bootstrap inference in partially identied models de- ned by moment inequalities: Coverage of the identied set. Econo- metrica, 78(2), 735-753. Calv o-Armengol, A. (2004). Job contact networks. Journal of Economic Theory, 115, 191{206. Calv o-Armengol, A., & _ Ilkili c, R. (2009). Pairwise stability and nash equi- libria in network formation. International Journal of Game Theory, 38, 51{79. Calv o-Armengol, A., Patacchini, E., & Zenou, Y. (2009). Peer eects and social networks in education. Review of Economic Studies, 76, 1239{ 1267. Chandrasekhar, A. G., & Jackson, M. O. (2012). Tractable and consistent random graph models. Mimeo, Stanford University. Chernozhukov, V., & Hong, H. (2004). Likelihood estimation and inference in a class of nonregular econometric models. Econometrica, 72(5), 1445- 1480. 140 Chernozhukov, V., Hong, H., & Tamer, E. (2007). Estimation and con- dence regions for parameter sets in econometric models. Econometrica, 75(5), 1243-1284. Chernozhukov, V., Imbens, G. W., & Newey, W. (2007). Instrumental variable estimation of nonseparable models. Journal of Econometrics, 139(1), 4{14. Choo, E., & Siow, A. (2006). Who marries whom and why. Journal of Political Economy, 114(1), 175-201. Christakis, N., & Fowler, J. (2007). The spread of obesity in a large social network over 32 years. New England Journal of Medicine, 357(4), 370- 379. Christakis, N., Fowler, J., Imbens, G. W., & Kalyanaraman, K. (2010). An empirical model for strategic network formation. NBER Working Paper No.16039. Ciliberto, F., & Tamer, E. (2009). Market structure and multiple equilibria in airline markets. Econometrica, 77(6), 1791-1828. Conley, T. G., & Udry, C. R. (2010). Learning about a new technology: Pineapple in Ghana. American Economic Review, 100(1), 35-69. Currarini, S., Jackson, M. O. & Pin, P. (2009). An economic model of friend- ship: Homophily, minorities, and segregation. Econometrica, 77(4), 1003-1045. 141 De Marti, J., & Zenou, Y. (2009). Social networks. IFN Working Paper No. 816, Research Institute of Industrial Economics. De Giorgi, G., Pellizzari, M., & Redaelli, S. (2010). Identication of so- cial interactions through partially overlapping peer groups. American Economic Journal: Applied Economics, 2, 241{275. De Paula, A., & Tang, X. (2012). Inference of signs of interaction eects in simultaneous games with incomplete information. Econometrica, 80(1), 143-172. Dutta, B., & Mutuswami, S. (1997). Stable networks. Journal of Economic Theory, 76, 322-344. Echenique, F., Lee, S., & Shum, M. (2010). Aggregate matching. SS Work- ing Paper 1316, California Institute of Technology. Fox, J. T. (2010a). Identication in matching games. Quantitative Eco- nomics, 1, 203-254. Fox, J. T. (2010b). Estimating matching games with transfers. Mimeo, Uni- versity of Michigan. Galichon, A., & Henry, M. (2011). Set identication in models with multiple equilibria. Review of Economic Studies, 78(4), 1264-1298. Galichon, A., & Salanie B. (2010). Matching with trade-os: Revealed pref- erences over competing characteristics. Mimeo, Columbia University. 142 Geweke, J., & Keane, M. (2001). Computationally intensive methods for integration in econometrics, in J. J. Heckman & E. Leamer (Eds.), Handbook of Econometrics (Vol. 5, pp. 3463-3568). Amsterdam: North- Holland. Godsil, C., & Royle, G. (2001). Algebraic Graph Theory. New York, NY: Springer-Verlag. Goyal, S. (2007). Connections: An Introduction to the Economics of Net- works. Princeton, NJ: Princeton University. Goyal, S., & Joshi, S. (2006). Unequal connections. International Journal of Game Theory, 34, 319-349. Goyal, S., & Moraga-Gonzalez, J. L. (2001). R&D networks. RAND Journal of Economics, 32(4), 686-707. Goyal, S., & Vega-Redondo, F. (2007). Structural holes in social networks. Journal of Economic Theory, 137, 460-492. Graham, B. S. (2008). Identifying social interactions through conditional variance restrictions. Econometrica, 76(3), 643-660. Groenert, V. (2010). A characterization of weakly pairwise nash stable net- works. Mimeo, Universitat Aut onoma de Barcelona. Haile, P. A., & Tamer, E. (2003). Inference with an incomplete model of english auctions. Journal of Political Economy, 111(1), 1-51. 143 Hajivassiliou, V. A., & Ruud, P. A. (1994). Classical estimation methods for LDV models using simulation. In R. F. Engle & D. L. McFadden (Eds.), Handbook of Econometrics (Vol. 4, pp. 2384-2443). Amsterdam: North-Holland. Heckman, J. J., Matzkin, R. L, & Nesheim, L. (2010). Nonparametric iden- tication and estimation of nonadditive hedonic models. Econometrica, 78(5), 1569{1591. Hellmann, T. (2013). On the existence and uniqueness of pairwise stable networks. International Journal of Game Theory, 42(1), 211-237. Hirano, K., & Porter, J. R. (2003). Asymptotic eciency in parametric structural models with parameter-dependent support. Econometrica, 71(5), 1307{1338. Imbens, G. W., & Newey, W. K. (2009). Identication and estimation of triangular simultaneous equations models without additivity. Ecomo- metrica, 77(5), 1481-1512. Jackson, M. O. (2006). The economics of social networks. In R. Blundell, W. K. Newey, & T. Persson (Eds.), Advances in Economics and Economet- rics: Theory and Applications, Ninth World Congress (Vol. 1). Cam- bridge, UK: Cambridge University. Jackson, M. O. (2008). Social and Economic Networks. Princeton, NJ: Princeton University. 144 Jackson, M. O., Barraquer, T., & Tan, X. (2012). Social capital and so- cial quilts: Network patterns of favor exchange. American Economic Review, 102 (5), 1857-1897. Jackson, M. O., & van der Nouweland, A. (2005). Strongly stable networks. Games and Economic Behavior, 51, 420-444. Jackson, M. O., & Rogers, B. W. (2007). Meeting strangers and friends of friends: How random are social networks? American Economic Review, 97(3), 890-915. Jackson, M. O., & Watts, A. (2001). The existence of pairwise stable net- works. Seoul Journal of Economics, 14(3), 299-321. Jackson, M. O., & Watts, A. (2002). The evolution of social and economic networks. Journal of Economic Theory, 106, 265-295. Jackson, M. O., & Wolinsky, A. (1996). A strategic model of social and economic networks. Journal of Economic Theory, 71, 44-74. Jia, P. (2008). What happens when walmart comes to town: An empirical analysis of the discount retailing industry. Econometrica, 76(6), 1263{ 1316. Kandori, M., Mailath, G. J., & Rob, R. (1993). Learning, mutation and long run equilibria in games. Econometrica, 61(1), 29-56. Manski, C. F. (1993). Identication of endogenous social eects: The re ec- tion problem. Review of Economic Studies, 60(3), 531-542. 145 Manski, C. F. (2004). Statistical treatment rules for heterogeneous popula- tions. Econometrica, 72(4), 1221-1246. Matzkin, R. L. (1994). Restrictions of economic theory in nonparametric methods. In R. F. Engel & D. L. McFadden (Eds.), Handbook of Econo- metrics (Vol. 4, pp. 2523-2558). Amsterdam: North-Holland. Matzkin, R. L. (2003). Nonparametric estimation of nonadditive random functions. Econometrica, 71(5), 1339{1375. Matzkin, R. L. (2004). Unobservable instruments. Mimeo, Northwestern University. Matzkin, R. L. (2007). Nonparametric identication. In J. J. Heckman & E. E. Leamer (Eds.), Handbook of Econometrics (Vol. 6B, pp. 5307{5368), Amsterdam: North-Holland. Matzkin, R. L. (2008). Identication in nonparametric simultaneous equa- tions models. Econometrica, 76(5), 945-978. McFadden, D. (1989). A method of simulated moments for estimation of discrete response models without numerical integration. Econometrica, 57(5), 995-1026. Mele, A. (2010). A structural model of segregation in social networks. Mimeo, University of Illinois at Urbana-Champaign. Myerson, R. (1991). Game Theory: Analysis of Con ict. Cambridge, MA: Harvard University. 146 Nakajima, R. (2007). Measuring peer eects on youth smoking behavior. Review of Economic Studies, 74(3), 897-935. Newey, W., & Powell, J. (2003). Instrumental variables estimation of non- parametric models. Econometrica, 71(5), 1565{1578. Pakes, A., & Pollard, D. (1989). Simulation and the asymptotics of opti- mization estimators. Econometrica, 57(5), 1027-1057. Pakes, A., Porter, J., Ho, K., & Ishii, J. (2006). Moment inequalities and their application. Mimeo, Harvard University. Reiss, P. C., & Wolak, F. A. (2007). Structural econometric modeling: Ra- tionals and examples from industrial organization. In J. J. Heckman & E. E. Leamer (Eds.) Handbook of Econometrics (Vol. 6A, pp. 4277- 4415). Amsterdam: North-Holland. Romano, J. P., & Shaikh, A. M. (2010). Inference for the identied set in partially identied econometric models. Econometrica, 78 (1), 169-211. Sacerdote, B. (2001). Peer eects with random assignment: Results for dartmouth roommates. Quarterly Journal of Economics, 116 (2), 681- 704. Snijders, T. (2002). Markov chain Monte Carlo estimation of exponential random graph models. Journal of Social Structure, 3 (2). Tamer, E. (2003). Incomplete simultaneous discrete response model with multiple equilibria. Review of Economic Studies, 70, 147-165. 147 Topa, G. (2001). Social interactions, local spillovers and unemployment. Review of Economic Studies, 68 (2), 261-295. Van der Vaart, A., & Wellner, J. A. (1996). Weak Convergence and Empiri- cal processes: With Applications to Statistics. New York, NY: Springer- Verlag. Watts, D. J., & Strogatz, S. H. (1998). Collective dynamics of "small-world" networks. Nature, 393 (4), 440-442. Young, H. P. (1993). The evolution of conventions. Econometrica, 61 (1), 57-84. 148
Abstract (if available)
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Essays on nonparametric and finite-sample econometrics
PDF
Essays on beliefs, networks and spatial modeling
PDF
Heterogeneous graphs versus multimodal content: modeling, mining, and analysis of social network data
PDF
Three essays on the identification and estimation of structural economic models
PDF
Microbial interaction networks: from single cells to collective behavior
PDF
Disentangling the network: understanding the interplay of topology and dynamics in network analysis
PDF
Essays on high-dimensional econometric models
PDF
Modeling and predicting with spatial‐temporal social networks
PDF
Communication and cooperation in underwater acoustic networks
PDF
Three essays on the statistical inference of dynamic panel models
PDF
Modeling intermittently connected vehicular networks
PDF
Behavioral approaches to industrial organization
PDF
Modeling social and cognitive aspects of user behavior in social media
PDF
Three essays on cooperation, social interactions, and religion
PDF
Protecting networks against diffusive attacks: game-theoretic resource allocation for contagion mitigation
PDF
Nanostructure interaction modeling and estimation for scalable nanomanufacturing
PDF
The spread of an epidemic on a dynamically evolving network
PDF
Nonlinear dynamical modeling of single neurons and its application to analysis of long-term potentiation (LTP)
PDF
Dynamic network model for systemic risk
PDF
Three essays on econometrics
Asset Metadata
Creator
Sheng, Shuyang
(author)
Core Title
A structural econometric analysis of network and social interaction models
School
College of Letters, Arts and Sciences
Degree
Doctor of Philosophy
Degree Program
Economics
Publication Date
08/07/2013
Defense Date
04/25/2013
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
Bayesian learning,multiple equilibria,network formation,nonadditive index models,nonparametric identification,OAI-PMH Harvest,pairwise stability,partial identification,simulation,social interactions,subnetworks
Format
application/pdf
(imt)
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Ridder, Geert (
committee chair
), Moon, Hyungsik Roger (
committee member
), Strauss, John A. (
committee member
), Yang, Sha (
committee member
)
Creator Email
shengshuyang@gmail.com,ssheng@usc.edu
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-c3-320029
Unique identifier
UC11295209
Identifier
etd-ShengShuya-1990.pdf (filename),usctheses-c3-320029 (legacy record id)
Legacy Identifier
etd-ShengShuya-1990.pdf
Dmrecord
320029
Document Type
Dissertation
Format
application/pdf (imt)
Rights
Sheng, Shuyang
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the a...
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Tags
Bayesian learning
multiple equilibria
network formation
nonadditive index models
nonparametric identification
pairwise stability
partial identification
simulation
social interactions
subnetworks