Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Conspicuous connections as signals of expertise in networks
(USC Thesis Other)
Conspicuous connections as signals of expertise in networks
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
CONSPICUOUS CONNECTIONS AS SIGNALS OF EXPERTISE IN NETWORKS
by
Leila Bighash
A Dissertation Presented to the
FACULTY OF THE USC GRADUATE SCHOOL
UNIVERSITY OF SOUTHERN CALIFORNIA
In Partial Fulfillment of the
Requirements for the Degree of
DOCTOR OF PHILOSOPHY
(COMMUNICATION)
August 2018
Copyright August 2018 Leila Bighash
ii
ACKNOWLEDGEMENTS
My advisor, Professor Peter Monge, has been my intellectual guide, pushing me to think
carefully about my theoretical development and analyses throughout this dissertation writing
process. He provided patience, support, and detailed feedback, which allowed me to develop,
complete, and be proud of this dissertation. I cannot thank him enough for being my advisor and
for being such a wonderful mentor to me throughout my Ph.D.
I would also like to thank my dissertation committee members, Professors Andrea
Hollingshead and Mike Ananny. Their feedback, questions, and unique perspectives on my work
provided me the opportunity to work harder and think more deeply about how I approach my
research for this dissertation and beyond. I cherish their guidance and mentorship throughout my
Ph.D., and I look forward to our future collaborations. I also thank Professor Janet Fulk, another
mentor who very much informed my work and helped me throughout the development of the
dissertation, as well as any other mentors inside and outside USC and the Annenberg School
with whom I discussed this work. Even though the dissertation is a solo work, I felt like I had
intellectual collaborators and supporters throughout.
To my understanding, brilliant, and/or hilarious friends and colleagues: Katie Elder,
LeeAnn Sangalang, Christy Hagen, Poong Oh, Kristen Steves Alexander, Katrina Pariera, Traci
Gillig, Diana Lee, Emily Sidnam, Sonia Shaikh, and many others – thank you for helping me
maintain the light and fun from beginning to end of the Ph.D., even during the most challenging
moments. Whether we were discussing our next project(s), toiling over an analysis conundrum,
pondering through an intellectual puzzle, or recapping our favorite reality TV episode, your
conversations and companionship were (almost) as necessary as water, food, shelter, and air
during this process. I appreciate you all!
iii
My parents and other family members have been so supportive throughout this process,
and that was so important and crucial to my success. Thanks for believing in me and
understanding that this is a (very) long process. My parents, Iraj and Theresa Bighash, have
always inspired me to be curious and hard-working. They are some of the smartest people I
know, yet still kind, gracious, and giving. I hope I can be a good force in the world like them.
And finally to my partner, Bryan Tracey: thank you a million times for being so patient
and supportive during this dissertation writing process and the whole Ph.D. You helped me more
than you know. Having you there allowed me to escape when I needed it most, and this was an
absolute necessity for me. I couldn’t have done it without you, and I’m finally done!
Now on to the next adventure.
iv
TABLE OF CONTENTS
ACKNOWLEDGEMENTS ............................................................................................................ ii
LIST OF TABLES ......................................................................................................................... vi
LIST OF FIGURES ...................................................................................................................... vii
ABSTRACT ................................................................................................................................. viii
CHAPTER 1: INTRODUCTION ................................................................................................... 1
CHAPTER 2: EXPERTISE AND SIGNALING ............................................................................ 9
2.1 Expertise in Transactive Memory System Networks ............................................................ 9
2.2. Signals and Signaling Games ............................................................................................. 13
2.3 Types of Signals .................................................................................................................. 25
2.4 Equilibria ............................................................................................................................. 31
CHAPTER 3: CONSPICUOUS CONNECTIONS AS SIGNALS .............................................. 36
3.1 Conspicuous Consumption .................................................................................................. 36
3.2 Connections as Signals ........................................................................................................ 40
3.3 Conspicuous Connection ..................................................................................................... 42
3.4 Conceptual Signaling Networks .......................................................................................... 61
CHAPTER 4: METHOD .............................................................................................................. 66
4.1 GitHub as the Empirical Site ............................................................................................... 66
4.2 Operationalizing Concepts .................................................................................................. 71
4.3 Analyses for Hypotheses ..................................................................................................... 76
CHAPTER 5: RESULTS .............................................................................................................. 80
5.1 Descriptive Statistics on GitHub ......................................................................................... 80
5.2 Results for Individual-Level Hypotheses ............................................................................ 86
v
5.3 Results for System- and Network-Level Hypotheses ....................................................... 112
CHAPTER 6: DISCUSSION ...................................................................................................... 118
6.1 Implications ....................................................................................................................... 118
6.2 Limitations......................................................................................................................... 128
6.3 Future Work ...................................................................................................................... 131
6.4 Conclusion ......................................................................................................................... 137
REFERENCES ........................................................................................................................... 140
vi
LIST OF TABLES
Table 2.1 Normal Form of a Lewis Signaling Game .................................................................... 19
Table 3.1 Categories of Conspicuous Connection Signals ........................................................... 44
Table 4.1 Concepts and Operationalizations ................................................................................ 72
Table 5.1 Example Contingency Table for JavaScript – Signaling and Expertise ....................... 83
Table 5.2 Comparing Zero-Inflated Negative Binomial Models for Hypothesis 1(a), DV =
contribution requests (number of pull request assignments), n = 430,306 ................................... 90
Table 5.3 Comparing Negative Binomial Models for H1(b), DV = future connections, n =
430,306.......................................................................................................................................... 92
Table 5.4 Hypothesis 1(c): Logistic Regression Odds Ratios (OR) for 76 Language Groups with
Informative Signaling, n = 140,415 .............................................................................................. 94
Table 5.5 Comparing Negative Binomial Models for H1(d), DV = content spread, n = 140,415 98
Table 5.6 Welch’s t-tests comparing mean number of CC signals of experts vs. non-experts (H2)
....................................................................................................................................................... 99
Table 5.7 Random Effects Model Meta-Analysis for H2, Effect: Mean Difference Between
Expert and Non-Expert Signaling ............................................................................................... 103
Table 5.8 Comparing Negative Binomial Models for H4, DV = post signal performance, n =
318,783........................................................................................................................................ 111
Table 5.9 Summary of Results for All Hypotheses .................................................................... 117
vii
LIST OF FIGURES
Figure 2.1 A General Communication System (Shannon & Weaver, 1949, p. 7) ....................... 13
Figure 2.2 Extensive Form of a Lewis Signaling Game .............................................................. 19
Figure 3.1 Hypothetical Social Relation Network ....................................................................... 52
Figure 3.2 Distinction Between the Source-Target of the Connection and Sender-Receiver of the
Signal ............................................................................................................................................ 53
Figure 3.3 An Example Profile on GitHub ................................................................................... 55
Figure 4.1 The GitHub Flow (reproduced from “Understanding the GitHub Flow · GitHub
Guides,” n.d.) ................................................................................................................................ 67
Figure 5.1 Top 20 Programming Languages on GitHub in 2013 ................................................. 81
Figure 5.2 Histogram of Information in Signals .......................................................................... 85
Figure 5.4 Residuals and Q-Q plot for H3 Normal Linear Model ............................................. 108
Figure 5.5 Residuals and Q-Q plot for H3 Box-Cox Transformed Linear Model ..................... 109
Figure 5.6 Plot of Non-Expert Proportion against Mean Number of Expert CC Signals with
linear regression line (in blue) .................................................................................................... 110
Figure 5.7 Assortativity Coefficient Histogram ......................................................................... 114
viii
ABSTRACT
To create complicated products or learn new things, people must access the expertise of
others which can be difficult (Fulk & Yuan, 2013). Many online spaces facilitate knowledge-
sharing or utilizing that expertise to create some new information or product. The problem is
how to find an expert that meets or exceeds those expertise needs. People develop ideas about
the expertise of others by learning about their attributes, such as who knows what and who
knows who (Hollingshead, 1998; Wegner, 1995). This research seeks to understand how
communication behavior in a network about others’ networks may lead to different outcomes in
online communities at the individual, group, and system levels.
Signaling theory (Donath, 2007; D. Lewis, 1969; Spence, 1973; Zahavi, 1975) provides a
foundation to understand how people communicate to resolve information asymmetries. In its
simplest form, a sender has some private information about him/herself or the environment that
is unobservable to a receiver. The sender can send a signal to the receiver to try to alter the
receiver’s beliefs regarding that unobservable information, and ultimately change the receiver’s
behavior. Ideally these signals are honest, but sometimes signals are deceptive. An education is a
classic example of a reliable signal to employers that a job-seeker will be a high-quality
employee (Spence, 1973). In other words, education honestly indicates the high-quality of a
potential employee, a quality that is not directly observable prior to hiring. Contingent on costs,
however, low-quality workers could hijack the use of this signal to try to deceive employers that
they will be of high quality. Another example is conspicuous consumption, in which individuals
signal their status and wealth by purchasing luxury goods that are of the same quality as other
goods but cost much more (BliegeBird & Smith, 2005).
ix
In its current form, signaling theory provides a way to understand when and how signals
can reliably distinguish between different types (e.g., high-quality vs. low-quality; rich vs. poor;
etc.), with little overall threat from deceivers (e.g., those of low quality pretending to be high
quality through signals). This study extends signaling theory in three ways: 1) to explore the
signals that serve the purpose of conspicuous connection in online spaces; 2) to explain signaling
in the context of the networks that are formed based on this conspicuous connection, linking this
to transactive memory systems theory; and 3) to differentiate between types and levels of
conspicuous connection signals to understand individual and network outcomes.
Specifically, senders in online communities may or may not be able to reliably signal to
sets of receivers in different ways. Expanding on the work of Donath and boyd (2004), Shami et
al. (2009), and Lin, Prabhala, and Viswanathan (2013) that conceptualizes connections as signals
in online networks, this study proposes to examine conspicuous connections as signals. Like
Donath and boyd (2004) who focus on connections as reliable signals that corroborate identity
claims, here connections are signals of the expertise type and level in a particular knowledge area
of another person. Conspicuous connection signals can include information about the size of
senders’ multiplex ego networks, the composition of their ego networks, and the informational
resources contained in their ego networks. Signals about ego networks can change receivers’
behaviors, like with whom these receivers connect or from whom they obtain information and
contributions. People can copy others’ signaling strategies (i.e., replicated signals) or try new
signaling strategies (i.e., mutated signals). In this evolutionary context, some signals will be
selected because of their ability to reliably distinguish between types. Others signals will
eventually die out or may not be used for their discriminating functionality.
x
Both individual-level and network-level hypotheses examine the development of large-
scale TM systems through signaling in networks, about networks. Six hypotheses are explored
by analyzing data from an open-source software programming online community, GitHub. Using
both traditional statistical and social network analysis techniques, results point to the way
conspicuous connecting impacts with whom people connect in the future, which contributions
they request and accept, and how they form and manage large, dispersed transactive memory
systems to build complicated products. In particular, information in signals has a large impact on
expected counts of future followers, content spread, and post signal performance for individuals.
1
CHAPTER 1: INTRODUCTION
How do people in large systems with hundreds of thousands to millions of people, like in
many social media platforms and online communities, work together and coordinate? For
example, building software is a very technical and expertise-oriented undertaking that requires
people know where to find others who have the skills to help on specific aspects of the project.
On GitHub, a social media software coding website, people are able to find those experts every
day and build open source software with the contributions of people who are scattered all over
the world. This study utilizes the context of software development on GitHub to understand how
people communicate using their network connections to indicate their expertise efficiently.
Organizing to solve complicated problems, run large businesses, or build complex
products requires different kinds of expertise, which is distributed among different people in a
system (Monge & Contractor, 2003, p. 91). However, it is often difficult and costly to locate
experts (Fulk & Yuan, 2013). Despite the difficulty, successful projects must harness, mobilize,
and exploit people’s varied expertise, especially in efforts like software development that
involves “knowledge work” where the “most important resource is expertise” (Faraj & Sproull,
2000, p. 1554). Open-source social coding websites, where people voluntarily join and contribute
their software programming knowledge and skills, are highly distributed in terms of expertise
because of their large size, geographical dispersion, and lack of traditional organizational
structure. As such, successfully building a collective body of knowledge that is useful and can be
used to create a final deliverable also involves determining ‘who knows who’ and ‘who knows
what’ (Hollingshead, 1998; Leonardi, 2014; Wegner, 1995).
Communication between people is one seemingly simple way to resolve the difficulty in
finding experts by impacting perceptions about collective knowledge networks (Hollingshead,
2
1998; Wegner, 1995). But again, in very large groups or organizations, search strategies like
direct, one-on-one communication may be impossible because of size and geographic
distribution (Leonardi, 2014). In addition, evidence reveals that people are aware that others’
claims of expertise are not always reliable; in other words, “a lot of people like to look or act like
they know something they don’t” (Leonardi, 2014, p. 804, quoting a participant). Because of
this, it is important for people to corroborate those expertise claims in some way. For example,
the participant quoted in Leonardi’s (2014) study dedicated time to read other messages and
content to determine whether people know what they claim to know.
However, people may not always be able to ascertain beforehand if content or
information given by a contributor will be of high quality. Those searching for information or
expertise are, after all, asking others for that knowledge, so they may not have the ability, time,
or effort to evaluate the contribution accurately and efficiently, particularly in large, dispersed
communities. The main research question for this study is, what kind of communication can help
people coordinate in large, dispersed, online communities? In particular, what kind of
communication can help people learn about who knows what so that they can find the right
experts to build complicated products?
Without the ability or time, another way people can evaluate the environment around
them is through signals. Signals, defined as encoded messages that indicate some unobservable
quality and that then can change a receivers’ perceptions or behaviors, can help corroborate
whether someone is an expert before information provision, requests for contribution, or
connection occurs, rather than try to directly discern the quality of contributions after provision.
Donath and boyd (2004) contend that “public displays of connection” are signals that can
authenticate people’s claimed identity on social network sites (SNSs). This study contends that
3
these displays of connection, or conspicuous connections, may also corroborate other claims that
may be unobservable, such as claims about expertise. Connections must be conspicuous to be
signals, because they must be visible at some level to others so that they impact their behavior
(Connelly, Certo, Ireland, & Reutzel, 2011).
Signaling theory was developed to understand how two or more parties reduce
information asymmetries (Connelly et al., 2011), and stems from evolutionary biology (Zahavi,
1975), philosophy (D. Lewis, 1969), and economics (Spence, 1973), with more recent
applications to online contexts (Donath, 2007). At its simplest, signaling theory involves a sender
who has private information and sends a signal to a receiver with the goal of altering the
receiver’s beliefs and ultimately behaviors and actions (Connelly et al., 2011; Lachmann,
Szamado, & Bergstrom, 2001; Maynard-Smith & Harper, 2003). A signal is an encoded message
that serves the purpose of influencing one or more receivers by transmitting information about an
unobservable quality (a definition that is explored in greater detail in Chapter 2). Receivers, who
cannot directly observe the private information, obtain and interpret signals with the goal of
acting in ways that make them better off, such as making more informed decisions (Connelly et
al., 2011). Ideally signals are reliable and honest, meaning the observable signal is correlated
with the unobservable quality that it indicates (Számadó, 2011), but sometimes signals are
deceptive (Ott, Cardie, & Hancock, 2012).
For signaling theory to be an effective theoretical framework for examining empirical
phenomenon, at least two parties must have access to different information (Connelly et al.,
2011). This condition applies to many human communication contexts including labor markets
(Spence, 1973), used car markets (Akerlof, 1970), and online environments (Donath, 2007;
Flanagin & Metzger, 2013; Shami et al., 2009). Information asymmetries occur when a person,
4
group of people, organization, or other entity does not have information that others have
(Connelly et al., 2011). Information asymmetry among people, essentially meaning “different
people know different things” (Stiglitz, 2002, p. 469), is also a feature of transactive memory
(TM) systems, where specialization aides in efficiency. Individuals both with and without
information can be motivated to communicate to resolve those information asymmetries
(Skyrms, 2010; Spence, 2002). Those who do not have information often want to obtain
information that others have to make better decisions or coordinate better, among other
outcomes. Those with information can also receive benefit by communicating to others if they
can convince them to act in ways beneficial to them
1
. In the case of teams working toward
building a product like software, team members and managers would want to resolve information
asymmetries to achieve coordination, effectiveness, and project success at the team level and
increased reputation at the individual level.
Akerlof’s (1970) analysis of information asymmetry in car sales markets was one of the
first in the body of research on the economics of information to explore how imperfect
information may lead to different outcomes compared to the previous models that assumed
perfect information (Stiglitz, 2002). The study starts with the premise that some car dealers sell
cars that are “lemons” (i.e., low-quality) and some sell cars that are high-quality, but in Akerlof’s
model only the dealers know what type of car they are selling. Because buyers do not have
information on car type, they are willing to pay only the midrange between what they would pay
for a high-quality car and a low-quality car. Akerlof’s analysis concluded that in this situation,
1
There are also incentives for people to withhold information when they can retain an advantage
for maintaining an information asymmetry. However, when incentives or payoffs are lined up
such that the sender’s success is dependent at least partially on the receiver’s success, then
withholding information or being deceptive are usually not good strategies for the sender.
Traditional signaling games (Lewis, 1969) assume aligned interests.
5
eventually only low-quality cars would be sold, because high-quality cars would be unable to
obtain the price that they deserve and would be pushed out of the market. This is called the
adverse selection problem, where information asymmetries result in only low-quality types to
stay in a market. However, Akerlof’s model and conclusion relied on the assumptions that good-
quality car sellers could not convey information about the quality of their cars and buyers could
not figure out which are good compared to bad quality cars, assumptions that are also generally
unrealistic (Stiglitz, 2002).
Spence (1973) went a step further by examining how to resolve information asymmetries
through signaling, this time in a different context: the labor market. Spence focused on a specific
signaling mechanism, obtaining an education, that would allow employers to determine whether
a job seeker would be a high-quality employee and thus deserving of a higher wage. Obtaining
an education is assumed to be higher cost for low-quality employees than for high-quality
employees, so low-quality employees actually would prefer to not obtain an education and get a
lower wage instead of incur the higher cost of education. Consequently, an education signal
(assumed to be visible (Spence, 2002)) is a reliable indicator, separating high-quality from low-
quality employees. This example illustrates a separating equilibrium (explored in more detail in
the next chapter), where those of high-quality signal and those of low-quality do not.
Like car sales markets and labor markets, other knowledge-intensive social processes can
be examined through the lens of signaling, such as collaborative open-source software
development. This study examines how signals about relationships and connections may act to
corroborate claims about expertise in this context. In other words, some conspicuous connections
can separate the real experts in specific software languages from the amateurs or deceivers.
Conspicuous means there is some level of signal observability or visibility (Connelly et al.,
6
2011), and connection means these signals are about aspects of an individual’s or other entity’s
network. Along with establishing conspicuous connections as signals, these signals are examined
as they exist in networks of people, expertise claims, and content.
Conspicuous connection signals are theorized to facilitate efficient and effective
outcomes through the development of transactive memory systems in large, dispersed groups.
Transactive memory systems, or “the shared division of cognitive labor with respect to the
encoding, storage, retrieval, and communication of information from different domains” (K.
Lewis & Herndon, 2011, p. 1254), have been examined from a network perspective (Borgatti &
Cross, 2003; Contractor & Monge, 2002; Monge & Contractor, 2003; Palazzolo, Serb, She, Su,
& Contractor, 2006). The perspective adopted in this study considers the development and
functioning of transactive memory systems as a networked signaling process about expertise and
information sharing. Using signaling theory can help elucidate the causal mechanisms at work
when larger, more elusive, or transitory groups behave in ways that resemble TM systems.
Larger groups are theoretically expected to have less developed TM systems in terms of
“accuracy of expertise recognition and … knowledge differentiation” (Palazzolo et al., 2006, p.
243), but including signals may help account for situations in which well-functioning TM
systems exist in larger groups. Combining signaling theory with transactive memory systems
theory can help uncover network- and system-level outcomes by starting with individual-level
communication and actions.
The rest of this dissertation expands on this introduction, with a chapter on the previous
literature and theoretical background, a chapter building the theory of signaling through
conspicuous connection as a causal mechanism in forming and maintaining large TM systems, a
methodological chapter, a chapter on the empirical results, and a chapter discussing the
7
implications of this work and potential for future work. Chapter two reviews the literature to
understand expertise from a signaling perspective as it can be examined in the context of
transactive memory systems theory and knowledge networks. The first section overviews the
literature on expertise in networks, particularly in expert communities and transactive memory
systems. The next section explains Shannon’s mathematical model of communication to
understand what a signal is, followed by a review of Lewis signaling games to show that
communication as signals can be regarded not just through a transmission lens, but also as a
constitutive and social process between two or more people. Next, the different categories of
signals as established in the literature are reviewed. The next section reviews equilibria that
either separate or pool entities with different characteristics, which is important to understand
macro level outcomes.
Chapter three builds the theory of conspicuous connections as signals. The concept of
conspicuous consumption as costly signals is reviewed before making the comparison to
conspicuous connections. The theory is reinforced by examining previous research exploring
connections as signals. The more nuanced details of conspicuous connections are developed,
including distinguishing between high visibility and low visibility and high information and low
information conspicuous connections. Additionally, receivers may themselves require different
levels of expertise depending on their knowledge and/or information requirements, so signaling
can be a way to both separate high quality experts from non-experts or deceivers and match those
experts to appropriate receivers. Last, to understand this kind of signaling from a network
perspective, the multimodal and multiplex networks of interest are explicated, including the
person-expertise network, the person-person network, and the person-content network, which as
a whole comprises the networked signaling involved in large-scale TM systems.
8
Chapter four discusses the context and methodology to test the hypotheses laid out in
chapter three by examining a social media coding website, GitHub, using digital data that are
available through an openly-accessible archive. This conspicuous connection signaling context
provides an opportunity to examine individual behavior and group outcomes in a large,
distributed system of intensive knowledge work to see if the constructs of interest are empirically
linked in the theoretically-proposed ways. The chapter describes the context, discusses the
specific variables that operationalize the concepts in the hypotheses, and then details the analyses
conducted that lead to the results.
Chapter five presents the empirical results from the observational data analysis, including
both descriptive analysis of the data as well as the inferential analysis of the hypotheses of
interest. The hypotheses are explored through both traditional statistical and network analysis
techniques. The results highlight the influence of two concepts associated with conspicuous
connection signals, visibility and information, on outcomes such as future followers, content
spread, contribution requests, and team success. Chapter six concludes this study with a
discussion on the implications, limitations, and areas for future research. In sum, information
theory, signaling theory, social network theory, and transactive memory systems theory are
combined in this work to understand how people search and find experts in large online
networks.
9
CHAPTER 2: EXPERTISE AND SIGNALING
This chapter provides a background on communicating about expertise in social media
and online communities from a signaling perspective. The first section overviews the concept of
expertise itself, exploring the tension between social closure in communities of experts compared
to diversity and specialization in transactive memory (TM) systems. The next section defines
signals in the context of information theory and signaling games. The third section explores how
different types of signals are categorized in the literature. The last section in this chapter explores
different equilibria to understand outcomes at macro levels.
2.1 Expertise in Transactive Memory System Networks
The motivating research question for this study is, what type of communication helps
people find experts easily and efficiently in large, dispersed, online spaces where it is difficult to
ascertain through simple one-on-one communication? To answer this question, first the concept
of expertise must be examined in terms of what it is and how it is found in different kinds of
systems.
Expertise is “the most critical resource” whenever knowledge is the commodity of
interest (Faraj & Sproull, 2000, p. 1554; see also Palazzolo et al., 2006). Successfully and
efficiently sharing knowledge to organize and solve complicated problems and build complicated
products requires that participants in a system are able to locate experts (Fulk & Yuan, 2013), but
expertise itself is often unobservable to others. Because of this, locating experts can be difficult
(Fulk & Yuan, 2013), but one way to find experts is through communication, which impacts
group members’ perceptions about collective knowledge networks (Hollingshead, 1998; Wegner,
1995).
10
Expertise is defined and explored differently depending on the disciplinary tradition.
While some scholarship generally considers expertise as a trait of exceptional individuals
without examining the context within which their exceptionalism occurs (Ericsson & Smith,
1991), other traditions emphasize context to understand how expertise develops and functions in
society, often examining expertise situated and institutionalized in professions (Evetts, Mieg, &
Felt, 2006; Mieg, 2006). From this perspective, expertise is inherently relational, meaning others
determine whether someone is an expert by comparing that person to others (Evetts et al., 2006;
Fulk, 2016; Leonardi & Treem, 2012) and through communication or “visible performances of
knowledge without interrogation of work practice” (Treem, 2012, p. 43). Claiming expertise is
contextual and dynamic, in that any person can act as an expert in certain situations, but they
may lose their expert status in other situations and contexts (Mieg, 2006).
Claiming expertise is also a form of social closure or the exclusion of non-experts by
those who claim to be experts (Evetts et al., 2006). Expertise itself can be considered “the result
of successful socialization within a particular community, which gives a sociological definition
of expertise as social fluency within a form-of-life” (Evans, 2008, p. 283). Some combine the
individual trait approach and the relational approach by saying that expertise is an individual’s
possessed knowledge in relation to a group or task, defining it as “the specialized skills and
knowledge that an individual brings to the team’s task” (Faraj & Sproull, 2000, p. 1555). These
definitions, descriptions, and claims of expertise incorporate ideas of relationships between
people and participation in groups of likeminded others. Expertise communities involve
networks of people, both actual and perceived, and those actual and perceived networks of
people can have intersections and divergences.
11
Transactive memory systems (TM systems) theory also focuses on networked
relationships of people with expertise but, instead of expertise similarities, focuses on the
distribution of the varied expertise of people within groups and how this specialization or
division of labor is harnessed to enhance group performance (Hollingshead, 2001; K. Lewis &
Herndon, 2011). A TM system involves “group information-processing” (J.-Y. Lee, Bachrach, &
Lewis, 2014, p. 951), where the individuals within the group “coordinate their information
retrieval and distribution… that transcends the sum of the individual parts” (Yuan, Fulk, &
Monge, 2007, p. 132). Rather than relying only on one’s individual mental faculties, a TM
system requires “the shared division of cognitive labor with respect to the encoding, storage,
retrieval, and communication of information from different domains” (Lewis & Herndon, 2011,
p. 1254).
There are two components of TM systems theory: (1) the “individuals’ mental models of
their ego-centered networks” (Jarvenpaa & Majchrzak, 2008, p. 261; see also Akgün, Byrne,
Keskin, Lynn, & Imamoglu, 2005) consisting of their metaknowledge of label and location of
information or expertise (Ren & Argote, 2011, p. 192), and (2) the development the shared group
cognition regarding who knows what and who knows who so that information can be encoded,
stored, and retrieved efficiently (Hollingshead, 1998; Ren & Argote, 2011). A strength in a
transactive memory system as a whole is in the informational diversity and access to that
informational diversity (K. Lewis & Herndon, 2011), whereas expert communities often result in
information redundancies and social closure (J.-Y. Lee et al., 2014).
The development of both individuals’ mental representations of the network of possible
information sources within a group, organization, or other set of people and the collective
transactive memory system as a whole involves interaction and communication between system
12
participants (Palazzolo et al., 2006); as such, memory is “an evolving construct, changing as new
information is added and old memories are reinterpreted, and as areas of expertise are created or
reassigned,” and this changing memory occurs because of transactions or communication
(Hollingshead & Brandon, 2003, p. 609). Directory updating, one of the three “generative
mechanisms” in transactive memory systems
2
where people determine who has expertise in a
group or network, occurs through making judgments based on some form of observation or
communication (Palazzolo et al., 2006; Wegner, 1995). However, because individuals often
focus on shared or common information held by more than one member rather than the unique
information held only by one (Stasser & Titus, 2003), TM systems research suggests that social
closure through frequent reciprocal ties may result in less developed transactive memory systems
(J.-Y. Lee et al., 2014). Therefore, there is an inherent tension between tightly-knit expertise
networks and loosely-connected TM system networks. Additionally, dispersed and large
collectives face even more challenges in knowledge-sharing and team work because of
employees’ difficulty in knowing who knows who and who knows what, lack of colocation, and
unfamiliarity/reluctance, among other issues (Ellison, Gibbs, & Weber, 2015).
Signaling theory provides a systematic way to theoretically resolve this tension between
the transactions (i.e., communication) necessary between diverse others in developing larger TM
systems and the tight clusters within which experts are embedded. Different types of signals can
help people resolve this tension in systems that include both these locally tight clusters and more
broadly diverse groups of experts, such as network connections which are established in the next
chapter as important signals of individuals’ different areas of expertise. The next sections review
2
Palazzolo et al. (2006, p. 225) identify “communication to allocate information” and
“communication to retrieve information” as the other two generative mechanisms.
13
information theory and signals as a foundation to understand the development and functioning of
transactive memory systems through signaling.
2.2. Signals and Signaling Games
Shannon (1948) proposed a general model of communication, and Figure 2.1 provides the
schematic diagram of a general communication system, involving a source, transmitter, channel,
receiver, and destination. Using the definition of communication provided at the beginning of
Weaver’s (1949) chapter as “all of the procedures by which one mind may affect another” (p. 3),
in this communication system the source is the mind that affects the destination.
Figure 2.1 A General Communication System (Shannon & Weaver, 1949, p. 7)
In the system illustrated in Figure 2.1, a source chooses a message among a set of
possible messages
3
. This message is sent to a transmitter that encodes this message into a signal
or set of signals that can be sent via a communication channel, which may encounter noise. The
3
In Shannon’s model of communication, messages are not yet encoded, unlike in other
communication theories where messages themselves are encoded using symbols and signs.
14
signal is received by a receiver that decodes the signal back into a message, and finally the
destination receives the decoded message. Weaver explained how this can move beyond only
electrical or man-made systems with this example: “When I talk to you, my brain is the
information source, yours the destination; my vocal system is the transmitter, and your ear and
the associated eighth nerve is the receiver” (Shannon & Weaver, 1949, p. 7). In this case,
thoughts (message) are encoded into words and phrases (signal) that can be decoded and
interpreted by the other person (if the language is understood by both parties).
4
Because “all
communication involves some sort of encoding of messages” (Pierce, 1980, p. 78), information
theory applies not just to technical communication systems but also when trying to understand
how people communicate with each other using different messages and signals and through
different communication channels.
Signals are made up of either symbols or signs (McGlone & Giles, 2011), where signs
are causally related to the concept or object of interest (e.g., a pictogram) and symbols are
arbitrary indicators of some physical entities or abstract concepts (e.g., a word or set of words,
excluding onomatopoeias) (Pierce, 1980, p. 293). In information theory, a message can be
thought of as a concept or thought that has yet to be encoded into a decipherable form, and a
signal or set of signals encode a message into a form that can be transmitted to another party.
Evolutionary biologists define a signal including directedness (i.e., to the destination) as a
defining quality, where the signal “serves to influence the behavior of [the receiver] by
transmitting information” (Lachmann et al., 2001, p. 13190; see also, Maynard-Smith & Harper,
4
While Shannon (1948) viewed this process as an engineering problem, Weaver’s (1949)
interpretation involved generalizing this beyond only technical considerations and into semantic
and effectiveness considerations, as well. Weaver (1949) notes that, in addition to the technical
receiver in the system, one could include a semantic receiver, where a person actually interprets
the meaning of a message.
15
2003 for a more detailed review of signaling from this perspective). As such, signals are defined
here as part of or an entire message that has been encoded into transmittable symbols and/or
signs that then influence one or more receivers by transmitting information.
At the center of information theory is of course information, defined as anything that
reduces uncertainty about alternatives in a given context. In the context of message transmission
and/or message meaning, information is defined as the logarithm of the number of possible
message choices or meaning interpretations (Pierce, 1980; Shannon, 1948; Shannon & Weaver,
1949). When the logarithm is calculated at base two, information is measured in terms of binary
digits (“bits,” a shortened combination of binary and digits; (Pierce, 1980; Shannon & Weaver,
1949).
5
This measure of the amount of information is the complement of uncertainty, also known
as entropy. The more entropy or uncertainty exists, the greater the potential for information. The
less entropy or uncertainty that exists in a given context, the less potential there is for
information to reduce uncertainty. In other words, if little uncertainty or entropy exists, there is
little information left to complete the puzzle (Shannon & Weaver, 1949, p. 9).
Information and entropy can be defined from the perspective of both the source and the
receiver. For the source, entropy is a measure of one’s freedom of choice (with more message
choices there is more freedom in choosing which message to send; with fewer message there is
less freedom). From the destination’s perspective, entropy is the “amount of surprise” (Mitchell,
5
Logarithms are inverse exponents. For example, log24 is asking, what can we raise the number
2 (the logarithm’s base) by to equal 4 (i.e., what is x when 2
x
= 4). In this example, log24 = 2.
When there are two possible message choices, log22 = 1, so one bit of information is needed to
represent a two-choice situation. This quality of logarithms with a base of two means that every
time the number of possible message choices doubles, the number of bits goes up by one. Stated
differently, under equally probable conditions, a transfer of one bit of information reduces the
number of possible of alternatives by half. In other words, calculating the log at base 2 provides
the number of yes or no questions needed to determine which alternative was chosen.
16
2009, p. 54), meaning how surprising a particular message is once it is sent (with more message
choices, there is more surprise; with fewer, there is less surprise). It also quantifies how much
information a signal carries from a source to that destination. In a communication system, if it is
equally likely that all signals are sent, then there is full freedom of choice and the highest entropy
and also the most information contained in each signal; if one signal is certain to be sent, then
there is no freedom of choice, no entropy/uncertainty, and no information contained in the signal
because all parties know it will be sent (Shannon & Weaver, 1949, p. 15). For example, if a
person only can say, “Hold the door!” and no other phrases are possible no matter the
circumstance, then there is no freedom of choice and the phrase “Hold the door!” contains no
information. It is certain that this person always says that one phrase, thus there is no entropy;
there is no freedom for the sender and there is no surprise for the receiver. On the other hand, if
the person can say either “Hold the door!” or “Winter is coming,” then there are two message
choices, and there is some information (to be exact, there is log2(2) = 1 bit of information), some
surprise, and some uncertainty. As the number of possible message choices increases, the amount
of information, surprise, or entropy also increases. When one of the many messages is sent and
received, information is transmitted by the sender and uncertainty is reduced for the receiver. In
a social online project development environment, uncertainty exists about who has what
knowledge and skills and who is an expert in those areas. There could be many possible ways to
convey this information, thus uncertainty is high but the amount of information conveyed once a
message is sent is also high.
Both messages and signals contain information. If the channel does not have sufficient
carrying capacity or introduces noise, this is where the information can differ between them. As
the Shannon and Weaver (1949) state:
17
The statistical nature of messages is entirely determined by the character of the source.
But the statistical character of the signal as actually transmitted by a channel, and hence
the entropy in the channel, is determined both by what one attempts to feed into the
channel and by the capabilities of the channel to handle different signal situations. (p. 17-
18)
For example, a source may be very complex, like a human wishing to send many
different messages related to all facets of life, imagination, and/or society, or a source may be
very simple like a bacterium, only wanting and able to send a few messages related to its
survival. However, despite a human’s ability to think of many different messages he or she
wishes to send, the communication channel in which he or she communicates may only allow
him or her to use, for example, hand gestures as signals if there is no capacity for sound (e.g., a
thumbs-up or thumbs-down). On the other hand, if the communication channel involved directly
accesses the human mind’s thoughts (i.e., mind reading), then there would be no loss of
information when converting a message to a signal because they would be equivalent (in fact,
there would be no conversion), and there would be no loss of information in the communication
channel (e.g., the thumbs up or thumbs down would instead include all details on the thoughts,
attitudes, beliefs, and feelings on the topic at hand).
In the case where the communication channel has certain constraints, “the signal entropy
is exactly equal to the channel capacity” (Shannon & Weaver, 1949, p. 18). Channel noise adds
uncertainty and thus information, but this is spurious information unrelated to the message
(Shannon & Weaver, 1949). From the perspective of an honest source and destination, this
would be undesirable uncertainty (on the other hand, a third party may wish to disrupt
18
communication or the source may wish to deceive, so noise could be desirable in this case,
illustrating the subjective normativity of elements of the system).
Information theory, or as some call the transmission model because it transmits
information from one mind to another, is a debated paradigm from which to study
communication. Craig (1999), for example, says that many scholars have heavily attacked the
transmission model, calling it “philosophically flawed, fraught with paradox, and ideologically
backward” (p. 125) in part because these message-driven theories do not “regard the human
participants … as capable of making up their own meanings, negotiating relationships among
themselves, and reflecting on their own realities” (Krippendorff, 1993, p. 35).
Additionally, some scholars argue that information theory not only does not (as admitted by
Shannon) but cannot account for meaning (Mitchell, 2009), a critical if not necessary concept in
understanding communication as a whole.
Lewis signaling games provide an answer to how extensions of information theory
combined with game theory can in fact “[conceptualize] communication as a constitutive process
that produces and reproduces shared meaning” (Craig, 1999, p. 125), especially at higher levels
of analysis, thus resolving some of the problems of information theory put forth by other
scholars. It also provides more insight into the directed and influential nature of signals, as
defined by evolutionary biologists (Lachmann et al., 2001). Table 1 provides an example of a
simple Lewis signaling game in normal form and Figure 2 provides an example in extensive
form, with two states of nature, two signals, and two acts (D. Lewis, 1969; Skyrms, 2010). The
extensive form provides a way to understand Lewis signaling games in sequence, while the
normal form represents all possible strategies in a matrix. Also, signaling games do not
19
conceptually separate signals and messages, effectively ignoring encoding by the transmitter and
any noise via the channel.
Table 2.1 Normal Form of a Lewis Signaling Game
State of Nature
State 1 State 2
Receiver
Act 1 Act 2 Act 1 Act 2
Sender
Signal 1 1, 1 0, 0 0, 0 1, 1
Signal 2 1, 1 0, 0 0, 0 1, 1
Figure 2.2 Extensive Form of a Lewis Signaling Game
In this signaling game, nature “chooses” one of two states. The state of nature could be a
property of the sender or the environment; for example, the state of nature could be that it is
raining outside, or more relevant to this study, it could be that the sender is an expert in some
area. For this example, the state of nature will be that the sender is either an Expert (State 1) or
Not Expert (State 2). The sender observes the state of nature (e.g., knows whether they are an
20
expert) and sends a signal (either Signal 1 or Signal 2) to the receiver, who cannot directly
observe the state of nature. After observing the signal, the receiver then chooses between one of
two acts. For this example, Act 1 could be to Request Information from the expert and Act 2
could be to Not Request Information the expert. If the act that the receiver chooses corresponds
to the state of nature, both the sender and receiver get a positive payoff (see (1, 1) in Table 2.1
and Figure 2.2).
For example, perhaps the receiver upon receiving Signal 1 decides to do Act 1, which in
this example would be to Request Information; if the State of Nature is also State 1 (Expert), then
both the sender and the receiver will be positively rewarded because they can effectively transfer
knowledge from the sender (who is an expert) and the receiver (who wants the information). If
the act the receiver chooses does not correspond to the state of nature, both the sender and
receiver get no payoff (see (0, 0) in Table 1 and Figure 2).
For example, if the State of Nature is that the sender is Not Expert (State 2) but the
receiver Requests Information (Act 1), then neither will get a positive reward because they waste
time on the request and the response. The case where the receiver does Not Request Information
can be explored as well: if the state of nature is that the sender is Not Expert then both the sender
and receiver get a positive payoff (e.g., no wasted time on request or response so they can focus
on other activities), whereas if it is Expert then both the sender and receiver get no payoff (e.g.,
the receiver does not ask and the sender does not respond, so neither get the opportunity to learn
or contribute, respectively). Thus, in this game, it is in both the sender’s and receiver’s benefit to
coordinate with one another.
As such, in the context of signaling games, “a signal is a specific type of physical
interaction, one in which the content of the interaction is determined by the sender, and it
21
changes the receiver’s behavior by altering the way the receiver evaluates actions” (Gintis, 2009,
p. 179, emphasis original). In addition, “a signal is… generally the result of a coevolutionary
process between senders and receivers in which both benefit from its use” (Gintis, 2009, p. 180,
emphasis original). The reason this claim is true is that in the system, senders who send signals
are better off than those who do not, receivers who take heed to signals are better off than those
who do not, and any “mutant” strategies (in evolutionary terms) will not have the chance to
propagate in the system. This only happens in systems where the state of nature is not certain
(e.g., the sender is not always an expert, no matter what), and where signaling can in fact reduce
uncertainty by providing the receiver with more information.
When senders learn to send signals that correspond with what receivers will do to match
with states of nature, together the sender and receiver have created shared meaning behind a
previously arbitrary signal. In the signaling game above, the choice of signal itself (Signal 1 or
Signal 2) does not matter, which is why it was not explored in the example. The signal may be a
claim of expertise, or, as expanded on in the next chapter, a conspicuous connection to other
experts, among many other possibilities. Instead of the form of the signal, what matters is that
the signal transmits information such that the receiver makes the correct decision on which
action to take, and this occurs through the evolutionary process. Skyrms (2010) explores this by
simulating iterative signaling games between senders and receivers. In the beginning, the sender
and receiver may not communicate properly, but over time they can learn to signal and act in
ways that coordinate with one another so that the State matches the Act. In this way, signals gain
meaning over time after repeated interactions.
For example, in time 1, the sender may observe and know that the State of nature is State
1, so the sender sends Signal 1, but then the receiver may do Act 2 instead of Act 1, resulting in
22
zero payoff for both, because the receiver has no reference for which signal corresponds with
which state. Then, in time 2, the sender may again observe State 1, again send Signal 1, and the
receiver may now learn to do Act 1 to get the higher payoff
6
. In signaling systems, meaning is
thus not innate but rather convention developed iteratively (Skyrms, 2010), where signals
“acquire meaning” when senders and receivers “somehow find their way to an equilibrium where
information is transmitted” (Skyrms, 2009, p. 772). When meaning is transmitted, this “[creates]
optimizing solutions for signallers and receivers alike” (Bergh, Connelly, Ketchen, & Shannon,
2014, p. 1336), or in other words, “signals evolved because, on average, they increase the fitness
of the signaller by altering the behaviour of the receiver in a favourable way” (Számadó, 2011, p.
4).
While Shannon’s theory provides a way to quantify the amount of information in a whole
situation to understand how to design communications systems, it is prudent to understand the
amount of information contained in a single signal for signaling games. This is a conditional
probability, where the information in a signal about a state is equal to the logarithm of the
probability of a state given a signal over the unconditional probability of a state (Skyrms, 2010).
If, for example, a signal provided no information about a state, then the unconditional probability
of the state would be equal to the probability of a state given a signal (i.e., if x = y, then x / y = 1
and log2(1) = 0). Because one signal provides information not just about one state, but other
states as well, this can be summed over all states to find the total information in the signal.
Taking the log (base 2) of this provides this figure in bits. Skyrms (2010, p. 36) formula is as
follows,
6
This hypothetical example is likely unrealistic in terms of how quickly the sender and receiver
learn to communicate and establish the meaning of signals. It may take several iterations before
the sender and receiver learn how to interact to get positive rewards.
23
𝐼 (𝑠𝑖𝑔𝑛𝑎𝑙 | 𝑠𝑡𝑎𝑡𝑒𝑠 ) = ∑ 𝑃 (𝑠𝑡𝑎𝑡𝑒 𝑖 | 𝑠𝑖 𝑔 𝑛𝑎𝑙 ) ∗ log
2
[
𝑃 (𝑠𝑡𝑎𝑡𝑒 𝑖 | 𝑠𝑖𝑔𝑛𝑎𝑙 )
𝑃 (𝑠𝑡𝑎𝑡𝑒 𝑖 )
]
where 𝐼 (𝑠𝑖𝑔𝑛𝑎𝑙 | 𝑠𝑡𝑎𝑡𝑒𝑠 ) is the information in the signal given all the states,
𝑃 (𝑠𝑡𝑎𝑡𝑒 𝑖 | 𝑠𝑖𝑔𝑛𝑎𝑙 ) is the conditional probability of each state i given the signal, and 𝑃 (𝑠𝑡𝑎𝑡𝑒 𝑖 )
is the unconditional probability of each state i. For example, suppose there are two mutually
exclusive and exhaustive states, Rainy and Clear Skies. In Los Angeles, California, the
unconditional probability of rain is quite low; there were only 28 days with more than 0.1 inches
of precipitation in 2015 (“NOAA National Centers for Environmental Information,” 2015). To
simplify the example, the probability of the state of nature Rainy is 28 out of 365 days, which is
about a 0.05 probability of state Rainy, and a 0.95 probability of state Clear Skies. However, if a
sender sends a signal, like the word “rain,” perhaps the probability of state Rainy conditional on
the signal is 0.75 (i.e., now there is a 75 percent chance that the state of nature is Rainy), while
the probability of state Clear Skies conditional on the signal is 0.25. The above formula can be
applied as follows:
𝐼 ("rain" | 𝐶𝑙𝑒𝑎𝑟 𝑆𝑘𝑖𝑒𝑠 𝑜𝑟 𝑅𝑎𝑖𝑛𝑦 )
= (0.75 ∗ log
2
(
0.75
0.05
)) + (0.25 ∗ log
2
(
0.25
0.95
))
≈ 2.93 + (−0.49) ≈ 2.44
In this example, the signal “rain” contains 2.44 bits of information about the state of nature when
the state of nature is either Rainy or Clear Skies and the conditional probabilities as well as the
unconditional probabilities are taken into account. However, in signaling games there is not only
information about the state of nature, but also “information about the act that will be chosen”
based on the signal (Skyrms, 2010, p. 38). The same formulas can be applied, replacing “state”
with “act.” So, if a signal is sent and it changes the probabilities of an act occurring, then it
24
contains information about the act. In the example above, the signal would contain information
about whether the receiver brings the umbrella. If the probability that the receiver brings the
umbrella is the same regardless of whether the sender sends the signal, then the signal contains
no information about the act.
7
If the sent signal changes the probability that the receiver brings
an umbrella, then the signal does contain information about the act and it can be calculated the
same as the information about the state of nature.
Skyrms’ (2010) claims to take on critics who say that meaning does not have a place in
information theory by quantifying informational content (as opposed to quantity) through the
same logic as above. He intuitively quantifies information content not just by how much a given
signal changes probabilities of states or actions, but also which way. To do this, information
content is mathematically signified by a vector of values, one for each state (or act), each value
as the log of the probability of the state given the signal over the unconditional probability of the
state (Skyrms, 2010, p. 41). The elements of the vector can be calculated as follows,
< log
2
[
𝑃 (𝑠𝑡𝑎𝑡𝑒 1 | 𝑠𝑖𝑔𝑛𝑎𝑙 )
𝑃 (𝑠𝑡𝑎𝑡𝑒 1)
] , log
2
[
𝑃 (𝑠𝑡𝑎𝑡𝑒 2 | 𝑠𝑖𝑔𝑛𝑎𝑙 )
𝑃 (𝑠𝑡𝑎𝑡𝑒 2)
] , . . . , log
2
[
𝑃 (𝑠𝑡𝑎𝑡𝑒 𝑛 | 𝑠𝑖𝑔𝑛𝑎𝑙 )
𝑃 (𝑠𝑡𝑎𝑡𝑒 𝑛 )
] >
where 𝑝𝑟
𝑠𝑖𝑔 (𝑠𝑡𝑎𝑡𝑒 𝑛 ) is the conditional probability of state n given the signal and 𝑝𝑟 (𝑠𝑡𝑎𝑡𝑒 𝑛 ) is
the unconditional probability of state n. In a hypothetical example where there are two equally
likely states, Rainy and Clear Skies, and the signal “rain” changes the probability of the state
Rainy to be 100 percent certain, then the vector is:
7
The signal may provide information about other kinds of acts, such as whether someone gets
wet that day, but the specific information about the act of bringing an umbrella is what is key
here.
25
< log
2
[
𝑃 (𝑅𝑎𝑖𝑛𝑦 | "𝑟𝑎𝑖𝑛 ")
𝑃 (𝑅𝑎𝑖𝑛𝑦 )
] , log
2
[
𝑃 (𝐶 𝑙𝑒𝑎𝑟 𝑆𝑘𝑖𝑒𝑠 | "𝑟𝑎𝑖𝑛 ")
𝑃 (𝐶𝑙𝑒𝑎𝑟 𝑆𝑘𝑖𝑒𝑠 )
] > =
< log
2
[
1.0
0.5
] , log
2
[
0.0
0.5
] = < 1 , −∞ >
In this example, the first element in the vector, 1, indicates that the signal moved the
probability of the state of nature Rainy up to one (100 percent), and the second element in the
vector, −∞, indicates that the state of nature Clear Skies is now impossible (probability of 0)
given the signal. Skyrms explains that a proposition, or a statement with meaning or content, can
be considered as a set of states of nature that the true state of nature is either in or not in;
therefore, one can see that the vector provides a propositional statement. In the example, the
vector shows that, given the signal, the state of nature is definitely Rainy and absolutely not
Clear Skies. As such, not only can the amount of information in a signal be captured but also the
informational content, or in other words meaning.
2.3 Types of Signals
Encoding a message into an arbitrary signal or set of signals means that one could send
different signals to indicate the same message, particularly with access to different
communication systems. For example, if the sender wanted to influence a receiver to be chosen
for a sports team, a signal of physical strength (made up of symbols) could be the phrase, “I am
strong,” but another signal of physical strength (made up of a sign) could be to flex one’s
muscles. These different signals may indicate the same underlying message and unobservable
quality, but may also be perceived differently by the receiver because of the properties of each
signal. These properties, such as intentionality, relation to the underlying message, and cost, have
been used to understand different types of signals and how they may be related to whether a
sender sends a signal and whether a receiver reacts to a signal.
26
Some delineations of signals focus on intentionality, claiming that signals must be sent
purposefully while cues are unintentional (Donath, 2011). Skyrms’ (2010) claim, on the other
hand, is that this kind of theory of signaling supposes that the mind of an individual is thinking
about or of something, and that there is some sort of “mental representation” going on inside the
head of individuals (Siewert, 2016). A theory of signaling, however, does not need to consider
mental life of the signalers, separating the mental from the physical (Skyrms, 2010, p. 43). In this
study, cues are defined as any information in the world that someone may use to inform how it
will act in the future (e.g., smoke billowing from a fire); signals are not differentiated from cues
based on intentionality, but rather because they evolved specifically because of their influence on
receivers’ actions and responses (Maynard-Smith & Harper, 2003, p. 3). In other words, senders
send signals because they influence others, regardless of whether they do so intentionally, and
the reason why signals influence others is because they evolved specifically to do so.
Maynard-Smith and Harper (2003) provide an example of a spider that illustrates the
point more clearly. In their example, the “act of conveying information about size [of the
spider’s body] is a signal” (Maynard-Smith & Harper, 2003, p. 4). A spider’s size impacts the
amplitude of vibrations sent through its web, an action (by a sender) which evolved because of
its effect on intimidating smaller-sized spiders (receivers) so that they run away and the larger
spiders do not waste energy to fight, thus adhering to Maynard-Smith and Harper’s definition of
a signal. One does not need to consider the mental life of the spider, or any other being for that
matter, to accept that it is sending a signal (Skyrms, 2010). Another example is deepening the
pitch of a man’s voice to signal dominance or strength; this action may not be done consciously
or intentionally, but it likely evolved to influence others in a specific way (e.g., to obtain more
sexual partners) and is thus a signal (Puts, Gaulin, & Verdolini, 2006). Similarly, the shaking of
27
a voice during a speech can signal nervousness, despite that the sender likely does not want to
convey this information. Intentional signals have received the majority of attention from
scholarly work in economics and human behavior research (Connelly et al., 2011; Spence, 2002).
Unintentional signals are often intrinsic, unalterable, or difficult to change, are often sent by the
sender without knowing, but can nonetheless influence others (Spence, 2002) and may be an
avenue for future research (Connelly et al., 2011). There are thus two important characteristics of
signals regardless of their intentionality: (1) “signals carry information from a signaler to a
receiver,” and (2) “signals influence receiver behavior” (Lachmann et al., 2001, p. 13189).
One of the assumptions of Lewis signaling games is that senders and receivers have
corresponding interests; in these circumstances, it is in the interest of both the sender and
receiver to accurately and honestly transmit and react to information. However, since oftentimes
senders and receivers have at least partially conflicting interests, many scholars have focused on
how signals are kept honest (BliegeBird & Smith, 2005; Gintis, Smith, & Bowles, 2001). If
senders and receivers have conflicting interests (where the players prefer different strategies
given the payoffs, which is different from the signaling game in section 2.2), then the sender has
incentives to misrepresent, and thus there must be some sort of “countervailing incentives” to
maintain honesty (Lachmann et al., 2001, p. 13189).
One of the mechanisms to keep signals honest is cost. Cost can be separated into two
categories: the cost of sending the signal (i.e., “production costs”) and the potential cost of
cheating (Számadó, 2011). In essence, production costs involve how much it costs to create and
send the signal, and these costs can be further deconstructed into efficacy cost (the cost of
transmitting the information unambiguously) and strategic cost (responsible for the reliability of
the signal) (Maynard-Smith & Harper, 2003; Számadó, 2011). The potential cost of cheating, on
28
the other hand, is a cost imposed on a sender once the signal has either been verified or
uncovered as a lie (Lachmann et al., 2001; Számadó, 2011). In the context of a signaling game,
costs would be built into the payoffs of senders.
Some have proposed that only signals with associated production costs, called assessment
signals and handicap signals, can be honest because costs can factor into both how senders and
receivers evaluate the value of signaling. Assessment signals have causal links to unobservable
qualities they indicate and cannot be deceptive (Donath, 2007; Guilford & Dawkins, 1995). For
example, someone lifting a very heavy weight signals their strength, but a weak person would
not be able to deceive by lifting the same weight (Donath, 2007). For assessment signals,
production costs are out of range for those without the quality, making it impossible for those
without the quality to signal in this way. In communication and linguistics, signals that are
causally related to the underlying message are called signs (McGlone & Giles, 2011).
Like assessment signals, handicap signals are costly to make (Zahavi, 1975), thus the
term handicap, though they are not causally linked to the unobservable quality. In other words,
those without the unobservable quality could send the handicap signal, but it would generally be
too costly for individuals without the unobservable quality to send it (i.e., it costs less for them to
not signal). Handicap signals can be considered types of assessment signals because “only
someone who has an excess of a given resource can afford to expend it for communicative
display” (Donath, 2007, p. 234). For example, Zahavi (1975) suggests that a female bird can
discern if a male bird is of high-quality because of the signal of colorful feathers, which attracts
both females and predators. A male with colorful feathers has “already withstood the extra
predation risk involved in its plumage” (Zahavi, 1975, p. 210), thus indicating quality. In other
words, colorful feathers are costly (e.g., the bird must avoid fighting, death from predators, or
29
waste metabolic resources on maintaining the color), which means that signal cost can be
correlated to quality though the colorful feathers are not causally related to how fit the male bird
is. This stream of research has shown that signals that are costly to produce are less likely to be
faked, because only those with the necessary resources are able or motivated to produce the
signals. Research has explored human handicap signaling, as well, such as conspicuous
consumption where people purchase and display expensive goods, even if these goods are
functionally equivalent to lower cost goods, to signal that they are high status or wealthy (e.g.,
BliegeBird & Smith, 2005; Corneo & Jeanne, 1997; Sundie et al., 2011).
While some have claimed that signals themselves must be costly to be honest (Grafen,
1990; Zahavi, 1975), others have shown that honesty is maintained not exclusively by the cost of
signaling, but also or only by the potential cost of cheating after a signal has been verified
(Lachmann et al., 2001; Számadó, 2011). Conventional signals, such as human language
(Donath, 2007; Lachmann et al., 2001), evolved as arbitrary indicators of some quality, where all
or most signalers in the system agree to its correspondence to a quality; another signal or
signaling system could have just as easily evolved instead (Guilford & Dawkins, 1995; D. Lewis,
1969; Skyrms, 2010). For example, the word tree has no inherent relation to a tall plant with a
trunk and branches, and yet in English we all agree that the word tree is associated with this
specific kind of plant; we use it to refer to these objects in a way that can help us coordinate and
understand (e.g., “Meet me at the big tree!”). Another word could have easily evolved instead,
but over time this became accepted as convention.
Despite their arbitrariness, conventional signals can be reliable under certain
circumstances, like if costs are imposed on cheaters when they are detected as such by the
receivers themselves or some other entity (Guilford & Dawkins, 1995; Számadó, 2011) or if
30
senders generally benefit from accurately transmitting information through signaling (D. Lewis,
1969; Skyrms, 2010).
Some TM systems research specifies how people develop their mental representations of
expertise and network connections by communicating or signaling (though the term signaling is
not used). For example, Hollingshead and Fraidin (2003) found that sometimes prior to
interaction, stereotypes based on group members’ visible characteristics may be used to infer
domains of expertise, such as the gender stereotypes of cars for men and cosmetics for women.
In the signaling framework proposed here, gender is a signal that is not easily changed in any one
moment by an individual but is still used by receivers to infer traits of that individual (Spence,
2002), whether those inferences are accurate or inaccurate. More complex signaling, such as
language or other displays of attributes, can also change perceptions of expertise and lead “to
more sophisticated attributions that are more likely to be accurate” (Hollingshead & Brandon,
2003, p. 609).
When that communication changes the perceptions or behaviors of receivers, in this case
the communication updates receivers’ directories of who knows what and who knows who, that
communication can be called signaling (Donath, 2007; Maynard-Smith & Harper, 2003).
Additionally, signaling is an effective mechanism that can help coordinate among groups of
different sizes and with different levels of familiarity. Transactive memory systems operate well
primarily among small groups and teams (Palazzolo, 2005) and when those groups or teams are
familiar enough with each other, such as if they go through training together (Moreland, 1999).
Signaling about expertise in different ways can function as a communicative mechanism that
mitigates the negative effect of unobservable expertise in large groups, such as in the context of
social media and online communities or large dispersed organizations. Additionally, examining
31
signaling that may resemble TM system structures from a “network perspective allows for the
direct measure of connections between multiple people and, therefore, provides more rich
information regarding the underlying processes of the system” (Palazzolo et al., 2006, p. 224).
2.4 Equilibria
To understand TM system group-level outcomes, which is a main goal of TM system
theory, more macro-level signaling concepts must be incorporated. One of the most important
outcomes of signaling games are equilibria, as they provide information on the “end state” of a
whole system. An equilibrium in a game theoretic model like a signaling game occurs when a
change in strategy by any player would not make them better off than if they stayed with the
status quo. In a signaling game, an equilibrium that accurately transmits information from the
sender to the receiver creates a “signaling system,” meaning that there is now agreement between
the sender and receiver about which signal corresponds to which state of nature and thus which
action (D. Lewis, 1969; Skyrms, 2010). Some assumptions of signaling games include that the
game is sequential and that the person (or entity) with more information plays first, meaning they
make the first move (i.e., signal) and then the person with less information responds (i.e., action)
(Riley, 2001).
Economics researchers have focused on two kinds of equilibria that can emerge from
signaling games: the pooling equilibrium and the separating equilibrium.
8
Under the simple
condition where there are two groups, one of high quality and one of low quality, a pooling
equilibrium occurs when both the high- and low-quality groups could benefit from signaling (or
8
There is also a semi-pooling (or, equivalently, semi-separating) equilibrium which occurs when
senders of different types may send the same signal, so receivers are only partially able to
distinguish between types (Cadsby, Frank, & Maksimovic, 1990; Lachmann, Szamado, &
Bergstrom, 2001).
32
not signaling), so receivers have no way to distinguish between high or low quality groups.
Signaling will not occur if it is “too expensive for [the high-quality group] to distinguish itself”
(Spence, 2002, p. 438). This is dependent on the relative size of the groups. Suppose, for
example, there are two kinds of bands that play at weddings in a city, one kind that plays fun
music and one that plays boring music. The fun bands could charge up to $3,000 to play, while
the boring bands could only charge $1,000. Luckily in this city, of the 100 bands, only 10 are
boring. The fun bands could signal to potential clients that they are fun by building a website, but
this signal may cost them about $300 per wedding to maintain. If, instead, these bands just sit
back, relax, and do not signal, they can expect:
[(3000 ∗ 0.9) + (1000 ∗ 0.1)]
or $2,800 to play at a wedding, which is the average payoff for all bands when weighting fun
bands’ and boring bands’ prices by the distribution of fun bands and boring bands. If they built
the $300 website that would signal they are fun, they would get $2,700 total per wedding, a loss
of $100 compared to not signaling. Even though boring bands can expect $2,800, as well, the
real harm to the fun bands comes with cost of the signal and not with the competition in this
case. A bride and groom know they have a good shot at getting a fun band because there are so
many fun bands in town.
9
If no signals are sent by bands to distinguish themselves, then they
would still pay less than the maximum for fun bands just in case they hire one of the 10 boring
bands, but they know they are unlikely to do so. In the context of online knowledge-sharing
communities, if there are many experts in a field and very few amateurs or novices, a person may
9
In this model, the receivers always know the distribution of high- and low-quality members, but
they just cannot tell apart which members belong to which group without a signal. While this
example is highly unlikely and a bride and groom would prefer bands that can signal their quality
rather than not because it is a one-time special event, this example provides an intuition for why
signaling may not occur in some situations.
33
not need to signal because their knowledge will be rewarded highly regardless of whether they
signal.
Signaling also will not occur if the low-quality group has incentives to mimic the high-
quality group (Stiglitz, 2002, p. 476), so eventually the high-quality group decides not to signal
at all because they know others will be unable to tell them apart from low-quality groups. Stiglitz
(2002) mentions cases where mimicry is easy (i.e., it is not costly for low-quality types to signal
the same as high-quality types), thus resulting in pooling because everyone can signal. For
example, imagine that it is possible for any band, regardless of their true quality, to build a
website for $100 that would signal they are fun. In this case, boring bands could easily mimic
fun bands. If all bands have a website that signaled they are fun, then there would be no point to
build the website in the first place. Through backwards induction, it is clear that fun bands would
not signal in this case, and boring bands would also not signal. In the context of online
knowledge-sharing communities, if it is easy for amateurs or novices to mimic experts, experts
may not waste their effort on trying to prove the worth of their knowledge. As such, there are
cases when people, groups, organizations, or other entities may decide not to signal to
distinguish themselves or may not be able to distinguish themselves from others through
signaling. When signaling does not occur, there are incentives for individuals in both groups to
keep information private, to not be transparent, or to keep secrets (Stiglitz, 2002).
On the other hand, a separating equilibrium emerges when the high-quality group signals
and the low-quality group does not. This occurs when the high-quality group’s payoffs for
signaling are greater than not signaling and the low-quality group’s payoffs for signaling are less
than not signaling (Connelly et al., 2011; Spence, 2002). This is the ideal case, where receivers
can accurately distinguish between types because of reliable signals. From the perspective of the
34
bride and groom in the ongoing example, they would much prefer to eliminate all uncertainty
and be sure to get a fun band at their wedding. Rather than have the cost of the signal be the
same for both types of bands, perhaps maintaining a website is more expensive for boring bands
than for fun bands. A boring band would have to spend $2,200 per wedding to hire web designer
to make them look fun and maintain a really fun-looking website, while the fun band would only
have to spend $100 (their fun-band content speaks for itself!).
Even with the same distribution of fun and boring bands in the city (90 compared to 10,
respectively), the outcome looks very different. Now boring bands can expect to make $1,000
per wedding if they do not signal with the website, and only $800 if they do signal, a loss of
$200 per wedding; in this case, boring bands will not signal. Fun bands, on the other hand, can
expect to make $2,800 per wedding if they do not signal with the website and $2,900 per
wedding if they do signal, a gain of $100 per wedding; fun bands will signal given these
parameters. This shows how, under the right circumstances, signaling can lead to a separating
equilibrium. Some scholars consider the concept of separating equilibrium as “the essential
predictive mechanism that drives the unique explanations associated with signalling [sic] theory-
based hypotheses” (Bergh et al., 2014, p. 1335). However, this mechanism relies on cost, which
often is not a real factor in conventional signaling.
The term equilibrium has been used in this chapter to refer to the meaning that is
transmitted in signaling games. To understand how signals propagate and meaning emerges in a
system, Skyrms (2010) and others (e.g., Gintis et al., 2001) utilize a different kind of equilibrium
analysis: evolutionary stability. In classical game theory, players have strategies from which they
rationally choose the best option, but in evolutionary game theory, types (i.e., species) have
strategy sets and individuals inherit their strategy based on their type (Gintis, 2009, p. 229). An
35
evolutionary stable strategy is one that, if it is used by all individuals, cannot be invaded by a
“mutant” alternative strategy (Gintis, 2009). Skyrms (2010) exploration of the way signaling
systems evolve utilizes the concept of evolutionary stable strategies. Putting the above into
evolutionary game theoretic terms, the species would be high-quality or low-quality and
individuals would be one of those two types of species. Those who are part of the high-quality
species would have certain signaling strategies, and those who are part of the low-quality species
would have other signaling strategies, and over time the progeny (however defined) of these
individuals would inherit their ancestors’ strategies.
This chapter provided the necessary background to understand TM systems, expertise,
information theory, and signaling theory as a foundation for examining conspicuous connection
signaling. The next chapter builds on these theories to begin to develop a theory of conspicuous
connection signaling.
36
CHAPTER 3: CONSPICUOUS CONNECTIONS AS SIGNALS
Building on the literature discussed in the previous chapter, this chapter develops a theory
of conspicuous connections as signals of expertise. To do this, the first section discusses a related
signaling theory of conspicuous consumption, explaining how conspicuousness (i.e.,
noticeability or visibility) is an important aspect of this type of signaling. Conspicuous
consumption literature links to conspicuous connections analogously, where, rather than
displaying connections to expensive objects, people instead display connections to other people
as signals. The next section begins to explicate a theory of conspicuous connections at the
individual level of senders and receivers, and hypothesizes how signal visibility and information
may lead to varying outcomes in social online communities. These micro foundations provide a
basis to also explore the separating and matching functions of these types of signals. The last
section focuses on the network implications of these micro foundations by conceptualizing the
different interconnected networks and how these networks may be correlated to each other to
understand the development and functioning of whole TM systems.
3.1 Conspicuous Consumption
Conspicuous consumption is an example of costly signaling among humans (BliegeBird
& Smith, 2005). Thorstein Veblen (1899) is credited with the first explication of a theory of
conspicuous consumption in Theory of the Leisure Class, where he proposed that purchasing and
displaying overpriced goods (or, alternatively, prolonged leisure time) signaled great wealth,
particularly in situations where wealth was difficult to discern, such as economically mobile
societies (BliegeBird & Smith, 2005; Trigg, 2001). In traditional economic theory, price should
signal the quality of the good (i.e., higher priced goods are of higher quality, and vice versa for
lower priced goods), but in the case of conspicuous consumption the price of the good instead
37
signals the quality of the consumer (Corneo & Jeanne, 1997). Essentially, those with money to
waste are able to purchase expensive products that are functionally equivalent to lower cost
products in order to portray themselves as high status to others (Bagwell & Bernheim, 1996;
Trigg, 2001).
In this way, wealthy individuals do not signal to others that they are of high status by
saying, “I am wealthy,” but rather signal their high status by purchasing luxury items and
connecting themselves to those products by displaying them. Conspicuous consumption is a
signal because items are bought and displayed to effect others’ behaviors and perceptions,
defining characteristics of signals according to Chapter 2. This can be thought of as a form of
handicap signaling, where an individual is only able to signal by wasting a resource if that
individual has a lot of that resource. Veblen (1899) compared conspicuous consumption motives
of the rich and the poor: he describes the rich as motivated by “invidious comparison,” meaning
they want to be distinguished from the poor, while the poor are motivated by “pecuniary
emulation,” meaning they want to be mistaken for the rich. As such, upper classes take measures
to make sure the costs of such goods are high enough where pecuniary emulation (i.e., deception)
is unlikely (Bagwell & Bernheim, 1996). Many have pointed to the link between the work of
Veblen and that of Pierre Bourdieu, particularly his discussion of symbolic and cultural capital in
Outline of a Theory of Practice (1977) and Distinction: A Social Critique of the Judgement of
Taste (1984) (BliegeBird & Smith, 2005; Trigg, 2001).
Since Veblen’s publication in 1899, a considerable body of research has expanded on the
conspicuous consumption concept. Psychologists examined how material goods may be thought
of as a signal of one’s identity, finding those who were more insecure about but also more
committed to their identities were more likely to own and wear items that they perceived as
38
essential to that identity (Braun & Wicklund, 1989). Some scholars focused on how conspicuous
consumption may function in a psychologically similar way to conspicuous peacock tails,
impacting sexual behavior in humans such that men will consume conspicuously when they are
interested in short-term mating (Sundie et al., 2011). Others found in several experiments that
displaying luxury brands elicited preferential treatment by others (Nelissen & Meijers, 2011).
Wedding celebrations in rural India have also been examined as a form of conspicuous
consumption signaling; when families live in different villages there is information asymmetry
regarding the status of the families, so a bride’s family will want to symbolically signal status
through a more extravagant wedding (Bloch, Rao, & Desai, 2004). Still others have tried to
understand market dynamics of the conspicuous goods themselves, focusing on bandwagon
effects (i.e., demand for the good increases) and snob effects (i.e., demand for the good
decreases) (Corneo & Jeanne, 1997). Expanding the concept to other contemporary practices,
some scholars have examined other status-enhancing signals like conspicuous compassion
(West, 2004), conspicuous donation (Grace & Griffin, 2006, 2009), and conspicuous
conservation (Griskevicius, Tybur, & Van den Bergh, 2010; Sexton & Sexton, 2014), all
building on Veblen’s original work.
One of the key aspects of conspicuous consumption and related concepts is
conspicuousness, or visibility of one’s actions (e.g., purchases, empathy, contribution, etc.) in
order to elicit the desired response and status enhancement (Chaudhuri, Mazumdar, & Ghoshal,
2011; Grace & Griffin, 2006). In other words, if the action is not observable by others, it would
not be useful in the sense of creating the impression that someone is, for example, wealthy,
environmentally-friendly, kind, or charitable. Signaling theory refers specifically to
39
observability, or “the extent to which outsiders [i.e., receivers] are able to notice the signal”
(Connelly et al., 2011, p. 45).
The similar construct of visibility has received much attention in its own right as a
construct in recent years, particularly in terms of how contemporary information and
communication technologies may facilitate visibility in different ways (Leonardi, 2014; Stohl,
Stohl, & Leonardi, 2016; Treem & Leonardi, 2012). Despite the increased attention to visibility
in recent research, a clear and concise conceptual definition has not been established. Even work
that claims to develop a “theory of communication visibility” does not provide a clear definition
of what visibility is (Leonardi, 2014).
Scholars generally conceive of visibility as an “affordance” of an individual or other
actor, where an affordance is the interaction between the features of an object and the
perceptions of an individual such that each individual may perceive different affordances of the
same object even though they have the same features (Gibson, 1986; Treem & Leonardi, 2012).
As Brighenti (2007) discusses, visibility is also tied to perception, or seeing and being seen; it
can also be understood as related to its absence, where a lack of visibility means that there is
complete unawareness of the object, person, communication or other entity of interest. Stohl et
al. (2016) conceive of visibility as being composed of three parts: “availability of information,
approval to disseminate information, and accessibility of information to third parties” (p. 124).
Visibility is often not symmetrical, and often one entity is more visible relative to another
(Brighenti, 2007). The level of visibility can be understood from the receiver’s perspective as the
amount of effort receivers must expend to locate and access the information from the object,
person, signal, or other entity of interest (Treem & Leonardi, 2012). Section 3.3 provides a
40
definition of visibility, particularly a type of visibility that is relevant to this study: networked
signal visibility.
3.2 Connections as Signals
Just as conspicuous consumption signal high status (BliegeBird & Smith, 2005),
relationships or connection information can signal certain important aspects of one’s identity. As
Donath and boyd (2004) state, “a public display of connections can be viewed as a signal of the
reliability of one’s identity claims” (p. 73). By “making network structure and activity into an
everyday part of impression formation” (Donath, 2007, p. 241), visible networked connections
“can clarify ambiguous [self-]presentation, moderate an extreme performance, and confirm an
ambitious one” (Donath, 2007, p. 243).
There is empirical evidence that these claims hold merit. For example, one study found
that people perceive others as more similar to their friends when those connections are displayed,
illustrating that connection information on SNSs matters (Utz, 2010). Another study found that,
controlling for performance, network connectivity (i.e., reach) was a positive predictor for
whether someone obtained or maintained employment in the NCAA, and identifying as a
member of a particular group of coaches also impacted prestige of positions (Halgin, 2008).
Additionally, scholars have found that entrepreneurs who receive third-party endorsements in the
highly uncertain and ambiguous early stage of startups were more successful in obtaining
external funding because these signals helped disambiguate other “weak” signals such as
managerial experience or the introduction of a product to market (Plummer, Allison, & Connelly,
2015).
Scholars using signaling theory in the context of online peer-to-peer lending found that
online friendships (with verified users) of potential borrowers signaled their credit quality and
41
were associated with the rate of lending, interest rates, and subsequent default rates (M. Lin,
Prabhala, et al., 2013). On the P2P lending website, one’s real identity is only visible to other
individuals on the site who have been added as “friends.” Friendship data is also highly visible
on that site, and there are different types of friendships and roles (i.e., friendship with a verified
profile, friendship with an unverified profile, friendship with lenders-only, friendship with
borrowers-only, etc.). Similar to Lin and colleagues’ (2013) focus on a marketplace, this study
focuses on what can be considered an expertise market, where senders are expertise suppliers
and receivers are expertise demanders (L. Lin, Geng, & Whinston, 2005).
Shami and colleagues (2009) also explored different kinds of signals that corroborate
expertise in online environments, and they considered social connection information to be
assessment signals because they were calculated by the site itself. In other words, the information
cannot be manipulated by the person him or herself and thus is directly linked to the quality of
interest, expertise. They found that participants used this social connection information to decide
whom to contact as an expert because they presumed the “paths were honest signals of expertise
since an expert would be linked to other experts within a connection chain” (p. 75). While Shami
and colleagues found evidence that connection signals do impact peoples’ decision-making
regarding expertise and knowledge-sharing, they did not specifically theorize about connection
signals. Additionally, while they examined individual-level outcomes, they did not explore how
this may impact the network and system levels. The next section specifically theorizes about
conspicuous connections as signals and moves to exploring this from group and network
perspectives.
42
3.3 Conspicuous Connection
This study defines conspicuous connection as exhibiting relevant connections to others to
corroborate one’s claims surrounding expertise; in other words, individuals exhibit their
connections to other individuals as signals
10
. Depending on the context, conspicuous connections
may be signals of different aspects of an individual. For example, in a social setting an individual
may want to establish rapport with a new crowd by mentioning a connection to a known mutual
friend. On the other hand, in a professional setting, one may want to provide evidence of a
connection to a well-known individual in that same profession. In an SNS focused on building
complicated products like software and where there is high information asymmetry regarding
who knows what and who knows who, establishing that one has connections to others in an
expert community can signal that the individual is an expert as well. When expertise is
conceptualized relationally based on communication as explored in the previous chapter (Evans,
2008; Evetts et al., 2006; Mieg, 2006; Treem, 2012), signaling through connections is an
applicable and useful perspective. And while one may wish to signal as much as possible and in
as many ways as possible, the “network environment … [provides] benefits and constraints that
the actor may, or may not, exploit and manage” (Borgatti, Brass, & Halgin, 2014, p. 4).
Just as there are many different ways to signal through conspicuous consumption, such as
spending on extravagant weddings (Bloch et al., 2004) or purchasing luxury cars (Sundie et al.,
2011), there are also different ways that individuals may signal through conspicuous connection
based on the types of ties and the ties’ characteristics. In social network theory and analysis
10
This definition is based on the definition of conspicuous consumption as “attaining and
exhibiting costly items to impress upon others that one possesses wealth or status” (Sundie et al.,
2011, p. 1); individuals obtain (seemingly by purchasing with excess money) and exhibit items
as signals.
43
literature, networks can be multiplex, meaning people can be connected to others in different
ways (Shumate & Contractor, 2013). Different types of ties, or the kinds of relations that people
or other entities can have with one another, include similarities, social relations, interactions, and
flows (Borgatti et al., 2014; Borgatti, Mehra, Brass, & Labianca, 2009). These ties can also have
different characteristics, including degree, symmetry, affect, and strength (Kane, Alavi,
Labianca, & Borgatti, 2014, p. 11). Signals about different types of ties and characteristics can
provide third-parties or receivers with information about a focal individual, in aggregate
changing receivers’ mental representations of the expertise network, also known as directory
updating (Palazzolo, 2005; Palazzolo et al., 2006; Yuan, Fulk, Monge, & Contractor, 2010).
Shumate and colleagues (2013; 2013) call this communication or signaling about ties a
“representational” tie, defined as “messages about an association among actors communicated to
a third party or to the public” (Shumate & Contractor, 2013, p. 452). The signaling framework,
on the other hand, provides a way to understand how communication about connections can
influence others and the overall functioning and development of the network. While the creation
of a tie is a behavior by a source to a target, the visible tie as seen by receivers is a representation
of the source’s and target’s qualities. The different categories of ties represent individuals in
different ways, and this is explained in more detail later in this section.
Table 3.1 provides a matrix of the different types of conspicuous connection signals
based on both the type of connection and the direction of connection. The table provides
examples of those individual signals, though aggregated signals can also be sent. This is
explained in the following paragraphs.
44
Table 3.1 Categories of Conspicuous Connection Signals
Direction of the Connection
Incoming Outgoing Symmetrical
Type of
Signal
Similarity
-Saying: “I am an expert
in knowledge area A.”
-Building a visible
project in a
programming language
Social
Relation
-Visibly
followed by
Person X
-Visibly following
Person X
-Visibly following
and followed by
Person X
-Having a
visible
hyperlink
from Person
X’s website to
mine
-Visibly hyperlinking to
Person X’s website
-Visible
hyperlinks to and
from Person X’s
website
-Visible evidence
of a working
relationship with
Person X
Interaction
-Asking a
visible
question by
Person X
-Asking Person X a
visible question
-Receiving a
visible question
from and visibly
answer a question
to Person X
-Receiving a
visible
response from
Person X
-Responding visibly to
Person X
-Receiving a
visible email from
and send visible
email to Person X
Flow
-Visibly
receiving
information
on topic Z
from Person
X
-Visibly providing
Person X with
information on topic Z
45
In this table, only individual signal examples are provided at the intersection of the types
of ties and the direction of the ties. All of these signals can be aggregated. For example, for
incoming social relation connections, an aggregated signal is for an individual to visibly show
the number of followers they have. To be signals, these connections must be visible or
conspicuous to at least one other person outside of the connection itself (and could be visible to
many others). If the connections exist but others cannot see those connections, they are not
signals. For example, someone may reach out and develop a back-and-forth email
correspondence with another individual privately. If only those two individuals in the email
correspondence are aware of that symmetric information flow connection, that connection is not
a conspicuous connection signal (no third party receiver). If, on the other hand, one of them
forwards that email correspondence to another person, it is now a signal because the third person
outside of the email correspondence connection is aware of that connection. If a person claims
they have a connection with someone else when they do not, this would be a deceptive signal.
Deceptive signals can still change the perceptions individuals have of the available knowledge
network, even if they do so inaccurately. For example, a recent phenomenon involves purchasing
fake followers and the deployment of automated accounts on social media, with some estimates
that 15 percent of users on Twitter are not necessarily associated with one human but are instead
are bots (Confessore, Dance, Harris, & Hansen, 2018). Conspicuousness is not about indicating
expertise, necessarily, but rather a necessary condition at some minimal level for a tie to be a
signal. Without conspicuousness, also called visibility, the tie would not be a signal at all.
Revisiting the definitional problem with visibility discussed in section 3.1 and thinking
about this from the conspicuous connection signaling theory developed here, a particular type of
visibility can be defined: networked signal visibility. Visibility of a signal in this sense is
46
determined not only by the perception of one individual along with the capabilities of the system
of interest, but also the perception, actions, and connections of others in the network. For
example, if a piece of content is shared on social media, the visibility of that piece of content is
not solely determined or perceived by either the individual who posts that content or the
system(s) within which the content exists. Instead, because individuals and other entities in the
system exist in a network of connections, the visibility of that content relies on the sender and the
system along with the network of individuals that interact with that content. Visibility can thus
change over time, depending on if others share that content, recreate that content, or try to
diminish the reach of that content.
Consequently, in this study visibility is not an affordance of technologies as perceived by
individuals, but rather a property of some entity, like a signal, that can move through a network.
In particular, the visibility of conspicuous connection signals are not always manipulated by the
sender of interest, but rather can be changed by those in the network by connecting to that
sender. As those individual conspicuous connections aggregate and increase, the visibility of the
signal increases as well. For this reason, visibility exists on a continuum, or in other words the
amount of effort to locate, access, and interpret signals about expertise may vary on some scale.
As it becomes less difficult for receivers to locate, access, and interpret signals indicating an
unobservable quality (in this case expertise), the more likely it is that those receivers will change
their beliefs and behaviors based on those signals. With these characteristics in mind, the
definition of networked signal visibility in this study is the extent to which a signal in a
communication system network can be located, accessed, and interpreted by those in the
network.
47
The first type of tie in Table 3.1, similarity relationships, include things such as co-
membership, co-location, or similar attributes, and while they are not considered actual social
ties between individuals, they are generally considered to provide the conditions necessary for
ties to form (Borgatti et al., 2014; Borgatti & Cross, 2003). Similarities in knowledge-intensive
networks occur when, for example, individuals have expertise in the same areas. The principle of
homophily, or “contact between similar people occurs at a higher rate than among dissimilar
people” (McPherson, Smith-Lovin, & Cook, 2001, p. 416), means that when people have
similarity ties, they also having the conditions necessary for other kinds of ties to form like
friendship ties, advice ties, information exchange ties, work ties, and others. However, claiming
a similarity tie is different from having a similarity tie. For example, one may not signal at all,
never admitting their expertise in a certain area though they are an expert. On the other hand, one
could signal by saying, “I am an expert in this area,” without corroboration of this conventional
signal. People who claim expertise in the same area are similar to each other at least
superficially.
The second type of tie, social relations, are considered the “persistent social connections
between nodes, such as role-based connections (friends, family) or affective states (likes,
dislikes)” (Kane et al., 2014, p. 8). In the case of knowledge-intensive networks, social relations
may include co-workers, recommenders, advisors, supervisors, mentors/mentees, or admirers,
among other possibilities. For example, someone may signal by saying or showing that they
worked with a prominent person in a certain expertise area. Again, the defining condition of a
signal about network ties, including social relations, is that an individual sends information about
this social connection to a third party, changing that third party’s behavior and/or perceptions.
48
Third, interactions involve “discrete, transitory relational events,” like exchanging an
email or talking with another person (Kane et al., 2014, p. 8). In the case of knowledge-intensive
networks, interactions include the actual exchange or transaction of information or knowledge,
including both the request for contribution and the response (Bighash, Oh, Fulk, & Monge,
2018). People signal by claiming to have had or showing interactions with others. Those
interactions do not become signals until they become known to someone outside of the
interaction.
Last, flows involve “the tangible and intangible material” that moves from one person or
entity to another, such as information or goods (Kane et al., 2014, pp. 8–9). In the case of
knowledge-intensive networks, this would specifically include the flow of information or other
units of interest (like software code) from one person to others, though no one ever fully
transfers or loses possession of information but only replicates it (Fulk, Heino, Flanagin, Monge,
& Bar, 2004). As in all cases, it is not the flow connection itself that is the signal, but the
communication about the connection. As such, it is very possible that someone may signal to a
another that they have these connections, but in fact they are lying and do not have those
connections, if people are able to deceive in the system.
Each of these conspicuous connection signals about different types of connections may
have different characteristics, including symmetry and degree or centrality
11
. Symmetry refers to
the reciprocated nature of a tie (Kane et al., 2014). Some ties are directional, meaning that the
connection has an origin and a target, while other ties are non-directional, meaning that the
11
While affect and strength are two other characteristics of ties, this study only focuses on
symmetry and centrality because affect (e.g., like or dislike, friend or enemy) and strength (e.g.,
you are a close friend versus a distant friend) are not as directly related to building products in
knowledge-intensive online work communities.
49
connection exists and does not indicate origin or target (Wasserman & Faust, 1994, p. 44). By
definition, a non-directional tie is symmetrical, while a directional tie may be either symmetrical
or asymmetrical. For directional ties, people may signal about different symmetric or asymmetric
relationships. First, individuals may signal by exhibiting to others that they have one-way,
outgoing connections. For example, online this may look like a “following” relationship, where
the focal individual follows another on a social media platform, or a hyperlink from the focal
individual’s or entity’s website out to another’s. These are examples of one-way outgoing social
relations that are observable to others in a system. Another example could be an outgoing email,
which is an example of a one-way outgoing interaction; however, in order for this to be a signal,
the email or mentions of the email would have to be visible to others in the system, as
observability is a necessary condition of a signal (Connelly et al., 2011). Perhaps a better
example would be forwarding an email sent to one person to others. A non-digital counterpart
may include communicative actions like namedropping or discussing with whom one works.
Second, individuals may signal by showing others they have one-way, incoming
connections. Online incoming social relations could look like a “follower” relationship, where
the focal individual has others who follow them on a social media platform, or an incoming
hyperlink to the focal individual’s or entity’s website. Non-digital counterparts could include
things like third-party endorsements or the inverse of the outgoing relationships. It is important
to remember that one individual’s outgoing connection is another’s incoming connection,
creating a network of relationships in the aggregate.
The third type of conspicuous connection signal is a symmetrical two-way connection,
meaning one of two possibilities: (1) that one individual’s outgoing or incoming connection
necessarily implies the other’s in reciprocation; or (2) that both an incoming and outgoing
50
connection have been established. Online examples include those social media sites that only
allow mutual relationships, such as Facebook friendships, or when an individual both is
following is and is followed by another individual.
In addition to symmetry, degree or the number of ties an individual has is an important
characteristic in social networks (Wasserman & Faust, 1994), and, like symmetry, one that may
also be determined or limited by a particular online space (Kane et al., 2014). Individuals may
signal regarding the number of ties they have, focusing on different types of ties (e.g., “I got so
many emails today!” or displaying the number of ties one has). Signaling this kind of aggregated
information regarding the number of ties can indicate to third parties how important or central
one is in a particular network (Kane et al., 2014; Wasserman & Faust, 1994), although the
commonly perceived expectations of centrality on the flow of information may not always occur
in reality; in other words, different types of centrality may either hinder or help the flow of
information in the network, though it is commonly expected that a central node would help
facilitate the flow of information in a network (Borgatti, 2005). Previous studies have shown that
high volume of information in the form of reviews impacts how much people trust that
information particularly if that information is user-generated and not formally credentialed
(Flanagin & Metzger, 2013). As such, it can be expected that when expertise is unknown and not
credentialed as in many online spaces, signaling by indicating high numbers of connections may
lead to more trust by others.
One way to measure conspicuousness, also called visibility, would be as volume as those
individual connections aggregate; for example, one study found that higher volume of movie
ratings were perceived as more reliable signals than lower volume (Flanagin & Metzger, 2013).
When someone has a greater number of conspicuous connections, the signal is hard to ignore and
51
is easier to locate, access, and interpret, and so it is much more visible. This is like conspicuous
consumption, where purchasing many luxury items could be a stronger signal of wealth than just
purchasing one luxury item. In terms of Table 3.1, this would mean that aggregated signals
would change the behavior of receivers more than individual signals.
On some social media sites, like LinkedIn and Facebook for example, a tie is necessarily
mutual
12
; this means that if one person requests to connect with another, that other person must
agree to the connection before the connection exists. As such, all ties in those networks are
symmetrical. In a network where a connection does not imply reciprocity (i.e., the network is
directed), like on sites like Twitter and GitHub, a person can follow another without that other
person following back. In this case, incoming and outgoing connections can have different
interpretations by those viewing those connections as signals, particularly when the purpose of
the network is for knowledge work. Figure 3.1 provides a hypothetical social relation network,
with the aggregated number of network connections listed next to each node as the visible
signals that others can interpret.
12
This only includes the main types of ties on these sites (friendships on Facebook; connections
on LinkedIn). These sites have now incorporated other types of ties where you can “follow”
another user to pay attention to their public content without those users following back.
52
Figure 3.1 Hypothetical Social Relation Network
The users in the network in Figure 3.1 not only see their own connections, but also see
the aggregated number of connections of other users along with the specific people to which they
are connected. For example, User C is only connected with an outgoing tie to User B, but User C
is still able to see all of the other users’ connections as signals, like that User A is following three
other users. User C can compare the users’ aggregated number of connections and determine that
User B has the highest number of incoming connections while User A has the highest number of
outgoing connections. Figure 3.2 provides another illustration to emphasize the difference
between the source and target of the connection(s) in the network compared to the sender and
receiver of the visible signals about the connections, known in this study as conspicuous
connections.
53
Figure 3.2 Distinction Between the Source-Target of the Connection and Sender-Receiver of the
Signal
Figure 3.2 shows only one of the sources of the four connections to illustrate the source-
to-target relationship in the network; three other sources are outside of the view of the zoomed in
network. A source of a connection is the node from which a connection originates in a directed
network, while a target is the node to which the connection is pointed. This can be in the form of
a follower-followee relationship in a network like Twitter or GitHub, for example. These
incoming connections, either individual or aggregated, become signals sent by that target who
becomes the sender when a third party (the receiver in this figure) can access, locate, and
interpret those connections at some level; in other words, the tie becomes visible to a third-party.
When a receiver sees those incoming social relation conspicuous connections, either aggregated
or individual, they make judgements about that person’s expertise.
54
Although the incoming connections are determined in part by others in the network
because they must connect to the individual of interest, those incoming connections are signals
of that individual of interest because they tell others in the network important information about
that individual. In signaling theory, signals do not have to be purposefully sent, but rather must
provide some sort of information about the sender that is useful to the sender and the receiver
(Skyrms, 2010). For example, sometimes a person will have a shaky voice when they speak
publicly, which is a signal sent by a sender which communicates nervousness, but that shaky
voice is not intentional and in fact the speaker may wish to avoid this if they knew he or she
knew this was occurring. Similarly, peacocks do not purposefully grow their tails into large and
colorful displays, but rather this is a product of evolution to signal status and mating superiority
(Sundie et al., 2011; Zahavi, 1975). These signals are all sent by the sender, though they are not
purposefully created by the sender.
In the same way, an incoming signal is not necessarily purposefully created by the sender
but still provides information about that sender. Previous research has found that the number of
followers impacts perceptions of source credibility on social media (Jin & Phua, 2014; J. Y. Lee
& Sundar, 2013; Westerman, Spence, & Van Der Heide, 2012). Figure 3.3 provides an example
of a public profile of the user ‘mojombo’ on GitHub from 2013, where both the number
following and followers appear prominently on the profile.
55
Figure 3.3 An Example Profile on GitHub
On the left hand side of a user’s profile page, the user’s profile picture, name, handle, and
other information inputted by the user are displayed. These are all signals that may be utilized by
other users when making a determination about an individuals’ expertise. For example, there
may be certain stereotypes that are related to an individual being more of a programming expert,
like being male versus female (Hollingshead & Fraidin, 2003). While these signals are not
examined in this study, they will be important to examine in future research as they are related to
the perceptions of conspicuous connection signals. The discussion chapter addresses future
studies that may help distinguish and relate these signals. Right below this, the website
aggregates the total number of followers, total number of starred repositories, and total number
following (in bold on the bottom left). These numbers provide information about the user: (a)
that he finds 11 people worthy enough to follow, and (b) that about 13 thousand people find him
worthy to follow. This is similar to many other social media sites, like Twitter and Instagram,
56
that prominently display both the number of followers and the number following. Some scholars
call these “system-generated cues” because the social media site or other system aggregates these
numbers and displays them rather than the user inputting this information, and these are
perceived by users to be more credible than cues generated by the user (Walther, Van Der Heide,
Kim, Westerman, & Tong, 2008). Additionally, others can drill down into the specific people
that follow the user and are being followed by the user by clicking on the number and examining
those users. In this example, it may be easy to peruse all of the specific other users that
‘mojombo’ is following, but it would be difficult to examine all of the users that follow him.
Despite not being able to examine each individual follower, the number provides is an important
signal about the expertise of the user.
Based on this, third parties, or receivers, receiving conspicuous connections signals in
knowledge networks will have different ideas of the motivations behind the behavior of tie
creation depending on the type and direction of the tie. Those ties will represent different aspects
of an individuals’ qualities. For example, outgoing connections are assumed to be perceived by
receivers as an indication of that person’s interest. When an individual connects to another and
that connection is visible, receivers of that conspicuous connection signal will perceive the
motivation to be that they are interested in that other individual in some way, such as their
content. An individual can create many outgoing connections by following many people;
however, the people that they follow do not necessarily have any opinion about the focal
individual. They do not need to respond to the focal individual’s connection in any way.
Outgoing connections provide information about what that focal individual finds interesting or
worth following.
57
Incoming connections, on the other hand, are assumed to be perceived by receivers as an
indication of that person’s expertise. If the focal individual is the target of many incoming
connections, then those people who are following that focal individual have an opinion of that
individual. Their visible incoming connections provide information about that focal individual,
that others perceive the focal individual as interesting and so the person must have some quality
that makes them interesting; in a knowledge intensive network, this quality would likely be
expertise.
For this study examining claims of expertise, incoming social relation connections are the
conspicuous connection signals of interest. These conspicuous connection (CC) signals can also
be called “acknowledgement CC signals” because the source of the connection is acknowledging
the expertise or worth of the target of the connection. Online spaces may be particularly suited to
this form of conspicuous connection because of the opportunities these social media network
(SNS) sites (Kane et al., 2014) afford, particularly visibility (Leonardi, 2014; Treem & Leonardi,
2012). Many of these sites allow people to see not just personal relationships and communication
patterns, but also the relationships of others (Majchrzak, Faraj, Kane, & Azad, 2013; Treem &
Leonardi, 2012). While hyperlinking can be considered a “representational” tie that has physical-
world counterparts such as name-dropping (Shumate et al., 2013; Shumate & Contractor, 2013),
some SNSs allow for displays of networked connections in ways that are less common in the
non-virtual world, such as visualizing users’ full networks or publicly displaying all or some
friendships or connections (Donath, 2007; Donath & boyd, 2004). Which of these different kinds
of conspicuous connection is possible is at least partially a function of a site’s design (Norman,
2008).
58
Consequently, not only do the signals themselves matter but the visibility of those signals
also matters. Exhibition or conspicuousness is necessary for signaling because observability (or
visibility) is an important quality of a signal (Connelly et al., 2011). This visibility can vary from
the perspective of both the sender and receiver(s). A sender may take efforts to limit the visibility
of a signal to only a certain set of receivers and not make it publicly available, or on the other
hand may attempt make a signal as widely observable as possible. On the other hand, senders
may not have direct control of the visibility of all of the signals they send. From the perspective
of the receiver, no matter how visible a sender or a system attempts to make a signal, if the
receiver does not have the decoding tools necessary to understand the signal or the ability to
filter that signal compared to a myriad of other signals in the environment, the information
contained in the signal may be lost and thus not visible.
If the signal(s) the individual sends changes the probabilities that receivers will perceive
him or her as an expert, that receivers should request and accept that individual’s contribution. It
would be in receivers’ interest to request and accept the contribution because the receivers are
trying to satisfy the need for expertise. Many SNSs have capabilities where content is evaluated
in some way, through upvotes, page views, or some other metric that is observable by users.
When contributions created by a self-proclaimed expert is evaluated by others, this provides a
metric to determine whether receivers’ expertise needs were fulfilled. Additionally, those who
successfully signal should become even more embedded in the community and obtain more
contribution requests. If experts really do signal well compared to non-experts, then those who
use high visibility and high information signals should be more likely to be experts, and they
should also have content that spreads more widely through the network.
This leads to the following hypotheses about senders:
59
Hypothesis 1: As the visibility and/or information of conspicuous connection signals
observed by a receiver about a sender increases, (a) the number of contribution requests
to the sender increases, (b) the number of future connections the sender receives
increases, (c) the sender is more likely to be an expert, and (d) the sender’s content will
spread more widely.
Signaling theory has additional predictive power by “specifying how signal senders and
observers can distinguish between – or ‘separate’” between signalers of different underlying
qualities (Bergh et al., 2014, p. 1335). The concept of the separating equilibrium as explained in
the previous chapter can move beyond only individual-level hypotheses and into system-level
hypotheses. In the case of conspicuous connections in online knowledge-intensive networks
where expertise is the unobservable quality of interest, there are more than two types (i.e., there
are experts in many different areas). However, within each expertise area high-quality or low-
quality experts (or, in other words, experts versus amateurs or laypersons) can be distinguished.
For a separating equilibrium to occur in this context (i.e., within each expertise area), the same
conditions must hold. First, the payoff for people to signal to corroborate their expertise area
must be greater than the payoff for not signaling at all. Second, those who do not have expertise
in that area should obtain a higher payoff for not signaling than for signaling. Intuitively this
makes sense. If an individual is not an expert, they would not gain from signaling by forming
conspicuous connections to corroborate that false identity.
The more complicated case arises when comparing experts and amateurs in a knowledge
area. Amateurs potentially could gain from being mistaken for an expert in the knowledge area
60
they claim. However, the structure of the costs and information content of conspicuous
connection signals could mitigate the potential of deception (where amateurs signal and deceive
receivers who are trying to obtain knowledge on a knowledge-sharing platform) or of pooling
equilibria (where experts and amateurs do not distinguish themselves through signaling). Based
on the previous review of equilibria, costs, and information, the following hypotheses can be
deduced:
Hypothesis 2: Experts will signal using conspicuous connections more than those with no
expertise.
Additionally, as explored in Chapter 2, Spence (2002) describes a case where the high-
quality group decides not to distinguish itself through signaling if the low-quality group is
relatively small. This is because the average wage or productivity (which is what they would
receive if they did not distinguish themselves) in the market or system would be weighted to
their higher wage, thus they would not lose much of their wage (or other benefit) and would
avoid the costs of signaling by not signaling. Thus if the proportion of amateurs is relatively
small, experts will not have incentive to distinguish themselves because the average payoff (in
this case, reach of their content) will be high even if they do not. If the proportion of experts is
relatively small, they will have more incentive to distinguish themselves because they will want
to garner the higher payoff than the average in the community. The following hypothesis
postulates how relative group size impacts whether senders signal:
61
Hypothesis 3: As the proportion of non-experts increases in comparison to experts in a
knowledge area, experts will send more conspicuous connection signals.
According to Bergh and colleagues (2014), understanding signaling equilibria means
understanding not just receiver reaction to a signal, but also confirmation of the quality of
interest following a signal. In this study, the post signal performance is how well an individual
does at the job of interest. If an individual contributes often, this shows that they are not only
signaling but also taking actions in the system to help the overall goals of the community. This
would confirm that visible and informative signals are working properly in the system.
Hypothesis 4: The post signal performance of those individuals that send high visibility
and/or high information signals will be better than those that send low visibility and/or
low information signals.
Building on these ideas about how individual-level behavior may lead to macro,
emergent outcomes, the next section discusses several conceptual networks to understand how
people, expertise, and content are linked to one another in a theoretical network that involves
conspicuous connection signaling.
3.4 Conceptual Signaling Networks
In aggregate, these conspicuous connections, which are conceptualized as signals that
corroborate one’s claims of expertise, form networks. Three conceptual networks of interest
inform the hypotheses and research questions at the network and system levels.
62
The first network of interest is the network that links people and their expertise claims, or
the Person-Expertise (PE) network. Expertise is inherently relational (Evetts et al., 2006; Mieg,
2006), involving communication (Treem, 2012) about the type of expertise people may have.
Similar to TMS literature that has examined communication and expertise from a social network
perspective (J.-Y. Lee et al., 2014; Palazzolo et al., 2006), this study examines how people are
linked not only to each other, but also to their expertise areas. The PE network is a bimodal
network, meaning it involves entities or nodes of two different kinds: people and expertise
claims. A key point here is that while individuals’ actual expertise areas are unobservable to
others in the network, their expertise claims are observable (i.e., expertise claims are signals).
The PE network involves one-way conspicuous connections from people to their
expertise claims, which in the case of GitHub are programming languages. This results in a
network of similarities between people. For example, one person may claim to be an expert in
the programming languages of R and Python, while another person may claim to be an expert in
the programming languages of Python and JavaScript. While these two individuals overlap in the
Python programming language expertise area, they are not directly connected, but rather only
connected if this bimodal network is projected to a unimodal network where people can be
connected to each other if they share the same expertise claim (in this example, they share the
expertise claim of Python). The network could also be projected to a one-mode network of
expertise claims to understand how expertise claims are linked to each other through people. In
the terms from Table 3.1, this network would consist of similarity ties.
The second conceptual network of interest is one that connects people to each other, or
the Person-Person (PP) network. The PP network is a one-mode network (only one type of node:
people), but it can be multiplex, meaning there are different types of relationships between
63
people that can be indicated by a connection. The different connections that are available are a
function of the system within which people are creating their connections (and the researcher-
imposed boundaries that limit the scope of the system). For example, in the online world broadly,
a person may connect to another online through a website hyperlink, a Twitter follow, or a
Facebook friendship, among many other possibilities. Even within one platform, an individual
may connect to others in several different ways, depending on the capabilities of the platform. In
the PP network of interest here, conspicuous connection signals are connecting people to other
people. Of interest in the PP network are that these connections act as signals about expertise,
and these signals are sent to a third-party (Shumate et al., 2013; Shumate & Contractor, 2013).
The connections themselves may be important in other ways, like transmitting information
between those people who are connected, but for this study the connection is a signal to those
outside of the connection. These connections are visible to others in the network, and different
individuals may have different mental representations of the full PP network, or who knows who
and who knows what, based on what signals they observe (Akgün et al., 2005; Hollingshead &
Brandon, 2003). There may be online-only connections that are one-way or symmetrical, but
sometimes people are also connected offline and these offline connections can be represented
online. For example, someone may list another person’s name on CV that they post online,
indicate coauthorship or joint project work on a personal website, or receive recommendations
and endorsements by coworkers or supervisors on professional social media sites. In terms of
Table 3.1, this network consists of social relation ties.
These networks matter in relation to one another because they can provide evidence for
whether conspicuous connection signaling is working in the system. Changes in one of the
networks should also reflect changes in other networks, as well. Also, if the full PP network is
64
accurately corroborating the claims in the PE network, then this means there should be social
closure (Evetts et al., 2006; J.-Y. Lee et al., 2014) in the PP network (i.e., groups of likeminded
individuals will signal their connections to each other to corroborate their expertise to outsiders).
Hypothesis 5: People will have more connections with others in the PP network who
claim the same expertise area in the PE network than those who claim different expertise
areas.
The last conceptual network of interest is the network that connects people to
contributions, or the Person-Contribution (PC) network. Like the PE network, the PC network is
a bimodal network. However, this network also is multiplex, because people can be connected to
content either as providers or requestors. Providers are those who provide some form of
contribution to the collective or share their knowledge. For example, someone may submit a
piece of code to include in an open-source software project. Requestors are those who ask for
and would like to use that contribution for a project. This network provides both confirmation to
providers and feedback to requestors, as is necessary for testing separating equilibria. When
providers signal using conspicuous connection, they indicate to requestors that the contribution
would be worthy of including. For example, Flanagin and Metzger (2013) find that people
perceive online movie ratings that have more volume (i.e., more people have rated it) and that
come from experts as being more credible. Looking to Table 3.1, the one-mode projection of the
PC network consists of information flow ties.
These networks essentially are the formation and maintenance of TM systems, where
individuals know who has expertise and knows others and can share the cognitive load and
65
information-processing among members of the team to build complicated products. As such, if
signaling is working in a way to lead to effective TM systems, teams who request contributions
from people who are signal using conspicuous connections, or with connections that are higher
in information, should be more effective than teams who request contributions from others who
do not signal or signal using lower information signals.
Hypothesis 6: Team projects that request more contributions from individuals who
conspicuously connect more often or with higher information will be more successful.
This chapter provided a foundation for the theory of signaling through conspicuous
connection. This theory led to several hypotheses at both the individual and network levels. The
hypotheses in this section and the previous sections require methods that can untangle these
relations in a systematic way. The next chapter will outline the methodology that can provide
unique insight into conspicuous connections.
66
CHAPTER 4: METHOD
Observational digital data from an online social software coding website, GitHub, is
utilized to test the hypotheses raised in the previous chapter. This chapter outlines the empirical
setting and the methodological approach. The first section in this chapter describes the empirical
site within which the conspicuous connection is examined. The second section develops the
social network analysis foundation and how networks are represented in the empirical setting.
The third section discusses how the concepts from the hypotheses are operationalized as
variables to explore how signaling through conspicuous connection operates. The final section
discusses the analytical techniques used to test the hypotheses.
4.1 GitHub as the Empirical Site
GitHub is a social coding website that facilitates user collaboration on software by
hosting those projects on their site. GitHub utilizes Git, an open-source distributed version
control system (VCS) for software development that allows team members to easily manage and
track changes to software projects, called repositories, even while working at the same time if
they wish. Git’s distributed architecture means that each team member can work on a local copy
of the repository that contains the full history of all code changes before merging any changes to
a remote repository that is hosted somewhere else, such as GitHub.
Git involves several features that allow individuals to write and debug code and interact
with one another, while the GitHub site includes several more social features on top of the Git
VCS. Git allows users to branch a project, meaning they have a local version on their machine
and are working on it independently without affecting the master branch. Once users branch off
the master or other upstream branch, they create “commits,” which are any changes to the project
67
code including additions, edits, and deletions, with each one of these commits “considered a
separate unit of change” (“Understanding the GitHub Flow · GitHub Guides,” n.d.). Commits are
also associated with messages that provide for transparency of work and ease of understanding
for fellow coders on the same project. When changes have been made between the working
branch and the upstream branch, a user can open a “pull request” to initiate discussion on their
commits. Those reviewing the code may then ask questions, make comments, or have other types
of conversations surrounding the changes. After this discussion and review, the next step is to
deploy the changes in production before merging. Once the changes are verified and do not cause
issues, the branch can be merged with the master or upstream branch. Figure 4.1 is reproduced
from the GitHub website and illustrates the workflow from the GitHub guide.
Figure 4.1 The GitHub Flow (reproduced from “Understanding the GitHub Flow · GitHub
Guides,” n.d.)
68
In addition to the Git capabilities, GitHub allows users to connect with one another and
engage socially, like other social media network sites (Boyd & Ellison, 2007; Kane et al., 2014).
One of those social capabilities is to follow or be followed by other users and to see who others
are followed by and follow. When a user follows another on GitHub, their activity populates on a
front page feed similar to Facebook’s News Feed or Twitter’s timeline. This allows users to keep
track of others’ projects, work, and contributions. The user who is followed now has an increase
in “Followers,” and both the aggregate number of followers and the individual followers’ names
(and hyperlinks) are visible through their profile page. The user who followed the other user now
has an increase in their “Following” number, and again both the aggregated number of those they
are following as well as the individual names (and hyperlinks) are visible through their profile
page. This allows users to see who is popular and specifically who follows whom. Another social
and communicative function allows users to request feedback from a specific user or team on a
Pull Request with the @mention function.
GitHub is growing quickly. In 2011, there were about 1.2 million users and 1.7 million
repositories, and in 2012 the site grew to 2.8 million users and 4.6 million repositories (Doll,
2012). In 2016, GitHub hosted 19.4 million active repositories written by 5.8 million users in
316 different programming languages (“GitHub Octoverse 2017,” n.d.; La, 2016; Ramel, 2016).
The features on GitHub that allow people to interact and collaborate provide for an
opportunity to examine the hypotheses regarding expertise and signaling about and through
networks, the visibility of and information in those signals, and the development of large TM
system networks. GitHub works as a case study to test the relevant hypotheses because it is an
environment that relies on users’ expertise to complete software projects, where a large user base
69
is distributed across the world, and there are often information asymmetries regarding who are
the experts in different programming languages.
In addition, literature examining GitHub suggests that people do make judgments based
on the publicly visible information that is shared specifically on GitHub, including who follows
whom and the number of follows. For example, Dabbish et al. (2012) interviewed GitHub users
and found that “the number of followers a developer had was interpreted as a signal of status in
the community. Developers with lots of followers were treated as local celebrities (e.g. dhh).
Their activities were retold almost as local parable; our interviewees knew a great deal about
them and paid attention to their actions (P20)” (p. 1281). They also found that users
“…described following the actions of other developers because they deemed them
particularly good at coding. They referred to these developers with thousands of
followers as ‘coding rockstars’ (P20) and reported interest in how they coded, what
projects they were working on, and what projects they were following. In most cases this
was because these developers were deemed to have special skill and knowledge about the
domain (P17, P20) in part as a function of their large following” (Dabbish et al., 2012,
pp. 1283–1284).
This suggests that connections play a role in how people evaluate the expertise of others. This
provides an opportunity to examine how conspicuous connection signals may impact the
development of software projects and the formation of future following networks.
The GitHub platform allows for the development of software projects by often large and
dispersed groups of people, which is a different environment compared to most TM systems
work that focuses on small, collocated groups. This is an opportunity to understand how TM
systems theory may work in this context, with the addition of signaling as a mechanism for the
70
development of this shared information-processing and cognition. Also, because much of the
work using this theory focuses on how a well-functioning TM system can impact group-level
outcomes, the team’s output will be a good outcome variable that is comparable to other
research. For this reason, it is worth exploring if conspicuous connection signaling impacts how
groups are formed around software projects and the outcomes surrounding software projects like
the release of software versions or the number of commits, or changes to the code. Software is
considered a knowledge product that requires expertise to contribute to, and the contributions
themselves are usually sections of code.
One year of time-series “Follow Event” data from GitHub in 2013 were gathered from
GitHub Archive (GitHubarchive.org), “a project to record the public GitHub timeline, archive it,
and make it easily accessible for further analysis” (“GitHub Archive,” n.d.). The data are stored
on Google Big Query where different “events” data (the different ways that users interact with
their projects, the website, and each other) can be queried using SQL-like queries. The Follow
Events data includes each user's username and unique ID number and the username and unique
ID number of who they followed, the time the follow was added, as well as information about
the followed user. While the database is updated every hour, the "Follow Events" that were
captured in their API until 2013 are no longer created. The person-to-person network is created
using this data.
In addition to the follow data, other events data within the same time period has been
gathered to examine the outcomes associated with users and teams. The main events in the API
that are useful for operationalizing concepts in this study are Pull Request Events. This occurs
when a user notifies others about changes they have made in a branch of the original, allowing
others like the owner to decide whether to merge the changes with the master branch.
71
Additionally, in 2011 and 2012 the API did not capture many of the details for the Pull Request
events, which contain the language, popularity, and contribution information on repositories. For
this reason, the data analyzed in this study are limited to 2013. Section 4.2 expands on these
event types and how they relate to the concepts of interest in the previous chapter.
4.2 Operationalizing Concepts
The way people interact on GitHub is captured through the dataset described in the
Sections 4.1. This section provides more detail on the specific variables captured in the dataset
that can be used to operationalize the concepts in the hypotheses. Table 4.1 summarizes the
concepts, hypotheses, operationlizations, and event type in the database.
72
Table 4.1 Concepts and Operationalizations
Concept Hypotheses Measured Variable(s) Events Type in
API
Expertise Claims or the
“unobservable quality”
of being an expert
H1(c), H2,
H3, H5
-The programming language label
for users’ personal repositories
-Pull Request
Events
Conspicuous
Connections
H2, H3, H5,
H6
-Being followed by another user
(who claims expertise in a
programming language)
-Follow Events
for follows
-Pull Request
Events claims of
expertise based
on owned
repository
programming
language
Visibility of signals H1(a-d), H4 -The number of people who follow
and who are followed by a user
(more follows → greater visibility)
-Follow Events:
aggregated
Information in signals H1(a-d), H4 -Calculation of information based
on Skyrms (see Section 2.2)
-Follow Events
for conspicuous
connection
signals
-Pull Request
Events for claims
of expertise for
states of nature
Request for
contribution
H1(a) -Assignment to a pull request
-Pull Request
Events
Contribution / Post
signal performance
H4 -Number of times an individual
contributes to repositories that are
not their own
-Pull Request
Events
Spread of content H1(d) -Number of watchers on all
projects to which the user
contributes
-Pull Request
Events
Success of projects H6 -Number of watchers on the
repository of interest
-Pull Request
Events
73
The first important concept is claimed expertise. On GitHub, people contribute to others’
software projects and also create software projects on their own. The projects are created using
specific programming languages. The programming languages of the repositories are labeled
automatically by GitHub through the “Linguist library” (linguist, 2011/2018), which is an open
source program that GitHub employs to analyze in which language the project is being written.
This alleviates any concern that a programmer will incorrectly label a language, either
intentionally or unintentionally. The label is an indication of the programming language in which
the user is claiming to be an expert. This is captured in the Pull Request Events, where events are
captured when individuals interact with their own repositories or other repositories and the
detailed information about that repository is contained in that event. The site does not have other
capabilities for individuals to express their claimed expertise, but this provides a good proxy.
The PE network can be built with these user-to-repository programming language data.
Next is the existence of conspicuous connection signaling. For the year of 2013, Follow
Events were captured with the timestamp of the follow. This tells us exactly when each user was
followed and by whom they were followed. The PP network at any point in time is built with
these follow data. In the aggregate, the number of people who follow each user is counted to
establish how visible these conspicuous connections are, which is a necessary variable for the
hypotheses dealing with visibility. Chapter 3 develops a list of different types of conspicuous
connection signals (see Table 3.1). One of those types of conspicuous connection signals is
incoming, aggregated social relations. This is the type of signal that is examined here for both the
existence of conspicuous connection signaling as well as the visibility of those signals. Visibility,
or varying levels of conspicuousness, is operationalized as the starting number of followers
74
(incoming social relation, or acknowledgement conspicuous connection signals) for each
individual on GitHub in the beginning of 2013. Those individuals with the largest number of
conspicuous connection signals at the beginning of 2013 have the highest visibility, while those
with the fewest signals have the lowest visibility. This follows from the conceptual definition
provided in Chapter 3 because a larger number is assumed to be easier to interpret by receivers,
whereas a smaller number requires individuals to dig deeper into who is following in order to see
if those followers corroborate the expertise of the sender. The sender of the signal is in this case
the target of connection, as explicated in Chapter 3 and illustrated in figures 3.1 and 3.2.
Senders’ profile elements showcasing "number of followers" signals to receivers about their
expertise.
Another aspect of the hypotheses are the actions of the receivers. On GitHub, users on
projects can request that another user contribute to their project, and this is captured in Pull
Request Events when a user is assigned to another’s repository or collaborates on another’s
repository. This is necessary for the hypotheses regarding receiver behavior and post signal
performance of the sender. Spread of content is operationalized as the total number of watchers
on all projects to which a user contributes. On GitHub, people can flag a repository by
“watching” it, and this is captured in the API. This allows users to keep track of the progress on
the repository by being notified of conversations happening on the repository. This means that
there are more eyes on the repository. Previous research has also found that users on GitHub
view the number of watchers as an important indicator of the quality, usefulness, or importance
of that repository (Dabbish et al., 2012). The success of projects is also operationalized as the
number of watchers, but this time for each specific repository as the unit of analysis. Just as
number of eyes on all repositories for individuals is a measure of spread of that content, eyes on
75
one repository indicates that the repository is interesting for others. The number of stargazers
would also be a good indicator of spread of content. When an individual flags a repository by
“starring” it, rather than following all of the activity of that repository they instead place it into a
folder so that they can easily find it again, essentially bookmarking the project. There are two
reasons why this study uses the number of watchers rather than the number of stargazers. First,
the number of watchers is a higher bar for spread because people are engaging with the
repository in a more substantial manner. They are not merely bookmarking the project, but rather
having that project’s activity populate on their personal feed. As such, this could be considered a
stricter and more conservative measure than the stargazers number. Second, the dataset gathered,
which will be discussed in the next chapter, does not include the number of stars on a large
number of repositories, while the number of watchers is a more consistent measure. In order to
avoid any sampling issues by not including those repositories without number of stars but which
may in fact have stars or where other issues with the data may exist, this study instead utilizes
watchers as a more consistent measure. This is also the case with success of projects. Projects are
successful if they have many people engaged with the project. While stars may be another
indicator, the same issues exist, and so the number of watchers per repository is also the
operational measure of success of a project.
On GitHub, people can signal using conspicuous connections by following or being
followed by another user (or set of users). To calculate the amount of information in a
conspicuous connection signal, recall Skyrms’ (2010) formula expanded on in Section 2.2. There
are two possible 𝑠𝑡𝑎𝑡𝑒 s for each programming language on the website: expert or not expert. The
unconditional probability of a user being an expert in each programming language is calculated
by examining the total number of claimed experts in a language and dividing by the total number
76
of people on the site. The probability of someone being an expert given conspicuous connection
signaling is calculated by obtaining figures from contingency tables, which cross tabulate
expertise with signaling and provide the necessary information. This will be expanded on in the
next chapter. This provides the necessary components of the formula. The next section describes
the analysis techniques used to examine the hypotheses using these measured variables.
4.3 Analyses for Hypotheses
This section outlines the statistical analyses performed to test the hypotheses that are
reported in the next chapter, as well as the criterion used to evaluate competing models.
Hypothesis 1(a) states that as the visibility and/or information of conspicuous connection signals
observed by a receiver about an individual increases, the number of contribution requests to that
individual increases. Either the Poisson or negative binomial regressions are the appropriate
statistical models when there is a positive count dependent variable (number of contribution
requests) and a positive count independent variable (visibility of connections, operationalized as
the number of follows). A count variable is a common form of a “limited dependent variable,”
meaning it is not unbounded and continuous, and it cannot be assumed to have constant variance
or a normal error term (Coxe, West, & Aiken, 2013). In this case, a generalized linear model like
a Poisson regression must be performed where these assumptions are relaxed. Poisson
regression, however, relies on one parameter which is equal to both the mean and variance, so if
the dependent variable’s mean is not equal to the variance a negative binomial regression is a
more appropriate model that accounts for over dispersion by estimating a dispersion parameter
(Hilbe, 2011). Hypothesis 1(b) states that as the visibility and/or information of conspicuous
connection signals observed by a receiver about an individual increases, the number of
connections that individual gets in the future increases. Again, a Poisson or negative binomial
77
regression model can be used in this case, as both the visibility of connections (operationalized
as the total number of starting follower connections) and requests for contributions
(operationalized as the total number of connections added through the year) are count variables.
Hypothesis 1(c) states that as the visibility and/or information of conspicuous connection
signals observed by a receiver about an individual increases, that individual is more likely to be
an expert. Logistic regression is appropriate, because the independent variables (visibility and/or
information of signals) are continuous and the dependent variable (whether someone is an
expert) is dichotomous. Hypothesis 1(d) states that as the visibility of conspicuous connection
signals observed by a receiver about an individual increases, that individual’s content will spread
more widely. In these cases, there is a continuous independent (visibility and/or information of
signals) and counted dependent variable (spread of content measured as number of watchers on
repositories), so Poisson or negative binomial regression is the appropriate model in these cases.
Hypothesis 2 states that experts will signal more using conspicuous connection signals
than those with no expertise. In this case, independent t-tests can compare the two groups,
experts and non-experts, within the different programming languages in terms of the mean
number of signals sent by those in each group.
Hypothesis 3 states that as the proportion of non-experts increases in comparison to
experts, experts will send more conspicuous connection signals. The proportions are within
programming language groups, so this means that the level of analysis is the programming
language groups. The proportion of experts to non-experts can be calculated for each language
group as the independent variable of interest. The mean number of conspicuous connection
signals sent can be calculated for experts within each group as the dependent variable. The
78
independent variable is continuous but the dependent variable is a mean based on count of
connections, so after checking assumptions a linear regression may be an appropriate analysis.
Hypothesis 4 states that the post signal performance of those individuals that send high
visibility and/or high information signals will be better than those that send low visibility (or low
information) signals. The visibility of signals and the post signal performance are count
variables, so like H1(a), H1(b), and H1(d), a Poisson or negative binomial regression will be an
appropriate analysis after checking assumptions. Hypothesis 5 states that it is more likely that
users will have connections with others in the PP network who claim the same expertise area.
This homophily hypothesis is tested by examining Newman’s assortativity coefficient, where a
coefficient of 1 means only experts connect with experts and non-experts connect with non-
experts with no crossover connections, and where a coefficient of -1 means there is only
connection between experts and non-experts and no people of the same type connect. A
coefficient of 0 means that the amount of within-group connection is random, with some within-
group and some between-group connection. The next chapter provides more details about this
analysis.
Hypothesis 6 states that team projects that request more contributions from individuals
with a greater number of conspicuous connections will be more successful. This hypothesis takes
the software coding projects as the units of analysis. The independent variable for this hypothesis
is conceptualized as the average number of conspicuous connections of the individuals who
contribute to a repository. The dependent variable of success is operationalized as the number of
watchers on a project. Because watchers is a count, a Poisson or negative binomial regression is
the appropriate analysis technique depending on the dispersion of the dependent variable.
79
For the models where appropriate, Akaike’s information criterion (AIC) is used to
evaluate relative performance and select the model that is the best fit for the data. The AIC is an
information theoretic criterion that is advocated by many scholars as one of the most effective
and useful methods of model selection (Burnham & Anderson, 2002). The AIC is calculated
using the formula
−2 log(ℒ) + 2𝐾
where log(ℒ) is the maximized log-likelihood and 𝐾 is the number of parameters. Models with
more parameters are penalized. The value of each AIC cannot be evaluated independently, but
rather the relative values between models are examined and the model with the lowest AIC is
selected as the best fitted model for the data. AIC values depend on the maximum likelihood
value which is determined through maximum likelihood estimation of the parameters of the
model.
The next chapter delves into the results of these analyses.
80
CHAPTER 5: RESULTS
This chapter details the results to the analyses performed using the GitHub data. The first
section provides the descriptive statistics, giving empirical background on website, the users who
interact on the website, and the projects they own and on which they collaborate. The next
section examines the individual-level hypotheses that serve as the foundation for the theory. The
final section discusses the results for the system- and network-level hypotheses.
5.1 Descriptive Statistics on GitHub
In 2013, there were 259,556 unique public repositories
13
(i.e., software projects started
and/or maintained by users) on GitHub where pull requests were initiated, with 140,415 unique
users who own those repositories. Pull requests occur whenever someone submits proposed
changes to a repository. Pull requests act as “the heart of collaboration on GitHub” (“Hello
World · GitHub Guides,” 2016), because this is when proposed changes become visible to other
collaborators. At this point, others can review changes, make comments, and provide
suggestions. This is an important aspect of the forthcoming analysis, as many of the dependent
and independent variables come from pull request events. There were approximately 3.1 million
pull requests within the almost 260 thousand repositories. Of those, 682,344 pull requests were
acted on by the same user who owned the repository, while the remaining 2,431,650 pull
requests were acted on by a different user than who owned the repository.
Pull requests can be assigned to other users, if the requester would like. However, most
pull requests were not assigned to anyone else. In fact, about 3 million pull requests had no
13
All repositories and users accessed and reported in this study are fully public and available
through the GitHub API.
81
assigned user. However, there were 79,548 pull requests including an assigned user, with 18,338
unique users assigned to those requests.
The repositories in the 2013 with pull requests were labeled with 131 unique
programming languages. See Figure 5.1 for a bar graph showing the frequencies of the top 20
languages. The most frequently-used was JavaScript, with 45,750 repositories labeled with this
language. Many repositories were not labeled with a programming language (null = 39,131).
Figure 5.1 Top 20 Programming Languages on GitHub in 2013
Most important in this study is the main independent variable, conspicuous connection
signals. Throughout 2013, there were approximately 1.6 million follow events, where one user
followed another user, and this information was publicly visible to other users in the form of
82
both individual follows as well as aggregated numbers. This is among the 602,246 total unique
users, 430,306 of whom were followed by another user in the dataset. The first follow event in
2013 occurred on January 1 at 8:00:48 Universal Time Coordinated (UTC), while the last
recorded follow event occurred on December 11 at 18:59:19 UTC. GitHub stopped recording
follow events in their API at the end of 2013. Although users can still follow other users on
GitHub, the events are no longer recorded as such in the API. While these data are longitudinal
and one person can have multiple follow events over time, for analysis the data are collapsed and
aggregated into a cross-section of the whole year. For this reason, some of the data dependency
issues in the data that would be problematic for many types of analyses are avoided. However,
the discussion chapter outlines some of the limitations that still exist.
The user with the most added followers through 2013 was ‘mojombo.’ This is the same
user profile that was examined in Chapter 3 as an example of how acknowledgment conspicuous
connections can be displayed on social media profiles. This user had 8,074 added followers
through 2013, and ended with a total of 16,381 total followers. As of November 2017, this user
is still on GitHub with 20.6 thousand followers ((“mojombo (Tom Preston-Werner),” n.d.). It can
be seen from this one user that he is connected directly to his other online and offline personas as
a coder, through his personal website and apparent legal name. The average total followers
people ended the year with was 7.41, with a standard deviation of 64.44.
Information in a conspicuous connection signal can be calculated by looking at
contingency tables of each programming language, where the proportions of those who have at
least one incoming connection from someone in that language and those who do not are cross
tabulated with the proportions those who are experts in that programming and those who are not.
83
Table 5.1 provides an example of the cross tabulation of JavaScript signaling and JavaScript
experts.
Table 5.1 Example Contingency Table for JavaScript – Signaling and Expertise
Expert in JavaScript
No Yes Total
Acknowledgement
Conspicuous Connection
Signal (JavaScript)
No 0.8386 0.0216 0.8602
Yes 0.1201 0.0197 0.1398
Total 0.9587 0.0413 1.0000
This table shows the proportions within each category as the totals. For example, to see
the total proportion of people who are experts in JavaScript, look to the third row (Total), second
column (Yes – Expert), and this equals 0.0413, or 4.13 percent of the 430,306 individuals are
experts in JavaScript. To see the total proportion of people who have a connection from someone
who is an expert in JavaScript, look to the second row (Yes – Signal), third column (Total), and
this equals 0.1398, or 13.98 percent of people in the network have received one or more
incoming connections from an expert in JavaScript, meaning they have one or more
acknowledgement CC signals. These values give all of the necessary information for the
information in signals equation from Chapter 2. This was,
𝐼 (𝑠𝑖𝑔𝑛𝑎𝑙 | 𝑠𝑡𝑎𝑡𝑒𝑠 ) = ∑ 𝑃 (𝑠𝑡𝑎𝑡𝑒 𝑖 | 𝑠𝑖𝑔𝑛𝑎𝑙 ) ∗ log
2
[
𝑃 (𝑠𝑡𝑎𝑡𝑒 𝑖 | 𝑠𝑖𝑔𝑛𝑎𝑙 )
𝑃 (𝑠𝑡𝑎𝑡𝑒 𝑖 )
]
where 𝐼 (𝑠𝑖𝑔𝑛𝑎𝑙 | 𝑠𝑡𝑎𝑡𝑒𝑠 ) is the information in the signal given all the states,
𝑃 (𝑠𝑡𝑎𝑡𝑒 𝑖 | 𝑠𝑖𝑔𝑛𝑎𝑙 ) is the conditional probability of each state i given the signal, and 𝑃 (𝑠𝑡𝑎𝑡𝑒 𝑖 )
is the unconditional probability of each state i. In this case, the states of nature are “Expert” and
“Not Expert” in the particular programming language. The signal in this study is equal to having
84
a conspicuous connection with one person in the particular programming language. Given the
rules of conditional probability,
𝑃 (𝐴 | 𝐵 ) =
𝑃 (𝐴 𝑎𝑛𝑑 𝐵 )
𝑃 (𝐵 )
where 𝑃 (𝐴 | 𝐵 ) is the conditional probability of event A given event B, 𝑃 (𝐴 𝑎𝑛𝑑 𝐵 ) is
the joint probability of events A and B, and 𝑃 (𝐵 ) is the unconditional probability of event B. As
such, the conditional probability of each state (expert and not expert) given the signal can be
obtained via the contingency tables. To obtain 𝑃 (𝑠𝑡𝑎𝑡𝑒 𝑖 | 𝑠𝑖𝑔𝑛𝑎𝑙 ), or the conditional probability
of being an expert given a signal and the conditional probability of being a non-expert given a
signal, the figures for 𝑃 (𝐴 𝑎𝑛 𝑑 𝐵 ) (being an expert and having a signal, row 2 column 2; being a
non-expert and having a signal, row 2 column 1) and for 𝑃 (𝐵 ) (having a signal, row 2 column 3)
can be found in the contingency table. In the case of Table 5.1 for JavaScript, the joint
probability of being an expert (A) and having a conspicuous connection signal (B) is
𝑃 (𝐴 𝑎𝑛𝑑 𝐵 ) = 0.0197. The joint probability of being a non-expert (~A) having a conspicuous
connection signal (B) is 𝑃 (~𝐴 𝑎𝑛𝑑 𝐵 ) = 0.1201. The unconditional probability of having a
conspicuous connection signal in JavaScript is 𝑃 (𝐵 ) = 0.1398. With these numbers, the
conditional probability of being an expert in JavaScript given a JavaScript conspicuous
connection signal is,
𝑃 (𝑒𝑥𝑝𝑒 𝑟 𝑡 | 𝑠𝑖𝑔𝑛𝑎𝑙 ) =
0.0197
0.1398
= 0.1406
The conditional probability of being a non-expert given a signal is,
𝑃 (~𝑒𝑥𝑝𝑒𝑟𝑡 | 𝑠𝑖𝑔𝑛𝑎𝑙 ) =
0.1201
0.1398
= 0.8594
Plugging these numbers into the above equation to calculate the information in a signal,
85
𝐼 (𝑠𝑖𝑔𝑛𝑎𝑙 | 𝑒𝑥𝑝𝑒𝑟𝑡 𝑎𝑛𝑑 ~𝑒𝑥𝑝𝑒𝑟𝑡 ) = (0.1406 ∗ log
2
[
0.1406
0.0413
]) + (0.8594 ∗ log
2
[
0.8594
0.9587
])
which equals 0.1131 bits of information contained in one JavaScript conspicuous
connection signal. This can be calculated for every language group that has conspicuous
connection signaling and experts. Of the 131 programming languages, the information in signals
can be calculated for 76 languages, where there are proportions in all necessary five cells in the
contingency table. Figure 5.2 provides a histogram highlighting the distribution of information in
signals.
Figure 5.2 Histogram of Information in Signals
Most languages’ signals have information between 0 and 1 bits, while there are a few
languages’ signals have high information. The top 10 languages in terms of their signals’
information include Ada (2.76 bits), Ceylon (2.01 bits), Eiffel (1.83 bits), F# (1.27 bits), Squirrel
(1.20 bits), Apex (1.12 bits), ColdFusion (1.11 bits), Julia (1.04 bits), OCaml (1.03 bits), and Tcl
(0.98 bits).
Information in Signals
Frequency
0.0 0.5 1.0 1.5 2.0 2.5
0 5 10 15 20
86
Prior to hypothesis testing, it should be noted that because of the large size of the dataset,
p-values will very often be in the significant range, thus can generally be considered low in
information value (M. Lin, Lucas Jr, & Shmueli, 2013). In addition, the analyses of several of the
hypotheses include non-independent observations (e.g., individual contributions and success of
projects), so the standard errors and p-values cannot be interpreted accurately in these cases.
While the below sections do report p-values in order to adhere to standard reporting practices,
the written findings emphasize practical implications by interpreting coefficients and evaluating
model fit through other substantive criterion like the Akaike Information Criterion (AIC) which
are described as they are utilized. The term effect size used in the next section refers to the
“magnitude of the differences found” (Sullivan & Feinn, 2012, p. 281), or how much the
independent variables impact the dependent variable on a unit-by-unit basis or as percentage
change.
5.2 Results for Individual-Level Hypotheses
Hypothesis 1(a) states that as the visibility and/or information of conspicuous connection
signals observed by a receiver about a sender increases, the number of contribution requests to
that sender increases. The independent variable of visibility is operationalized as the starting
number of followers in 2013 (mean = 4.82; SD = 41.51; min = 0; max = 9052). The independent
variable of information in signals is operationalized as the sum of the information in bits of the
conspicuous connection signals (mean = 0.21 ; SD = 1.23; min = 0; max = 204.89). The
dependent variable, request contributions, is operationalized in this dataset as the number of
times a user is assigned to a pull request in 2013 (mean = 0.15; SD = 3; min = 0; max = 491).
However, as stated in the previous section, only 79,548 pull requests include an assigned
user, with 18,338 unique users assigned to those requests. This means that the dependent variable
87
has an excessive number of zeros, and this is likely because there are two separate processes
occurring: one where individuals do not interact with other repositories and only use GitHub to
create and store their own personal content, and another where individuals do interact with other
repositories and people, using the social functionality of the site. This can be modeled using a
zero-inflated generalized linear model, where the excessive zeros and the count are modeled as
two processes. In a zero-inflated model, excessive zeros are modeled as a binomial with a logit
link function (either zero or not zero), while the counts in the non-zero group are modeled as
either Poisson or negative binomial with a log link function. To model the excess zeros portion
of the mixed model, two variables are hypothesized to impact whether someone uses the social
functionality of GitHub or just uses is as a storage site: how often each user interacts with their
own repositories and how often others interact with each users’ repositories. These two variables
are included in the excess zeros portion of the model to predict whether someone is in the non-
social category (a certain zero) or the social category (a count that could be zero, but not
certainly).
In order to determine whether the Poisson or negative binomial distribution is more
appropriate for the count portion of the model, it must be determined whether the variance is
equal to or greater than the mean, with Poisson more appropriate for the former situation while
negative binomial more appropriate for the latter. To determine this, a Poisson generalized linear
model (GLM) was run with number of assignments as the dependent variable and both visibility
(number of follows at the beginning of the year) and total information in signals as the
independent variables, and then a dispersion test was performed. The “dispersiontest”
function from the “AER” package in R implements a test of over dispersion as outlined in
Cameron and Trivedi (1990). This test for over dispersion determines whether the variance
88
equals the mean (the null hypothesis) or if the variance is greater than the mean (over dispersed).
In this case, the null hypothesis that the variance is equal to the mean can be rejected (dispersion
parameter c = 57.62; p < 0.001), showing evidence for over dispersion. As such, a negative
binomial regression, which estimates a dispersion parameter in addition to the mean, can
performed for the count portion of the mixed model to test the hypothesis.
To interpret the coefficients of the zero-inflated negative binomial (ZINB) model, the
estimates are exponentiated and then can be interpreted similarly to a linear regression as unit
increases of the dependent variable, though the count and the zero-inflated portions of the model
must be interpreted separately. The zero-inflated portion of the model models the log odds of
being an excessive zero. The count portion of the model models the log of expected count as a
function of the independent variables. To change to the original units of interest for both parts of
the model for easier interpretation and to understand effect size, the inverse of a natural log is
taken, which is the exponential function with e as the constant (𝑒 𝑦 ).
For the zero-inflated portion of the model, the intercept provides the estimated log odds
of being in the excessive zero group, and exponentiating the intercept is equal to the estimated
odds of being in the excessive zero group. Exponentiating the coefficients on the independent
variables in the zero-inflated portion gives the expected change in odds of being part of the
excessive zero group, holding the other independent variables constant. The odds ratio (OR) can
be interpreted as the percentage change in the odds, where
100 ∗ (𝑂𝑅 – 1.00) = % 𝑐 ℎ𝑎𝑛𝑔𝑒
If the OR is greater than one, there is a positive percentage change in the odds of being in
the excessive zero group. If the OR is less than one, there is a negative percentage change in the
odds of being in the excessive zero group. For example, if the odds ratio is equal to one, this
89
means there is no change in odds of being in the excessive zero group compared to the other
group, or a zero percent change in the odds. If the odds ratio is equal to two, this means the
expected change in odds of being part of the excessive zero group doubles, or in other words for
a one unit increase in the independent variable there will be a 100 percent increase in the odds of
being in the excessive zero group. If the odds ratio is equal to 0.5, this means the expected
change in odds of being part of the excessive zero group reduces by half, or in other words there
is a 50 percent decrease in the odds of being in the excessive zero group with a one unit increase
in the independent variable.
For the count portion of the model, exponentiating the intercept provides the expected
count of the dependent variable given the independent variables are zero. Exponentiating the
coefficients provides an incidence rate ratio (IRR), or in other words, the change in count of the
dependent variable (contributions) given one unit change in the independent variable(s)
(visibility or information). For example, if the exponentiated coefficient, or the IRR, is equal to
one, then this means that for every one unit change in the independent variable there is also an
increase of one in the count of the dependent variable. The difference between the IRR and one
is the proportion change, which can be multiplied by 100 to get the percentage change in the
count. For example, if the IRR equals 1.5, for every one unit increase in the independent variable
there is a 1.5 increase in the count, or a 50 percent increase. In the case of an IRR of one, there is
a zero percent change in count. If the IRR is less than one, the percentage change is negative, or
a decrease in the expected count.
Table 5.2 displays the outcomes of the zero-inflated negative binomial models. The AIC
allows for a relative examination of the performance of different models on the same outcome
variable (Akaike, 1998). A lower AIC indicates a better fit. The ZINB model with information as
90
the only independent variable has the lowest of all models evaluated (AIC = 124583.5 for ZINB
model with information as the IV, compared to the second-lowest AIC = 124585.4 for the ZINB
full model). While the difference between the two models’ AIC’s is small, the coefficient on
visibility is no longer significant in the full model, so the model with information as the only
independent variable is the best fit and is also more parsimonious. For this reason, this is the
better model.
Table 5.2 Comparing Zero-Inflated Negative Binomial Models for Hypothesis 1(a), DV =
contribution requests (number of pull request assignments), n = 430,306
ZINB ZINB full model ZINB null model
Count
Model:
Coefficient (SE) Coefficient (SE) Coefficient (SE) Coefficient (SE)
Intercept 0.1690 (0.0205)*** 0.1180 (0.0211)*** 0.1183 (0.0211)*** -1.9279 (0.0152)***
Visibility 0.0023 (0.0003)***
0.0001 (0.0003)
Information
0.0804 (0.0070)*** 0.0792 (0.0009)***
Zero-Inflated Model:
Intercept 2.4356 (0.0299)*** 2.4275 (0.0303)*** 2.4273 (0.0303)*** -9.2370 (13.0620)***
Own Repo
Interact
-0.3043 (0.016)*** -0.3115 (0.0165)*** -0.3112 (0.0164)***
Other Repo
Interact
-0.1371 (0.0045)*** -0.1389 (0.0046)*** -0.1388 (0.0046)***
AIC 124688.4 124583.5 124585.4 163616.6
Note. * p < 0.05. ** p < 0.01. *** p < 0.001.
For the best ZINB, the odds of being part of the non-social group when both of the
independent variables equal zero is e
2.4273
= 11.33. This shows that it is much more likely that an
individual on GitHub will not use the social functionalities, including interacting with others.
This may be because these people use GitHub as a storage site or do not engage with the site
beyond their initial profile creation. For every one unit increase in the amount individuals
interact with their own repository (Own Repo Interact), the odds of being part of the non-social
91
group decreases by e
-0.3112
= 0.73 (holding all else constant), or a 27 percent decrease in the odds
of being in the non-social group. For every one unit increase in the amount individuals interact
with other repositories (Other Repo Interact), the odds of being part of the non-social group
decrease by e
-0.1388
= 0.87 (holding all else constant), or a 13 percent decrease in the odds of
being in the non-social group. This shows that any interaction, either with their own or other
repositories, makes it less likely they will be part of the non-social, “storage-only” group.
For the count part of the best ZINB model (with information as the only independent
variable in the count portion), the baseline number of contribution requests is e
0.1180
= 1.13
among those who have a chance of being social. A one unit increase of information increases the
expected count of assignments by e
0.0804
= 1.084, or an 8.4 percent increase in contribution
requests (holding all else constant). While this is practically a fairly small increase, the
discussion chapter overviews some of why this increase is so small and also discusses some
potential future avenues for research that could lead to more insight. These findings provide
evidence in support of part of H1(a) which says that as information in signals from a sender
increases, contribution requests also increase (coefficient on information = 0.0804, p < 0.001).
However, the evidence does not support that increased visibility of signals from a sender will
increase contribution requests (AIC on full model including visibility = 124585.4; AIC on
information-only model = 124583.5 and more parsimonious with fewer variables).
Hypothesis 1(b) states that as the visibility and/or information of conspicuous connection
signals observed by a receiver about a sender increases, the number of connections the sender
will receive in the future increases. Like H1(a), the first step is to perform a Poisson regression,
with visibility (the starting-level of conspicuous connection signals in 2013) and information
(sum of information in conspicuous connection signals through 2013) as the independent
92
variables and the total number of followers added through the year of 2013 as the dependent
variable (mean = 3.775; SD = 27.6; min = 1; max = 8074), and then evaluate the dispersion to
determine if a negative binomial regression is more appropriate. After using the glm function in
R and specifying “poisson” as the family to indicate the error distribution and natural log link
function, the dispersiontest function from the AER package was used again to test for over
dispersion (Cameron & Trivedi, 1990). The null hypothesis that the variance is equal to the mean
can be rejected (dispersion parameter c = 35.99; p < 0.001), showing evidence for over
dispersion. As such, a negative binomial (NB) regression in the generalized linear model family,
which estimates a dispersion parameter in addition to the mean, is performed to test the
hypothesis. Table 5.3 provides a summary of all four evaluated models.
Table 5.3 Comparing Negative Binomial Models for H1(b), DV = future connections, n =
430,306
Model 1 Model 2 Full Model Null Model
Intercept 0.9063 (0.0016)*** 0.8118 (0.0015)*** 0.7824 (0.0015)*** 1.3285 (0.0018)***
Visibility 0.0348 (0.0000)*** 0.0107 (0.0000)***
Information 0.8532 (0.0009)*** 0.7061 (0.0014)***
AIC 1902731 1815044 1804731 2104077
Note. * p < 0.05. ** p < 0.01. *** p < 0.001.
The best NB model is the full model with both independent variables, as indicated by the
lowest AIC (AIC for full model = 1,804,731). This shows that the information gained through
including both independent variables, visibility and information, helps with the performance of
the model in predicting future connections. The coefficient on visibility is = 0.011, p < 0.001
and the coefficient on information is = 0.706, p < 0.001. Like the previous hypothesis, the
coefficients can be exponentiated to get expected change in counts with a unit change in the
93
independent variable. In this model, holding all else constant, a one unit increase in visibility
increases future followers by e
0.011
= 1.011, or a 1.1 percent increase. A one unit increase in
information in this model increases future followers by e
0.706
= 2.026, or a 102.6 percent increase
in future followers holding all else constant. The effects of the independent variables’ on future
followers vary greatly: visibility’s impact is fairly small (around one percent), while information
has a large effect (greater than 100 percent). The possible reasons for this are discussed in the
next chapter. The evidence provides support for H1(b) that both visibility (coefficient = 0.011, p
< 0.001) and information (coefficient = 0.706, p < 0.001) influence the number of future
followers.
Hypothesis 1(c) states that as the visibility and/or information of conspicuous connection
signals observed by a receiver about a sender increases, the sender is more likely to be an expert.
In this case, the dependent variable is binary (expert or not). As such, a logistic regression can be
performed to estimate the likelihood an individual is an expert as a function of visibility and
information. Only those 76 language groups that have informative signaling (as calculated in
section 5.1) are included in this analysis. Although information in signals is directly calculated
including the probability someone is an expert in all the languages, information of signals is
summed over all languages and so can be evaluated for this hypothesis.
In addition, this hypothesis will be evaluated on a group-by-group (language-by-
language) basis because individuals can be experts in multiple languages. As such, the binary
outcome of expert or not expert can be evaluated for each language. This analysis looked at users
who owned a repository in the pull requests database for 2013 (n = 140,415). Table 5.4 provides
the results of the logistic regressions predicting whether an individual is an expert based on the
visibility and information in their conspicuous connection signaling, including the odds ratio for
94
interpretability for the intercept and the coefficients on visibility and information as independent
variables, the model AIC, the null AIC). The p-values are also included for the intercept and
independent variables’ coefficients.
Table 5.4 Hypothesis 1(c): Logistic Regression Odds Ratios (OR) for 76 Language Groups with
Informative Signaling, n = 140,415
DV - Expert in:
Intercept
OR
Visibility
OR
Information
OR
Model
AIC
Null AIC
null 0.2760 *** 0.9993 *** 1.0424 *** 147198.23 147286.30 ^
JavaScript 0.2534 *** 1.0002
1.1101 *** 143290.58 144054.69 ^
Ruby 0.1391 *** 1.0003
1.0775 *** 105831.69 106296.31 ^
Python 0.1386 *** 0.9999
1.0272 *** 104539.93 104594.17 ^
Java 0.1344 *** 1.0007 * 0.8994 *** 100458.72 100590.23 ^
PHP 0.1215 *** 1.0008 *** 0.9516 *** 95692.81 95728.55 ^
C 0.0614 *** 0.9998
1.0406 *** 62663.60 62752.72 ^
C++ 0.0527 *** 0.9999
1.0079
55913.87 55911.40
CSS 0.0475 *** 0.9977 *** 1.0901 *** 52460.22 52606.44 ^
Shell 0.0355 *** 0.9995 ** 1.0708 *** 42634.82 42823.18 ^
C# 0.0326 *** 1.0006
0.9108 *** 38593.89 38621.04 ^
Objective-C 0.0299 *** 0.9992 *** 1.0635 *** 37419.05 37532.13 ^
Perl 0.0167 *** 1.0004
1.0144
23710.75 23739.55 ^
CoffeeScript 0.0118 *** 0.9997
1.0469 *** 18126.24 18186.05 ^
Go 0.0094 *** 0.9983 *** 1.0944 *** 15156.33 15266.19 ^
VimL 0.0085 *** 0.9994 * 1.0652 *** 13972.79 14067.82 ^
Scala 0.0083 *** 0.9974 *** 1.0856 *** 13588.17 13630.70 ^
Lua 0.0058 *** 0.9991
1.0413 ** 10086.67 10092.35 ^
Clojure 0.0052 *** 0.9985 *** 1.0827 *** 9302.85 9367.32 ^
Puppet 0.0053 *** 0.9992
1.0386 ** 9321.94 9325.50 ^
Haskell 0.0052 *** 0.9971 *** 1.1071 *** 9223.23 9294.11 ^
Emacs Lisp 0.0047 *** 0.9997
1.0425 *** 8480.25 8511.88 ^
Groovy 0.0043 *** 0.9999
1.0106
7760.54 7756.85
R 0.0038 *** 0.9938 * 1.0811 *** 6944.60 6951.94 ^
Erlang 0.0035 *** 0.9998
1.0374 *** 6582.87 6604.14 ^
ActionScript 0.0027 *** 1.0016 * 0.9057
5110.58 5112.02 ^
Arduino 0.0019 *** 1.0013 * 0.9423
3928.57 3931.11 ^
Matlab 0.0021 *** 0.9778 * 0.8110
3814.99 3830.89 ^
95
TypeScript 0.0019 *** 1.0015 * 0.9162
3817.89 3818.33 ^
Common Lisp 0.0013 *** 0.9985
1.0671 *** 2771.65 2785.76 ^
OCaml 0.0013 *** 0.9964
1.0895 *** 2729.90 2745.83 ^
PowerShell 0.0012 *** 0.9997
1.0179
2669.46 2665.66
XSLT 0.0012 *** 1.0004
1.0005
2561.11 2558.12
D 0.0011 *** 0.9994
1.0207
2507.89 2504.06
Logos 0.0012 *** 0.9999
0.8610
2424.19 2422.60
Visual Basic 0.0011 *** 0.9648 * 0.9627
2195.31 2203.03 ^
Dart 0.0009 *** 0.9900 * 1.1306 *** 2105.29 2119.77 ^
Processing 0.0008 *** 0.9987
1.0399
1940.37 1937.47
Scheme 0.0008 *** 0.9993
1.0507 *** 1902.15 1909.18 ^
ColdFusion 0.0008 *** 1.0001
1.0115
1898.76 1895.01
Assembly 0.0008 *** 0.9989
1.0462 * 1881.36 1880.82
TeX 0.0009 *** 0.9684
1.0533
1820.28 1823.89 ^
FORTRAN 0.0008 *** 0.9759
1.1233
1793.38 1795.33 ^
Racket 0.0007 *** 0.9992
1.0469 ** 1606.83 1607.80 ^
ASP 0.0010 *** 0.8946 * 0.2937
1564.46 1593.23 ^
Haxe 0.0007 *** 0.9638 *** 1.2796 *** 1523.57 1564.04 ^
Rust 0.0006 *** 0.9997
1.0419 ** 1500.87 1505.40 ^
F# 0.0006 *** 0.9832 *** 1.1813 *** 1459.04 1475.95 ^
Elixir 0.0006 *** 0.9961
1.0815 * 1410.56 1416.77 ^
Julia 0.0006 *** 0.9774 * 1.1979 *** 1389.54 1401.91 ^
Delphi 0.0005 *** 0.9936
1.0630
1300.35 1297.25
Prolog 0.0005 *** 0.9987
1.0386
1194.72 1191.28
HaXe 0.0003 *** 0.9989
1.0539 * 815.59 816.10 ^
Vala 0.0003 *** 0.9841
1.0446
818.66 816.10
Verilog 0.0003 *** 0.9389
1.2178 * 764.88 767.62 ^
Tcl 0.0003 *** 0.9713
1.1161
687.56 685.86
AutoHotkey 0.0003 *** 1.0024
0.6782
671.37 669.35
Apex 0.0003 *** 1.0025
0.7901
655.47 652.78
Pascal 0.0002 *** 0.9195
1.1465
517.49 518.05 ^
LiveScript 0.0002 *** 0.9935
1.1017
465.78 466.36 ^
XQuery 0.0002 *** 1.0018
0.8555
452.59 448.97
Smalltalk 0.0001 *** 1.0001
1.0044
417.91 413.93
Lasso 0.0002 *** 0.9763
0.9393
399.05 396.26
Nemerle 0.0001 *** 0.9943
1.0931
397.29 396.26
Ada 0.0001 *** 0.9615
1.2437 ** 342.19 342.65 ^
AppleScript 0.0001 *** 0.9999
1.0023
346.65 342.65
Eiffel 0.0001 *** 0.9363
1.2963 *** 322.59 324.55 ^
Objective-J 0.0001 *** 1.0027
0.6731
327.46 324.55
96
Gosu 0.0001 *** 0.9686
1.1561
216.21 212.99
Nimrod 0.0001 *** 0.9927
1.0997
177.76 174.37
Scilab 0.0001 *** 0.9992
1.0434
178.07 174.37
Awk 0.0000 *** 1.0011
0.9297
138.69 134.73
Ceylon 0.0000 *** 0.9932
1.1017
137.76 134.73
ooc 0.0000 *** 1.0002
1.0192
138.44 134.73
Squirrel 0.0000 *** 0.7876
1.3185
116.12 114.43
Factor 0.0000 *** 0.9994 1.0460 97.33 93.73
Note. * p < 0.05. ** p < 0.01. *** p < 0.001. ^ Model AIC < Null
AIC
The table shows the odds of being in each group, as well as the practical difference in
odds for a unit increase in visibility and information which is quite small for most language
groups, regardless of whether the results are significant. For example, JavaScript, the odds of
being an expert when both visibility and information are zero is equal to exponentiating the
intercept, which is e
-1.373
= 0.253. With 95 percent certainty, the odds of being an expert in
JavaScript when visibility is zero is between 0.250 and 0.257. For every one unit increase in
visibility, the odds increases by a factor of 1.0002, which means that there is a 0.02 percent
increase in the chance of being an expert with a one-unit increase in visibility, holding
information constant. With a one unit increase in information, the odds of being an expert
increases by a factor of 1.110, or an 11 percent increase in the chance of being an expert, holding
visibility constant.
Of the 76 languages, 37 have a significant estimated coefficient on information and 22
have a significant estimated coefficient on visibility. The next chapter discusses possible reasons
why so few of the languages have significant estimates on information and visibility for this
hypothesis. The odds ratios for visibility across all languages is 0.9880, which is interpreted as
the odds of being an expert is expected to be on average 1.2 percent lower (because it is less than
one) for every one unit increase in visibility (all else equal), while for the 22 languages with
97
significant coefficients the mean odds ratio is 0.9881. The mean odds ratio for information for all
languages is 1.0301, interpreted as the odds of being an expert is on average expected to be 3.01
percent higher for every one unit increase in information (all else equal). For those languages the
37 languages with significant coefficients the odds ratio is several percentage points higher at
1.0819, which is interpreted as the odds of being an expert is on average expected to be 8.19
percent higher for every one unit increase in information (all else equal).
For many of the languages an increase in visibility slightly lowers the odds of being an
expert, but for most languages an increase in information increases the odds of being an expert.
Only 16 of the 76 languages, or about 21 percent, have significant coefficients on both visibility
and information. The table also shows that, out of the 76 language groups 61.8 percent of the
groups (47 groups) have a full model that is better than the null model based on the AIC. When
visibility is significant at least at the p < 0.05 level, all 22 of the models are better than the null
model. When information is significant at least at the p < 0.05 level, 36 out of 37 models are
better than the null. Only the language Assembly has a significant coefficient on information (at
the p < 0.05 level) and the null model is better than the full model.
Hypothesis 1(d) states that as the visibility and/or information of conspicuous connection
signals observed by a receiver about a sender increases, the sender’s content will spread more
widely. A similar situation occurs here as H1(a) and H1(b), where either a Poisson or negative
binomial is appropriate depending on the dispersion of the dependent variable, content spread.
Of the 430,306 users who were followed by another user, 140,415 users have at least one active
unique repository. Content spread is operationalized by summing the number of watchers on all
of each users’ owned repositories in 2013 (mean = 53.45; SD = 524.15; min = 0; max = 56358),
which means the number of people who are paying attention to an individual’s projects. A
98
Poisson regression was performed with content spread as the dependent variable and visibility
and information as the independent variables, and a dispersion test was estimated. The null
hypothesis that the variance is equal to the mean can be rejected (dispersion parameter c =
4299.856; p < 0.001), showing evidence for over dispersion. As such, three negative binomial
regressions were performed along with the null model to assess model fit. Table 5.5 summarizes
the four evaluated models.
Table 5.5 Comparing Negative Binomial Models for H1(d), DV = content spread, n = 140,415
Model 1 Model 2 Full Model Null Model
Intercept 3.4870 (0.0079)***
3.4844 (0.0080)***
3.4255 (0.0079)*** 3.9788 (0.0082)***
Visibility 0.0209 (0.0001)***
0.0132 (0.0002)***
Information
0.4656 (0.0038)*** 0.2391 (0.0057)***
AIC 816653
816854
815568 826909
Note. * p < 0.05. ** p < 0.01. *** p < 0.001.
The full model with both visibility and information had the best AIC ( AIC for full model
= 815,568). Results suggest that both visibility and information impact content spread, with a
coefficient on visibility = 0.013, p < 0.001 and a coefficient on information = 0.239, p <
0.001. This means that for every one unit increase in visibility, there will be e
0.013
= 1.013 unit
increase in content spread holding information constant. Holding visibility constant, there is
expected to be a e
0.239
= 1.270 unit increase in count of content spread for every one unit increase
in information. This shows that both independent variables in this hypothesis, visibility and
information, help explain content spread, with visibility having a small effect (1.3 percent) while
information has a larger effect (27 percent). The null hypothesis that information and visibility
do not impact content spread can be rejected; these findings provide evidence in support of
H1(d).
99
Hypothesis 2 examines the signaling behavior between two types of people: experts and
non-experts. It states that experts will signal more (i.e., have a greater mean number of signals)
than non-experts. This hypothesis is examined by focusing on the 430,306 users that received
follows from other users in 2013 to see how many users from whom they received follows were
embedded within an expertise group. Because each language group may have different behavior
and the users within the groups are different but overlapping, several separate t-tests must be
performed for the different programming languages. The Welch’s independent groups t-test is
preferred when comparing means between two groups with unequal variances and unequal
sample sizes, and it is often recommended as the default t-test for its robustness (Delacre,
Lakens, & Leys, 2017; Ruxton, 2006). Separate Welch’s t-tests were performed on the 103
programming language groups that had enough experts, defined here as those users who own a
repository labeled with that language, to do the comparison analysis with non-experts. The
dependent variable in this case is the number of incoming conspicuous connection signals (i.e.,
follows) from users who are experts (i.e., own a repository with that language). Table 5.6
provides the t-test results for all analyzed programming languages.
Table 5.6 Welch’s t-tests comparing mean number of CC signals of experts vs. non-experts (H2)
DV: Number of incoming CC signals within the expert language group
Experts
Non-Experts
Language: N Mean SD N Mean SD t p
JavaScript 19,232 2.29 10.55
411,074 0.20 0.99 27.50 ***
null 14,954 1.24 6.20
415,352 0.18 0.92 20.80 ***
Ruby 11,161 2.15 7.06
419,145 0.12 0.74 30.40 ***
Python 10,697 1.14 3.91
419,609 0.10 0.66 27.30 ***
PHP 8,178 1.20 4.10
422,128 0.08 0.58 24.79 ***
Java 8,127 0.68 2.77
422,179 0.06 0.41 20.33 ***
C 4,937 0.66 2.20
425,369 0.05 0.30 19.69 ***
C++ 4,008 0.44 1.15
426,298 0.03 0.27 22.44 ***
CSS 3,640 0.91 3.71
426,666 0.07 0.53 13.77 ***
Shell 3,287 0.62 2.16
427,019 0.05 0.35 15.11 ***
100
Objective-C 2,766 1.87 6.93
427,540 0.03 0.27 13.94 ***
C# 2,200 0.50 1.57
428,106 0.02 0.16 14.31 ***
Perl 1,544 0.86 2.05
428,762 0.02 0.16 16.17 ***
CoffeeScript 1,227 0.59 1.89
429,079 0.03 0.29 10.41 ***
VimL 1,002 0.66 2.85
429,304 0.02 0.19 7.11 ***
Go 949 1.22 3.63
429,357 0.01 0.16 10.23 ***
Scala 732 1.19 2.07
429,574 0.01 0.12 15.40 ***
Haskell 557 1.29 2.52
429,749 0.01 0.09 12.02 ***
Emacs Lisp 536 0.81 2.46
429,770 0.01 0.10 7.56 ***
Clojure 535 1.18 2.78
429,771 0.01 0.09 9.74 ***
Lua 500 0.60 1.31
429,806 0.00 0.08 10.21 ***
Puppet 434 0.42 1.16
429,872 0.00 0.07 7.55 ***
Erlang 337 0.82 1.40
429,969 0.00 0.06 10.69 ***
Groovy 303 0.40 1.10
430,003 0.00 0.05 6.25 ***
R 290 0.84 2.33
430,016 0.00 0.06 6.11 ***
ActionScript 205 0.26 0.58
430,101 0.00 0.05 6.30 ***
Arduino 155 0.22 0.83
430,151 0.00 0.05 3.25 **
Common Lisp 145 1.26 1.56
430,161 0.00 0.05 9.71 ***
Matlab 139 0.12 0.35
430,167 0.00 0.03 4.09 ***
TypeScript 130 0.24 0.54
430,176 0.00 0.06 4.97 ***
OCaml 117 1.04 1.38
430,189 0.00 0.04 8.17 ***
D 101 0.29 0.67
430,205 0.00 0.03 4.31 ***
PowerShell 98 0.27 0.60
430,208 0.00 0.04 4.34 ***
Dart 91 1.01 1.80
430,215 0.00 0.03 5.34 ***
Scheme 78 0.37 0.84
430,228 0.00 0.04 3.90 ***
Processing 78 0.40 0.92
430,228 0.00 0.05 3.81 ***
Assembly 77 0.06 0.25
430,229 0.00 0.03 2.27 *
ColdFusion 72 0.44 0.85
430,234 0.00 0.02 4.41 ***
XSLT 70 0.13 0.38
430,236 0.00 0.03 2.83 **
Rust 70 0.30 0.62
430,236 0.00 0.02 4.03 ***
Logos 68 0.21 0.48
430,238 0.00 0.03 3.56 ***
F# 68 1.43 1.59
430,238 0.00 0.03 7.41 ***
Haxe 63 2.11 2.24
430,243 0.00 0.04 7.49 ***
Racket 63 0.48 0.95
430,243 0.00 0.03 3.98 ***
Elixir 62 0.95 1.91
430,244 0.00 0.04 3.92 ***
Visual Basic 61 0.08 0.33
430,245 0.00 0.02 1.92
FORTRAN 60 0.17 0.46
430,246 0.00 0.02 2.82 **
TeX 59 0.08 0.28
430,247 0.00 0.03 2.30 *
XML 55 0.00 0.00
430,251 0.00 0.02 -13.64 ***
Julia 53 0.47 0.77
430,253 0.00 0.02 4.43 ***
Delphi 47 0.45 1.25
430,259 0.00 0.03 2.45 *
Prolog 42 0.14 0.35
430,264 0.00 0.02 2.61 *
ASP 38 0.13 0.47
430,268 0.00 0.02 1.70
101
HaXe 36 0.81 1.31
430,270 0.00 0.02 3.70 ***
Vala 30 0.07 0.25
430,276 0.00 0.01 1.44
AutoHotkey 27 0.15 0.36
430,279 0.00 0.01 2.12 *
VHDL 25 0.00 0.00
430,281 0.00 0.01 -8.66 ***
Tcl 24 0.17 0.38
430,282 0.00 0.01 2.14 *
DOT 24 0.00 0.00
430,282 0.00 0.01 -8.00 ***
Verilog 22 0.27 0.63
430,284 0.00 0.01 2.03
Apex 21 0.19 0.40
430,285 0.00 0.01 2.17 *
Pascal 18 0.67 1.91
430,288 0.00 0.02 1.48
XQuery 16 0.13 0.34
430,290 0.00 0.01 1.46
OpenEdge ABL 16 0.00 0.00
430,290 0.00 0.01 -7.48 ***
LiveScript 15 0.60 0.99
430,291 0.00 0.02 2.36 *
Standard ML 13 0.00 0.00
430,293 0.00 0.02 -10.82 ***
Nemerle 13 0.92 0.95
430,293 0.00 0.02 3.49 **
AppleScript 13 0.15 0.38
430,293 0.00 0.01 1.48
Smalltalk 12 0.58 0.90
430,294 0.00 0.02 2.24 *
Lasso 11 0.18 0.40
430,295 0.00 0.01 1.49
Ada 9 0.56 0.53
430,297 0.00 0.01 3.16 *
Objective-J 9 0.11 0.33
430,297 0.00 0.00 1.00
SuperCollider 9 0.00 0.00
430,297 0.00 0.01 -4.36 ***
Pure Data 9 0.00 0.00
430,297 0.00 0.01 -6.71 ***
Eiffel 8 1.00 0.93
430,298 0.00 0.01 3.05 *
Coq 8 0.00 0.00
430,298 0.00 0.01 -4.12 ***
M 8 0.00 0.00
430,298 0.00 0.01 -3.74 ***
Xtend 7 0.00 0.00
430,299 0.00 0.01 -5.00 ***
Nimrod 6 0.50 0.55
430,300 0.00 0.01 2.24
Kotlin 6 0.00 0.00
430,300 0.00 0.01 -3.74 ***
ooc 6 0.17 0.41
430,300 0.00 0.01 1.00
Scilab 6 0.17 0.41
430,300 0.00 0.01 1.00
Gosu 5 0.20 0.45
430,301 0.00 0.01 1.00
Io 5 0.00 0.00
430,301 0.00 0.01 -4.00 ***
Squirrel 5 0.20 0.45
430,301 0.00 0.00 1.00
DM 4 0.00 0.00
430,302 0.00 0.00 -3.00 **
Ceylon 4 1.25 0.96
430,302 0.00 0.01 2.61
Rebol 4 0.00 0.00
430,302 0.00 0.00 -1.73
Augeas 4 0.00 0.00
430,302 0.00 0.00 -3.16 **
Awk 4 0.25 0.50
430,302 0.00 0.01 1.00
Factor 4 0.75 0.50
430,302 0.00 0.02 3.00
Oxygene 3 0.00 0.00
430,303 0.00 0.02 -10.96 ***
DCPU-16 ASM 3 0.00 0.00
430,303 0.00 0.01 -4.00 ***
Bro 3 0.00 0.00
430,303 0.00 0.00 -2.65 **
Ragel in Ruby Host 3 0.00 0.00
430,303 0.00 0.00 -1.41
nesC 2 0.00 0.00
430,304 0.00 0.00 -1.73
102
Slash 2 0.00 0.00
430,304 0.00 0.00 -3.16 **
Arc 2 0.00 0.00
430,304 0.00 0.00 NA
Turing 2 0.00 0.00
430,304 0.00 0.00 -1.41
Nu 2 0.00 0.00
430,304 0.00 0.00 -1.73
Elm 2 0.00 0.00
430,304 0.00 0.00 -1.41
MoonScript 2 0.00 0.00
430,304 0.00 0.01 -3.74 ***
Opa 2 0.00 0.00
430,304 0.00 0.00 -2.00 *
Note. * p < 0.05. ** p < 0.01. *** p < 0.001.
Within most programming language groups, the null hypothesis that there is no
difference between experts and non-experts in terms of their amount of conspicuous connection
signaling is rejected. Of the 103 language groups included in this analysis, 59 languages (57.3
percent) are significant at the p < 0.05 level or less and the expert group has a larger mean than
the non-expert group. Only looking at the 76 language groups that have enough connections for
there to be both a mean and standard deviation for both groups, that increases to 77.6 percent.
There is evidence in many language groups, particularly those with many users, that there are
differences in signaling.
To more systematically uncover if there is evidence of conspicuous connection signaling
among experts more than non-experts in all languages, a meta-analysis can be performed to
synthesize evidence across the t-tests (Del Re, 2015) and to determine if size of group influences
the effect size. A meta-analysis often involves comparing two “interventions, the experimental
and a control (Schwarzer, Carpenter, & Rücker, 2015). In this study there is not an experimental
group and control group, but instead there is an expert group and a non-expert group. The first
step of a meta-analysis usually involves finding many different studies that examine a particular
outcome of interest, utilizing inclusion and exclusion criteria to determine which studies should
be included in the analysis (Del Re, 2015). While that step is not necessary here, for this meta-
analysis many of the expert groups are very small and either the mean number of expert
103
connections or the standard deviation of expert connections cannot be calculated or are equal to
zero; as such, these groups are eliminated from the meta-analysis, resulting in a total of 76
languages included in the analysis.
The R package “meta” was used to conduct the meta-analysis. The mean difference in
expert signaling is used as the effect measure of interest (rather than the standardized mean
difference since all “studies” – or in this case the different languages groups – included are in the
same scale) (Schwarzer et al., 2015). The results from the different language groups are weighted
using the inverse-variance method, which uses the inverse variance of the language groups as the
weights for the meta-analysis (C. H. Lee, Cook, Lee, & Han, 2016). For a fixed effects model, it
is assumed a true effect size exists across all studies for the fixed effects model (C. H. Lee et al.,
2016), whereas the random effects model incorporates some “unexplained heterogeneity” in
effects as studies are “assumed to be drawn from a normal distribution” of studies (Higgins,
Thompson, & Spiegelhalter, 2009, p. 137). Because it is assumed that there could be some
unexplained heterogeneity and different languages groups’ true effect of expertise signaling may
vary within a distribution, the results from the random effects model are examined. As such,
between-study variances are estimated as a parameter rather than assumed to be zero as in the
case of the fixed-effects model. Table 5.7 provides the mean difference estimates for each
language group, the subgroup analysis based on size of the expert group, as well as the overall
estimate of the random effects model.
Table 5.7 Random Effects Model Meta-Analysis for H2, Effect: Mean Difference Between
Expert and Non-Expert Signaling
Language
Mean
Difference
95% Confidence
Interval
Weight % Grouping
JavaScript 2.0916 1.9425 2.2407 1.4 Expert N > 71
null 1.0553 0.9558 1.1548 1.4
104
Ruby 2.0309 1.9000 2.1619 1.4
Python 1.0319 0.9578 1.1059 1.5
Java 0.6238 0.5637 0.6840 1.5
PHP 1.1233 1.0345 1.2121 1.4
C 0.6180 0.5564 0.6795 1.5
C++ 0.4069 0.3714 0.4424 1.5
CSS 0.8478 0.7272 0.9685 1.4
Shell 0.5694 0.4956 0.6432 1.5
Objective-C 1.8368 1.5785 2.0952 1.3
C# 0.4804 0.4146 0.5462 1.5
Perl 0.8418 0.7398 0.9439 1.4
CoffeeScript 0.5618 0.4560 0.6675 1.4
Go 1.2059 0.9748 1.4369 1.3
Scala 1.1792 1.0291 1.3293 1.4
VimL 0.6416 0.4649 0.8184 1.4
Puppet 0.4193 0.3105 0.5282 1.4
Clojure 1.1716 0.9360 1.4073 1.3
Haskell 1.2854 1.0759 1.4949 1.4
Lua 0.5991 0.4841 0.7142 1.4
Emacs Lisp 0.8027 0.5945 1.0109 1.4
Erlang 0.8158 0.6661 0.9654 1.4
Groovy 0.3967 0.2723 0.5211 1.4
R 0.8354 0.5673 1.1034 1.3
ActionScript 0.2565 0.1767 0.3362 1.5
OCaml 1.0417 0.7917 1.2917 1.3
Arduino 0.2174 0.0865 0.3483 1.4
TypeScript 0.2353 0.1425 0.3281 1.4
Matlab 0.1214 0.0631 0.1796 1.5
Common Lisp 1.2606 1.0061 1.5151 1.3
Dart 1.0102 0.6395 1.3809 1.2
PowerShell 0.2639 0.1448 0.3830 1.4
D 0.2864 0.1561 0.4168 1.4
Scheme 0.3704 0.1842 0.5566 1.4
ColdFusion 0.4441 0.2468 0.6413 1.4
Assembly 0.0642 0.0088 0.1196 1.5
Processing 0.3957 0.1923 0.5991 1.4
Group R.E. Estimate 0.7673 0.6360 0.8987 I
2
= 98.5%
Julia 0.4714 0.2628 0.6799 1.4 Expert N <= 71
XSLT 0.1278 0.0394 0.2163 1.4
Visual Basic 0.0816 -0.0015 0.1648 1.4
Logos 0.2052 0.0923 0.3181 1.4
Rust 0.2997 0.1539 0.4454 1.4
Haxe 2.1100 1.5577 2.6624 1.0
105
F# 1.4258 1.0487 1.8029 1.2
FORTRAN 0.1664 0.0507 0.2821 1.4
Elixir 0.9504 0.4747 1.4262 1.1
Racket 0.4755 0.2414 0.7096 1.3
TeX 0.0841 0.0124 0.1558 1.5
ASP 0.1313 -0.0197 0.2823 1.4
Delphi 0.4462 0.0895 0.8029 1.2
Prolog 0.1425 0.0354 0.2496 1.4
HaXe 0.8053 0.3788 1.2317 1.1
Vala 0.0666 -0.0242 0.1573 1.4
Apex 0.1904 0.0183 0.3625 1.4
Verilog 0.2726 0.0089 0.5363 1.3
AutoHotkey 0.1480 0.0114 0.2845 1.4
Tcl 0.1666 0.0143 0.3189 1.4
LiveScript 0.5998 0.1010 1.0985 1.0
Smalltalk 0.5831 0.0737 1.0925 1.0
Eiffel 0.9999 0.3584 1.6415 0.9
XQuery 0.1249 -0.0425 0.2923 1.4
Pascal 0.6663 -0.2159 1.5486 0.6
Nemerle 0.9228 0.4041 1.4414 1.0
Lasso 0.1817 -0.0573 0.4208 1.3
Ada 0.5555 0.2112 0.8998 1.2
Objective-J 0.1111 -0.1067 0.3289 1.4
AppleScript 0.1537 -0.0504 0.3579 1.4
Nimrod 0.4999 0.0616 0.9382 1.1
Ceylon 1.2500 0.3117 2.1882 0.6
Gosu 0.2000 -0.1920 0.5919 1.2
ooc 0.1666 -0.1601 0.4932 1.2
Scilab 0.1666 -0.1600 0.4933 1.2
Awk 0.2499 -0.2401 0.7399 1.0
Squirrel 0.2000 -0.1920 0.5920 1.2
Factor 0.7497 0.2597 1.2397 1.0
Group R.E. Estimate 0.3224 0.2489 0.3959 I
2
= 80.0%
Overall R.E. Estimate 0.5949 0.5013 0.6885 I
2
= 97.4%
z = 12.45, p <
0.001
The meta-analysis is conducted among two subgroups based on the number of experts
within the programming language to see if the size of the expertise group, based on a median
split, makes a difference in outcomes, as this appeared to be an important difference based on
visual inspection. The inconsistency index I
2
statistic provides the percentage of true
106
heterogeneity between effect sizes in different studies (or in this case, different language groups)
(Higgins & Thompson, 2002; Higgins, Thompson, Deeks, & Altman, 2003; Huedo-Medina,
Sánchez-Meca, Marín-Martínez, & Botella, 2006). The statistic is calculated using Cochran’s Q
(an earlier measure of heterogeneity that is calculated as the weighted summed squared deviation
of each study’s estimate from the meta-analytic estimate, but is known as a poor statistic for
finding true heterogeneity (Higgins et al., 2003)) and degrees of freedom (df); the equation is
100% ×
(𝑄 −𝑑𝑓 )
𝑄 and lies between 0 percent and 100 percent. An I
2
of 25 percent is considered to
be low, showing that there is consistency among results, whereas an I
2
of 75 percent or more is
considered high, showing heterogeneity in results. There is considerable heterogeneity among
the results for the different language groups in this study (I
2
= 97.4%). The mean difference
between experts and non-experts in each language group provides visual confirmation of this
heterogeneity, with languages like JavaScript and Objective-C having a much larger mean
difference than others while languages like TeX, FORTRAN, and Visual Basic much lower.
Even when taking the size of the language expertise group into account by performing the
analysis on two groups determined by a median split (median = 71 users in the language group),
there is still considerable heterogeneity (I
2
= 99% for large N; I
2
= 80% for small N). This
inconsistency in results means that there may be some other moderator not accounted for in this
study that is playing a role in why some language groups have a greater mean difference than
other language groups. The discussion in Chapter 6 details some proposed explanations that
should be examined in future studies.
The table also shows that the overall estimated mean difference between expert and non-
expert signaling is between 0.5013 and 0.6885 with 95 percent confidence. The meta-analysis
shows that, despite the heterogeneity, larger groups tend to signal more (mean difference in
107
number of signals between 0.6360 and 0.8987 with 95 percent confidence) than the smaller
groups (mean difference in number of signals between 0.2489 and 0.3959 with 95 percent
confidence). This makes sense, because small groups do not have as many opportunities to signal
within their own group because there are not as many people with whom to connect. Despite the
random effects model showing a significant difference in signaling between experts and non-
experts in both large and small groups and in all language groups overall, the practical difference
in number of signals is quite small, less than one signal difference between experts and non-
experts, and does not provide sufficient evidence to support H2. More research must be done to
determine if experts and non-experts across different size groups signal differently.
Hypothesis 3 states that as the proportion of non-experts increases in comparison to
experts in a knowledge area, experts will send more conspicuous connection signals. This is
based on Spence’s (2002) discussion of payoff depending on number of experts compared to
non-experts (experts will get high payoff in groups where the proportion of non-experts is low,
so there is no need to signal to distinguish themselves). As such, language groups are examined
rather than individuals, and amount of expert signaling within a group can be regressed on the
proportion of non-experts to experts within a group. The 119 languages on GitHub that have
expert signaling are examined with mean expert signaling as the dependent variable (mean = 0.4;
SD = 0.51; min = 0; max = 2.29) and the proportion of non-experts to experts as the independent
variable of interest (mean = 1; SD = 0.01; min = 0.96; max = 1). However, because the data do
not fit the normal linear model assumptions, particularly in terms of the residuals plot showing
bias and heteroscedasticity and the quantile-quantile (Q-Q) plot showing non-normality (see
Figure 5.4), the analysis technique must be reevaluated.
108
Figure 5.4 Residuals and Q-Q plot for H3 Normal Linear Model
Transforming by excluding cases where the mean is zero (no signaling occurs in these
language groups; n = 76) and using a Box-Cox transformation (𝜆 = 0.22, estimated via
maximum likelihood), the Q-Q plot looks closer to normal but the residuals still exhibit
heteroscedasticity (see Figure 5.5).
−15 −10 −5 0 5 10 15
−20 −10 0 10 20
Fitted values
Residuals
Residuals vs Fitted
50 57 62
−2 −1 0 1 2
−1.0 −0.5 0.0 0.5 1.0
Theoretical Quantiles
Standardized residuals
Normal Q−Q
50 5762
109
Figure 5.5 Residuals and Q-Q plot for H3 Box-Cox Transformed Linear Model
As such, a robust regression is fitted using iterated re-weighted least squares (IWLS) that
weights cases (Huber weights) with low residuals more than those with high residuals (lmrob
function from the robustbase package in R).
0.9 1.0 1.1 1.2
−0.2 0.0 0.2 0.4
Fitted values
Residuals
Residuals vs Fitted
42
44
55
−2 −1 0 1 2
−2 −1 0 1 2
Theoretical Quantiles
Standardized residuals
Normal Q−Q
42
44
55
110
Figure 5.6 Plot of Non-Expert Proportion against Mean Number of Expert CC Signals with
linear regression line (in blue)
After taking these measures to adjust for heteroscedasticity along with the Box-Cox
transformation to adjust for normality, the estimate for the intercept is 10.372 (SE = 1.265; p <
0.001) and the estimate on the coefficient for proportion of non-experts is -9.555 (SE = 1.277; p
< 0.001). The adjusted R
2
, which provides a measure of goodness-of-fit and can be interpreted as
the percentage of variance explained, equals 0.1934 or 19.34 percent variance explained.
Although the null hypothesis that there is no relationship between expert signaling and
proportions of non-experts can be rejected, the slope of the line is opposite of the direction
hypothesized, and so the evidence does not provide support for this hypothesis.
0.96 0.97 0.98 0.99 1.00
0.0 0.5 1.0 1.5 2.0
Proportion Non−Experts in the Language Group
Mean Number of Expert CC Signals
111
The results from these first several hypotheses provide some evidence that experts embed
in their respective expert communities by using conspicuous connection signals on GitHub,
particularly for certain programming languages that are prominent within the online community.
The next important question to understand conspicuous connection signaling is how these visible
displays of embeddedness impact other outcomes.
The next hypothesis, hypothesis 4, states that the post signal performance of those
individuals that send high visibility and/or high information signals will be better than those that
send low visibility and/or low information signals. As stated in the previous chapter, post signal
performance can be operationalized as the number of times an individual contributes to
repositories that are not their own (mean = 7.63; SD = 43.2 min = 0; max = 9906). Like previous
hypotheses, either a Poisson or negative binomial is appropriate depending on the dispersion of
the dependent variable, post signal performance. A Poisson regression was performed, and a
dispersion test was estimated. The null hypothesis that the variance is equal to the mean can be
rejected (dispersion parameter c = 216.176; p < 0.001), showing evidence for over dispersion. As
such, three negative binomial regressions were performed, one with only visibility, one with only
information, and one with both independent variables, along with the null model to assess model
fit. Table 5.7 summarizes the all four evaluated models.
Table 5.8 Comparing Negative Binomial Models for H4, DV = post signal performance, n =
318,783
Model 1 Model 2 Full Model Null Model
Intercept 2.293 (0.004)***
2.171 (0.004)***
2.162 (0.004)*** 2.032 (0.003)***
Visibility 0.014 (0.000)***
0.002 (0.000)***
Information
0.434 (0.002)*** 0.411 (0.003)***
AIC 952672 945893 945831 1739065
Note. * p < 0.05. ** p < 0.01. *** p < 0.001.
112
The model with the lowest AIC is the full model, with both independent variables
included (AIC for full model = 945,831). This provides evidence that both visibility and
information impact post signal performance, with a coefficient on visibility = 0.002, p < 0.001
and a coefficient on information = 0.411, p < 0.001. This means that for every one unit
increase in visibility, there will be e
0.002
= 1.002 unit increase in post signal performance, or a 0.2
percent increase in post-signal performance, all else equal. More dramatically, for every one unit
increase in information, there will be e
0.411
= 1.508, or a 50.8 percent increase in post-signal
performance for every one unit increase in information all else equal. While the effect size for
visibility is small like the other models in this analysis, the effect size for information is much
larger and shows the impact of conspicuous connection signal information on outcomes. The
evidence provides support for both independent variables impacting post signal performance as
expected in Hypothesis 4.
While these hypotheses give some insight into how experts signal within their respective
groups, as well as individual-level outcomes, the next step is to examine hypotheses at the
system and network levels. The next section examines hypotheses five and six.
5.3 Results for System- and Network-Level Hypotheses
Hypothesis 5 is the first network-level hypotheses, and it states that people will have
more connections with others in the person-to-person (PP) network who claim the same expertise
area than those who claim different expertise areas. Homophily is the concept that people
develop relationships with people who are similar to them (McPherson et al., 2001). This
hypothesis assessing homophily can be examined in different ways. First, Newman’s
assortativity coefficient r is a measure that provides information about the level of homophily
between people in a network based on a specific attribute of those individuals (M. E. Newman,
113
2002, 2003). In this case, the attribute of interest is being an expert in the same programming
language. A coefficient of 1 is perfectly assortative, meaning that all connections in the network
are between people of the same type; in this study, this would mean that all experts connect with
only other experts, all non-experts connect with only other non-experts, and there are no
connections between experts and non-experts. A coefficient of -1 is perfectly dissortative,
meaning all connections are between people of different types; in this study, all experts connect
with non-experts, and there are no expert-to-expert or non-expert-to-non-expert connections. A
coefficient of 0 means that the connections are random and the characteristic does not impact
whether people are connected; there will be some expert-to-expert connections, some non-
expert-to-non-expert connections, and some non-expert-to-expert mixing.
Several scholars examining discrete assortativity suggest to interpret Newman’s
assortativity coefficient as such: r > .35 is assortative; r between .26 and .34 is moderately
assortative; r between .25 and .15 is minimally assortative; and r ≤ .15 is dissortative (Doherty,
Schoenbach, & Adimora, 2009; Fujimoto & Williams, 2015; Schneider et al., 2013). Figure 5.6
is a histogram that shows the frequencies of assortativity coefficients for the different language
groups.
114
Figure 5.7 Assortativity Coefficient Histogram
The histogram shows that most languages have an assortativity coefficient that hovers
around zero, and this shows that the network is not assortative based on that attribute. Of the 121
languages with enough connections to calculate the assortativity coefficient, 34 languages have
coefficients less than zero (but very close to zero), meaning that people in that language are
slightly more dissortative in their connections (i.e., they connect more with people of different
languages than of their own language) than languages with coefficients of zero or more. Eleven
languages have coefficients of zero, meaning being an expert in the same language has no
bearing on whether people connect within that language group. The remaining 76 languages have
assortativity coefficients greater than zero, but only four language groups have an assortativity
coefficient in the minimally assortative or moderately assortative range, meaning they connect
more often with others who are experts in their own language group than people who are not
experts in their language group. The mean assortativity coefficient across all language groups is r
Assortativity
Frequency
0.00 0.05 0.10 0.15 0.20 0.25
0 10 20 30 40
115
= 0.0352. Because so few languages show evidence of assortativity, the evidence does not
provide support for Hypothesis 5.
Hypothesis 6 states that team projects that request more contributions from individuals
who conspicuously connect more often will be more successful. The dependent variable of
success is operationalized as the number of watchers on the repository (mean = 52.15; SD =
372.08; min = 0; max = 62708), where the repository is the unit of analysis (n = 172,341).
Watchers are those follow the progress of that repository by being notified of conversations, so
number of watchers provides a counted measurement of success based on popularity or interest.
The independent variable is operationalized as the average number of conspicuous connections
of all contributors to that project. Either a Poisson or negative binomial is appropriate depending
on the dispersion of the counted dependent variable. A Poisson regression was performed, and a
dispersion test was estimated. The null hypothesis that the variance is equal to the mean can be
rejected (c = 2636.59; p < 0.001), showing evidence for over dispersion. As such, a negative
binomial regression was performed along with the null model to assess model fit.
Results suggest that the average conspicuous connections per individual that contributes
to a repository impacts success of a repository (number of watchers), with a coefficient on the
independent variable = 0.013, p < 0.001. This means that for every one unit increase in average
conspicuous connections, there will be e
0.013
= 1.013 unit increase in watchers, or a 1.3 percent
increase in success. To evaluate the relative performance of this model, the AIC was examined
for both this and the null model, and the full model’s is lower (AIC for full model = 1,220,468;
AIC for null model = 1,223,387). This shows that the average number of conspicuous
connections does help with the performance of the model in explaining success, though the
116
influence on success is small (1.3%). As such, there is evidence providing support for hypothesis
6.
Table 5.9 provides a summary of the results of all hypotheses, including the dependent
variable, independent variable(s), unit of analysis, whether it was supported or not, and the effect
size. For incident rate ratios and odds ratios, effects less than 25 percent change are labeled as
“small,” effects between 25 and 50 percent are labeled as “medium,” and effects greater than 50
percent are labeled as “large.” The hypothesis that is supported but not as an IRR or OR, H2, is
labeled as having a small effect because it less than one signal difference between groups.
117
Table 5.9 Summary of Results for All Hypotheses
DV IV(s)
Unit of
Analysis
Supported? Effect Size
H1(a)
Contribution
Requests
Visibility
Individual
No --
Information Yes
Small (IRR =
+8.4%)
H1(b)
Future
Connections
Visibility
Individual
Yes
Small (IRR =
+1.1%)
Information Yes
Large (IRR =
+102.6%)
H1(c)
Expert
(Yes/No)
Visibility
Individual,
within
Groups
No --
Information Mixed / No
Small (OR =
+3.0%)
H1(d) Content Spread
Visibility
Individual
Yes
Small (IRR =
+1.3%)
Information Yes
Medium (IRR =
+27.0%)
H2
Mean Number
of Signals
Expert
(Yes/No)
Language
Groups
Yes
Small (MD
between 0.5013
and 0.6885)
H3
Mean Number
of Signals
Proportion of
Non-Experts to
Experts
Language
Groups
No --
H4
Post-Signal
Performance
Visibility
Individuals
Yes
Small (IRR =
+0.2%)
Information Yes
Large (IRR =
+50.8%)
H5 Connections
Similar
Expertise Areas
Individuals,
within
Groups
No --
H6 Success
Mean # of CC
signals of all
contributors
Repository
(Software
Project)
Yes
Small (IRR =
+1.3%)
This chapter analyzed the hypotheses raised in the last chapter by focusing on GitHub.
The next chapter provides a discussion of implications, limitations, future work, and a summary
conclusion.
118
CHAPTER 6: DISCUSSION
This chapter contextualizes the previous chapters, first by outlining how the empirical
findings relate back to the theory and how these findings matter in practical terms. The next
section discusses the limitations of this study. The third section focuses on future work that can
address these limitations, as well as theoretical and empirical work that could advance research
about conspicuous connection signaling. The final section summarizes and concludes this study.
6.1 Implications
This study theorizes changes in individual, group, and network outcomes based on how
people perceive the expertise of others according to their visible network connections, or
conspicuous connection signals. The work of Donath and boyd (2007; 2004) was the first related
to the communication discipline that discussed network connections on social media as signals,
focusing on those signals’ reliability related to cost as indicators of identity. They did this
through theoretical discussion of evolutionary biology signaling research and communication
self-presentation literature, and anecdotal evidence from different social media sites. Others have
found empirical evidence that network connections as signals matter, but without theoretically
building on the implications of networks connections as signals in signaling theory (M. Lin,
Prabhala, et al., 2013; Shami et al., 2009) having too many followers may cause people to think
that the page owner is spending too much time amassing fol- lowers, rather than actually
providing useful content. This dissertation grounds itself in the traditions of information theory,
signaling games, social network theory, and transactive memory theory, to develop a theoretical
and empirical foundation on which to build a conspicuous connection signaling theory. This
study distinguishes itself by expanding on the concept of network connections as signals in five
important areas, including (a) utilizing the concept and measure of information in signals
119
(Skyrms, 2010), (b) examining previous literature on visibility (Brighenti, 2007; Leonardi, 2014;
Stohl et al., 2016; Treem & Leonardi, 2012) and developing a clear conceptual definition of
visibility of signals, (c) categorizing different conspicuous connection signals based on the type
and direction of the tie (Borgatti et al., 2014, 2009; Kane et al., 2014) and developing a reasoning
for different perceptions on two of those types (incoming and outgoing social relation
connections as signals of expertise and interest, respectively), (d) relating this to transactive
memory systems theory (Hollingshead, 1998; Wegner, 1995) by exploring how ‘who knows
who’ is related to ‘who knows what,’ and (e) empirically examining the influence of conspicuous
connection signals on outcomes at different levels of analysis in an online community.
Different types of conspicuous connection signals are theorized to exist, but incoming
and aggregated social relation conspicuous connections are expected to be particularly important
for perceptions of expertise in knowledge-intensive networks. As such, this study explores the
development of large scale transactive memory (TM) systems that would generally not be
theorized to work because of the difficulty of normal communication practices among such large
and dispersed groups (Palazzolo, 2005). This study utilizes the concepts of visibility (Leonardi,
2014; Stohl et al., 2016; Treem & Leonardi, 2012) and information (Skyrms, 2010) to examine
how signals of varying levels of these concepts impact outcomes and the development of these
systems. Information theory and signaling theory combined with social network theory provide a
framework for examining large networks where communication can take many forms and
coordination can occur at different levels. In addition, the theory proposes that conspicuous
connection signaling is a mechanism by which TM systems can develop, resulting in positive
outcomes for large groups.
120
Information theory, signaling theory, and social network theory are combined to develop
conspicuous connection signaling theory and how this is related to TM systems theory.
Specifically, conspicuous connection signals are proposed as one causal mechanism explaining
the development of TM systems, particularly how individuals learn ‘who knows who’ and how
who they know is related to what they know (Leonardi, 2015; Moreland, 1999; Stasser, Stewart,
& Wittenbaum, 1995). Communication is key not only in the development of TM systems as
individuals learn about others, but also in their function as people share knowledge and retrieve
information (Hollingshead, 1998; Hollingshead & Brandon, 2003; Palazzolo, 2005; Wegner,
1995).
In the same vein, recent work has found that the communication network structure of
small groups changes the performance of those groups when turnover occurs, concluding that
centralized structures (where individuals are only connected to one central individual) are more
effective at integrating new members than fully connected structures (where individuals are all
connected to everyone) (Argote, Aven, & Kush, 2018). Different types of communication can
aid in the development of TM systems, and this study focuses particularly on defining and
examining one type of communicative mechanism, conspicuous connection signals. In this sense,
the network structure as visibly associated with particular individuals in the network provides
information about those individuals. Analogous to conspicuous peacock tails, experts’
connections to other experts act as visible signals to those who may not otherwise be able to
assess knowledge content or specialized language.
Focusing on specific types of communication may be a useful way to explore TM
systems in the future, where, rather than examining communication broadly defined, different
types of communication can be understood distinctly as operating in and influencing the system
121
in unique ways. Much TM systems research treats communication as an intentional and
motivated action by individuals (Wittenbaum, Hollingshead, & Botero, 2004), but this study
instead focuses on a communicative mechanism that is at least partially system-generated (X.
Lin, Spence, & Lachlan, 2016; Walther et al., 2008) and sometimes or often unintentional but
still useful for both the sender and receiver of the signal. Despite the signals themselves not
being easily manipulatable by the sender, people do wish to gain followers in online
communities as a signal of credibility (Jin & Phua, 2014; J. Y. Lee & Sundar, 2013; Westerman
et al., 2012), and they take actions to gain followers like create more interesting content or even
purchase followers (Confessore et al., 2018; K. Lee, 2014). This study also proposes that TM
systems can develop in larger groups if simple communicative mechanisms like conspicuous
connection signals are appropriate and effective.
Additionally, this study provides a more concise and measurable network-based
definition of visibility in order to accurately conceptualize how visibility of signals impacts
outcomes. This definition is in contrast to visibility conceptualized in previous work as an
affordance and which does not clearly or concisely define the term (Brighenti, 2007; Leonardi,
2014; Stohl et al., 2016; Treem & Leonardi, 2012). The definition suggested in this study instead
conceptualizes visibility as a property of a signal in a network, but is broad enough where, with
slight adjustments, it can be used in other contexts and outside of signaling, as well. Along with
visibility, information in signals (Skyrms, 2010) is an important addition to communication and
organization science literature as both a useful concept and a good measure that helps explain
outcomes.
The empirical findings from GitHub, the social software coding site, suggest that
conspicuous connection signals and their associated measures of visibility and information
122
influence several of the hypothesized outcomes of interest at different levels of analysis, and in
particular impact outcomes of interest for TM systems theory at the individual and system levels.
The outcomes associated with visibility of signals (volume as raw count) and information in
signals (calculated through probabilities via contingency tables) vary in terms of effect size and
which independent variables are influential, depending on the hypothesis. The results with the
largest effects and most obvious implications are discussed first, and then the remaining
outcomes are discussed by grouping the results according to similar themes or trends.
The first outcomes discussed here are those with the most obviously substantial and large
effects, which include those where information in signals impacts future followers, content
spread, and post signal performance. There is expected to be a doubling in future followers for a
one unit increase in information, all else equal, for H1(b). In this study, information in signals
makes a large difference for what is known as preferential attachment in network theory (M. E. J.
Newman, 2001), where those who are popular or highly connected become even more popular
over time. However, rather than raw number of followers (visibility) where there is only about a
one percent change in future followers, it is the information in the signals that makes the biggest
difference in who is more popular. This shows that conspicuous connection signaling impacts the
structure of the network, and this network structure has been shown to influence outcomes in TM
systems research (Argote et al., 2018). In particular, if someone is highly central or acts as a
bridge in a local part of the network, they can hold much power or social capital and play a more
important role in terms of the functioning of the system because their removal would result in
full disconnection between many people (Burt, 2002).
Information in signals also has a medium-to-large impact on content spread for
individuals in H1(d), where there is a 27 percent increase in content spread for every one unit
123
increase in information, all else equal, while only about a one percent increase in content spread
with a one unit increase in visibility. Individuals on GitHub believe that the number of watchers
on their repositories is important because it indicates that their repository is useful in some way
(Dabbish et al., 2012). This individual level outcome is associated with the success of those
individuals’ projects, and so has implications for TM systems theory in that those who have more
conspicuous connection signals are more likely to own repositories that have more eyes on them.
As such, conspicuous connection signaling by owners is important for team outcomes.
For H4, people also are expected to have better post signal performance when they have
increases in information (about a 50 percent increase), while raw number of signals or visibility
makes a much smaller impact (0.2 percent increase). Post signal performance is discussed in
signaling theory literature as essential for understanding signaling equilibria and whether the
signaling system is working properly (Bergh et al., 2014). Post signal performance was examined
here as how often individuals contributed to repositories across the site. That information in
signals has such a larger impact compared to visibility suggests that the measurement of
information is a more useful and explanatory variable compared to the measurement of visibility
as raw number. These hypotheses show that the conceptualization and measurement of
information in signals contributes substantially to the understanding of these outcomes. The next
section discusses possible reasons visibility plays a comparatively small role.
While information makes a large difference for those three hypotheses, many of the other
effects are smaller, generally less than a 10 percentage change in the dependent variable for
every one unit increase in the independent variable. While the variables do to make a significant
difference, visibility has quite a small effect in H1(a) with about a one percent increase in
number of contributions requests all else equal, and information’s influence is also much smaller
124
compared to the other hypotheses with about an eight percent increase in contributions all else
equal. In a working signaling system, those who signal more should receive more contribution
requests from receivers because those receivers more effectively and easily understand who is an
expert in the system. This would be a requirement for the system to be considered a working TM
system, as people need to be able to locate and access the specialized expertise of others by
knowing who knows what (Hollingshead, 2011; Wegner, 1995). For the language groups with
significant coefficients in H1(c), the effects are also small, with the average expected odds of
being an expert about one percent lower for a one unit increase in visibility (holding information
constant) and about eight percent higher for a one unit increase in information (holding visibility
constant).
However, small effects can still be important. Analyzing large-scale observational data
allows researchers “to identify weak but reliable signals in a complicated world” (Matz,
Gladstone, & Stillwell, 2017, p. 548). Also, when aggregated, even small effects can amount to
having an influence on sometimes hundreds, thousands, and even up to hundreds of thousands to
millions of people depending on the size of the network of interest (Bond et al., 2012; Kramer,
Guillory, & Hancock, 2014; Prentice & Miller, 1992). For example, a small effect of a change of
one percent applied to the population of the United States would be expected to influence more
than three million people (“Population Clock,” 2018). Of the 430,306 users who were followed
on GitHub in 2013, a one percent difference would amount to a change for about 4,303 people.
Those 4,303 people could make a substantial difference in the functioning of projects and could
also impact those that they are connected to in the network with their behavior. Large effects
cannot always be expected and small effects cannot be completely disregarded. If visibility
and/or information in conspicuous connection signals make even a small difference in how
125
people across a large network perceive the expertise of others, this could make a sizable
difference in the overall functioning of the website and what is proposed in this study as a large-
scale TM system.
The evidence does not provide support for H3 in this study, which states that as the
proportion of non-experts increases in a group, the amount of signaling should increase. This
could be because the proportion and payoff for experts is more diluted in a large online
community where it may not be easy to tell that the proportion of non-experts to experts is high
comparatively. Additionally, because this study examines incoming social relation conspicuous
connection signals, also called acknowledgement CC signals, as signals of expertise, there is less
direct control of number of the signals sent by the sender. While the signals are still sent by the
sender because the number is displayed visibly on their profile page and communicates to others
their expertise level, the actual number cannot be directly manipulated but rather only indirectly
controlled by the sender, for example by having high quality content, because others find them
interesting and worthy to follow. This is likely to positively influence the reliability of the signal,
as previous research has illustrated by examining how increasing number of followers positively
impacts perceptions of source credibility (Jin & Phua, 2014; J. Y. Lee & Sundar, 2013;
Westerman et al., 2012), but also means that senders cannot easily adjust the number of signals if
they do notice differences in the proportion of non-experts to experts and potential payoffs from
signaling when the number of experts is low versus high compared to non-experts. This is similar
to other signaling situations where an individual’s signaling cannot be easily adjusted over time,
like the plume of a peacock’s tail as discussed in previous chapters.
When performing sub-group analyses on specific languages, like for H1(c), H2, and H5,
the results are mixed. Less than half of the language groups (48.7 percent) have significant
126
coefficients on information in signals playing a role in the odds of being an expert for H1(c), and
fewer of the language groups (28.9 percent) have significant coefficients on visibility.
Hypothesis 1(c), however, does not provide support overall that visibility and information
change the odds of being an expert significantly, since many of the languages’ coefficients are
not significant. Additionally, there could be something particular about the specific languages
that make it more or less likely that visibility and information impact the odds of being an expert.
Perhaps the culture of certain language groups makes it more likely that people embed and get
followed by others. A post written in a technology and culture blog provides some anecdotal
evidence that programming languages have cultures, or at least cultural stereotypes that may or
may not be accurate (Yang & Rabkin, 2015). Whether this translates to empirical connection
patterns and signaling strategies by programmers on GitHub is yet to be determined.
For H2, the meta-analysis shows that size does not impact outcomes as much as it
appears from an initial visual inspection of results. As stated in the previous chapter, there could
be another moderator or third variable that impacts whether visibility and information
significantly influence if someone is an expert in that language. For example, perhaps it is not the
size of the language group but rather how embedded or active that language group has been over
time in the GitHub community, or something like the culture of the language as postulated for
H1(c). However, the meta-analysis does provide some evidence that there are some small
signaling differences between experts and non-experts, though the difference is less than one
signal between groups. The meta-analysis highlights the heterogeneity of results, showcasing
that different languages again have different trends within GitHub. Size does not appear to be a
driving factor in why the different languages have different signaling differences between
experts and non-experts.
127
It was hypothesized in H5 that those with expertise in the same languages would be more
likely to connect in the network than those with expertise in different languages. The
assortativity coefficient is in the dissortative range for most languages, meaning that most
language groups do not tend to have connections with those of the same language group more
often than random. The network, however, is very large and it may be that certain parts of the
network involve more assortative relationships than others. This is something to be examined in
the future.
Also, the few languages that have high assortativity are languages that have high
information in signals. The assortativity coefficient for F# (r = 0.17), Ada (r = 0.18), and Eiffel
(r = 0.25) are in the assortative range, and all three of these groups also have more than one bit of
information in their signals (F# = 1.27 bits, Ada = 2.76 bits, Eiffel = 1.83 bits). This means that
those language groups who have strong assortment with others within the same language group
also seem to be part of language groups where the signal of a conspicuous connection is more
informative. Additionally, these are also some of the languages that have a large change in the
odds being an expert due to information for H1(c) (F# = 18.1% change, Ada = 24.4% change,
Eiffel = 29.6% change). While these are only three of the more than 100 languages examined on
the site in 2013, it does show that these few languages have expert users who may behave
differently due to conspicuous connection signals than people in other languages.
Finally for H6, the hypothesis that examines how increasing numbers of
acknowledgement CC signals impact the success of large-scale TM systems, there also is about a
one percent increase in success of a project (number of watchers) for a unit increase in
conspicuous connections. This effect is small, but again could be impactful for the whole
community when aggregated. Additionally, this is just one way to operationalize success of
128
teams in an open source software development environment. While number of watchers is an
observable outcome that researchers can access and is already quantified, success may be
examined according to other less observable outcomes like satisfaction of users or quality of
code, (Crowston, Annabi, & Howison, 2003; S.-Y. T. Lee, Kim, & Gupta, 2009) or other
observable outcomes like downloads (Grewal, Lilien, & Mallapragada, 2006). Conspicuous
connection signals may have a larger impact on success if operationalized in these ways. The
next section discusses other opportunities for measuring success.
While these findings provide some evidence supporting the theoretically-grounded
hypotheses, there are limitations. The next section outlines those limitations.
6.2 Limitations
This study examines observational behavioral data from GitHub, and while this kind of
digital trace information has its advantages, like no researcher interference on variables of
interest, there are also limitations. In the case of this study where people’s perceptions of
expertise and perceptions of conspicuous connection signals are important for the theory, the
only inferences that can be made are based on behavioral and observable measures. This study
cannot make inferences about what people are perceiving or what people actually think of the
signals of interest. All cognitive mechanisms are assumed rather than empirically examined. In
particular, the assumption is that incoming social relation conspicuous connection signals (also
called acknowledgment CC signals) are understood by receivers as indicators of expertise while
outgoing social relation conspicuous connection signals are perceived as indicators of interest.
Additionally, volume of signals are assumed to be a good proxy for visibility. These assumptions
cannot be verified in this study, but the next section discusses future work that will begin to
untangle whether this is the case. Finally, the empirical setting has limitations in terms of
129
availability of data. Some control variables that would be important and interesting as other
signals of expertise, particularly in terms of other kinds of stereotypes (gender revelation, profile
picture), location, website listing, and badge of organizations, among others, were not examined
in this study.
GitHub is also a unique social media site, where a very specific kind of expert, software
language experts, join to make a specific product, open source software. As a case study, the
empirical results cannot be generalized to other settings, especially settings outside of software
development. However, the results do provide guidelines on how to proceed when looking at
new locations of interest, such as focusing on information in signals.
In terms of the analyses, there may be issues with the independence of observations, one
of the assumptions for conducting generalized linear models, particularly because the analysis
involves the entire dataset of follow events over time as well as other behaviors like
contributions. As previously discussed, the data are collapsed and aggregated across the year so
that it is treated as a cross section, and there is just one observation per user or per repository,
depending on the unit of analysis. However, there could be still dependency problems. It is
possible that when people are friends, they are more likely to have similar numbers of total
friends. The degree assortativity (M. E. Newman, 2002) can determine if people tend to connect
to others with a similar number of connections or degree (degree defined as the number of ties a
node in the network has) more often than those with a different number of connections. The
assortativity coefficient was calculated, and it is negative but close to zero (r = -0.0461), which
means that people do not tend to associate more with people who have the same degree. As such,
it is less likely that there is a problem with dependency in the aggregated follower data. Despite
these precautions, care should be taken when interpreting the standard error and p-values and
130
future work should replicate and extend in different contexts and with different methodologies
that can ensure independent observations. However, as stated previously, this is a large dataset so
the p-values are already not very informative. The coefficients from the models can be
interpreted accurately despite the potential dependence issue.
Additionally, most of the findings produce small effects, generally under 10 percent
change in count or odds. In research that involves a large number of observations significance of
coefficients or p-values are usually viewed as not very informative while effect sizes are
advocated as a better indicators of the importance of a variable in explaining outcomes (M. Lin,
Lucas Jr, et al., 2013; Matz et al., 2017; Sullivan & Feinn, 2012). However, as discussed in
section 6.1, effect sizes do not always need to be large to make a difference in outcomes for large
networks of hundreds of thousands of people. As such, though the effects may be small, future
work discussed in the next section should begin to uncover when small effects have a big impact.
Additionally, the operationalizations of some variables of interest may have tempered the effects.
One set of operationalizations that should be reexamined include both spread of content
and success of a repository, with both currently utilizing the number of watchers as the measure
(total watchers for the individual’s repositories or watchers on each repository, respectively).
Watching a repository involves keeping track of the conversations and the progress of a
repository. However, another possibility would be the stars on a repository, which allow users to
bookmark a repository to easily find it again in the future, which is much less intrusive to the
user than watching because the conversations are not tracked. However, only 610,462 of the over
three million pull requests had the number of stargazers on repositories. Future work should
untangle if the number of watchers or stargazers on a repository is a better indicator of spread
and/or success.
131
Another limitation is the operationalization of visibility, which in this study is measured
as the aggregated number of acknowledgement (incoming social relation) conspicuous
connection signals at the beginning of the 2013 year. While this operationalization works as a
good proxy for visibility as discussed in previous chapters, more research must be done to
understand if volume of acknowledgement CC signals does in fact map on to the perception of
the visibility of those signals, as conceptualized for receivers as the amount of effort that must be
expended to locate, access, and interpret information (Leonardi, 2014; Stohl et al., 2016; Treem
& Leonardi, 2012) and for the sender as the noticeability or observability the information that is
sent (Connelly et al., 2011; Olson, 1965). To truly understand the amount of effort that must be
expended or the observability of information, other studies must examine the cognitive
mechanisms at play for both senders and receivers of conspicuous connection signals and how
that is related to perceived visibility of those signals.
The next section previews several potential future studies related to these cognitive
mechanisms as well as other related work to expand on conspicuous connection signaling.
6.3 Future Work
This study established the foundation of the theory of conspicuous connection signaling
and examined behavioral digital data from the GitHub social software coding website. However,
there is much work to do to continue to explore and expand on this theory as well as understand
how this theory relates to empirical outcomes.
First, exploring the causal mechanisms at the psychological level will be an important
step to understand how people perceive conspicuous connection signals and the associated
visibility of those signals. To do this, experimental studies isolating those cognitive mechanisms
of interest must be conducted. A first important step will be to uncover how different types of
132
conspicuous connections are perceived by individuals. Specifically, incoming and outgoing
social relation conspicuous connection signals should be tested against each other to understand
if the assumption of this study (that incoming social relation conspicuous connection signals are
perceived as indicators of expertise while outgoing social relation conspicuous connection
signals are perceived as indicators of interest) is valid. This could be done in a survey
experiment, where participants are shown the profile of an individual with different types of
conspicuous connection signals while holding everything else constant on the profile, and then
ask those participants questions about their perceptions of that individual’s interest and expertise,
as well as their behavioral intentions toward that individual. To examine how receivers perceive
the visibility of signals, future studies can also display varying volume levels of conspicuous
connections and develop a method for measuring perceptions of visibility. Other future
experimental studies will recruit groups to study how the perceptions of conspicuous connection
signals and the signals’ visibility and information can impact outcomes like success for large and
small groups, as is expected by this theory.
Additionally, previous research has found a curvilinear, inverted-u relationship between
the number of connections and outcomes of interest. For example, Westerman et al. (2012) found
that increasing number of connections on Twitter positively impacts perceptions of source
credibility only up to a certain point, at which the relationship inverses and more connections
actually negatively impacts perceptions of source credibility. Their interpretation of this finding
was that “having too many followers may cause people to think that the page owner is spending
too much time amassing followers, rather than actually providing useful content” (p. 204). This
is similar to a finding from scholars examining Facebook where people with too many friends
were viewed as less socially attractive (Tong, Heide, Langwell, & Walther, 2008). Future
133
research should examine if there is some sort of inverted-u relationship at work or if there is a
threshold number of conspicuous connection signals at which point the relationship changes.
This may particularly be the case in different contexts outside of GitHub, where amassing
followers may not indicate expertise or high quality as it has been shown to represent on that site
(Dabbish et al., 2012).
In TM systems theory, individuals become more specialized and this is positive for the
group, which reduces the cognitive load of others, increases interdependence among team
members, and results in a more efficient and effective team (Brandon & Hollingshead, 2004;
Hollingshead & Brandon, 2003; Wegner, 1987). Potential future hypotheses that could be
examined directly associated with TM systems theory could be to see if people who are experts
in one or very few areas (specialists) have more conspicuous connection signals than those who
are experts in many areas (generalists), and if those groups who request contributions from
specialist signalers are more successful than those groups who request contributions from
generalist signalers. Most users in the 2013 database only owned a repository in one
programming language (113,673 users). This drops off quickly with 19,141 users owning
repositories in two programming languages, 4,982 users owning repositories in three languages,
1,546 users owning repositories in four programming languages, and so on. This shows that the
vast majority (about 81 percent) of GitHub users specialize in one programming language.
Examining how specialists interact in this community and others may provide more information
relevant to TM systems theory, because TM systems rely on people with increasing specialized
expertise to function effectively (Hollingshead, 1998; Hollingshead & Brandon, 2003; Wegner,
1995).
134
A research question of interest would be how different kinds of users display their
conspicuous connection signals in conjunction with their offline reputations. On GitHub, some of
those with a large number of conspicuous connection signals were associated with their offline
personas, and thus may be more invested in signaling and portraying themselves in a certain
light. For example, the user ‘mojombo,’ whose profile was used as an example in Chapter 3 and
again examined in Chapter 5 as the most followed user in 2013, clearly associates his online
profile with his offline and other online reputations by linking to his website, showcasing that he
is a GitHub employee in 2013, and showing his location. His conspicuous connection patterns,
along with those of other highly-followed users, can be examined in relation to their personal
choices on the site. This could be compared to those who do not signal as often on the site to see
if there are differences in the communication behaviors. Additionally, other signals that were not
examined in this study as controls, like profile pictures and revealed gender, can be examined in
the future to see how those signals interact with conspicuous connection signals and impact the
perception and behaviors of individuals, teams, and networks.
Future studies should examine how this theory operates in other contexts. Websites like
Quora, Stackoverflow, and Twitter provide an opportunity to see if the same expectations and
results hold in environments that are not directly related to open source software coding.
Perhaps those who are rich in knowledge resources (experts) compared to poor in knowledge
resources (novices) may interact differently on these sites (Bighash et al., 2018). Like in the case
of conspicuous consumption where sometimes those who are not wealthy purchase goods that
are “fakes” but look similar to luxury goods to try to deceptively communicate wealth and status,
novices may try to emulate experts by displaying connections that superficially look like they
could indicate expertise but when examined more closely are deceptive. For example, many
135
people have started to buy followers on Twitter to look more popular, famous, or authoritative
(Confessore et al., 2018), and some underground groups have started selling hacked profiles,
with “some accounts still [having] the original users' name so their friends may believe the
information is coming from a reputable source” (Mikhailova, 2018, para. 6). Presumably part of
the advantage these groups have in cultivating and selling online profiles is that they display
some sort of legitimacy through their conspicuous connections and other signals.
There are many other contexts outside of online communities or at the intersection
between online and offline contexts that would be interesting to examine conspicuous connection
signaling. In particular examining cases where conspicuous connection signals are deceptive may
be fruitful. For example, pump-and-dump stock fraud often involves making false statements to
inflate the value of “microcap” or “penny” (low value) stocks in order to get people to purchase
that stock, only to then sell their shares of the stock at the higher price before the stock price
plummets and the fraud becomes apparent (“‘Pump-and-Dumps’ and Market Manipulations,”
2013). Fraudsters sometimes claim that the company has an endorsement from a highly credible
source, like a financial guru, deceptively utilizing a conspicuous connection signal unbeknownst
to the receivers of that signal, some of whom purchase the stock. A Scottish man, for example,
was indicted for manipulating markets by creating Twitter accounts that looked authentic, then
tweeting about stocks and companies in order to artificially raise stock prices, allegedly causing
investors to lose more than 1.6 million dollars (“Scottish Citizen Indicted for Twitter-Based
Stock Manipulation Scheme,” 2015). Additionally, cases have been documented where either
legitimate or illegitimate endorsements by celebrities, media personalities, or other financially-
savvy individuals cause stocks to rise or fall (Brown, 2017). The organization apparently or
actually receives this acknowledgement conspicuous connection signal from these individuals,
136
and this can impact markets, just as media coverage can impact financial markets (Engelberg &
Parsons, 2011). Different types of conspicuous connection signals as developed in Table 3.1 in
Chapter 3 can be examined in these different contexts to see how both senders’ receivers’
perceptions and behaviors may change based on the conspicuous connection signals sent.
Context also may play a role in whether conspicuous connection signals actually make a
difference at all. In some communities, for example, having and exhibiting many connections
may be viewed as unattractive or as a front for other inadequacies. For example, Tong et al.
(2008) found that having too many friends on sites like the now-defunct Friendster or Facebook
actually comes across as friend “whor[ing]” (p. 538), where people do not have any real interest
in the people themselves but rather just to increase their friend count. In some cases, it may
actually be the case that having fewer friends or fewer followers or follows may indicate
exclusivity. For example, singing icon Beyoncé famously follows no one on the photo-sharing
application Instagram, much to the dismay of other celebrities who court her follow (Iasimone,
2017). In other cases, it may be that conspicuous connection signaling may not be normative in
the community of interest. These contexts and cases will be interesting to examine as counter to
the logic that more signaling or more visibility is better. However, conspicuous connection
signaling is always at work. If one does not choose to exhibit their connections, this is also
signaling. Like Watzlawick, Bavelas, and Jackson who said, “no matter how one may try, one
cannot not communicate” (1967, p. 30), one cannot not signal through conspicuous connections.
A receiver will always perceive the number of connections that an individual exhibits, whether
non-existent, small, or large, as communicating something about that individual’s qualities. How
this works in different contexts is a question that is ripe to answer with future research.
137
This study also pointed out the limitations in previous research examining visibility, and
proposed a networked definition of visibility as a property of a signal, with the conceptualization
incorporating not only the individual and the system but also those other individuals in the
network who may interact with that signal. Future work should continue to develop this
networked visibility concept outside of the affordances paradigm to be able to more clearly and
accurately examine and measure the phenomenon and determine how it influences outcomes.
Other measures associated with conspicuous connection signals should be developed and
explored in order to fully understand signaling of this kind.
6.4 Conclusion
Finding experts in large systems is difficult (Fulk & Yuan, 2013), but it is important in
order to fulfill the needs of projects that require the knowledge and skills of these experts. The
conspicuous connection signaling theory developed in this study, as well as the empirical
evidence provided, highlights that the information and visibility of conspicuous connection
signals matter for outcomes at the individual, group, and network levels. Conspicuous
connection signaling provides a framework for researchers to study how third-parties, or
receivers, perceive network connection signals of senders and make judgements about those
individuals’ qualities, like their expertise. In addition to explaining how TM systems develop at
the group level, conspicuous connection signaling theory may be also helpful in understanding
other outcomes and perceptions at the individual level based in different theoretical traditions.
For example, cognitive social structures research examines the accuracy (and often more
importantly, the inaccuracy) of individuals’ mental representations of their social networks and
the consequences of these potential perceptual biases (Brands, 2013; Krackhardt, 1987).
138
Conspicuous connection signaling may also explain how people develop these cognitions about
their networks.
This study expands on previous research examining connections as signals by developing
this theory of conspicuous connection signaling (Donath, 2007; Donath & boyd, 2004),
particularly as it relates not just to individuals but to groups and systems. Visibly displayed
friendships, professional relationships, and other associations are theorized to provide critical
information about individuals’ identity, and in this study one particular type of identity: claimed
expertise. Donath and boyd (2004) utilized signaling theory to discuss how visible connections
can verify one’s identity in an online world where it is sometimes not clear who the offline
individual is behind the profile. Additionally, however, this study treats conspicuous connection
signals as providing information in the context of large, distributed online systems where people
are attempting to coordinate and find experts to build products. Beyond impacting only whether
someone is telling the “truth” about their identity, conspicuous connection signals provide
information about the embeddedness of an individual in a community, and in this study, in a
community of experts. This work utilizes information theory and signaling games (Skyrms,
2010) to more formally untangle the influence of signals in networks, about networks.
Conspicuous connection signaling theory shifts the theoretical discussion of networks
away from the connections within the network and how the structure of the network itself
impacts outcomes, and instead focuses on what people think the network represents and how
those perceptions of the network impacts outcomes. Information and signaling theory provide the
foundation to understand how communication impacts these perceptions about networks, and one
type of communication within TM systems is classified in a meaningful and quantifiable manner.
People must make decisions about others, and often those others are strangers. Many different
139
heuristics facilitate those decisions, and some of those heuristics are biased (Tversky &
Kahneman, 1974) while others are useful and evolutionarily advantageous (Gigerenzer &
Gaissmaier, 2011). In this study, judgements and decisions made based on conspicuous
connection signals are expected to be more informed and “better” than those made without the
availability and information contained in those signals, and group outcomes are also expected to
be better. More work needs to be done to determine when and how signals about networks
positively or negatively impact individual and group outcomes, but this first step sheds light on
how conspicuous connection signaling influences individuals, groups, and teams in the context
of a social open-source software coding community.
140
REFERENCES
Akaike, H. (1998). Information theory and an extension of the maximum likelihood principle. In
Selected Papers of Hirotugu Akaike (pp. 199–213). Springer.
Akerlof, G. A. (1970). The Market for" Lemons": Quality Uncertainty and the Market
Mechanism. The Quarterly Journal of Economics, 84(3), 488–500.
Akgün, A. E., Byrne, J., Keskin, H., Lynn, G. S., & Imamoglu, S. Z. (2005). Knowledge
networks in new product development projects: A transactive memory perspective.
Information & Management, 42(8), 1105–1120. https://doi.org/10.1016/j.im.2005.01.001
Argote, L., Aven, B. L., & Kush, J. (2018). The Effects of Communication Networks and
Turnover on Transactive Memory and Group Performance. Organization Science.
https://doi.org/10.1287/orsc.2017.1176
Bagwell, L. S., & Bernheim, B. D. (1996). Veblen Effects in a Theory of Conspicuous
Consumption. The American Economic Review, 86(3), 349–373.
Bergh, D. D., Connelly, B. L., Ketchen, D. J., & Shannon, L. M. (2014). Signalling Theory and
Equilibrium in Strategic Management Research: An Assessment and a Research Agenda.
Journal of Management Studies, 51(8), 1334–1360. https://doi.org/10.1111/joms.12097
Bighash, L., Oh, P., Fulk, J., & Monge, P. (2018). The Value of Questions in Organizing:
Reconceptualizing Contributions to Online Public Information Goods. Communication
Theory, 28(1), 1–21. https://doi.org/10.1111/comt.12123
BliegeBird, R., & Smith, E. (2005). Signaling Theory, Strategic Interaction, and Symbolic
Capital. Current Anthropology, 46(2), 221–248. https://doi.org/10.1086/427115
141
Bloch, F., Rao, V., & Desai, S. (2004). Wedding Celebrations as Conspicuous Consumption:
Signaling Social Status in Rural India. The Journal of Human Resources, 39(3), 675–695.
https://doi.org/10.2307/3558992
Bond, R. M., Fariss, C. J., Jones, J. J., Kramer, A. D. I., Marlow, C., Settle, J. E., & Fowler, J. H.
(2012). A 61-million-person experiment in social influence and political mobilization.
Nature, 489(7415), 295–298. https://doi.org/10.1038/nature11421
Borgatti, S. P. (2005). Centrality and network flow. Social Networks, 27(1), 55–71.
https://doi.org/10.1016/j.socnet.2004.11.008
Borgatti, S. P., Brass, D. J., & Halgin, D. S. (2014). Social Network Research: Confusions,
Criticisms, and Controversies. In Contemporary Perspectives on Organizational Social
Networks (Vol. 40, pp. 1–29). Emerald Group Publishing Limited.
Borgatti, S. P., & Cross, R. (2003). A Relational View of Information Seeking and Learning in
Social Networks. Management Science, 49(4), 432–445.
https://doi.org/10.1287/mnsc.49.4.432.14428
Borgatti, S. P., Mehra, A., Brass, D. J., & Labianca, G. (2009). Network analysis in the social
sciences. Science, 323(5916), 892–895.
Bourdieu, P. (1977). Outline of a Theory of Practice (Vol. 16). New York, NY: Cambridge
University Press.
Bourdieu, P. (1984). Distinction: A social critique of the judgement of taste. Cambridge, MA:
Harvard University Press.
Boyd, D. M., & Ellison, N. B. (2007). Social Network Sites: Definition, History, and
Scholarship. Journal of Computer-Mediated Communication, 13(1), 210–230.
https://doi.org/10.1111/j.1083-6101.2007.00393.x
142
Brandon, D. P., & Hollingshead, A. B. (2004). Transactive memory systems in organizations:
Matching tasks, expertise, and people. Organization Science, 15(6), 633–644.
https://doi.org/10.1287/orsc.1040.0069
Brands, R. A. (2013). Cognitive social structures in social network research: A review. Journal
of Organizational Behavior, 34(S1), S82–S103. https://doi.org/10.1002/job.1890
Braun, O. L., & Wicklund, R. A. (1989). Psychological antecedents of conspicuous
consumption. Journal of Economic Psychology, 10(2), 161–187.
https://doi.org/10.1016/0167-4870(89)90018-4
Brighenti, A. (2007). Visibility A Category for the Social Sciences. Current Sociology, 55(3),
323–342. https://doi.org/10.1177/0011392107076079
Brown, A. (2017, January 31). How Social Media Affects the Markets. Retrieved April 20, 2018,
from https://www.financemagnates.com/forex/bloggers/social-media-affects-markets/
Burnham, K. P., & Anderson, D. R. (2002). Model Selection and Multimodel Inference: A
Practical Information-Theoretic Approach (2nd ed.). New York: Springer-Verlag.
Burt, R. S. (2002). Bridge decay. Social Networks, 24(4), 333–363.
https://doi.org/10.1016/S0378-8733(02)00017-5
Cadsby, C. B., Frank, M., & Maksimovic, V. (1990). Pooling, Separating, and Semiseparating
Equilibria in Financial Markets: Some Experimental Evidence. Review of Financial
Studies, 3(3), 315–342. https://doi.org/10.1093/rfs/3.3.315
Cameron, A. C., & Trivedi, P. K. (1990). Regression-based tests for overdispersion in the
Poisson model. Journal of Econometrics, 46(3), 347–364. https://doi.org/10.1016/0304-
4076(90)90014-K
143
Chaudhuri, H. R., Mazumdar, S., & Ghoshal, A. (2011). Conspicuous consumption orientation:
Conceptualisation, scale development and validation. Journal of Consumer Behaviour,
10(4), 216–224. https://doi.org/10.1002/cb.364
Confessore, N., Dance, G., Harris, R., & Hansen, M. (2018, January 27). The Follower Factory.
The New York Times. Retrieved from
https://www.nytimes.com/interactive/2018/01/27/technology/social-media-bots.html
Connelly, B. L., Certo, S. T., Ireland, R. D., & Reutzel, C. R. (2011). Signaling theory: A review
and assessment. Journal of Management, 37(1), 39–67.
Contractor, N. S., & Monge, P. R. (2002). Managing Knowledge Networks. Management
Communication Quarterly, 16(2), 249–58.
Corneo, G., & Jeanne, O. (1997). Conspicuous consumption, snobbism and conformism. Journal
of Public Economics, 66(1), 55–71. https://doi.org/10.1016/S0047-2727(97)00016-9
Coxe, S., West, S. G., & Aiken, L. S. (2013). Generalized linear models. Oxford Handbook of
Quantitative Methods, 26–51.
Craig, R. T. (1999). Communication Theory as a Field. Communication Theory, 9(2), 119–161.
https://doi.org/10.1111/j.1468-2885.1999.tb00355.x
Crowston, K., Annabi, H., & Howison, J. (2003). Defining Open Source Software Project
Success. ICIS 2003 Proceedings, 28.
Dabbish, L., Stuart, C., Tsay, J., & Herbsleb, J. (2012). Social Coding in GitHub: Transparency
and Collaboration in an Open Software Repository. In Proceedings of the ACM 2012
Conference on Computer Supported Cooperative Work (pp. 1277–1286). New York, NY,
USA: ACM. https://doi.org/10.1145/2145204.2145396
144
Del Re, A. C. (2015). A practical tutorial on conducting meta-analysis in R. The Quantitative
Methods for Psychology, 11(1), 37–50.
Delacre, M., Lakens, D., & Leys, C. (2017). Why Psychologists Should by Default Use Welch’s
t-test Instead of Student’s t-test. International Review of Social Psychology, 30(1).
https://doi.org/10.5334/irsp.82
Doherty, I. A., Schoenbach, V. J., & Adimora, A. A. (2009). Sexual Mixing Patterns and
Heterosexual HIV transmission among African Americans in the Southeastern United
States. Journal of Acquired Immune Deficiency Syndromes (1999), 52(1), 114–120.
https://doi.org/10.1097/QAI.0b013e3181ab5e10
Doll, B. (2012, December 19). The Octoverse in 2012. Retrieved December 14, 2017, from
https://github.com/blog/1359-the-octoverse-in-2012
Donath, J. (2007). Signals in social supernets. Journal of Computer-Mediated Communication,
13(1), 231–251.
Donath, J. (2011). Signals, truth and design. Manuscript Available:< Http://Smg. Media. Mit.
Edu/People/Judith/SignalsTruthDesign.Html.
Donath, J., & boyd, danah. (2004). Public displays of connection. BT Technology Journal,
22(4), 71–82.
Ellison, N. B., Gibbs, J. L., & Weber, M. S. (2015). The use of enterprise social network sites for
knowledge sharing in distributed organizations the role of organizational affordances.
American Behavioral Scientist, 59(1), 103–123.
https://doi.org/10.1177/0002764214540510
Engelberg, J. E., & Parsons, C. A. (2011). The causal impact of media in financial markets. The
Journal of Finance, 66(1), 67–97.
145
Ericsson, K. A., & Smith, J. (1991). Prospects and limits of the empirical study of expertise: An
introduction. In K. A. Ericsson & J. Smith (Eds.), Toward a general theory of expertise:
Propsects and limits (pp. 1–38). Cambridge, UK: Cambridge University Press.
Evans, R. (2008). The Sociology of Expertise: The Distribution of Social Fluency. Sociology
Compass, 2(1), 281–298. https://doi.org/10.1111/j.1751-9020.2007.00062.x
Evetts, J., Mieg, H. A., & Felt, U. (2006). Professionalization, scientific expertise, and elitism: A
sociological perspective. In K. Anders Ericsson, N. Charness, P. J. Feltovich, & R. R.
Hoffman (Eds.), The Cambridge Handbook of Expertise and Expert Performance (pp.
105–126). Cambridge, UK: Cambridge University Press.
Faraj, S., & Sproull, L. (2000). Coordinating Expertise in Software Development Teams.
Management Science, 46(12), 1554–1568.
https://doi.org/10.1287/mnsc.46.12.1554.12072
Flanagin, A. J., & Metzger, M. J. (2013). Trusting expert-versus user-generated ratings online:
The role of information volume, valence, and consumer characteristics. Computers in
Human Behavior, 29(4), 1626–1634.
Fujimoto, K., & Williams, M. L. (2015). Racial/ethnic differences in sexual network mixing: A
log-linear analysis of HIV status by partnership and sexual behavior among most at-risk
MSM. AIDS and Behavior, 19(6), 996–1004.
Fulk, J. (2016). Conceptualizing multilevel expertise. In J. Treem & P. M. Leonardi (Eds.),
Expertise, communication, and organizing (pp. 251–270). New York, NY: Oxford
University Press.
146
Fulk, J., Heino, R., Flanagin, A. J., Monge, P. R., & Bar, F. (2004). A test of the individual
action model for organizational information commons. Organization Science, 15(5), 569–
585. https://doi.org/10.1287/orsc.1040.0081
Fulk, J., & Yuan, Y. C. (2013). Location, motivation, and social capitalization via enterprise
social networking. Journal of Computer-Mediated Communication, 19(1), 20–37.
https://doi.org/10.1111/jcc4.12033
Gibson, J. J. (1986). The Ecological Approach To Visual Perception. Taylor & Francis Group.
Gigerenzer, G., & Gaissmaier, W. (2011). Heuristic Decision Making. Annual Review of
Psychology, 62(1), 451–482. https://doi.org/10.1146/annurev-psych-120709-145346
Gintis, H. (2009). Game Theory Evolving: A Problem-centered Introduction to Modeling
Strategic Behavior. Princeton, NJ: Princeton University Press.
Gintis, H., Smith, E. A., & Bowles, S. (2001). Costly signaling and cooperation. Journal of
Theoretical Biology, 213(1), 103–119.
GitHub Archive. (n.d.). Retrieved September 2, 2017, from https://www.githubarchive.org/
GitHub Octoverse 2017. (n.d.). Retrieved December 14, 2017, from
https://octoverse.github.com/
Grace, D., & Griffin, D. (2006). Exploring conspicuousness in the context of donation behaviour.
International Journal of Nonprofit and Voluntary Sector Marketing, 11(2), 147–154.
https://doi.org/10.1002/nvsm.24
Grace, D., & Griffin, D. (2009). Conspicuous donation behaviour: scale development and
validation. Journal of Consumer Behaviour, 8(1), 14–25. https://doi.org/10.1002/cb.270
Grafen, A. (1990). Biological signals as handicaps. Journal of Theoretical Biology, 144(4), 517–
546.
147
Grewal, R., Lilien, G. L., & Mallapragada, G. (2006). Location, Location, Location: How
Network Embeddedness Affects Project Success in Open Source Systems. Management
Science, 52(7), 1043–1056.
Griskevicius, V., Tybur, J. M., & Van den Bergh, B. (2010). Going green to be seen: Status,
reputation, and conspicuous conservation. Journal of Personality and Social Psychology,
98(3), 392–404. https://doi.org/10.1037/a0017346
Guilford, T., & Dawkins, M. S. (1995). What are conventional signals? Animal Behaviour, 49(6),
1689–1695. https://doi.org/10.1016/0003-3472(95)90090-X
Halgin, D. (2008). All in the family: Network ties as determinants of reputation and identity in
NCAA basketball. In Academy of Management Proceedings (Vol. 2008, pp. 1–6).
Academy of Management. Retrieved from
http://proceedings.aom.org.libproxy1.usc.edu/content/2008/1/1.298.short
Hello World · GitHub Guides. (2016, April 7). Retrieved November 14, 2017, from
https://guides.github.com/activities/hello-world/
Higgins, J. P. T., & Thompson, S. G. (2002). Quantifying heterogeneity in a meta-analysis.
Statistics in Medicine, 21(11), 1539–1558. https://doi.org/10.1002/sim.1186
Higgins, J. P. T., Thompson, S. G., Deeks, J. J., & Altman, D. G. (2003). Measuring
inconsistency in meta-analyses. BMJ : British Medical Journal, 327(7414), 557–560.
Higgins, J. P. T., Thompson, S. G., & Spiegelhalter, D. J. (2009). A re-evaluation of random-
effects meta-analysis. Journal of the Royal Statistical Society: Series A (Statistics in
Society), 172(1), 137–159.
Hilbe, J. M. (2011). Negative Binomial Regression. New York, NY: Cambridge University
Press.
148
Hollingshead, A. B. (1998). Communication, learning, and retrieval in transactive memory
systems. Journal of Experimental Social Psychology, 34(5), 423–442.
Hollingshead, A. B. (2001). Cognitive interdependence and convergent expectations in
transactive memory. Journal of Personality and Social Psychology, 81(6), 1080–1089.
https://doi.org/10.1037/0022-3514.81.6.1080
Hollingshead, A. B. (2011). Transactive memory system. In J. Levine & M. Hogg (Eds.),
Encyclopedia of group processes & intergroup relations (pp. 932–934). Thousand Oaks,
CA: SAGE Publications, INC.
Hollingshead, A. B., & Brandon, D. P. (2003). Potential Benefits of Communication in
Transactive Memory Systems. Human Communication Research, 29(4), 607–615.
https://doi.org/10.1111/j.1468-2958.2003.tb00859.x
Hollingshead, A. B., & Fraidin, S. N. (2003). Gender stereotypes and assumptions about
expertise in transactive memory. Journal of Experimental Social Psychology, 39(4), 355–
363. https://doi.org/10.1016/S0022-1031(02)00549-8
Huedo-Medina, T. B., Sánchez-Meca, J., Marín-Martínez, F., & Botella, J. (2006). Assessing
heterogeneity in meta-analysis: Q statistic or I$^2$ index? Psychological Methods, 11(2),
193.
Iasimone, A. (2017, September 28). Ellen DeGeneres Vies to Become the One Person Beyonce
Follows on Instagram. Retrieved June 1, 2018, from
https://www.billboard.com/articles/columns/pop/7981701/ellen-degeneres-beyonce-
instagram-follow-single-ladies-dance
149
Jarvenpaa, S. L., & Majchrzak, A. (2008). Knowledge Collaboration Among Professionals
Protecting National Security: Role of Transactive Memories in Ego-Centered Knowledge
Networks. Organization Science, 19(2), 260–276. https://doi.org/10.1287/orsc.1070.0315
Jin, S.-A. A., & Phua, J. (2014). Following Celebrities’ Tweets About Brands: The Impact of
Twitter-Based Electronic Word-of-Mouth on Consumers’ Source Credibility Perception,
Buying Intention, and Social Identification With Celebrities. Journal of Advertising,
43(2), 181–195. https://doi.org/10.1080/00913367.2013.827606
Kane, G. C., Alavi, M., Labianca, G., & Borgatti, S. P. (2014). What’s Different About Social
Media Networks? A Framework and Research Agenda. MIS Q., 38(1), 275–304.
Krackhardt, D. (1987). Cognitive social structures. Social Networks, 9(2), 109–134.
Kramer, A. D. I., Guillory, J. E., & Hancock, J. T. (2014). Experimental evidence of massive-
scale emotional contagion through social networks. Proceedings of the National Academy
of Sciences, 111(24), 8788–8790. https://doi.org/10.1073/pnas.1320040111
Krippendorff, K. (1993). The Past of Communication’s Hoped-For Future. Journal of
Communication, 43(3), 34–44. https://doi.org/10.1111/j.1460-2466.1993.tb01274.x
La, A. (2016, September 14). The State of the Octoverse. Retrieved December 14, 2017, from
https://github.com/blog/2257-the-state-of-the-octoverse
Lachmann, M., Szamado, S., & Bergstrom, C. T. (2001). Cost and conflict in animal signals and
human language. Proceedings of the National Academy of Sciences, 98(23), 13189–
13194.
Lee, C. H., Cook, S., Lee, J. S., & Han, B. (2016). Comparison of Two Meta-Analysis Methods:
Inverse-Variance-Weighted Average and Weighted Sum of Z-Scores. Genomics &
Informatics, 14(4), 173–180. https://doi.org/10.5808/GI.2016.14.4.173
150
Lee, J. Y., & Sundar, S. S. (2013). To Tweet or to Retweet? That Is the Question for Health
Professionals on Twitter. Health Communication, 28(5), 509–524.
https://doi.org/10.1080/10410236.2012.700391
Lee, J.-Y., Bachrach, D. G., & Lewis, K. (2014). Social Network Ties, Transactive Memory, and
Performance in Groups. Organization Science, 25(3), 951–967.
https://doi.org/10.1287/orsc.2013.0884
Lee, K. (2014, May 27). 6 Research-Backed Ways To Get More Followers On Any Social
Media. Retrieved April 23, 2018, from https://www.fastcompany.com/3031006/6-
research-backed-ways-to-get-more-followers-on-any-social-media
Lee, S.-Y. T., Kim, H.-W., & Gupta, S. (2009). Measuring open source software success.
Omega, 37(2), 426–438. https://doi.org/10.1016/j.omega.2007.05.005
Leonardi, P. M. (2014). Social media, knowledge sharing, and innovation: Toward a theory of
communication visibility. Information Systems Research, 25(4), 796–816.
https://doi.org/10.1287/isre.2014.0536
Leonardi, P. M. (2015). Ambient Awareness and Knowledge Acquisition: Using Social Media to
Learn “Who Knows What” and “Who Knows Whom.” MIS Quarterly, 39(4), 747–762.
Leonardi, P. M., & Treem, J. (2012). Knowledge management technology as a stage for strategic
self-presentation: Implications for knowledge sharing in organizations. Information and
Organization, 22(1), 37–59. https://doi.org/10.1016/j.infoandorg.2011.10.003
Lewis, D. (1969). Convention: A philosophical study. Malden, MA: Harvard University Press.
Lewis, K., & Herndon, B. (2011). Transactive Memory Systems: Current Issues and Future
Research Directions. Organization Science, 22(5), 1254–1265.
https://doi.org/10.1287/orsc.1110.0647
151
Lin, L., Geng, X., & Whinston, A. B. (2005). A Sender-Receiver Framework for Knowledge
Transfer. MIS Quarterly, 29(2), 197–219.
Lin, M., Lucas Jr, H. C., & Shmueli, G. (2013). Research commentary—too big to fail: large
samples and the p-value problem. Information Systems Research, 24(4), 906–917.
Lin, M., Prabhala, N. R., & Viswanathan, S. (2013). Judging Borrowers by the Company They
Keep: Friendship Networks and Information Asymmetry in Online Peer-to-Peer Lending.
Management Science, 59(1), 17–35. https://doi.org/10.1287/mnsc.1120.1560
Lin, X., Spence, P. R., & Lachlan, K. A. (2016). Social media and credibility indicators: The
effect of influence cues. Computers in Human Behavior, 63, 264–271.
https://doi.org/10.1016/j.chb.2016.05.002
linguist: Language Savant. (2018). Ruby, GitHub. Retrieved from
https://github.com/github/linguist (Original work published 2011)
Majchrzak, A., Faraj, S., Kane, G. C., & Azad, B. (2013). The contradictory influence of social
media affordances on online communal knowledge sharing. Journal of Computer-
Mediated Communication, 19(1), 38–55. https://doi.org/10.1111/jcc4.12030
Matz, S. C., Gladstone, J. J., & Stillwell, D. (2017). In a World of Big Data, Small Effects Can
Still Matter: A Reply to Boyce, Daly, Hounkpatin, and Wood (2017). Psychological
Science, 28(4), 547–550. https://doi.org/10.1177/0956797617697445
Maynard-Smith, J., & Harper, D. (2003). Animal Signals. Oxford, England: Oxford University
Press.
McGlone, M. S., & Giles, H. (2011). Language and interpersonal communication. In M. L.
Knapp & J. A. Daly (Eds.), The SAGE Handbook of Interpersonal Communication.
SAGE Publications.
152
McPherson, M., Smith-Lovin, L., & Cook, J. M. (2001). Birds of a Feather: Homophily in Social
Networks. Annual Review of Sociology, 27(1), 415–444.
https://doi.org/10.1146/annurev.soc.27.1.415
Mieg, H. A. (2006). Social and sociological factors in the development of expertise. In K.
Anders Ericsson, N. Charness, P. J. Feltovich, & R. R. Hoffman (Eds.), The Cambridge
Handbook of Expertise and Expert Performance (pp. 743–760). Cambridge, UK:
Cambridge University Press.
Mikhailova, A. (2018, April 8). Hacked Facebook and Twitter accounts sold online for as little
as £1, Telegraph investigation finds. The Telegraph. Retrieved from
https://www.telegraph.co.uk/news/2018/04/08/hacked-facebook-twitter-accounts-sold-
online-little-1-telegraph/
Mitchell, M. (2009). Complexity: A guided tour. New York, NY: Oxford University Press.
mojombo (Tom Preston-Werner). (n.d.). Retrieved November 14, 2017, from
https://github.com/mojombo
Monge, P., & Contractor, N. S. (2003). Theories of communication networks. New York, NY:
Oxford University Press.
Moreland, R. L. (1999). Transactive memory: Learning who knows what in work groups and
organizations. In L. L. Thompson, J. M. Levine, & D. M. Messick (Eds.), Shared
cognition in organizations: The management of knowledge (pp. 3–31). Mahwah, NJ, US:
Lawrence Erlbaum Associates Publishers.
Nelissen, R. M., & Meijers, M. H. (2011). Social benefits of luxury brands as costly signals of
wealth and status. Evolution and Human Behavior, 32(5), 343–355.
153
Newman, M. E. (2002). Assortative mixing in networks. Physical Review Letters, 89(20),
208701.
Newman, M. E. (2003). Mixing patterns in networks. Physical Review E, 67(2), 026126.
Newman, M. E. J. (2001). Clustering and preferential attachment in growing networks. Physical
Review E, 64(2), 025102. https://doi.org/10.1103/PhysRevE.64.025102
NOAA National Centers for Environmental Information. (2015). Annual Climatological
Summary PDF: Los Angeles, CA.
Norman, D. A. (2008). THE WAY I SEE IT Signifiers, not affordances. Interactions, 15(6), 18–
19.
Olson, M. (1965). The logic of collective action: Public goods and the theory of groups.
Cambridge, MA: Harvard University Press.
Ott, M., Cardie, C., & Hancock, J. (2012). Estimating the prevalence of deception in online
review communities. In Proceedings of the 21st international conference on World Wide
Web (pp. 201–210). ACM.
Palazzolo, E. T. (2005). Organizing for Information Retrieval in Transactive Memory Systems.
Communication Research, 32(6), 726–761. https://doi.org/10.1177/0093650205281056
Palazzolo, E. T., Serb, D. A., She, Y., Su, C., & Contractor, N. S. (2006). Coevolution of
Communication and Knowledge Networks in Transactive Memory Systems: Using
Computational Models for Theoretical Development. Communication Theory, 16(2),
223–250. https://doi.org/10.1111/j.1468-2885.2006.00269.x
Pierce, J. R. (1980). An introduction to information theory: symbols, signals and noise (Second,
Revised). New York, NY: Dover Publications, Inc.
154
Plummer, L. A., Allison, T. H., & Connelly, B. L. (2015). Better Together? Signaling
Interactions in New Venture Pursuit of Initial External Capital. Academy of Management
Journal, amj.2013.0100. https://doi.org/10.5465/amj.2013.0100
Population Clock. (2018, April 2). Retrieved April 4, 2018, from
https://www.census.gov/popclock/
Prentice, D. A., & Miller, D. T. (1992). When small effects are impressive. Psychological
Bulletin, 112, 160–164.
“Pump-and-Dumps” and Market Manipulations. (2013, June). Retrieved April 20, 2018, from
https://www.sec.gov/fast-answers/answerspumpdumphtm.html
Puts, D. A., Gaulin, S. J. C., & Verdolini, K. (2006). Dominance and the evolution of sexual
dimorphism in human voice pitch. Evolution and Human Behavior, 27(4), 283–296.
https://doi.org/10.1016/j.evolhumbehav.2005.11.003
Ramel, D. (2016, September 15). Microsoft Beats Facebook as Open Source Champion -.
Retrieved December 14, 2017, from https://adtmag.com/articles/2016/09/15/github-
octoverse.aspx
Ren, Y., & Argote, L. (2011). Transactive Memory Systems 1985–2010: An Integrative
Framework of Key Dimensions, Antecedents, and Consequences. The Academy of
Management Annals, 5(1), 189–229. https://doi.org/10.1080/19416520.2011.590300
Riley, J. G. (2001). Silver Signals: Twenty-Five Years of Screening and Signaling. Journal of
Economic Literature, 39(2), 432–478.
Ruxton, G. D. (2006). The unequal variance t-test is an underused alternative to Student’s t-test
and the Mann–Whitney U test. Behavioral Ecology, 17(4), 688–690.
https://doi.org/10.1093/beheco/ark016
155
Schneider, J. A., Cornwell, B., Ostrow, D., Michaels, S., Schumm, P., Laumann, E. O., &
Friedman, S. (2013). Network Mixing and Network Influences Most Linked to HIV
Infection and Risk Behavior in the HIV Epidemic Among Black Men Who Have Sex
With Men. American Journal of Public Health, 103(1), e28–e36.
https://doi.org/10.2105/AJPH.2012.301003
Schwarzer, G., Carpenter, J. R., & Rücker, G. (2015). Meta-analysis with R. Switzerland:
Springer. Retrieved from https://doi.org/10.1007/978-3-319-21416-0
Scottish Citizen Indicted for Twitter-Based Stock Manipulation Scheme. (2015, November 5).
[Press Release]. Retrieved April 20, 2018, from https://www.fbi.gov/contact-us/field-
offices/sanfrancisco/news/press-releases/scottish-citizen-indicted-for-twitter-based-stock-
manipulation-scheme
Sexton, S. E., & Sexton, A. L. (2014). Conspicuous conservation: The Prius halo and willingness
to pay for environmental bona fides. Journal of Environmental Economics and
Management, 67(3), 303–317. https://doi.org/10.1016/j.jeem.2013.11.004
Shami, N. S., Ehrlich, K., Gay, G., & Hancock, J. T. (2009). Making Sense of Strangers’
Expertise from Signals in Digital Artifacts. In Proceedings of the SIGCHI Conference on
Human Factors in Computing Systems (pp. 69–78). New York, NY, USA: ACM.
https://doi.org/10.1145/1518701.1518713
Shannon, C. E. (1948). A Mathematical Theory of Communication. The Bell System Technical
Journal, 27, 379–423, 623–656. https://doi.org/10.1145/584091.584093
Shannon, C. E., & Weaver, W. (1949). The Mathematical Theory of Communication. Urbana,
Illinois: University of Illinois Press.
156
Shumate, M., & Contractor, N. S. (2013). Emergence of Multidimensional Social Networks. In
L. L. Putnum & D. K. Mumby (Eds.), The SAGE Handbook of Organizational
Communication (pp. 449–474). Thousand Oaks, CA: SAGE Publications, INC.
Shumate, M., Pilny, A., Catouba, Y., Kim, J., Pena-y-Lillo, M., Rcooper, K., … Yang, S. (2013).
A Taxonomy of Communication Networks. Annals of the International Communication
Association, 37(1), 95–123. https://doi.org/10.1080/23808985.2013.11679147
Siewert, C. (2016). Consciousness and Intentionality. In E. N. Zalta (Ed.), The Stanford
Encyclopedia of Philosophy (Fall 2016). Retrieved from
http://plato.stanford.edu/archives/fall2016/entries/consciousness-intentionality/
Skyrms, B. (2009). Evolution of signalling systems with multiple senders and receivers.
Philosophical Transactions of the Royal Society B: Biological Sciences, 364(1518), 771–
779. https://doi.org/10.1098/rstb.2008.0258
Skyrms, B. (2010). Signals: Evolution, Learning, and Information. Oxford University Press.
Spence, M. (1973). Job market signaling. The Quarterly Journal of Economics, 355–374.
Spence, M. (2002). Signaling in retrospect and the informational structure of markets. American
Economic Review, 92(3), 434–459.
Stasser, G., Stewart, D. D., & Wittenbaum, G. M. (1995). Expert roles and information exchange
during discussion: The importance of knowing who knows what. Journal of Experimental
Social Psychology, 31(3), 244–265.
Stasser, G., & Titus, W. (2003). Hidden profiles: A brief history. Psychological Inquiry, 14(3–4),
304–313.
Stiglitz, J. E. (2002). Information and the Change in the Paradigm in Economics. American
Economic Review, 92(3), 460–501. https://doi.org/10.1257/00028280260136363
157
Stohl, C., Stohl, M., & Leonardi, P. M. (2016). Digital Age | Managing opacity: Information
visibility and the paradox of transparency in the digital age. International Journal of
Communication, 10, 15.
Sullivan, G. M., & Feinn, R. (2012). Using Effect Size—or Why the P Value Is Not Enough.
Journal of Graduate Medical Education, 4(3), 279–282. https://doi.org/10.4300/JGME-
D-12-00156.1
Sundie, J. M., Kenrick, D. T., Griskevicius, V., Tybur, J. M., Vohs, K. D., & Beal, D. J. (2011).
Peacocks, Porsches, and Thorstein Veblen: Conspicuous consumption as a sexual
signaling system. Journal of Personality and Social Psychology, 100(4), 664–680.
https://doi.org/10.1037/a0021669
Számadó, S. (2011). The cost of honesty and the fallacy of the handicap principle. Animal
Behaviour, 81(1), 3–10.
Tong, S. T., Heide, B. V. D., Langwell, L., & Walther, J. B. (2008). Too Much of a Good Thing?
The Relationship Between Number of Friends and Interpersonal Impressions on
Facebook. Journal of Computer-Mediated Communication, 13(3), 531–549.
https://doi.org/10.1111/j.1083-6101.2008.00409.x
Treem, J. (2012). Communicating Expertise: Knowledge Performances in Professional-Service
Firms. Communication Monographs, 79(1), 23–47.
https://doi.org/10.1080/03637751.2011.646487
Treem, J., & Leonardi, P. (2012). Social media use in organizations: Exploring the affordances
of visibility, editability, persistence, and association. Communication Yearbook, 36, 143–
189. https://doi.org/10.1080/23808985.2013.11679130
158
Trigg, A. B. (2001). Veblen, Bourdieu, and conspicuous consumption. Journal of Economic
Issues, 35(1), 99–115.
Tversky, A., & Kahneman, D. (1974). Judgment under Uncertainty: Heuristics and Biases.
Science, 185(4157), 1124–1131. https://doi.org/10.1126/science.185.4157.1124
Understanding the GitHub Flow · GitHub Guides. (n.d.). Retrieved December 19, 2017, from
https://guides.github.com/introduction/flow/
Utz, S. (2010). Show me your friends and I will tell you what type of person you are: How one’s
profile, number of friends, and type of friends influence impression formation on social
network sites. Journal of Computer-Mediated Communication, 15(2), 314–335.
https://doi.org/10.1111/j.1083-6101.2010.01522.x
Veblen, T. (1899). The theory of the leisure class. New York, NY: The New York American
Library.
Walther, J. B., Van Der Heide, B., Kim, S., Westerman, D., & Tong, S. T. (2008). The Role of
Friends’ Appearance and Behavior on Evaluations of Individuals on Facebook: Are We
Known by the Company We Keep? Human Communication Research, 34(1), 28–49.
https://doi.org/10.1111/j.1468-2958.2007.00312.x
Wasserman, S., & Faust, K. (1994). Social Network Analysis: Methods and Applications. New
York, NY: Cambridge University Press.
Watzlawick, P., Bavelas, J. B., & Jackson, D. D. (1967). Pragmatics of Human Communication:
A Study of Interactional Patterns, Pathologies and Paradoxes. New York, NY: W. W.
Norton & Company.
159
Wegner, D. M. (1987). Transactive memory: A contemporary analysis of the group mind. In
Theories of group behavior (pp. 185–208). Springer. Retrieved from doi: 10.1007/978-1-
4612-4634-3_9
Wegner, D. M. (1995). A computer network model of human transactive memory. Social
Cognition, 13(3), 319–339.
Westerman, D., Spence, P. R., & Van Der Heide, B. (2012). A social network as information:
The effect of system generated reports of connectedness on credibility on Twitter.
Computers in Human Behavior, 28(1), 199–206.
https://doi.org/10.1016/j.chb.2011.09.001
Wittenbaum, G. M., Hollingshead, A. B., & Botero, I. C. (2004). From cooperative to motivated
information sharing in groups: Moving beyond the hidden profile paradigm.
Communication Monographs, 71(3), 286–310.
Yang, J., & Rabkin, A. (2015, January 20). C is Manly, Python is for “n00bs”: How False
Stereotypes Turn Into Technical “Truths.” Retrieved April 24, 2018, from
https://modelviewculture.com/pieces/c-is-manly-python-is-for-n00bs-how-false-
stereotypes-turn-into-technical-truths
Yuan, Y. C., Fulk, J., & Monge, P. R. (2007). Access to information in connective and
communal transactive memory systems. Communication Research, 34(2), 131–155.
https://doi.org/10.1177/0093650206298067
Yuan, Y. C., Fulk, J., Monge, P. R., & Contractor, N. (2010). Expertise directory development,
shared task interdependence, and strength of communication network ties as multilevel
predictors of expertise exchange in transactive memory work groups. Communication
Research, 37(1), 20–47. https://doi.org/10.1177/0093650209351469
160
Zahavi, A. (1975). Mate selection—a selection for a handicap. Journal of Theoretical Biology,
53(1), 205–214.
Abstract (if available)
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
The Silicon Valley startup ecosystem in the 21st century: entrepreneurial resilience and networked innovation
PDF
The evolution of multidimensional and multilevel networks in online crowdsourcing
PDF
The evolution of knowledge creation online: what is driving the dynamics of Wikipedia networks
PDF
Social value orientation, social influence and creativity in crowdsourced idea generation
PDF
Adjusting the algorithm: how experts intervene in algorithmic hiring tools
PDF
A multitheoretical multilevel explication of crowd-enabled organizations: exploration/exploitation, social capital, signaling, and homophily as determinants of associative mechanisms in donation-...
PDF
Cosmopolitan logics and limits: networked discourse, affect, and identity in responses to the Syrian refugee crisis
PDF
Communicating organizational knowledge in a sociomaterial network: the influences of communication load, legitimacy, and credibility on health care best-practice communication
PDF
Media reinvented: the transformation of news in a networked society
PDF
Social motivation and credibility in crowdfunding
PDF
Ecology and network evolution in online innovation contest crowdsourcing
PDF
Use Signal, use Tor? The political economy of digital rights technology
PDF
Identity, trust, and credibility online: evaluating contradictory user-generated information via the warranting principle
PDF
The evolution of multilevel organizational networks in an online gaming community
PDF
Cryptographic imaginaries and networked publics: a cultural history of encryption technologies, 1967-2017
PDF
Connected: living mindfully in the digital age
PDF
Crowdsourcing for integrative and innovative knowledge: knowledge diversity, network position, and semantic patterns of collective reflection
PDF
Sense of belonging in an online high school: looking to connect
PDF
Contagious: social norms about health in work group networks
PDF
Why we follow influencers: the role of Asian female fashionistas in Los Angeles
Asset Metadata
Creator
Bighash, Leila
(author)
Core Title
Conspicuous connections as signals of expertise in networks
School
Annenberg School for Communication
Degree
Doctor of Philosophy
Degree Program
Communication
Publication Date
07/20/2018
Defense Date
05/08/2018
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
expertise,networks,OAI-PMH Harvest,online communities,organizations,organizing,signaling,social media,software development
Format
application/pdf
(imt)
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Monge, Peter (
committee chair
), Ananny, Mike (
committee member
), Hollingshead, Andrea (
committee member
)
Creator Email
bighash@usc.edu,leilabighash@gmail.com
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-c89-21404
Unique identifier
UC11671684
Identifier
etd-BighashLei-6435.pdf (filename),usctheses-c89-21404 (legacy record id)
Legacy Identifier
etd-BighashLei-6435.pdf
Dmrecord
21404
Document Type
Dissertation
Format
application/pdf (imt)
Rights
Bighash, Leila
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the a...
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Tags
expertise
networks
online communities
organizations
organizing
signaling
social media
software development