Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Understanding virality of YouTube video ads: dynamics, drivers, and effects
(USC Thesis Other)
Understanding virality of YouTube video ads: dynamics, drivers, and effects
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
UNDERSTANDING VIRALITY OF YOUTUBE VIDEO ADS: DYNAMICS, DRIVERS, AND EFFECTS by Yanwei Zhang A Dissertation Presented to the FACULTY OF THE GRADUATE SCHOOL UNIVERSITY OF SOUTHERN CALIFORNIA In Partial Fulfillment of the Requirements for the Degree DOCTOR OF PHILOSOPHY (BUSINESS ADMINISTRATION) August 2015 Copyright 2015 Yanwei Zhang This dissertation is dedicated to my parents and my wife for their immense love, understanding, support and faith. ii Acknowledgments I would like to thank my advisor Professor Gerry Tellis. He has been a tremendous mentor for me. His advice on both research as well as on my career have been invaluable. I truly appreciate his support throughout my doctoral study. I am also very grateful to Professor Debbie MacInnis for her many insightful discussions and suggestions on my research. I would also like to thank my committee members, Professors Lan Luo and Jinchi Lv for serving on the committee and for their helpful suggestions in general. iii Contents Acknowledgments iii List of Tables vii List of Figures x Abstract xii 1 Introduction 1 2 Modeling the Viral Diffusion of YouTube Video Ads 4 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2.2 Literature and theoretical basis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.2.1 Diffusion context . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.2.2 Viral diffusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.2.3 Role of word-of-mouth . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 2.2.4 Dynamic relationship between adoption and sharing . . . . . . . . . . . . . . . . . 9 2.2.4.1 Forgetting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.2.4.2 Dynamic effect of shares on adoptions . . . . . . . . . . . . . . . . . . . 10 2.2.4.3 Dynamic effect of adoptions on shares . . . . . . . . . . . . . . . . . . . 11 2.3 Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 2.3.1 Model development . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 2.3.1.1 Model for number of views (V) and goodwill (G) . . . . . . . . . . . . . 12 2.3.1.2 Model for potential sharers (P) and number of shares (S) . . . . . . . . . . 13 2.3.1.3 Dynamic Coefficients . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 2.3.2 Statistical models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 2.3.3 Model for non-share diffusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 iv 2.3.4 Hierarchical modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 2.4 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 2.4.1 Data collection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 2.4.2 Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 2.4.3 Modeling sample . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 2.5 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 2.5.1 Model comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 2.5.2 Diffusion characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 2.5.3 Dynamic interrelationship between views and shares . . . . . . . . . . . . . . . . . 24 2.5.4 Best time to seed the ad . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 2.6 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 2.6.1 Summary of results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 2.6.2 Discussion issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 2.6.2.1 Rapid peak and fast decay . . . . . . . . . . . . . . . . . . . . . . . . . . 31 2.6.2.2 Existence of virality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 2.6.2.3 Generalizability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 2.6.3 Limitations for future research . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 3 Content Drivers of Virality for YouTube Video Ads 33 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 3.2 Literature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 3.2.1 Drivers of viral ads . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 3.2.2 Theory of social sharing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 3.2.3 Structure of causal relationships . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 3.2.4 Extent of arguments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 3.2.5 Extent to which the ad arouses emotions . . . . . . . . . . . . . . . . . . . . . . . . 40 3.2.6 Extent of humor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 3.2.7 Extent of surprise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 3.2.8 Brand prominence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 3.2.9 Sources and types of sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 3.2.10 Length of the ad . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 3.3 Data and coding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 3.3.1 Data sampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 3.3.2 Content coding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 v 3.3.3 Exploratory analysis and descriptive statistics . . . . . . . . . . . . . . . . . . . . . 47 3.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 3.4.1 Indirect effects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 3.4.2 Direct effects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 3.5 Discussion and conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 3.5.1 Implications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 3.5.2 Comparison to existing findings . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 3.5.3 Future research . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 4 Quantifying the Effects of YouTube Video Ads 59 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 4.2 Literature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 4.2.1 Research on ad effectiveness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 4.2.2 Effect of virality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 4.2.3 Effect of appeal methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 4.2.4 Measures of ad effect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 4.3 Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 4.4 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 4.4.1 Data sampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 4.4.2 Descriptive statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 4.4.3 Content coding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 4.5 Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 4.5.1 Parameter estimates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 4.5.2 Channel subscription estimates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 4.5.3 Effect of virality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 4.5.4 Effect of appeal methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 4.5.5 Additional analysis on abnormal stock returns . . . . . . . . . . . . . . . . . . . . . 70 4.6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 4.6.1 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 References 73 A Technical Appendix to Chapter 2 80 B Technical Appendix to Chapter 4 82 vi List of Tables 1.1 Summary of the research questions, contributions and findings of the three studies. . . . . . . . . . . . . . 3 2.1 Comparison of the studies on word-of-mouth dynamics. “Y” indicates “yes”. . . . . . . . . . . . . . . . 5 2.2 Classification of research on viral diffusion of digital information products. . . . . . . . . . . . . . . . . 7 2.3 Summary of the representative studies on word-of-mouth. . . . . . . . . . . . . . . . . . . . . . . . . 9 2.4 The list of variables and notation in the model formulation. . . . . . . . . . . . . . . . . . . . . . . . . 12 2.5 The eight most viewed video ads in the data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 2.6 Sample quantiles of major video statistics across all collected ads. . . . . . . . . . . . . . . . . . . . . . 19 2.7 Correlations of various social engagement measures. . . . . . . . . . . . . . . . . . . . . . . . . . . 20 2.8 Results of model comparison. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 2.9 Estimated population-level diffusion parameters from the three models. Existing studies of diffusions of durable goods report that the innovation parameter is in[0:0007;0:030], and that the imitation parameter is in[0:38;0:53]. 22 2.10 The correlation matrix for the population-level diffusion parameters. . . . . . . . . . . . . . . . . . . . 22 2.11 Estimated population-level diffusion characteristics. Speed of diffusion is measured as the difference between the time points where 10% to 90% of the estimated ultimate views are reached. . . . . . . . . . . . . . . . 23 vii 2.12 Estimated dynamic effects of share effectiveness and sharing propensity. . . . . . . . . . . . . . . . . . 25 2.13 Estimated short-term and long-term share effectiveness and sharing propensity by day. . . . . . . . . . . . 27 3.1 Summary of studies on drivers of virality. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 3.2 Representative studies on ad cues. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 3.3 Summary of the potential effects of ad cues on self-enhancement from the theoretical discussion. . . . . . . . 37 3.4 Sample quantiles of the number of social shares. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 3.5 Rating scales and definitions used in the content coding. . . . . . . . . . . . . . . . . . . . . . . . . . 45 3.6 Factor analysis on individual emotions and related cues. . . . . . . . . . . . . . . . . . . . . . . . . . 47 3.7 The frequency table of the scales. The last column shows the estimated coefficient from the univariate regression of the logarithmic shares on each ad cue. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 3.8 Mediation analysis for ad cues. Columns 3 – 4 report the estimated effects of the ad cues on the mediator. Columns 5 – 6 report the estimated indirect effects of the ad cues on social shares through the mediator, along with the estimated confidence intervals based on bootstrapping. . . . . . . . . . . . . . . . . . . . . . 50 3.9 Estimated direct effects. We started with the mixed-effects model of logarithmic shares over all the ad cues included in this study. Ad cues completely mediated by emotion or argument were dropped to obtain better estimates of the other cues. TheR 2 statistics is 0:51 when calculated using only the fixed effects (ad cues), and 0:66 when calculated using both the fixed effects (ad cues) and random effects (channel effects). R 2 = 1 Var(Residuals)/Var(Response). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 3.10 Contrast analysis based on the estimated parameters in Table 3.9. The first row shows the effect of argument in the ad that introduces new product. That is, the main effect of argument plus the interaction effect in Table 3.9. . 53 3.11 Summary of the findings of various ad cues on different measures of consumer responses. The notation of + and – indicates a positive or negative effect, respectively. . . . . . . . . . . . . . . . . . . . . . . . . . . 57 4.1 Descriptive statistics for the channels included in the study. . . . . . . . . . . . . . . . . . . . . . . . 66 viii 4.2 Parameter estimates for one brand channel. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 4.3 Distribution of the estimated median effectiveness parameter and subscriptions across all video ads. . . . . . . 68 4.4 Test of effect of virality on ad effectiveness. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 4.5 Test of effect of appeal methods on ad effectiveness. . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 4.6 Test of effect of ad views on cumulative abnormal stock returns. Ad views are measured using lagged values of 0, 1, 2, and 5 days, and cumulative abnormal returns are based on a window of 1, 3, and 5 days. . . . . . . . . 71 ix List of Figures 2.1 Diagram illustrating the diffusion process of the video ad. Blue rectangle indicates observed and red trapezium indicates latent variables. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 2.2 Plot of the observed hourly views (red) and shares (green) for four sample video ads. The hourly views(shares) are normalized by the largest observed hourly views(shares) for each video so that y-axises are comparable. The white and gray strips in the background separate different days. . . . . . . . . . . . . . . . . . . . . . . 21 2.3 Scatterplot of the estimated innovation and imitation parameters over the estimated total views for each video ad. The line in each plot represents the linear regression fit. . . . . . . . . . . . . . . . . . . . . . . . 23 2.4 Estimated hourly effects. Dashed lines indicate the 95% credible intervals. . . . . . . . . . . . . . . . . 24 2.5 Estimated population-level evolution patterns of views and shares. . . . . . . . . . . . . . . . . . . . . 25 2.6 The estimated population-level share effectiveness and sharing propensity parameters over time. . . . . . . . 26 2.7 Part (a) illustrates the dynamic effect of of a promotion at the 15th hour, indicated by the red arrow. The total effect is the gray area. Figure (b) shows the total effect of a promotion that is carried out at different times. In both cases, the promotion is a purchase of 1000 views. . . . . . . . . . . . . . . . . . . . . . . . . . 28 2.8 The estimated effect of uploading hour (a) and uploading day (b) on the ultimate views. The vertical bars in (b) correspond to two standard deviations from the mean. . . . . . . . . . . . . . . . . . . . . . . . . . . 29 3.1 Structure of causal relationships. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 x 3.2 Estimated relationship between social shares and ad length. The dashed line indicates the optimal ad length based on this estimate. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 3.3 Identified drivers of virality and the way they influence social shares. . . . . . . . . . . . . . . . . . . . 55 4.1 Metrics that may be used for measuring the return of ad campaigns. . . . . . . . . . . . . . . . . . . . . 62 4.2 Illustration of the spike-and-slab prior. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 4.3 Plot of the observed incremental channel subscriptions for two sample channels. The arrows on the axis mark the uploading time of a video ad, with red and black colors indicating viral and non-viral videos, respectively. . 67 xi Abstract Recent years have seen a proliferation of branded video ads on YouTube. Brands are increasingly willing to bypass traditional mass media and place ads on YouTube. If the video ad goes viral, it creates huge short-term brand exposure. However, creating viral videos is difficult. Most video ads are duds. This is because on YouTube, consumers decide what to watch, whether to share, and whether to subscribe to the brand channel. These unique features call for a careful analysis of YouTube video ads to help brands better design and place their ads on YouTube. We scrape hourly data from YouTube for a large number of branded video ads. Our collected data include various measures of consumers’ social engagement (views, shares, likes/dislikes, comments). We also collect channel subscriptions and abnormal stock returns to gauge the effect of advertising. We examine the virality of video ads from three aspects. First, we study the process that a video ad spreads. We construct a dynamic system that jointly models the process of product adoption and share generation. We also model the dynamic interrelationship between adoption and sharing resulting from the change of consumers’ adoption and sharing behavior over time. The empirical inference of the model sheds light on the patterns of adoptions and shares, characteristics of viral diffusion, intra-day seasonality, economic values of social shares, and dynamics of share effectiveness and sharing propensity. These results provide insights on the strategy of ad promotion, share stimulation, and ad seeding. Second, we analyze the drivers of viral video ads. We develop an instrument to rate the content of a large number of ads on over 30 executional cues drawn from the behavioral literature on advertising. These cues cover argument, emotion, endorsement, surprise, humor, branding, ad length, sex and many other variables. We analyze social shares as a function of the executional cues and identify the important drivers of virality. Our empirical findings provide important implications for the current practice of advertising via video ads. Third, we investigate the effects of video ads on channel subscriptions and abnormal stock returns of the brand. We build a dynamic model that links the views of video ads to the channel subscriptions of the brand. We quantify the effect of a view on channel subscription, identify the effective ads, test the effectiveness of xii viral ads compared to non-viral ads, compare different appealing methods used in ads and access the effects of video ads on abnormal stock returns. xiii Chapter 1 Introduction Characterized by its wide reach, fast speed, low cost, and rapid evolution, the internet has become a primary medium through which people create and share information and ideas today. It has stimulated an exponential growth of digital information products [DIP]. These are products that exist in digital form and contain some form of information or entertainment broadly construed, such as ideas, tweets, news, images, songs, or videos. In particular, the internet has revolutionized the advertising industry. Brands are increasingly willing to bypass traditional mass media and rely on online ads. For example, advertisers have widely adopted YouTube to deliver their video ads through branded channels. A recent report shows that from 2009 through 2013, more than six thousand brands released more than 11,500 advertising campaigns and 179,900 video ads on YouTube, which generated more than 19 billion video views (Visible Measures 2013). Besides the potentially large viewership, advertising through YouTube channels has several features that are distinct from other online and offline advertising. First, it is highly cost efficient. In its most basic form, the advertising is free as it is costless to publish videos on YouTube. The only cost to the advertiser arises from making the video. Also, advertising through YouTube is unlimited. An advertiser can upload as many videos as possible at minimum cost. Second, there is almost no length restriction on the video ads. Advertisers can create long ads with rich content. Long ads can tell a story or portray a drama that can arouse strong emotions. Additionally, long ads allow flexibility in terms of ad message format, encoding, and delivery. Such ads may affect viewers in a different way than does traditional TV advertising. Third, unlike other advertising methods, viewership is voluntary. The ad is viewed only if a viewer chooses to watch it. Indeed, if the brand does not purchase additional services, the exposure of the video ad is mostly limited to the channel subscribers. Additional exposure is created when the video is shared by the viewers. Sharing creates fresh exposure as the video reaches new viewers across other platforms such as Facebook, Twitter, and Google Plus. Fourth, YouTube complements TV advertising in new and important ways. Advertisers can publish ads on YouTube as a test platform before placing them in paid TV channels. Conversely, advertisers can use paid TV channels as a seed for gaining wide viewership when they post it on YouTube. In sum, advertising on YouTube has high reach, is flexible and low cost, is optional and sharable, and complements TV advertising. If a massive network of consumers views and then shares the ad, virality may 1 occur. Virality is the rapid spread of a video ad through viewing and sharing in a massive social network leading to a very large number of views in a short period of time. If an ad goes viral, it creates huge short- term brand exposure that equals or exceeds that available through TV advertising. However, creating viral video ads is difficult. Most video ads are duds and do not generate many views. The topic of virality has gained an increasing interest in marketing research. The dissertation extends this stream of research in three ways. First, recent studies have focused on the type of content that may become viral (e.g., Berger and Milkman 2012). However, little has been known about the temporal process that leads to viral diffusion. The use of online distribution media may entail adoption behavior and diffusion process that are distinct for new digital information products relative to traditional physical products. We study the process that a video ad spreads. We construct a dynamic system that jointly models the process of product adoption and share generation. We also model the dynamic interrelationship between adoption and sharing resulting from the change of consumers’ adoption and sharing behavior over time. The empirical inference of the model sheds light on the patterns of adoptions and shares, characteristics of viral diffusion, intra-day seasonality, economic values of social shares, and dynamics of share effectiveness and sharing propensity. These results provide insights on the strategy of ad promotion, share stimulation, and ad seeding. Second, most existing studies have investigated the role of emotion in driving virality (Berger and Milk- man 2012, Nelson-Field et al. 2013). However, video ads are rich stimuli that include cues beyond emotions. Ads might include arguments, brand prominence, and cues that may not be fully captured by emotion, such as the type of sources used or the use of surprise. These additional cues may also affect virality in impor- tant ways. For this reason, we analyze the drivers of viral video ads. We develop an instrument to rate the content of a large number of ads on over 30 executional cues drawn from the behavioral literature on advertising. These cues cover argument, emotion, endorsement, surprise, humor, branding, ad length, sex and many other variables. We analyze social shares as a function of the executional cues and identify the important drivers of virality. Our empirical findings provide important implications for the current practice of advertising via video ads. Third, little has been known about the effect of viral video ads. Advertisers can publish unlimited video ads and enjoy the “free” brand exposures generated from ad views. However, the success of an advertising campaign is measured beyond short-term brand exposures. Managers may wish to understand how the campaign affects consumers’ behaviors following exposure. We investigate the effects of video ads on channel subscriptions and abnormal stock returns of the brand. We build a dynamic model that links the views of video ads to the channel subscriptions of the brand. We quantify the effect of a view on channel subscription, identify the effective ads, test the effectiveness of viral ads compared to non-viral ads, compare different appealing methods used in ads and access the effects of video ads on abnormal stock returns. 2 Table 1.1 summarizes the research questions, contributions to marketing research and practices, and findings of the three studies. We discuss each in turn. Study Research Questions Contributions Findings How to jointly model the process of ad view- ing and share generation? Construct a viral diffusion model that jointly describes the evolution of views and shares. Diffusion of video ads is fast. Most peak within days. Shares peak earlier and decay much faster than views. What is the dynamic interrelationship between the viewing and sharing of a video ad? Study the dynamics of share effec- tiveness and sharing propensity. Viral ads have a smaller innovation and a larger imitation coefficient than non-viral ads. What are the patterns of views and shares? Model the diffusion of video ads at the hourly level. Sharing propensity wear outs over time. Share effectiveness generally wear outs, but wearin occurs due to the volume effect of shares. Dynamics What is the economic value of social shares? Parsimoniously model the seasonal- ity effect using the Fourier form. Hourly-level diffusion exhibits strong intra- day seasonality, with higher views after noon than before noon and a peak around 4-6 PM Pacific Time. What is the best time to promote an ad? Estimate the economic value of a social share. One share generates about 6.6 views and cor- responds to $0.18. What is the best time to seed an ad? The best promotion time is the beginning of the second day after release. The best uploading time is around 6 PM Monday Pacific Time. What ad cues are influential on social shares? Study a large number of ad cues beyond emotion. Argument positively impacts sharing only when the ad relates to a newly introduced product; otherwise, its effect is negative. How do the ad cues affect social shares, through emotion and argument, or directly? Identify important drivers of viral- ity of video ads. Emotional appeals are more effective than argument. High ratings of love, warmth, pride and excitement stimulate social shar- ing. Drivers Why are the identified cues effective in cre- ating social shares? Provide theoretical justification for important ad cues based on the self- enhancement effect. The use of surprise is effective. A surpris- ing end evokes more sharing than a surpris- ing beginning. Humorous ads are shared more. Late placement of brand results in better sharing propensity than early placement. The relationship between shares and ad length is an inverted U curve. How to quantify the effect of YouTube video ads? First investigation of the relation- ship between views and channel subscriptions. About 5.8 channel subscriptions are gener- ated for every 1000 views. Are viral ads more effective in generating channel subscriptions than non-viral ads? First investigation of the relation- ship between ad views and abnor- mal stock returns. Tests support that viral ads are more effective in generating subscriptions. Effects Are ads using emotional appeals more effec- tive in generating channel subscriptions than ads using argument appeals? Incorporate the spike-and-slab pri- ors in to dynamic linear models. Tests support that ads using emotional appeals are more effective in generating subscriptions. What is the effect of video ads on abnormal stock returns? No evidence of influence of viral video ads on abnormal stock returns. Table 1.1: Summary of the research questions, contributions and findings of the three studies. 3 Chapter 2 Modeling the Viral Diffusion of YouTube Video Ads 2.1 Introduction The topic of virality has gained increasing interest in marketing research. Recent studies have focused on the type of content that may become viral (e.g., Berger and Milkman 2012). However, little has been known about the temporal process that leads to viral diffusion. The use of online distribution media may entail adoption behavior and diffusion process that are distinct for new digital information products relative to traditional physical products. Notably, the spread of digital information products is mainly spurred by consumers’ word-of-mouth such as social shares. It is critical to understand the process of both product adoption and word-of-mouth generation. Further, the online environment is characterized by a fast pace and rapid change in topics discussed. Such changes may influence consumers’ behavior on adoption and social sharing, resulting in a dynamic interrelationship between adoption and sharing. These unique features call for a careful analysis of the diffusion process of digital information products to help advertisers optimize social marketing strategy and business performance. For example, understanding the dynamic diffusion of video ads enables advertisers to predict advertising outcomes after releasing the ad, make investment decisions on stimulating social shares, and optimize the strategy to promote their ads. This essay investigates the diffusion of digital information products over time by studying the time series data of views and shares of branded video ads on YouTube. We scraped the websites of YouTube and major social networks to collect data on hourly views and shares of YouTube video ads since their uploading time. We constructed a theoretical model that specifies the joint evolution of the views and shares, as well as the dynamic interrelationship between them. We formulated the theoretical model as a dynamic (non)linear model (West and Harrison 1997). In modeling multiple video ads, we used a hierarchical structure to reflect the variation of parameters among video ads and account for their heterogeneity. Through this study, we seek answers to the following questions: 1. How should one model the viral diffusion of video ads? That is, how to jointly model the ad viewing process and the social sharing process? 4 Study Diffusion of product Evolution of WOM Dynamic effect of WOM on adoption Dynamic effect of adoption on WOM Seasonality Data Liu (2006) Y Weekly Dellarocas et al. (2007) Y Weekly Duan et al. (2008) Y Daily Trusov et al. (2009) Y Daily Bruce et al. (2012) Y Weekly Current study Y Y Y Y Y Hourly Table 2.1: Comparison of the studies on word-of-mouth dynamics. “Y” indicates “yes”. 2. What is the dynamic interrelationship between the viewing and sharing of a video ad? Is there wear- in/out of the effect of a social share? Is there wear-in/out of the propensity that consumers engage in social sharing? 3. What are the diffusion characteristics of video ads? What is the intra-day seasonality? How do the evolution patterns of the views and shares compare? 4. What is the economic value of social shares in temporal diffusion? 5. What is the best time to promote an ad? That is, when is an exposure to the ad most effective in spurring additional views? 6. What is the best time to seed an ad? We contribute to the marketing literature in several ways. First, we jointly model the diffusion of the digital information product and the dynamic evolution of social shares. Our model describes the temporal process of product adoption and social share formation following the release of the product. It enables the study of the evolution pattern of product adoption as well as that of social shares. It also allows predictions of the diffusion curve at any time point. In comparison, most prior studies on word-of-mouth study the relationship between adoption and word-of-mouth and do not model the takeoff or decay of the adoption curve (e.g., Liu 2006, Trusov et al. 2009). Although Dellarocas et al. (2007) investigated the effect of word-of-mouth in the diffusion context, they did not model the evolution of word-of-mouth. Second, we study the dynamic interrelationship between social shares and ad views. We examine the potential dynamic effect of sharing on adoption and model the wear-in/out pattern of share effectiveness. In addition, we allow the effect of adoption on sharing to be time varying and model the wear-in/out of sharing propensity. Bruce et al. (2012) studied the dynamic effectiveness of online reviews, but did not model the dynamic effect of adoption on word-of-mouth generation. Using the modeled dynamics of both share effectiveness and sharing propensity, we provide insights on the optimal strategy of ad promotion. Third, the temporal granularity of the data for the study of video ads differs substantially from existing studies. The adoption and spread of video ads occur at a markedly rapid speed, with many video ads reaching 5 peaks on the very first day. Hourly data are required for the study of video ads. In contrast, past diffusion analyses have used data aggregated at the yearly, monthly (Peers et al. 2012), or weekly level (Sawhney and Eliashberg 1996). Prior studies on word-of-mouth dynamics have used data at the weekly (Liu 2006) or daily level (Duan et al. 2008). Fourth, we employ the Fourier form to model the intra-day seasonality in diffusion. Compared to exist- ing methodsusing dummy indicators, the Fourier method provides a smooth and robust estimate of the intra-day seasonality in a very parsimonious way. Table 2.1 summarizes the scope of the current study and compares it with existing studies that examined the dynamics of product adoption and word-of-mouth. Our main findings are the following. The speed of diffusion of video ads is rather rapid, with most ads reaching 90% of the ultimate views within two weeks. On average, it takes 1.3 days to reach the peak and 12.7 days from the peak time to reach 90% saturation of the ultimate views. The estimated Bass imitation coefficient is substantially smaller than that of durable goods. Viral ads have a smaller innovation and a larger imitation coefficient than non-viral ads. There is strong intra-day seasonality, with more views in the afternoon than in the morning. The peak occurs between 4 to 6 PM and the trough around 4-6 AM Pacific Time. The evolution of shares is characterized by a higher and earlier peak and a much faster decay than that of views. The interrelationship between views and shares changes over time. The effect of a share on view generation generally decays over time; but it can also be boosted by a high volume of shares. These two effects yield dynamic share effectiveness over time with an initial increase and a subsequent steady drop. In addition, the propensity that a consumer shares the ad decays over time and this decay is much faster than that of share effectiveness. Based on these results, we find that the best time to promote the ad is at the beginning of the second day after release of the ad. We also find that one share generates about 6.6 views in general, which corresponds to a monetary value about $0.18. Lastly, additional analysis suggests that the best time to upload the video ad is around 6 PM Pacific Time and the best day is Monday. The rest of the essay is organized as follows. Section 2.2 reviews the literature and discusses the the- oretical basis for the diffusion process of digital information products. Section 2.3 presents the proposed theoretical and statistical model. Section 2.4 describes data collection and sampling. Section 2.5 reports the estimated results. Section 2.6 presents our conclusions, discusses some issues, and lists limitations and directions for future research. 6 Characteristic Object of Diffusion of Diffusion Consumer/Print Media Generated Firm Generated Network Yoganarasimhan (2012) Susarla et al. (2012) Driver Berger and Milkman (2012) Nelson-Field et al. (2013) Nelson-Field et al. (2013) Structure Goel et al. (2014) Dynamics Current study Table 2.2: Classification of research on viral diffusion of digital information products. 2.2 Literature and theoretical basis 2.2.1 Diffusion context Rogers (2003) defines diffusion as the process in which an innovation is communicated through certain channels over time among the members of a social system. In the context of digital information products, a new video ad is considered an innovation and a consumer’s decision to watch the video ad is regarded as its adoption. To consumers, the quality of the video ad, e.g., how entertaining it is and how much information it contains, is uncertain before its adoption. Whereas the consumers do not pay for the adoption, they must invest time to watch the video ad, which is the non-negligible cost of the adoption. So, video ads can be regarded as a product for which the cost that consumers pay is time. In reality a consumer can view a video ad multiple times. All of these views are not necessarily counted, however, because YouTube implements algorithms that comb through cookies and IP addresses to make sure that the video views are not inflated by multiple clicks from the same consumer. Because of this screening of data, the diffusion of video ads on YouTube can be roughly regarded as a one-time adoption process. The diffusion of an innovation is a function of media and distribution channels. For digital information products such as video ads, one critical such medium is interpersonal word-of-mouth such as social shares in which personal influence can directly affect consumers’ adoption decisions. Abundant shares in a massive social network can produce a very large number of adoptions in a short period of time, leading to viral diffusion of a digital information product. 2.2.2 Viral diffusion In marketing, a vast literature exists on the diffusion of consumer durables (Bass 1969, Chandrasekaran and Tellis 2007), movies (Sawhney and Eliashberg 1996), and consumer non-durables (Bucklin and Sengupta 7 1993). In contrast, research on the diffusion of digital information products and viral diffusion is very limited. We classify the literature on virality into four streams (see Table 2.2). First is research on the con- tent drivers of diffusion. For example, Berger and Milkman (2012) look at the drivers of viral diffusion in newspaper stories. Second is research on the pattern of network effects in diffusion. For example, Yoganarasimhan (2012) and Susarla et al. (2012) examine how the network structure affects the diffusion of YouTube videos. Third is research on the structure of virality in diffusion. For example, Goel et al. (2014) distinguish two ways of a product gaining popularity in the diffusion process, through a single large broadcast or via multiple generations of individual sharing. Fourth is research on the temporal dynamics of diffusion. To the best of our knowledge, no studies exist on this fourth topic. Also, the first three top- ics are predominantly cross-sectional. The fourth topic is inherently dynamic or time based. In addition, most existing studies focus on user or media generated content as opposed to marketer generated content such as branded video ads. While consumer generated content creates free exposure to product and brand information, such content is more unpredictable and harder for firms to control than that for branded video ads. 2.2.3 Role of word-of-mouth There has been considerable research on the role of word-of-mouth in driving product adoptions (e.g., sales). Existing studies have found that product adoptions can be influenced by the volume (Chevalier and Mayzlin 2006, Liu 2006, Dellarocas et al. 2007, Duan et al. 2008), valence (Dellarocas et al. 2007, Duan et al. 2008) or entropy (Godes and Mayzlin 2004) of word-of-mouth. Increasing attention has focused on the dynamics of word-of-mouth. Several studies have examined the factors that affect the dynamic formation of word- of-mouth (Liu 2006, Duan et al. 2008, Berger and Schwartz 2011, Moe and Trusov 2011, Godes and Silva 2012), as well as the dynamic relationship between word-of-mouth and product adoptions (Duan et al. 2008, Moe and Trusov 2011). Representative studies and their findings are summarized in Table 2.3. It is worth noting that most of these studies use online reviews to measure word-of-mouth. Yet, online reviews are different from social shares in several ways. First, the valence of online reviews can be positive, negative or neutral. Social shares serve the purpose of recommendation and are intrinsically positively valenced. Second, social shares broadcast the recommendation or message only to friends in the network. In contrast, reviews are generally accessible to most users on the internet. Third, it is generally much easier to share content on a social network than post a review. For this reason, the evolution of social shares may be quite different from that of online reviews. To the best of our knowledge, no study has employed social shares as the measure of word-of-mouth and examined their impact on product adoptions. 8 Study Type of WOM Product Dependent variable Finding Godes and Mayzlin (2004) Online posts (entropy) TV shows Ratings The measure of the dispersion of conver- sations across communities has explanatory power in a dynamic model of TV ratings. Chevalier and Mayzlin (2006) Online reviews (volume, rating) Book Sales An improvemenint a book’s reviews leads to an increase in relative sales at that site. Liu (2006) Online reviews (volume, valence) Movie Box office sales The volume of WOM offers significant explanatory power on box office revenue. Dellarocas et al. (2007) Online reviews (volume, valence, entropy) Movie Box office sales The inclusion of online product review met- rics increases the forecasting accuracy of dif- fusion models. Duan et al. (2008) Online reviews (volume, valence) Movie Box office sales Both a movies box office revenue and WOM valence significantly influence WOM vol- ume. WOM volume in turn leads to higher box office performance. Trusov et al. (2009) WOM referrals Website sign-ups WOM referrals have longer carryover effects than traditional marketing actions and pro- duce higher response elasticities. The mone- tary value of a WOM referral is calculated. Moe and Trusov (2011) Online reviews (rating) Bath, fragrance, beauty products Rating & Sales Ratings behavior is significantly influenced by previous ratings and can directly improve sales, but the effects are relatively short lived. Godes and Silva (2012) Online reviews (rating) Book Rating Online ratings change systematically over both order and time. Current study Shares on social networks Video ads Views of ads Table 2.3: Summary of the representative studies on word-of-mouth. 2.2.4 Dynamic relationship between adoption and sharing Content on the internet and social networks is constantly in flux, with topics and trends changing quickly. Consumers’ behavior on adoption and sharing is likely to be affected by such changes. This can lead to complex dynamic relationships between adoption and sharing. We examine three different dynamic effects: forgetting, time-varying share effectiveness, and time-varying sharing propensity. We now discuss these different types of dynamic effects and their possible causes. 2.2.4.1 Forgetting In the online environment, numerous rival ideas or digital information products are created and published continuously. For example, every minute over 100 hours of video are uploaded to YouTube 1 . Such strong competition makes digital information products such as video ads highly perishable. Consumers’ memory of the ad or a share of the ad may be very short. This implies that after exposure to the ad or a share of the 1 Source: https://www.youtube.com/yt/press/statistics.html 9 ad, the consumer may become less likely to share the watched ad or watch the shared ad due to memory decay. 2.2.4.2 Dynamic effect of shares on adoptions In addition to driving adoptions in the short term, word-of-mouth may influence adoptions in future periods and therefore have a long-term impact (Duan et al. 2008, Moe and Trusov 2011). In our context, social sharing spurs the spread of a video ad by exposing it to network of friends. Friends may attend to and watch the shared video ad hours or days after the time of sharing. However, due to consumer forgetting, the effect of that share may decay over time and ultimately become ineffective. In other words, a social share creates goodwill that has a diminishing impact on ad views over time (similar to the goodwill created by advertising (Nerlove and Arrow 1962, Bass et al. 2007)). In addition to the dynamic effect due to forgetting, another type of dynamic effect results from time- varying share effectiveness. This pertains to the possibility that the effectiveness of a new social share changes over time (Bruce et al. 2012). Such dynamic share effectiveness can be further classified as the temporal effect and the volume effect, as explained below. 2 The temporal effect refers to the change of share effectiveness with the passage of time that is indepen- dent of the number of shares. Several factors may contribute to this effect. First, early shares may be less effective when consumers are uncertain about the quality of the ad. The effectiveness of a share may increase as consumers gather more information about the ad from other sources. This may cause temporal wearin of share effectiveness. Second, a late share may be less effective than an early one. The later the share, the more likely that potential viewers in the network have already watched the ad (from other sources). Third, consumers may be more interested in fresh materials and are thus less likely to respond to late shares. Both of these can lead to temporal wearout of share effectiveness. The volume effect occurs because the number of shares in each period has an impact on the share effectiveness. Such an effect may take place in two ways. First, there can be overlap among different sharers’ social networks, which reduces the effective number of new exposures a share creates. This leads to declining returns to volume. This is more likely to occur when many people share. Second, extensive sharing in a short time period may create a buzz and indicate the good quality of the ad (Trusov et al. 2009). A consumer receiving many recommendations of the same ad from network friends may be more likely to watch the ad. This is likely to lead to increasing returns to volume. 2 They are analogous to the copy and repetition effects, which were used in Naik et al. (1998) and Bass et al. (2007) to study the wearout of ad effectiveness. 10 Goodwill (G) Views (V) Potential Sharers (P) Shares (S) (1a) (1b) (2b) (2a) (3) (4b) (4a) Figure 2.1: Diagram illustrating the diffusion process of the video ad. Blue rectangle indicates observed and red trapezium indicates latent variables. 2.2.4.3 Dynamic effect of adoptions on shares Consumers can share the video ad on social networks after watching it. They are likely to share immediately after watching the ad, but they may also share the ad at a later time. As a result, adoptions also exert a long- term effect on social shares. But due to consumer forgetting, the effect decreases over time. In addition to the carryover dynamics, the propensity that a new viewer shares the ad may also change with the passage of time, leading to sharing propensity wearout. Prior research suggests that consumers engage in social sharing to create a positive image of oneself (De Angelis et al. 2012, Packard and Wooten 2013, Barasch and Berger 2014). Adopters are more willing to share the ad when other network friends lack knowledge about the ad or the information it contains. Sharing content worthy of knowing may add social cache to the viewer who appears to be “in the know”. For this reason, there may be a temporal wearout of the sharing propensity. 2.3 Model In this section, we develop the theoretical model for the diffusion process of a video ad. The model is characterized by the joint evolution of views and shares of the video ad over time, as well as the dynamic changes of share effectiveness and sharing propensity. We then formulate the statistical model based on the theoretical development. 2.3.1 Model development The dynamic model specifies the evolution path of the (cumulative) number of views (V), the (cumulative) number of shares (S), the goodwill created by social shares (G), and the number of potential sharers (P). The latter two are latent variables that capture the potential long-term effect due to forgetting. Table 2.4 lists the models notation. We note that all these variables depend on time. Here we suppress the time dimension to 11 Notation Explanation V The cumulative number of views of a video ad. V t is the total number of views by timet. V The incremental number of views of a video ad. V t = V t V t1 is the number of views betweent 1 andt. S The cumulative number of shares of a video ad. S t is the total number of shares by timet. S The incremental number of shares of a video ad. S t = S t S t1 is the number of shares betweent 1 andt. G The goodwill generated by social shares. P The potential sharers of the ad. Sharing propensity: the propensity that a potential sharer shares the ad. Share effectiveness: the number of (short-term) views that a social share generates. Table 2.4: The list of variables and notation in the model formulation. avoid cumbersome notation. Figure 2.1 summarizes the entire system and the interdependence of the four individual components. We discuss each process in turn. 2.3.1.1 Model for number of views (V) and goodwill (G) Existing diffusion models distinguish two groups of adopters: the imitator is influenced only by others (e.g., via word-of-mouth), while the innovator is influenced only by marketer initiated communications. The innovator and imitator effects are modeled analytically through differential equations. When the measure of social shares is available, we separate it from the innovator and other sources of word-of-mouth and model its impact explicitly. As discussed, social sharing creates goodwill (G), which exerts a long-term influence on the viewing of video ads. We thus specify a model that distinguishes the share goodwill from the non-share diffusion as: dV dt =f (V ) +G: (1) In this specification, goodwill (G) measures the effect of social sharing on ad viewing (Arrow (1b) in Figure 2.1). The term f (V ) measures the non-share diffusion, and may take the form of a Bass-like diffusion process as specified in section 2.3.3. It encapsulates all the other effects not captured by explicit social shares (Arrow (1a)). This could include the effect from innovators that are not affected by word-of-mouth (e.g., consumers’ search). It could also include possible platform recommendations that are functions of the views. Finally, it may include other unobserved forms of word-of-mouth such as oral conversations, emails and blog entries that must be implicitly modeled (via the imitation effect). 12 Goodwill created by social shares changes over time. Consumers tend to forget a share on the social platform as new information and other shared content is received. This leads to a decay of goodwill. The value of goodwill is enhanced by new shares of the ad. The dynamic model for the goodwill is: dG dt = G G + dS dt : (2) In the above, G measures the decay of goodwill due to forgetting (1 G is thus the carryover effect), is the “share effectiveness” parameter that measures the immediate effect of a social share in creating goodwill, anddS=dt is the (instantaneous) number of new shares (to be specified below). We point out the link of the models in (1) and (2) to some existing models. First, if there is no share goodwill, i.e.,G 0, the model reduces to a simple diffusion model. For example, we obtain the standard Bass model iff (V ) = (mV )(p +qV=m). Second, if there is only the share effect, i.e.,f (V ) 0, the model reduces to the Nerlove-Arrow model used to capture the dynamic effect of advertising (Nerlove and Arrow 1962, Naik et al. 1998, Bass et al. 2007). Our model captures both the dynamic effect of social shares due to forgetting and the non-share diffusion due to innovator and implicit non-share word-of-mouth. 2.3.1.2 Model for potential sharers (P) and number of shares (S) Consumers may share the video ad on social networks after watching it. At any time, the potential consumers who may share the ad are a subset of those who have viewed it, since not all viewers are active throughout the life span of the ad. Viewers become inactive when they forget the ad, or have already shared the ad. These viewers no longer contribute to new social shares. The potential sharers (P) are the repository of viewers who will potentially share the ad on social net- works. The rate of change of the potential sharers is described by: dP dt = P PP + dV dt : (3) The change comes from three sources. First, some potential sharers ( P P ) drop out because they forget the ad (Arrow (2a)). Second, some of the potential sharers (P ) indeed share the ad (Arrow (3)). Third, the size of the repository is increased by the number of new viewers (dV=dt) who can share the ad following their adoption of it (Arrow (2b)). Based on the above dynamic model on potential sharers, the number of new shares is dS dt =P: (4) 13 In the above, is the “sharing propensity” parameter that measures the likelihood a potential sharer is willing to share the ad at a given time. The models in (3) and (4) are analogous to the compartment model used to describe epidemic outbreaks (see, e.g., Dukic et al. 2012). 2.3.1.3 Dynamic Coefficients We now specify the dynamics for share effectiveness (parameter) and sharing propensity (parameter). We continue to use differential equations to describe the potential wearin or wearout: d dt = b 1 +b 2 dS dt : (5) d dt =b 3 : (6) In the above, b 1 captures the temporal dynamics of share effectiveness, b 2 the volume dynamics of share effectiveness, and b 3 the temporal dynamics of sharing propensity. Equation (5) is similar to the model for advertising or word-of-mouth wearout in Naik et al. (1998), Bass et al. (2007) and Bruce et al. (2012). Because of the dependence on the sharing information, the estimated pattern of share effectiveness can be flexible with both wearin and wearout. The differential equation (6) dictates an exponential growth or decay in sharing propensity, depending on the sign ofb 3 . 2.3.2 Statistical models For estimating the parameters in the system of differential equations, we formulate the specified theoretical model into the dynamic (non)linear model framework (West and Harrison 1997, Bass et al. 2007). We can then make inference of the parameters by sampling from their posterior distributions. We add the subscript t to each variable to indicate the time period. For example,V t denotes the cumulative number of views by timet. We construct the statistical models for the incremental viewsV t and the incremental sharesS t 3 . They correspond to the rate of changes dV=dt and dS=dt in the theoretical development. The observed incre- mental views and shares are perturbed series of the underlying dynamic system specified in (1) – (5), by measurement errors. Because the data reflect discrete time points, we work with the discrete analogue of the system. We note that there is a natural order in which viewing and sharing occur: one views a video ad first and then decides whether to share. This means that in the discrete analogue, a view affects sharing contemporaneously but a share affects views in one subsequent period. We therefore specify the model of 3 There are alternative ways of formulating the statistical model based on the dynamical system. For example, we can work with the cumulative measures directly. DenoteY = (V;S;G;E;;). The dynamical system specifies that dY dt =g(Y). The natural dynamic (non)linear model isYt =Yt1 +g(Yt1)+ Y . 14 views conditional on the lagged value of shares and the model of shares conditional on the current value of views. The dynamic linear model derived from the system is: V t =f (V t1 ) +G t + V t ; G t = (1 G )G t1 + t S t1 + G t ; t = (1 +b 1 +b 2 S t1 ) t1 + t ; 9 > > > = > > > ; Views S t = t P t + S t ; P t = (1 P t )P t1 +V t + P t ; t = (1 +b 3 ) t1 + t : 9 > > > = > > > ; Shares (7) In the above, m t N(0; 2 m ) is the error term in each component of the system,m2fV;G;;S;P;g. The dynamic process is completed by specifying the initial distributions of the unobserved variable (at t = 0). Since our observation period starts from the uploading of the video ads at which point no views and shares have taken place, we can simply setG 0 = P 0 = 0, without any uncertainty. We specify the initial distributions for and as 0 N(0; 2 0 ) and 0 N(0; 2 0 ). 2.3.3 Model for non-share diffusion The existing literature on product diffusion offers many possibilities for the choice of the non-share diffusion f (V ). The widely used Bass model (Bass 1969) specifies: dF=dt 1F =p +qF: (8) In the above,F = V=m is the proportion that has adopted the product (watched the video ad), which is a measure of the implicit word-of-mouth (non-share word-of-mouth in our context). The parametersp andq are the innovation and imitation coefficients, andm is the market potential (ultimate views). The Bass model assumes that the probability of adoption from potential adopters increases over time because of the accumulation of word-of-mouth. As discussed in the theory section, strong competition of digital information products may lead to a temporal decay of the adoption probability over time. To accommodate such an effect, we consider a variant of the Bass model: dF=dt 1F = (1F )(p +qF ): (9) 15 In this specification, the hazard of adoption becomes a quadratic function ofF . This results in a much faster decay in the tail than the Bass model. 4 We refer to this as the DIP Bass model. Based on the model of the cumulative proportionF , we can derive the non-share diffusion model as f (V t ) =m(1V t =m) (p +qV t =m): (10) where = (m;p;q) T is the vector of diffusion parameters, and = 1; 2 for the Bass and the DIP Bass model, respectively. When studying the dynamic diffusion at the hourly level, the hourly effects can drive intra-day oscillation and create intra-day seasonal patterns. A parsimonious method to modeling the intra-day seasonality is via the Fourier representation, a combination of trigonometric terms that can describe any discrete periodic effects. To account for the systematic dependence of the seasonal effect on the level of the series, we specify the seasonal effect to be multiplicative with the underlying trend of the hourly views. The non-share diffusion model with the intra-day seasonality is: f (V t ) =m(1V t =m) (p +qV t =m) (1 +h t ); h t =c 1 cos 2 24 t +c 2 sin 2 24 t : (11) In the above,h t is the first harmonic component of the Fourier representation. It measures the hourly effect and has a period of 24 hours. With this specification, we have = (m;p;q;c 1 ;c 2 ) T wherec 1 andc 2 are the Fourier coefficients. We note that the mean of the intra-day seasonal effectsh t is zero, i.e., P 23 t=0 h t = 0. They therefore capture purely the intra-day seasonal effects. There are alternative ways of modeling seasonality in product diffusion in the literature. Using the Generalized Bass model (Bass et al. 1994), we can represent the intra-day seasonalityh t with 23 dummies. Compared to this approach, the Fourier method has two advantages. First, it is much more parsimonious with only two parameters to be estimated. Second, it provides a smooth estimate of the intra-day seasonality. Estimates using dummies can be highly oscillatory, lacking interpretability. 2.3.4 Hierarchical modeling In the above formulation, we have described how to model the dynamic diffusion of one video ad. When modeling multiple video ads, it is necessary to construct a hierarchical model that jointly describes the 4 When the diffusion is close to saturation, i.e.,F 1, we can approximate both models by ignoring higher order terms. The Bass model becomesdF=dt(p+q)(1F) and the modified Bass isdF=dt(p+q)(1F) 2 , which is quadratic and has a faster decay. 16 diffusion processes of all video ads. Such a model enables the estimation of the population-level parameters and accounts for the variation across the population of video ads. We add the subscript i to reference the observations and parameters from the i th video ads. The hierarchical model allows the diffusion parameters (m;p;q), seasonality coefficients (c 1 ;c 2 ), forget- ting rates ( G ; P ), and wear-in/out parameters (b 1 ;b 2 ;b 3 ) to vary by video ads. Denote i = logm i ; logp i ; logq i ;c 1i ;c 2i ; logit( G i ); logit( P i );b 1i ;b 2i ;b 3i T the ten-dimensional vector of transformed parameters, with the transformation imposed to ensure the estimated parameters fall in the correct range. We use the logit transformation of the forgetting parameters G and P so that the effect decays exponen- tially over time. However, no restriction is applied to the wear-in/out parameters (b 1 ;b 2 ;b 3 ). We specify a multivariate normal distribution on the transformed video-level parameters as: i N( 0 ; ); (12) where 0 is the population-level estimate of the transformed parameters, and captures the variation of the parameters across the video ads. To complete the specification of our model, we assign diffuse hyperprior distributions, e.g., a diffuse multivariate normal for 0 , a proper but non-informative inverse Wishart distribution for , and non- informative inverse gamma distributions for the variances 2 m i ,m2fV;G;;S;P;; 0 ; 0 g. 2.4 Data This section covers data collection, measures, and sampling. The context of the study is online video ads uploaded on YouTube in branded channels. A branded channel is an account on YouTube through which a brand uploads video ads, communicates with other users, and manages video information and other activities. 2.4.1 Data collection The study of diffusion of YouTube video ads requires time series data on the statistics about viewing and sharing of the video from each period since it was uploaded. To collect such data, we targeted branded channels on YouTube and tracked any new videos uploaded in these channels. We wrote a program that automatically tracked the branded channels every hour to extract the following metrics: 17 Whether a new video was uploaded on the channel in the past hour. If so, we added the video to the database and recorded the physical information of the video, including the title, description, duration of the video, and upload time. The number of views, likes, dislikes and comments of each tracked video on YouTube. The number of shares of the videos across various social networks. We extracted video statistics including views, likes, dislikes, and comments from the YouTube API. 5 We relied on the APIs provided by major social networks to extract the number of shares of the video on these networks. Requests to these APIs return the number of times the URL of a given video has been shared on these networks. The major social networks are Facebook, Twitter, Google+ and LinkedIn. 6 While we also tracked other social platforms such as StumbleUpon, Pinterest, and so on, the shares on these networks were quite small compared to the four major networks and their statistics seemed to be unstable. The number of shares of a video ad is then defined as the sum of the shares across the four major social networks. We tracked each video for 60 days, by which point there were no substantive changes in the measured metrics. The number of branded channels and video ads on YouTube is enormous. We had to sample judiciously. We selected the target brands through several criteria. First, we selected the top 100 advertisers in 2012 in the US by expenditure 7 . Second, for these brands, we looked up their names on YouTube. If there was a channel with descriptions that closely matched the target brand, we recorded that brand’s channel name on YouTube and used it in our sample. Third, we included additional brands that were historically active on YouTube. This process resulted in 109 brand channels considered in our data collection process. Our program tracked videos uploaded from these brands between November 25, 2013 and March 4, 2014, which spanned about 100 days. There were a total of 1,962 advertisement videos uploaded from the 109 brands, so each channel uploaded a video ad in about 5.6 days. Table 2.5 lists the eight most viewed video ads in the data. Table 2.6 shows the sample quantiles of the views, likes, dislikes, comments, social shares, and upload- ing hour across all collected videos. We see that the distributions of views and shares across the video ads are highly skewed. The skewness measures of the views and shares are 13.22 and 22.15, respectively, with the distribution of the shares being more skewed than that of the views. The sample median of the views is 5 YouTube API:https://developers.google.com/youtube/v3/ 6 Facebook API:https://developers.facebook.com Twitter API:https://dev.twitter.com Google+ API:https://developers.google.com/+/api/ LinkedIn API:http://developer.linkedin.com/apis 7 Source: http://www.adbrands.net/us/top_us_advertisers.htm. 18 Video ID Views Shares Title Brand uQB7QRyF4p4 49,681,264 1,621,300 Budweiser Super Bowl 2014 – Puppy Love Budweiser Lv-sY z8MNs 30,979,691 218,185 Google Zeitgeist: Here’s to 2013 Google -XseHZyvGtg 25,740,178 344,963 Samsung GALAXY S5 - Official Introduction Samsung 57e4t-fhXDs 18,708,141 584,691 P&G Thank You, MomjPick Them Back Up Procter & Gamble ns-p0BdUB5o 17,628,505 123,858 2014 V olkswagen Game Day Commercial: Wings V olkswagen dRIgmKGDqFM 15,920,143 527,272 Pepsi MAX & Jeff Gordon Present: “Test Drive 2” Pepsi 98BIu9dpwHU 13,988,431 314,111 Amazon Prime Air Amazon EC-zB2aSXXM 12,803,930 320,821 #GALAXY11: The Beginning Samsung Table 2.5: The eight most viewed video ads in the data. 7,764 while that of the shares is 158. More than 5% of the videos are not shared at all, while the most shared video generates about 1.6 million shares. The uploading times of video ads seem to be fairly uniform within the day. Min 5% 25% Median 75% 95% Max Views 0 206 1,425 7,764 91,108 1,589,598 49,681,264 Likes 0 0 5 26 293 2,748 207,753 Dislikes 0 0 0 3 17 140 10,590 Shares 0 0 11 158 1,058 16,574 1,621,272 Comments 0 0 0 3 19 257 19,383 Uploading hour 0 2 7 10 14 20 23 Table 2.6: Sample quantiles of major video statistics across all collected ads. 2.4.2 Measures Likes, dislikes, comments and social shares all provide measures of consumers’ social engagement and word-of-mouth that may spur more views of the ads. Table 2.7 shows the sample correlations among these measures. We see that these measures are highly correlated. In the subsequent analysis, we will focus only on social shares. Compared to likes, dislikes and comments that mainly take place on YouTube, social shares on major social networks may influence larger networks of audiences beyond YouTube users. In addition, brand channels may disable the function of liking, disliking and commenting on YouTube for some video ads. This leads to missing data for these measures. Shares from consumers are generally not affected by such manipulations from brands. 2.4.3 Modeling sample For estimating the model, we drew a random sample from all video ads. Our sample is stratified based on the number of shares. This is because the distribution of the shares across the video ads is highly skewed, with more than 50% shared less than 158 times (see Table 3.4). We study the dynamic processes of both views and shares. Using a simple random sampling procedure would result in a sample that contains a 19 Likes Dislikes Shares Comments Likes 1.00 0.73 0.93 0.84 Dislikes 0.73 1.00 0.64 0.87 Shares 0.93 0.64 1.00 0.74 Comments 0.84 0.87 0.74 1.00 Table 2.7: Correlations of various social engagement measures. large portion of non-shared ads. Such a sample would not provide much information for identifying the interrelationship of views and shares and the diffusion characteristics of viral ads. In the stratified sampling, we divided all video ads into three groups and sampled from each group randomly. The break points for the four strata were based on the 75% and 90% quantiles of the shares. We then drew 50 video ads from each group randomly, resulting in a total of 150 video ads. We believe the sample size is sufficiently large to provide a reasonable representation of the population of video ads, while still allowing the model to be estimated within a reasonable amount of time. We included the time series up to 15 days (360 hours) for each video, because the tail lacks patterns of interest. As a result, our sample consists of 150 videos, each with 360 observations. This yields a data set of 54,000 observations in total. Figure 2.2 shows the time series of the (normalized) hourly views and shares for four sample video ads. Nonlinear diffusion, seasonality and interdependency of views and shares are evident from these plots. 2.5 Results We organize this section as follows. First we describe and compare the estimated models. We next present the results on diffusion characteristics and the estimated dynamic effect of share effectiveness and sharing propensity. We then show the implications of our dynamic estimates for optimizing ad promotion strategy. Finally, we analyze the best time to seed an ad. To estimate the parameters of the model, we simulate the full posterior distribution by using Markov chain Monte Carlo methods (Gelman et al. 2003). The ‘Metropolis-within-Gibbs’ algorithm scheme sequen- tially samples parameters from their lower dimensional full conditional distributions over many iterations. Details of the derivation of the full conditional distribution used in the simulation schemes are in Appendix A. In drawing the samples from the posterior distributions in the three models, we ran 50,000 iterations in three parallel chains, discarding the burn-in period of the first 30,000 iterations at which point the approx- imate convergence was achieved. To reduce auto-correlation, we used every 20 th iteration of each chain. 20 views, EC−zB2aSXXM views, fiCJ7lgn0Z0 views, 86wGkXjgEjI views, RzRm9−−UmPw shares, EC−zB2aSXXM shares, fiCJ7lgn0Z0 shares, 86wGkXjgEjI shares, RzRm9−−UmPw 0 120(5) 240(10)0 120(5) 240(10)0 120(5) 240(10)0 120(5) 240(10) Hours(days) elapsed Hourly views/shares (normalized) Figure 2.2: Plot of the observed hourly views (red) and shares (green) for four sample video ads. The hourly views(shares) are normalized by the largest observed hourly views(shares) for each video so that y-axises are comparable. The white and gray strips in the background separate different days. Model 1 Model 2 Model 3 Structure Independent Bass Independent DIP Bass Full dynamic model DIC -262,932 -285, 095 -428,307 Table 2.8: Results of model comparison. This process resulted in 1000 simulated draws for each model, based on which we draw inferences about the quantities of interest. 2.5.1 Model comparison In addition to the specified dynamic model in (7), we estimated two other models, both of which model the time series of views and shares separately. In Model 1, the basic seasonal Bass model is applied to both the views and shares. Model 2 is identical to Model 1 except that the seasonal DIP Bass model is employed. We estimated these two additional models to test the need for including the temporal decay factor in the DIP Bass model. Fitting the Bass-like model to the shares data is helps understand the shape of the share evolution. Model 3 is the full model (7) with joint evolution of views and shares and dynamic parameters. 21 The three models are summarized in Table 2.8. We compare these models using the Deviance Informa- tion Criterion (DIC) based on the sample estimation results. From the results in Table 2.8, we see that the DIP Bass model is better than the basic Bass model, justifying the inclusion of the temporal decay factor. The full dynamic model outperforms the other two by a significant margin. 2.5.2 Diffusion characteristics Model Potential (m in 1000) Innovation (p) Imitation (q) Mean 95% interval Mean 95% interval Mean 95% interval Model 1 141 (3, 11273) 0.0036 (0.0003, 0.0240) 0.0031 (0.0002, 0.0472) Model 2 140 (3, 13206) 0.0028 (0.0002, 0.0358) 0.0137 (0.0009, 0.1249) Model 3 153 (4, 12797) 0.0010 (0.0007, 0.0015) 0.0143 (0.0010, 0.1859) Table 2.9: Estimated population-level diffusion parameters from the three models. Existing studies of diffusions of durable goods report that the innovation parameter is in[0:0007;0:030], and that the imitation parameter is in[0:38;0:53]. Diffusion Coefficients. Table 2.9 reports the posterior means and 95% credible intervals of the population- level diffusion parameters for the views from the three models. Existing diffusion studies of durable goods report that the innovation parameter is in [0:0007; 0:030], and that the imitation parameter is in [0:38; 0:53] (Chandrasekaran and Tellis 2007). Compared to the coefficients for durable goods, the estimated imitation parameter for video ads is much smaller. This leads to much faster peak and diffusion speed for video ads. Table 2.10 reports the estimated correlation matrix among the diffusion parameters. The innovation parameter is significantly negatively correlated with the total views. On the other hand, there is a strong positive association between the imitation coefficient and the total views. Figure 2.3 shows the scatterplot of the estimated innovation and imitation parameters over the estimated total views for each video ad. The same trend is observed in these plots. These results indicate that viral ads have a smaller innovation coefficient and a larger imitation coefficient than non-viral ads. A possible explanation is that word-of-mouth plays a more important role for viral ads. In contrast, there is not much word-of-mouth generated for non-viral ads. Innovators may be the more important source of adoption. log(m) log(p) log(q) log(m) 1.000 -0.310** 0.270** log(p) 1.000 -0.166** log(q) 1.000 Table 2.10: The correlation matrix for the population-level diffusion parameters. Diffusion Speed. To get a quantitative measure of the diffusion speed of video ads, we compute the median time to peak, the peak to 90% saturation, and the speed of diffusion measured by the difference between 22 log(p) log(q) −12 −8 −4 0 0 4 8 0 4 8 Log market potential (log(m)) Log diffusion parameters Figure 2.3: Scatterplot of the estimated innovation and imitation parameters over the estimated total views for each video ad. The line in each plot represents the linear regression fit. the time points where 10% to 90% of the estimated ultimate views are reached. These estimates, along with the 95% credible intervals are in Table 2.11. From these estimates, we see that the views of video ads peak rather fast, generally within one and half days. The successful estimation of this metric would have not been possible had we not tracked the video ads since the uploading time and collected the data on an hourly level. In addition, the average diffusion speed is about 13 days to go from 10% to 90% of the estimated ultimate views. Measure Median 95% interval Days to peak 1.29 (0.24, 9.35) Peak to 90% saturation 12.69 (2.54, 29.48) Speed of diffusion 13.41 (3.95, 31.85) Table 2.11: Estimated population-level diffusion characteristics. Speed of diffusion is measured as the difference between the time points where 10% to 90% of the estimated ultimate views are reached. Intra-Day Seasonality. Figure 2.4 shows the estimated population-level hourly effects for the views. We converted all time stamps to Pacific Standard Time. Because we lack information about individual viewers, we cannot identify the country or region from which a view comes and hence cannot adjust the hours to the viewing time zone. The hours used here are averages across all countries and regions of the world. However, a large portion of views are US based. 8 8 According to YouTube, about 20% of the traffic comes from the US. See http://www.youtube.com/yt/press/ statistics.html 23 −0.6 −0.3 0.0 0.3 0.6 0:00 06:00 12:00 18:00 24:00 Hour Hourly effects Figure 2.4: Estimated hourly effects. Dashed lines indicate the 95% credible intervals. The plot indicates that there are more views in the afternoon (12:00 - 24:00) than in the morning (0:00 - 12:00). In addition, the highest intra-day views occur between 4 PM and 6 PM Pacific Time. On the other hand, the lowest views happen between 4 AM and 6 AM Pacific Time, when most people in the US are asleep. Comparing Patterns of Views and Shares. We applied a Bass-like diffusion model on shares to estimate its evolution pattern. Figure 2.5 shows the estimated population-level evolution patterns for both views and shares based on the DIP Bass model. It is evident that the curved pattern of the shares has a higher and earlier peak and a much faster decay compared to that of the views. More mass of the shares distribution is concentrated on the left than the views distribution. This implies that compared to actual ad adoption, a much larger portion of word-of-mouth is generated in the early stage of the diffusion. 2.5.3 Dynamic interrelationship between views and shares Forgetting and Wearout Effects. Table 2.12 reports the estimated dynamic effects of share effectiveness, those of sharing propensity, and the forgetting effects on goodwill and potential sharers. There are several results noteworthy. First, there is a temporal wearout of share effectiveness since the temporal effectb 1 is negative and significant. This suggests that the effect of a share in generating ad views generally decays over time. 24 0 0 Time elapsed Marginal distribution (dF) shares views Figure 2.5: Estimated population-level evolution patterns of views and shares. Parameter Median SD 95% interval Temporal wearout of share effectiveness (b 1 ) -1.358 0.048 (-1.449, -1.262) V olume wearin of share effectiveness (b 2 ) 1.350 0.334 ( 0.759, 2.043) Temporal wearout of sharing propensity (b 3 ) -0.941 0.028 (-0.997, -0.885) Goodwill decay ( G ) 0.704 0.036 ( 0.629, 0.779) Potential sharers decay ( P ) 0.797 0.033 ( 0.725, 0.855) Table 2.12: Estimated dynamic effects of share effectiveness and sharing propensity. Second, the volume effectb 2 is significantly positive, implying strong volume wearin of share effective- ness. A large number of ad sharing in a short period creates positive buzz and signals superior ad quality. This may make the ad more attractive to consumers. Third, we find strong temporal wearout of sharing propensity. This suggests that consumers become less motivated to share the ad over time. With the passage of time, consumers are more likely to have watched the ad or become increasingly knowledgeable of the ad, making it less worthy of knowing. Late shares are less likely to reflect positively on the sender when the recipients find little value of the share. Indeed, sharing old content may make the sender look outdated. On the other hand, early shares make the sender appear to be “in the know” and promote a good impression. 25 Fourth, the rate of forgetting is moderately large. The rate of goodwill decay is about 0.704 and the rate of potential sharers decay is 0.797. These estimates are substantially larger than the rate of forgetting in TV advertising (e.g., Bass et al. (2007) estimates the forgetting rate to be around 0.03). The online environment is characterized by the intense competition from numerous rival digital information products. Members of a network may be motivated to publish or share something not already discussed, to become an opinion leader. This makes an ad or a share published days or even hours ago quickly obsolete and ineffective. Share effectiveness Sharing propensity 2.5 5.0 7.5 10.0 0.00 0.03 0.06 0.09 0.12 0 120(5) 240(10) 360(15) 0 120(5) 240(10) 360(15) Hours(days) elapsed Parameter estimates Figure 2.6: The estimated population-level share effectiveness and sharing propensity parameters over time. Sharing Propensity and Share Effectiveness. Figure 2.6 shows the estimated population-level sharing propensity and share effectiveness parameters over time. 9 Consistent with the analysis of the wearout effect, the sharing propensity parameter decreases over time. The decay is markedly rapid. By the end of day 4, the sharing propensity is reduced to only 3% of its initial level. The share effectiveness parameter decreases after the stage of an initial increase. This initial increase in share effectiveness is likely due to the volume wearin. The rate of decay is much smaller compared to that of sharing propensity. The fluctuation of the 9 The sharing propensity and share effectiveness parameters are estimated for each ad and for each hour. These population-level estimates are averaged across the video ads. 26 Day Sharing propensity (%) Share effectiveness Short-term Long-term Short-term Long-term Monetary value ($) 1 5.23 6.56 5.15 7.31 0.20 2 3.40 4.26 5.89 8.36 0.23 3 0.81 1.01 5.78 8.21 0.23 4 0.53 0.66 5.81 8.26 0.23 5 0.36 0.45 6.90 9.80 0.27 6 0.26 0.32 5.50 7.81 0.22 7 0.25 0.31 5.30 7.52 0.21 8 0.33 0.41 4.60 6.53 0.18 15-day average 0.91 1.15 4.65 6.60 0.18 Table 2.13: Estimated short-term and long-term share effectiveness and sharing propensity by day. estimated share effectiveness is mainly due to the dependence of the dynamic change on the social shares (volume wearin), which tends to fluctuate from hour to hour (see Figure 2.2). Table 2.13 shows the average estimates of the sharing propensity and share effectiveness parameters by day. We also compute the approximate long-term effect that takes into account the carryover effect 10 . The long-term sharing propensity measures the percentage of adopters that are willing to share the ad. For example, 6.56% of the viewers on the first day will share the ad in the long run. In comparison, only 0.33% of the viewers will share if they watch the ad on day 8. The long-term share effectiveness measures the number of views that a share may generate. For example, a share on the first day is estimated to create 7.31 views in the long run. The 15-day average sharing propensity is about 0.91% and the 15-day average effect of a share is about 6.6 views. We can also compute the equivalent monetary value of a share based on the long-term effect of a share. This may help managers make investment decisions on stimulating additional social shares (Trusov et al. 2009). For this purpose, we need an estimate of the cost that a brand pays for one view of a video ad. There are multiple sources for such an estimate. For example, according to Nielson 11 , the cost per mille (CPM) of an advertisement shown on TV during the prime time is $25, which suggests the cost per view is about $0.025. A Wall Street Journal report 12 shows that the average cost per view for thirty seconds of ad time during the Super Bowl is about $0.03. In the following computation, we use the average of the two numbers, $0.028, as the cost per view for an ad shown on TV . Multiplying this with the views per share yields the 10 For example, the long-term effect of a share at timet is P 1 i=0 (1 G ) i t+i. 11 Source: http://www.tvb.org/trends/4718/4709 12 Source: http://blogs.wsj.com/economics/2012/02/04/number-of-the-week-super-bowl-ads-cost-3-cents-per-viewer 27 monetary value of a share, which is shown in the last column of Table 2.13. The 15-day average value of a share is about $0.18. Hours elapsed Number of views 0 24 48 0 500 1000 (a) Dynamic effect of one promotion Days elapsed Additional views 0 1 2 3 4 5 6 7 8 9 10 1000 2000 3000 (b) Total effect of promotion by hour Figure 2.7: Part (a) illustrates the dynamic effect of of a promotion at the 15th hour, indicated by the red arrow. The total effect is the gray area. Figure (b) shows the total effect of a promotion that is carried out at different times. In both cases, the promotion is a purchase of 1000 views. Best time to promote the ad. Advertisers may purchase YouTube services to promote their video ads. They can purchase the desired number of views of the ad on a “cost-per-view” basis. Some viewers reached by the promotion may also share the ad, generating additional views beyond the intended number of the promotion. In other words, social sharing engenders a long-term impact of promotion. Figure 2.7(a) illustrates the dynamic effect of a purchase of 1000 views at 15 hours after the video is uploaded. We see that an immediate effect of 1000 views occurs at the time of promotion. Potential social sharing from these viewers causes subsequent viewing and sharing until no more shares are generated from this process. The subsequent views and shares are estimated based on the population-level estimates of share effectiveness, sharing propensity and rate of forgetting. The total impact of the promotion includes all the views measured by the gray area. The dynamic nature of the viewing and sharing process implies that the total impact of the promotion depends on the time it takes place. A managerially relevant question is when to promote the ad in order to maximize the return of that investment. To answer this question, we compute the total effect of the promotion (1000 views) launched at each hour after the video is uploaded. Figure 2.7(b) plots the estimated total effect of promotion by the time of promotion. This result suggests that it is better to promote the video ad at the early than at the late stage of the diffusion. In addition, the best time to promote is around 24 hours, the beginning of the second day. This is preferred to the time of uploading because a share has become more 28 −0.4 −0.2 0.0 0.2 0.4 Upload hour Estimated hour effect 0:00 6:00 12:00 18:00 24:00 (a) −0.4 −0.2 0.0 0.2 0.4 0.6 0.8 Estimated day effect Monday Tuesday Wednesday Thursday Friday Saturday Sunday (b) Figure 2.8: The estimated effect of uploading hour (a) and uploading day (b) on the ultimate views. The vertical bars in (b) correspond to two standard deviations from the mean. effective because of the volume wearin, which outweighs the lower sharing propensity (due to the temporal wearout). 2.5.4 Best time to seed the ad The hierarchical structure can be extended to include video-level predictors such as the physical information of the video ad. For example, we can link the uploading time and day to the estimated ultimate views of the video and give guidance on the best time for seeding a video ad. Because of the complexity of the hierarchical model, we include a sample of only 150 videos. This sample still takes days to estimate the current model even with a highly efficient MCMC algorithm. A robust estimate of the effect of seeding time on ultimate views requires a much larger sample. For this reason, we resort to an approximate method, in which we use all videos in the collected data, treat the observed largest cumulative views as the approximate ultimate views, and regress the logarithmic ultimate views against the uploading hour and uploading day of the video. We also include brands as random effects. The estimates for the uploading hour and uploading day are in Figure 2.8. We see that consistent with the estimated hourly effect in the model, the best uploading hour is around 6 PM Pacific Time. In addition, videos uploaded on Monday seem to correspond to more views. 29 2.6 Discussion Recent years have seen the surge of digital information products over the internet. The adoption of these products is distinct in many ways from traditional products. On one hand, because of the use of the internet for distribution, these products have a large potential market and a rapid diffusion speed. A successful product may generate substantial interest and social shares within a short time period, becoming a viral product. On the other hand, these products can face strong competition due to a large number of rival ideas published continuously. The fast-paced change of topics and trends in the online environment may engender consumers’ adoption and sharing behavior that also change over time. These unique features of digital information products require a careful study of the process of product adoption and share generation. Such an analysis can help to better understand the dynamics of human information consumption and help in improving product launch and social marketing strategies. We constructed and estimated a dynamic model that describes the joint evolution of product adoption and share generation, as well as the dynamic interrelationship between adoption and sharing. The appli- cation of the model to YouTube video ads demonstrates excellent performance in capturing the complex diffusion pattern as well as virality when it occurs. This section summarizes the main results, discusses some questions, and lists limitations for future research. 2.6.1 Summary of results Our main findings from the empirical analysis are the following. The speed of diffusion of video ads is rather rapid, with most ads reaching 90% of the ultimate views within two weeks. On average, it takes 1.3 days to reach the peak and 12.7 days from the peak time to reach 90% saturation of the ultimate views. The estimated Bass imitation coefficient is substantially smaller than that of durable goods. Viral ads have a smaller innovation and a larger imitation coefficient than non-viral ads. There is strong intra-day seasonality, with more views in the afternoon than in the morning. The peak occurs between 4 to 6 PM and the trough around 4-6 AM Pacific Time. The evolution of shares is characterized by a higher and earlier peak and a much faster decay than that of views. The interrelationship between views and shares changes over time. The effect of a share on view generation generally decays over time; but it can also be boosted by a high volume of shares. These two effects yield dynamic share effectiveness over time with an initial increase and a subsequent steady drop. In addition, the propensity that a consumer shares the ad decays over time and this decay is much faster than that of share effectiveness. Based on these results, we find that the best time to promote the ad is at the beginning of the second day after release of the ad. We also find that one share generates about 6.6 views in general, which corresponds to a monetary value about $0.18. Lastly, additional 30 analysis suggests that the best time to upload the video ad is around 6 PM Pacific Time and the best day is Monday. 2.6.2 Discussion issues The study raises several questions as follows. What are the specific causes of rapid peak (first few days) and rapid decay (first few weeks) of digital information products? What are the reasons for the existence of virality? How generalizable are the results for online video ads to digital information products? 2.6.2.1 Rapid peak and fast decay We postulate that the rapid peak and fast decay of online videos ads may be due to five essential character- istics that distinguish them from physical products or long form information products such as e-books and full length movies. First, being short, their consumption is quick. Second, they exist in digital form, making viewing and sharing easy, unlike physical products that have to be shipped. Third, they use the channel of broadband internet service or digital phone networks, both of which are currently widely available in most countries of the world. These channels enable the existence of huge interconnected networks of potential viewers and sharers. Fourth, such channels provide consumers with hundreds of products to simultaneously view and share, increasing the rival options available to consumers. Fifth, in this information and enter- tainment rich environment, consumers still have limited time. Indeed, the abundance of options has led to shrinkage of experienced time. Abundant rival options and limited time has led to information-rich but time-poor consumers, resulting in the rapid adoption and decay of any single digital information product. 2.6.2.2 Existence of virality Virality is the rapid spread of a digital information product through viewing and sharing in a social network leading to a very large number of views in a short period of time. Because of consumers’ limited time and the abundant choices, all products cannot go viral. Only a few do. We find that viral ads have a higher imitation coefficient. This suggests that word-of-mouth plays an important role in viral diffusion. In particular, explicit social shares seem primarily responsible for the virality of a video ad. 2.6.2.3 Generalizability We speculate that the patterns and estimates of diffusion for online video ads are generalizable to other digital information product such as ideas, tweets, news, images, songs, or user generated videos with relatively small variations. The reason is that all these products share the same five characteristics described above. Nevertheless, replication of this model on other digital information products to confirm their similarity and explicate their differences would be beneficial. 31 2.6.3 Limitations for future research This research has several limitations that could be addressed in future research. First, because we collect the data by tracking publicly available video ads, we are unable to gather private information such as whether and when the video ads are aired on the television or pre-embedded into other videos due to the brand’s purchase of promotion services on YouTube. The diffusion behavior of the video ads under these two activities may be different from those independent of these activities. However, because we use randomly sampled videos from a relatively larger number of brands, our population-level inference is still valid, with the population now consisting of video ads with and without additional promotional activities. Of course, when information of such promotional activities is available, we can readily incorporate it into the model and investigate its impact on the diffusion. Second, because our tracking of the video ads is URL based and only focuses on video ads uploaded by the focal brands, it ignores any identical copies of a video ad that are uploaded by other users. These copies may create additional views and shares that benefit the brands. Another type of video is the derivative of a branded video ad that is modified and edited by other users. Future research may include these two types of videos and investigate how virality affects the number of copies and derivatives. Third, our primary focus of this essay is to make inference of the diffusion characteristics of information products and understand the dynamics of consumers’ consumption of information. We did not analyze the forecasting perspective, although the model can be readily used to generate predictions. In this context, perhaps it is more useful to make on-line forecasting, where the model parameters are updated sequentially as new data points come and predictions generated for the next period using Bayesian updating. To make on-line forecasting, however, we need to use sequential Monte Carlo methods such as the particle filter to make efficient updating of the filtering distribution (Liu and West 2001). All these limitations present promising opportunities for future research. 32 Chapter 3 Content Drivers of Virality for YouTube Video Ads 3.1 Introduction Most existing studies on virality have investigated the role of emotion in driving virality. For example, Berger and Milkman (2012) looked at the drivers of virality in newspaper stories and found that valence and arousal affect virality. Nelson-Field et al. (2013) investigated the role of emotions in driving the virality of YouTube videos. Table 3.1 summarizes the existing studies in this area (adapted from Nelson-Field et al. (2013)). However, video ads are rich stimuli that include cues beyond emotions. Ads might include arguments, brand prominence, and cues that may not be fully captured by emotion, such as the type of sources used or the use of surprise. These additional cues may also affect virality in important ways. A better understanding of these ad cues will help advertisers design more effective ad content. We conducted an empirical analysis to examine how the use of various executional cues in the ad influ- ences the virality of YouTube video ads. We collected the number of social shares of 345 branded video ads and developed an instrument to rate the content of these ads on over 30 cues drawn from the behavioral literature on advertising. These cues cover the use of arguments, the type of emotions used in the ad, the type of sources, the extent and location of surprise, the extent of humor, the timing of the brand’s presence, and the use of narration, dramatization, sexual appeals, and others. We then analyzed social shares as a function of the cues of ad content and identified important drivers of virality by assessing their both direct and indirect effects through possible mediators. Our analysis of the relationship between the shares of video ads and ad cues suggests the following major results. The use of argument appeals positively influences sharing only when the ad relates to a newly introduced product; otherwise, its effect is negative. Emotional appeals are more effective than are argument appeals at promoting sharing. In particular, high ratings of love, warmth, pride, and excitement stimulate social sharing. Ads rated as high in surprise have a strong positive effect on social shares. The location of surprise also matters. A surprising end evokes more sharing compared to a surprising beginning. A related finding is that humorous ads are generally more likely to be shared. Further, evidence suggests that brand 33 Study Stimuli type n Stimuli Independent variable Dependent variable Significant drivers Dobele et al. (2007) Viral campaigns 9 Extent of aroused emotions Not an empirical study Southgate et al. (2010) YouTube video ads 102 Involvement, enjoyment, branding, distinctiveness, celebrity YouTube views Involvement, enjoyment & celebrity Eckler and Bolls (2011) YouTube video ads 12 Emotional tone Intent to forward Positive tone Berger and Milkman (2012) NY Times articles 7000 Emotion valence & arousal On most emailed list Positive & high arousal emotions Nelson-Field et al. (2013) YouTube videos 800 Emotion valence & arousal Facebook shares Positive & high arousal emotions Current study YouTube video ads 345 30+ variables of content Number of shares on Face- book, Twitter, Google Plus, LinkedIn Use of argument in new product ads, high aroused emotion of love/warmth, pride, excitement, high degree of humor and surprise, low narration, high dramatization, low brand prominence, end placement of surprise and brand name, length of ad (inverted U shape), and use of celebrity and baby/animal Table 3.1: Summary of studies on drivers of virality. prominence is influential. The frequency that brand names are shown in the ad has a negative effect. Late placement of brand names and logos enhance sharing propensity compared to early display. Pulsing brand names through the ad does not increase virality. The relationship between social shares and ad length is characterized by an asymmetric inverted U curve, with ads between 1 to 1.5 minutes being most likely to be shared. Lastly, the type of sources influences sharing. Celebrities, babies, and animals have strong positive effects on social shares, both directly and indirectly through their effects on emotions. In the following sections, we review the related literature, discuss the theoretical framework, describe the collected data and the content coding procedure, and present the estimation results. We then provide concluding comments and implications for marketing practice. 3.2 Literature In this section, we discuss the theoretical framework for the analysis of drivers of viral video ads. We review the literature on various executional cues and their possible effects on social shares. 3.2.1 Drivers of viral ads A large body of literature has examined the effects of various executional cues in TV and print advertis- ing. Lab experiments or empirical data are used to investigate the link between different ad cues and ad effectiveness, often measured in terms of brand name or message recall, attitude toward the ad or brand, purchase intentions, or actual product sales. The abundant research in this area has greatly improved our understanding of the effect of different executional cues on these outcome measures. Representative studies and their findings are in Table 3.2. 34 Study Cues Dependent variable Context Findings Petty et al. (1983) Argument strength Attitude Experimental There are two routes to persuasion. Argument quality (central route) has a greater impact under high involvement and product endorser (peripheral route) has a greater impact in under low involvement. Snyder and DeBono (1985) Argument (claims of product quality) , Product image Evaluation of ads Experimental Low self-monitoring individuals react more favorably to product-quality-oriented ads, and high self-monitoring individuals react more favorably to image-oriented ads. Munch and Swasy (1988) Argument (rhetorical question, summarization frequency and argument strength) Recall Experimental Rhetorical argument reduces recall. Pechmann and Stewart (1990) Argument (type of comparative claims) Attention, memory and purchase intentions Experimental Comparative argument increases attention and purchase attention for low-share brands. Dens and De Pelsmacker (2010) Argument, Emotion Attitude, purchase intention Experimental The type of advertising strategy (informational vs emotional) matters for new brands, and such an impact is moderated by the involvement of product category. Batra and Ray (1986) Emotion (aroused feelings in respondents) Attitude and purchase intention Experimental Affective responses are significant in determining the attitude toward the ad, the brand and purchase intention. Holbrook and Batra (1987) Many cues including fact, emotion, humor, sur- prise, sex Emotion, attitude to ad and brand Experimental Emotion serves as a mediator in advertising Edell and Burke (1987) Emotion (intensity and types of emotion aroused in subjects) Attitude Experimental Both positive and negative feelings are important predictors of ad effectiveness, and the relative importance of feelings and judgments depends on the informational aspect of the ad. Burke and Edell (1989) Emotion (intensity and types of emotion aroused in subjects) Attitude Experimental Feelings affect attitude toward the ad and attitude toward the brand directly and indirectly. Olney et al. (1991) Many cues including fact, emotion, sex, unique- ness Emotion, attitude to ad, viewing time Experimental Find various effects of ad content on actual viewing behaviors through emotional reactions and attitude toward the ad. Shimp and Stuart (2004) Disgust Attitude and purchase intention Experimental Disgust mediates ad content on purchase intentions. Aaker et al. (1986) Warmth Arousal, attitude and purchase likelihood Experimental Warmth improves attitude, recall and purchase likelihood. Teixeira et al. (2012) Joy and surprise Ad avoidance Experimental Joy and surprise increases attention and reduces ad avoidance. Woltman Elpers et al. (2003) Information and entertainment value Viewing time Experimental Viewing time is positively related with entertainment value and negatively related with information value. Zhang and Zinkhan (1991) Humor Perceived humor, attitude, recall Experimental Humor influences attitude and recall Spotts et al. (1997) Humor types and intentional relatedness Starch score The effect of humor varies by product and incongruity-based humor performs well. Lee and Mason (1999) Humor Attitude Experimental Humor has a favorable effect in ads with unexpected-irrelevant information, but not in ads with unexpected-relevant information. Woltman Elpers et al. (2004) Surprise location Perceived humor Later peak of surprise leads to higher peak of humor Derbaix and Vanhamme (2003) Surprise Frequency of word-of-mouth Experimental Surprise increases frequency of word-of-mouth and not completely mediated by emotion. Friedman and Friedman (1979) Celebrity Attitude Experimental Effectiveness of endorser varies by product type. Atkin and Block (1983) Celebrity Ad ratings Experimental The use of celebrity produces favorable impact on ad ratings. Freiden (1984) Endorser Attitude Experimental Different endorser types have differential effects on consumers’ attitudes. Amos et al. (2008) Celebrity Meta-analysis Negative celebrity information is extremely detrimental to an advertising campaign. Baker et al. (2004) Brand location Attitude Experimental Early placement of brand is better. Teixeira et al. (2010) Branding activity Ad avoidance Pulsing reduces ad avoidance. Singh and Cole (1993) Ad length Recall Experimental 30s ads are as effective as 15s ones and the effectiveness depends on the appeal method used. Deighton et al. (1989) Narration, character, plot Emotion, belief Experimental Emotion and persuasion is affected by the dramatization scale. Severn et al. (1990) Sexual appeal Recall, attitude, behavioral intentions Experimental Sexual appeal leads to superior attitudes and purchase intentions Chandy et al. (2001) Argument, emotion, endorsers Sales Empirical Impacts of different ad cues depend on the market age. MacInnis et al. (2002) Rational cues, emotion, length Sales Empirical Cues affect the impact of media weight on sales. Table 3.2: Representative studies on ad cues. 35 However, the same executional cues may have a different effect in the context of YouTube video ads for several reasons. First, the exposure context is different. Consumers are forced to watch TV ads unless they choose to avoid them through zipping or zapping. Moreover, advertisers have great control over the number of exposures, as the number of ad exposures is a function of ad spending and the potential audience size. Prior research focuses on how the cues change consumers’ behavior following exposure to the ad. In contrast, consumers are generally not exposed to a YouTube ad unless they choose to view it. The number of ad exposures is driven mainly by word-of-mouth and sharing using social networks. Advertisers have little control over the social sharing process. The main objective is to identify the executional cues that are likely to generate social shares and create ad exposures. Second, prior research has focused on consumers’ memory, recall, or attitudes because consumers usu- ally do not purchase at the time of exposure. Their memory of the ad content influences the future purchase decisions. While the same factors may operate in the context of shared YouTube ads, YouTube consumers have the opportunity to consider an additional decision, specifically, whether to share the ad with others immediately following ad exposure. Measures such as memory and recall may play less important roles in driving social shares. Third, consumers’ purchase decisions are often made individually. In contrast, sharing is a means of social interaction that influences other individuals within the network. Cues that engage consumers in social sharing are likely to exert social influence. For this reason, additional theory is needed to understand the phenomenon of social sharing. 3.2.2 Theory of social sharing Sharing on social networks is a means of social interaction. The content that individuals share can influence the nature of such interactions. Prior research has suggested that people often engage in information sharing to satisfy the need for self-enhancement (De Angelis et al. 2012, Packard and Wooten 2013, Barasch and Berger 2014). Self-enhancement refers to the basic human need to feel good about oneself (De Angelis et al. 2012). Individuals tend to enhance the self by promoting a good impression and positive image of oneself in social interactions. On social networks, individuals express themselves through the content they share. To bolster the self, individuals tend to share the content that helps them create a good impression and obtain positive recognition from others. For example, sharing interesting things should make someone look better than sharing mundane ones. For this reason, video ads containing executional cues that help individuals achieve a positive self- image are more likely to become viral. In the following, we provide a detailed discussion of various ad cues 36 and the ways in which they may satisfy the need for self-enhancement. Table 3.3 provides the summary of the potential effects of ad cues on self-enhancement based on the theoretical discussion. Cues Dimension Effect on self-enhancement Reason Argument Extent – More message in argument requires high cognitive effort, and can make the ad dry, uninteresting and even irritating. Sharing such ads is unlikely to elicit self-enhancement effects. Extent + Messages may contain valuable information that benefits view- ers, and may provide novel information worthy of knowing (e.g., ads for new product). Sharing such ads make senders look “in the know”. Emotion Extent + Stimuli are generally more interesting in emotion-based ads. Sharing interesting content can reflect positively on the sender. Valence Positive (+) Positive emotions can create a positive image of the sender. Humor Extent + Humorous ads put the audience in a pleasant mood and may create a humorous image of the sender. Surprise Extent + a) Surprise may arouse emotions. b) Consumers crave for unexpected stimuli. Sharing surprising ads satisfy that need, thus eliciting positive responses. Location End (+) Longer window between the beginning of the ad and the loca- tion of surprise may lead to stronger aroused emotions. Brand prominence Duration – Extensive brand prominence makes the ad more like a tradi- tional ad. Consumers are unlikely to feel that others will look at them favorably by sharing a traditional, marketer driven message. Location End (+) a) Early placement increases the perception of the information value of the ad. This may reduce ad liking and increase ad avoidance. b) End placement allows consumers to focus on the content. This makes the ad more like an entertaining video and less like a traditional commercial. These imply that end placement may elicit an self-enhancement effect, while early placement may not. Sources Celebrity + a) Celebrities are generally attractive and/or likable. This may arouse emotions from the audience. b) Celebrities can make claims trustworthy, increasing the product information more valuable. c) Celebrities can be a source of entertainment. Sharing ads endorsed by celebrities may make the sender look informed and valuable. Cute sources + Babies and animals add a cute factor to the ad. They may arouse emotions like warmth or love. Ad length Inverted U shape a) Long ads may contain more information, be more persuasive and arouse stronger emotions. b) Long ads may be boring and fail to sustain viewers interest. c) Short ads may not convey enough information or arouse strong emotions. Other cues Sex, suspense, narration, drama- tization These cues may have indirect effects through the extent of argument or emotion. Table 3.3: Summary of the potential effects of ad cues on self-enhancement from the theoretical discussion. 37 Figure 3.1: Structure of causal relationships. 3.2.3 Structure of causal relationships The presence of a large number of cues in advertising (e.g., see Table 3.2) can make the study of causal relationships between cues and their outcomes a challenging task. We relied on the existing theories to establish the structure of causal relationships. 38 Advertisers commonly persuade the viewers by appealing through message arguments and evoking viewers’ emotions (Tellis 2004). Various cues in the ad may be employed to deliver messages in argu- ment form or to arouse emotions from the viewers. These cues may therefore influence the advertising outcome indirectly through the strength of argument or the extent of emotion. For example, the use of nar- ration (Deighton et al. 1989) and claims about products are ad cues that may influence the strength of an argument. Ad cues that may arouse emotion include surprise (Alden et al. 2000, Derbaix and Vanhamme 2003), dramatization (Deighton et al. 1989), sexual appeals (Holbrook and Batra 1987), suspense (Alwitt 2002), and endorsers (Petty et al. 1983). These cues (hereafter referred to as independent variables) may drive social sharing in two ways: (1) via indirect influence through the extent of argument or the aroused emotion (the mediators) and (2) via additional direct influence beyond the indirect effect via the mediators. Other cues, such as the length of the ad, may not have a strong relation with argument or emotion. These cues were also included, and their direct influence on social shares was tested. Accordingly, the ad cues could be classified into three groups: (1) the mediators: the extent of argument and emotion; (2) the independent variables that may be mediated; and (3) the other cues that are not linked to the mediators. Their potential relationships with the social shares are summarized in Figure 3.1. Given the established structure, we used statistical analysis to identify important cues that drive social sharing as well as the form in which such influence takes place (e.g., direct, indirect or both). Specifically, we sought to answer the following questions: 1. What is the role of emotion in influencing sharing? What ad cues influence virality indirectly through their effects on emotions? (Arrow (1)) 2. What is the role of argument in influencing sharing? What ad cues influence virality indirectly through arguments? (Arrow (2)) 3. What ad cues directly drive virality beyond the effect of arguments or emotions? (Arrow (3)) 3.2.4 Extent of arguments Message arguments persuade a viewer by using logical reasoning and factual claims that credibly convey the benefits of the brand to the user. The strength of message arguments can affect the valence and dura- tion of consumers’ attitude and their resistance to persuasion (Petty et al. 1983, Petty and Cacioppo 1986). Strong or unusual messages may also influence brand memory (recognition, recall), purchase intentions (e.g., Pechmann and Stewart 1990), and product sales (Chandy et al. 2001). The effect of the use of argu- ments depends on consumers’ ad and issue involvement (Petty et al. 1983), product category involvement (Dens and De Pelsmacker 2010), and the extent to which consumers engage in self-monitoring. Some 39 research has suggested that the use of message arguments may also be more effective in driving consumers’ behavior compared to the use of emotional appeals (Golden and Johnson 1983, Millar and Millar 1990). However, logical reasoning and extensive processing of factual claims require substantial cognitive effort from the viewer (MacInnis and Jaworski 1989). Moreover, factual claims can make the ad dry and unin- teresting (Tellis 2004), limiting consumers’ motivation to process these arguments (MacInnis et al. 1991). Indeed, extensive ad information can irritate the viewer (Pasadeos 1990) and increase the likelihood that s/he will stop viewing the ad (Woltman Elpers et al. 2003). This will decrease the liking of the ad, the likelihood of a positive response, and therefore the propensity of sharing. On the other hand, viewers might be willing to share some messages that contain arguments when ads provide novel or interesting information that is newsworthy, such as that from a new product launch (Moldovan et al. 2011). When products are new and information about them is limited, the use of arguments can add persuasive power to the ad (Petty et al. 1983, Dens and De Pelsmacker 2010) and foster willingness to share it with others. Second, the ad can contain valuable information regarding the product that can benefit the recipients. Sharing persuasive information about a new product and sharing content worthy of knowing may add social cache to the viewer who appears to be “in the know”. 3.2.5 Extent to which the ad arouses emotions Ads that aim to arouse viewers’ emotions employ various tactics to arouse emotions from the audience. Such ads have been demonstrated to enhance attitude toward the ad and brand (Edell and Burke 1987, Holbrook and Batra 1987, Olney et al. 1991), increase purchase intention (Aaker et al. 1986, Shimp and Stuart 2004) and recall (Stayman and Batra 1991), reduce ad avoidance (Olney et al. 1991, Teixeira et al. 2012), and affect sales (Chandy et al. 2001, MacInnis et al. 2002). Several studies have examined the role of emotion in driving ad virality (Dobele et al. 2007, Poels and Dewitte 2006). The existing findings suggest that people are more likely to share content when the ad uses positive emotions (Eckler and Bolls 2011), particularly emotions that are high in arousal (Berger and Milkman 2012, Nelson-Field et al. 2013) (e.g., excitement, surprise, humor, pride). Compared to logic and facts associated with the use of arguments in ads, the stimuli used to arouse emo- tions (e.g., drama) are generally more interesting and require less cognitive effort to process. Furthermore, appealing through emotion may be more effective in driving consumers’ action (Edwards 1990). Consumers may wish to share positively valenced and highly arousing ads with others because they want to make others feel good. Such social sharing creates a potential reciprocity effect through affecting other individuals who might adopt similar behaviors when they find similar “feel good” content. Sharing content that is emo- tionally evocative also suggests that one is “in the know” by scouting out content that might be of interest 40 to others. Sharing interesting and entertaining content is therefore more likely to reflect positively on the sender. In addition, people may be more motivated to share positive emotions, as they can put the recipients in a positive mood and create a positive image of the sender. 3.2.6 Extent of humor The extent of humor is defined here as a subjective judgment of the extent to which the ad is likely to be regarded as funny. Humorous ads are often enjoyable, involving, and memorable (Weinberger and Gulas 1992). The effective use of humor in the ad can put the audience in a pleasant mood and improve the attitude towards the ad (Zhang and Zinkhan 1991, Lee and Mason 1999). Humor also creates ads of good attention- grabbing quality (Sternthal and Craig 1973, Zhang and Zinkhan 1991). Video ads that attract attentions are more likely to be viewed; therefore, they are more likely to deliver the message the sharer intends to send. Further, sharing humorous ad can create a humorous image of the sender. The recipients may credit the humorous content to the direct source from which they obtained it, that is, the sender. Receiving a humorous ad may therefore increase the recipients’ liking of the sender. 3.2.7 Extent of surprise Surprise in our context is elicited when the ad content is contrary to a common viewer’s prior beliefs or expected behaviors. While surprise itself has no valence attached, it is often accompanied by some sub- sequent emotions. For this reason, prior research has considered surprise as a potential driver of emotions (Alden et al. 2000, Woltman Elpers et al. 2004). The effect of surprise may go well beyond the aroused emotions. A previous study in neuroscience (Berns et al. 2001) utilized MRIs to measure changes in human brain activity in response to a sequence of pleasurable stimuli that are either predictable or completely unpredictable. The results indicated that the human brain responds most strongly to unpredictable stimuli and suggested that “people are designed to crave the unexpected” (Redick 2013). The findings implied that ads that contain strong surprises are more likely to be shared. These ads are more likely to satisfy people’s need for the unexpected. Sharing them should therefore elicit positive responses from recipients within one’s network and foster self-enhancement. When using surprise in an ad, marketers need to decide where the surprising element should be placed. The location may be less material for TV ads, since they are usually short. However, video ads on YouTube can be long (more than half of the ads we studied were longer than 60 seconds), making the location of the surprise a potentially relevant consideration. The late placement of surprise may be more effective compared to the early placement. Prior research has suggested that a longer window between the beginning of the ad and the location of surprise allows more opportunities for surprise to be transformed to into other emotions. 41 This can lead to higher emotional arousal (e.g., humor) (Woltman Elpers et al. 2004). Additionally, a later surprise can be more gratifying by suggesting that one’s time investment in the ad has been worth the wait. 3.2.8 Brand prominence Brand prominence is an important element of advertising that affects consumers’ comprehension of the ad. In this study, we were interested in the effect of two particular aspects of brand prominence in ads on social sharing: the duration of brand name exposure in the ad and the timing of first mention of brand. Stewart and Furse (1986) argued that the length of the brand name appearing on screen could improve ad comprehension, memory, and persuasion. However, when ad exposure is not forced, the presence and longer duration of brand names in the ad can increase the likelihood of ad avoidance (Teixeira et al. 2010). This occurs because YouTube ads become less like entertainment and more like traditional ads when brand names become prominent. Consumers are unlikely to feel that others will look at them favorably by sharing a traditional, marketer driven message. In other words, sharing ads that contain extensive brand names may not induce a self-enhancement effect. Prior studies on the effect of the location of the brand name in the ad have yielded conflicting findings. The existing evidence favors placing the brand name early (Baker et al. 2004), late (Fazio et al. 1992), or intermittently (Teixeira et al. 2010) in an ad. Since a brand name contains information, early placement of the brand may increase the perception of the ad’s information value. The increased level of information may lower ad liking and create ad avoidance (Woltman Elpers et al. 2003). Consumers are unlikely to achieve self-enhancement if the recipients do not view the shared ads. On the other hand, end placement of the brand name allows the viewers to focus on the content of the ad, regardless of whether it is a story or a drama. Consumers have the option of being exposed to the brand name at the end. In this sense, consumers may perceive the video ad more like an entertaining video and less like a traditional commercial. By the same reasoning above, sharing such ads may elicit a self-enhancement effect. 3.2.9 Sources and types of sources Various sources used in ads can arouse audience emotions. For example, celebrity endorsers generally are attractive and/or likable (Friedman and Friedman 1979, Atkin and Block 1983). They may evoke a sense of excitement. The use of babies and animals adds a cute factor to the ad. Such factors might arouse emotions like warmth, or love. In addition, the use of celebrities can engender the effect beyond emotions. The appearance of celebri- ties in the ad may make an ad popular for several reasons. First, the endorsement from a celebrity may 42 increase the trustworthiness of the claims in the ads (Goldsmith et al. 2000), making the product infor- mation more valuable. Second, celebrities can be a source of entertainment, gossip, and news (Southgate et al. 2010). They attract attention to the ad and they are perceived as more entertaining (Atkin and Block 1983). Ads endorsed by celebrities are likely to generate more interest, attention, and response from the viewers. Sharing such ads can help bolster the self because the ads may make the senders appear valuable and informed. 3.2.10 Length of the ad The length of the ad can also affect viewers’ comprehension of the ad. Prior studies have found that longer ads are generally more effective on recall (Singh and Cole 1993) and product sales (MacInnis et al. 2002) compared to shorter ones. However, because of the high cost of TV advertising, ads are historically short. While the typical commercial length is 30 seconds, commercial lengths can be as brief as 5 seconds. The use of 15-second ads can be as high as 40% of all commercials (Singh and Cole 1993). In marked contrast, almost no restrictions are placed on the length of video ads on YouTube. Advertisers tend to use longer ads compared to those in traditional advertising on TV (the median is 60 seconds in our sample). The existing findings may not apply to YouTube video ads that can be much longer. Longer ads may contain more information, facilitate information processing, or strengthen persuasion through repetitive information. They may also tell a story or portray a drama that arouses stronger emotions from the audience. However, such ads may fail to sustain the viewers’ interest if not well executed. A very long ad can become boring because consumers are generally impatient. On the other hand, a very short ad may fail to convey enough information to be persuasive, or it may contain insufficient emotional cues to be engaging and/or tell a story. For this reason, we expect an inverted U shape between social shares and ad length. 3.3 Data and coding This section covers data sampling and content coding. The data we used are from the first essay, and the data collection procedure was described in Section 2.4. 3.3.1 Data sampling Because of the challenges in coding all video ads, we selected a sample to work with. Our random sampling procedure was based on stratified sampling because the distribution of the shares across the video ads is highly skewed. Table 3.4 shows the sample quantiles of the observed shares. We see that about 10% of 43 the ads are not shared at all, and more than 50% are shared less than 15 times. Using a simple random sampling procedure would result in a sample that contains a large portion of non-shared ads. Such a sample will not provide much information for identifying the drivers of virality. In the stratified sampling, we divided all video ads into four groups and sampled from each group randomly. The break points for the four strata are based on the 50, 75, 90% quantiles of the shares. We then drew 90 video ads from each group randomly, resulting in 360 video ads. After rating these video ads, we found some duplicates where the advertiser uploaded the same ad multiple times. We excluded 15 duplicates from the data and obtained the final sample of 345 video ads. Probability Min 10% 25% 50% 75% 90% Max Quantiles 0 1 11 158 1,058 6,574 1,621,272 Table 3.4: Sample quantiles of the number of social shares. 3.3.2 Content coding We adapted the scales from Chandy et al. (2001) to code the content of the video ads. Table 3.5 presents the scales used in this study along with some brief definitions. The following discusses the ad cues. Content that affected the rating of the extent of arguments included the following. First, the selected ad delivers messages using logical reasoning, for example, by comparing the target brand to some competitive brand. Second, the ad makes factual claims by listing positive attributes of a product. Third, the ad offers certain benefits to users. These aspects are considered together when rating the argument of the ad. Coders used a 6-point scale to rate the extent to which the ad used these elements (0 = not at all; 5 = very strong). Raters also rated whether the ad refers to a price aspect of the product or uses promotional activities (0 = no; 1 = yes). In addition, they recorded whether the ad concerns the launch of a new product and service that has not existed in the current market (0 = no; 1 = yes). For emotional cues, coders were asked to rate the extent to which the ad arouses emotions (0 = not at all; 5 = very strong). These scales were based on the ability of the ad to arouse raters emotionally. We also asked them to visualize how other viewers would respond emotionally to the ad. However, the ratings of these scales might not reflect the level of aroused emotions from the viewers. The coders rated the overall level of emotional arousal as well as the degree of individual emotions that were aroused in the ad. The individual emotions included both positive, i.e., love, pride, courage, joy, triumph, warmth and excitement, and negative emotions, i.e., sadness, shame and fear. For the negative emotions, we also included ratings on anger, disgust and hatred; however, these were not present in the sampled video ads and were thus dropped from the analysis. 44 Cue Scale Explanation Argument 0 – 5 To what extent does the ad use logical reasoning, factual claims or offer benefits? Price 0 = no, 1 = yes Does the ad focus on the list price aspect of the product, or describe some deal or promotion such as coupon, rebate, percentage off? New product 0 = no, 1 = yes Is the ad about the introduction of a new product/service that currently does not exist? Emotion 0 – 5 To what extent does the ad arouse any emotion overall? Love 0 – 5 To what extent does the ad arouse love? Pride 0 – 5 To what extent does the ad arouse pride? Courage 0 – 5 To what extent does the ad arouses courage? Joy 0 – 5 To what extent does the ad arouse joy? Triumph 0 – 5 To what extent does the ad arouse triumph? Warmth 0 – 5 To what extent does the ad arouse warmth? Excitement 0 – 5 To what extent does the ad arouse excitement? Sadness 0 – 5 To what extent does the ad arouse sadness? Shame 0 – 5 To what extent does the ad arouse shame? Fear 0 – 5 To what extent does the ad arouses fear? Humor 0 – 5 To what extent is the ad funny? Surprise 0 – 5 To what extent does the ad run contrary to a common viewer’s prior belief or expectations? Surprise Location None, Beginning, End Where in the ad does the surprising outcome occur? Suspense 0 – 5 To what extent does the ad build desire to know an out- come? Ad length numeric The total duration of the ad in seconds. Brand frequency [0, 1] Percentage of the ad when any brand name/logo is present. Brand location None, Beginning, End, Intermittent Where does brand symbol appear or is brand mentioned in voice in the ad? Celebrity 0 = no, 1 = yes Does the ad involve a celebrity, someone famous for the lives they lived or the character they played? Baby/Animal 0 = no, 1 = yes Does the ad use babies or animals to deliver the message? Cartoon 0 = no, 1 = yes Does the ad use cartoons to deliver the message? Narration 0 – 5 To what extent does the ad use third party voice or text that tells the audience what is going on and what to do? Dramatization 0 – 5 To what extent does the ad is dramatized? Timeliness 0 = no, 1 = yes Is the ad related to a contemporary event? Sex 0 = no, 1 = yes Does the ad rely on sexual appeal? Table 3.5: Rating scales and definitions used in the content coding. 45 Humor was measured by how funny the ad appeared to the viewers. Surprise was assessed by the extent to which the ad is inconsistent with a common viewer’s prior belief. The intensity of humor and the extent of surprise were measured on a 6-point scale (0 = not at all; 5 = very strong). The location where the surprising outcome appears in the ad was identified as “none”, “beginning” or “end”. Ad length refers to the duration of the ad in seconds, which was collected directly from YouTube. The raters counted in seconds the total time the brand name/logo was present or the brand was mentioned in the ad. We operationalized the duration of brand prominence as a frequency measure. That is, we normalized the duration of brand name appearance by the length of the ad to account for differences in ad length. The location where the brand name appeared was recorded as “none”, “beginning”, “end” or “intermittent”. It was coded as “intermittent” if the brand name appeared in multiple places in the middle of the ad. Coders also identified whether any source was used to deliver the ad message. The sources we focused on were celebrities, cute sources (babies/animals), and animated sources. We created a binary indicator for each source type, with 1 indicating that a certain type was used and 0 indicating that it was not used. Multiple sources could appear in the same ad. We originally used separate indicators of baby and animal, but combined them given their limited occurrence in our sample. The use of character and plot in the ad determined the scale of dramatization (Deighton et al. 1989). Character refers to the use of a person or personalization with distinct traits, and plot is a sequence of causal events for telling a story. Coders rated both character and plot on a 6-point scale (0 = not at all; 5 = very strong). The scale of dramatization was then defined as the average of character and plot. The use of a narration in the ad (e.g., a voice-over) was coded separately. It constitutes a third party voice or text that describes what is going on in the ad 1 . We included additional ad cues that may be relevant. Some ads employed suspense, which builds a desire for the viewer to know the outcome. Coders rated the extent of suspense in the ad (0 = not at all; 5 = very strong). Ads can also contain content pertinent to a contemporary event or season, such as the winter Olympics, the World Cup, the Super Bowl, and the like. Hence, coders indicated whether the ad was (= 1) or was not (= 0) relevant to a contemporary event. Sexual appeals are commonly used in certain product categories, such as fashion and cosmetics. Coders used a 0/1 scale to indicate whether the ad contained (= 1) or did not contain (= 0) sexual appeals. Three paid raters who were blind to the purpose of this research coded the data independently. We explained the rating scales, trained the coders using test video ads unrelated to the selected sample, discussed their results of the test cases, and clarified the definitions so that they understood the meaning of each scale. 1 Our use and definition of “narration” follows that in Deighton et al. (1989). 46 Raters were provided with copies of each ad, which were downloaded from YouTube. Only the title of each video ad and the brand channel that published it were available to raters. Raters were instructed to base their ratings only on the information provided. They were instructed not to search for or gather any additional information. Following these instructions, the three raters rated the sampled video ads independently. After all coders finished the ratings, we computed the inter-rater reliability measures. The inter-rater agreement percentage was 0.76, and the Kappa and Tau correlations were 0.67 and 0.63. All of these measures indicated good reliability of the coded scales. To determine the final scale for each cue used in the analysis, we set the scale of the cue to reflect the agreed-upon value when at least two raters gave the same rating on a cue. Otherwise, we used the mean of the three ratings. 3.3.3 Exploratory analysis and descriptive statistics The inclusion of many individual emotions may create high correlations among them. To reduce this collinearity, we ran an exploratory factor analysis using varimax rotation on the individual emotions and related cues. Table 3.6 shows the loadings for the first five factors with eigenvalues greater than one. Based on the loadings of these factors, we created three derived scales, triumph/courage, love/warmth, and sad- ness/fear. The scales reflected the average of the individual items. We will use these aggregated emotions in the following. Variable Factor1 Factor2 Factor3 Factor4 Factor5 Triumph 0.772 -0.157 0.206 -0.060 -0.081 Courage 0.738 -0.203 0.182 -0.018 -0.077 Warmth 0.638 0.175 -0.382 0.181 -0.034 Love 0.589 0.342 -0.301 0.023 0.249 Pride 0.535 -0.049 0.051 -0.070 0.468 Joy 0.139 0.764 -0.226 -0.003 -0.049 Humor -0.223 0.650 0.154 0.014 -0.031 Surprise -0.059 0.612 0.319 -0.066 -0.015 Suspense 0.067 0.234 0.645 0.206 0.055 Excitement 0.041 -0.011 0.555 -0.070 -0.037 Sadness 0.034 -0.068 -0.204 0.773 -0.061 Fear -0.040 0.023 0.264 0.744 0.047 Shame -0.072 -0.058 -0.020 0.009 0.890 Table 3.6: Factor analysis on individual emotions and related cues. 47 Table 3.7 shows the frequency of each scale for all executional cues (except for the numeric ones). The median ad length is about 60 seconds, with first and third quantile being about 30 and 120 seconds. The last column shows the result of a simple regression of the logarithmic shares over the rated scale of each cue. Variable 0 1 Estimate Binary new product 304 41 1.460** price 319 26 -1.898** celebrity 254 91 1.302** baby/animal 332 13 3.133** cartoon 317 28 0.292 timeliness 298 47 1.349** sex 332 13 -0.258 Variable 0 1 2 3 4 5 Estimate Numeric argument 157 36 44 58 45 5 -0.602** emotion 190 35 52 43 21 4 0.634** pride 332 4 4 2 3 1.180** joy 275 19 25 24 1 1 0.697** excitement 308 19 16 2 0.955** love/warmth 294 19 20 7 5 0.604** courage/triumph 297 20 16 6 4 2 0.482** shame 343 1 1 -0.061 sadness/fear 337 5 3 -1.377* humor 251 23 27 37 6 1 0.513** surprise 312 3 9 13 7 1 1.014** suspense 330 3 5 5 2 0.661** narration 173 38 35 24 47 28 -0.300** dramatization 107 87 48 66 29 8 0.478** ad length 0.005** brand frequency -1.220** Variable none beginning end intermittent Estimate Categorical brand location 3 104 123 115 surprise location 315 11 19 Table 3.7: The frequency table of the scales. The last column shows the estimated coefficient from the univariate regression of the logarithmic shares on each ad cue. 3.4 Results Our empirical analysis follows the structure laid out in section 3.2.3. Our discussion in the theory section indicates that certain cues may influence social sharing by stimulating emotion from the audience or by delivering message arguments. To appropriately assess the roles of these cues, we estimated both their indirect effects through the extent of the rated emotion or the extent to which the ad uses logical arguments, 48 and any additional direct effect they exert on social shares. For this reason, we carried out the empirical analysis in two stages. In the first stage, we examined the cues that we hypothesized would arouse emotions or those that would be associated with the use of factual arguments. Through a mediation analysis, we investigated which of these cues exert indirect effects on social shares via emotion or argument appeal. In the second stage, we tested any additional direct effect of these cues. We also assessed the effects of individual emotions as well as other ad cues that did not influence sharing through emotion or argument. Based on the two analyses, we can draw conclusions about the important drivers of social shares and the potential ways their influence takes place. 3.4.1 Indirect effects From Figure 3.1, we see that cues that are linked to aroused emotions include the degree of dramatization, surprise, suspense, sexual appeal, and the type of source used in the ad (celebrity, baby/animal and cartoon). Cues that are linked to the extent of argument include narration, brand frequency, and the use of a price appeal. To assess the indirect effect of these cues on social sharing, we performed a mediation analysis, esti- mating a sequence of regressions following the procedure by Baron and Kenny (1986). We denoteM 1 as the rated level of emotion andM 2 as the rated extent of argument,X 1 andX 2 as the independent variables that may work throughM 1 andM 2 , respectively, andQ as all other relevant cues. We used the logarithmic shares as the response variable (Y ) to account for skewness in the sharing data. We used the overall extent of aroused emotion judged by raters, as opposed to the extent of aroused individual emotions as mediators, to avoid the complexity of a large number of mediators in a given model (the effects of individual emotions were examined in the second stage). We estimated the following models: Y 7 X i=1 c i1 X i1 + 3 X i=1 c i2 X i2 +Q; (1) M 1 7 X i=1 a i1 X i1 ; (2) M 2 3 X i=1 a i2 X i2 ; (3) Y 7 X i=1 c 0 i1 X i1 + 3 X i=1 c 0 i2 X i2 +b 1 M 1 +b 2 M 2 +Q: (4) 49 In the above,a i1 is the effect of ad cuei on the aroused emotion, andb 1 is the effect of the aroused emo- tion on social shares after controlling for all other ad cues. The indirect effect of ad cuei through emotion is thena i1 b 1 . Similarly,a i2 is the effect of ad cuei on the rated argument, andb 2 is the effect of argument on social shares after controlling for all other ad cues. The indirect effect of ad cuei through argument is then a i2 b 2 . To make a statistical inference from the estimated indirect effects, we used a bootstrapping approach (Bollen and Stine 1990, Preacher and Hayes 2008). That is, we randomly sampled the observations with replacement, refit the above models based on each new sample, and computed and stored the indirect effects for each sample. We ran the above steps 5000 times, which yielded 5000 estimates of the indirect effects. Bias-corrected and accelerated confidence intervals were computed based on the sample of bootstrapped indirect effects. Table 3.8 reports the estimated effects of ad cues on the mediators (a) and the estimated indirect effects (ab) of ad cues through the mediators. Several results are noteworthy. First, it seems that greater use of a narrator, brand frequency and the use of price appeal in the ad leads to higher extent of message arguments. Second, the indirect effects of narrator use, brand frequency, and price appeal were all significantly negative. Third, the results on the rated ad emotion suggest that greater use of dramatization, celebrity sources, and cute sources, such as babies and animals, is associated with greater emotions, as rated by the judges. Fourth, these cues also have strong and positive indirect effects on social sharing through rated aroused emotions. The use of sexual appeals, on the other hand, is negatively related to social shares through emotion. Mediator Independent variable Effects of cues on mediators (a) Effects of cues through mediators (ab) Estimate 95% Confidence interval Estimate 95% Confidence interval Argument Extent of narration 0.278 ( 0.181, 0.376)** -0.044 (-0.084, -0.018)** Brand frequency 0.095 (0.003, 0.192)** -0.015 (-0.041, 0.000)** Use of price appeal 0.926 ( 0.555, 1.297)** -0.147 (-0.272, -0.053)** Emotion Extent of dramatization 0.338 ( 0.188, 0.488)** 0.031 ( 0.008, 0.070)** Extent of surprise 0.045 (-0.096, 0.186) 0.004 (-0.007, 0.023) Use of celebrity 0.301 (0.010, 0.620)** 0.028 ( 0.001, 0.083)** Use of baby/animal 1.578 ( 0.844, 2.311)** 0.145 ( 0.035, 0.303)** Use of cartoon -0.290 (-0.804, 0.224) -0.027 (-0.088, 0.008) Use of sexual appeal -0.582 (-1.307, 0.143) -0.053 (-0.135, -0.005)** Extent of suspense 0.000 (-0.146, 0.146) 0.000 (-0.014, 0.014) Table 3.8: Mediation analysis for ad cues. Columns 3 – 4 report the estimated effects of the ad cues on the mediator. Columns 5 – 6 report the estimated indirect effects of the ad cues on social shares through the mediator, along with the estimated confidence intervals based on bootstrapping. 3.4.2 Direct effects We now extend the analysis to examine the direct link between ad cues and social shares. In the above, we used the overall level of emotion for the purpose of mediation analysis. We now include the scales measuring 50 individual emotions because different types of emotions may have a different direct effect on social shares (Berger and Milkman 2012, Nelson-Field et al. 2013). We started with a model that included all executional cues, i.e., the independent variables, the mediators, and other cues. However, high correlations among the independent variables and the mediators might potentially mask the effect of certain cues. For this reason, we used a stepwise variable selection procedure to eliminate the insignificant independent variables. These are the cues that are either completely mediated by argument or emotion focus or do not influence social shares. Substantial brand differences may affect the popularity of the ad. We controlled such brand differences in two ways. First, we included the logarithm of channel subscribers to control for the observed popularity heterogeneity. Second, we used a brand-level random effect to account for any unobserved heterogeneity. That is, our analysis relied on a mixed-effect model of the logarithmic shares over the rated scales 2 . We standardized the response and the numeric scales so that the magnitude of the coefficients can be compared. No scaling was applied to binary or categorical covariates. The estimated coefficients together with the estimated standard deviations and p-values are reported in Table 3.9. Several results are noteworthy. First, the coefficient for the number of subscribers is positive and signif- icant, implying that the video ads from a brand channel with more subscribers tend to generate more shares. The number of subscribers reflects the exposure the ad can generate without word-of-mouth. More exposure leads to more shares, ceteris paribus. Further, subscribers are more likely to share the ad if subscription to the channel implies their liking of the brand and channel. Second, the extent to which the ad uses arguments is significant and negatively related to virality of video ads. This implies that ads focusing on facts are generally less likely to be shared. Such ads may be perceived as discouraging to self-enhancement capital, as individuals are motivated to watch video ads primarily for entertainment and social connection (vs. information acquisition) purposes. Ads introducing new products are generally shared more often, as the main effect of new product was positive and significant. The interaction between the argument focus of the ad and the new product indicator was also positive and significant. Based on this, we can compute the estimated effect of argument in ads with new product introduction. Table 3.10 shows that this effect is positive and marginally significant. The result indicated that for new products, greater use of arguments in ads could indeed facilitate social sharing. This may be because the information about new products is likely to be novel, valuable and thus of greater benefit to recipients. 2 We actually estimated the model using MCMC to produce better estimates of the uncertainty. The p-value in Table 3.9 is based on the posterior distribution of parameter estimates. 51 Variable Mean SD p-value Intercept -0.636 0.417 0.133 log(subscribers) 0.378 0.059 0.000** Extent of Argument -0.103 0.053 0.047** New product 0.379 0.125 0.001** Argument * New product 0.278 0.113 0.015** Extent of love/warmth 0.114 0.050 0.024** Extent of pride 0.140 0.042 0.000** Extent of courage/triumph 0.038 0.045 0.406 Extent of joy 0.058 0.046 0.189 Extent of excitement 0.131 0.042 0.000** Extent of sadness/fear -0.051 0.038 0.192 Extent of shame 0.003 0.038 0.940 Extent of humor 0.166 0.050 0.001** Extent of surprise 0.156 0.081 0.051* Surprise none 0 – – Surprise beginning -0.430 0.338 0.184 Surprise end 0.168 0.297 0.574 Brand none 0 – – Brand early 0.481 0.416 0.248 Brand end 0.802 0.415 0.056* Brand intermittent 0.508 0.415 0.228 Ad length 0.093 0.045 0.043** Ad lengthˆ2 -0.080 0.034 0.017** Use of celebrity 0.270 0.094 0.004** Use of baby/animal 0.394 0.215 0.057* Timeliness -0.136 0.136 0.317 Table 3.9: Estimated direct effects. We started with the mixed-effects model of logarithmic shares over all the ad cues included in this study. Ad cues completely mediated by emotion or argument were dropped to obtain better estimates of the other cues. The R 2 statistics is0:51 when calculated using only the fixed effects (ad cues), and0:66 when calculated using both the fixed effects (ad cues) and random effects (channel effects).R 2 =1 Var(Residuals)/Var(Response). Third, looking at the estimates for the individual emotion types, we found that ads with positive emotions tend to generate more shares. Ads arousing stronger emotions are also more likely to be shared. Among the different types of emotions, ads that evoke pride, love/warmth, and excitement are most likely to be shared. None of the coefficients for the negative emotions were significant in the analysis. From Table 3.7, we see few ads that evoke negative emotions. This may be because advertisers have already anticipated the positive effect from positive emotions. A more conclusive analysis would require the inclusion of more ads rated as arousing negative emotions. 52 Contrast Mean SD p-value Extent of argument for new product 0.175 0.112 0.055* Surprise end – Surprise beginning 0.598 0.271 0.015** Brand end – Brand early 0.321 0.101 0.000** Brand end – Brand intermittent 0.294 0.102 0.001** Table 3.10: Contrast analysis based on the estimated parameters in Table 3.9. The first row shows the effect of argument in the ad that introduces new product. That is, the main effect of argument plus the interaction effect in Table 3.9. Fourth, humor and surprise play important roles in driving virality. Funny and surprising ads are sig- nificantly more likely to be shared on the networks. This may be because they contain elements that are unexpected and interesting. In addition, the location of surprise matters (“Surprise none” is set to be the reference level). Table 3.10 reports the estimated contrast between end and beginning placements. The esti- mate is positive and significant, indicating that if social sharing is the objective of communication, holding the surprise component at the end of ad is better than placing it earlier. Fifth, the location of brand appearance seems to have an influence (“Brand none” is set to be the refer- ence level). Table 3.10 shows the contrast between “Brand end”, “Brand beginning” and “Brand intermit- tent”. Based on the estimates of the contrasts, it is evident that showing the brand at the end of the ad is significantly better than placing it at the beginning and intermittently for the purpose of promoting social shares. This result is contrary to the existing finding (Teixeira et al. 2010) that the best strategy of placing brand names is pulsing. Sixth, the length of the ad is also crucial. We operationalized on the logarithmic scale and included a quadratic term to test the potential non-linearity. Both the linear and quadratic terms were significantly different from zero. The coefficient of the quadratic term was negative, which determines an inverted U shape between social shares and ad length. One disadvantage of the quadratic polynomial is that it implies a symmetric relationship that may be too strict. For this reason, we replaced the quadratic polynomial of ad length by a penalized spline term (Eilers and Marx 1996), keeping other aspects of the model unchanged. Using a nonparametric spline function on ad length allows flexible patterns between shares and ad length to be estimated from the data. Penalty on the spline coefficients was imposed to avoid overfitting. Figure 3.2 shows the estimated relationship between social shares and ad length from the penalized spline model. It still displays an inverted U shape, with maximum value achieved at 1.15 minutes. From the figure, we see that ads between 1 and 1.5 minutes are more likely to be shared. The asymmetry of the curve indicates that compared to very short ads, consumers are more likely to share long ads. For example, based on the estimates, a two-minute ad is three times more likely to be shared than a 15-second ad. The 15-second ads are the least shared among the ads with different length. 53 Ad length (min) Effect 0 0.5 1 1.5 2 2.5 3 3.5 4 0.4 0.6 0.8 1.0 1.2 1.4 Figure 3.2: Estimated relationship between social shares and ad length. The dashed line indicates the optimal ad length based on this estimate. Lastly, both celebrity endorsers and baby/animal sources still have significant and positive effects on social shares beyond their positive indirect effects through emotion. This may be because the use of celebri- ties can capture attention and make the ad interesting. Figure 3.3 summarizes the results from the two analyses of indirect and direct effects. 3.5 Discussion and conclusions Advertising on YouTube using video ads has gained increasing interest over the years. This new medium offers advertisers great opportunities to create effective ad campaigns with relatively low cost. The primary distinction from traditional TV advertising is that the exposure to the video ads is generally voluntary and driven largely by social sharing. Advertisers seek to design video ads that produce interest and stimulate social shares. Therefore, it is of great importance to better understand the relationship between various executional cues that advertisers can control and the shares on social networks. We address this question by collecting social sharing data on a large number of video ads and rating the content of the collected video ads. We provide possible explanations of identified drivers of virality based on the theory of self-enhancement and prior findings on attitude/persuasion. 54 Figure 3.3: Identified drivers of virality and the way they influence social shares. Our major empirical findings are as follows. The use of argument appeals positively influences shar- ing only when the ad relates to a newly introduced product; otherwise, its effect is negative. Emotional appeals are more effective than are argument appeals at promoting sharing. In particular, high ratings of love, warmth, pride, and excitement stimulate social sharing. Ads rated as high in surprise have a strong positive effect on social shares. The location of surprise also matters. A surprising end evokes more sharing compared to a surprising beginning. A related finding is that humorous ads are generally more likely to be shared. Further, evidence suggests that brand prominence is influential. The frequency that brand names are shown in the ad has a negative effect. Late placement of brand names and logos enhance sharing propensity compared to early display. Pulsing brand names through the ad does not increase virality. The relationship between social shares and ad length is characterized by an asymmetric inverted U curve, with ads between 1 to 1.5 minutes being most likely to be shared. Lastly, the type of sources influences sharing. Celebrities, babies, and animals have strong positive effects on social shares, both directly and indirectly through their effects on emotions. These results are robust although other variables are either not significant or not robust. 55 3.5.1 Implications These results provide important implications for the current practice of advertising via video ads. First, in our sample, about 55% of the ads (see Table 3.7) used argument appeals to deliver ad messages and did not use emotional appeals. This number is likely to be even higher had we not used a stratified sample to eliminate ads that failed to be shared. However, our results imply that the use of argument appeals is in general negatively correlated with social shares and is only effective for new product introductions. Emotional appeals with positive emotions are generally more effective in driving social sharing. To engage consumers in more social sharing, advertises may be better off reducing logical reasoning and factual claims shown in the ad, except when new products are introduced. Emotional appeals may be used as the main approach for generating maximum social sharing. In the current sample, 45% of the ads were rated to use emotional appeals, and only 7% of the ads are rated as emotionally strong appeals (rated emotional scale 4). Second, our study found that the use of babies/animals and strong drama are effective in arousing emo- tions and creating social shares. This does not seem to be widely understood by advertisers: only less than 3% of the ads in the sample used babies/animals as main sources. In comparison, more than 26% of the ads use celebrities as endorsers. While the use of celebrities is effective in generating shares, it can be costly. In comparison, babies and animals are much less expensive. Appropriate use of these sources can help achieve a higher return of the ad campaign. This can be a viable strategy for smaller companies with high budget constraints on marketing. Third, advertisers can take certain simple actions to improve social sharing. According to the results, it is better to place the brand name or logo at the end of the ad. Currently, only 30% of the ads in the sample used late placement. Additionally, the length of the ad is easy to control. Our results suggest that the most shared ads are generally between 1 and 1.5 minutes. In contrast, about 50% of the ads were shorter than 1 minute and about 25% were longer than 2 minutes in the sample. While the length of the ad can improve storytelling, it can also detract the viewing experience. Viewers of ads are often impatient and an ad that is too long can become less engaging. A short ad may not be able to tell a full story to arouse strong emotions, and it may become overwhelming if the rate of events in the plot is too fast. In designing the ad, it can be a fruitful means to manage the length of the ad to the extent that it tells interesting stories but still sustains viewers’ interest. Lastly, based on the standardized coefficients in Table 3.9, it appears that the extent of humor and the extent of surprise are the two largest effects among the numeric scales. For improving social sharing, it can 56 Cues Recall, Attitude, Purchase intention Sales Shares Argument + (main effect, high involvement consumer and product, low market-share brand) + (main effect, new market) – (main effect); + (new product) Emotion + (main effect, low involvement product, positive emotion); – (negative emotion) + (main effect, older markets) + (love/warmth, pride, excitement) Humor + (main effect, favorable prior brand eval- uation, low involvement product, existing product) + (main effect) Surprise + (main effect, end placement) Celebrities + (main effect); – (negative celebrity infor- mation) + (main effect) Cute sources (babies/animals) + (main effect) Brand prominence + (beginning, end, intermittent); + (length) – (length); + (end) Ad length 30s is more effective than 15s for emotional ads and as effective as 15s for informational ads + (long ads) asymmetric inverted U: best length is between 1 and 1.5 minutes Narration + (through argument) – (through argument) Dramatization + (through emotion) + (through emotion) Sexual appeal mixed results: depends on products and con- sumer segments – (through emotion) Suspense + (main effect) no effect found Table 3.11: Summary of the findings of various ad cues on different measures of consumer responses. The notation of + and – indicates a positive or negative effect, respectively. be helpful to first explore the use of humor and surprise in designing video ads. In the current sample, only 28% of the ads used humor and 10% elicited surprise. 3.5.2 Comparison to existing findings YouTube represents a new advertising medium. In this new context, the success of video ads depends primarily on the social shares generated by consumers. The effects of various ad cues on social shares may be different from those on recall, attitude, purchase intention and product sales that have been found in the context of traditional advertising. Table 3.11summarizes and compares the findings of ad cues on the various measures of ad effectiveness. Several points are noteworthy. First, the use argument appeal has been found to positively affect recall, attitude, purchase intention, and sales in many studies. For video ads, however, it is generally negatively related to social shares. The use of argument appeal is only effective for ads that introduce new products. Second, the existing research has investigated the role of surprise as a driver of humor and other emotions. Our result on the effect size of the examined ad cues indicates that surprise can be highly effective compared to other ad cues, even after controlling for the effect of humor. Third, for YouTube videos, there is almost no restriction on the length of the ad. Advertisers tend to use longer ads on YouTube compared to traditional advertising on TV . The location of surprise has been found to be material in this context, with end placement being more effective in driving social shares compared 57 to early placement. Fourth, another important difference between our finding and those from the existing studies concerns the effect of ad length. Because of the high cost of TV advertising, ads are historically short. The existing findings may not be elucidating for YouTube video ads that can be much longer. Indeed, we found an asymmetric inverted U shape between social shares and ad length. This is contrasted with prior findings that generally support longer ads. Fifth, prior research on the location of brand names has been inconclusive, with evidence in favor of brand placement at the beginning, at the end, or intermittently. Our research provides the first investigation regarding the effect of brand name placement on social shares. Our results suggest that displaying the brand name at the end is preferred to other locations when the goal is to drive social shares. 3.5.3 Future research The success of an ad campaign is often measured by both the exposure and the persuasiveness of the ad. In this essay, we focused on how different ad cues affect the ad exposure driven by social shares. Another important topic would consider the effect of executional cues on persuasiveness of the ad in the YouTube context (e.g., Tucker 2015). Cues that generate social shares may have less effect on persuasion. The roles of the cues on persuasion can be different from those they play in stimulating social shares. Finally, it will be interesting to examine how the executional cues influence the ad return measured by product sales, taking into account both exposure and persuasiveness. 58 Chapter 4 Quantifying the Effects of YouTube Video Ads 4.1 Introduction YouTube has become a new medium for advertising because of its popularity and relatively low cost. Adver- tisers can publish unlimited video ads and enjoy the “free” brand exposures generated from ad views. How- ever, the success of an advertising campaign is measured beyond short-term brand exposures. Managers may wish to understand how the campaign affects consumers’ behaviors following exposure. For example, it may be more relevant to consider the effect on the top or bottom line of the company when planning future campaigns. Quantifying the ad effect beyond exposures can be challenging. A recent survey 1 of chief marketing offi- cers found that nearly half said they were not able to quantify the impact of social media on their companies, while 36 percent said they had a good sense of qualitative – though not quantitative – results. The first objective of this study is to provide quantitative measures of the effect of video ads. This enables advertisers to quantify the effect of a video ad and identify the more effective ones. Advertises can further promote the effective ads on YouTube or air them on TV to get better returns on the ad investment. One strategy that has gained increasing interest in both marketing research and advertising practice is to make viral ads. Though viral ads, brands can gain enormous exposures with very low cost. Existing studies in marketing have examined how to make online content go viral (e.g., Berger and Milkman 2012). However, little is known about the effect of virals beyond short-term exposures compared to non-viral ones. Our second goal is to investigate whether viral ads are also more effective in driving consumers’ behaviors (e.g., subscribe to the channel) than non-viral ads. Two major methods that advertisers commonly use to persuade the viewers are appealing through mes- sage arguments and through evoking viewers’ emotions (Tellis 2004). Existing research has studied the effect of different appealing methods on attitude toward the ad and brand, purchase likelihood, actual sales 1 Source: http://www.forbes.com/sites/dorieclark/2013/09/12/cmos-on-social-media-wheres-the-roi/ 59 using TV ads. However, these findings may not extend to YouTube video ads because of the substantial differences in the context of exposure and the medium of ad delivery. So, our third objective is to provide a fresh understanding of the relative effectiveness of argument and emotional appeals in video ads. We scrape the YouTube website to collect the hourly views of video ads and subscriptions to channels for 21 brands. We then build a dynamic model that links the views of video ads to the channel subscriptions of the brand. This allows us to quantify the effect of a view on channel subscription, identify the effective ads, compare the effect of viral ads to non-viral ads and test the effectiveness of different appealing methods. Our main results are the following. First, the empirical analysis of 152 video ads suggests that every 1000 views create about 5.8 channel subscriptions. Second, we find strong evidence that viral ads generate more channel subscriptions than non-viral ads. Third, our analysis supports the use of emotion as a more effective appealing method to stimulate channel subscriptions. The rest of the essay is organized as follows. Section 4.2 reviews the literature and provides the motiva- tion for the study. Section 4.3 presents the proposed dynamic model. Section 4.4 describes data collection and sampling. Section 4.5 reports the estimated results. Section 4.6 provides concluding comments and lists limitations and directions for future research. 4.2 Literature In this section, we first review the related literature on ad effectiveness. We then discuss the possible effect of virality, the effects of different appeal methods, and the various measures of ad effect. 4.2.1 Research on ad effectiveness A large number of studies have investigated the effect of advertising in the marketing literature. Such studies relate advertising measures such as ad spending, gross rating points or ad exposures directly to consumers’ purchasing behavior measures such as sales, market share, and brand choice (Vakratsas and Ambler 1999). These studies have examined the elasticity of advertising (Assmus et al. 1984), carryover effect of advertising (Bass and Clarke 1972), shape of advertising response function (Vakratsas et al. 2004), and advertising wearin and wearout (Bass et al. 2007). However, most of the existing research and findings are based on TV advertising. Consumers are forced to watch TV ads unless they choose to avoid them through zipping or zapping. In contrast, consumers are generally not exposed to a YouTube ad unless they choose to view it. By choosing to watch the ads on YouTube, consumers are more likely to attend to the ad content and process messages conveyed in the ad. For this reason, video ads may be more effective at driving consumers’ behaviors. 60 Another major difference is that the number of exposures of YouTube video ads is mainly driven by consumers’ sharing of the ads using social networks. A share from social network friends functions as recommendations of the ad, product or brand. It can increase the persuasive power of the ad and affect the viewers’ attitude favorably. Prior research has found that word-of-mouth can influence consumers behaviors and product sales (Chevalier and Mayzlin 2006, Liu 2006). Because of these differences, consumers may respond differently to YouTube video ads than TV ads. This may lead to effects of video ads that are different from those in prior research using TV ads. 4.2.2 Effect of virality There has been growing interest of virality research in marketing. Several studies in marketing have exam- ined the factors that make online contents viral (e.g., Berger and Milkman 2012, Nelson-Field et al. 2013). However, few studies have investigated the effect of virality. Undoubtedly, viral contents attract substantial consumer attention and generate enormous brand exposures. But to what extent can viral ads affect the brand beyond ad exposures, e.g., prompting exposed consumers to subscribe to the channel, visit the web, or purchase the product? On one hand, viral ads may be expected to create a greater effect than nonviral ones. First, viral ads generate more exposures. If ads are equally effective in driving consumers’ behaviors, viral ones will have a larger impact. Second, viral ads often generate substantial word-of-mouth such as social shares. Consumer generated word-of-mouth may spur product sales (Chevalier and Mayzlin 2006, Liu 2006). Third, viral contents may create a favorable attitude toward the brand which may lead to more engagement with the brand. On the other hand, more exposures of the ad may not necessarily correspond to the increase of revenues (Tellis and Weiss 1995). In addition, viral ads may be less persuasive. Indeed, in the context of YouTube video ads, Tucker (2015) finds that on average, video ads that have received one million more views are 10% less persuasive in terms of purchase intent. For these reasons, it is unclear whether viral ads can be more effective in changing the behavior of consumers than nonviral ones. 4.2.3 Effect of appeal methods Two major methods that advertisers commonly use to persuade the viewers are appealing through message arguments and through evoking viewers’ emotions (Tellis 2004). Message arguments persuade a viewer by using logical reasoning and factual claims that credibly convey the benefits of the brand to the user. Ads that aim to arouse viewers’ emotions employ various tactics to arouse emotions from the audience. 61 Figure 4.1: Metrics that may be used for measuring the return of ad campaigns. Extant research finds that emotions that are positively valenced and high in arousal are effective at mak- ing ads viral (Berger and Milkman 2012, Nelson-Field et al. 2013). The use of argument appeal generally has a negative impact on virality (second essay). However, little is known about the effect of the different appeal methods used in YouTube video ads on channel subscriptions, product sales or stock returns. While emotional appeals may be more important in driving virality, they may be less effective than arguments at driving consumers’ behaviors (Golden and Johnson 1983, Millar and Millar 1990). A recent study by Teix- eira et al. (2014) finds that entertainment in advertisement has an inverted U-shape relationship to purchase intent. For these reasons, a video ad that appeals through emotions and focuses on entertainment can be less effective in driving purchases than a less popular ad using arguments. 4.2.4 Measures of ad effect For video ads on the social media platform, a variety of metrics can be used for measuring the effect of an ad campaign. They include the measure of ad exposure (video views), ad liking (likes, shares, rating, etc), pre-conversion activities (e.g., web visits), consumer conversion (actual sales), and company performance (stock returns). Figure 4.1 summarizes the list of potential metrics. Video-ad-level metrics such as views, likes and shares provide important information on brand exposure, word-of-mouth, and consumer sentiment. They can often be tracked for each video ad. However, managers may be more interested in understanding how the ad affects consumers’ behaviors on the product or brand. For example, it may be more relevant to know if the ad “converts” consumers by prompting them to subscribe to the channel, sign up on the email list, or purchase the product. These measures of conversions are often not observed on the ad level. They can also be influenced by multiple ads. As a result, the assessment of the ad-level return using these measures must be based on the appropriate attribution of the observed outcomes to each ad. Such a need for attribution can be very important for measuring the performance of video ads on YouTube, because advertisers tend to release video ads frequently due to the low cost incurred. In this essay, we focus on the metric using channel subscribers on YouTube. We adopt this metric for several reasons. First, it provides a measure of conversion beyond ad exposure, liking and sharing. 62 Subscribing to the brand channel means the consumer is willing to receive more messages from the brand in the future. It signals a favorable brand attitude and it may be indicative of future actions such as product purchase. Second, it is publicly available on YouTube. In contrast, product sale is information mostly private and stock prices may not be available for companies that are not publicly traded. Channel subscriptions enable us to make statistical inference using a large number of brands, making the results and conclusions more robust. A third advantage of using subscribers is that it is available by day or day parts. We can therefore build dynamic models on a more granular level for the purpose of attribution. To our knowledge, no prior research has analyzed the effect of online marketing on channel subscriptions. 4.3 Model In this section, we describe a dynamic model that link the subscribers of a channel to the views of video ads in the given period. To fix ideas, we focus on a single brand channel. Suppose y t is the number of incremental channel subscribers, andx it is the number of incremental views of video adi at timet. If thei th video ad has not been uploaded by times, thenx it = 0 for 0ts. The number of video ads uploaded in the observation period isI. The number of incremental channel subscribers is linked to the number of exposures of each video ad dynamically as the following: y t = t + y ; t = t1 + I X i=1 x it i + : (1) In the above, t is the expected number of incremental subscribers at timet. It is composed of the carryover effect t1 as well as the effect due to views of the focal video ads in the current period P I i=1 x it i . The coefficient i measures the effectiveness of video ad i in generating immediate new subscriptions to the channel, so thatx it i is the measure of expected number of subscriptions due to viewing thei th video ad at timet. The coefficient2 (0; 1) measures the rate of carryover from the last period. The error terms are specified as y N(0; 2 y ) and N(0; 2 ). The dynamic model is completed by specifying the initial distribution of the series, e.g., 0 N(0; 10 4 ). This specification resembles the Koyck model, and allows multiple ads to be included. Priors. We now specify the priors for all parameters in the model. For the effectiveness parameters, we employ the spike-and-slab priors (Ishwaran and Rao 2005) as follows: i (1q i )I 0 +q i TN (0;1) (0; 2 i ): (2) 63 0 Figure 4.2: Illustration of the spike-and-slab prior. In the above, I 0 denotes the degenerate distribution at the origin, TN (0;1) (0; 2 i ) denotes the truncated normal distribution with mean zero and standard deviation i in the interval (0;1), andq i 2 [0; 1] is the prior probability that the parameter is nonzero. Bothq i and i are pre-specified constants. This is a mixed type prior that has a probability mass at the origin accompanied by a continuous distribution on the positive real line, thus restricting the parameters to be non-negative. Figure 4.2 illustrates the shape of the above spike-and-slab prior. We adopt this spike-and-slab prior for the following reasons. First, it is intuitively appealing. An adver- tisement is usually classified as effective or ineffective. For example, Vakratsas et al. (2004) specifies a model that allows the ad to switch between effective and an ineffective states. Unless the ad is very con- troversial or arouses extremely negative emotions, it is unusual to expect negative effects. Indeed, most YouTube ads are designed to create spontaneous viewing and sharing of the ad, and in our data, only less than 5% of the observations correspond to negative subscriptions. Second, when a channel publishes many ads in a short time period, this may create highly correlated time series of ad views that can result in spu- riously negative estimates. Third, allowing a probability mass at the origin enables the identification of effective ads - effective ads can be identified as those with high posterior probability of not being zero. Sim- ply restricting all coefficients to be positive, e.g., using a LogNormal distribution, prohibits the identification of effective ads. Lastly, a non-negative effectiveness measure facilitates the attribution of the aggregate sub- scriptions to each video ad (see below). 64 We assign diffuse hyperprior distributions to other parameters, e.g., a uniform prior for , and non- informative inverse gamma distributions for the variances 2 y and 2 . 4.4 Data 4.4.1 Data sampling Our unit of analysis is the YouTube channel. We included a sample of 21 channels out of all the chaneled based on the following selection process. First, for estimating the virality effect, the channel should have at least one video ad that is viral in the observation period. Following Eckler and Bolls (2011), we classify a video ad to be viral if it generates more than 1 million views. There are only 44 channels satisfying this criterion. Second, we exclude channels that have uploaded an exceedingly large number of videos. For example, there is one channel that uploads about three videos per day. After the above selection process, we arrive at a sample of 21 branded channels, with a total of 152 video ads, 43 of which are viral. The number of video ads for each channel is shown in Table 4.1. While our time series is on the hourly level, further inspection of the data of channel subscriptions indicates that this information may not be updated hourly. It may be helpful to aggregate the time series to avoid the collection bias. We thus aggregate the data by four day parts: post late (0 – 6), morning (6 – 12), afternoon (12 – 18) and night (18 – 24). The views of an ad and the subscription of a channel in the following both correspond to incremental changes in these day parts. 4.4.2 Descriptive statistics The number of observations and sample statistics of subscriptions for each channel are reported in Table 4.1. Figure 4.3 shows the time series of the observed incremental channel subscriptions for two sample channel. There are several things to note. First, by simply eyeballing the patterns corresponding to the first viral video of each channel and comparing to the patterns of the non-viral ones, one may reasonably expect that virality may create more values in terms of subscriptions. Second, however, not all viral videos seem to bring in abnormally more subscriptions, as signaled by the second viral video in (b). Third, there are many situations when several videos are closely related to the abnormal subscription period. Assessing the number of subscriptions created by a video must take into account the effect from other videos. 65 Channel # Videos # Observations Subscriptions Min Mean Median Max Amazon 7 413 -5 46 5 2575 Budweiser 4 390 -19 131 3 3327 Butterfinger 2 209 -6 2 0 126 Chrysler 8 290 -6 20 2 1016 Dove 12 402 -23 16 6 255 Google 8 370 0 636 624 2839 H&M 7 189 -5 18 11 149 Hyundai Motors 5 396 -4 5 3 48 Kia Motors 6 226 -5 13 6 126 Kmart 7 419 -7 10 1 487 M&M 3 213 -6 15 6 378 Microsoft 6 326 -11 52 24 1861 MINI 5 476 -4 10 7 113 Nike 15 398 -68 119 65 3116 Old Spice 4 231 -190 70 39 762 Pepsi 16 361 -268 57 26 1161 Procter & Gamble 10 426 -8 39 12 2584 Samsung 8 431 -10 42 13 723 Visa 8 390 -7 11 2 1152 V olkswagen 6 250 -56 283 40 9808 Walmart Stores 5 366 -5 3 1 33 Table 4.1: Descriptive statistics for the channels included in the study. 4.4.3 Content coding The details of the content coding are described in Essay 2. While we can examine the influence of all ad cues on channel subscriptions, we will only incorporate two rating scales of most interest for the current essay. This is because there are only 70 of the ads in the current study that were rated. The relatively small sample may lower the statistical power when examining a large number of ad cues. Specifically, we use the rating of the overall emotion (0 = none; 5 = very strong) to indicate the extent to which the ad arouses any emotion. The rating of argument (0 = none; 5 = very strong) measures the extent to which the ad uses logical reasoning, factual claims or offers benefits. 4.5 Estimation This section describes the results from estimating the above dynamic model using the sample data. The statistical inference was carried out using Markov chain Monte Carlo methods (Gelman et al. 2003). The 66 (a) Time Incremental subscribers (b) Time Incremental subscribers Figure 4.3: Plot of the observed incremental channel subscriptions for two sample channels. The arrows on the axis mark the uploading time of a video ad, with red and black colors indicating viral and non-viral videos, respectively. details of the algorithm were derived in Appendix B. The critical step is to draw the effectiveness parameter i from its posterior which is based on the complete blocking method of Geweke (1996). In drawing the samples from the posterior distributions via MCMC, we ran 50,000 iterations in three parallel chains, discarding the burn-in period of the first 30,000 iterations at which point the approximate convergence was achieved. To reduce auto-correlation, we used every 20 th iteration of each chain. This process resulted in 1000 simulated draws for each model, based on which we draw inferences about the quantities of interest. 4.5.1 Parameter estimates Because there are many video ads in the sample, the number of effectiveness parameters is large. Table 4.2 reports the parameter estimates for one representative brand channel. The median coefficient is the sample median of the posterior draws of the effectiveness parameter i . The number of views was first divided by 1000 before estimating the model, so these parameter estimates represent the number of channel subscriptions generated by 1000 views. The “p-value” reports the posterior probability that the estimated effectiveness parameter is zero. Based this information, there is only one ad (video 5) whose effectiveness parameter is not zero. The mean estimate implies that in about 10000 people that view the ad, only 1.3 of them will subscribe to the channel. 67 video nviews median coefficient p-value mean subscribers 1 2153 0.00 0.92 13.46 2 2219 0.00 0.90 18.49 3 24590 5.91 0.12 919.03 4 502357 0.01 0.47 186.42 5 4967458 0.13 0.00 4579.57 6 8021 2.26 0.46 253.71 7 1539 0.00 0.91 10.55 8 1735 0.00 0.89 12.03 9 2299 0.00 0.73 39.68 10 3803 0.00 0.66 64.08 11 1858 0.00 0.70 28.43 12 2209 0.00 0.75 21.36 Table 4.2: Parameter estimates for one brand channel. We estimate the model for each brand channel and get the effectiveness estimate for each video ad. Table 4.3 reports the distribution of the estimated effectiveness parameter across all video ads. The median effec- tiveness parameter across all video ads is zero, which implies that most of the video ads are not effective in creating channel subscriptions. The average estimate is about 2.3. The average estimate of the carry- over parameter is 0.6. This suggests that out of a thousand views, approximately 2:3=(1 0:6) = 5:8 subscriptions are generated in the long run. min 25% median mean 75% max Effectiveness parameter 0 0 0 2.32 0.18 78.31 Estimated subscriptions 0 0 0 2806 993.1 82890 Table 4.3: Distribution of the estimated median effectiveness parameter and subscriptions across all video ads. 4.5.2 Channel subscription estimates From the estimated effectiveness parameters, we can estimate the total number of subscriptions created by each video ad. The estimation is based on the allocation of the observed incremental subscriptions to each relevant video ad. From the model estimates, we can estimate the expected number of subscriptions created by each video ad in each period. These estimates are then used to compute the proportion of subscriptions assigned to each video ad. Specifically, we estimate the total number subscriptions attributable to thei th video ad, ^ y i , as ^ y i = T X t=1 y t |{z} observed subscription P t s=1 x is ts P I j=1 P t s=1 x js ts + 0 t | {z } fraction attributable to adi : (3) 68 In the above, P t s=1 x is ts is the expected subscriptions generated by thei th video in periodt, reflecting the impact from all historical views of thei th ad. The denominator P I j=1 P t s=1 x js ts + 0 t is the total number of expected subscriptions from all video ads, plus the effect from the original subscription rate. So the fraction is the portion of the total subscription att that is attributable to thei th video. The last column of Table 4.2 shows the estimated channel subscribers generate by each ad using (3). The most viral video # 5 seems to create the largest number of subscribers among all the video ads. The distribution of the estimated channel subscriptions generated by each ad is shown in Table 4.3. We see that consistent with the distribution of the effectiveness parameter, most ads are not effective in generating channel subscriptions. 4.5.3 Effect of virality We now ascertain the effect of virality in terms of the generation of channel subscriptions. We look at two measures of interest for each video ad. The first is the posterior probability of i > 0. A positive effective parameter estimate indicates the video ad is effective in generating subscriptions. The second is the estimated channel subscriptions created by each video ad as in (3). In the following, we first compute the two measures for each video, and then use a mixed model to compare the differences between viral and non-viral video ads. Posterior probability of effectiveness. Let ci be the effectiveness parameter of thei th video ad from channel c. We estimate the posterior probability of an ad to be effective as ^ p ci = 1000 X s=1 I( (s) ci > 0)=1000; (4) where (s) ci indicates thes th draw of ci . In other words, we compute the proportion that the effectiveness parameter is positive in the posterior samples. Estimated subscriptions. For each posterior draw, we estimate the subscriptions of thei th video of channel c according to (3), and then average across all the samples to get the estimated subscriptions, denoted by ^ y ci . We then run a mixed-effect model of the above two measures over the intercept and the virality indicator. We allow all parameters to vary by channels to account for the unobserved channel effect. The results are shown in Table 4.4. We see that the coefficients associated with virality in both measures are positive and significant. This provides statistical evidence that a viral ad not only creates more brand exposures, but also is more effective in changing consumers’ behaviors in terms of promoting channel subscriptions, than a nonviral one. 69 Parameter Posterior Effectiveness Estimated Subscriptions Estimate SD Estimate SD Intercept 0.458 0.029** 5.902 0.364** Virality 0.325 0.044** 2.032 0.351** Table 4.4: Test of effect of virality on ad effectiveness. 4.5.4 Effect of appeal methods We now compare the effect of different methods of appeal in generating channel subscriptions. We carry out the analysis similar to the above. That is, we run mixed-effect models of the two measures (posterior effectiveness and estimated subscriptions) over the intercept and the rating scales on emotion and argument. The estimation results are reported in Table 4.5. We see that only the coefficient for the emotional appeal is positive and significant. The use of argument appeal does not yield statistically significant results. Parameter Posterior Effectiveness Estimated Subscriptions Estimate SD Estimate SD Intercept 0.566 0.054** 6.108 0.359** overall emotion 0.046 0.020** 0.480 0.131** overall argument 0.015 0.025 0.305 0.166 Table 4.5: Test of effect of appeal methods on ad effectiveness. 4.5.5 Additional analysis on abnormal stock returns Prior research has found evidence that experts’ reviews and user generated content have an influence on stock returns of the firm (Tirunillai and Tellis 2012). The popularity of the video ad reflects the firm’s ability to engage consumers, which may translate into further success of the advertised products and improved performance of the firm. Investors foresee such improved performance and bid up the price of the stock of the firm. For this reason, the success of an ad campaign can also be measured by its influence on the stock market performance of the firm. In this subsection, we investigate the link between ad views and abnormal stock returns of the firm. We use the date of upload of the video as the event date. We collected the daily stock return data for 11 of the channels which are publicly traded in the US market. We used the data 250 days prior to the event date to calculate the abnormal stock returns as follows: R t R ft = 0 + 1 (R mt R ft ) + 2 SMB t + 3 HML t + 4 MOM t + t : (5) 70 In the above,R t is the observed stock return of the firm at timet,R m is the market portfolio return,R f is the risk-free rate of return,R t R f is the adjusted return, SMB is the small minus big capitalization factor, HML is the high minus low book-to-market equity factor, and MOM is the momentum factor. We estimated the coefficients which allowed us to predict the adjusted return in the period following the first ad release. The abnormal stock return is then the difference between the observed and the predicted adjusted return ^ t . We then regressed the daily abnormal stock return of a channel over the daily views of the ads in the channel. We used cumulative returns over windows of 1, 3, and 5 days from the event window. We also tested the effect of 1-day, 3-day and 5-day lagged ad views on these abnormal returns. Table 4.6 reports the estimated parameters of ad views using the different measures of ad views and cumulative abnormal returns. We did not find evidence of strong links between virality and stock market performance, since all coefficients on ad views are not significant. Cumulative Abnormal Returns Measure of Ad Views Estimate of Ad Views SD 1-day window current period 0.000 0.008 1-day window lag1 0.002 0.008 1-day window lag2 -0.002 0.008 1-day window lag5 0.003 0.007 3-day window current period 0.000 0.008 3-day window lag1 -0.003 0.014 3-day window lag2 -0.002 0.013 3-day window lag5 0.007 0.012 5-day window current period 0.000 0.008 5-day window lag1 0.005 0.018 5-day window lag2 0.007 0.017 5-day window lag5 0.009 0.016 Table 4.6: Test of effect of ad views on cumulative abnormal stock returns. Ad views are measured using lagged values of 0, 1, 2, and 5 days, and cumulative abnormal returns are based on a window of 1, 3, and 5 days. In addition, we tested a threshold specification on ad views, where we used 1 million as a cutoff point. The estimated coefficient of the threshold ad views is -0.182 with a standard deviation of 0.196. Still, no significant relationship was found between ad views and abnormal stock returns. 4.6 Conclusions In this essay, we propose a method that enables advertisers to determine the effect of ad campaigns. Our method relies on a novel introduction of the spike-and-slab priors into the dynamic model so that we can get sensible estimates while appropriately accounting for model uncertainty. We apply the proposed to the context of YouTube video ads to describe the relationship between channel subscribers and views of video ads. Based on the model estimates, we estimate the number of channel subscribers generated by each video 71 ad, which advertisers can further use for the calculation of social return. Our result suggests that every 1000 views create about 5.8 channel subscriptions. According to our knowledge, this is the first attempt to formally quantify the effect of YouTube advertising. In addition, we compare the effects of viral and non-viral videos, and find support that viral videos not only generate huge exposures, but indeed create more subscriptions than non-viral ones. Our analysis also supports the use of emotion as a more effective appealing method to stimulate channel subscriptions. 4.6.1 Limitations In our model, we have restricted the effectiveness coefficient for each ad to be constant. This assumption can be relaxed so that the effectiveness of an ad can be time varying. Research on traditional TV advertising has found that there are wearout effect over time. The wearout happens when the ad becomes less effective as it is showed over time. Different from TV advertising, the spread of video ads is primarily through social sharing. Later viewers are more likely to be generated by social shares. And if social sharing increases persuasiveness of the ad, then one may see a wearin instead of wearout. For future research, we can extend the model to allow time-varying response effect, and examine the shape of the response over time. 72 References Aaker, David A., Douglas M. Stayman, Michael R. Hagerty. 1986. Warmth in advertising: Measurement, impact, and sequence effects. Journal of Consumer Research 12(4) 365–381. Alden, Dana L., Ashesh Mukherjee, Wayne D. Hoyer. 2000. The effects of incongruity, surprise and positive moderators on perceived humor in television advertising. Journal of Advertising 29(2) 1–15. Alwitt, Linda F. 2002. Suspense and advertising responses. Journal of Consumer Psychology 12(1) 35–49. Amos, C., G. Holmes, D. Strutton. 2008. Exploring the relationship between celebrity endorser effects and advertising effectiveness: A quantitative synthesis of effect size. International Journal of Advertising 27(2) 209–234. Assmus, Gert, John U. Farley, Donald Lehmann. 1984. How advertising affects sales: Meta analysis of econometric results. Journal of Marketing Research 21 65–74. Atkin, C, M Block. 1983. Effectiveness of celebrity endorsers. Journal of Advertising Research 23(1) 57–61. Baker, W. E., H. Honea, C. A. Russell. 2004. Do not wait to reveal the brand name: The effect of brand-name placement on television advertising effectiveness. Journal of Advertising 33(3) 77 – 85. Barasch, Alixandra, Jonah Berger. 2014. Broadcasting and narrowcasting: How audience size affects what people share. Journal of Marketing Research LI 286–299. Baron, Reuben M., David A. Kenny. 1986. The moderator-mediator variable distinction in social psycholog- ical research: Conceptual, strategic, and statistical considerations. Journal of Personality and Social Psychology 51(6) 1173–1182. Bass, F. M., T. V . Krishnan, D. C. Jain. 1994. Why the bass model fits without decision variables. Marketing Science 13(3) 203–223. Bass, Frank M., Norris Bruce, Sumit Majumdar, B. P. S. Murthi. 2007. Wearout effects of different advertis- ing themes: A dynamic bayesian model of the advertising-sales relationship. Marketing Science 26(2) 179–195. 73 Bass, Frank M, Darral G. Clarke. 1972. Testing distributed lag models of advertising effects. Journal of Marketing Research 9 298–308. Bass, M. Frank. 1969. A new product growth for model consumer durables. Management Science 15(5) 215–227. Batra, Rajeev, Michael L. Ray. 1986. Affective responses mediating acceptance of advertising. Journal of Consumer Research 13(2) 234–249. Berger, Jonah, L. K Milkman. 2012. What makes online content viral? Journal of Marketing Research 49(2) 192–205. Berger, Jonah, Eric M. Schwartz. 2011. What drives immediate and ongoing word of mouth? Journal of Marketing Research XLVIII 869–880. Berns, Gregory S., Samuel M. McClure, Giuseppe Pagnoni, P. Read Montague. 2001. Predictability modu- lates human brain response to reward. The Journal of Neuroscience 21(8) 2793–2798. Bollen, K. A., R. Stine. 1990. Direct and indirect effects: Classical and bootstrap estimates of variability. Sociological Methodology 20 115–140. Bruce, Norris I., Natasha Zhang Foutz, Ceren Kolsarici. 2012. Dynamic effectiveness of advertising and word of mouth in sequential distribution of new products. Journal of Marketing Research 49(4) 469– 486. Bucklin, Louis P., Sanjit Sengupta. 1993. The co-diffusion of complementary innovations: Supermarket scanners and upc symbols. Journal of Product Innovation Management, 10(2) 148–160. Burke, Marian Chapman, Julie A. Edell. 1989. The impact of feelings on ad-based affect and cognition. Journal of Marketing Research 26(1) 69–83. Chandrasekaran, Deepa, Gerard J. Tellis. 2007. Diffusion of new products: A critical review of models, drivers, and findings. Review of Marketing Research 39–80. Chandy, Rajesh, Gerard J. Tellis, Debbie MacInnis, Pattana Thaivanich. 2001. What to say when: Advertis- ing appeals in evolving markets. Journal of Marketing Research 38(4) 399–414. Chevalier, Judith A., Dina Mayzlin. 2006. The effect of word of mouth on sales: Online book reviews. Journal of Marketing Research 43(3) 345–354. De Angelis, Matteo, Andrea Bonezzi, A. M. Peluso, D. D. Rucker, Michele Costabile. 2012. On braggarts and gossips: A self-enhancement account of word-of-mouth generation and transmission. Journal of Marketing Research XLIX 551–563. Deighton, John, Daniel Romer, Josh McQueen. 1989. Using drama to persuade. Journal of Consumer Research 16(3) 335–343. 74 Dellarocas, C., X. Zhang, N. F. Awad. 2007. Exploring the value of online product reviews in forecasting sales: the case of motion pictures. Journal of Interactive Marketing 21 23–45. Dens, Nathalie, Patrick De Pelsmacker. 2010. Consumer response to different advertising appeals for new products: The moderating influence of branding strategy and product category involvement. Journal of Brand Management 18 50–65. Derbaix, Christian, Jo¨ elle Vanhamme. 2003. Inducing word-of-mouth by eliciting surprise – a pilot investi- gation. Journal of Economic Psychology 24 99–116. Dobele, Angela, Adam Lindgreen, Michael Beverland, Joelle Vanhamme, Robert van Wijk. 2007. Why pass on viral messages? because they connect emotionally. Business Horizon 50(4) 291–304. Duan, Wenjing, Bin Gub, Andrew B. Whinstonb. 2008. The dynamics of online word-of-mouth and product sales–an empirical investigation of the movie industry. Journal of Retailing 84(2) 233–242. Dukic, Vanja, Hedibert F. Lopes, Nicholas G. Polson. 2012. Tracking epidemics with google flu trends data and a state-space seir model. Journal of the American Statistical Association 107 327–342. Eckler, Petya, Paul Bolls. 2011. Spreading the virus: Emotional tone of viral advertising and its effect on forwarding intentions and attitudes. Journal of Interactive Advertising 11(2) 1–15. Edell, Julie A., Marian Chapman Burke. 1987. The power of feelings in understanding advertising effects. Journal of Consumer Research 14(3) 421–433. Edwards, Kari. 1990. The interplay of affect and cognition in attitude formation and change. Journal of Personality and Social Psychology 59(2) 202–216. Eilers, P.H.C., B.D. Marx. 1996. Flexible smoothing with b-splines and penalties (with comments and rejoinder). Statistical Science 11(2) 89–121. Fazio, R.H., P.M. Herr, M.C. Powell. 1992. On the development and strength of category-brand associations in memory: The case of mystery ads. Journal of Consumer Pshychology 1(1) 1 – 13. Freiden, Jon B. 1984. Advertising spokesperson effects: An examination of endorser type and gender on two audiences. Journal of Advertising Research 24(5) 33–41. Friedman, Hershey H., Linda. Friedman. 1979. Endorser effectiveness by product type. Journal of Adver- tising Research 19(5) 63–71. Gelman, Andrew, John B. Carlin, Hal S. Stern, Donald B. Rubin. 2003. Bayesian Data Analysis. 2nd ed. CRC Press, Boca Raton. Geweke, J. 1996. Variable selection and model comparison in regression. Bayesian Statistics 5. Oxford Press. Godes, David, Dina Mayzlin. 2004. Using online conversations to study word-of-mouth communication. Marketing Science 23(4) 545–60. 75 Godes, David, Jos´ e C. Silva. 2012. Sequential and temporal dynamics of online opinion. Marketing Science 31(3) 448–473. Goel, Sharad, Ashton Anderson, Jake Hofman, Duncan Watts. 2014. The structural virality of online diffu- sion . Golden, Linda L., K.A. Johnson. 1983. The impact of sensory preference and thinking versus feeling appeals on advertising effectiveness. Advances in Consumer Research. Ml: Association for Consumer Research. Goldsmith, Ronald E., Barbara A. Lafferty, Stephen J. Newell. 2000. The impact of corporate credibility and celebrity credibility on consumer reaction to advertisements and brands. Journal of Advertising 29(3) 43–54. Holbrook, Morris B., Rajeev Batra. 1987. Assessing the role of emotions as mediators of consumer responses to advertising. Journal of Consumer Research 14(3) 404–420. Ishwaran, Hemant, J. Sunil Rao. 2005. Spike and slab variable selection: Frequentist and bayesian strategies. The Annals of Statistics 33(2) 730–773. Lee, Yih Hwai, Charlotte Mason. 1999. Responses to information incongruency in advertising: The role of expectancy, relevancy, and humor. Journal of Consumer Research 26(2) 156–169. Liu, J., M. West. 2001. Combined parameters and state estimation in simulation-based filtering. Sequential Monte Carlo Methods in Practice. New York: Springer-Verlag. Liu, Yong. 2006. Word of mouth for movies: Its dynamics and impact on box office revenue. Journal of Marketing 70(3) 74–89. MacInnis, Deborah J., Bernard J. Jaworski. 1989. Information processing from advertisements: Toward an integrative framework. Journal of Marketing 53(4) 1–23. MacInnis, Deborah J., Christine Moorman, Bernard J. Jaworski. 1991. Enhancing and measuring con- sumers’ motivation, opportunity, and ability to process brand information from ads. Journal of Mar- keting 55(4) 32–53. MacInnis, Deborah J., Ambar G. Rao, Allen M. Weiss. 2002. Assessing when increased media weight of real-world advertisements helps sales. Journal of Marketing Research XXXIX 391–407. Millar, Murray G., Karen U. Millar. 1990. Attitude change as a function of attitude type and argument type. Journal of Personality and Social Psychology 52 (2) 217–228. Moe, Wendy W, Michael Trusov. 2011. The value of social dynamics in online product ratings forums. Journal of Marketing Research (JMR) 48(3) 444 – 456. 76 Moldovan, Sarit, Jacob Goldenberg, Amitava Chattopadhyay. 2011. The different roles of product origi- nality and useful- ness in generating word of mouth. International Journal of Research in Marketing 28(2) 109–119. Munch, James M., John L. Swasy. 1988. Rhetorical question, summarization frequency, and argument strength effects on recall. Journal of Consumer Research 15(1) 69–76. Naik, P. A., M. K. Mantrala, A. G. Sawyer. 1998. Planning media schedules in the presence of dynamic advertising quality. Marketing Science 17(3) 214–235. Nelson-Field, Karen, Erica Riebe, Kellie Newstead. 2013. The emotions that drive viral video. Australasian Marketing Journal 21 205–211. Nerlove, Marc, Kenneth J. Arrow. 1962. Optimal advertising policy under dynamic conditions. Economica 29(114) 129–142. Olney, Thomas J., Morris B. Holbrook, Rajeev Batra. 1991. Consumer responses to advertising: The effects of ad content, emotions, and attitude toward the ad on viewing time. Journal of Consumer Research 17 440–453. Packard, Grant, David Wooten. 2013. Compensatory communication: Knowledge discrepancies and knowl- edge signaling in word of mouth communications. Journal of Consumer Psychology 23(4). Pasadeos, Yorgo. 1990. Perceived informativeness of and irritation with local advertising. Journalism and Mass Communication Quarterly 67(1) 35–39. Pechmann, Cornelia, David W. Stewart. 1990. The effects of comparative advertising on attention, memory, and purchase intentions. Journal of Consumer Research 17(2) 180–191. Peers, Yuri, Dennis Fok, Philip Hans Franses. 2012. Modeling seasonality in new product diffusion. Mar- keting Science 31(2) 351–364. Petty, Richard E., John T. Cacioppo. 1986. Communication and Persuasion: Central and Peripheral Routes to Attitude Change. New York: Springer-Verlag. Petty, Richard E., John T. Cacioppo, David Schumann. 1983. Central and peripheral routes to advertising effectiveness:the moderating role of involvement. Journal of Consumer Research 10 135–146. Poels, Karolien, Siegfried Dewitte. 2006. How to capture the heart? reviewing 20 years of emotion mea- surement in advertising. Journal of Advertising Research 46(1). Preacher, K. J., A. F. Hayes. 2008. Asymptotic and resampling strategies for assessing and comparing indirect effects in multiple mediator models. Behavior Research Methods 40(3) 879–891. Redick, Scott. 2013. Surprise is still the most powerful marketing tool. Harvard Business Review . Rogers, Everett M. 2003. Diffusion of innovations (5th ed.). New York: Free Press. 77 Sawhney, Mohanbir S., Jehoshua Eliashberg. 1996. A parsimonious model for forecasting gross box-office revenues of motion pictures. Marketing Science 15(2) 113–131. Severn, Jessica, George E Belch, Michael A Belch. 1990. The effects of sexual and non-sexual advertising appeals and information level on cognitive processing and communication effectiveness. Journal of Advertising 19(1) 14–22. Shimp, T. A., E. W. Stuart. 2004. The role of disgust as an emotional mediator of advertising effects. Journal of Advertising 33(1) 43–53. Singh, Surendra N., Catherine A. Cole. 1993. The effects of length, content, and repetition on television commercial effectiveness. Journal of Marketing Research 30(1) 91–104. Snyder, Mark, Kenneth G. DeBono. 1985. Appeals to image and claims about quality: Understanding the psychology of advertising. Journal of Personality and Social Psychology 49(3) 586–597. Southgate, Duncan, Nikki Westoby, Graham Page. 2010. Creative determinants of viral video viewing. International Journal of Advertising 29(3) 2–14. Spotts, Harlan E., Marc G. Weinberger, Amy L. Parsons. 1997. Assessing the use and impact of humor on advertising effectiveness: A contingency approach. Journal of Advertising 26(3) 17–32. Stayman, Douglas M., Rajeev Batra. 1991. Encoding and retrieval of ad affect in memory. Journal of Marketing Research 28(2) 232–239. Sternthal, Brian, C. Samuel Craig. 1973. Humor in advertising. Journal of Marketing 37(4) 12–18. Stewart, D. W., D. H. Furse. 1986. Effective Television Advertising. Lexington Books, Lexington, MA. Susarla, Anjana, Jeong-Ha Oh, Yong Tan. 2012. Social networks and the diffusion of user-generated content: Evidence from youtube. Information Systems Research 23(1) 23–41. Teixeira, Thales, Rosalind Picard, Rana el Kaliouby. 2014. Why, when, and how much to entertain con- sumers in advertisements? a web-based facial tracking field study. Marketing Science . Teixeira, Thales, Michel Wedel, Rik Pieters. 2010. Moment-to-moment optimal branding in tv commercials: Preventing avoidance by pulsing. Marketing Science . Teixeira, Thales, Michel Wedel, Rik Pieters. 2012. Emotion-induced engagement in internet video adver- tisements. Journal of Marketing Research XLIX 144–159. Tellis, Gerard J. 2004. Effective Advertising: Understanding When, How, and Why Advertising Works. SAGE Publications. Tellis, Gerard J., Doyle L. Weiss. 1995. Does tv advertising really affect sales? the role of measures, models, and data aggregation. Journal of Advertising 24(3) 1–12. 78 Tirunillai, Seshadri, Gerard J. Tellis. 2012. Does chatter really matter? dynamics of user-generated content and stock performance. Marketing Science 31(2) 198–215. Trusov, Michael, Randolph E. Bucklin, Koen Pauwels. 2009. Effects of word-of-mouth versus traditional marketing: Findings from an internet social networking site. Journal of Marketing 73 90–102. Tucker, Catherine E. 2015. The reach and persuasiveness of viral video ads. Marketing Science 34(2) 281–296. Vakratsas, Demetrios, Tim Ambler. 1999. How advertising works: What do we really know? Journal of Marketing 63(1) 26–43. Vakratsas, Demetrios, Fred M. Feinberg, Frank M. Bass, Gurumurthy Kalyanaram. 2004. The shape of advertising response functions revisited: A model of dynamic probabilistic thresholds. Marketing Science 23(1) 109–119. Visible Measures. 2013. 2013 branded video annual report. http://www.visiblemeasures.com/ 2014/03/24/2013-branded-video-report/ . Weinberger, Marc G., Charles S. Gulas. 1992. The impact of humor in advertising: A review. Journal of Advertising 21(4) 35–59. West, Mike, Jeff Harrison. 1997. Bayesian Forecasting and Dynamic Models. Springer, New York. Woltman Elpers, Josephine L. C. M., Ashesh Mukherjee, Wayne D. Hoyer. 2004. Humor in television advertising: A moment-to-moment analysis. Journal of Consumer Research 31(3) 592–598. Woltman Elpers, Josephine L. C. M., Michel Wedel, Rik G. M. Pieters. 2003. Why do consumers stop view- ing television commercials? two experiments on the influence of moment-to-moment entertainment and information value. Journal of Marketing Research 40(4) 437–453. Yoganarasimhan, H. 2012. Impact of social network structure on content propagation: A study using youtube data. Quantitative Marketing and Economics 10 111–150. Zhang, Yong, George M. Zinkhan. 1991. Humor in television advertising: the effects of repetition and social setting. Advances in Consumer Research, vol. 18. UT : Association for Consumer Research, 813–818. 79 Appendix A Technical Appendix to Chapter 2 To estimate the parameters of the model, we simulate the full posterior distribution using Markov chain Monte Carlo methods. The “Metropolis-within-Gibbs” algorithm scheme sequentially samples parameters from their lower dimensional full conditional distributions over many iterations. The full conditionals are derived from the joint posterior density, which is proportional to the product of the prior and the likelihood (givenI video ads each withT observations): I Y i=1 T Y t=1 p(V it jV i;t1 ;G it ; i ; 2 V i )p(S it j it ;P it ; 2 S i ) conditional likelihood I Y i=1 T Y t=1 p(G it jG i;t1 ; it ;S i;t1 ; G i ; 2 G i ) I Y i=1 T Y t=1 p(P it jP i;t1 ; it ;V it ; P i ; 2 P i ) I Y i=1 T Y t=1 p( it j i;t1 ;S i;t1 ;b 1i ;b 2i ; 2 i )p( 0i j 2 0i ) I Y i=1 T Y t=1 p( it j i;t1 ;b 3i ; 2 i )p( 0i j 2 0i ) system evolution I Y i=1 p( i j 0 ; ) hierarchical structure I Y i=1 p( 2 V i ; 2 S i ; 2 G i ; 2 P i ; 2 i ; 2 0i ; 2 i ; 2 0i )p( 0 )p() hyperprior (1) We now describe the sampling scheme for each parameter in turn. SampleG it ;P it ; it ; it . The full conditional distributions of these variables are normal, which can be easily simulated. For example, the full conditional distribution ofG it has a mean and variance as follows: E(G it j) = [V it f i (V i;t1 )] 2 G i + [(1 G i )(G i;t1 +G i;t+1 i;t+1 S it ) + it S i;t1 ] 2 V i 2 G i + [1 + (1 G i ) 2 ] 2 V i ; 80 Var(G it j) = 2 G i 2 V i 2 G i + [1 + (1 G i ) 2 ] 2 V i : We need to specify the boundary values so that the above formula work for allt = 1; ;T . Whent = 1, we setG i0 =S i0 = 0, and we replace the denominator of the above two equations by 2 G i + (1 G i ) 2 2 V i . Whent =T , we setG i;T+1 = i;T+1 = 0, and we replace the denominator of the above two equations by 2 G i + 2 V i . The distributions ofP it ; it , and it can be derived similarly. Sample i . The full conditional does not correspond to a standard distribution. We simulate each component using the random walk Metropolis algorithm. In the implementation, we used the first 10,000 iterations to tune the standard deviation of the normal proposals to get an acceptance rate around 40%. These samples were discarded in the analysis. Sample 0 . The full conditional of 0 is a normal distribution: 0 N I 1 +C 1 1 1 X I i=1 i ; I 1 +C 1 1 ; whereC is the prior covariance matrix for 0 . Sample the variance components. The posterior distributions of the variance components 2 m i , m 2 fV;G;;S;P;; 0 ; 0 g, and are easy to derive since we specify conjugate priors. For example, 2 G i follows an inverse-Gamma distribution with 2 G i Gamma T 2 +C 1 ; 1 2 X T t=1 [G it (1 G i )G i;t1 it S i;t1 ] 2 +C 2 ; whereC 1 andC 2 are the shape and scale parameters in the prior. The distributions of the other variance components can be derived similarly. 81 Appendix B Technical Appendix to Chapter 4 Suppose our model is y t = t + y ; t = t1 + I X i=1 x it i + ; (1) wherey t is the incremental subscribers at timet,x it is the number of incremental views of videoi at timet. In the above i measures the immediate effectiveness of adi in generating new subscriptions to the channel and measures the carryover effect. The error terms are specified as y N(0; 2 y ) and N(0; 2 ). To ensure that j is nonnegative, we use the following spike and slab prior: i (1q)(0) +qTN (0;1) (0; 2 ): (2) In the above, (0) denotes the degenerate distribution at the origin, TN (0;1) (0; 2 ) denotes the truncated normal distribution in the interval (0;1), andq2 [0; 1] is the probability that the coefficient comes from the truncated normal. Bothq and are pre-specified constants. Given this, we can write out the likelihood as ( p 2 y ) T exp P T t=1 (y t t ) 2 2 2 y ! ( p 2 ) T exp P T t=1 ( t t1 P I i=1 x it i ) 2 2 2 ! I Y i=1 p( i )p()p( 2 y )p( 2 ): In the following, we derive the Gibbs sampler used to simulate from the posterior of the model. The posterior distribution is proportional to the joint distribution of the data and the priors. In the Gibbs sampler, we sequentially sample each parameter from its full conditional distribution given all the other parameters 82 and the data. We now describe the sampling scheme for each parameter in turn. We use to indicate anything other than the focal parameter, including other parameters and the data to be conditioned upon. Sample t . The full conditional of t is a normal distribution with mean and variance as follows: E( t j) = y t 2 + ( t1 + P I i=1 x i(t1) i + t+1 P I i=1 x it i ) 2 y 2 + (1 + 2 ) 2 y ; Var( t j) = 2 2 y 2 + (1 + 2 ) 2 y : In the above, whent = 1, we set 0 = 0; and whent = T , we set T+1 = P I i=1 x iT i so that the above expression holds for allt = 1; ;T . Sample j . Since j has a mixed prior distribution, we need to compute the posterior probability of j = 0. To simplify notation, in the following, we denotez t = t t1 P i6=j x it i . The posterior of j = 0 is p( j = 0j) = p(j j = 0)p( j = 0) p(j j = 0)p( j = 0) + R 1 0 p(j j )p( j )d j : (3) When j = 0, we have, ignoring terms common to j = 0 and j 6= 0, p(j j = 0) = exp P T t=1 z 2 t 2 2 ! : (4) When j 6= 0, the integrand in the denominator is p(j j )p( j ) = exp P T t=1 (z t x jt j ) 2 2 2 ! 2q p 2 exp 2 j 2 2 ! : (5) Based on this, some algebra shows that Z 1 0 p(j j )p( j )d j = 2Aq exp BC 2 2A 2 [1 (C=A)]; (6) 83 with A 2 = 2 P T t=1 x 2 jt + 2 = 2 ; B = P T t=1 z 2 t P T t=1 x 2 jt + 2 = 2 ; C = P T t=1 z t x jt P T t=1 x 2 jt + 2 = 2 : Plugging (4) and (6) into (3), we get that the posterior conditional probability of j = 0 given all other parameters and the data is p( j = 0j) = 1q (1q) +q 2A exp C 2 2A 2 [1 (C=A)] : (7) If we useq = 0:5, i.e., assigning equal probabilities to the mass at zero and the positive values, the expression simplies to p( j = 0j) = 1 1 + 2A exp C 2 2A 2 [1 (C=A)] : (8) So we have the following sampling scheme for j forj = 1; ;I: 1. SampleuU(0; 1). Ifu<p( j = 0j) in (7), set j = 0. 2. Otherwise, sample j TN (0;1) (C;A 2 ). Sample. The log posterior of (we use a uniform prior on) is logp(j)/ P T t=1 ( t t1 P I i=1 x it i ) 2 2 2 ;2 [0; 1]: This is a concave function of, so we sample using the Adaptive Rejection Sampling (ARS). Sample the variance components. The posterior distributions of the variance components 2 y and 2 are easy to derive since we specify conjugate priors. Both follow an inverse-Gamma distribution with 2 y Gamma T 2 +C 1 ; 1 2 X T t=1 (y t t ) 2 +C 2 ; 2 Gamma T 2 +C 1 ; 1 2 X T t=1 t t1 X I i=1 x it i 2 +C 2 ! ; whereC 1 andC 2 are the shape and scale parameters in the prior. 84
Abstract (if available)
Abstract
Recent years have seen a proliferation of branded video ads on YouTube. Brands are increasingly willing to bypass traditional mass media and place ads on YouTube. If the video ad goes viral, it creates huge short-term brand exposure. However, creating viral videos is difficult. Most video ads are duds. This is because on YouTube, consumers decide what to watch, whether to share, and whether to subscribe to the brand channel. These unique features call for a careful analysis of YouTube video ads to help brands better design and place their ads on YouTube. ❧ We scrape hourly data from YouTube for a large number of branded video ads. Our collected data include various measures of consumers' social engagement (views, shares, likes/dislikes, comments). We also collect channel subscriptions and abnormal stock returns to gauge the effect of advertising. ❧ We examine the virality of video ads from three aspects. First, we study the process that a video ad spreads. We construct a dynamic system that jointly models the process of product adoption and share generation. We also model the dynamic interrelationship between adoption and sharing resulting from the change of consumers' adoption and sharing behavior over time. The empirical inference of the model sheds light on the patterns of adoptions and shares, characteristics of viral diffusion, intra-day seasonality, economic values of social shares, and dynamics of share effectiveness and sharing propensity. These results provide insights on the strategy of ad promotion, share stimulation, and ad seeding. ❧ Second, we analyze the drivers of viral video ads. We develop an instrument to rate the content of a large number of ads on over 30 executional cues drawn from the behavioral literature on advertising. These cues cover argument, emotion, endorsement, surprise, humor, branding, ad length, sex and many other variables. We analyze social shares as a function of the executional cues and identify the important drivers of virality. Our empirical findings provide important implications for the current practice of advertising via video ads. ❧ Third, we investigate the effects of video ads on channel subscriptions and abnormal stock returns of the brand. We build a dynamic model that links the views of video ads to the channel subscriptions of the brand. We quantify the effect of a view on channel subscription, identify the effective ads, test the effectiveness of viral ads compared to non-viral ads, compare different appealing methods used in ads and access the effects of video ads on abnormal stock returns.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Essays on consumer conversations in social media
PDF
Scale construction and effects of brand authenticity
PDF
Quality investment and advertising: an empirical analysis of the auto industry
PDF
Essays on understanding consumer contribution behaviors in the context of crowdfunding
Asset Metadata
Creator
Zhang, Yanwei
(author)
Core Title
Understanding virality of YouTube video ads: dynamics, drivers, and effects
School
Marshall School of Business
Degree
Doctor of Philosophy
Degree Program
Business Administration
Publication Date
07/06/2015
Defense Date
05/11/2015
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
abnormal stock returns,diffusion,executional cues,OAI-PMH Harvest,self-enhancement,social shares,virality
Format
application/pdf
(imt)
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Tellis, Gerard J. (
committee chair
), Luo, Lan (
committee member
), Lv, Jinchi (
committee member
), MacInnis, Deborah J. (
committee member
)
Creator Email
actuary_zhang@hotmail.com
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-c3-584540
Unique identifier
UC11300568
Identifier
etd-ZhangYanwe-3541.pdf (filename),usctheses-c3-584540 (legacy record id)
Legacy Identifier
etd-ZhangYanwe-3541.pdf
Dmrecord
584540
Document Type
Dissertation
Format
application/pdf (imt)
Rights
Zhang, Yanwei
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the a...
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Tags
abnormal stock returns
diffusion
executional cues
self-enhancement
social shares
virality