Professional Documents
Culture Documents
Marketing Science
Publication details, including instructions for authors and subscription information:
http://pubsonline.informs.org
This article may be used only for the purposes of research, teaching, and/or private study. Commercial use
or systematic downloading (by robots or other automatic processes) is prohibited without explicit Publisher
approval, unless otherwise noted. For more information, contact permissions@informs.org.
The Publisher does not warrant or guarantee the article’s accuracy, completeness, merchantability, fitness
for a particular purpose, or non-infringement. Descriptions of, or references to, products or publications, or
inclusion of an advertisement in this article, neither constitutes nor implies a guarantee, endorsement, or
support of claims made of that product, publication, or service.
With 12,500 members from nearly 90 countries, INFORMS is the largest international association of operations research (O.R.)
and analytics professionals and students. INFORMS provides unique networking and learning opportunities for individual
professionals, and organizations of all types and sizes, to better understand and use O.R. and analytics tools and methods to
transform strategic visions and achieve better outcomes.
For more information on INFORMS, its publications, membership, or meetings visit http://www.informs.org
MARKETING SCIENCE
Vol. 40, No. 5, September–October 2021, pp. 964–984
http://pubsonline.informs.org/journal/mksc ISSN 0732-2399 (print), ISSN 1526-548X (online)
Received: January 6, 2020 Abstract. As live streaming of events gains traction, pay what you want (PWYW) pricing
Revised: July 21, 2020; November 16, 2020 strategies are emerging as critical monetization tools. We assess the viability of PWYW by ex-
Accepted: December 27, 2020 amining the relationship between popularity (i.e., audience size) of a live streaming event and
Published Online in Articles in Advance: the revenue it generates under a PWYW scheme. On the one hand, increasing audience size
August 19, 2021 may enhance voluntary payment/tips if social image concerns are important because larger
https://doi.org/10.1287/mksc.2021.1292 audiences amplify the utility pertaining to social image. On the other hand, increasing audi-
ence size may reduce tips if gaining the broadcaster’s reciprocal acts motivates tipping be-
Copyright: © 2021 INFORMS cause larger audiences are associated with fiercer competition for reciprocity. To examine
these trade-offs in the relationship between audience size and revenue under PWYW, we ma-
nipulate audience size by exogenously adding synthetic viewers in live streaming shows on a
platform in China. The results reveal a mostly positive relationship between audience size
and average tip per viewer, which suggests that social image concerns dominate seeking reci-
procity. In support of herding, adding synthetic viewers also increases the number of real
viewers. Social image concerns and herding together explain the finding that adding one ad-
ditional viewer improves the tipping revenue per minute by approximately 0.01 yuan (1% of
the mean level). Further, famous female broadcasters who use recognition-related words fre-
quently during the event benefit the most from an increase in audience size. Overall, the re-
sults indicate that revenues under PWYW do not scale linearly and support the relevance of
social image concerns in driving individual payment decisions under PWYW.
History: K. Sudhir served as the senior editor and Juanjuan Zhang served as associate editor for this article.
Funding: This research is supported by the National Natural Science Foundation of China [Grant
71872115], the Natural Science Foundation of Guangdong Province, China [Grant 2020A1515011201],
and National University of Singpore Research [Grant R-316-000-104-133].
Supplemental Material: The data and online appendix are available at https://doi.org/10.1287/mksc
.2021.1292.
Keywords: live streaming • pay what you want • social media • user-generated content • tipping • field experiment • video analysis
964
Lu et al.: Audience Size and Live Streaming Revenues Under Pay What You Want
Marketing Science, 2021, vol. 40, no. 5, pp. 964–984, © 2021 INFORMS 965
live video content free of charge and pay/tip broad- others (Lampel and Bhalla 2007). As such, a represen-
casters in real time by sending voluntary payments in tative viewer’s tip amount should increase as audi-
the form of virtual cash and virtual gifts. The revenue ence size grows. On the other hand, if seeking reciproci-
collected from viewers is later split between the live ty is the primary motive for tipping, as a session
streaming platform and broadcasters. Prominent live becomes more crowded, the chance of a viewer gain-
streaming platforms, such as YouTube Live and ing reciprocal acts from the broadcaster in the form of
Twitch, fully embrace the PWYW strategy with Face- social interactions should reduce, which should lead
Downloaded from informs.org by [68.101.74.150] on 24 March 2023, at 08:12 . For personal use only, all rights reserved.
book Live adopting this pricing strategy for live to a lower tip per viewer because of the more intense
streaming of video games (Roettgers 2018). (perceived) competition for reciprocity (referred to as
The prominence of the PWYW revenue model leads the N-effect in social psychology; Garcia and Tor 2009).
to astronomical growth of the live streaming industry The negative relationship between audience size and
in both the United States and China over the past tip amount per viewer might also occur because of
few years. According to Streamlabs and Goldman free-riding: as group size grows, the individual contri-
Sachs research, live streaming broadcasters within the bution declines (e.g., Olson 1965, Andreoni 1988).
United States received $129 million in the form of dis- Thus, the net effect of audience size on tip amount per
cretionary tips in 2017 (Hays 2018). It also predicts viewer is a priori unclear. It could be either positive,
that the “tipping market” will reach $372 million in negative, or null, depending on the relative strengths
2022, suggesting a compound annual growth rate of of multiple underlying forces (e.g., social image, reci-
23.6% from 2017 to 2022. For the live streaming mar- procity, free-riding).
ket in China, the user base has increased from 310 mil- Two features of the live streaming industry make it
lion in 2016 to 504 million in 2019, suggesting that ap- suitable for us to study the scalability of the PWYW
proximately two out of five Chinese watched live revenue model. First, both the payment decisions and
streaming shows in 2019 (iiMedia Research 2020). audience size are public information, which enable
With the booming live streaming industry, a number status-signaling and herding by viewers. These behav-
of Chinese live streaming service providers (e.g., Bili- ioral mechanisms are likely absent for products and
bili, DouYu, Huya, Momo) have become billion-dollar services that do not disclose consumers’ payment to
companies that are publicly traded in the United others (e.g., shows on Netflix). Second, a live stream-
States. ing event is hosted over the internet and, therefore,
Despite the wide adoption of PWYW in the live does not have a capacity constraint on the number of
streaming industry, the viability and efficacy of this consumers who can simultaneously use the service.
revenue model remain unclear. In this research, we as- This is different from off-line events (e.g., sports
sess the scalability of PWYW by addressing the fol- games, concerts) for which the number of consumers
lowing question: do larger audiences generate greater is restricted by the physical space.
revenues in live streaming with PWYW schemes? As To understand how audience size affects revenues
tipping revenue in a live streaming session depends under PWYW, we conduct a field experiment on a
on the number of viewers and tip amount per viewer, large live streaming platform in China. In the experi-
we first ask how popularity information (i.e., audience ment, we first randomize the displayed audience size
size) affects viewer participation. In settings in which (based on treatment and control condition allocation)
product quality is unknown to consumers a priori, across broadcasters. Then, using the allocation of
popularity can serve as a signal of quality to draw in broadcasters to one of the three conditions (two treat-
new viewers (e.g., Tucker and Zhang 2011, Zhang and ment and one control), we randomize the audience
Liu 2012). Because of the self-reinforcement of popu- size within broadcaster, within session (as most
larity information, we expect herding to occur in the broadcasters have multiple sessions of around an
context of live streaming, suggesting a positive effect hour each), and at every minute of a session (after the
of audience size on viewer participation. first 10 minutes as we discuss subsequently), for
Further, we examine whether larger audiences en- which deviation of displayed audience size from the
hance or reduce the average tip amount per viewer. actual audience size depends on a random draw. Spe-
We posit that the direction of this effect depends on a cifically, for each of the two treatment conditions, at
central trade-off. On the one hand, theory of signaling every minute, we draw a random number of synthetic
and status seeking (e.g., Lampel and Bhalla 2007, viewers to add from a distribution with a mean of two
Gneezy et al. 2012, Toubia and Stephen 2013) and the or four, respectively.
public nature of viewers’ payment in live streaming Empirical identification of the treatment effect of
suggest that a viewer’s utility of tipping should in- adding synthetic viewers proceeds in three steps.
crease as audience size increases because of an up- First, we begin with a mean comparison by exploiting
ward bump in the viewer’s social image, defined as an variation at the broadcaster level across treatment
individual’s social status and prestige perceived by groups. The results from the mean comparison
Lu et al.: Audience Size and Live Streaming Revenues Under Pay What You Want
966 Marketing Science, 2021, vol. 40, no. 5, pp. 964–984, © 2021 INFORMS
suggest that adding synthetic viewers indeed signifi- norm (Croson and Shang 2008, Simonsohn and Ariely
cantly improves tipping revenues per minute; howev- 2008) rather than observational learning (Banerjee
er, such a benefit from audience augmentation is sub- 1992, Bikhchandani et al. 1992). Furthermore, the theo-
ject to diminishing returns as the difference is not ry of intrasexual competition from evolutionary psy-
statistically significant when increasing the average chology predicts that status-signaling motivation
number of synthetic viewers per minute from two to among male customers is more likely to appear in the
four. presence of a female rather than a male broadcaster
Downloaded from informs.org by [68.101.74.150] on 24 March 2023, at 08:12 . For personal use only, all rights reserved.
Second, though intuitive and convenient, the mean (Sundie et al. 2011). Given the majority of male audi-
comparison at the broadcaster level is subject to ag- ences on our platform (66%), our finding of a stronger
gregation bias mainly because the treatment strength effect on tipping to female than male broadcasters
varies across sessions and within sessions over time; suggests the relevance of social image concerns in live
thus, averaging could result in aggregation bias. To streaming. We further provide corroborating evidence
tackle this problem, we resort to the slope comparison of social image–related utility in tipping by examining
by testing whether the slope of tipping revenues the moderating effects of broadcaster performances. If
against time differs across treatment conditions be- social image concerns are indeed important, we expect
cause the average treatment strength increases over the broadcaster’s tendency to express recognition of
time from our manipulation (i.e., we add an average individual viewers to strengthen the positive effect of
of two or four viewers every minute based on treat- audience size on tipping. We employ state-of-the-art
ment condition). The slope analysis yields qualitative- speech recognition techniques to measure the broad-
ly similar results with increased statistical significance caster’s use of recognition-related words during a live
of identified effects. streaming event and find the positive moderating ef-
Third, the panel structure of the data allows us to fect predicted by the model with social image–related
precisely account for the time-varying treatment utility.
strength randomized at the minute level to obtain an With this research, we aim to make several impor-
unbiased estimate of the treatment effect. We also con- tant contributions. Substantively, we find an overall
trol for unobserved session heterogeneity with session positive and concave causal relationship between au-
fixed effects (which alleviates the need for broadcast- dience size and tipping revenue, suggesting that the
er-level fixed effects) to identify the effect of audience
revenue under PWYW is less scalable than that under
size based on within-session variation across time.
other monetization tools (e.g., advertising, product
Several findings emerge from the linear panel regres-
placement) that sell viewers’ eyeballs or clicks, which
sions. Increasing the audience size by one unit im-
increase linearly with the audience size. To our
proves the tipping revenue per minute by approxi-
knowledge, this research is among the first few field
mately 0.01 yuan, which is 1% of the mean level. Such
data–based as opposed to survey-based studies in the
a positive effect on tipping revenue is subject to di-
live streaming literature. With our finding of a posi-
minishing returns, and the effect turns negative if the
tive effect of audience size on average tip amount per
number of displayed viewers exceeds 567 (96.6 per-
viewer, this research also contributes to the PWYW lit-
centile). By breaking down the revenue to the number
erature by confirming the relevance of social image–-
of real viewers times tip amount per viewer, we find
that two forces drive the mostly positive relationship related utility in commercial contexts. Although re-
between average treatment strength and tipping reve- search shows the relevance of social image motive in
nue: a positive treatment effect on viewer size and a charity contexts (Harbaugh 1998, Ariely et al. 2009,
mostly positive treatment effect on tip per viewer. The DellaVigna et al. 2012), no evidence exists of social im-
former effect on viewer size seems to rely on theories age concerns or social pressure in PWYW employed
related to herding, and the latter effect on tip per in commercial contexts. Our data from live streaming
viewer suggests the dominance of social image con- provide new evidence of the importance of social im-
cerns over seeking reciprocity in driving tipping age concerns in driving individual payment decisions
decisions. under PWYW.
We explore the potential heterogeneity in the effect
of increasing audience size in subsequent analyses. 2. Related Literature
We find that female and more famous broadcasters As live streaming is an emerging form of UGC, the lit-
tend to enjoy greater benefits from larger audiences erature on UGC is of relevance. Previous research
than male and less famous broadcasters. Such an im- shows the importance of social influence in driving in-
provement in tipping revenue comes from both a dividual contribution on various types of UGC plat-
stronger effect on audience size and tip per viewer. forms. For example, Toubia and Stephen (2013) con-
These results are generally consistent with the data duct a field experiment by adding synthetic followers
pattern predicted by herding resulting from social on Twitter and find that social image–related utility
Lu et al.: Audience Size and Live Streaming Revenues Under Pay What You Want
Marketing Science, 2021, vol. 40, no. 5, pp. 964–984, © 2021 INFORMS 967
(utility associated with the perception of others) is off-line settings (e.g., theme parks, restaurants) in
generally more important than intrinsic utility (direct which previous PWYW experiments typically oc-
utility from posting content) in motivating users to curred. The public nature of payment in PWYW in the
contribute content; thus, users’ content contribution is live streaming industry suggests the importance of so-
likely to increase with the augmentation of followers. cial image, which drives an individual’s desire to im-
Zhang and Zhu (2011) find social benefit in content prove the social status and prestige perceived by
generation by examining users’ posting and editing others (Lampel and Bhalla 2007). We contribute to the
Downloaded from informs.org by [68.101.74.150] on 24 March 2023, at 08:12 . For personal use only, all rights reserved.
efforts on Wikipedia. Shriver et al. (2013) advance the PWYW research by providing suggestive evidence of
literature by showing the reverse causal effect of con- the existence of social image–related utility in driving
tent generation on user engagement and social ties in consumers’ payment in the noncharity contexts. This
the context of an online windsurfing community. Live novel finding of social image–related utility under
streaming differs from conventional UGC in its core PWYW in the live streaming industry is by and large
business model. Almost all UGC platforms studied consistent with the prominence of social image in mo-
(e.g., online forums, review sites, social network) tivating user contributions in front of others on online
monetize online traffic through advertisements. By platforms (Toubia and Stephen 2013).
contrast, most live streaming platforms rely on view- The total revenue in live streaming depends not
ers’ voluntary payments/tips to generate revenues. only on the individual tip amount but also on the size
Our study differs from previous research on UGC by of viewers. Thus, we draw from previous herding lit-
exploring the scalability of revenues against the size erature to theorize the impact of popularity informa-
of consumers when the UGC platform monetizes traf- tion on viewer participation (e.g., Banerjee 1992, Bikh-
fic through PWYW. chandani et al. 1992). Before entering a live stream,
As voluntary payments/tips facilitate the pricing of viewers have only limited information about its quali-
live streaming, our study closely relates to the litera- ty and, thus, may resort to other participants to ratio-
ture on tipping and PWYW. Tipping is a common and nalize their decisions, resulting in herding behavior,
important component of service marketing (for a re- which predicts a positive effect of popularity informa-
view, see Azar 2007); for example, annual tipping in tion on viewer participation. Previous literature fur-
the U.S. food industry amounts to $46.6 billion (Azar ther suggests that herding occurs for at least two rea-
2011). There is increasing interest from both industry sons: (1) social norm, in which a potential viewer
and academia in exploring whether tipping, in lieu of refers to existing viewers’ choices as a descriptive so-
a fixed price, can serve as an alternative business cial norm and then passively mimics their decisions
model to generate revenues (e.g., Natter and Kauf- (e.g., Croson and Shang 2008, Simonsohn and Ariely
mann 2015). The English rock band Radiohead is a fa- 2008), and (2) observational learning, in which a po-
mous application of this idea. Radiohead announced tential viewer infers higher quality from a session
in 2007 that it would allow fans to set their own price, with a larger number of existing viewers and, thus, in-
if anything, for downloading its seventh album In creases the likelihood of participation (e.g., Cai et al.
Rainbows. The live streaming industry is also adopting 2009, Zhang 2010). In this study, we find data patterns
and experimenting with the viability of this PWYW that are more consistent with herding driven by social
business model. norm rather than observational learning.
Such PWYW pricing schemes, in which consumers Finally, our research builds on the growing litera-
decide the payment amount (including zero), are be- ture on live streaming. Extant research is largely com-
ginning to receive scholarly scrutiny (e.g., Schmidt puter science focused (e.g., Pires and Simon 2015, He
et al. 2015, Jung et al. 2016, Chen et al. 2017). Kim et al. et al. 2016) with only a handful of studies examining
(2009) find that PWYW transactions evoke concerns the reasons behind the production and popularity of
about reciprocity and fairness. Gneezy et al. (2012) live streaming. With a survey of Twitch viewers,
identify the importance of self-identity and self-image Sjöblom and Hamari (2017) identify tension release,
in PWYW and show that self-image concerns drive interpersonal bonding, and entertainment as three key
people to pay more but also make them less likely to motivators to watch live streaming of video games.
buy. Research also shows that the strategy of shared Lee et al. (2019) survey live streaming viewers in
social responsibility, a modified version of PWYW China and find that interaction and content appear to
with a fixed percentage of consumers’ payment going be the two most important reasons that motivate
to support a charity cause, is more profitable than a viewers to tip. For interactions, they further find that
fixed price (Gneezy et al. 2010) and insensitive to the viewers are motivated by both reciprocity, that is,
percentage of payment allocated to the charity (Jung viewers expect broadcasters to engage in social inter-
et al. 2017). Our study differs from previous studies of actions with tippers in return for tips received, and so-
PWYW in that the tip by consumers is publicly ob- cial image, that is, viewers tip to grab attention from
servable in live streaming instead of anonymous as in the crowd and enjoy standing out from the others.
Lu et al.: Audience Size and Live Streaming Revenues Under Pay What You Want
968 Marketing Science, 2021, vol. 40, no. 5, pp. 964–984, © 2021 INFORMS
Another survey from an industrial report of Chinese Figure 1. (Color online) A Snapshot of a Live Streaming
live streaming platforms also highlights the impor- Session
tance of social image concerns in viewers’ tipping (ii-
Media Research 2019). According to this survey, re-
ceiving social recognition is among the top five
Number of current viewers, Tips in the form of
reasons why viewers tip in live streaming. The other virtual gifts
including both real and
four reasons include content uniqueness, content rele- synthetic viewers
Downloaded from informs.org by [68.101.74.150] on 24 March 2023, at 08:12 . For personal use only, all rights reserved.
Means
C T1 T2 p-values of F-test
groups. These metrics included the average value of setting to identify the effect of audience size alone
tips, the average number of likes and chats, and the rather than a combined effect of audience size and en-
average number of chatters per session. We also com- gagement activities. It is also technically challenging
pared the number of sessions, session length, and pro- to generate synthetic viewers who can meaningfully
portion of female broadcasters across the three send tips, chats, and likes as a regular viewer. Thus,
groups. As Table 1 shows, the p-values of the F-test in- we manipulated only the audience size but not any
dicate no significant evidence to reject the null hy- engagement activity in this experiment. This design is
pothesis that the means of the pretreatment metrics by similar to Toubia and Stephen (2013) in which the au-
broadcasters in each group are the same, thus con- thors manipulated only the number of followers rath-
firming the success of our randomization procedure. er than their activities (e.g., replying, sharing, men-
For each session by a broadcaster in T1, we asked tioning) on Twitter when studying the effect of
the platform to add an average of two synthetic view- followers on content creation.
ers at the end of each minute after the 10th minute.
We treat sessions in T2 similarly to those in T1 but 3.3. Data Description
double the strength of the treatment by adding an av- Our data include 2,222 sessions by 153 broadcasters
erage of four synthetic reviews at each minute after during the experiment and 2,226 sessions by these
the 10th minute. The platform added x synthetic broadcasters before the experiment.5 We have 813,
viewers at the end of each minute; x is drawn from a 660, and 749 sessions in C, T1, and T2, respectively,
normal distribution, and · is the floor function. during the experiment. The number of sessions in C,
When x is negative, up to x number of synthetic T1, and T2 are 794, 678, and 754, respectively, before
viewers leave the session. We did not add synthetic
viewers at the beginning of a session to avoid poten- Figure 2. (Color online) Histogram of Number of Added
tial suspicion from the broadcaster and real viewers. Synthetic Viewers
The platform also used only dormant accounts—those
generated by real users who were no longer active—
as synthetic viewers in this study. Thus, the platform
did not create any new accounts for this study, which
could be troublesome if a savvy viewer noticed any
unauthentic profile information such as a relatively re-
cent registration date. The synthetic viewers we added
did not interact with any party during a session. Fig-
ure 2 shows the distribution of the number of added
synthetic viewers at each minute in T1 (M 2.08) and
T2 (M 4.12), respectively. This manipulation allows
us to have exogenous data variation at the level of mi-
nutes though we conduct initial randomization at the
level of broadcasters.
Recall that the goal of our research is to investigate
the relationship between audience size and tipping
revenue, conditional on everything else, including en-
gagement activities (i.e., tips, chat, likes). We, there-
fore, did not allow synthetic viewers to engage in a
live stream so that we can have a relatively clean
Lu et al.: Audience Size and Live Streaming Revenues Under Pay What You Want
970 Marketing Science, 2021, vol. 40, no. 5, pp. 964–984, © 2021 INFORMS
Standard
Definition Group Mean deviation Minimum Maximum
the experiment. In contrast with session-level data col- Section 5.2, we describe how we generate four perfor-
lected before the experiment, we have minute-level mance-related metrics by analyzing video content
observations during the experiment. In particular, we (i.e., Recognition, FaceTime, HandTime, and Emotion).
observe the number of likes and chats and the value We present the summary statistics of key variables for
of tips sent by real viewers during each minute t of a viewers and broadcasters in Tables 2 and 3, respec-
session k. We also observe the number of real and syn- tively. On average, each session lasts approximately
thetic viewers at minute t. For broadcasters, we ob- 50 minutes, and a broadcaster in C, T1, and T2 re-
serve session length and infer broadcaster gender on ceives 0.61, 1.03, and 1.15 yuan of tips per minute,
the basis of saved video recordings of all streams. In respectively.
Standard
Definition Group Mean deviation Minimum Maximum
Tip 1
TipRate 0.733** 1
ChatRate –0.054** –0.007* 1
LikeRate –0.000 0.023** 0.111** 1
Nreal 0.171** 0.040** –0.259** –0.055** 1
Downloaded from informs.org by [68.101.74.150] on 24 March 2023, at 08:12 . For personal use only, all rights reserved.
Table 4 reports the correlation matrix of key varia- not significantly different between the T1 and T2 con-
bles in our data. We find a positive correlation between ditions in which treatment strength doubles from T1
the tip amount and the audience size measured by the to T2. This result suggests that the positive effect of
number of displayed viewers (r 0.059, p < 0.01). This treatment on Tip is nonlinear and subject to diminish-
positive association seems to be driven by the positive ing returns. By decomposing Tip into Nreal times Tip-
correlation between audience size and the number of Rate, we show that two factors drive the positive treat-
real viewers (r 0.326, p < 0.01) and the positive corre- ment effect on tipping revenue: a positive treatment
lation between audience size and the average tip per effect on viewer participation and a positive treatment
viewer measured by the TipRate (r 0.016, p < 0.01). effect on TipRate. With regard to the length and fre-
We show how the distributions of the three reve- quency of streams, we find no significant difference
nue-related variables (i.e., Tip, TipRate, and Nreal ) vary between the control and treated broadcasters. We
by the number of synthetic viewers in Figure 3. We speculate that the null effect of audience size manipu-
use a heat map to visualize the distribution; the darker lation on session length might depend on broadcast-
areas represent the higher density. The scatterplots of ers’ schedules. Most broadcasters on our live stream-
our raw data reveal an inverted U-shaped associations ing platform are using the streaming services either as
between Nsyn and Tip and between Nsyn and TipRate a hobby or a part-time job. They typically have other
and a mostly positive and concave relationship be- duties (e.g., full-time worker/student) that may affect
tween Nreal and Nsyn when the number of synthetic their flexibility of broadcasting. We also observe that
viewers is not too large (Nsyn ≤ 1, 200). Figure 3 also broadcasters often announce how much time they will
depicts outliers in which the number of added syn- broadcast at the beginning of a session. The lack of
thetic viewers is extremely large (Nsyn > 1, 200) be- flexibility and predetermined broadcasting schedules
cause of a few unusually long sessions. might explain the similar session length between the
control and treated conditions.
4. Main Analysis Despite the convenience, the treatment effect re-
4.1. Mean Comparison vealed from the aggregate-level mean comparison suf-
We describe the raw treatment effect by comparing the fers from at least two sources of bias. The first results
means of revenue-related variables across C, T1, and from our experimental manipulation in which the first
T2. Specifically, we examine whether the value of tips 10 minutes in sessions of T1 and T2 were actually un-
per minute, the session length, and the streaming fre- treated. Given this nontreatment for the first 10 minutes,
quency, all aggregated at the broadcaster level, vary the aggregation based on the whole sample can result in
by treatment conditions. We decompose the value of an underestimate of the treatment effect. The second
tips to the number of real viewers (Nreal ) and average source of bias arises from the varying strength of treat-
tip per real viewer (TipRate) to explore the potential ment over time. We provide an illustrative example
mechanism through which audience size may affect here to show the potential bias in treatment effect in-
tipping revenue. As previously theorized, the popular- ferred from the mean comparison at the aggregate
ity information may affect tipping revenues through broadcaster level. For simplicity, we assume treatment
the impact on viewer participation because of herding strength is linear in time and each broadcaster only
and through the impact on individual tipping because streams once. We describe the outcome variable yit of
of the potential coexistence of viewers’ motivations of broadcaster i at time t as follows:
improving social image and seeking reciprocity. yit (Treati 1) α + f (t) + ξi + it , (1)
We report the results of the broadcaster-level mean
yit (Treati 0) α + ξi + it , (2)
comparison in Table 5. We note that the value of tips
per minute (i.e., Tip) increases significantly when we where f (t) is the time-varying treatment effect and ξi
add synthetic viewers. However, the value of tips is denotes broadcaster heterogeneity with mean zero.
Lu et al.: Audience Size and Live Streaming Revenues Under Pay What You Want
972 Marketing Science, 2021, vol. 40, no. 5, pp. 964–984, © 2021 INFORMS
Figure 3. (Color online) Relationship Between the Number of Synthetic Viewers and Revenue Metrics
Downloaded from informs.org by [68.101.74.150] on 24 March 2023, at 08:12 . For personal use only, all rights reserved.
Notes. Each line represents the data fitted by a cubic spline and the surrounding areas represent the 95% confidence interval. The gap on the Y-
axis in Figure 3(a) is caused by the discrete nature of the tip amount: the smallest tip is 10 cents (1 cent 0.01 yuan), which causes the gap be-
tween log(10) 2.30 and 0 on the Y-axis.
Based on Equation (1), the mean of yit for broadcast- is independent and identically distributed, and there-
i fore, we have
er i from treatment groups is ȳ i α + Tt1 f (t)=Ti +
Ti
ξi + t1 it =Ti , where Ti represents the number of
Ti
f (t)
Ei ȳ i α + ETi t1
: (3)
treated periods (session length). As we do not find a Ti
treatment effect on session length, we assume that Ti
Lu et al.: Audience Size and Live Streaming Revenues Under Pay What You Want
Marketing Science, 2021, vol. 40, no. 5, pp. 964–984, © 2021 INFORMS 973
Mean p-value
Consider a typical between-subjects design with groups. As the top panel of Table 6 shows, the means
multiple trials in which the treatment strength is cons- of variables of interest in the treatment groups all
tant (i.e., f (t) β) across
trials (i.e., minutes).
In this
move toward the expected direction (e.g., Tip in T1 and
scenario, we have E ȳ i |Treati 1 − E ȳ i |Treati 0 T2 both move up compared with the values in Table 5).
We cannot reject the hypothesis that Tip in C is smaller
t1 f (t)=Ti β, suggesting that the mean com-
Ti
ETi
than those in T1 and T2 at the 5% level. In addition,
parison estimate works. Nevertheless, the number of TipRate significantly increases from C to T1 (p 0.018)
synthetic viewers added in our experiment accu- and then decreases from T1 to T2 (p 0.034), suggest-
mulates over time, and therefore, the treatment ing an inverted U-shaped relationship. Furthermore,
strength is not identical across trials. If the treatment the differences in the number of real viewers between
effect is perfectly linear in strength (i.e., f (t) βt), we C and T1 and between T1 to T2 are statistically signifi-
i
have E ȳ i |Treati 1 − E ȳ i |Treati 0 ETi Tt1 f (t)=Ti cant at the 10% level, suggesting an overall positive
ETi (Ti +1) treatment effect on Nreal . We also report the results of
2 β ∝ β, suggesting that the broadcaster-level mean the mean comparisons in the bottom panel of Table 6
comparison estimate is proportionally unbiased. Howev- when excluding the first 10 minutes of data from both
er, if the treatment effect is nonlinear (e.g., f (t) βt + γt2 ), control and treatment groups. We find qualitatively
we have E y i | Treati 1 − E y i | Treati 0 ETi Ti 2+ 1 β + similar patterns. To tackle the second source of bias re-
(Ti +1)(2Ti +1) E (T +1) E [(T +1)(2T +1)] sulting from varying treatment strength, we need to ac-
6 γ Ti 2i β + Ti i 6 i γ, suggesting that
the mean comparison estimate is no longer proportional- count for the variation in treatment strength over time
ly unbiased, and the extent of the bias is determined by and across sessions as we discuss next.
the convexity of treatment effect and the distribution of
session length. 4.2. Slope Comparison
To alleviate the first source of bias resulting from the One intuitive approach of exploiting the variation in
nontreatment for the first 10 minutes, we rerun the treatment strength over time is to compare the slope
mean comparison by excluding observations associated of variables of interest with time across treatment con-
with the first 10 minutes from the two treatment ditions. For example, if the addition of synthetic
Mean p-value
Note. In Figures 4–6, the minute-level mean estimates and error bars with 95% confidence interval are shown in the left figure, and the minute-
level mean estimates and a linear fit line with 95% confidence interval are shown in the right figure.
viewers indeed increases the number of real viewers, We further plot the dynamics in Tip in Figure 5 and
we should expect the slope of Nreal to be higher for T2 TipRate in Figure 6. Again, we are not able to reject the
than T1 than C because the number of synthetic view- hypothesis that either Tip or TipRate is the same across
ers on average increases faster over time in T2 than T1 C, T1, and T2 at t 1, : : : , 10 at the 5% level with the
than C. In addition, given that the addition of synthet- exception of TipRate at t 2 (p 0.026). By comparing
ic viewers proceeded after the 10th minute, we should the slopes during the nontreated period (t ≤ 10), we
not expect a significant difference in Nreal across C, T1, cannot reject the hypothesis that the slope of Tip is the
and T2 during the first 10 minutes. same across groups (p 0.670), nor can we reject the
The dynamics of Nreal over time confirm our expect- same-slope hypothesis for TipRate (p 0.571). During
ations. As Figure 4(a) shows, the number of real view- the treated period (10 < t ≤ 120), we find that the slope
ers gradually increases during the first 10 minutes, but of Tip in the control group is generally downward
there is no substantial difference across the three while the slope of Tip is either flat or slightly upward
groups. The p-values of the F-test that Nreal is the same in two treatment groups, suggesting a positive treat-
across the three groups at t 1, : : : , 10 have a mean of ment effect on Tip. Although the slopes for both T1 (p
0.559 and a minimum of 0.184. In addition, when 0.028) and T2 (p 0.012) are significantly steeper
t ≤ 10, we are unable to reject the hypothesis that the than C, there is no significant difference in the slopes
slope in group C is the same as the slope in T1 and T2 between T1 and T2 (p 0.552). This data pattern again
(p 0.571). These tests provide additional support for suggests that the treatment effect on tip amount is
the success of our field experiment manipulations. By nonlinear and subject to diminishing returns. For the
comparing the slope of Nreal across groups when slope of TipRate shown in Figure 6(b), the downward
10 < t ≤ 120, we find that the slope of T1 is steeper slope for T1 is significantly less salient than C (p <
than C (p < 0.001) and the slope of T2 is steeper than 0.001) and significantly more salient than T2 (p
T1 (p < 0.001), which is consistent with the positive 0.043), and the slope for T2 is not significantly differ-
treatment effect on Nreal . As Figure 3 shows, the data ent from C (p 0.110), which implies an inverted U-
with Nsyn > 1, 200 result from a few unusually long ses- shaped relationship. Overall, the results from our
sions, and t 120 is approximately associated with the slope comparisons provide additional support for the
threshold of Nsyn 1, 200 (i.e., the maximum of Nsyn success of our manipulation in the experiment and
when t ≤ 120 is 1,153). Thus, we use the data truncated confirm similar treatment effects on Tip, Nreal , and Tip-
at 120 minutes in subsequent analyses to prevent dis- Rate found in previous mean comparisons with in-
proportionate influence from outliers. creased statistical power.
Lu et al.: Audience Size and Live Streaming Revenues Under Pay What You Want
Marketing Science, 2021, vol. 40, no. 5, pp. 964–984, © 2021 INFORMS 975
4.3. Regression Analysis viewers across groups (i.e., Nsyn grows faster in T2
A unique design of our field experiment is the ran- than T1 than C), but it also allows us to observe the
domization of synthetic viewers over time and across exact treatment strength (i.e., Nsyn ) at each minute.
sessions in addition to the first level of randomization This additional level of randomization enables us to
across broadcasters. This manipulation not only gives exploit the richer temporal variation to identify the ef-
rise to the differential growth rates of synthetic fect of audience size on tipping revenues.
To estimate the causal impact of audience size, we em- error term kt if there are unobserved factors (e.g., am-
ploy a linear panel regression with session fixed effects. bience) that affect both viewers’ participation and tip-
Compared with the mean comparison estimate, we are ping decisions. To address this potential endogeneity
syn
able to explicitly account for the variation in treatment issue, we use Nkt and its squared term as instrument
strength in the regression, and therefore, the estimate is disp
variables (IVs) for Nkt and its squared term. These are
no longer subject to aggregation bias. An additional ben- syn disp
valid IVs because Nkt is correlated with Nkt by defini-
efit of the regression analysis is that the coefficient esti-
Downloaded from informs.org by [68.101.74.150] on 24 March 2023, at 08:12 . For personal use only, all rights reserved.
Table 7. Regression Results of Models of Tip, TipRate, and Number of Real Viewers
(3.34e-4) (1.09e-5)
Lag Nreal 0.757** 0.754**
(0.016) (0.016)
Nsyn 0.021** 0.028**
(0.004) (0.003)
(Nsyn )2 −1.64e-5*
(6.83e-6)
t −2.57** −3.11** –0.057** –0.067** –0.040** –0.045**
(0.770) (0.819) (0.020) (0.022) (0.008) (0.007)
Tenure −4.97** −5.40** –0.103 –0.111 –0.238** –0.246**
(1.70) (1.82) (0.061) (0.064) (0.029) (0.028)
Session fixed effects Yes Yes Yes Yes Yes Yes
Endogeneity correction Yes Yes Yes Yes
Number of observations 53,893 53,893 53,893 53,893 52,833 52,833
Notes. Robust standard errors clustered by broadcasters in parentheses. Model 1 considers linear effect only;
Model 2 includes both N and N2.
**p < 0.01; *p < 0.05.
where we include the number of real viewers from TipRate, suggesting the dominance of social image–
the last period to capture the inertia in viewers’ partic- related utility over seeking reciprocity in driving view-
ipation decisions. We also use Nsyn rather than Ndisp as ers’ tipping when Ndisp is not too large. When Ndisp
the independent variable because Ndisp Nreal + Nsyn , exceeds a certain threshold (0.022/[2 × 2.32e-5] 474,
disp
suggesting that Nkt perfectly correlates with kt and, which is the 94.2 percentile), the relationship reverses,
therefore, cannot serve as an independent variable. perhaps because of the dominance of the negative com-
As Table 7 shows, adding synthetic viewers draws petition effect.
more real viewers to live streaming sessions.8 This is
supported by the positive and significant coefficients of
5. Model Extensions and
Nsyn (0.021, p < 0.01), which suggests the existence of
herding in a live streaming context. Adding 50 synthet- Robustness Checks
ic viewers results in approximately one additional real We first extend our main analysis by further exploring
viewer. Although we find diminishing returns of the ef- the heterogeneity in the treatment effect of increasing
2
fect, the relatively small coefficient of (Nsyn ) suggests audience size. We then conduct a series of robustness
that the effect of audience size on viewer participation checks to show that our findings are not sensitive to
is mostly positive in our data range. That is, the effect is alternative model specifications and assumptions.
positive until Nsyn exceeds 0.028/(2 × 1.64e-5) 854,
which is the 99.2 percentile in our data. 5.1. The Moderating Effect of Broadcaster
Characteristics
4.3.2. Does a Larger Audience Encourage or Discour- 5.1.1. Does the Effect of Audience Size Differ by
age Average Tip Amount per Viewer? As audience Broadcaster Gender? Previous research suggests that
size increases, a viewer tends to obtain greater social herding occurs if a viewer refers to existing viewers’
image–related utility from tipping and, thus, might in- choices as a descriptive social norm and, thus, pas-
crease the tip amount. Nevertheless, a larger audience sively mimics their decisions to follow well-attended
may also suggest greater competition for the broad- sessions (Croson and Shang 2008, Simonsohn and
caster’s reciprocal acts, which results in a lower Ariely 2008). Our pretreatment data show that live
marginal utility of tipping and further reduces the streaming sessions by female broadcasters are better
willingness to tip. To empirically test the direction of attended than sessions by male broadcasters as indi-
the effect, we estimate the model of TipRate using cated by the significant difference (8.00, p < 0.01) in
Equation (4) with ykt TipRatekt . We report the results the number of chatters per session between female
from the IV estimation in Table 7 under the column and male broadcasters. The theory of social norm sug-
DV TipRate. The coefficient estimates reveal an over- gests that a viewer’s choice of a mainstream product
all positive relationship between the audience size and is more justifiable than the choice of a niche product,
Lu et al.: Audience Size and Live Streaming Revenues Under Pay What You Want
978 Marketing Science, 2021, vol. 40, no. 5, pp. 964–984, © 2021 INFORMS
and thus, other viewers are more likely to follow this rather than a male broadcaster (Sundie et al. 2011).
viewer to adopt the mainstream rather than the niche Given the male-dominant audience on our platform,
product. Following this reasoning, as female broad- we expect the utility related to social status and pres-
casters are the mainstream on our platform, we expect tige to be greater in sessions by female than male
the effect of popularity information on viewer partici- broadcasters, which predicts a stronger effect of audi-
pation to be stronger in sessions by female broadcast- ence size on average tip amount (a form of conspicu-
ers than male broadcasters. ous consumption) in female broadcasters’ sessions.
Nevertheless, previous research also indicates that To test the moderating effect of broadcaster gender,
consumers often engage in observational learning to we divide the sample into female versus male broad-
draw quality inference from the observation of others’ casters and reestimate Equations (4) and (5). Table 8 re-
decisions (e.g., Banerjee 1992, Zhang and Liu 2012). For ports the results. Although the main effects of audience
example, Tucker and Zhang (2011) show that niche size on Tip are both significant and positive for female
products with narrow appeal benefit more from popu- and male broadcasters, the coefficient of Ndisp is smaller
larity information than broad-appeal products in e- for male broadcasters. Next, we compare the effect of
commerce because consumers infer higher quality audience size on Nreal and TipRate between female and
from a narrow-appeal product than an equally popular male broadcasters, respectively. We find that the coeffi-
broad-appeal product. Given the majority of male cient of Nsyn is larger in the model of Nreal for female
viewers and female broadcasters, we can classify ses- broadcasters (0.032, p < 0.01) than for male broadcasters
sions by male broadcasters as niche products. Thus, if (0.016, p < 0.01). This result provides suggestive evi-
observational learning drives herding in our data, male dence that social norm rather than observational learn-
broadcasters benefit more than female broadcasters ing drives the herding in viewers’ participation in live
from the increase in popularity in drawing viewers. streaming. Regarding the regression results of TipRate,
Broadcaster gender may also moderate an individu- we find only a significant effect of Ndisp for female broad-
al’s motivation for tipping. Theories from evolution- casters (0.025, p < 0.01) although the effect is not statisti-
ary psychology suggest that men are likely to display cally significant for male broadcasters (0.011, p > 0.05).
financial resources as a tactic to signal status and gain This result implies that the motivation of signaling social
attention from female mates, which results in a higher status only exists in female broadcasters’ sessions, which
level of conspicuous consumption by men than wom- is consistent with the theory of intrasexual competition.
en (Buss 1988, Griskevicius et al. 2007, Kenrick and We show in Online Appendix C that the differential ef-
Griskevicius 2013). In our context of live streaming, fects of audience size on Tip, TipRate, and Nreal between
the theory of intrasexual competition predicts that female and male broadcasters still hold when we test
such status-signaling motivation among male custom- the moderating effect using interaction terms between
ers is more likely to appear in the presence of a female audience size and gender indicator variable.
Lu et al.: Audience Size and Live Streaming Revenues Under Pay What You Want
Marketing Science, 2021, vol. 40, no. 5, pp. 964–984, © 2021 INFORMS 979
5.1.2. Does the Effect of Audience Size Vary by Broad- approximately 180% (4.03e-4 × [6,600 – 403]/1.40
caster Fame? Viewers on the live streaming platform 1.78) more than an average broadcaster from the in-
do not know all broadcasters equally. From a viewer’s crease in audience size. The coefficient of Ndisp × Fame in
perspective, tipping a star broadcaster rather than a the regression of TipRate is positive but statistically in-
nonstar broadcaster can improve the viewer’s social significant, which does not support the enhanced social
image because of the perceived affiliation with a more image from tipping to top-ranked broadcasters. The
prestigious group in front of other viewers, holding positive moderation of Fame in the effect of audience
Downloaded from informs.org by [68.101.74.150] on 24 March 2023, at 08:12 . For personal use only, all rights reserved.
everything else equal. Thus, we might expect a greater size on Tip seems to be driven by the stronger effect on
effect of audience size on tipping revenue in sessions viewer participation as suggested by the positive and
by more famous broadcasters. significant coefficient of Nfake × Fame in the regression of
We operationalize the fame of each broadcaster by Nreal . As sessions by more famous broadcasters’ exhibit
the dollar value of all tips received during the one- more viewers, this result is consistent with our previous
month pretreatment period because the platform pub- finding that social norm rather than observational learn-
licizes the rank of broadcasters on the basis of their ing drives herding on our platform.
monthly earnings with the top broadcaster earning the
most. The variable Famei , defined as the monthly tip 5.2. The Moderating Effect of Broadcaster
amount, has a mean of 403.1 yuan and a maximum of Performances
6,600 yuan in our sample, suggesting that none of the We further investigate the potential moderating effect
broadcasters are superstars in the conventional sense. of broadcaster performances. As time-variant perfor-
Nevertheless, we still have a large variation in the mance metrics are potentially endogenous to the treat-
monthly earnings of broadcasters (SD 809.7), which ment, the results derived from this model extension
enables us to identify the moderating effect of fame. are better interpreted as exploratory rather than con-
We add interaction terms N × Famei and N2 × Famei clusive. Our main analysis reveals a positive causal im-
to Equations (4) and (5) and report the regression results pact of audience size on viewers’ tipping when the au-
in Table 9. The coefficient of Ndisp × Fame is significant dience size is not too large, which is consistent with
and positive in the regression of Tip, suggesting a stron- the existence of social image–related utility in tipping.
ger effect of audience size in sessions by more famous In a live steaming session, a broadcaster’s recognition
broadcasters. The most earned broadcaster benefits of a viewer’s action would likely garner attention from
other viewers, thus enhancing the social image–related
Table 9. The Moderating Effect of Broadcaster Fame utility of the focal viewer. For example, a broadcaster
may welcome a viewer when this viewer joins a ses-
DV Tip DV TipRate DV Nreal sion or express gratitude to the viewer for the tip re-
Ndisp 1.40** 0.021** ceived. In line with this logic, a viewer is more likely
(0.293) (0.008) to tip in front of other viewers in anticipation of a
(N disp )2
–0.001** −2.23e-5* broadcaster’s recognition.9 We, therefore, expect a
(2.73e-4) (1.04e-5)
broadcaster’s recognition tendency to positively mod-
Ndisp × Fame 4.03e-4** 3.87e-6
(1.17e-4) (2.66e-6) erate the effect of audience size on viewers’ tipping.
(Ndisp )2 × Fame −2.73e-7 −2.68e-9 To test this hypothesis, we operationalize a broad-
(1.85e-7) (6.80e-9) caster’s recognition tendency by the frequency of say-
Lag Nreal 0.709** ing words, as suggested by the platform, related to
(0.024)
greeting and gratitude toward a specific viewer dur-
Nsyn 0.032**
(0.003) ing a session: “welcome,” “hello,” “thank you,” and
(Nsyn )2 −1.70e-5** “many thanks” (in Chinese). We create Recognitionkt ,
(5.58e-6) the count of these four words by the broadcaster in
Nsyn × Fame 7.03e-6** session k during minute t, by employing the state-of-
(1.75e-6)
(Nsyn )2 × Fame
the-art speech recognition algorithms detailed in On-
1.89e-9
(7.19e-9) line Appendix B.
t −3.01** –0.066** –0.054** We also create three other metrics (i.e., FaceTimekt ,
(0.863) (0.022) (0.008) HandTimekt , and Emotionkt ) from the video content as
Tenure −5.19** –0.109 –0.275** proxies of the intensity and quality of the broadcast-
(1.76) (0.065) (0.033)
er’s performance (for definitions and summary statis-
Session fixed effects Yes Yes Yes
Endogeneity correction Yes Yes tics, see Table 3). Specifically, we measure the intensi-
Number of observations 53,893 53,893 52,833 ty of performance according to the percentage of
frames in which the broadcaster’s face and hands ap-
Notes. Robust standard errors clustered by broadcasters in parenthe-
ses. Ndisp , Nsyn , and Fame are mean-centered. pear in front of the screen within a minute. Intuitively,
**p < 0.01; *p < 0.05. the longer a broadcaster disappears from the screen,
Lu et al.: Audience Size and Live Streaming Revenues Under Pay What You Want
980 Marketing Science, 2021, vol. 40, no. 5, pp. 964–984, © 2021 INFORMS
the lower the performance intensity perceived by audience size. The stronger effect on Tip results from
viewers. We assume that the quality of a session is as- the moderating effect of Recognition on TipRate as indi-
sociated with a broadcaster’s emotion, which we infer cated by the positive and significant coefficient of
from facial expressions using Microsoft Azure Emo- Ndisp × Recognitionkt in the model of TipRate. These
tion. This program detects and analyzes facial expres- findings again provide evidence for the existence of
sions using deep convolutional neural networks. The social image–related utility in tipping.
output of these algorithms is a vector of scores associ- Further analyses provide additional evidence in
Downloaded from informs.org by [68.101.74.150] on 24 March 2023, at 08:12 . For personal use only, all rights reserved.
ated with eight types of positive and negative emo- support of our proposed mechanism through the so-
tions (see Online Appendix B). We aggregate these cial image–related utility. We examine whether broad-
scores to create an emotion score, Emotionit , to mea- casters’ behaviors (i.e., FaceTimekt , HandTimekt ,
sure the extent of positivity in a broadcaster’s facial Emotionkt ) that are unlikely to be related to the signal-
expression within a minute. ing value of a viewer’s tipping also moderate this pos-
To evaluate the moderating effects, we add interac- itive effect. Note that these types of broadcaster be-
tion terms N × Recognitionkt and N2 × Recognitionkt to haviors differ from Recognitionkt as a broadcaster’s
Equations (4) and (5). We also include Recognitionkt as body language and facial expressions are often made
an independent variable because of the variation of to the general audience rather than to a particular
Recognitionkt over time and across sessions, which ses- viewer. Therefore, we do not expect FaceTimekt ,
sion fixed effects do not capture. We present estima- HandTimekt , or Emotionkt to moderate the impact of
disp
tion results in Table 10. The results show that the posi- Nkt on social image–related utility of tipping; thus,
tive impact of audience size on Tip is stronger in a we propose a falsification test. To conduct the pro-
session in which the broadcaster is more likely to ex- posed falsification test, we reestimate the model of Tip
press recognition. Broadcasters whose frequency of in a similar way as we did for the test for Recognition.
using recognition-related words is one standard devi- The results in Table 11 confirm that neither of these
ation above population mean enjoy 10.0% (6 × 0.023/ three performance metrics moderates the effect of au-
1.38 0.100) more revenue from the increase in dience size on tipping revenue. Aside from broadcast-
er performances, we explore a few other potential
moderators in Online Appendix C and find that past
Table 10. The Moderating Effect of Broadcaster’s
Recognition Tendency tips and the number of real viewers moderate the ef-
fect of audience size on tipping, but viewer engage-
DV Tip DV TipRate DV Nreal ment does not moderate this effect.
Ndisp 1.38** 0.021**
(0.295) (0.008) 5.3. Robustness Checks
(N disp )2 We conduct a series of analyses to ensure the robust-
–0.001** −2.35e-5*
(2.90e-4) (1.16e-5) ness of our main findings in Online Appendix D. We
Ndisp × Recognition 0.023* 7.99e-4* first show that the positive effect of audience size on
(0.010) (3.22e-4)
(Ndisp )2 × Recognition
tipping revenue still holds after controlling for poten-
−1.22e-5 −4.61e-7
(3.38e-5) (8.22e-7) tial confounding factors, including the proportion of
Lag Nreal 0.726** synthetic viewers, the proportion of new viewers, and
(0.021) the potential spillover effects of other concurrent
Nsyn 0.031** streams.
(0.004)
(Nsyn )2
We then consider an alternative model specification
−2.11e-5**
(6.38e-6) without session fixed effects but with the time-of-day
Nsyn × Recognition −1.44e-4 and social norm effects. We still find a mostly positive
(1.51e-4) effect of audience size on each of the three outcome
(Nsyn )2 × Recognition 1.23e-6** variables (Tip, TipRate, Nreal ). In addition, total tips
(3.22e-7)
tend to be higher in sessions by female and famous
Recognition 0.069 –0.021 3.52e-4
(1.04) (0.026) (0.005) broadcasters. Viewers tend to tip more in talk shows
t −2.98** –0.065** –0.051** than other genres. Regarding the timing of broadcast-
(0.814) (0.022) (0.009) ing, sessions that start in the early morning (8 a.m.)
Tenure −4.67** –0.084 –0.275** and late afternoon (4 p.m.) receive significantly fewer
(1.64) (0.059) (0.034)
tips than sessions that start at midnight.
Session fixed effects Yes Yes Yes
Endogeneity correction Yes Yes We exclude all short sessions that last no more than
Number of observations 53,893 53,893 52,833 10 minutes in our regressions because of the lack of
variation in the number of synthetic viewers during
Notes. Robust standard errors clustered by broadcasters in parenthe-
ses. Ndisp , Nsyn , and Recognition are mean-centered. the first 10 minutes. In total, 482 out of 2,222 sessions
**p < 0.01; *p < 0.05. are short sessions and contribute to 0.47% of total
Lu et al.: Audience Size and Live Streaming Revenues Under Pay What You Want
Marketing Science, 2021, vol. 40, no. 5, pp. 964–984, © 2021 INFORMS 981
DV Tip
revenues. The exclusion of short sessions might give tipping is through changes in service quality mani-
rise to a sample selection problem if the social benefits fested in broadcaster behavior. Specifically, none of
of tipping systematically differ between short and the four lag broadcaster performance metrics has a
long sessions. We provide evidence that our findings statistically significant relationship with either Tip or
do not seem to be subject to the sample selection bias. TipRate, suggesting that broadcaster behavior does
We not only observe data across the three treatment not mediate the positive main effect of audience size
groups but also have data before and after the experi- on tipping. For broadcaster performance, we find that
ment, which allows us to check the robustness of find- a larger audience size generally increases the duration
ings using a difference-in-differences (DID) analysis. of a broadcaster’s face appearing in front of the cam-
We present corroborating evidence of the positive and era (i.e., FaceTime). This finding suggests that broad-
nonlinear treatment effect on tipping revenue from casters tend to increase their broadcasting effort when
the estimation of a DID model. having a larger audience. Although we did not find a
To further test the proposed behavioral mechanism significant relationship between FaceTime and view-
through social image, we investigate the effects of au- ers’ immediate tips, an increased frequency of show-
dience size on chats and likes as a falsification test. We ing faces to viewers might improve the viewer-broad-
find a negative main effect of audience size on Cha- caster relationship in the long run and, therefore, lead
tRate and a null effect on LikeRate, which are consis- to a greater scalability of PWYW. Given the relatively
tent with the absence of social image in these nonmon- short span of our data, we are unable to test this hy-
etary engagement activities. One possible explanation pothesis in this study. We, therefore, leave these inter-
for the main negative effect on ChatRate is the substi- esting questions pertaining to the long-run benefits of
tution between tipping and chatting. When audience a large audience to future research.
size grows, the increasing social image–related utility Finally, we conduct a descriptive analysis by examin-
in tipping may drive viewers to switch from chatting ing the correlations between audience size and key out-
to tipping, which explains the downward trend in come variables using data from the control group
ChatRate against audience size. Notably, as nonmonet- alone, in which all viewers are real. Consistent with our
ary engagement activities, such as chats and likes, are main findings from the experiment, we observe a most-
often used as key health indicators by online commu- ly positive relationship between audience size (Ndisp )
nities and platforms, the decrease in chats against au- and total tips (Tip) and a mostly positive relationship
dience size could be a potential downside of the between audience size and average tip per real viewer
PWYW model. (TipRate). These observations suggest that not only hav-
We also empirically rule out an alternative mecha- ing more synthetic viewers, but also having more regu-
nism that the positive effect of audience size on lar real viewers, may improve tipping revenue.
Lu et al.: Audience Size and Live Streaming Revenues Under Pay What You Want
982 Marketing Science, 2021, vol. 40, no. 5, pp. 964–984, © 2021 INFORMS
6. Discussion and Conclusion on the assumption that either the tipping rate does not
Fueled by the advances in mobile technology and in- depend on the broadcaster’ performance quality or the
ternet penetration, the live streaming industry has broadcasters’ performance quality does not depend on
grown to be a billion-dollar market and has become a the audience size. When this assumption does not
viable source of income for thousands of people hold, splitting viewers into smaller groups might low-
across the globe. As major players in the social media er the broadcaster’s performance quality, which, in
industry embrace live streaming, the question regard- turn, lowers the tipping amount. Although we find
Downloaded from informs.org by [68.101.74.150] on 24 March 2023, at 08:12 . For personal use only, all rights reserved.
ing the scalability of the PWYW-based revenue model support for this assumption in our empirical data, we
(i.e., whether larger audiences generate greater reve- caution firms that they should scrutinize this assump-
nues) remains unanswered. tion before following our recommendation of splitting
We find a mostly positive casual effect of audience viewers.
size on tipping revenues using data from a field ex- A caveat of using PWYW in live streaming is that it
periment run on a live streaming platform that uses does not scale linearly with the viewership as other
PWYW. However, the effect of audience size is con- monetization methods do, such as advertising and
cave, and it can even turn negative when audience subscription. Thus, a live streaming firm might con-
size is relatively large. By decomposing the net effect sider diversifying its monetization methods to accom-
on revenue to the effect on viewer size and on tip per modate advertising and/or subscriptions when it
viewer, we find that the positive effect on total reve- reaches a certain stage. Although we used synthetic
nue results from a positive effect of popularity infor- viewers as a part of the research design, we do not
mation on viewer size and a mostly positive effect on recommend platforms to add synthetic viewers to live
individual payment. The former finding is consistent streaming sessions to increase revenues because of the
unethical nature of this practice. Instead, given the im-
with theories related to herding; the latter finding sug-
portance of social image–related utility in PWYW, live
gests the dominance of social image motive over seek-
streaming platforms could offer additional status-
ing reciprocity in driving a viewer’s tipping in live
seeking devices, such as badges, avatars, or ranking
streaming. We also find that famous female broadcast-
systems, to further improve revenues. We also caution
ers who use recognition-related words frequently ben-
firms that intentionally manipulating audience size
efit the most from an increase in audiences. In terms
may backfire because the revenue will actually de-
of mechanisms, we provide suggestive evidence for
crease rather than increase when the number of dis-
the importance of social image–related utility in
played viewers is relatively large.
PWYW through moderation analyses and falsification We focus on the context of live streaming in this re-
tests. We also scrutinize the possibility that audience search. However, the question pertaining to scalability
size can improve the performance quality of broad- extends to other important formats of virtual commu-
casters and, thus, have an indirect effect on tipping. nication, such as massive open online courses
The lack of a statistically significant relationship be- (MOOC). As consumers are increasingly migrating
tween broadcasters’ recent performances and tipping from the traditional off-line communication to the
suggests the absence of such an indirect effect. “new normal” of online/virtual communication, it is
Our research adds to the PWYW literature by pro- imperative for firms to understand and quantify the
viding the first assessment of the scalability of reve- marginal benefit of expanding the audience given that
nues generated by PWYW, which sheds light on the the marginal cost of holding a larger audience is negli-
viability of PWYW as a key monetization tool for gible in the virtual space. We take the first step in this
firms, especially for social media firms that rely on research to examine the scalability of PWYW as a
UGC. In addition, we complement previous PWYW business model for live streaming. Future research
studies by showing that an important motive for con- could extend our research design to explore the scal-
sumers to pay under PWYW is to improve their social ability of other forms of virtual communication. For
images, at least in the live streaming context. We also example, it will be interesting to study how the overall
contribute to the burgeoning live streaming literature learning effectiveness of a MOOC might vary with the
by presenting one of the first few studies to use field number of enrolled students.
rather than survey data. Our study has limitations, which suggest avenues
Our findings provide several important implications for future research. One limitation of this study is the
to marketing practitioners. Because of the concave re- relatively small sample size; thus, larger sample size
lationship between audience size and tipping revenue, exploration should help establish robustness. In addi-
platforms might consider splitting viewers into multi- tion, we do not observe individual-level data and,
ple chat rooms to alleviate their declining marginal therefore, are unable to assess the potential heteroge-
benefit of tipping, especially for extremely populous neity in treatment effects across viewers. For example,
sessions. The validity of this recommendation hinges viewers who tipped a lot in the past might gain more
Lu et al.: Audience Size and Live Streaming Revenues Under Pay What You Want
Marketing Science, 2021, vol. 40, no. 5, pp. 964–984, © 2021 INFORMS 983
benefit from broadcasters’ attention than those mar- recognition tendency can be driven by a broadcaster’s politeness
rather than reciprocity.
ginal viewers who rarely tipped. Should individual-
level data be available, researchers could extend our
analysis to investigate heterogeneous treatment effects References
across viewers. We started the treatment after the first Andreoni J (1988) Privately provided public goods in a large econo-
10 minutes, which might give rise to the sample selec- my: The limits of altruism. J. Public Econom. 35(1):57–73.
tion problem. Although we find no strong evidence of Arellano M, Bond S (1991) Some tests of specification for panel data:
Downloaded from informs.org by [68.101.74.150] on 24 March 2023, at 08:12 . For personal use only, all rights reserved.
He Q, Liu J, Wang C, Li B (2016) Coping with heterogeneous video Qin A (2016) China’s viral idol: Papi Jiang, a girl next door with atti-
contributors and viewers in crowdsourced live streaming: A tude. The New York Time Online (August 24), https://www
cloud-based approach. IEEE Trans. Multimedia 18(5):916–928. .nytimes.com/2016/08/25/arts/international/chinas-viral-idol
iiMedia Research (2019) 2018-2019 China online live broadcasting in- -papi-jiang-a-girl-next-door-with-attitude.html.
dustry research report. Accessed May 25, 2021, https://www. Roettgers J (2018) Facebook adds ability to tip live streamers to mo-
iimedia.cn/c400/63478.html. bile apps. Variety Online (April 16), https://variety.com/2018/
iiMedia Research (2020) 2019-2020 China online live streaming mar- digital/news/facebook-tipping-in-app-purchases-1202754904/.
ket research report. Accessed May 25, 2021, https://www. Schmidt KM, Spann M, Zeithammer R (2015) Pay what you want as
iimedia.cn/c400/69017.html.
Downloaded from informs.org by [68.101.74.150] on 24 March 2023, at 08:12 . For personal use only, all rights reserved.