Professional Documents
Culture Documents
24th
I E
B u s i n e s s
S c h o o l
TABLE OF CONTENTS
1.
INTRODUCTION .......................................................................................................... 3
2.
BUSINESS
OPPORTUNITY ............................................................................................ 4
3.
MODERATION
OF
USER
GENERATED
CONTENT ........................................................... 6
3.1
USER
MODERATION
TECHNIQUES ......................................................................................6
3.2
IMPOSED
MODERATION
TECHNIQUES................................................................................6
IE
Business
School
2
1. INTRODUCTION
The rise of social computing and online communities has ushered in a new era of content
delivery, where information can be easily shared and accessed. A large number of
applications have emerged that facilitate collective actions for content generation and
knowledge sharing. Examples include blogs, online product reviews, wiki applications such
as Wikipedia.com, and online forums such as slashdot.org. Due to the anonymity of
Internet users, however, how to ensure information quality or induce quality content
remains a challenge.
Information sharing and user-generated content have become ubiquitous online phenomena.
For example, Wikipedia, a free online encyclopedia, is dedicated to massive distributed
collaboration by allowing visitors to add, remove, edit and change content. In online
product reviews, like the ones on Amazon.com, any user can post reviews on any item,
even if he or she has not bought it on Amazon. As these applications have gained
popularity and importance, the quality of content has become a concern. In Wikipedia,
readers may be provided with content that is misleading or even incorrect. Product reviews
on Amazon can be manipulated by sellers or book publishers to boost their products. On
Slashdot, commentators may post some biased or useless comments; e.g., advertisers from
hardware companies may post biased comments to promote their products.
People everywhere are getting together via the Internet in unprecedented ways. Millions
create content, inform each other about global issues, and build new communications
channels in a connected, always-on society. The rise of user-generated journalism is
generally attributed to blogs. Users and readers want to be heard. They want to influence
what they are reading. Unfortunately this revolution in user generated content means that
not all information on the Internet is suitable for all of its users. While many companies
maintain high standards of decency and refuse to allow any indecent material on their
computer network, not all companies have the capabilities to do so. Internet malcontents
have turned too many of the “group sites” into cyber-‐graffiti walls, filled with offensive
comments. Malcontent on the websites threaten to spoil the members’ experience on the
IE
Business
School
3
websites. To deal with this menace, this report introduces a new way to automate the user
generated content moderation system. This system not only identifies the bad behavior of
people but also identifies the good behavior.
2. BUSINESS OPPORTUNITY
Publishers and blog owners are looking for ways to effectively monitor the content on their
websites. Effective monitoring not only maintains the editorial standard of the publishing
house but also positively engages users on the website. As the market for user-generated
content set to rocket over the next few years, the important question for publishers and
advertisers remains: how best to monetize the rapid growth and demand in UGC whilst
ensuring a brand-safe environment?
The importance of this market, in comparison to the traditional advertising market is clear.
As budgets tighten due to global financial uncertainties, brand marketers are looking to
maximize the ROI potential of social media ahead of other digital media channels. Rich
media such as user-generated content that is uploaded and viewed from social media
communities is going to be a major global trend for brands to adopt and commercialize, and
for advertisers to take advantage of in terms of reaching new audiences.
Major enterprise publishers will look to harness this potentially, highly lucrative revenue
opportunity. Every major brand will create and manage its own digital community and
every consumer will be able to share their voice. However, the potential of this demand will
only be realized if advertisers can be assured that their brand is safe within the user
generated content market.
IE
Business
School
4
Addi<onal
revenue
Content
modera<on
User
Generated
by
monitezing
the
improves
the
end
Content
UGC
user
experience
Adver<sers
Enhanced
user
User
Community
experience
Figure 1: Cycle when effective moderation improves the user experience and generates
additional revenue from content monetizing.
Indeed, for any user generated content to be of significant value to the advertising
community and enable the advertising community to harness the rapid growth in user-
generated content, it is absolutely critical that the environment is moderated and therefore,
brand safe. Moving forward, the owners/publishers of social media and social networking
sites should be compelled to take responsibility for the content that is displayed on their
sites, especially if effective monetization is a business aim. A lack of moderation may be
acceptable amongst the 'not for profit' sector but as soon as sites look to monetize their
inventory through advertising, the question of moral and corporate responsibility is very
real.
World market for user generated content predicted to grow rapidly in the coming years --
rising from $200m at present to $2.46bn by 2012 (according to ABI Research), the
necessity for publishers to offer advertisers the safeguard of a moderated, brand-safe
method of harnessing this growth is clear. Magnified by the current economic concerns, the
notion of ignoring this potentially critical revenue stream whilst competitors snatch the
opportunity is unthinkable.
IE
Business
School
5
3. MODERATION OF USER GENERATED CONTENT
Nearly every UGC project has built-in tools for moderating content that allow the owner,
the client, or a trained team of outsourcers to remove offensive content, and perhaps even
drive participation. The moderation technique can be classified into two categories.
1. User moderation – In this, publishers set up their owns editorial standards and
communicate it to users through user agreement. These moderation techniques rely
on users to post comments that respect the community values and follow the term
and conditions of the publisher’s website.
2. Imposed moderation – In moderation technique publishers uses other tools such as
filters, human moderation to effectively monitor and control the quality of user-
generated content.
• Craft the guidelines – Publishing websites often have their own term and conditions
that users have to accept before registering on the website. These term and
conditions set the rules and regulation that users are bound to adhere.
• Enlist the users - In social groups the majority of the group’s members are
interested only in a positive experience. Given the opportunity, many community
members are more than willing to lend a hand and help protect the safety and
quality of a project.
• Make moderation action visible - When moderation controls are completely hidden,
an implicit invitation is given to online trolls to try to abuse the system, which in
turn creates extra work for the moderators. If the community knows that
inappropriate content will be removed quickly because they’ve seen clear signs of
that very thing happening, there will be less reason to test the boundaries.
IE
Business
School
6
• Automated filters - The first line of defense against malcontent is automating the
moderation process through smart filters. Filters can help ensure that certain types
of content never even appear to users of the site.
• Human moderation - In order to assure more complete brand protection, a human is
going to have to verify most or even all of a site’s content. Some sites operate on a
pre-moderated method (nothing goes live before specifically being approved), while
others operate on a post-moderation method (content goes live immediately, but is
ultimately reviewed by a moderator who may accept, reject or edit the content
according to client guidelines).
Right now, most publishers are relying on a combination of keyword filters and human
moderators to maintain their editorial standards. Unfortunately, there are problems with
both of these approaches. Keyword filters are a notoriously poor defense, as user can beat
them by simply replacing a letter with a symbol. Human moderators, on the other hand, are
expensive and occasionally biased. Given this poor choice of options, many publishers
choose to avoid UGC entirely. However, due to the need for publishers to maintain
relevance in an increasingly competitive marketplace, this is no longer an option, as UGC is
quickly becoming a necessity.
4. TECHNOLOGICAL APPROACH
IE
Business
School
7
"Discriminatory", "Inflammatory", "Violent Threats", etc that are easier for a classified.
The same approach can be used to identify quality contributions as well with sub-categories
such as "Congenial", "Insightful", and "Informative".
CoMo has following technological features:
• CoMo needs training according to client's editorial standards using historical data.
• CoMo is updated on a regular basis using feedback from the client's own users
community.
• CoMo moderates content in real time and tends to eliminate backlogged and
pending content on even the busiest sites.
5. CONCLUSIONS
User-generated content has exploded in recent times. UGC has not only created challenges
of maintaining the editorial standard for the publishers but also generated opportunity for
the publishers by monetizing the UGC for advertisers. Successful monetization requires
content moderation to protect not only the publisher’s brand but also effectively control the
menace of malcontent. In fact, the brand-protection has always served as a challenge for the
content moderation system. Finding users with a positive intent can help to reduce costs as
companies may find that users with a positive reputation do not need to be moderated as
intensively as those with no history or a less than stellar reputation. Users will get a
seriously reduced rating if he or she abuses the rules of a website by uploading offensive or
abusive material onto a community which could result in a user's online reputation being
tarnished. Automated filter and human moderation are the ways to moderate the content but
they suffer from several limitations. Automated system such as one discuss here will not
only safeguard the online brand but also provide a more positive and safer online
experience for consumers and improve online safety while rewarding responsible users. It
not only helps in deterring cyber bullying and online abuse, but also enables companies
identify users who take a positive, active role in their communities.
IE
Business
School
8
APPENDIX
1. Bayesian Filtering
After training, the word probabilities (also known as likelihood functions) are used to
compute the probability that a UGC with a particular set of words belongs to either good or
bad category. Each word in the UGC contributes to the probability of good or abusive
content. This contribution is called the posterior probability and is computed using Bayes'
theorem. Then, the email's spam probability is computed over all words in the UGC, and if
the total exceeds a certain threshold (say 95%), then the filter will mark the UGC as a
malcontent and deletes it.
The initial training can usually be refined when wrong judgments from the software are
identified (false positives or false negatives). That allows the software to dynamically adapt
to the ever-evolving nature of abusive content.
IE Business School 9