You are on page 1of 2

Fair and Balanced – Summary – Group 3

Widespread it is believed that media outlets exhibit ideological slant but there are very few
studies which tries the cumbersome task of quantifying media ideological slant . This paper is
special addition to the existing literature regarding quantifying media bias due to its
methodology . The authors have used hybrid technology -which is a mix of machine learning
and crowd sourcing technique . Another important thing about the paper is that it debunks
previous believed results regarding systematic media bias presence in U.S Media . The paper
start’s with discussion about the mechanism through which bias can operate – Issue filtering
& Issue framing. Previously Audience & Content based approaches were used to quantify
media bias , the former although gives sensible ideological ordering of outlets but was just a
relative measure not absolute , nonetheless later was used by several economists but is
applicable to the subset of article so limits the scope of research . So, the authors in response
to the limitations used method of machine learning and crowd sourcing , combining statistical
methods with direct human judgments that allows them to directly and systematically
investigate media bias at a scale that was previously infeasible . The primary data used by
authors is based on the articles published in 2013 by top thirteen U.S Newspapers and two
popular political blogs. Basically, they examined the complete web browsing records for US
located users who installed the Bing toolbar and for each fifteen news sites , they recorded all
unique URL’s that were visited by at least ten toolbar users . Then they estimated the
popularity of an article by tallying the number of views by toolbar users , this resulted in
803,146 articles published in 15 news sites. Naturally the next step was to skim out political
news articles for which they built binary classifiers using large scale logistic regression. Two
classifiers were used – News & Political . Given the scale of the classification task , they fit
the logistic regression with stochastic gradient descent (SGD) algorithm. To train the
classifier we require both article and feature. For former they used article title and first
hundred words which are strongly indicative of its content. For later they used a popular
crowdsourcing website called Amazon Mechanical Turk to hire works by doing up to the
mark background checks such as good mechanical Turk standing , 98% approval rate and a
political knowledge test . For news classification tasks workers were presented with article
title and first hundred words and asked to categorize it into the defined nine categories ,
which roughly corresponds to the nine sections of newspaper. To make the classification
further easier we make binary categories out of these nine as “ news” and “non-news” .
For the trainings set the workers categorized 10,005 randomly selected articles stratified
across 15 outlets , roughly 667 articles per outlet . Applying trained classifier to full corpus
yields 340,191 articles classified as news. Then they trained the political classifier and asked
workers to categorize 10,005 randomly selected sample of articles into political or non-
political . Applying trained classifier to full corpus yields 114,814 articles classified as
political news stories . They also tested the news classifiers and political classifier on a new
subset of random sample 10,005 sample and found the results consistent , accurate and
precise. Now the authors moved toward sole goal of categorizing article by topics ( Gay
rights , Health care ) and quantifying the political slant of article . Even with crowdsourcing ,
classifying over 100,000 articles is adaunting task , therefore they used readership weighted
sample of approx. 11,000 political news articles. They also took care of sample randomness
by randomly selecting two political articles from each 15 outlet every day in 2013. The
authors also controlled for workers possible preconception of an outlet ideological slant by
designing an experiment which include blinded and unblinded condition for workers to
categorize articles. The worker was asked to perform three tasks , firstly to do a primary and
secondary articles classification based on 15 topics , secondly whether determine articles was
descriptive news or opinion or lastly measuring ideological slant by asking question that “Is
this articles generally positive , Neutral or Negative toward members of the democratic /
republican party”. The answers for last question were provided on a 5-point scale. Finally,
authors assigned a partisanship score b/w -1 and 1 to each article . The score for each article
is defined as average of these two ratings. The authors then conducted some additional tests
to check the reliability of primary data and computed inter-rater reliability and found it to be
consistent . The author then provided overall outlet-level slant , outlet -level slant for opinion
and descriptive news separately and democrat and republican slant for each news outlet .
There is something interesting that an outlet’s net ideological leaning is identified by the
extent of its criticism , rather than its support of each party . The authors then discussed the
issue of Issue filtering to measure it they estimate the proportion of articles categorized by
human judges under each topic. Authors also discussed the difficulty of identifying
consumption vs production choice of news , because no definite measure can capture it . The
authors also did natural variation in their analysis using two different sample weights and
found nearly identical outlet level slants which advocates for the robustness of their results.
Both in terms of coverage and slant , they found that major online news outlets ranging from
the New York times on the left to Fox News on the right – have surprisingly similar , and
largely neutral , descriptive reporting of the US Politics .

You might also like