You are on page 1of 29

AN ALGORITHMIC INTERVENTION INTO

NEWS MEDIA AS AN EXPLORATION OF


MACHINE-LEARNING SYSTEMS
Independent Research Project
Autumn 2014
By Anuradha Reddy
Student ID: 33272145
MA Interaction Design
Goldsmiths, University of London

Overview
Algo News is a by-product of news, language and algorithms.
We constantly rely on machines to make accurate decisions for us, whether
we are online purchasing airline tickets, trading stocks or searching for
Chinese take away. Our reliance on machines has left us being unmindful of
its potential, where even the slightest inaccuracy is readily dismissed as a
malfunction or as bad data.
Algo News subverts this notion by putting the machine through various
stages of malfunction. The computer is programmed to source news content
through different media channels with an algorithm that repeatedly
generates alternate versions of each news item. The new content is published
back into mass media i.e. Twitter, Youtube and the newspaper. Through
these misrepresentations, Algo News opens up spaces for contemplation on
different modes of machine-thought against a backdrop of our own
intentions for the machine.
Tutors: Alex Wilkie, Matthew Plummer-Fernandez and Jiimmy Loizeau
Special thanks to the workshop team and Pete Rogers
Credits: Matthew Plummer-Fernandez, Naho Matsuda and Shih-Yuan Huang

Table of Contents
Overview

Introduction

Background study

Proposal

Prototyping

11

Outcomes

17

Reflection

25

Conclusion

28

References

Introduction
Machine learning systems are a huge part of our lives today. We depend
on them for a majority of our daily interactions like reading the news,
checking the weather, connecting with people, searching for information,
looking for directions, booking tickets, banking or trading stocks. And for
most part, we are only concerned with the success of our transactions,
unperturbed by whats happening inside and outside such systems.
Automation has no doubt made our lives easier but it has been a subject of
debate and controversy. While it brought employment opportunities to
countries like India and China, it deprived others their jobs and livelihood
(Fig. 1). Furthermore, it has also questioned issues around ownership, access
and privacy of data. Despite claims for an open and free web, the Snowden
report shows evidence of bias by people who have specialized access to
peoples private data. This also raises questions about the filter bubble and
the authenticity of information delivered to us. These larger issues are a
consequence of todays ecology of machine learning systems. While
computer scientists have come a long way in inventing new applications for
machine-learning technologies, design still needs to address these issues.

Figure 1: Jobs at risk of being replaced by bots

The aim of a machine-learning system is to learn from old and new data sets
and predict accurate future outcomes. It is a valuable asset for assessing the
risks of a future natural calamity, spread of a disease or an upcoming financial
crisis. Arguably, this technology has crept its way into our social and
domestic lives, coaxing us to shop for articles or subscribe to services based
on the predictions it makes for us. There isnt a direct way to explain a
machines predictions and this lack of understanding distances us from
conversing with the machine agreeably. Technologists believe that machines
are only as good as they are programmed to be. Even the slightest
inaccuracy is treated as bad data or as data of low standards. In this sense,
design holds the potential to argue for less stringent and ambiguous ways of
interacting with these systems. The project explores this dimension of
machine learning through a series of interventions.

Background study
Many prevalent technologies such as Google, Facebook and Twitter are
built using machine-learning algorithms. Most of its content is usergenerated and what we see on our computer screens is often filtered by
vague criteria. These filters may be geographical, cultural or racial incorporating a layer of randomness that is hard to pin down. Luciana Parisi,
the author of Contagious Architecture, claims that randomness has become
the condition of programming culture (Fig. 2). However, she does not
suggest that exposing this randomness would start explaining culture and
aesthetics. Instead, it sensitizes us to different modes of computational
thought in generating new possibilities. It breaks open existing ideas behind
algorithmic expectation.

Figure 2: Contagious Architecture

Other contemporary renditions of machine-learning systems are common


among technologists who experiment within the boundaries of design,
politics, literature and culture. James Bridle coined the termed The New
Aesthetic to highlight the increasing appearance of the digital in our physical
environment (Fig. 3). According to him, machine glitches, misrepresentations
and unintended occurrences (e.g. rainbow plane) act as agents that sensitize
us to machine visions/interpretations.

Figure 3: The New Aesthetic

Another technologist, Matthew Plummer Fernandez, maintains a blog named


Algopop, which captures instances of how technology and culture remix to
reveal new experiences and interesting outcomes (Fig. 4).

Figure 4: Algopop, Tumblr blog

Proposal
Drawing from literature and contemporary discussions on machine
learning, it was necessary to highlight the algorithmic processes behind these
systems. One way of addressing this may entail revealing the changes in
attributes (e.g. age, gender in a profiling system, Fig. 5). It exposes the
machines prediction in real-time for us to arrive at different interpretations.

Figure 5: Revealing age, gender and political orientation in Machine Learning

Another approach is to automate an algorithm (by creating a bot) to


influence interaction with other algorithms. For example, Darius Kazemi
created a Twitter bot of a young girl named Olivia Taters. She sometimes

talks to other bots on Twitter (Fig. 6). In this sense, the machine reveals its
process of making random associations by interacting with other algorithms.

Figure 6: Bank of America intervenes in a conversation between two Twitter bots

The agency of the machine slowly becomes apparent as it moves away from
doing what is expected. This idea of programming machines to perform
something unintentional started to tie in with the analogy of taking drugs to
alter the function of the human body. This led to an exploration of how these
algorithmic drugs could play out in the real world and the meanings
generated by the production of alternate content.
There are plenty of machine learning applications in various contexts to
experiment with an algorithmic invention. For example, online shopping is
huge area that keeps a track of individual profiles, their browsing histories
and purchase orders. Similarly, social-networking sites know who our friends
are, where we went on vacation and who is stalking us. In addition, machine
learning is used for high-frequency trading systems, transportation, brand
loyalty cards and security systems. A large focus of machine learning is
invested into search engines, multilingual translation applications and email
filtering applications.
In the beginning, it was interesting to think about applying machine learning
to cultural artifacts like the Holy Bible or Shakespearean sonnets. Initial ideas
wheeled around performing a faith analysis on Twitter feeds using the Bible
as training data. This idea was spurred by existing sentiment analysis

applications built using machine-learning techniques. However, the success


of this experiment was dependent on the use of keywords such as David,
Judah, Thee, or Thou. The other option was to alter the content of the
Bible itself, using machine-learning. However, the Bible felt static compared
to the amount of dynamic content available online i.e. websites, ecommerce, blogs, twitter feeds, Facebook data, YouTube videos etc. It
seemed intriguing to explore machine-learning associations in areas such as
online shopping (e.g. eBay, Amazon) and social-networks (e.g. Facebook,
Twitter).

Prototyping
Early on, basic prototypes were created using simple tools to visualize
ideas. For example, online recommendations from Amazons browsing
history were downloaded as images and converted into a fast-paced GIF
image. The idea was to randomize the act of browsing and purchasing items
while commenting on Amazons recent one-click-buy feature. Here users
may click on any item in the GIF and purchase it without notice. This
prototype spurred conversations on making dashboards that display users
browsing histories and machine recommendations in momentary flashes. It
also questioned what other peoples GIFs may look like and who has access
to it. Can a persons GIF image get sold back on Amazon (Fig. 7)?

Figure 7: A GIF image on an Amazon for accidentally purchasing an item

The same idea was replicated with Facebook images. A person may
unintentionally Like somebodys photo in a GIF image on his or her timeline
(Fig. 8). Such an intervention would start showing visible changes in their
future recommendations and Facebook wall activity.

Figure 8: A GIF image on Facebook for accidentally liking someones photo

Another prototype was created using Mozillas X-ray goggles. At the Mozilla
Festival 2014, a workshop was held to educate people about web literacy by
asking them to remix the TATE website. Mozillas webmaker tools like X-ray
goggles and Popcorn maker allows anyone to break down the HTML of a
webpage and remix it to create different content. Having attended this
workshop, it triggered the idea of changing the original content of a BBC
news page into a mistranslated one (Fig. 9). Google Translate was used to
translate this page from English to Telugu and back to English. Google
Translate is a popular multilingual translation service that uses machine
intelligence to learn from all kinds of human-translated text content on the
web. It attempts to make intelligent approximations of the actual sentence
through context-recognition and pattern matching. However, it is still quite
difficult for someone to explain why the machine chooses specific words or
sentences against many possibilities.

Figure 9: The mistranslated BBC news page

This idea of a mistranslation was initially limited to just text content. In the
next iteration, it was combined with finding substitute images using Googles
SearchByImage tool (Fig. 10). It allows a user to find similar images by
comparing it to the attributes of the original image. It uses an imagerecognition algorithm that uses machine learning.

Figure 10: Original image, Googles Search by Image image

The third prototype included the use of IFTTT (If This Then That) and
Facebook. IFTTT is a web-based service that allows other services to talk to
each other by creating algorithmic recipes between them. For example,
IFTTT was programmed to search for the terms eBay and Wearables on
Twitter and to post a new item containing the keyword on a Facebook profile
page (Fig. 11). A fake Facebook profile called Anna Roberts was created in
order to experiment with different keywords and outcomes. However, there
was no interaction with this profile or with the posts generated through IFTT.
In fact, Facebooks algorithms do not allow more than 25 posts per day from
the same source (IFTTT). This prototype was considered a failure because the

idea was to understand if social media interaction could influence ecommerce using ambiguous methods such as this prototype.

Figure 11: Results from IFTTT recipe

Another intervention could entail using technology in unintended ways. A


hands-free reading tool, for instance, was applied to a travel-booking
website. It was intentionally messing with the users focus to find a suitable
booking. The page constantly scrolls down obstructing the user from
successfully booking a ticket.
These ideas were presented to researchers and students. It was pointed out
that the code snippets could be repeated over and over, until a noticeable
change occurs in the machines output. This may be compared to
encountering the effect of consuming drugs more than once. They also
suggested that the project could result in multiple outcomes, each in a
different domain, such as e-commerce, news websites, social networks etc.

Given the time frame for the project, it was necessary to decide how much
work to take on. Drawing from the discussion and feedback, the most
promising prototype was the mistranslation of the BBC news story. This idea
carried the potential to put the content through several stages of mistranslations while also tying it to Google Translates machine-learning system.
The next stage of prototyping was to push this idea by translating a piece of
content several times, until its meaning starts to change. In order to put this
into a working logic, a news headline was translated into languages ranging
from the east (i.e. Chinese, Japanese, Hindi, Arabic, Russian) to the west
(German, Spanish, Italian, French, Portuguese). Each time the sentence was
translated from English to the language and back to English. When this idea
was explained to peers, they related it to the game Chinese Whispers. The
outcome of this experiment was amusing while being critical of the content
we consume as users of the web. To automate the recipe, a program was
written in Python (the programming language) with the help of open source
APIs. In this case, Goslate was used, which is a free Python API for Googles
translation services (Fig. 12).

Figure 12: Python code for Google Translate API

With the code, it became extremely simple to input a text and get instant
results at different stages of translation (Fig. 13).

Figure 13: Mistranslations

Outcomes
At this point, it was interesting to start imagining how these mistranslations
play out in the real world. The outcomes could act as agents for spurring
conversations around machine capacities. Moreover, the context of news also
brought with it the potential to implement the algorithm in different channels
of news media.
The first outcome originated in Twitter. Twitter is a micro-blogging service
that allows people to post short pieces of texts with a 140-character limit.
Twitter also relies on machine learning to recommend relevant profiles to
follow. In spite of many twitter profiles being real, there are a fair number of
automated twitter bots that post tweets, reply to tweets or retweet existing
tweets automated algorithms. In the Twitter universe, keywords are crucial for
prompting interaction between users. This insight resulted in experimenting
with the algorithm on Twitter. The twitter profile was titled Algo News and
the result of 10 mistranslations is posted as a tweet (Fig. 15). To accomplish
this, a new twitter profile was created with a developer identity. This helped
procure an API key and the required authentication codes. These codes were
used in a Twitter API for Python called Twython which allows a Twitter user
to post tweets from a computer program (Fig. 14).

Figure 14: Python code for Twitter API

The source for original twitter content is The Guardian. The Guardians tone
is neutral and unobtrusive compared to other news sources. It was interesting
to examine how the machine starts to interpret news in varied tones after
several mistranslations. Algo News started receiving immediate attention
with the keywords it generated. It produced about 400 tweets, of which
many were retweeted and favorited by unfamiliar users. However, the
number of followers has been fluctuating based on how many users realized
Algo News is actually an algorithmic bot (Fig. 16).

Figure 15: Algo News on Twitter

Figure 16: Reactions on Twitter for Algo News

The second outcome was a direct result of the tutoring sessions. One of the
tutors felt strongly about how this algorithm plays out in the real world
through physical artifacts. As the context was news media, it was obvious that
a mistranslated newspaper could be a potential outcome of the project. In
this case, even though a newspaper is physical, the mistranslated content
suggests something thats gone terribly wrong in the printing system or
before the content was processed. This seemingly familiar but bizarre sense
of reading a mistranslated newspaper was intriguing to pursue. It also
provided the opportunity to not just present text content but also images,
using the Google SearchByImage tool. To take it further, more than one
newspaper may be presented, each one different from the other by a layer of
mistranslation (Fig. 17). The idea was to suggest more than something gone
terribly wrong and instead reveal a machines capacity in generating many
versions of the same content in a mundane format like the newspaper.

Figure 17: Printed Newspapers of Algo News

Figure 18: Reading Algo News

The third project outcome consisted of mistranslating the voiceover text of a


video. In the initial stages of this idea, a YouTube video was selected from
the BBC news channel. The video content was related to Islamic terrorism
and the beheading of a British hostage. The mistranslated text was funny
because it transformed such serious content into something comical and
bizarre.

A volunteer attempted to vocally emulate a newsreader, by recording the


mistranslation. When the mistranslated video prototype was tested amongst
peers, it seemed as though the visual language of the video was too
imposing. The viewers were more concerned with the visual content than the
audio. The reactions were shifted from what was actually intended. The tutors
suggested choosing a video that was less heavy-handed. Based on this
feedback, the idea of using someones voice was rejected because it felt
almost too natural and believable. The strangeness in the content was
difficult to extract with a natural voice. Looking back at YouTube, there was
an opportunity to play around with YouTubes recent automated captioning
feature. Closed captioning is a tool that allows hearing-impaired viewers to
read dialog in a video. Most often, closed captioning is created in the same
language as the video, assuming viewers speak the same language. Videos
that use closed captioning also benefit from spreading word about the video
on the web. This happens because of the machine tapping into keywords
used in the captioning system. YouTubes automated captioning tool is not
very accurate, prompting users to poke fun at the system. A YouTube
channel by Rhett & Link is specially dedicated to YouTube videos with
inaccurate captions (Fig. 19).

Figure 19: Caption Fail on Youtube

A set of mistranslated captions could represent an alternate way for machines


to make sense of our world. To demonstrate this, a Youtube video titled
Rosetta scientist criticized for shirt was mistranslated using the algorithm.

Figure 20: Rosetta scientist criticized for shirt Youtube video

The content of the video was about a male scientist who wore a shirt showing
scantily clad women. The scientist, Matt Taylor, apologized for his conduct
and spoke about his priorities for the space probe that just landed on a

comet. The video suggests undertones of issues such as feminism, sexism


and men in scientific positions. It was interesting to explore how machines
interpret this content, and what it means for us to see mistranslations against
the background visual. The final video was divided into 4 videos showing the
same visual content but with different mistranslated captions.

Figure 21: Showing four versions of captioning for the same video

For exhibiting the prototypes, it was important to address how all the 3
outcomes were tied by the same mistranslation algorithm. One of the tutors
suggested that I buy a Raspberry Pi and let the Python algorithm run from it.
The hope was that the physical aspect of the Raspberry Pi would provoke
discussion about what these algorithmic bots look like and where it resides.
However, there was a danger in not being able to explain what the Raspberry
Pi does. A well-illustrated diagram of how the algorithm works with the
Raspberry Pi was a possible solution to address the issue. Furthermore, the
initial idea was to spread the newspapers out on the table provided. But the
tutors suggested other interesting ways of exhibiting it. It was also
recommended to make a scrapbook of cutouts containing original news
content used for making the algorithmic newspaper. This was suggested to
make the connection between the mistranslated newspaper and the original
content clearer. The mistranslation algorithm was also combined with a Flickr
API that posts tweets with the first image it finds upon searching the
translated text (Fig. 22). This was a welcome addition to the text content that
the algorithm was already generating.

Figure 22: Flickr Test and output on Twitter

Reflections
During the process, several biases and challenges were brought to the
forefront. To begin with, it was difficult to judge whether certain word
choices by the machine were due to foreign grammar or because of the
machine associations. For example, in this following set of mistranslations,
the word heroines gets lost in the second translation i.e. Chinese. Is this a
machines doing or does the word heroin not exist in Chinese?
Ellie Irving's top 10 quiet heroes and heroines
Ten quiet hero by Ellis Owen
10 quiet hero in Ellis Owen
Ellis Owen 10 quiet hero
Ellis Owen 10 quiet hero
10 Ellis Owen quiet hero
10 Ellis Owen silent hero
10 Ellis Owen unsung heroes
10 Ellis Owen unsung heroes
10 Ellis Owen unsung heroes
10 Ellis Owen unsung heroes

Similarly, the word Alibaba which refers to the Chinese e-commerce


company, gets translated to the word Pope. Is this a consequence of
moving from Eastern to Western languages? Or is it that the machine does
not acknowledge Alibaba as a noun?
Alibaba: Chinas answer to Amazon makes 4.4bn thanks to Singles' Day
Alibaba: China's answer to the Amazon 4.4 billion, due to the Singles' Day
Alibaba: China's answer to the Amazon 44 is, for the singles of the day
Alibaba: 44 China's answer to the Amazon for a single day, is

Alibaba: response 44 Amazon China in one day, it


Alibaba: response Amazon China 44 pounds in one day
Alibaba: response Amazon China 44 pounds in one day
Alibaba: Answer Amazon China 44 pounds in one day
Alibaba: Response Amazon China 44 pounds in one day
Pope: response Amazon China 44 in one day
Pope: response Amazon China 44 pounds in one day

However, there were certain instances where the machine-learning algorithm


behind Google Translate revealed its random associations. For example, the
sentence Dumb and Dumber was automatically transformed to Jim Carrey,
the actor. In an ideal situation, the translation shouldve been Stupid and
more stupid or likewise.
Dumb and Dumber To review the bottom line: it's still funny
Dumb and Dumber To see - the bottom line: it's still fun
Jim Carrey is to see Mr. Dama - The Bottom Line: It's still fun
Jim Carrey is Mr. Dama to see - Bottom line: it's still fun
Jim Carrey is Mr. Lady to see - Bottom line: it's still fun
Jim Carrey is Mr. Lady to see - in short, it's fun
Jim Carrey is Mr. Lady to see - in short, it's fun
Jim Carrey is to see Mr. Lady - in short, it's fun
Jim Carrey is to see Mr. Dame - in short, it's fun
Jim Carrey is to see Mr. Notre Dame - in short, it's fun
Jim Carrey's see Mr. Notre Dame - in short, it's fun

In another case, the word horrible was translated to the word awesome.
This happened around the third translation, which is the language Hindi.
Going back to Google Translate, it seemed as though the word horrible was
used in the context of being fierce and as a result, the word awesome.
Apocalypse now: horrifying scenes from our ravaged planet
Apocalypse: Horrifying scenes from our ravaged planet
Apocalypse: our horrible scene from the devastated planet
Doom: our awesome view of the devastated planet
Doom: our stunning view of the devastation of the planet
Destination: our breathtaking view of the devastation of the planet
Destination: our breathtaking view of the devastation of the planet
Destination: our breathtaking view of the devastation of the planet
Destination: our view breathtaking devastation of the planet
Destination: We have a breathtaking view of the destruction of the planet Earth
Destination: We have a stunning view of the destruction of the planet Earth

This sort of reflection requires much more observation and documentation,


to be able to point out what machines are really doing. However, having this
process of moving from one language to the next in a working order helped

in analyzing and debugging within those mistranslations where it felt strange


or ambiguous.
When the Twitter Bot was created, it was encouraging to see many other
twitter profiles retweet or favorite the tweets posted by Algo News. However,
it seemed like these other twitter profiles had automated actions to favorite
or retweet tweets with selected tags. For example, a Twitter profile with the
name About Mental Health favorited a tweet about mental health (Fig. 23).
It was difficult to tell whether there is a human behind this twitter profile.
Most non-automated twitter profiles following Algo News started to unfollow
Algo News within a span of 7 days.

Figure 23: Reactions on Twitter

From an exhibition viewpoint, it was hard to approach the topic machinelearning, algorithms or bots, assuming the audience have no idea of what
it all means. Where does the explanation begin and end? Also, there were
certain assumptions made about viewers being able to understand this idea
of machine-altered mistranslations without providing a reference to the
original content. However, with the tutors guidance, a reference to the
original was provided.
Furthermore, exploring a topic like machine learning through design requires
a focus on ongoing technological research with a combination of
prototyping. The outcomes may or may not bear potential for devising future
interactions. In this sense, the outcomes are more likely to be research
devices in the way it allows subjectivities to be performed by people who
interact with it.

Conclusion
The three outcomes of this project i.e. the newspapers, the video and the
Twitter bot are experiments in exploring how machine learning interventions
could play out in real life and what potential interactions it can offer. In the
future, these news-generating bots can be pushed further to create content
such as news websites or to create AI systems that interact with other newsbots or with other users. It also has a future in creating interfaces that allows
users to interact with multiple versions of a single story. How does a user
choose from them? The randomness of the machine reveals new potential as
better tools are invented to interact with these systems. It is also important to
mention the critical role of open-source developers in allowing designers and
technologists to creatively engage with technology and being able to
imagine and prototype ideas for future expectations. Luciana Parisi mentions
that Randomness is a condition of programming culture. However, in the
context of this project, this randomness extends beyond programming
culture, and into our physical lives, into who we are, and what we know. It is
indeed crucial to accept randomness as a part of our mundane lives and
design ways to define the scope and expectations for these technologies
through such explorations.

References
[1] Kaminska Isabella. FT Alphaville. FT Alphaville Robots Jobs and TFL Strikes Comments.
N.p., 29 Apr. 2014. Web. 05 Dec. 2014.
[2] Parisi, Luciana. Contagious Architecture: Computation, Aesthetics, and Space. MIT Press,
2013.
[3] Bridle, James. The New Aesthetic. Web log post. The New Aesthetic. N.p., n.d. Web.
05 Dec. 2014.
[4] Bridle, James. On the Rainbow Plane. Web log post. Booktwo.org. N.p., 4 July 2014.
Web. 05 Dec. 2014.
[5] Plummer-Fernandez, Matthew. #algopop. Web log post. #algopop, N.p., n.d. Web. 05
Dec. 2014.
[6] Madrigal, Alexis C. That Time 2 Bots Were Talking, and Bank of America Butted In. The
Atlantic. Atlantic Media Company, 07 July 2014. Web. 05 Dec. 2014.
[7] Amazon.co.uk: Low Prices in Electronics, Books, Sports Equipment & More.
Amazon.co.uk: Low Prices in Electronics, Books, Sports Equipment & More. N.p., n.d. Web.
05 Dec. 2014.
[8] The Queen Sends First Tweet to Launch Science Museum Gallery. BBC News. BBC, 24
Oct. 2014. Web. 05 Dec. 2014.
[9] Put the Internet to Work For you. IFTTT / Put the internet to Work for You. N.p., n.d.
Web. 05 Dec. 2014.
[10] Rosetta Scientist Criticized for Shirt. YouTube. YouTube, n.d. Web. 05 Dec. 2014
[11] Taylor Swift Caption Fail. YouTube. YouTube, n.d. Web. 05 Dec. 2014.
[12] Islamic State Hostage John Cantlie reports from Kobane. YouTube. Youtube. n.d. Web.
05 Dec. 2014.
[13] Google Translate. Google Translate. N.p., n.d. Web. 03 Dec. 2014.