Summer placement report

22
nd
August, 2014

Title: The Machine Learning Issue

Anuradha V Reddy
Student ID: 33272145
Organisation: Microsoft Research, Cambridge
Researching and Designing in the field, Summer 2014
MA Design: Interaction Research, 2013-14
Goldsmiths, University of London





















Table of Contents

1. Introduction ……………………………………………………………………………………………. 3

1.1 Problem Statement …………………………………………………………………………….. 3
1.2 Background Research …………………………………………………………………………. 4

2. Research Process ……………………………………………………………………………………… 6
2.1 Comparing existing services ………………………………………………………………… 6
2.2 Personality tests …………………………………………………………………………………. 8
2.3 Inspiration from other areas ………………………………………………………………. 13
2.4 Probes ……………………………………………………………………………………………… 17
2.5 Interviews ………………………………………………………………………………………… 22

3. Design Explorations ……………………………………………………………………………….. 23
3.1 Development ……………………………………………………………………………………. 27

4. Prototyping …………………………………………………………………………………………… 32
4.1 Demo ………………………………………………………………………………………………. 33

5. Feedback ……………………………………………………………………………………………….. 40

6. Reflection ………………………………………………………………………………………………. 41

References









The Machine Learning Issue


1. Introduction
We use computing devices all the time, whether we are in an office, on the move or even
while in the company of others. More often than not, applications on our devices run
complex algorithms that are designed to learn from our regular usage. For example, a
company like ‘Google’ keeps track of our browsing patterns to deliver better services. Or a
smart thermostat like ‘Nest’ learns from our domestic behaviour patterns to regulate indoor
temperature to a suitable amount. Such systems are broadly known as ‘machine learning’
systems, and this report is a investigation into ‘machine learning’ through a process of
experimentation and research through design.




1.1 Problem statement
Machine learning is currently an area of great relevance in the way we transact with the
world around us. It has applications in computer vision (Kinect), web-search & advertising,
profiling & recommendation systems, home automation, transportation systems, navigation,
security, finance, medicine, environment and even biology. As machine learning continues to
grow and share a greater role in our lives, it is crucial to ask what the machine is doing. How
do users know what machines are capable of? And how do we enter into a dialog with
machines?


1.2 Background research
Machine learning is a relatively new topic of research but it has its roots in computer science,
even before computers existed as a mainstream consumer product. In 1959, Sir Arthur
Samuel was known to pioneer ‘Machine Learning’ as a field in computer science. He defined
‘machine learning’ as giving computers the ability to learn without being explicitly
programmed. He tested this out by creating a computing machine that learns to play
checkers and master it.



The idea is to make predictions or ‘probabilistic inferences’ for drawing the best possible
result against a set of multiple options. Machine learning is also a subject that emulates how
humans learn and assimilate knowledge. In this sense, machine learning is also paralleled
with the term ‘artificial intelligence’. Alan Turing was instrumental in trying to distinguish
humans and machines by testing if both can engage in a conversation, without letting the
human know that s/he is talking to a machine. Such experiments led to a fascination for
robots and machines that could replace humans. This may be evident through science fiction
literature and cinema, as well as from long-term investments by DARPA or Honda who
created humanoid robots like ‘Asimo’.



Over the years the term ‘artificial intelligence’ has been replaced by the term ‘machine
intelligence’. Fantastic visions of robots became reduced to domestic machines embedded
with intelligence, like the vacuum cleaner ‘Roomba’. Such devices came with the tag of being
‘smart’. The ‘Tamagotchi’, on the other hand, became a huge success in the early nineties for
the opposite reason. It seemed as though humans perceived intelligence in objects that are
embedded with certain behaviour.



Alex Taylor, a principal researcher at Microsoft Research, Cambridge attempted at
understanding how humans perceive machines that are imbued with differing levels of
‘intelligence’. He found that humans require intelligible feedback from machines, especially
the ones which are complex and hard to understand. ‘Interpretation’, therefore, seemed
crucial for interacting with complex systems.
Today, we leave our traces on multiple services such as Google, Facebook and Twitter
without being conscious of the learning algorithms behind the interface. Recent work by
Microsoft Research, Cambridge and University of Pennsylvania showed that it is possible to
accurately infer our demographic and psychometric profiles from our Facebook or Twitter
activity. They accomplished this by gathering Facebook data from about 35,000 users to
extrapolate and accurately deduce the profile of a new user. Even though machines don’t
necessarily profile users in this way, it is interesting information for both organisations and
users to take into consideration. How does a user interpret such information?






2. Research process
The research process involved a combination of research and experimentation with machine
learning technologies. More specifically, it involved interviews, probes, personal
investigation, and inspiration from other areas.

2.1 Comparing existing services
It was useful to compare similar services for understanding how each service differs in its
learning mechanism. For example, both Spotify and Facebook provide music
recommendations, but Spotify is content driven and Facebook is socially driven. For most
music enthusiasts, Spotify would be more appropriate whereas Facebook would be ideal for
someone who likes to discover music based on what is socially popular.



Similarly, Amazon and Flipkart are two different services that sell books online. While the
account on Flipkart had a purchase history, Amazon had no history. In this particular
context, Flipkart was more accurate in recommending content than Amazon. This helped in
emphasising the relevance of ‘data’ in machine learning systems.



2.2 Personality tests
The process also involved taking multiple personality tests online. On Facebook, there were
several tests that used the big five personality traits (openness, conscientiousness,
extraversion, agreeableness and neuroticism) to reveal a user’s personality. Here are some of
the tests results.
YouAreWhatYouLike







Five Labs


Colour personality test


What ‘Game of Thrones’ character are you?


What country best fits your personality?


Social network prediction app


What does the internet think of you?

The Google Ad test was particularly bizarre because the results were too far removed from
the person’s actual demographic profile.
It turned out that abstract tests like the Game of Thrones character and colour test were
more effective than the big five personality test. In other words, it is harder to accept being
87% neurotic than to accept being an eccentric character from a TV series. This was a driving
insight for testing how machine learning categorisations may be usefully translated into
meaningful information for the user.


2.3 Inspirations from other areas

Gene Weaver
Gene Weaver is a project by Iona Inglesby, a recent graduate from Royal College of Art. It is a
set of hand-woven objects depicting the genetic make-up of the designer and her family. She
used her family’s genetic data and colour-coded them to identify similarities and differences
between her and her sister. This led to the idea of conducting a similar experiment with data
received from the big five personality traits. It was converted into a binary code and
visualised through colour. These were presented as probes to people who volunteered to take
the test and share their results.





Generated Man
‘Generated Man’ is a project by an RCA alumni, Diego T. Pisanty. The project embodies the
sense of being a game character with defined characteristics. These characteristics are fed
into a search engine algorithm which finds related keywords. The keywords are then fed into
Google SketchUp’s 3D Warehouse to generate 3D models of those keywords. The 3D
generated models then become a character sketch of that person. This particular project
helped me investigate how other forms of digital content may represent a person’s character.



Personas
Personas is a project by Aaron Zinman, who is an alumni of MIT Media Lab. This project
illustrates a person’s online persona through an algorithm that searches the web for traces of
the person’s name in various places. It eventually spills out a coloured band of categories
showing where a person’s digital persona dominates the web.





2.4 Probes

Social DNA
Taking inspiration from the Gene Weaver project, personality test results from the big five
personality test were converted into a binary format and colour coded as shown in the image
below.







These set of individual colour codes were shown to the volunteers who took the test.
However, it was shown without revealing the names of other people who took the test. It was
interesting to see how similarities and differences were being drawn between people. This
was a reaction from a volunteer- ‘My tests said I was 92% neurotic, I am guessing this person
(pointing at a another colour code) must be similar to me’.


There was positive feedback from this probe experiment. However, it seemed as though
important data from the personality test was getting lost by abstraction through colour. In
other words, people couldn’t comprehend the degree of conscientiousness or neuroticism
through the colour codes. More explorations were performed as shown in the image below.
The idea was to retain the message from the personality test by expressing them in different
forms (shape and colour).



Unicode and Emoji: Picture + Character
The ‘Generated Man’ project induced thoughts around capturing a persona or character of a
person through digital representations. Diego used 3D models to resonate with a person’s
character. Are there other forms of digital representations that could frame someone’s
personality? With a little research, it occurred that Unicode is a form of digital encoding
developed for representing ideograms or ‘smileys’.


These emoji representations were used to represent a person’s character. These were
juxtaposed against a music application service (Spotify) to get a sense of how a person may
interpret their persona resting on a computer screen.





In another experiment, a couple of emoji were selected to represent each of the five
personality traits, as shown in the image below. The collages were created to understand
what people interpret from such imagery. For example, some people reacted to ‘neuroticism’
suggesting it was depicting ‘sadness’ or being ‘stressed’. It is beneficial to get this feedback
because it shows how being ‘neurotic’ is not an absolute attribute of a person’s personality. It
is only the machine’s understanding of a user and its judgement is not always true. In this
sense, it proved that breaking down such heavy handed terms like being ‘conscientiousness’
or ‘neurotic’ helps in getting a better sense of a person’s character.





2.5 Interviews

Thore Graepel
Thore is a principal researcher at Microsoft Research, and he is responsible for leading the
project that created the Facebook personality test. Thore felt strongly about the need for
‘interpretation’ in machine learning systems. He believes in letting people know why they
have been recommended something instead of black-boxing technology. His viewpoint
helped probe further about what he thinks should be the ideal explanations. To this, he
responded saying most explanations we receive today are cheats. He felt it was absolutely
important to have explanations but couldn’t judge between giving accurate explanations or
more interpretable ones. When asked about reverse engineering algorithms, he responded
saying that the algorithms are too complex to be reverse engineered and still be useful. To
him, the big five personality test categories were sufficiently ‘interpretable’ traits.

Natasha Milic-Frayling
Natasha, like Thore, is another principal researcher at Microsoft Research. She leads the
Integrated Systems group at the organisation. Her work in the recent past has involved
interface mechanisms that reveal how websites track users and link their profiles to third-
party websites which provide ads and recommendation services. She was very knowledgeable
about how websites track people and how they often make money at a user’s expense. She
believes that companies like Google and Facebook only give the illusion that their services
are free. While discussing the idea of revealing profiling mechanisms, she responded saying
these companies are not interested in individual profiles but only in the tracking algorithms
that benefits their business model. In her own words, she feels the only way to protect
ourselves from being tracked is to have a machine learning algorithms for every individual.
Such an algorithm would hypothetically be tied into a contract with services and companies
that the individual would subscribe to.

Diego T. Pisanty – RCA alumni
Diego is an ex-RCA student from the Design Interactions masters’ degree. He is currently a
researcher at the Culture Lab in Newcastle. In this interview, Diego spoke about his project
and how he went about it. He discussed similar thoughts and interactions that he wanted to
capture through his project. He suggested asking people to describe ‘what they think
happens in the background of a profiling system’? He also warned against getting lost in too
much technicality and instead recommended falling back on individual experiences of
understanding technology.

3. Design explorations
Through the research conducted, it seemed worth exploring interfaces that may reveal a
user’s personality and demographic traits in real-time. This way, a person may choose to
stick to the machine’s judgement or s/he may choose to correct the machine to a suitable
profile. As a result, a few quick sketches were drawn on website screenshots to visualise what
such an interface may look like and how users interact with it. The idea behind these design
explorations is to provoke thought around what machines learn.

1. Predicting a person’s gender on a website

2. Predicting a person’s age
A ball that rolls along the base of the webpage (the horizontal line being the scale)


3. Predicting a person’s political orientation
An unbalanced plank over a boulder suggesting a left or right political inclination

4. Predicting a person’s gender on a shopping website
Check boxes on Amazon to determine at which point gender changes


5. Predicting a person’s personality on a social networking site
Text analysis of facebook status messages to determine personality

6. Predicting a person’s emotional state on social networking site
Text analsysis of facebook statuses to predict emotional states


7. Predicting a person’s racial profile
Text analysis of tweets to predict a person’s race

3.1 Development
As each of the above proposals afforded different graphic languages, a reference template
was created to support these design explorations.

The descriptions below are prototype visualisations that focus on how to reveal what the
machine is learning in an interesting way.

Random generation
The idea is to use Pinterest to generate images on the webpage. For example, if the machine
learns that a user is female, it randomly generates an image from a Pinterest board tagged
with the term ‘woman’.





Natural language
If a machine learns that the user is ‘neurotic’ or is a ‘republican’, it may be represented
through natural language inside a word balloon.







Using interactive badges
The machine learning interface may be a badge that sits on a webpage. It may be interactive
and movable, allowing users to readjust its position to a convenient location. This may be
more playful than having something static at a corner of the page.





Personality trait ‘Open’ as a badge




Male gender representation as a badge

Democratic political inclination as a badge




Female gender representation as a badge (combined with random generation)

To further capture attention and increase a user’s anticipation, a user-interface loader is
included as an animated effect; indicating that the machine was thinking or processing data.




4. Prototyping
The final prototyping stage consisted of creating badges for representing a person’s political
inclination, age and gender, whereas personality traits were revealed through the use of
natural language.



Flash and Premiere Pro were used to create working prototypes that demonstrate how these
profiles are revealed. The screenshots below show how interactivity was programmed into
the designed graphic language using Flash and ActionScript 3.0.








Similarly for the personality traits prototype demo, Premiere Pro was used to edit the screen-
capture of the video (typing a status message on Facebook). This video was then imported
into Flash, where a tiny code was programmed to allow advancing ten frames at a time when
a keystroke is pressed. Here are some screenshots:





4.1 Demo

Gender scenario
In order to create a gender scenario, this machine learning cluster was used to justify the
website ‘economist.com’ would be determined as a male inclination based on the word
‘economy’ in the male cluster.



Here are some screenshots for the gender scenario using the website – economist.com






Age scenario

Similarly, the age scenario was based on the machine learning cluster show below. The word
‘History’ was chosen to demonstrate that a webpage based on BBC History would be
associated with a school going student.


Below are screenshots of the age scenario in the ‘history’ page of www.bbc.co.uk






Political orientation scenario

For this scenario, Obama’s twitter page was used to show that a person would be categorised
as a democrat.



Personality traits

In order to demonstrate the personality traits, the machine learning cluster in the image
below was used to create sentence structures that reflect particular personality types.





Scenario 1


Scenario 2



Scenario 3



5. Feedback
The prototype was demonstrated to several people including peers and researchers working
on machine learning. For some computer scientists, the project felt incomplete as they did
not agree to this sort of understanding of machine learning. Infact, one of the responses was
- “Are you trying to scare people with this interface? It is very misleading. This is not how
machine learning works.” On the other hand, machine learning experts seemed positive and
excited on seeing the prototype. They felt it raised a lot of questions about the ‘interpretable’
capacity of machine learning algorithms. A machine learning expert also compared the
prototype to a running ‘subtext’. It was an interesting reaction because it reminds us of real-
life situations when we feel like we could use subtexts to read people’s minds. Despite the
provocation instilled by the prototype, it was brought to attention that such an interface is
currently being built by computer science researchers. For example, the machine can tell if a
user is being extremely negative in his/her speech. In this sense, it is useful to know how
such systems are equally seen in a positive light as well as being off an ethical radar. Another
machine learning expert commented that interpretability is key to unpredictable systems. He
suggested pushing emoji’s further into new directions like the image shown below:


He also commented on the usefulness of such an interface if there was a way to save cookies
from the interface and send it to Amazon or any other recommendation service to get
tailored recommendations. For the prototype revealing personality traits, he imagined a
scenario where a user could ask the machine to make him sound more extrovert or
agreeable. On the longer run, he was sceptical about how such a machine learning interface
affects changes in its recommendation behaviour.
In addition, there were some pressing questions about how such an interface would work.
For example, a computer science researcher questioned if the interface and its algorithm lie
on top of layer above the actual browser interface? And if it is a layer, does it really affect
what the machine learns? In an ideal scenario, the machine should learn when it is corrected
but given a chance to play around with different types of profiles (age, gender, politics,
religion, race or Starbucks fan), it may be more interesting to explore how users will make
use of this tool. For the purpose of the project, it seemed sufficient to use this interface as a
mechanism to open up machine learning to users and invite responses from them.

6. Reflection
The project ended on an open-ended note, leaving loosely framed questions for further
inquiry. For example, these prototype demonstrations had a very pragmatic, and yet an
equally debatable side to it. But in doing so, it appealed to both computer scientists, who saw
potential in the demonstration and also drew varied reactions from average users. In this
sense, there may be tolerance for experimenting and imagining how machine learning
systems work within both sides of the community i.e. the expert and the average user. This
may be a great advantage for designers and artists to shape future systems that rely
completely on machine learning.
Another interesting area may be to delve into what machine learning interpretations look
like. As machine learning systems lend themselves to abstraction, it may be interesting to
explore how to meaningfully engage with abstractions, using existing representations or
develop completely new forms.
In terms of usefulness of such systems, we may think of it as a necessary step in building a
relationship with the machine learning system. It may be possible to explore machine
learning behaviours and how that affects the user. What sort of behaviour may be desired?
As a concluding note, this project was tied to the investigation of machine learning systems,
and in doing so, it showed that it has been an under explored territory from a human-
interface perspective. Despite issues related to interpretability and feedback, not a lot of
websites have interfaces that support users while using it. Within the field of HCI, it is
important to push research towards finding successful renditions of machine learning
systems that bring about engaging experiences without feeling threatened or suspicious of
what is behind the interface. As the prototypes show, there is immense scope for turning
machine learning systems around to shape entirely new ways of interacting with it.


References

Schwartz, H. A., Eichstaedt, J. C., Kern, M. L., Dziurzynski, L., Ramones, S. M., Agrawal, M.,
& Ungar, L. H. (2013). Personality, gender, and age in the language of social media: The
open-vocabulary approach. PloS one, 8(9), e73791.
Bachrach, Y., Kosinski, M., Graepel, T., Kohli, P., & Stillwell, D. (2012, June). Personality
and patterns of Facebook usage. In proceedings of the 3rd annual ACM web science
conference (pp. 24-32). ACM.
Pennacchiotti, M., & Popescu, A. M. (2011). A Machine Learning Approach to Twitter User
Classification. ICWSM, 11, 281-288.
Hu, R., & Pu, P. (2010). A study on user perception of personality-based recommender
systems. In User Modeling, Adaptation, and Personalization (pp. 291-302). Springer Berlin
Heidelberg.
Tintarev, N., & Masthoff, J. (2007, October). Effective explanations of recommendations:
user-centered design. In Proceedings of the 2007 ACM conference on Recommender
systems (pp. 153-156). ACM.
Papatheodorou, C. (2001). Machine learning in user modeling. In Machine Learning and Its
Applications (pp. 286-294). Springer Berlin Heidelberg.
Turnbull, D. (1999). Interacting with Recommender Systems. In Proc. ACM SIGCHI 1999
Workshop on Recommender Systems.
Bilgic, M., & Mooney, R. J. (2005, January). Explaining recommendations: Satisfaction vs.
promotion. In Beyond Personalization Workshop, IUI (Vol. 5).
Kosinski, M., Bachrach, Y., Kohli, P., Stillwell, D., & Graepel, T. (2014). Manifestations of
user personality in website choice and behaviour on online social networks. Machine
Learning, 95(3), 357-380.
Das, S., & Lavoie, A. (2014, May). The effects of feedback on human behavior in social media:
An inverse reinforcement learning model. In Proceedings of the 2014 international
conference on Autonomous agents and multi-agent systems (pp. 653-660). International
Foundation for Autonomous Agents and Multiagent Systems.
Taylor, A. S. (2009, April). Machine intelligence. In Proceedings of the SIGCHI Conference
on Human Factors in Computing Systems (pp. 2109-2118). ACM.
Wilks, Y. (Ed.). (2010). Close engagements with artificial companions: key social,
psychological, ethical and design issues (Vol. 8). John Benjamins Publishing.
Helmes, J., Taylor, A. S., Cao, X., Höök, K., Schmitt, P., & Villar, N. (2011, January).
Rudiments 1, 2 & 3: design speculations on autonomy. In Proceedings of the fifth
international conference on Tangible, embedded, and embodied interaction (pp. 145-152).
ACM.
Pisanty, D. (n.d.). Characters. Retrieved August 20, 2014, from
http://www.generatedman.com/characters.html
Iona Inglesby. (n.d.). Retrieved August 20, 2014, from http://www.rca.ac.uk/students/iona-
inglesby/
Zinman, A. (n.d.). Personas | Metropath(ologies) | An installation by Aaron Zinman.
Retrieved August 20, 2014, from http://personas.media.mit.edu/
📠™ Emojipedia - 😃 Emoji Meanings 💠•ðŸ‘ ŒðŸ™ŠðŸŽ•ðŸ˜•.
(n.d.). Retrieved August 20, 2014, from http://emojipedia.org/
See The Personality Behind Your Facebook Posts | Five Labs. (n.d.). Retrieved August 20,
2014, from http://labs.five.com/
You Are What You Like. (n.d.). Retrieved August 20, 2014, from
http://youarewhatyoulike.com/