You are on page 1of 5

Turkopticon: motivation, design, status, lessons, questions

[Slide 1.]

0. Mechanical Turk.
0.1. Mechanical Turk is a web site run by Amazon.
0.2. It is a market for small information tasks, for "external crowdsourcing." For
example, "what is in this picture?", "are these two directory entries for a
business the same?", "rewrite this sentence in your own words", "transcribe this
audio clip", and so on.
0.3. The prices for the tasks range from 1 cent to a few dollars. Most work on
Mechanical Turk is precarious and for low pay. Experienced workers can earn more,
but most earn a few US dollars per hour.
0.4. You can be paid in US dollars, Indian rupees, or Amazon gift card points.
0.5. Amazon says there are between 250,000 and 500,000 workers on Mechanical Turk.
This is sort of a meaningless number though, because workers can be more or less
active. Some do one task a month when they are bored; some work on it 40 hours a
week.
0.6. One researcher estimates that there are about a thousand workers on it at any
time.
0.7. In a survey we ran from 2008 to 2010, a little less than half the workers were
in the US, a little less than half were in India, and the rest were from elsewhere.
0.8. In the same survey, about 20% said they relied on income from Mechanical Turk
to meet basic needs.
0.9. This survey is out of date, but if anything we underestimated that number --
because, we were told later by workers who criticized our paper, our survey paid
too little to attract the "professional" crowd workers.
0.10. There were diverse motivations for doing work on Mechanical Turk, but most
respondents to the survey said their main motivation for working on Mechanical Turk
was money, not fun.
0.11. Employers can "reject" work -- that is, not pay.

[Slide 2.]

1. Motivation for Turkopticon.


1.1. The design of Mechanical Turk is not fair for workers.
1.2. Employers do not have to pay for work.
1.3. They do not have to give a reason for not paying.
1.4. Workers can complain, but employers do not have to answer. They don't even
have to read the emails.
1.5. Amazon charges employers for *posting* tasks. So it does not hurt Amazon if
employers do not pay workers.
1.6. Mechanical Turk keeps track of workers' approval rates. A worker's "approval
rate" is the percentage of their work that employers have paid for.
1.7. Employers use approval rates to screen workers. They can allow only workers
with an approval rate above a certain number to work on their tasks. The default is
95%. The approval rate is assumed to be a proxy for worker quality.
1.8. But of course this is wrong, because employers may refuse to pay for any
reason -- for example, because they feel like it.
1.9. So there is an inaccurate reputation system for workers.
1.10. And there is no reputation system for employers.
1.11. For example, if you are a worker choosing a task you might want to know how
frequently the employer who posted the task has refused to pay other workers. You
might call this the employer's approval rate. You might even want to see a list of
all tasks from employers who have an approval rate above a certain number. But this
is not currently possible.
1.12. I want to pause for a moment. You might think, well, that is too bad for the
crowd workers. But this has nothing to do with me. My job is safe. I am a skilled
worker. Maybe. But you should know that unless your job involves doing something
with your body, there is probably a researcher in a computer science department in
the US, or a programmer in Silicon Valley, who is right now trying to figure out
how to crowdsource your job -- to save your employer money and take part of the
savings for themselves. I'm not saying this to stir up fear, but you should know
about it. There was a project at Carnegie Mellon University where researchers and
professional journalists tried to crowdsource the production of science journalism.
I hear it didn't work so well, which I am pretty happy about. But next time it
might work. And it's not far from crowdsourcing science journalism to crowdsourcing
academic writing. Now, I think writing produced that way will not be high quality.
But it may be good enough that universities will see an opportunity to cut costs --
and certainly in the US universities are trying very hard to cut costs (except in
administrative pay and athletics budgets). I think the idea is that if research can
be crowdsourced, then any "information work" -- anything that can be done with only
a computer and a brain -- can be crowdsourced. So the issue here is not only about
establishing fair conditions for current crowd workers. The issue is that many of
us who have secure "information work" jobs now may be crowd workers sooner than we
expect.

[Slide 3.]

2. Turkopticon, our tiny stand against this future.


2.1. Turkopticon is a third-paty employer reputation system for Mechanical Turk.
2.1.1. By "third-party" I mean it is not associated with Amazon at all. We have no
special access to their data or anything like that.
2.2. The original goals of Turkopticon were (a) to call attention to the unfairness
of Mechanical Turk and (b) to pressure Amazon to build an employer reputation
system into it.
2.3. Turkopticon has two parts: a browser add-on and a web database application.
2.3.1. The web application lets workers review employers.
2.3.2. The browser add-on adds these reviews to the Mechanical Turk interface.

[Slide 4.]

2.4. Here are some pictures.


2.4.1. This is what Mechanical Turk looks like normally.

[Slide 5.]

2.4.2. This is what it looks like when you have Turkopticon.

[Slide 6.]

2.4.3. If you mouse over one of the arrows, you see this.
2.4.4. We have four scores for each employer:
2.4.4.1. "Communicativity": how well do they respond to worker communications?
2.4.4.2. "Generosity": how well do their tasks pay?
2.4.4.3. "Fairness": do they reject fairly, or do they reject without good reason?
2.4.4.4. "Promptness" how fast do they pay? Employers in Mechanical Turk have up to
30 days to pay. But workers prefer faster pay.
2.4.5. If you click on the link for the number of reviews, you can see the
individual reviews.

[Slide 7.]

2.4.6. And you can leave your own review.

[Slide 8.]

[Slide 9.]
2.5. Here are some numbers.
2.5.1. We have about 20,000 users.
2.5.2. We have almost 100,000 reviews, covering almost 22,000 employers.
2.5.3. These have been posted by about 8,000 workers.
2.5.4. Almost 16% of the reviews have been posted in the last three months, so the
users are quite active.
2.5.5. We have about 12,000 daily visits to the different parts of the service.
2.5.6. And we have reviews for most of the employers on Mechanical Turk.

[Slide 10.]

2.6. But so what? Did we achieve what we set out to do?


2.6.1. Not really.
2.6.2. We did call attention to the unfairness of Mechanical Turk.
2.6.3. But Amazon did not build an employer reputation system.
2.6.4. In fact, I was at an event where somebody asked an Amazon executive why
there is no employer reputation system in Mechanical Turk. She said "the community
handles the problems" with employer misbehavior.
2.6.4.1. Now, I want to point out how important this answer is. Imagine if Siemens
had a factory. And every once in a while the machines would break down and they had
to hire an outside contractor to come repair them. But the workers had to keep a
pot of money on the side, collected out of their own paychecks, to pay the
contractor. And somebody asked management, why doesn't the company pay for this out
of their profits? And management said, well, the workers handle the problem. It's
really not a satisfactory answer. But this is the spirit of American technology
industry these days. It's very opportunistic and it is in love with anything "self-
organizing" or "user-generated" because those are free inputs. So Turkopticon,
combined with other tools and forums made by workers, make Mechanical Turk *less
unfair*. But it is a free input. We do a free service for workers by keeping this
up, but we also do a free service for Amazon by keeping this up. We make the
situation a little less intolerable, so it goes on longer like this, without Amazon
having to take responsibility. So, is that success? I don't know.
2.6.5. Also, a built in reputation system would be much better because it could
display objective data and let workers search or automatically screen by employer
statistics, like how often they reject work. With Turkopticon it is all manual,
because it can only be integrated so much.
2.6.6. There are other problems with Turkopticon. I won't talk about these in
detail. But they all come from the fact that our day jobs are to do research, not
to maintain or improve Turkopticon.
2.6.7. There is also one thing that we could not fix even if we could work on
Turkopticon full time, which is that people do not trust each other on Mechanical
Turk. I think a new crowd work market needs to be built to fix this, but that is
just my opinion.

[Slide 11.]

2.7. What have we learned?


2.7.1. I learned some things about making software, like:
2.7.1.1. If you listen to people and build what they want, they will use it. I
should add, "sometimes." But the point is, it is possible, but you must really
listen and respond to people's concerns, and you have to keep listening. Amazon
gets away with not listening to workers because it is fairly unique and workers
need the money. But if there was a serious competitor to Mechanical Turk that
really addressed workers' issues but still managed to attract employers, I think it
would do well.
2.7.1.2. Maintenance is both technical and social. And both parts are time
consuming.
2.7.2. I learned some things about markets, which some American economists still
think are god-given entities like atoms or squirrels, but are really institutions
created by people.
2.7.2.1. Most workers and employers have good intentions, but not all of them.
2.7.2.2. The small fraction of participants with selfish intentions affects the
market. You have to account for them in the design of the market or they will mess
it up for everyone else.
2.7.2.3. No system can solve all problems, so you need a human administrator
around. This is obvious to most people here, but it is not obvious to programmers.
Remember that I said that Silicon Valley loves things that are, or at least seem to
be, "self-organizing"? They also love to have systems solve problems for them so
they don't have to deal with them. So the idea that a perfect system can be built
that will not require any human oversight is a popular fantasy. I think it is a
dangerous fantasy, so I have to keep making this claim that it is not possible. It
may sound silly to you but it is an important reminder for technologists, and I am
telling you so if you ever run into anybody who believes this you won't be
surprised. If they are an American programmer who maybe studied some economics in
school, but no sociology, you should be even less surprised.
2.7.2.4. To maintain trust, there should be a record of why judgments were made the
way they were. We have had bugs that have made people ask things like "Has
Turkopticon sold out?" These were not even things that we did on purpose! They were
accidents! And people got worried. We also did some things, early on, on purpose
without talking to workers about it first, or explaining our motivations. Bad
mistake.
2.7.2.5. There is one more thing, which is not on the slide. That is that in a
complex system, sometimes you cannot see the consequences of your actions. So even
well-intentioned people can harm others by accident. So in very distributed systems
like Mechanical Turk, or even in more traditional outsourcing arrangements, we need
to establish ways for people to communicate what is going on with them. One of the
problems with Mechanical Turk is that people are almost anonymous to each other.
Workers don't have names or photos or anything, they just have numbers. They are
long numbers, like ten or twelve digits long. So they all look the same to the
employer. Actually, they all look like robots to the employer, so the employer
doesn't feel bad when she or he chooses not pay them. But my point is that we need
more lines of communication if we want to support market participants to achieve
fair outcomes. We need more lines of communication so that market participants can
think of each other as human beings and treat each other like human beings.
2.7.3. I also learned something about Amazon, which is that they are not all-
powerful. In 2008 I really thought that after we made Turkopticon -- we made the
first version in a weekend -- they would be so ashamed that two grad students could
just throw this thing together that they would get the point and make a good one
themselves. It would be a thousand times better than ours and have real data and
workers would be able to screen and search by employers' rejection rates and pay
speeds and all of this, all of which Amazon has, or could easily track... It didn't
happen. None of it happened. Nothing has changed in terms of workers' ability to
judge employers inside Mechanical Turk itself. Not one thing, in five years. So,
obviously they do not have infinite technical resources to work on this sort of
thing, and it is not high on their to-do list. Somebody else is just going to have
to make a new market. Maybe somebody in this room.

[Slide 12.]

2.8. What now for Turkopticon?


2.8.1. Somebody asked me if there will be a commercial version. No. There are three
reasons for this:
2.8.1.1. First, Turkopticon started off non-profit and we would like to keep it
that way.
2.8.1.2. Second, the people who need Turkopticon the most could not really afford
to pay for it.
2.8.1.3. Third, the workers would hate us and would stop using it.
2.8.2. There are some improvements we could make to Turkopticon. Some are more
organizational and some are straightforward and technical. This is just part of my
to-do list for Turkopticon. But there is no timeline for this to-do list because of
the day job. It is possible that after I finish grad school we will make a non-
profit organization and ask for some grants to keep this going.
2.8.3. But really, I would like to avoid having to do all of this, because I would
like Turkopticon to become unnecessary. Turkopticon shouldn't need to exist. This
should all be built in to Mechanical Turk...or into whatever replaces it.

[Slide 13.]

2.9. I want to steal the slogan from the World Social Forum -- you know, "another
world is possible" -- I really believe that; if I didn't believe that I would have
been too depressed to work on this for five years -- and say, much less ambitiously
perhaps but I think part of the bigger picture, "another crowd work is possible."
Or, in keeping with the theme of the conference, a more cooperative crowd work is
possible. Here is a first try at a to-do list for building another crowd work.
Anybody can sign up to any part of this to-do list: workers, employers, system
builders, trade unionists, policy makers, researchers, ...
2.9.1. First, we need to understand what is going on. The situation in crowd work
is very complex. For example, many US crowd workers don't actually want minimum
wage, because they are afraid that will mean less work. They get very angry when
you argue that government should regulate crowd work. I think this anger is based
in fear. I have not ever said this in the US, but I think the basic income idea is
very interesting and relevant here. The point here is we cannot jump to simple
solutions "oh, just make a minimum wage." People don't want it and it would
probably not even be enforceable for technical reasons. So, obviously I would say
this because I am a researcher, but I think we need more research. We need research
from different perspectives though, not just from academics. I would love to see
cross-sector work groups -- collaborations between workers, employers, system
builders, trade unionists, policy makers, researchers, and so on. It would be a big
pain to manage, but I think it is definitely possible and could be very beneficial.
2.9.2. We need to build a community around the dream of a more ethical future for
crowd work. This is starting to emerge now but it is far from mature.
2.9.3. And finally, we need to build and maintain new models, new systems, and new
cross-sector conversations that lead to, and sustain, learning and action.

2.10. Thank you.

You might also like