CHI 436: Paper Reading #6: TurKit: human computation algorithms on mechanical turk

Reference Information:
TurKit: human computation algorithms on mechanical turk By Greg Little, Lydia B. Chilton, Max Goldman, Robert C. Miller. Published in the UIST '10 Proceedings of the 23nd annual ACM symposium on User interface software and technology
Author Bios:
Greg Little attended Arizona State University for a short time before dropping out to work with a video game company. He eventually went back to ASU and graduated, then was accepted into the PhD program at MIT. He is currently part of the User Interface Design group under Rob Miller. Lydia Chilton is currently a computer science graduate student at the University of Washington. She also attended MIT from 2002 to 2009. Max Goldman is a graduate student at MIT studying user interfaces and software development. He also spent time at the Israel Institute of Technology. Robert C. Miller is an associate professor in the EECS department at MIT and leads the User Interface Design Group. His research interests include web automation and customization, automated text editing, end-user programming, usable security, and other issues in HCI.
Summary
Hypothesis
The authors believe that their program TurKit, a toolkit for prototyping and exploring algorithmic human computation, can expand on the efficiency and effectiveness of Mechanical Turk.
Methods
The paper describes several examples of possible applications for their toolkit. The first example described is iterative writing. Basically, one turker writes a paragraph with a goal, and subsequent turkers try to improve upon the paragraph. In between iterations the paragraph is subjected to a 'cleaning' to remove parts that are not relevant or useful. Another iterative task presented as an example was recognizing blurry text. Over several iterations, new guesses as to the blurred text's meaning are added and changed. The paper also explores decision theory experimentation using TurKit to simulate human decision making in a random guessing scenario, as well as psychophysics experimentation to have turkers sort and classify various stimuli in an effort to determine salient dimensions among those stimuli.
Results
From the first example, the authors noted that most paragraph improvements involve making the paragraph longer. Additionally, people tend to keep to the original style and formatting. In the blurry text recognition, the guesses evolved over several iterations and the final result was nearly perfect. In the decision theory experimentation TurKit was useful in coordinating the iterative nature of the process, but not necessarily very good at simulating actual human behavior. And lastly, TurKit has proven useful and effective in the area of psychophysics experimentation, since calls to MTurk were embedded within a larger application. Overall, the TurKit crash-and-rerun programming model made it easy to write simple scripts, but was far from perfect. Users were often unclear about certain critical details and shortcomings of TurKit, as well as not knowing about the parallel features.
Contents
The paper introduces us to TurKit, a toolkit that is good for prototyping algorithmic tasks on MTurk. It offers up several concepts and tools, including the idea of 'crash-and-rerun programming'. This is a programming model suited to long running processes where local computation is cheap and remote work is costly. It has the benefits of allowing incremental programming, easy implementation, and retroactive print-line-debugging. The paper goes on to describe certain highlights of TurKit in detail, such as the TurKit script and the web interface. It also gives several example applications with real-world feedback and discusses some of the user reactions to the tools.
Discussion
I am impressed with the contents of this paper, but I cannot say that I fully understand everything. It sounds like they have created a very useful tool with a lot of potential to grow and become even better. However, I personally have never dealt with Mechanical Turk and I think that makes it hard for me to really appreciate what exactly the authors have accomplished. Based on what I read, it seems like they needed to work on communicating with the users a little bit better regarding how to use the TurKit and were it might trip them up. There were several people who were unaware of the parallel aspect of the program, and the authors mentioned that people expressed some concern over potential problems and dangers that were not immediately obvious.

CHI 436

Monday, September 12, 2011

Paper Reading #6: TurKit: human computation algorithms on mechanical turk

No comments:

Post a Comment