Luis von Ahn

Workshop 9: Human Computation Workshop (HCOMP 2010)
July 25 9:00AM
Most research in data mining and knowledge discovery relies heavily on the availability of datasets. With the rapid growth of user generated content on the internet, there is now an abundance of sources from which data can be drawn. Compared to the amount of work in the field on techniques for pattern discovery and knowledge extraction, there has been little effort directed at the study of effective methods for collecting and evaluating the quality of data.



Human computation is a relatively new research area that studies the process of channeling the vast internet population to perform tasks or provide data towards solving difficult problems that no known efficient computer algorithms can yet solve. There are various genres of human computation applications available today. Games with a purpose (e.g., the ESP Game) specifically target online gamers who, in the process of playing an enjoyable game, generate useful data (e.g., image tags). Crowdsourcing marketplaces (e.g. Amazon Mechanical Turk) are human computation applications that coordinate workers to perform tasks in exchange for monetary rewards. In identity verification tasks, users need to perform some computation in order to access some online content; one example of such a human computation application is reCAPTCHA, which leverages millions of users who solve CAPTCHAs every day to correct words in books that optical character recognition (OCR) programs fail to recognize with certainty.



Topics:


  • Abstraction of human computation tasks into taxonomies of mechanisms
  • Theories about what makes some human computation tasks fun and addictive
  • Differences between collaborative vs. competitive tasks
  • Programming languages, tools and platforms to support human computation
  • Domain-specific implementation challenges in human computation games
  • Cost, reliability, and skill of labelers
  • Benefits of one-time versus repeated labeling
  • Game-theoretic mechanism design of incentives for motivation and honest reporting
  • Design of manipulation-resistance mechanisms in human computation
  • Effectiveness of CAPTCHAs
  • Concerns regarding the protection of labeler identities
  • Active learning from imperfect human labelers
  • Creation of intelligent bots in human computation games
  • Utility of social networks and social credit in garnering data
  • Optimality in the context of human computation
  • Focus on tasks where crowds, not individuals, have the answers
  • Limitations of human computation