How ChatWithTurk used to work:

  1. A user says a phrase A to Turk
  2. Turk remembers phrase A for later.
  3. Turk thinks of some ‘similar’ things that Turk has previously said that resemble A. (These are potential phrases B)
  4. For each of B, Turk checks to see what responses he has received when Turk said those. Turk picks one of these. This is phrase C. (If Turk has no good historic B phrases, he uses an untried one, something he’s heard but never said, which is his phrase C.)
  5. Turk responds with phrase C, which hopefully shares some context with phrase A, or maybe is a wild guess.

Right off the bat we have a system that has a lot of inherent randomness, even though it doesn’t have any entropy – the page just collects user input and regurgitates it according to the above. It does get tuned with use, insofar as the list of phrases (and appropriate human given responses) grows over time. Of course, Turk doesn’t track context at all, and doesn’t even differentiate from the conversation to conversation on his own. From close up, it’s very naive.

However, it works as sort of a conversational echo chamber – the user dictates the course of the conversation, whether they are greeting the Turk (he often returns the greeting), insulting him (he usually responds in kind to profanity), or asking questions about the nature of the page. He often accuses the user of being a bad chatbot.

At best, the conversations generated by a mature Turk system more closely resemble the ones found in the front cover of an old yearbook than a live conversation, and that makes sense. Since doing this experiment, I’ve let it go, but email me for free source code.