Archives for category: Language

Frequently, blogs copy a news article, reword it, and repost it wholesale, without adding any new research. They very often lose information or veracity in the transliteration of the post. This is a useless practice that needs to stop. Marco Arment, the instapaper developer, says that you should cite your sources visibly – not only does it make you more honest, it provides more raw information for the reader.

There is a tendency among spurious hacks of the world to present someone else’s idea as ones own. As much as I hate the wrong and harmful idea of patents (the exclusive right granted by a government to an inventor to manufacture, use, or sell an invention for a certain number of years), plagiarism is a different beast – in news media especially. The crime here is some part lack of attribution, and some part loss of information.

Arment has this to say about the phenomenon:

The most ethically and professionally sound practice when you have little value to add to the source story is the linked-list approach. Give a teaser quote and a prominent link. Make it clear that you didn’t write the target article, there’s more to be read there, and here’s how to get to it.Don’t replace it. Send your readers there.If you’re truly providing value, you should have the confidence to send your audience away, knowing that they’ll come back to you. If that’s not the case, don’t bother publishing.

It’s true. If you want to pontificate, great! If you want to build on someone else’s point, fine. Cite your sources and don’t massage facts, or you aren’t a journalist. And credit the original research, or you’re a plagiarist. Easy.

Collaborative story telling

Abrupt Goodbye is a collaborative chatting game released by an indie game studio. The whole thing is browser based and all of the content is user generated. I think that it’s possibly a first foray into a entirely new type of game.

The premise is supplied: A blind man is waiting for a train, a woman approaches him and talks.

Abrupt Goodbye is cool for a number of reasons:

– It is infinitely replayable – each completed game extends the content of the game a little bit, so the next game is longer and more varied.

– It’s totally asynchronous, but puts two ‘sides’ against each other. Each side is several players working together without communicating.

– The system is set up to be self-improving – as you choose your conversational options, you vote for the most interesting ones. So there’s a constant positive reform going on there.

You can crowdsource communication the wrong way, (as with some blog comments), or you can do something really great with it, like Abrupt Goodbye. Go play, it rules.

How ChatWithTurk used to work:

  1. A user says a phrase A to Turk
  2. Turk remembers phrase A for later.
  3. Turk thinks of some ‘similar’ things that Turk has previously said that resemble A. (These are potential phrases B)
  4. For each of B, Turk checks to see what responses he has received when Turk said those. Turk picks one of these. This is phrase C. (If Turk has no good historic B phrases, he uses an untried one, something he’s heard but never said, which is his phrase C.)
  5. Turk responds with phrase C, which hopefully shares some context with phrase A, or maybe is a wild guess.

Right off the bat we have a system that has a lot of inherent randomness, even though it doesn’t have any entropy – the page just collects user input and regurgitates it according to the above. It does get tuned with use, insofar as the list of phrases (and appropriate human given responses) grows over time. Of course, Turk doesn’t track context at all, and doesn’t even differentiate from the conversation to conversation on his own. From close up, it’s very naive.

However, it works as sort of a conversational echo chamber – the user dictates the course of the conversation, whether they are greeting the Turk (he often returns the greeting), insulting him (he usually responds in kind to profanity), or asking questions about the nature of the page. He often accuses the user of being a bad chatbot.

At best, the conversations generated by a mature Turk system more closely resemble the ones found in the front cover of an old yearbook than a live conversation, and that makes sense. Since doing this experiment, I’ve let it go, but email me for free source code.

There’s a little Internetology experiment I’d like to share with you. What if I paired up a random person on the Internet for a conversation? Actually, that’s been done before.

What if I paired you up with a random person, but told you that it was a chatbot? Finally, what if that chatbot was non-stateful and just replied the best it could to everything you said to it? Would that the result of that be interesting or just confusing?

(Coarse language warning: Like I said, this is an experiment in chatting with strangers on the Internet…)

Try Chat with A Turk. Go ahead, I’ll post the results and an overview of the algorithm here soon.

In keeping with his long tradition of thoughtfully incendiary blogging, Jeff Atwood of Coding Horror posted on the necessity/efficiency of English as a lingua franca for software developers.

Advocating the adoption of English as the de-facto standard language of software development is simple pragmatism, the most virtuous of all hacker traits. If that makes me an ugly American programmer, so be it.

So it goes, and it was, I’m sure, a reminder to native English speakers that they are fortunate to not have to learn another language in order to communicate with other software developers around the world. As you would expect if you were a frequent Coding Horror reader, the comments were full of offended and well-meaning developers who were angry at connotations of cultural imperialism. While communication would be streamlined if everyone spoke a common language, it would certainly be a shame to lose the world’s other languages.

When the story was posted to Slashdot – presumably a more rational and international community, the comments seemed to me to revolve more around a discussion of the use of English as a necessary common medium for code, comments, and technical documentation. There were vivid handfuls of stories from non-native English speakers on how they learned English – whether from Sesame Street, the Internet, or grade school. Snarky debates about the linguistic heritage of English, the relative usability of it, and so on. Fine points and spirited back and forth. All in all, nearly the opposite of the posts on Jeff’s blog. In all, the Slashdot echo chamber has a different shape that at this point reverberated with sounds of logical discussion. Particularly interesting to me was this post:

However, the main reason why finns speak pretty decent english is our school system. Studying english is mandatory from grades 3 to 9 in the elementary school and any route you continue from there also requires you to study english. We believe that in the modern world it is just a basic requirement for everyone to understand the same language.

I agree, a common language is necessary. Already the ability to speak the de facto language of the internet is a huge asset, for the individual and for all of us. More intriguingly, why the huge difference? The medium is similar.