Tag Archives: recaptcha

[Duolingo Introduction] Part 1: From Captcha to Digitalizing the books

duolingo_owl

Duolingo is a startup that focus on language-learning. And it’s a nice one.

Being constructed by the founder of Re-captcha, Duolingo also shares the vision of utilizing the power of community to make the world better. So, before the story of Duolingo, I’ll tell you the tale about how  Luis von Ahn creates Recaptcha.

recaptcha-exampleHave you even registered in any website? Did you see little words that you must type to prove that you are not a auto-robot? “If you type the weird words in the images right, you may get in”. If you saw it, you know what a Captcha is. It’s a simple mechanism – based on the fact that no matter how far technology goes, it’s still nearly impossible for a machine to tell a bad writing from random images with thousands of senseless dots.

Ok, but have you ever asked yourself, why sometimes, you must type 2 words instead of 1?

The reason is simple. When Luis sees that we all must type Captchas everyday, he wants to make it more useful.  He told himself: we still have lots of books/ materials in paper form. If we scan them, they still be in pure image form – we won’t be able to search them like a text ebook. The current technology doesn’t allow machine to translate from image to text accurately. But what if people can help in the process?

So Recaptcha was born. Among 2 word-images Recaptcha gives you, it only knows exactly 1 word. If you type that word right, Recaptcha will recognize you as a human and let you in. And yes! Because you type one word right, the second word should be right too! You just translate an image to a word!

missing you

The result is that a lot of books in Google Scholar is digitalized that way.