Stop Spam, Read Books

Published 08 June 07 12:34 AM | adrian

This is the motto of reCAPTCHA, a web service that handles CAPTCHA requests. reCAPTCHA is unique, because it does not only provide CAPTCHA services, but also helps OCR conversion.

How does it work?

reCAPTCHA displays two word, instead of one like the standard CAPTCHA. One is a known word, one is an unknown word (to the reCAPTCHA system). The user will need to spell both words, although only the known word will get validated. If the known word matches the user answer, then the user is validated (as real human), and the user's answer to the unknown word is stored in the database. With more answers for the unknown word, the higher the correctness level for that word. Until certain threshold, the unknown word becomes a known word.

Confused? So let's take an example ride:

You would type "itself shrine" to answer correctly. But what you don't know is, that the system only knows and validates one of the words. Generally speaking, those with harder to read words are the one usually unknown to the system. For the example, I purposely answered "its shrine" and the system returns a valid answer. Of course, this will only help the "Stop Spam" part, not the "Read Books" part.

Which then comes to our attention, how does a CAPTCHA system helps us read books?

Well, apparently, those unknown words are taken from document/book scans where the OCR algorithm fails to understand the word. We, as human, can recognize words better than any computer, so we actually help digitizing those document/book scans. Distributing digitized document/books is easier than printed version (provided that it is released under open license).

How hard is it to integrate reCAPTCHA to my site?

Well, it's as simple as embedding a Javascript, and then validates the answer to the reCAPTCHA server. That's it, no more coding the CAPTCHA generation, etc. In fact, you're also offloading your CAPTCHA process into reCAPTCHA servers. That means one less thing for your servers to worry about.

Is reCAPTCHA free?

Absolutely. But if you're going to use it on a large traffic web site, please notify the team first so they can prep their servers for your helping hands!

Visit reCAPTCHA at www.recaptcha.net for more information.

Share this post: | | | |
Filed under:

Comments

No Comments