Tech Dreams

Google Acquires reCaptcha – A Powerful Book Scanning Technology In Google Hands Now

Those scrambled text that we see on the registration pages or status update pages of facebook/orkut websites are called CAPTCHA. They are used to protect the websites from spammers and bots.

reCaptcha is one of the biggest provider of CAPTCHA technology and now Google owns it. Google in a blog post announced that they acquired them. reCaptcha is currently used in over 1,00,000 web sites world wide (even we used reCaptcha in couple of asp.net websites we developed recently).

recaptcha

Why Google acquired this company and what are they going to do with technology? Google explains

many of the CAPTCHAs provided by reCAPTCHA come from scanned archival newspapers and old books. Computers find it hard to recognize these words because the ink and paper have degraded over time, but by typing them in as a CAPTCHA, crowds teach computers to read the scanned text.

In this way, reCAPTCHA’s unique technology improves the process that converts scanned images into plain text, known as Optical Character Recognition (OCR). This technology also powers large scale text scanning projects like Google Books and Google News Archive Search.

That’s a nice move by Google. They got a brilliant technology that allows them to digitize old books and newspapers to make the web more searchable(also earn millions of dollars through advertising).

But on the other hand the worrying thing is the ever growing footprint of Google on the net.