Open Source OCR: Russian OCR engine to be published as FOSS

OCR is one of the few markets that are not fully internationalized yet. An OCR that can decently process Cyrillic texts for now can only come from Russia. And there are no more than two at the moment: ABBYY FineReader and Cognitive Cuneiform.

Both trace their origins to the late Soviet-era government research projects that were commercialized in the nineties. However, Cuneiform started to lose its position in the consumer market by the end of the decade, then the application saw very little progress since 2000, and now it is generally unknown among end-users. Cognitive, who has by now shifted to systems integration market, has finally decided to open up Cuneiform, make it available as freeware immediately on a dedicated website and publish under an open source license in March, 2008.

What makes it interesting is that Cuneiform will be the second OCR system to be published as Open Source after years of development inactivity along with Tessaract published by HP in 2005. Thus, the market of Open Source OCR will quite unexpectedly become competitive.

The most probable idea behind the decisions of both Cognitive and HP is to put to work the unemployed resources so that they start producing at least minimal benefit. It looks like a simple ‘let’s see’ action, and no clear business model seems to be lying behind it.

But with the recent increase of interest of the Russian authorities in Free Software usage at middle schools, the demand for the liberated Cuneiform could become considerable. However, until the government’s plan to shift all schools to Free Software by 2009 is fulfilled at least partially, it is very difficult to say what this state-supported middle-school FOSS market will look like and what its rules will be. But if it comes to reality, Cognitive has all chances to be a player there by simply having used the available resources in a smart way at the right moment.

Technorati Tags: oss, ocr, Cognitive, ABBYY, Tessaract, Cuneiform, Russia, schools