An Extensible Optical Character Recognition Framework

Philosophy

The philosophy behind the design of Conjecture differs somewhat from most other OCR projects. OCR is a difficult problem, and no one strategy for character recognition is optimal, given the vast amount variation possible in input (differing fonts, font sizes, noise levels, orientation angles, existence of interspersed text and pictures, and numerous other concepts that affect OCR accuracy). One strategy might excel for one document, while a very different strategy may be better for another document.

Most existing open-source OCRs provide a single hard-coded solution, and make it difficult for individuals to contribute and experiment with variations in algorithms without becoming intimately familiar with the entire architecture. In constrast, Conjecture provides a modular design that makes it possible to start experimenting and exploring incremental improvements almost immediately.

Quick Links

Downloads	:	V-0.06 Repository
Howto	:	Install Modules Implementations
Community	:	Mailing List Wiki SVN
To Do	:	Questions Easy Design Implementation Infrastructure

Conjecture is using services provided by