The philosophy behind the design of Conjecture differs somewhat from most
other OCR projects. OCR is a difficult problem, and no one strategy for
character recognition is optimal, given the vast amount variation
possible in input (differing fonts, font sizes, noise levels, orientation
angles, existence of interspersed text and pictures, and numerous other
concepts that affect OCR accuracy). One strategy might excel for one
document, while a very different strategy may be better for another
document.
Most existing open-source OCRs provide a single hard-coded solution, and
make it difficult for individuals to contribute and experiment with
variations in algorithms without becoming intimately familiar with the
entire architecture. In constrast, Conjecture provides a modular design
that makes it possible to start experimenting and exploring incremental
improvements almost immediately.