#include <Page.h>
Inheritance diagram for Conjecture::Page:
*****************************************************************
This class is one of the most important in the Part hierarchy. It may be sub-divided into Regions, Lines, Words and/or Glyphs, although only Glyphs are crucial. It plays a special role relative to other Part subclasses:
Much of the high-level segmenting functionality is defined on this class, and can be redefined by subclasses. Segmenting includes region detection, picture detection, noise removal, line detection, glyph detection, etc. etc., but does not involve glyph-to-character identification (see Glyph for the identification interface).
Each Page is associated with a file (representing the input image to be analyzed) and an Image (describing the in-memory representation of the file).
Public Member Functions | |
Page (Env *env, const std::string &file) | |
Page (Env *env, const std::string &imagefile, const std::string &validfile) | |
virtual bool | process () |
virtual bool | segment () |
virtual bool | identify () |
virtual bool | format () |
virtual Page * | asPage () |
virtual const Page * | asPage () const |
virtual void | printSummary (std::ostream &os=std::cerr, const std::string &indent="", int index=-1) const |
virtual void | writeText (std::ostream &os) const |
const Env * | env () const |
const std::string & | file () const |
const Image * | image () const |
Static Public Member Functions | |
static void | test (int argc=0, const char *argv[]=NULL) |
Protected Member Functions | |
OCRModule * | algorithms () const |
void | envIs (Env *env) |
void | fileIs (const std::string &file) |
virtual int | type () const |
void | imageIs (Image *image) |
Friends | |
class | Env |
const Image * | Element::pageImage () const |
|
Allows Image instances to be converted to Page instances safely without requiring explicit down-casting. Reimplemented from Conjecture::Element. |
|
Write identified text to outfile. This method is only applicable after 'identify' has been invoked, and is responsible any final grouping of Glyphs into Words, Words into Lines, Lines into Regions, and Regions into Pages, although such groupings may (optionally) have been performed during 'segment' or 'identify' as well. The output ... |
|
Perform identification of each individual Glyph. The collection of Glyphs may be modified during this process. For example, if the segmenting algorithm mis-identified a region as containing a character, but recognition suggests it is more than one character, or only part of a character, then the Glyph in question may be split or merged as appropriate. |
|
Print out information about this Part. Reimplemented from Conjecture::Element. |
|
Perform a full analysis: dust removal, line detection, character detection, character recognition, etc. |
|
Divide the Page into Glyphs. Depending on configuration values, the Glyphs may also be grouped into Words and/or Lines and/or Regions within the Page. Returns false if no appropriate content was found, true otherwise. |
|
Unit testing method. This static method should create instances of the class (and instances of any other class necessary) and perform tests to ensure that all methods within the class are working as expected. Reimplemented from Conjecture::Element. |
|
Returns an integer establishing how "small" this image type is, relative to other image types. It has nothing to do with width or height, but instead with the conceptual size of the type itself. All instances of a particular subtype will always return the same value. The code is designed so that Glyph returns a larger number than Word, which is larger than Line, which is larger than Region, which is larger than Page. This allows us to perform some sanity checks on hierarchial decompositions to ensure that we don't make silly structures in which Lines have Glyphs as parents, etc. FUTURE FIX: This method should be pure-virtual, but making it pure-virtual causes compilation failure (pure virtual method invoked in constructor). Reimplemented from Conjecture::Element. |
|
Writes to 'outfile' a textual representation of this Page and its contained sub-elements. Proper formatting only occurs if the Page contains Lines that contain Words that contain Glyphs. Reimplemented from Conjecture::Element. |