Conjecture::Page Class Reference

#include <Page.h>

Inheritance diagram for Conjecture::Page:

Conjecture::Element Conjecture::Root List of all members.

Detailed Description

An Image and associated meta-data representing an entire page of to-be-scanned data.

*****************************************************************

This class is one of the most important in the Part hierarchy. It may be sub-divided into Regions, Lines, Words and/or Glyphs, although only Glyphs are crucial. It plays a special role relative to other Part subclasses:

Much of the high-level segmenting functionality is defined on this class, and can be redefined by subclasses. Segmenting includes region detection, picture detection, noise removal, line detection, glyph detection, etc. etc., but does not involve glyph-to-character identification (see Glyph for the identification interface).

Each Page is associated with a file (representing the input image to be analyzed) and an Image (describing the in-memory representation of the file).


Public Member Functions

 Page (Env *env, const std::string &file)
 Page (Env *env, const std::string &imagefile, const std::string &validfile)
virtual bool process ()
virtual bool segment ()
virtual bool identify ()
virtual bool format ()
virtual PageasPage ()
virtual const PageasPage () const
virtual void printSummary (std::ostream &os=std::cerr, const std::string &indent="", int index=-1) const
virtual void writeText (std::ostream &os) const
const Envenv () const
const std::string & file () const
const Imageimage () const

Static Public Member Functions

static void test (int argc=0, const char *argv[]=NULL)

Protected Member Functions

OCRModulealgorithms () const
void envIs (Env *env)
void fileIs (const std::string &file)
virtual int type () const
void imageIs (Image *image)

Friends

class Env
const ImageElement::pageImage () const


Member Function Documentation

virtual Page* Conjecture::Page::asPage  )  [inline, virtual]
 

Allows Image instances to be converted to Page instances safely without requiring explicit down-casting.

Reimplemented from Conjecture::Element.

bool Conjecture::Page::format  )  [virtual]
 

Write identified text to outfile.

This method is only applicable after 'identify' has been invoked, and is responsible any final grouping of Glyphs into Words, Words into Lines, Lines into Regions, and Regions into Pages, although such groupings may (optionally) have been performed during 'segment' or 'identify' as well.

The output ...

bool Conjecture::Page::identify  )  [virtual]
 

Perform identification of each individual Glyph.

The collection of Glyphs may be modified during this process. For example, if the segmenting algorithm mis-identified a region as containing a character, but recognition suggests it is more than one character, or only part of a character, then the Glyph in question may be split or merged as appropriate.

virtual void Conjecture::Page::printSummary std::ostream &  os = std::cerr,
const std::string &  indent = "",
int  index = -1
const [virtual]
 

Print out information about this Part.

Reimplemented from Conjecture::Element.

bool Conjecture::Page::process  )  [virtual]
 

Perform a full analysis: dust removal, line detection, character detection, character recognition, etc.

bool Conjecture::Page::segment  )  [virtual]
 

Divide the Page into Glyphs.

Depending on configuration values, the Glyphs may also be grouped into Words and/or Lines and/or Regions within the Page.

Returns false if no appropriate content was found, true otherwise.

void Conjecture::Page::test int  argc = 0,
const char *  argv[] = NULL
[static]
 

Unit testing method.

This static method should create instances of the class (and instances of any other class necessary) and perform tests to ensure that all methods within the class are working as expected.

Reimplemented from Conjecture::Element.

virtual int Conjecture::Page::type  )  const [inline, protected, virtual]
 

Returns an integer establishing how "small" this image type is, relative to other image types. It has nothing to do with width or height, but instead with the conceptual size of the type itself. All instances of a particular subtype will always return the same value. The code is designed so that Glyph returns a larger number than Word, which is larger than Line, which is larger than Region, which is larger than Page. This allows us to perform some sanity checks on hierarchial decompositions to ensure that we don't make silly structures in which Lines have Glyphs as parents, etc.

FUTURE FIX: This method should be pure-virtual, but making it pure-virtual causes compilation failure (pure virtual method invoked in constructor).

Reimplemented from Conjecture::Element.

virtual void Conjecture::Page::writeText std::ostream &  os  )  const [virtual]
 

Writes to 'outfile' a textual representation of this Page and its contained sub-elements.

Proper formatting only occurs if the Page contains Lines that contain Words that contain Glyphs.

Reimplemented from Conjecture::Element.


The documentation for this class was generated from the following files:
Generated on Thu Jun 15 19:56:12 2006 for Conjecture by  doxygen 1.4.6