guidelines.h

00001 ' so that the
00016    class/method/field documentation is available to doxygen. The UML
00017    diagrams are to be kept up-to-date. Overall changes in design are
00018    kept up-to-date.
00019 
00020  - Every class should have a static 'Test(int argc, char* argv[])'
00021    method that acts as a unit test. This method should create one or
00022    more instances of the class (and any other classes needed to get
00023    the class to work) and test the public interface methods, checking
00024    return results/output against expected results to verify that it is
00025    working as expected. The 'main' program provides special support
00026    for invoking these unit-testing methods instead of performing
00027    normal OCR processing, allowing convenient per-class testing.
00028 
00029  - All fields are defined in private scope and are preceeded by an 
00030    underscore.  Accessors are defined for each field.  See the
00031    more detailed discussion on 'Accessors' for more.
00032     - getters can be public, but should ALWAYS have read-only 
00033       semantics, and should be moved out of 'public' scope if the
00034       field isn't critical to the public interface.
00035     - setters should never be at public scope.  Use 'friend's
00036       when classes need access to the non-public functionality
00037       of other classes.
00038 
00039  - Classes should be fully initialized by their constructors. Partial
00040    initialization requires class clients to remember to perform
00041    additional initialization before the class is safe to be used, and
00042    completely defeats the point of having an initializer to begin
00043    with! Furthermore, partial-initialization constructors require the
00044    introduction of public access to internal state (directly or
00045    indirectly via public read-write accessors). Note that this mandate
00046    implies that classes should NOT have default constructors unless
00047    the default values assigned to fields are not only intuitive and
00048    meaningful, but also useable as-is.
00049 
00050  - Avoid symbol conflicts. 
00051 
00052     - All classes are placed in the Conjecture
00053       namespace. 
00054 
00055     - No global variables. 
00056 
00057     - Do NOT place 'using namespace std' statements in header files.
00058       Yes, I know it allows you to avoid putting 'std::' in front of
00059       all sorts of types, but putting such a statement in header files
00060       forces third-party code to fully-qualify class names that
00061       conflict with those in 'std'.
00062 
00063     - The use of 'using namespace std' in source files is completely ok.
00064 
00065  - Avoid assuming there will only be one instances of any particular
00066    class. Support for multi-threading/parallel/distributed algorithms
00067    suggests that even classes that seem like Singletons may not be.
00068 
00069  - Write code in a thread-safe manner. Avoid non-const static
00070    variables where possible.
00071 
00072  - Document important changes in the $CONJECTUREROOT/ChangeLog file.
00073    Especially important to document is the introduction, changing or
00074    removal of public interface methods in any kernel or utils class.
00075    Concise notes on *everything* are probably better than detailed
00076    notes on a few things. Detailed notes can be placed in appropriate
00077    system/class/method documentation.
00078 
00079  - All methods should be virtual by default. It is unlikely that the
00080    minor gains in efficiency from having a non-virtual method will
00081    significantly improve performance. The exception to this rule is
00082    setters and getters, which can incur a significant efficiency hit
00083    if made virtual because they can then no longer be inlined. Other
00084    exceptions include methods that are expected to be executed
00085    millions of times.
00086 
00087  - If a method is virtual, explicitly specify the 'virtual' keyword in
00088    the class header even when doing so is redundant. This convention
00089    allows everyone to establish whether a method is virtual or not by
00090    looking in a single header file, instead of having to search every
00091    header file up the entire inheritance path when virtualness is in
00092    doubt. Although some programming environments make this convention
00093    unnecessary, program for all environments, not just your own.
00094 
00095  - The "appearance" of code can have a significant impact on
00096    encouraging or discouraging others to contribute. If the code is
00097    "open" and "accessible", it is more likely people will be motivated
00098    to modify and extend. If the code is obscure, cluttered, or
00099    undocumented, it is much less likely that people will contribute.
00100    Make efforts to make your code esthetically pleasant. Document
00101    classes and methods, of course. Within each method, document each
00102    conceptual "block" of code. One line of natural-language
00103    description can make 10 lines of code much more accessible!
00104 
00105 <hr>
00106 \subsection accessors Accessors
00107 
00108 Suppose we have a field called 'name', of type 'Type' (for example,
00109 'Type' might be 'int' or 'A*'). One way to define it would be:
00110 
00111 \code
00112    class A {
00113      public:
00114        Type name;
00115    }  
00116 \endcode
00117 
00118 And then an individual would read/write the field using:
00119 
00120 \code
00121    A* a = new A;
00122    a->name = 10;
00123    cout << "name = " << a->name << endl;
00124 \endcode
00125 
00126 A public field is a terrible implementation decision because:
00127   1) All code, everywhere, is allowed to modify (and thus potentially
00128      corrupt) its value.  There is no "safety" available and
00129      modification-at-a-distance is difficult to debug.
00130   2) If, during the evolution of the code, the type of the field
00131      (or the existence of the field) comes into question, changes
00132      are both difficult, and guaranteed to break any third-party
00133      code that relies on this field.
00134 
00135 However, using a simple strategy can address both of these problems
00136 and add some additional capability. And in C++, this comes at
00137 absolutely no runtime performance cost, as we'll see below.
00138 
00139 Instead of defining our field as shown above, we do the following:
00140 
00141 \code
00142    class A {
00143      public:
00144         inline const Type& name() const { return this->_name; }
00145 
00146      protected:
00147         inline void nameIs(Type val) { this->_name = val; }
00148         inline Type& nameRef() { return this->_name; }
00149 
00150      private:
00151        Type _name;
00152    }
00153 \endcode
00154 
00155 In the above, we have moved the field definition itself into private
00156 scope, and places an underscore in front of it.  We have also added
00157 three tiny inlined methods providing get/set access to the field.
00158 
00159 <hr>
00160 \subsubsection features Features worth noting:
00161 
00162   - The 'name()' method returns a const version of the underlying
00163     field. It allows 'read-only' access to the field, but does not
00164     allow it to be modified. It is often at public scope, but should
00165     be placed in protected scope if the field is not part of the
00166     public interface. Obviously, the fewer field accessors in the
00167     public interface, the more separation between implementation and
00168     interface.
00169 
00170   - The 'nameIs()' method sets the value of the field to its single
00171     argument. This method should almost never be placed at public
00172     scope. Classes should be designed so that they, are a small set of
00173     tightly-coupled friend classes, are responsible for modifying
00174     themselves. In this way, corruption of state is much easier to
00175     identify, because only a small number of classes can be
00176     responsible, and ultimately the change had to have occured via
00177     this setter method.
00178 
00179   - The 'nameRef()' method has exactly the same code block as
00180     'name()', but its return value and non-const status make it
00181     much more powerful.  It too should almost never be placed at
00182     public scope, as it provides a second means of modifying the
00183     field.  For example, if Type is 'int', we can write:
00184 \code
00185         a->nameRef() = 10;
00186 \endcode
00187     because 'nameRef()' returns a reference to the 'int' field. This
00188     accessor is useful because anything we could do with the
00189     underlying field 'a->_name', we can also do with 'a->nameRef()'.
00190     For example:
00191 \code
00192          cout << a->_name << endl;
00193          a->_name += 10;
00194          ++a->_name;
00195 \endcode
00196     is the same as
00197 \code
00198          cout << a->nameRef() << endl;
00199          a->nameRef() += 10;
00200          ++a->nameRef();
00201 \endcode
00202     except that using 'nameRef' provides for some separation of
00203     interface and implementation.  Note that since 'nameRef' is
00204     inline, the above two sets of lines really are identical
00205     after compilation.
00206 
00207 <hr>
00208 \subsubsection features2 Additional Features
00209 
00210   - The type of the argument in 'nameIs()' can sometimes be made
00211     'const', but when it is a pointer type, it cannot be made 'const'
00212     unless the underlying field itself is 'const'.
00213 
00214   - Although many C++ programmers avoid using 'const', constness
00215     can have a significant positive impact on code quality.  Rather
00216     than solving 'const' issues by not using 'const', it takes only
00217     a little effort to understand the basic principles.  Remember that:
00218 
00219       - If a variable is 'const', it cannot be modified. That is, if
00220         it appears on the right-hand side of an assignment statement,
00221         the compiler will generate an error.
00222   
00223       - 'const'ness is contagious - if one variable is 'const', it
00224         often forces other variables or methods to also be 'const',
00225         which may in turn force other variables or methods to be
00226         'const', etc. etc. Because of this, the strategy of "I'll get
00227         my code working without 'const' first, then add 'const'
00228         semantics in later" is a terrible idea, because it is
00229         difficult to "add in 'const'" incrementally - making one
00230         variable/method 'const' requires many others to be made const
00231         before the code will compile again.  Because of this, it is
00232         much better to always make EVERYTHING (variables and methods)
00233         'const' by default, at least conceptually, then remove
00234         'const' when it is established that the variable/method in
00235         question cannot be 'const' and still retain the desired
00236         semantics.
00237 
00238       - If a variable is 'const', then only 'const' methods can
00239         be invoked with that variable as a receiver.  For example:
00240 \code
00241             const Person p;
00242             p.somefunc();
00243 \endcode
00244         Only if 'somefunc' is marked as 'const' in its definition:
00245 \code
00246             void somefunc() const { ... }
00247 \endcode
00248         will the above 'p.somefunc();' invocation be allowed by
00249         the compiler.
00250       
00251       - A non-static method f() defined within a class A has a
00252         special variable called 'this'.  Since it is a variable,
00253         it must have a type.  The type of 'this', within class A,
00254         is ALWAYS either 'A* const' or 'const A*const'.  Which
00255         one it is depends on whether the method f() is 'const'
00256         or not.  
00257 
00258          - A method is 'const' if the keyword 'const' appears
00259            AFTER its parameter list.
00260 
00261          - If 'this' is of type 'A* const', it means that the 
00262            space for the pointer itself cannot be changed, so the
00263            statement
00264 \code
00265                this = a;
00266 \endcode
00267            is not allowed.  Since the 'this' variable is ALWAYS
00268            'A* const' or 'const A*const', one is never allowed to
00269            modify 'this'.  However, a variable being '*const'
00270            does not affect the ability to modify what the variable
00271            points to, so 
00272 \code
00273                this->some_field = 10;
00274 \endcode
00275            isn't disallowed.
00276 
00277          - If 'this' is of type 'const A* const', it means that
00278            everything above for 'A* const' holds, AND that we
00279            cannot modify what the variable points to, so
00280 \code
00281                this->some_field = 10;
00282 \endcode
00283            becomes a compile-time error.
00284 
00285 
00286 
00287 */

Generated on Mon Jun 12 20:27:15 2006 for Conjecture by  doxygen 1.4.6