You are on page 1of 41

Class design and design principles in C++

If not explicitly stated as copyrighted, materials used in this work are from public domain. Compiled and edited by Sergey Chepurin, December, 2011

Contents Class Design in C++ ...........................................................................................................3 Understanding Interfaces .................................................................... 3 Inheritance and Class Design .............................................................. 4 C++ coding standards ........................................................................................................5 Object Oriented Design rules I ............................................................................................16 Stay close to problem domain .............................................................. 16 Object discovery vs. object invention ................................................... 16 Pick nouns or noun phrases as classes ................................................. 17 Method names should contain verbs ..................................................... 17 Prefix adjectives when naming inheriting classes .................................. 17 Do not add suffixes to class names ....................................................... 18 Avoid one-to-one mapping from structured design ................................. 18 Replace multiple get-set methods with operations ................................. 18 Model classes that handle messages as state machines .......................... 19 Use const whenever possible ................................................................ 19 Restrict header file level dependency ................................................... 19 Don't reinvent the wheel; use STL ........................................................ 19 Object Oriented Design rules II ..........................................................................................20 Class with just get-set methods points to missed delegation ................... 20 Replace an array of structures with an array of objects ......................... 21 Delegate work to helper class .............................................................. 21 Multi-dimensional arrays point to incomplete class identification .......... 23 Multiple nested loops point to incomplete delegation ............................ 23 Class with very large numbers of methods points to incomplete class identification ...................................................................................... 24 Don't go overboard with inheritance .................................................... 24 Prefer delegation to inheritance .......................................................... 25 Don't scatter the abstraction ............................................................... 25 Consider group of objects to split work amongst team members ............. 26 Use nested classes for lightweight helper classes .................................. 26 Use templates to improve type safety and performance .......................... 27 Divide your code into framework and application parts ......................... 27 Object-oriented Design Principles ......................................................................................28 Object-oriented design ........................................................................................................32
Abstraction .................................................................................................................................. 32 Data abstraction .......................................................................................................................... 33 Polymorphism .............................................................................................................................. 34 Data hiding .................................................................................................................................. 36 Representation hiding .................................................................................................................. 37 Extensibility ................................................................................................................................. 37 More about extensibility .............................................................................................................. 38 A book on C++ design ................................................................................................................ 39 Templates vs. classes ................................................................................................................... 39

References ............................................................................................................................41

Class Design in C++


"Defining classes is hard; that's another reason for avoiding writing lots of them." Andrei Alexandrescu and Petru Marginean "Generic: Change the Way You Write Exception-Safe Code - Forever"

Understanding Interfaces
http://www.cprogramming.com/tutorial/class_design.html When you're designing a class in C++, the first thing you should decide is the public interface for the class. The public interface determines how your class will be used by other programmers (or you), and once designed and implemented it should generally stay pretty constant. You may decide to add to the interface, but once you've started using the class, it will be hard to remove functions from the public interface (unless they aren't used and weren't necessary in the first place). But that doesn't mean that you should include more functionality in your class than necessary just so that you can later decide what to remove from the interface. If you do this, you'll just make the class harder to use. People will ask questions like, "why are there four ways of doing this? Which one is better? How can I choose between them?" It's usually easier to keep things simple and provide one way of doing each thing unless there's a compelling reason why your class should offer multiple methods with the same basic functionality. At the same time, just because adding methods to the public interface (probably) won't break anything that doesn't mean that you should start off with a tiny interface. First of all, if anybody decides to inherit from your class and you then choose a function with the same name, you're in for a boatload of confusion. First, if you don't declare the function virtual, then an object of the subclass will have the function chosen depending on the static type of the pointer. This can be messy. Moreover, if you do declare it virtual, then you have the issue that it might provide a different type of functionality than was intended by the original implementation of that function. Finally, you just can't add a pure virtual function to a class that's already in use because nobody who has inherited from it will have implemented that function. The public interface, then, should remain as constant as possible. In fact, a good approach to designing classes is to write the interface before the implementation because it's what determines how your class interacts with the rest of the world (which is more important for the program as a whole than how the class is actually implemented). Moreover, if you write the interface first, you can get a feel for how the class will work with other classes before you actually dive into the implementation details.

Inheritance and Class Design


The second issue of your class design is what should be available to programmers who wish to create subclasses. This interface is primarily determined by virtual functions, but you can also include protected methods that are designed for use by the class or its subclasses (remember that protected methods are visible to subclasses while private methods are not). A key consideration is whether it makes sense for a function to be virtual. A function should be virtual when the implementation is likely to differ from subclass to subclass. Vice-versa, whenever a function should not change, then it should be made non-virtual. The key idea is to think about whether to make a function virtual by asking if the function should always be the same for every class. For example, if you have a class designed to allow users to monitor network traffic and you want to allow subclasses that implement different ways of analyzing the traffic, you might use the following interface:
class TrafficWatch { public: // Packet is some class that implements information about network // packets void addPacket (const Packet& network_packet); int getAveragePacketSize (); int getMaxPacket (); virtual bool isOverloaded (); };

In this class, some methods will not change from implementation to implementation; adding a packet should always be handled the same way, and the average packet size isn't going to change either. On the other hand, someone might have a very different idea of what it means to have an overloaded network. This will change from situation to situation and we don't want to prevent someone from changing how this is computed--for some, anything over 10 Mbits/sec of traffic might be an overloaded network, and for others, it would require 100 Mbits/sec on some specific network cables. Finally, when publicly inheriting from any class or designing for inheritance, remember that you should strive for it to be clear that inheritance models is-a. At heart, the is-a relationship means that the subclass should be able to appear anywhere the parent class could appear. From the standpoint of the user of the class, it should not matter whether a class is the parent class or a subclass. To design an is-a relationship, make sure that it makes sense for the class to include certain functions to be sure that it doesn't include that subclasses might not actually need. One example of having an extra function is that of a Bird class that implements a fly function. The problem is that not all birds can fly--penguins and emus, for instance. This suggests that a more prudent design choice might be to have two subclasses of birds, one for birds that can fly and one for flightless birds. Of course, it might be overkill to have two subclasses of bird depending on how complex your class hierarchy will be. If you know that nobody would ever expect use your class for a flightless bird, then it's not so bad. Of course, you won't always know what someone will use your class for and it's much easier to think carefully before you start to implement an entire class hierarchy than it will be to go back and change it once people are using it.

C++ coding standards


(Excerpts from "Joint strike fighter air vehicle C++ coding standards", 4.12 Templates, December 2005, http://www2.research.att.com/~bs/JSF-AV-rules.pdf)

4.9 Style
Imposing constraints on the format of syntactic elements makes source code easier to read due to consistency in form and appearance. Note that automatic code generators should be configured to produce code that conforms to the style guidelines where possible. However, an exception is made for code generators that cannot be reasonably configured to comply with should or will style rules (safety-critical shall rules must still be followed).
AV Rule 41

Source lines will be kept to a length of 120 characters or less. Rationale: Readability and style. Very long source lines can be difficult to read and understand.
AV Rule 42

Each expression-statement will be on a separate line. Rationale: Simplicity, readability, and style. See AV Rule 42 in Appendix A for examples.
AV Rule 43

Tabs should be avoided. Rationale: Tabs are interpreted differently across various editors and printers. Note: many editors can be configured to map the tab key to a specified number of spaces.
AV Rule 44

All indentations will be at least two spaces and be consistent within the same source file. Rationale: Readability and style. 4.9.1 Naming Identifiers The choice of identifier names should: Suggest the usage of the identifier. Consist of a descriptive name that is short yet meaningful. Be long enough to avoid name conflicts, but not excessive in length. Include abbreviations that are generally accepted. Note: In general, the above guidelines should be followed. However, conventional usage of simple identifiers (i, x, y, p, etc.) in small scopes can lead to cleaner code and will therefore be permitted. Additionally, the term word in the following naming convention rules may be used to refer to a word, an acronym, an abbreviation, or a number.
AV Rule 45

All words in an identifier will be separated by the _ character. Rationale: Readability and Style.
AV Rule 46 (MISRA Rule 11, Revised)

User-specified identifiers (internal and external) will not rely on significance of more than 64 characters. Note: The C++ standard suggests that a minimum of 1,024 characters will be significant. [10]
AV Rule 47

Identifiers will not begin with the underscore character _.


5

Rationale: _ is often used as the first character in the name of library functions (e.g. _main, _exit, etc.) In order to avoid name collisions, identifiers should not begin with _.
AV Rule 48

Identifiers will not differ by: Only a mixture of case The presence/absence of the underscore character The interchange of the letter O, with the number 0 or the letter D The interchange of the letter I, with the number 1 or the letter l The interchange of the letter S with the number 5 The interchange of the letter Z with the number 2 The interchange of the letter n with the letter h. Rationale: Readability.
AV Rule 49

All acronyms in an identifier will be composed of uppercase letters. Note: An acronym will always be in upper case, even if the acronym is located in a portion of an identifier that is specified to be lower case by other rules. Rationale: Readability. 4.9.1.1 Naming Classes, Structures, Enumerated types and typedefs
AV Rule 50

The first word of the name of a class, structure, namespace, enumeration, or type created with typedef will begin with an uppercase letter. All others letters will be lowercase. Rationale: Style. Example: class Diagonal_matrix { }; // Only first letter is capitalized; enum RGB_colors {red, green, blue}; // RGB is an acronym so all letters are un upper case Exception: The first letter of a typedef name may be in lowercase in order to conform to a standard library interface or when used as a replacement for fundamental types (see AV Rule 209). typename C::value_type s=0; // value_type of container C begins with a lower case //letter in conformance with standard library typedefs 4.9.1.2 Naming Functions, Variables and Parameters
AV Rule 51

All letters contained in function and variable names will be composed entirely of lowercase letters. Rationale: Style. Example: class Example_class_name { public: uint16 example_function_name (void); private: uint16 example_variable_name; };

4.9.1.3 Naming Constants and Enumerators


AV Rule 52

Identifiers for constant and enumerator values shall be lowercase. Example: const uint16 max_pressure = 100; enum Switch_position {up, down}; Rationale: Although it is an accepted convention to use uppercase letters for constants and enumerators, it is possible for third party libraries to replace constant/enumerator names as part of the macro substitution process (macros are also typically represented with uppercase letters). 4.9.2 Naming Files Naming files should follow the same guidelines as naming identifiers with a few additions.
AV Rule 53

Header files will always have a file name extension of ".h".


AV Rule 53.1

The following character sequences shall not appear in header file names: , \, /*, //, or ". Rationale: If any of the character sequences , \, /*, //, or " appears in a header file name (i.e. <h-char-sequence>), the resulting behavior is undefined. [10], 2.8(2) Note that relative pathnames may be used. However, only / may be used to separate directory and file names. Examples: #include <foo /* comment */ .h> // Bad: /* prohibited #include <foos .h> // Bad: prohibited #include <dir1\dir2\foo.h> // Bad: \ prohibited #include <dir1/dir2/foo.h> // Good: relative path used
AV Rule 54

Implementation files will always have a file name extension of ".cpp".


AV Rule 55

The name of a header file should reflect the logical entity for which it provides declarations. Example: For the Matrix entity, the header file would be named: Matrix.h
AV Rule 56

The name of an implementation file should reflect the logical entity for which it provides definitions and have a .cpp extension (this name will normally be identical to the header file that provides the corresponding declarations.) At times, more than one .cpp file for a given logical entity will be required. In these cases, a suffix should be appended to reflect a logical differentiation. Example 1: One .cpp file for the Matrix class: Matrix.cpp Example 2: Multiple files for a math library: Math_sqrt.cpp
7

Math_sin.cpp Math_cos.cpp

4.9.3 Classes
AV Rule 57

The public, protected, and private sections of a class will be declared in that order (the public section is declared before the protected section which is declared before the private section). Rationale: By placing the public section first, everything that is of interest to a user is gathered in the beginning of the class definition. The protected section may be of interest to designers when considering inheriting from the class. The private section contains details that should be of the least general interest. 4.9.4 Functions
AV Rule 58

When declaring and defining functions with more than two parameters, the leading parenthesis and the first argument will be written on the same line as the function name. Each additional argument will be written on a separate line (with the closing parenthesis directly after the last argument). Rationale: Readability and style. See AV Rule 58 in Appendix A for examples. 4.9.5 Blocks
AV Rule 59 (MISRA Rule 59, Revised)

The statements forming the body of an if, else if, else, while, dowhile or for statement shall always be enclosed in braces, even if the braces form an empty block. Rationale: Readability. It can be difficult to see ; when it appears by itself. See AV Rule 59 in Appendix A for examples.
AV Rule 60

Braces ("{}") which enclose a block will be placed in the same column, on separate lines directly before and after the block. Example: if (var_name == true) { } else { }
AV Rule 61

Braces ("{}") which enclose a block will have nothing else on the line except comments (if necessary). 4.9.6 Pointers and References
AV Rule 62

The dereference operator * and the address-of operator & will be directly connected with the type-specifier. Rationale: The int32* p; form emphasizes type over syntax while the int32 *p; form emphasizes syntax over type. Although both forms are equally valid C++, the heavy emphasis on types in C++ suggests that int32* p; is the preferable form. Examples: int32* p; // Correct
8

int32 *p; // Incorrect int32* p, q; // Probably error. However, this declaration cannot occur // under the one name per declaration style required by AV Rule 152.

4.9.7 Miscellaneous
AV Rule 63

Spaces will not be used around . or ->, nor between unary operators and operands. Rationale: Readability and style.

4.10 Classes
4.10.1 Class Interfaces
AV Rule 64

A class interface should be complete and minimal. See Meyers [6], item 18. Rationale: A complete interface allows clients to do anything they may reasonably want to do. On the other hand, a minimal interface will contain as few functions as possible (i.e. no two functions will provide overlapping services). Hence, the interface will be no more complicated than it has to be while allowing clients to perform whatever activities are reasonable for them to expect. Note: Overlapping services may be required where efficiency requirements dictate. Also, the use of helper functions (Stroustrup [2], 10.3.2) can simplify class interfaces. 4.10.2 Considerations Regarding Access Rights Roughly two types of classes exist: those that essentially aggregate data and those that provide an abstraction while maintaining a well-defined state or invariant. The following rules provide guidance in this regard.
AV Rule 65

A structure should be used to model an entity that does not require an invariant.
AV Rule 66

A class should be used to model an entity that maintains an invariant.


AV Rule 67

Public and protected data should only be used in structsnot classes. Rationale: A class is able to maintain its invariant by controlling access to its data. However, a class cannot control access to its members if those members non-private. Hence all data in a class should be private. Exception: Protected members may be used in a class as long as that class does not participate in a client interface. See AV Rule 88. 4.10.3 Member Functions
AV Rule 68

Unneeded implicitly generated member functions shall be explicitly disallowed. See Meyers [6], item 27. Rationale: Eliminate any surprises that may occur as a result of compiler generated functions. For example, if the assignment operator is unneeded for a particular class, then it should be declared private (and not defined). Any attempt to invoke the operator will result in a
9

compile-time error. On the contrary, if the assignment operator is not declared, then when it is invoked, a compiler-generated form will be created and subsequently executed. This could lead to unexpected results. Note: If the copy constructor is explicitly disallowed, the assignment operator should be as well.) 4.10.4 const Member Functions
AV Rule 69

A member function that does not affect the state of an object (its instance variables) will be declared const. Member functions should be const by default. Only when there is a clear, explicit reason should the const modifier on member functions be omitted. Rationale: Declaring a member function const is a means of ensuring that objects will not be modified when they should not. Furthermore, C++ allows member functions to be overloaded on their const-ness. 4.10.5 Friends
AV Rule 70

A class will have friends only when a function or object requires access to the private elements of the class, but is unable to be a member of the class for logical or efficiency reasons. Rationale: The overuse of friends leads to code that is both difficult to understand and maintain. AV Rule 70 in Appendix A provides examples of acceptable uses of friends. Note that the alternative to friendship in some instances is to expose more internal detail than is necessary. In those cases friendship is not only allowed, but is the preferable option. 4.10.6 Object Lifetime, Constructors, and Destructors 4.10.6.1 Object Lifetime Conceptually, developers understand that objects should not be used before they have been created or after they have been destroyed. However, a number of scenarios may arise where this distinction may not be obvious. Consequently, the following object-lifetime rule is provided to highlight these instances.
AV Rule 70.1

An object shall not be improperly used before its lifetime begins or after its lifetime ends. Rationale: Improper use of an object, before it is created or after it is destroyed, results in undefined behavior. See section 3.8 of [10] for details on proper vs. improper use. See also AV Rule 70.1 in Appendix A for examples.

4.10.6.2 Constructors
AV Rule 71

Calls to an externally visible operation of an object, other than its constructors, shall not be allowed until the object has been fully initialized. Rationale: Avoid problems resulting from incomplete object initialization. Further details are given in AV Rule 71 in Appendix A.
AV Rule 71.1

A classs virtual functions shall not be invoked from its destructor or any of its constructors.
10

Rationale: A classs virtual functions are resolved statically (not dynamically) in its constructors and destructor. See AV Rule 71.1 in Appendix_A for additional details.
AV Rule 72

The invariant1 for a class should be: a part of the postcondition of every class constructor, a part of the precondition of the class destructor (if any), a part of the precondition and postcondition of every other publicly accessible operation. Rationale: Prohibit clients from influencing the invariant of an object through any other means than the public interface.
AV Rule 73

Unnecessary default constructors shall not be defined. See Meyers [7], item 4. (See also AV Rule 143). Rationale: Discourage programmers from creating objects until the requisite data is available for complete object construction (i.e. prevent objects from being created in a partially initialized state). See AV Rule 73 in Appendix A for examples.
AV Rule 74

Initialization of nonstatic class members will be performed through the member initialization list rather than through assignment in the body of a constructor. See Meyers [6], item 12. Exception: Assignment should be used when an initial value cannot be represented by a simple expression (e.g. initialization of array values), or when a name must be introduced before it can be initialized (e.g. value received via an input stream). See AV Rule 74 in Appendix A for details.
AV Rule 75

Members of the initialization list shall be listed in the order in which they are declared in the class. See Stroustrup [2], 10.4.5 and Meyers [6], item 13. Note: Since base class members are initialized before derived class members, base class initializers should appear at the beginning of the member initialization list. Rationale: Members of a class are initialized in the order in which they are declarednot the order in which they appear in the initialization list.
AV Rule 76

A copy constructor and an assignment operator shall be declared for classes that contain pointers to data items or nontrivial destructors. See Meyers [6], item 11. Note: See also AV Rule 80 which indicates that default copy and assignment operators are preferable when those operators offer reasonable semantics. Rationale: Ensure resources are appropriately managed during copy and assignment operations. See AV Rule 76 in Appendix A for additional details.
AV Rule 77 A copy constructor shall copy all data members and bases that affect the class invariant (a data element representing a cache, for example, would not need to be copied).

Note: If a reference counting mechanism is employed by a class, a literal copy need not be performed in every case. See also AV Rule 83. Rationale: Ensure data members and bases are properly handled when an object is copied. See AV Rule 77 in Appendix A for additional details.
A class invariant is a statement-of-fact about a class that must be true for all stable instances of the class. A class is considered to be in a stable state immediately after construction, immediately before destruction, and immediately before and after any remote public method invocation.
1

11

AV Rule 77.1

The definition of a member function shall not contain default arguments that produce a signature identical to that of the implicitly-declared copy constructor for the corresponding class/structure. Rationale: Compilers are not required to diagnose this ambiguity. See AV Rule 77.1 in Appendix A for additional details. 4.10.6.3 Destructors
AV Rule 78

All base classes with a virtual function shall define a virtual destructor. Rationale: Prevent undefined behavior. If an application attempts to delete a derived class object through a base class pointer, the result is undefined if the base classs destructor is nonvirtual. Note: This rule does not imply the use of dynamic memory (allocation/deallocation from the free store) will be used. See AV Rule 206.
AV Rule 79

All resources acquired by a class shall be released by the classs destructor. See Stroustrup [2], 14.4 and Meyers [7], item 9. Rationale: Prevention of resource leaks, especially in error cases. See AV Rule 79 in Appendix A for additional details. 4.10.7 Assignment Operators
AV Rule 80

The default copy and assignment operators will be used for classes when those operators offer reasonable semantics. Rationale: The default versions are more likely to be correct, easier to maintain and efficient than that generated by hand.
AV Rule 81

The assignment operator shall handle self-assignment correctly (see Stroustrup [2], Appendix E.3.3 and 10.4.4) Rationale: a = a; must function correctly. See AV Rule 81 in Appendix A for examples.
AV Rule 82

An assignment operator shall return a reference to *this. Rationale: Both the standard library types and the built-in types behave in this manner. See AV Rule 81 for an example of an assignment operator overload.
AV Rule 83 An assignment operator shall assign all data members and bases that affect the class invariant (a data element representing a cache, for example, would not need to be copied).

Note: To correctly copy a stateful virtual base in a portable manner, it must hold that if x1 and x2 are objects of virtual base X, then x1=x2; x1=x2; must be semantically equivalent to x1=x2; [10] 12.8(13) Rationale: Ensure data members and bases are properly handled under assignment. See AV Rule 83 in Appendix A for additional details. See also AV Rule 77.

12

4.10.8 Operator Overloading


AV Rule 84

Operator overloading will be used sparingly and in a conventional manner. Rationale: Since unconventional or inconsistent uses of operator overloading can easily lead to confusion, operator overloads should only be used to enhance clarity and should follow the natural meanings and conventions of the language. For instance, a C++ operator "+=" shall have the same meaning as "+" and "=".
AV Rule 85

When two operators are opposites (such as == and !=), both will be defined and one will be defined in terms of the other. Rationale: If operator==() is supplied, then one could reasonable expect that operator!=() would be supplied as well. Furthermore, defining one in terms of the other simplifies maintenance. See AV Rule 85 in Appendix A for an example. 4.10.9 Inheritance Class hierarchies are appropriate when run-time selection of implementation is required. If run-time resolution is not required, template parameterization should be considered (templates are better-behaved and faster than virtual functions). Finally, simple independent concepts should be expressed as concrete types. The method selected to express the solution should be commensurate with the complexity of the problem. The following rules provide additional detail and guidance when considering the structure of inheritance hierarchies.
AV Rule 86

Concrete types should be used to represent simple independent concepts. See Stroustrup [2], 25.2. Rationale: Well designed concrete classes tend to be efficient in both space and time, have minimal dependencies on other classes, and tend to be both comprehensible and usable in isolation.
AV Rule 87

Hierarchies should be based on abstract classes. See Stroustrup [2], 12.5. Rationale: Hierarchies based on abstract classes tend to focus designs toward producing clean interfaces, keep implementation details out of interfaces, and minimize compilation dependencies while allowing alternative implementations to coexist. See AV Rule 87 in Appendix A for examples.
AV Rule 88

Multiple inheritance shall only be allowed in the following restricted form: n interfaces plus m private implementations, plus at most one protected implementation. Rationale: Multiple inheritance can lead to complicated inheritance hierarchies that are difficult to comprehend and maintain. See AV Rule 88 in Appendix A for examples of both appropriate and inappropriate uses of multiple inheritance.
AV Rule 88.1

A stateful virtual base shall be explicitly declared in each derived class that accesses it. Rationale: Explicitly declaring a stateful virtual base at each level in a hierarchy (where that base is used), documents that fact that no assumptions can be made with respect to the exclusive use of the data contained within the virtual base. See AV Rule 88.1 in Appendix A for additional details.
13

AV Rule 89

A base class shall not be both virtual and non-virtual in the same hierarchy. Rationale: Hierarchy becomes difficult to comprehend and use.
AV Rule 90

Heavily used interfaces should be minimal, general and abstract. See Stroustrup [2] 23.4. Rationale: Enable interfaces to exhibit stability in the face of changes to their hierarchies.
AV Rule 91

Public inheritance will be used to implement is-a relationships. See Meyers [6], item 35. Rationale: Public inheritance and private inheritance mean very different things in C++ and should therefore be used accordingly. Public inheritance implies an is-a relationship. That is, every object of a publicly derived class D is also an object of the base type B, but not vice versa. Moreover, type B represents a more general concept than type D, and type D represents a more specialized concept than type B. Thus, stating that D publicly inherits from B, is an assertion that D is a subtype of B. That is, objects of type D may be used anywhere that objects of type B may be used (since an object of type D is really an object of type B as well). In contrast to public inheritance, private and protected inheritance means is-implemented-interms-of. It is purely an implementation techniquethe interface is ignored. See also AV Rule 93.
AV Rule 92

A subtype (publicly derived classes) will conform to the following guidelines with respect to all classes involved in the polymorphic assignment of different subclass instances to the same variable or parameter during the execution of the system: Preconditions of derived methods must be at least as weak as the preconditions of the methods they override. Postconditions of derived methods must be at least as strong as the postconditions of the methods they override. In other words, subclass methods must expect less and deliver more than the base class methods they override. This rule implies that subtypes will conform to the Liskov Substitution Principle. Rationale: Predictable behavior of derived classes when used within base class contexts. See AV Rule 92 in Appendix A for additional details.
AV Rule 93

has-a or is-implemented-in-terms-of relationships will be modeled through membership or non-public inheritance. See Meyers [6], item 40. Rationale: Public inheritance means is-a (see AV Rule 91) while nonpublic inheritance means has-a or is-implemented-in-terms-of. See AV Rule 93 in Appendix A for examples.
AV Rule 94

An inherited nonvirtual function shall not be redefined in a derived class. See Meyers [6], item 37. Rationale: Prevent an object from exhibiting two-faced behavior. See AV Rule 94 in Appendix A for an example.
AV Rule 95

An inherited default parameter shall never be redefined. See Meyers [6], item 38. Rationale: The redefinition of default parameters for virtual functions often produces surprising results. See AV Rule 95 in Appendix A for an example.
14

AV Rule 96

Arrays shall not be treated polymorphically. See Meyers [7], item 3. Rationale: Array indexing in C/C++ is implemented as pointer arithmetic. Hence, a[i] is equivalent to a+i*SIZEOF(array element). Since derived classes are often larger than base classes, polymorphism and pointer arithmetic are not compatible techniques.
AV Rule 97

Arrays shall not be used in interfaces. Instead, the Array class should be used. Rationale: Arrays degenerate to pointers when passed as parameters. This array decay problem has long been known to be a source of errors. Note: See Array.doc for guidance concerning the proper use of the Array class, including its interaction with memory management and error handling facilities. 4.10.10Virtual Member Functions
AV Rule 97.1

Neither operand of an equality operator (== or !=) shall be a pointer to a virtual member function. Rationale: If either operand of an equality operator (== or !=) is a pointer to a virtual member function, the result is unspecified [10], 5.10(2).

Several other sections have also touched on virtual member functions and polymorphism. Hence, the following cross references are provided so that these rules may be accessed from a single location: AV Rule 71, AV Rule 78, AV Rule 87-AV Rule 97, and AV Rule 221.

15

Object Oriented Design rules I


http://eventhelix.com/RealtimeMantra/Object_Oriented/object_design_tips.htm Copyright 2000-2011 EventHelix.com Inc. All Rights Reserved. Here is an assortment of tips to keep in mind when using object oriented design in embedded systems: Stay close to problem domain Object discovery vs. object invention Pick nouns or noun phrases as classes Method names should contain a verb Prefix adjectives when naming inheriting classes Do not add suffixes to class names Avoid one-to-one mapping from structured design Replace multiple get-set methods with operations Model classes that handle messages as state machines Use const whenever possible Restrict header file level dependency Don't reinvent the wheel; use STL

Stay close to problem domain


Design is a process of modeling the problem domain into programming constructs. Object oriented design simplifies the design process by maintaining a one-to-one mapping between problem domain objects and software objects. To succeed in object oriented design, keep your design as close as possible to problem domain objects. The interactions between your objects should mirror interactions between corresponding problem domain objects. Problem domain objects is basically an object that can be found in the problem itself. For example, when developing a text editor real-world objects would be, Paragraph, Sentence, Word, ScrollBar, TextSelection etc. While developing a call processing module, the objects might be Call, Ringer, ToneDetector, Subscriber etc.

Object discovery vs. object invention


The first step in object oriented analysis is to discover the objects that can be directly identified from the problem itself. In many cases objects can be identified from the requirements. Objects discovered from the problem statement are extremely important. These objects will be the core objects in the design. The next stage in object design is to "invent" objects. These objects are needed to "glue" together objects that have been identified during object discovery. Invented objects generally do not correspond to anything tangible in the problem domain. They are inventions of programmers to simplify design. Consider the following statement from the requirements: The circuit controller shall support digital and analog circuits. The circuit controller shall contain 32 DSPs. When the circuit controller receives a request to setup a circuit, it shall allocate a DSP to the circuit.

16

We discover the following objects from the requirement:


CircuitController DigitalCircuit AnalogCircuit DSP

We invent the following objects based on our knowledge of the manager design pattern:

DSPManager: Manages the 32 DSPs on the circuit controller CircuitManager: Manages the digital and analog circuits

We invent a Circuit base class for DigitalCircuit and AnalogCircuit by filtering properties that are common to DigitalCircuit and AnalogCircuit objects. The relationship between the classes also follows from the requirement. CircuitController class contains DSPManager and CircuitManager classes. The CircuitManager contains an array of Circuit class pointers. The DSPManager contains an array of DSPobjects.

Pick nouns or noun phrases as classes


Identifying objects is easy, they should always be nouns. As we have seen in the Circuit Controller example, we picked up nouns from the requirements as classes in our design. Even when you invent classes, keep in mind that they should be nouns. Abstract concepts don't qualify as object names. Naming the objects is extremely important in object oriented design. Chances are that if you name your object correctly, the designers and maintainers will assign it functionality that fits its name. Also note that, if you have trouble naming an object, you probably have the wrong object. At this point go back and look at the problem again and see if you can pick an alternative object.

Method names should contain verbs


In any language, actions performed by nouns are specified using verbs. Why should object oriented programming be any different? Thus make sure all the operation methods should contain verbs. Thus the Circuit class we discussed earlier would have methods like:

Activate Deactivate Block Unblock ChangeStatus

Notice that the methods do not include Circuit in the name (ActivateCircuit, BlockCircuit etc.) as being methods of Circuit its clear that they refer to operations on Circuit.

Prefix adjectives when naming inheriting classes


This one is fairly obvious. When a class inherits from a base class, the name for the new class can be determined just by prefixing it with the appropriate adjective. For example, classes inheriting from Circuit are called AnalogCircuit and DigitalCircuit. Following this convention leads to class names that convey information about the classes inheritance.

17

Do not add suffixes to class names


Do not add suffixes like Descriptor, ControlBlock, Agent to the class names. For example, DigitalCircuit should not be called DigitalCircuitDescriptor or DigitalCircuitControlBlock. Such names are longer and do not convey the exact role of the class.

Avoid one-to-one mapping from structured design


Many developers moving from structured design just continue with structured design in C++. The classes developed correspond more to similar structured constructs they have used in the past. Similarity between C and C++ confuses developers. Make no mistake, object oriented programming is a completely different technique. The emphasis here is to keep the design process simple by minimizing the difference between the problem domain and software domain.

Replace multiple get-set methods with operations


Developers complain that after moving to object oriented programming, they spend considerable time writing mindless get and set methods. Here is a simple tip on reducing the get and set methods. Consider the code below:

Circuit Status (Multiple Get-Set)


void CircuitManager::GetStatus(const CircuitStatusMsg *pMsg) const { for (int i= 0; i < MAX_CIRCUITS; i++) { pMsg->circuitInfo[i].circuitId = m_pCircuit[i]->GetId(); pMsg->circuitInfo[i].circuitType = m_pCircuit[i]->GetType(); pMsg->circuitInfo[i].circuitStatus = m_pCircuit[i]->GetStatus(); pMsg->circuitInfo[i].circuitCallId = m_pCircuit[i]->GetCallId(); pMsg->circuitInfo[i].circuitState = m_pCircuit[i]->GetState(); } }

The above code can be replaced by moving the field filling in the message to the Circuit class. This way you do not need to define a large number of get operations. Also, any changes in the CircuitInfo field would result only in changes to the Circuit class. CircuitManager would be transparent as it does not look into CircuitInfo.

Circuit Status (Single Operation)


void CircuitManager::GetStatus(const CircuitStatusMsg *pMsg) const { for (int i= 0; i < MAX_CIRCUITS; i++) { m_pCircuit[i]->UpdateStatus(pMsg->circuitInfo[i]); } } void Circuit::UpdateStatus(CircuitInfo &circuitInfo) const { circuitInfo.circuitId = m_id; circuitInfo.circuitType = m_type; circuitInfo.circuitStatus = m_status; circuitInfo.circuitCallId = m_callId; circuitInfo.circuitState = m_state; }

18

Model classes that handle messages as state machines


Whenever you encounter a class that has to perform some level of message handling, its always better to model it as a state machine. We have discussed this in the article on hierarchical state machines.

Use const whenever possible


C++ provides powerful support for const methods and fields. const should be used in the following cases:

const

Methods that do not change the value of any variable in the class should be declared methods. If a function is supposed to just read information from a class, pass a const pointer or reference to this function. The called function would be restricted to calling const methods and using the classes fields only on the right side of an expression.

Proper and consistent use of const will help you catch several bugs at compile time. So start using const from day one of your project. If const is not used extensively from the beginning of a project, it will be close to impossible to add it later.

Restrict header file level dependency


Complex software requires a careful header file management even when programming in C. When developers move to C++, header file management becomes even more complex and time consuming. Reduce header file dependency by effective use of forward declarations in header files. Sometimes to reduce header file dependency you might have to change member variables from values to pointers. This might also warrant changing inline functions to out-of-line functions. Every time you use a #include make sure that you have an extremely good reason to do so. For details refer to the header file include patterns article.

Don't reinvent the wheel; use STL


The C++ standard template library is extremely powerful. It can save countless hours of coding and testing of complex containers and queues. Details can be found in the STL design patterns and STL design patterns II articles.

19

Object Oriented Design rules II


http://eventhelix.com/RealtimeMantra/Object_Oriented/object_design_tips_2.htm
Copyright 2000-2011 EventHelix.com Inc. All Rights Reserved. We have already covered object oriented design tips in a previous article. Here we will look at more tips that will help you improve your object oriented design skills:

Class with just get-set methods points to missed delegation Replace an array of structures with an array of objects Delegate work to helper class Multi-dimensional arrays point to incomplete class identification Multiple nested loops point to incomplete delegation Class with very large numbers of methods points to incomplete class identification Don't go overboard with inheritance Prefer delegation to inheritance Don't scatter the abstraction Consider group of objects to split work amongst team members Use nested classes for lightweight helper classes Use templates to improve type safety and performance Divide your code into framework and application parts

Class with just get-set methods points to missed delegation


Many times while developing classes you might find that a particular class you have developed has just get and set based methods. There are no methods to perform any operations on the object. In many cases, this points to inadequate delegation of work by the caller. Examine the caller of the get-set methods. Look for operations that could be delegated to the class with just get-set methods. The example below shows a DSP class that has get and set methods. The Message Handler class was doing most of the processing.

Overworked Message Handler class and a simple Get-Set DSP class


class DSP { public: Queue *GetWriteBufferQueue(); Queue *GetReadBufferQueue(); void SetStatusRegister(int mask); void SetCongestionLevel(); void SetWriteBufferQueue(Buffer *pQueue); void SetReadBufferQueue(Buffer *pQueue); }; Status MessageHandler::SendMessage(Buffer *pBuffer) { int dspId = pBuffer->GetDSP(); DSP* pDSP = m_dsp[dspId]; Queue *pQueue = pDSP->GetWriteBufferQueue(); int status = pQueue->Add(pBuffer); pDSP->SetStatusRegister(WRITTEN_MESSAGE); if (pQueue->GetLength() > CONGESTION_THRESHOLD) { pDSP->SetCongestionLevel();

20

} return status; }

The above classes have been transformed to assign most of the DSP queue management to the DSP class itself. This has simplified the design of the Message Handler class. The interfaces of the DSP class have also been simplified. Here the DSP class does most of the work
class DSP { public: void WriteBuffer(Buffer *pBuf); Buffer *ReadBuffer(); }; Status MessageHandler::SendMessage(Buffer *) { int dspId = pBuffer->GetDSP(); pDSP = m_dsp[dspId]; int status = pDSP->WriteBuffer(); return status; } Status DSP::WriteBuffer(Buffer *) { int status = m_pQueue->Add(pBuffer); IOWrite(WRITTEN_MESSAGE); if (m_pQueue->GetLength() > CONGESTION_THRESHOLD) { m_bCongestionFlag = true; } return status; }

Replace an array of structures with an array of objects


Whenever you end up with an array of structures in your class, consider if you should convert the array of structure into an array of objects. Initially the structure array might be simple with only one or two fields. As coding progresses, more and more fields are added to the structure. At that time it might be too late to treat the structure as a class.

Delegate work to helper class


If you find that one of the classes in your design has too many methods and the code size for the class is much greater than your average class, consider inventing helper classes to handle some of the functionality of this class. This will simplify the design of the huge class, making it more maintainable. More importantly, you might be able to split the work amongst different developers.

21

Consider the following class:

Monolithic Digital Trunk Class


class DigitTrunk { Status m_status; Timeslot m_timeslot[MAX_TIMESLOT_PER_TRUNK]; int m_signalingTimeslot; int m_signalingStatus; . . . int int int . . public: m_errorThreshold; m_localErrorRate; m_remoteErrorRate; .

. . . void HandleSignalingRequest(); void SendSignalingIndication(); . . . void HandleRemoteError(); void HandleLocalError(); . . . };

The above class can be made more maintainable by adding private helper classes SignalingHandler and ErrorHandler.

Digital Trunk Class With Helper Classes


class DigitTrunk { Status m_status; Timeslot m_timeslot[MAX_TIMESLOTS_PER_TRUNK]; class SignalingHandler { int m_signalingTimeslot; int m_signalingStatus; . . . public: void HandleSignalingRequest(); void SendSignalingIndication(); }; class ErrorHandler { int m_errorThreshold; int m_localErrorRate; int m_remoteErrorRate; . . . public: void HandleRemoteError(); void HandleLocalError(); }; // Helper classes SignalingHandler m_signalingHandler; ErrorHandler m_errorHandler; public: . . . void HandleSignalingRequest() { m_sigalingHandler.HandleSignalingRequest(); }

22

void SendSignalingIndication() { m_signalingHandler.SendSignalingIndication(); } . . . void HandleRemoteError() { m_errorHandler.HandleRemoteError(); } void HandleLocalError() { m_errorHandler.HandleLocalError(); } . . . };

Multi-dimensional arrays point to incomplete class identification


If your design contains multi-dimensional arrays, this might point to missed class identification. The following example should clarify this:

Two dimensional DSP array declaration


. . . // Each Signal Processor Card contains 32 DSPs. // This is represented by a two dimensional // array of DSP objects. The first dimension // is the Signal Processor Card Id and the second // dimension is the DSP id on the card DSP m_dsp[MAX_SIGNAL_PROCESSOR_CARDS][MAX_DSPS_PER_CARD];

The above two dimensional array points to missed identification of SignalProcessingCard class. This has been fixed in the following code fragment:

SignalProcessingCard class eliminates two dimensional array


. . . // The two dimensional array is replaced. We identify // SignalProcesingCard as an object. This // object contains 32 DSPs class SignalProcessingCard { DSP m_dsp[MAX_DSPS_PER_CARD]; public: . . . }; // Array of signal processing card objects, indexed by signal processing card id SignalProcessingCard m_signalProcessingCard[MAX_SIGNAL_PROCESSOR_CARDS]; . . .

Multiple nested loops point to incomplete delegation


Many times, nested loops point to incomplete delegation. May be the inner nesting of the loop should have been delegated to a lower level object. Consider the above example of SignalProcessingCard and DSP.

Initializing all DSPs (nested loops)


23

. . . for (card=0; card < MAX_SIGNAL_PROCESSING_CARDS; card++) { for (dsp=0; dsp < MAX_DSPS_PER_CARD; dsp++) { m_signalProcessingCard[card].GetDSP(dsp)->Initialize(); } }

The inner loop in the above code should be replaced with a Initialize method at SignalProcessingCard. Code operating on SignalProcesingCard initialization should not worry about DSP level initialization. This should be delegated to the Initialize method of the SignalProcessingCard.

Initializing all DSPs delegated to SignalProcessingCard class (no nested loop)


. . . for (card=0; card < MAX_SIGNAL_PROCESSING_CARDS; card++) { m_signalProcessingCard[card].Initialize(); } void SignalProcessingCard::Initialize() { for (dsp=0; dsp < MAX_DSPS_PER_CARD; dsp++) { m_dsp[dsp].Initialize(); } }

Class with very large numbers of methods points to incomplete class identification
A class with very large number of methods typically means that fine grain object identification has been missed. At this stage, have a hard look at your design to identify more classes.

Don't go overboard with inheritance


This is a very common mistake made by designers new to object oriented design. Inheritance is such a wonderful concept that its easy to go overboard and try to apply it everywhere. This problem can be avoided by using the litmus test: X should inherit from Y only if you can say that X is a Y. By this rule its easy to see that Circle should inherit from Shape as we can make the statement "Circle is a Shape". Inheritance is the most tightly coupled of all the relationships. Every inheritance relationship causes the derived class to strongly depend upon the base class. That dependency is hard to manage. Also note that the biggest benefit of object oriented design are obtained from composition and not inheritance. In our earlier example, programmers can develop SignalProcessingCard and DSP objects as if there was only one instance of the object. The multiplicity is achieved by just declaring an array of the objects.

24

Prefer delegation to inheritance


Many times, relationships are better modeled as delegation2 than inheritance. When in doubt, always consider delegation as an alternative. Sometimes commonality in classes that do not meet the "is a" rule is better implemented by using a common helper class which implements the common functionality. This class can then be included as a member in both the classes. Consider two classes TerminalAllocator and DSPAllocator which use similar resource allocation algorithms. The two classes have completely different type of interfaces. You might be tempted to model this as both the classes inheriting from a common Allocator class which implements the common parts of the allocation algorithm. In many cases, it might be better to model TerminalAllocator and DSPAllocator as standalone classes with a helper class Allocator included as a member.

Modeled as inheritiance
class TerminalAllocator : public Allocator { . . . }; class DSPAllocator : public Allocator { . . . };

Modeled as delegation
class TerminalAllocator { Allocator m_allocator; . . . }; class DSPAllocator { Allocator m_allocator; . . . };

Don't scatter the abstraction


This is a common mistake when multiple developers are working on a project. Each developer implements his or her part by designing objects that they need, without considering if other developers have similar objects. This scatters the abstraction of an object into several different objects which implement pieces of the whole objects functionality. In our example, this would mean that the design contains several objects that represent the SignalProcessingCard and DSP objects in different portions of the code. Each developer implement parts of the SignalProcessingCard and DSP functionality that is needed in their domain. This results in scattering the functionality of an object over several incomplete objects. Needless to say, such code would be difficult to understand and hard to maintain.

Editors note: Here, term delegation is used instead of more common composition when class contains an object (or objects) of another class. It describes the situation when composition is preferred over inheritance.

25

Consider group of objects to split work amongst team members


Embedded software developers often split work amongst team members by dividing the functionality into several tasks. With object oriented design, work can be divided in a much more fine grain way by assigning a group of classes to a developer. In many cases you can implement all the functionality in a single task, thus greatly reducing the effort in designing intra-processor communication.

Use nested classes for lightweight helper classes


Many times you will encounter a situation where a small class might be useful in capturing some of functionality of a large class. Often developers avoid adding such classes as they would result in a new set of header and source files. This brings its associated changes like makefile updates, checking in new elements. Another problem with the this approach is that you end up with simply too many classes. There is no way to isolate the important classes from the simple helper classes. The solution to this problem is to use small nested classes that are declared within the parent class. With this approach, the nested class does not appear amongst the top level classes in your design. This greatly simplifies the total number of high level classes you have to deal with. (If you are using a tool like Microsoft Visual Studio, the nested classes would appear as tree nodes inside the parent class. Thus adding a new class does not increase the number of classes visible in the outermost nodes of the tree). Nested classes can be made even more lightweight by letting developers write the code for the nested classes in the parent class source files. This lightweight mechanism would improve the readability of complex classes. The developers can now model the complex class as a set of lightweight helper classes.

DigitalTrunk.h : Nested class declaration


class DigitalTrunk { private: // Timeslot allocator is a private class used by DigitalTrunk. The full // name for the class is DigitalTrunk::TimeslotAllocator. class TimeslotAllocator { public: int Allocate(); void Free(int timeslot); } // Timeslots in transmit and receive direction can be allocated // independently TimeslotAllocator m_transmitAllocator; TimeslotAllocator m_receiveAllocator; public: . . .

};

DigitalTrunk.cpp : Nested class methods


// Nested classes methods are also contained // in the the parent class's CPP file int DigitalTrunk::TimeslotAllocator::Allocate() { } void DigitalTrunk::TimeslotAllocator::Free() { }

26

Use templates to improve type safety and performance


Do not restrict yourself to using templates as defined in STL. Templates can be used to provide type safe and efficient code in the following cases:

Classes have common functionality but differ in the size of data structures. Such classes can be modeled in base template class that takes the data structure sizes as template parameters. Preprocessor macros are type independent but type unsafe. C++ inline functions are type safe but type dependent. Template functions can be used to replace macros as well as regular inline functions. Template functions are both type safe and type independent. Pointer and reference based classes where the functionality is the same in classes but the type to operate on is different. In most such cases declaring template base class with a generic type would solve this problem in an elegant fashion.

Divide your code into framework and application parts


When developing a new application consider dividing the total application into core application code and framework code. The core application code performs operations that are very specific to the application at hand. All the other code that is needed to support the core application should be modeled as an application framework. This has several benefits:

The application framework developed here might get reused in developing similar applications. The application framework can be reused much more readily than the core application. Lower layers of the application framework might be reused in applications that are quite different from the original core application. The core application can to be ported to a different platform by just changing the application framework. Often developing the core application and framework requires different skills. This application- framework can simplify staffing the project.
Here are a few examples of possible frameworks:

Tracing framework Memory management framework Message management framework Call processing framework Operator interface management framework Fault handling framework

27

Object-oriented Design Principles


http://www.codeguru.com/forum/showthread.php?t=328034 Q: What are the OOD principles and paradigms? A: Here is a comprehensive list of the main OOD principles with links to articles. Open Close Principle (OCP)

Description Software entities should be open for extension, but closed for modification. B. Meyer, 1988 / quoted by Robert Martin, 1996 No significant program can be 100% closed. R. Martin, The Open-Closed Principle, 1996 Resources o OCP by Robert Martin o OCP at EventHelix.com

Liskov Substitution Principle (LSP)

Description Inheritance should ensure that any property proved about super-type objects also holds for subtype objects. Barbara Liskov, 1987 Functions that use pointers or references to base classes must be able to use objects of derived classes without knowing it. Robert Martin, 1996 Heuristics o It is illegal for a derived class, to override a base-class method with a method that does nothing. Resources o LSP by Robert Martin o LSP at EventHelix.com o LSP

Design by Contract

Description When redefining a method in a derivate class, you may only replace its precondition by a weaker one, and its post condition by a stronger one. B. Meyer, 1988 Resources o Design by Contract at EventHelix.com

28

Dependency Inversion Principle (DIP)

Description I. High-level modules should not depend on low-level modules. Both should depend on abstractions. II. Abstractions should not depend on details. Details should depend on abstractions. R. Martin, 1996 Heuristics o Design to an interface, not an implementation; o Avoid Transitive Dependencies; o When in doubt, add a level of indirection. Resources o DIP by Robert Martin o DIP at EventHelix.com o DIP

The Law of Demeter (LoD)

Description Any object receiving a message in a given method must be one of a restricted set of objects. Strict Form: Every supplier class or object to a method must be a preferred supplier. Minimization Form: Minimize the number of acquaintance classes / objects of each method. Lieberherr and Holland Resources o Introducing Demeter and its Laws o LoD o More about LoD

Interface Segregation Principle (ISP)

Description Clients should not be forced to depend upon interfaces that they do not use. R. Martin, 1996 Resources o ISP by R. Martin o ISP

Reuse/Release Equivalency Principle (REP)

Description The granule of reuse is the granule of release. Only components that are released through a tracking system can be efficiently reused. R. Martin, 1996 Resources o Principles of OOD

29

The Common Reuse Principle (CRP)

Description All classes in a package [library] should be reused together. If you reuse one of the classes in the package, you reuse them all. R. Martin, Granularity 1996 Resources o CRP

Common Closure Principle (CCP)

Description The classes in a package should be closed against the same kinds of changes. A change that affects a package affects all the classes in that package. R. Martin, 1996 Resources o CCP

Acyclic Dependencies Principles (ADP)

Description The dependency structure for released component must be a Directed Acyclic Graph (DAG). There can be no cycles. R. Martin, 1996 Resources o Principles of OOD o Granularity

Stable Dependencies Principle (SDP)

Description The dependencies between components in a design should be in the direction of stability. A component should only depend upon components that are more stable than it is. R. Martin, 1996 Resources o Stability by R. Martin o SDP

Stable Abstractions Principle (SAP)

Description The abstraction of a package should be proportional to its stability! Packages that are maximally stable should be maximally abstract. Instable packages should be concrete. R. Martin, 1996 Resources o SAP
30

Q: What are other good books and articles about OOD principles? A: Some more good articles are listed here.

An Introduction to Software Architecture, David Garlan and Mary Shaw, January 1994 Assuring Good Style for Object Oriented Programs, Karl J. Lieberherr and Ian M. Holand On the Criteria To Be Used in Decomposing Systems into Modules, D.L. Parnas Design Principles, R. Martin

31

Object-oriented design
http://www.glenmccl.com/ood_cmp.htm Editor's note: The author "Glen McCluskey has more than 25 years experience in software, and has focused on Java and C++ since 1988. He spent 1990-94 working with AT&T Bell Labs / USL / Novell in the C++ systems group in New Jersey. In this group, he worked on tool development, the design and implementation of a C++ template instantiation environment, and development of a comprehensive test suite for C++. This work involved close cooperation with the designers of C++ and with the ANSI standardization process." (http://www.glenmccl.com/brochure.htm) Take into account that this tutorial was written somewhere between 1995 and 1998. Thus, its time references may look outdated.

Abstraction
Up until now we've largely avoided discussing object-oriented design (OOD). This is a topic with a variety of methods put forward, and people tend to have strong views about it. But there are some useful general principles that can be stated, and we will present some of them in a series of articles. The first point is perhaps the hardest one for newcomers to OOD to grasp. People will ask "How can I decide what classes my program should have in it?" The fundamental rule is that a class should represent some abstraction. For example, a Date class might represent calendar dates, an Integer class might deal with integers, and a Matrix class would represent mathematical matrices. So you need to ask "What kinds of entities does my application manipulate?" Some examples of potential classes in different application areas would include:
GUI/Graphics - Line, Circle, Window, TextArea, Button, Point Statistics - Mean, ChiSquare, Correlation Geography - River, Country, Sea, Continent

Another way of saying it would be this. Instead of viewing an application as something that performs steps A, B, and C, that is, looking at the program in terms of its functions, instead ask what types of objects and data the application manipulates. Instead of taking a function-oriented approach, take an object-oriented one. One obvious question with identifying potential classes is what level of granularity to apply. For example, in C++ an "int" is a primitive type, that represents an abstraction of mathematical integers. Should int be a class in the usual C++ sense? Probably not, because a class implies certain kinds of overhead in speed and space and in user comprehension. It's interesting to note that Java(tm), a newer object-oriented language, also has int, but additionally supports a "wrapper" class called Integer that represents an integer value. In this way, an application can manipulate integers either as primitives or as classes. Consider a slightly more ambiguous case. Suppose that you're writing a Date class, and you want to express the concept "day of week". Should this be a class of its own? Besides devising a class for this purpose, at least five other representations are possible:
int dow : 3; (bit field) char dow; short dow; int dow; enum Dow {SUN, MON, TUE, WED, THU, FRI, SAT};

The "right" choice in this case is probably the enumeration. It's a natural way of representing a limited domain of values Direct use of primitive types for representation has its drawbacks. For example, if I choose to represent day of week as an integer, then what is meant by:
int dow; ...

32

dow = 19;

The domain of the type is violated. As another example, C/C++ pointers are notorious for being misused and thereby introducing bugs into programs. A better choice in many cases is a higher-level abstraction like a string class, found in the C++ and Java standard libraries. On the other end of the scale, it's also possible to have a class try to do too much, or to cover several disparate abstractions. For example, in statistics, it doesn't make sense to mix Mean and Correlation. These statistical methods have little in common. If you have a class "Statistics" with both of these in it, along with an add() member function to add new values, the result will be a mishmash. For example, for Mean, you need a stream of single values, whereas for Correlation, you need a sequence of (X,Y) pairs. We will have more to say about OOD principles. A good book illustrating several object-oriented design principles is "Designing and Coding Reusable C++" by Martin Carroll and Margaret Ellis, published by Addison-Wesley.

Data abstraction
As we said in the previous issue, object-oriented design has many aspects to it, and a variety of strong views about which approach is "right". But there are some general techniques that are useful. One of these, one that constitutes a whole design method in itself, is data abstraction. Simply stated, data abstraction refers to identifying key data types in an application, along with operations that are to be done on those types. What does this mean in practice? Suppose that we are doing graphics of some sort, and are concerned with X,Y points on a screen. Now, at a low enough level, a point might be described via a couple of floating-point numbers X and Y. But with data abstraction, we define a type "Point" that will refer to a point, and we hide from the users of the type just how such a point is implemented. Instead of directly using X,Y values, we present Point as a distinct data type, along with some operations on it. In the case of a Point type, two of those operations are (1) establishing a new Point instance, that describes an actual screen point, and (2) computing the distance between this point and another point. If Point was written out as a C++ class, it might look like:
class Point { float x; float y; public: Point(float, float); float dist(const Point&); };

We've declared a class Point with a couple of private data members. There is a constructor to create new object instances of Point, and a member function dist() to compute the distance between this point and another one. Suppose that we instead implemented this as C code. We might have:
struct Point { float x; float y; }; typedef struct Point Point; float Point_dist(Point*);

and so on. The C approach will certainly work, so why all the fuss about data abstraction and C++? There are several reasons for the fuss. One is simply that data abstraction is a useful way of looking at the organization of a software program. Rather than decomposing a program in terms of its functional structure, we instead ask the question "What data types are we operating on, and what sorts of operations do we wish to do on them?" With data abstraction, there is a distinction made between the representation of a type, and public operations on and behavior of that type. For example, I as a user of Point don't have to know or care 33

that internally, a point is represented by a couple of floating-point numbers. Other choices might conceivably be doubles or longs or shorts. All I care about is the public behavior of the type. In a similar vein, data abstraction allows for the formal manipulation of types in a mathematical sense. For example, suppose that we are dealing with screen points in the range 0-1000, typical of windowing systems today. And we are using the C approach, and say:
Point p; p.x = 125; p.y = -59;

What does this mean? The domain of the type has been violated, by introduction of an invalid value for Y. This sort of invalid value can easily be screened out in a C++ constructor for Point. Without maintaining integrity of a type, it's hard to reason about the behavior of the type, for example, whether dist() really does compute the distance appropriately. Also, if the representation of a type is hidden, it can be changed at a later time without affecting the users of the type. As another simple example of data abstraction, consider designing a String class. In C, strings are implemented simply as character pointers, that is, of type "char*". Such pointers tend to be error prone, and we might desire a higher-level alternative. In terms of the actual string representation, we obviously have to store the string's characters, and we also might want to store the string length separately from the actual characters. Some of the operations on strings that we might want would include:
- creating a String from a char* - creating a String from another String - retrieving a character at a given index - retrieving the length - searching for a pattern in a String

Given this very rough idea for a data type, we could write C++ code like so:
class String { char* str; int len; public: String(const char*); String(const String&); char charAt(int) const; int length() const; int search(const String&) const; };

and so on. In medium-complexity applications, data abstraction can be used as a design technique by itself, building up a set of abstract types that can be used to structure a complete program. It can also be used as part of other design techniques. For example, in some application I might have a calendar date type, used to store the birthdate of a person in a personnel record. Data abstraction could be used to devise a Date type, independent of any other design techniques used in the application. There is an excellent (but out of print) book on data abstraction, with the title "Abstraction and Specification in Program Development", by Barbara Liskov and John Guttag (published 1986 by MIT Press). Note also that data abstraction is only one part of object-oriented design and programming. Some languages (Modula-2, Ada 83) support data abstraction without being fully object-oriented.

Polymorphism
The example in the previous section illustrates another aspect of object-oriented design, that of polymorphism. This term means "many forms", and in the context that we are using refers to the ability to call member functions of many object types using the same interface. 34

The simplest C++ example of this would be:


#include <iostream.h> class A { public: virtual void f() {cout << "A::f" << endl;} }; class B : public A { public: virtual void f() {cout << "B::f" << endl;} }; int main() { B b; A* ap = &b; ap->f(); return 0; }

which calls B::f(). That is, the base class pointer ap "really" points at a B object, and so B::f() is called. This feature requires some run-time assistance to determine which type of object is really being manipulated, and which f() to call. One implementation approach uses a hidden pointer in each object instance, that points at a table of function pointers (a virtual table or vtbl), and dispatches accordingly. Without language support for polymorphism, one would have to say something like:
#include <iostream.h> class A { public: int type; A() {type = 1;} void f() {cout << "A::f" << endl;} }; class B : public A { public: B() {type = 2;} void f() {cout << "B::f" << endl;} }; int main() { B b; A* ap = &b; if (ap->type == 1) ap->f(); else ((B*)ap)->f(); return 0; }

that is, use an explicit type field. This is cumbersome. The use of base/derived classes (superclasses and subclasses) in combination with polymorphic functions goes by the technical name of "object-oriented programming". It's interesting to note that in Java, methods (functions) are by default polymorphic, and one has to specifically disable this feature by use of the "final", "private", or "static" keywords. In C++ the default goes the other way.

35

Data hiding
Another quite basic principle of object-oriented design is to avoid exposing the private state of an object to the world. Earlier we talked about data abstraction, where a user-defined type is composed of data and operations on that data. For example, in C++ a type Date might represent a user-defined type for calendar dates, and operations would include comparing dates for equality, computing the number of days between two dates, and so on. Suppose that in C++, a Date type looks like this:
class Date { public: int m; int d; int y; };

// month 1-12 // day 1-31 // year 1800-1999

and I say:
Date dt; dt.m = 27;

What does this mean? Probably nothing good. So it would be better to rewrite this as:
class Date { int m; int d; int y; public: Date(int, int, int); };

with a public constructor that will properly initialize a Date object. In C++, data members of a class may be private (the default), protected (available to derived classes), or public (available to everyone). A simple and useful technique for controlling access to the private state of an object is to define some member functions for setting and getting values:
class A { int x; public: void set_x(int i) {x = i;} int get_x() {return x;} };

These functions are inline and have little or no performance overhead. In C++ there is another sort of hiding available, that offered by namespaces. Suppose that you have a program with some global data in it:
int x[100];

and you use a C++ class library that also uses global data:
double x = 12.34;

These names will clash when you attempt to link the program. A simple solution is to use namespaces:
namespace Company1 { int x[100]; } namespace Company2 { double x = 12.34; }

36

and refer to the values as "Company1::x" and "Company2::x". Note that the Java language has no global variables, and similar usage to this example would involve static data defined in classes. Data hiding is a simple but extremely important concept. Without it, it is difficult to reason about the behavior of an object, given that its state can be arbitrarily changed at any point.

Representation hiding
In the last issue we talked about data hiding, where the internal state of an object is hidden from the user. We said that one reason for this hiding is so that the state can not be arbitrarily changed. Another aspect of hiding concerns the representation of an object. For example, consider a class to handle a stack of integers:
class Stack { ??? public: void push(int); int pop(); int top_of_stack(); };

It's pretty obvious what the public member functions should look like, but what about the representation? At least three representations could make sense. One would be a fixed-length array of ints, with an error given on overflow. Another would be a dynamic int array, that is grown as needed by means of new/delete. Yet a third approach would be to use a linked list of stack records. Each of these has advantages and disadvantages. Suppose that the representation was exposed:
class Stack public: int int ... int }; { vec[10]; sp; top_of_stack();

and someone cheats by accessing top of stack as:


obj.vec[obj.sp]

instead of:
obj.top_of_stack()

This will work, until such time as the internal representation is changed to something else. At that point, this usage will be invalidated, and will not compile or will introduce subtle problems into a running program (what if I change the stack origin by 1?). The point is simply that exposing the internal representation introduces a set of problems with program reliability and maintainability.

Extensibility
Thus far we've looked at object-oriented design in isolation, focusing on individual classes as abstractions of some real-world entity. But as you're probably already aware, C++ classes (and ones in other languages as well) can be extended by deriving subclasses from them. These classes add functionality to the base class. Suppose that we have a class:
class A { private: int x; protected:

37

int y; public: int z; };

The declarations of the members indicate that x is available only to member functions of the class itself, y is available to subclasses, and z is available to everyone. How do we decide how to structure a class for extensibility? There are several aspects of this, one of them being the level of protection of individual members. There is not a single "right" answer to this question, but one approach is to ask how the class is likely to be used. For example, with a Date class:
class Date { private: long repr; };

// days since 1/1/1800

it's unlikely that a derived class will need to directly access repr, because it's in an arcane format and because the Date class can supply a set of functions that will suffice to manipulate Dates. There is a steep learning curve in learning how to directly manipulate the underlying representation, and a consequent ability to mess things up by getting it wrong. On the other hand, for a Tree class:
class TreeNode { protected: TreeNode* left; TreeNode* right; int value; public: TreeNode(TreeNode*, TreeNode*, int); };

making the internal pointers visible may make sense, to facilitate a derived class walking through the tree in an efficient manner. It's useful to distinguish between developers, who may wish to extend a class, and end users. For example, with the Date class, the representation (number of days since 1/1/1800) is non-standard, and in a hard format to manipulate. So it makes sense to hide the representation completely. On the other hand, for TreeNode, with binary trees as a well-understood entity, giving a developer access to the representation may be a good idea. There's quite a bit more to say about extensibility, which we will do in future issues.

More about extensibility


In the last issue we started talking about extending the functionality of a base class via a derived class. Another aspect of this simply deals with the issue of how far to carry derivation. In other words, how many levels of derived classes make sense? In theoretical terms, the answer is "an infinite number". That is, you can carefully design a set of classes, with each derived class adding some functionality to its base class. There is no obvious stopping point for this process. In practical terms, however, deep class derivations create more problems than they solve. At some point, humans lose the ability to keep track of all the relationships. And there are some hidden performance issues, for example with constructors and destructors. The "empty" constructor for C in this example:
#include <iostream.h> class A { public: A() {cout << "A::A\n";} ~A() {cout << "A::~A\n";} }; class B : public A {

38

public: B() {cout << "B::B\n";} ~B() {cout << "B::~B\n";} }; class C : public B { public: C() {cout << "C::C\n";} ~C() {cout << "C::~C\n";} }; void f() { C c; } int main() { f(); return 0; }

in fact causes the constructors for B and A to be called, and likewise for the destructor. As a simple rule of thumb, I personally try to keep derivations to three levels or less. In other words, a base class, and a couple of levels of derived classes from it.

A book on C++ design


In issue #027 the new edition of Bjarne Stroustrup's book "The C++ Programming Language" was mentioned. This book came out a few months ago, and contains about 100 pages on design using C++. It starts out by discussing design at an abstract level, and then moves on to cover specific design topics as they relate to C++. Recommended if you're interested in this topic.

Templates vs. classes


In previous issues we've looked at some of the aspects of template programming. One big issue that comes up with object-oriented design is when to implement polymorphism via a template (a parameterized class or function) in preference to using inheritance or a single class. This is a hard question to answer, but there are several aspects of the issue that can be mentioned. Consider first the nature of the algorithm that is to be implemented. How generally applicable is the algorithm? For example, sorting is used everywhere, and a well-designed sort function template for fundamental or user-defined types would be very handy to have around. On the other hand, consider strings. Strings of characters are very heavily used in programming languages. But what about strings of doubles? For example, does taking a substring of doubles from a string mean very much? In certain applications it might, but clearly this feature is not as generally useful as strings of characters. On the other side of this same argument, if we want to implement an algorithm for a set of types, and some of those types are much more widely used than others (such as a string of chars), then template specializations offer a way to tune performance. For example, a string template might be defined via:
template <class T> class string { ... };

with a specialization for characters:


class string<char> { ... };

that is optimized. Of course, if strings of chars represent 99% of the use of the string template, then perhaps simply devising a string class would make more sense. Another question to ask is whether all the types of interest fit cleanly into a single class hierarchy. For example, a hierarchy for a GUI window system might have: 39

class class class class

Component { ... }; Container : public Component { ... }; Window : public Container { ... }; Frame : public Window { ... };

That is, all types are in one hierarchy. Such a type hierarchy is often best managed via abstract classes and virtual functions, without the use of templates. Note that using virtual functions allows for access to runtime type information, whereas templates are more of a compile-time feature. Newsletter issues #024, #025, and #0263 give some examples of the use of virtual functions and runtime type identification. But sometimes templates might be useful even in a simple hierarchy such as this one. For example, a hierarchy of GUI classes might be parameterized based on the type of the underlying display device, such as a bit-mapped display, dumb terminal, or touch-screen.

See ftp://ftp.glenmccl.com/pub/cpplett

40

References
1. "Class Design in C++", http://www.cprogramming.com/tutorial/class_design.html 2. "Joint strike fighter air vehicle C++ coding standards", 4.12 Templates, December 2005, http://www2.research.att.com/~bs/JSF-AV-rules.pdf 3. Object-Oriented Design Tips I, http://eventhelix.com/RealtimeMantra/Object_Oriented/ 4. Object-Oriented Design Tips II, http://eventhelix.com/RealtimeMantra/Object_Oriented/object_design_tips_2.htm 5. Principles of Object-Oriented Design, http://www.codeguru.com/forum/showthread.php?t=328034 6. Object-Oriented Design, http://www.glenmccl.com/ood_cmp.htm

41

You might also like