Large software projects are generally undertaken by correspondingly large teams of developers. For the code produced by large teams to have project-wide measurable quality, the code must be written in accordance with; and be judged against a standard. It is therefore important for large project teams to establish a programming standard, or set of guidelines.
The use of a programming standard also makes it possible to do the following:
The aim of this text is to present C++ programming rules, guidelines and hints (generically referred to as guidelines) that can be used as the basis for a standard. It is intended for software engineers working in large project teams.
The current version is purposely focused upon programming (although at times it is difficult to draw the line between programming and design); design guidelines will be added at a later date.
The guidelines presented cover the following aspects of C++ development:
They have been collected from a large base of industry knowledge. (See the bibliography for the sources: authors and references.) They are based upon:
Most are based upon a handful of the first category, and large doses of the second and third. Unfortunately, some are also based upon the last category; mainly, because programming is a highly subjective activity: there being no widely accepted "best" or "right" way to code everything.
Clear, understandable C++ source code is the primary goal of most of the rules and guidelines: clear, understandable source code being a major contributing factor to software reliability and maintainability. What is meant by clear and understandable code can be captured in the following three simple fundamental principles [Kruchten, 94].
Minimal Surprise - Over its lifetime, source code is read more often than it is written, especially specifications. Ideally, code should read like an English-language description of what is being done, with the added benefit hat it executes. Programs are written more for people than for computers. Reading code is a complex mental process that can be eased by uniformity, also referred to in this guide as the minimal-surprise principle. A uniform style across an entire project is a major reason for a team of software developers to agree on programming standards, and it should not be perceived as some kind of punishment or as an obstacle to creativity and productivity.
Single Point of Maintenance - Whenever possible, a design decision should be expressed at only one point in the source, and most of its consequences should be derived programmatically from this point. Violations of this principle greatly jeopardize maintainability and reliability, as well as understandability.
Minimal Noise - Finally, as a major contribution to legibility, the minimal-noise principle is applied. That is, an effort is made to avoid cluttering the source code with visual "noise": bars, boxes, and other text with low information content or information that does not contribute to the understanding of the purpose of the software.
The intended spirit of the guidelines expressed herein, is to not be overly restrictive; but, rather to attempt to provide guidance for the correct and safe usage of language features. The key to good software resides in:
The guidelines presented here make a small number of basic assumptions:
Guidelines are not of equal importance; they are weighted using the following scale:
A guideline identified by the above symbol is a tip a simple piece of advice that can be followed, or safely ignored.
A guideline identified by the above symbol is a recommendation usually based on more technical grounds: encapsulation, cohesion, coupling, portability or reusability may be affected, as well as performance in some implementations. Recommendations must be followed unless there is good justification not to.
A guideline identified by the above symbol is a requirement or restriction; a violation would definitely lead to bad, unreliable, or non-portable code. Requirements or restrictions cannot be violated without a waiver
When you cannot find an applicable rule or guideline; when a rule obviously does not apply; or when everything else fails: use common sense, and check the fundamental principles. This rule overrides all of the others. Common sense is required even when rules and guidelines exist.
This chapter provides guidance on program structure and lay-out.
Large systems are usually developed as a number of smaller functional subsystems. Subsystems themselves are usually constructed from a number of code modules. In C++, a module normally contains the implementation for a single, or on rare occasions, a set of closely related abstractions. In C++, an abstraction is normally implemented as a class. A class has two distinct components: an interface visible to the class clients, providing a declaration or specification of the class capabilities and responsibilities; and an implementation of the declared specification (the class definition).
Similar to the class, a module also has an interface and an implementation: the module interface contains the specifications for the contained module abstractions (class declarations); and the module implementation contains the actual implementation of the abstractions (class definitions).
In the construction of the system; subsystems may also be organized into collaborative groups or layers to minimize and control their dependencies.
A module's specification should be placed in a separate file from its implementation-the specification file is referred to as a header. A module's implementation may be placed in one or more implementation files.
If a module implementation contains extensive inline functions, common implementation-private declarations, test code, or platform-specific code, then separate these parts into their own files and name each file after its part's content.
If executable program sizes are a concern, then rarely used functions should also be placed in their own individual files.
Construct a part file name in the following manner:
File_Name::=
<Module_Name> [<Separator> <Part_Name>] '.' <File_Extension>
The following is an example module partitioning and naming scheme:
inlines
file (see "Place module inline function definitions in a separate
file").Module.Test
. Declaring
the test code as a friend class facilitates the independent development
of the module and its test code; and allows the test code to be omitted
from the final module object code without source changes.SymaNetwork.hh // Contains the declaration for a // class named "SymaNetwork". SymaNetwork.Inlines.cc // Inline definitions sub-unit SymaNetwork.cc // Module's main implementation unit SymaNetwork.Private.cc // Private implementation sub-unit SymaNetwork.Test.cc // Test code sub-unit
Separating out a module's specification from its implementation facilitates independent development of user and supplier code.
Breaking a module's implementation into multiple translation units provides better support for object code removal, resulting in smaller executable sizes.
Using a regular and predictable file naming and partitioning convention allows a module's content and organization to be understood without inspection of its actual contents.
Passing names through from the code to the file name increases predictability and facilitates the building of file-based tools without requiring complex name mapping [Ellemtel, 1993].
Commonly used file name extensions are: .h, .H, .hh, .hpp,
and
.hxx
for header files; and .c, .C, .cc, .cpp,
and
.cxx
for implementations. Pick a set of extensions and use them
consistently.
SymaNetwork.hh // The extension ".hh" used to designate // a "SymaNetwork" module header. SymaNetwork.cc // The extension ".cc" used to designate // a "SymaNetwork" module implementation.
The C++ draft standard working paper also uses the extension ".ns" for headers encapsulated by a namespace.
Only upon rare occasions should multiple classes be placed together in a module; and then only if they are closely associated (e.g., a container and its iterator). It is acceptable to place a module's main class and its supporting classes within the same header file if all classes are always required to be visible to a client module.
Reduces a module's interface and others' dependencies upon it.
Aside from class private members, a module's implementation-private declarations (e.g. implementation types and supporting classes) should not appear in the module's specification. These declarations should be placed in the needed implementation files unless the declarations are needed by multiple implementation files; in that case the declarations should be placed in a secondary, private header file. The secondary, private header file should then be included by other implementation files as needed.
This practice ensures that:
// Specification of module foo, contained in file "foo.hh" // class foo { .. declarations }; // End of "foo.hh" // Private declarations for module foo, contained in file // "foo.private.hh" and used by all foo implementation files. ... private declarations // End of "foo.private.hh" // Module foo implementation, contained in multiple files // "foo.x.cc" and "foo.y.cc" // File "foo.x.cc" // #include "foo.hh" // Include module's own header #include "foo.private.hh" // Include implementation // required declarations. ... definitions // End of "foo.x.cc" // File "foo.y.cc" // #include "foo.hh" #include "foo.private.hh" ... definitions // End of "foo.y.cc"
#include
to gain access to a module's specificationA module that uses another module must use the preprocessor #include
directive to acquire visibility of the supplier module's specification.
Correspondingly, modules should never re-declare any part of a supplier
module's specification.
When including files, only use the #include <header>
syntax for "standard" headers; use the #include
"header" syntax for the rest.
Use of the #include
directive also applies to a module's own
implementation files: a module implementation must include its own
specification and private secondary headers (see "Place module
specifications and implementations in separate files").
// The specification of module foo in its header file // "foo.hh" // class foo { ... declarations }; // End of "foo.hh" // The implementation of module foo in file "foo.cc" // #include "foo.hh" // The implementation includes its own // specification ... definitions for members of foo // End of "foo.cc"
An exception to the #include
rule is when a module only uses
or contains a supplier module's types (classes) by-reference (using pointer
or reference-type declarations); in this case the by-reference usage or
containment is specified using a forward declaration (see also
"Minimize compilation dependencies") rather than a #include
directive.
Avoid including more than is absolutely needed: this means that module headers should not include other headers that are required only by the module implementation.
#include "a_supplier.hh" class needed_only_by_reference;// Use a forward declaration // for a class if we only need // a pointer or a reference // access to it. void operation_requiring_object(a_supplier required_supplier, ...); // // Operation requiring an actual supplier object; thus the // supplier specification has to be #included. void some_operation(needed_only_by_reference& a_reference, ...); // // Some operation needing only a reference to an object; thus // should use a forward declaration for the supplier.
This rule ensures that:
When a module has many inline functions, their definitions should be placed in a separate, inline-function-only file. The inline function file should be included at the end of the module's header file.
See also "Use a No_Inline
conditional compilation symbol
to subvert inline compilation".
This technique keeps implementation details from cluttering a module's header; thus, preserving a clean specification. It also helps in reducing code replication when not compiling inline: using conditional compilation, the inline functions can be compiled into a single object file as opposed to being compiled statically into every using module. Correspondingly, inline function definitions should not be defined in class definitions unless they are absolutely trivial.
Break large modules into multiple translation units to facilitate un-referenced code removal during program linking. Member functions that are rarely referenced should be segregated into separate files from those that are commonly used. In the extreme, individual member functions can be placed in their own files [Ellemtel, 1993].
Linkers are not all equally capable of eliminating un-referenced code within an object file. Breaking large modules into multiple files allows these linkers to reduce executable program sizes by eliminating the linking of whole object files [Ellemtel, 1993].
It may also be worthwhile considering first whether the module should be broken down into smaller abstractions.
Separate out platform-dependent code from platform-independent code; this will facilitate porting. Platform-dependent modules should have file names qualified by their platform name to highlight the platform dependence.
SymaLowLevelStuff.hh // "LowLevelStuff" // specification SymaLowLevelStuff.SunOS54.cc // SunOS 5.4 implementation SymaLowLevelStuff.HPUX.cc // HP-UX implementation SymaLowLevelStuff.AIX.cc // AIX implementation
From an architectural and maintenance viewpoint, it is also good practice to contain platform dependencies in a small number of low level subsystems.
Adopt a standard file content structure and apply it consistently
A suggested file content structure consists of the following parts in the following order:
1. Repeated inclusion protection (specification only).
2. Optional file and version control identification.
3. File inclusions needed by this unit.
4. The module documentation (specification only).
5. Declarations (class, type, constants, objects and functions) and
additional textual specifications (preconditions and postconditions, and
invariants).
6. Inclusion of this module's inline function definitions.
7. Definitions (objects and functions) and implementation private
declarations.
8. Copyright notice.
9. Optional version control history.
The above file content ordering, presents the client pertinent information first; and is consistent with the rationale for the ordering of a class' public, protected and private sections.
Depending upon corporate policy, the copyright information may need to be placed at the top of the file.
Repeated file inclusion and compilation should be prevented by using the following construct in each header file:
#if !defined(module_name) // Use preprocessor symbols to #define module_name // protect against repeated // inclusions... // Declarations go here #include "module_name.inlines.cc" // Optional inline // inclusion goes here. // No more declarations after inclusion of module's // inline functions. #endif // End of module_name.hh
Use the module file name for the inclusion protection symbol. Use the same letter-case for the symbol as for the module name.
No_Inline
" conditional compilation symbol to
subvert inline compilationUse the following conditional compilation construct to control inline versus out-of-line compilation of inline-able functions:
// At the top of module_name.inlines.hh #if !defined(module_name_inlines) #define module_name_inlines #if defined(No_Inline) #define inline // Nullify inline keyword #endif ... // Inline definitions go here #endif // End of module_name.inlines.hh // At the end of module_name.hh // #if !defined(No_Inline) #include "module_name.inlines.hh" #endif // At the top of module_name.cc after inclusion of // module_name.hh // #if defined(No_Inline) #include "module_name.inlines.hh" #endif
The conditional compilation construct is similar to the multiple
inclusion protection construct. If the No_Inline
symbol is not
defined, then the inline functions are compiled with the module
specification and automatically excluded from the module implementation. If
the No_Inline
symbol is defined, then the inline definitions
are excluded from the module specification but included in the module
implementation with the keyword inline
nullified.
The above technique allows for reduced code replication when inline functions are compiled out-of-line. By using conditional compilation, a single copy of the inline functions is compiled into the defining module; versus replicated code, compiled as "static" (internal linkage) functions in every using module when out-of-line compilation is specified by a compiler switch.
Use of conditional compilation increases the complexity involved in maintaining build dependencies. This complexity is managed by always treating headers and inline function definitions as a single logical unit: implementation files are thus dependent upon both header and inline function definition files.
Consistent indentation should be used to visually delineate nested statements; indentation of between 2 and 4 spaces has been proven to be the most visually effective for this purpose. We recommend using a regular indentation of 2 spaces.
The compound or block statement delimiters ({}
), should be at
the same level of indentation as surrounding statements (by implication,
this means that {}
are vertically aligned). Statements within
the block should be indented by the chosen number of spaces.
Case labels of a switch
statement should be at the same
indentation level as the switch
statement; statements within
the switch
statement can then be indented by 1 indentation
level from the switch
statement itself and the case labels.
if (true) { // New block foo(); // Statement(s) within block // indented by 2 spaces. } else { bar(); } while (expression) { statement(); } switch (i) { case 1: do_something();// Statements indented by // 1 indentation level from break; // the switch statement itself. case 2: //... default: //... }
An indentation of 2 spaces is a comprise between allowing easy recognition of blocks, and allowing sufficient nested blocks before code drifts too far off the right edge of a display monitor or printed page.
If a function declaration cannot fit on a single line, then place the first parameter on the same line as the function name; and subsequent parameters each on a new line, indented at the same level as the first parameter. This style of declaration and indentation, shown below, leaves white spaces below the function return type and name; thus, improving their visibility.
void foo::function_decl( some_type first_parameter, some_other_type second_parameter, status_type and_subsequent);
If following the above guideline would cause line wrap, or parameters to be too far indented, then indent all parameters from the function name or scope name (class, namespace), with each on a separate line:
void foo::function_with_a_long_name( // function name is much less visible some_type first_parameter, some_other_type second_parameter, status_type and_subsequent);
See also alignment rules below.
The maximum length of program lines should be limited to prevent loss of information when printed on either standard (letter) or default printout paper size.
If the level of indentation causes deeply nested statements to drift too far to the right, and statements to extend much beyond the right margin, then it is probably a good time to consider breaking the code into smaller, ore manageable, functions.
When parameter lists in function declarations, definitions and calls, or enumerators in an enum declarations cannot fit on a single line, break the line after each list element and place each element on a separate line (see also "Indent function parameters from the function name or scope name").
enum color { red, orange, yellow, green, //... violet };
If a class or function template declaration is overly long, fold it onto consecutive lines after the template argument list. For example (declaration from the standard Iterators Library, [X3J16, 95]):
template <class InputIterator, class Distance> void advance(InputIterator& i, Distance n);
This chapter provides guidance on the use of comments in the code.
Comments should be used to complement source code, never to paraphrase it:
For each comment, the programmer should be able to easily answer the question: "What value is added by this comment?" Generally, well-chosen names often eliminate the need for comments. Comments, unless they participate in some formal Program Design Language (PDL), are not checked by the compiler; therefore, in accordance with the single-point-of-maintenance principle, design decisions should be expressed in the source code rather than in comments, even at the expense of a few more declarations.
The C++ style "//
" comment delimiter should be used in
preference to the C-style "/*...*/
".
C++ style comments are more visible and reduce the risk of accidentally commenting-out vast expanses of code due to a missing end-of-comment delimiter.
/* start of comment with missing end-of-comment delimiter do_something(); do_something_else(); /* Comment about do_something_else */ // End of comment is here ---> */ // Both do_something and // do_something_else // are accidentally commented out! Do_further();
Comments should be placed near the code they are commenting upon; with the same level of indentation, and attached to the code using a blank comment line.
Comments that apply to multiple, successive source statements should be placed above the statements-serving as an introduction to the statements. Likewise, comments associated with individual statements should be placed below the statements.
// A pre-statements comment applicable // to a number of following statements // ... void function(); // // A post-statement comment for // the preceding statement.
Avoid comments on the same line as a source construct: they often become misaligned. Such comments are tolerated, however, for descriptions of elements in long declarations, such as enumerators in an enum declaration.
Avoid the use of headers containing information such as author, phone numbers, dates of creation and modification: author and phone numbers rapidly become obsolete; while creation and modification dates, and reasons for modification are best maintained by a configuration management tool (or some other form of version history file).
Avoid the use of vertical bars, closed frames or boxes, even for major construct (such as functions and classes); they just add visual noise and are difficult to keep consistent. Use blank lines to separate related blocks of source ode rather than heavy comment lines. Use a single blank line to separate constructs within functions or classes. Use double blank lines to separate functions from each other.
Frames or forms may have the look of uniformity, and of reminding the programmer to document the code, but they often lead to a paraphrasing style [Kruchten, 94].
Use empty comments, rather than empty lines, within a single comment block to separate paragraphs
// Some explanation here needs to be continued // in a subsequent paragraph. // // The empty comment line above makes it // clear that this is another // paragraph of the same comment block.
Avoid repeating program identifiers in comments, and replicating information found elsewhere-provide a pointer to the information instead. Otherwise any program change may require maintenance in multiple places. And failure to make the required comment changes everywhere will result in misleading or wrong comments: these end up being worse than no comments at all.
Always aim to write self-documenting code rather than providing comments. This can be achieved by choosing better names; using extra temporary variables; or re-structuring the code. Take care with style, syntax, and spelling in comments. Use natural language comments rather than telegraphic, or cryptic style.
do { ... } while (string_utility.locate(ch, str) != 0); // Exit search loop when found it.with:
do { ... found_it = (string_utility.locate(ch, str) == 0); } while (!found_it);
Although self-documenting code is preferred over comments; there is generally a need to provide information beyond an explanation of complicated parts of the code. The information that is needed is documentation of at least the following:
The code documentation in conjunction with the declarations should be
sufficient for a client to use the code; documentation is required since the
full semantics of classes, functions, types and objects cannot be fully
expressed using C++ alone.
This chapter provides guidance on the choice of names for various C++ entities.
Coming up with good names for program entities (classes, functions, types, objects, literals, exceptions, namespaces) is no easy matter. For medium-to-large applications, the problem is made even more challenging: here name conflicts, and lack of synonyms to designate distinct but similar concepts add to the degree of difficulty.
Using a naming convention can lessen the mental effort required for inventing suitable names. Aside from this benefit, a naming convention has the added benefit of enforcing consistency in the code. To be useful, a naming convention should provide guidance on: typographical style (or how to write the names); and name construction (or how to choose names).
It is not so important which naming convention is used as long as it is applied consistently. Uniformity in naming is far more important than the actual convention: uniformity supports the principle of minimal surprise.
Because C++ is a case-sensitive language, and because a number of distinct naming conventions are in widespread use by the C++ community; it will rarely be possible to achieve absolute naming consistency. We recommend picking a naming convention for the project based upon the host environment (e.g., UNIX or Windows) and the principle libraries used by the project; to maximize the code consistency:
The careful reader will observe that the examples in this text currently do not follow all the guidelines. This is due in part to the fact that examples are derived from multiple sources; and also due to the desire to conserve paper, therefore the formatting guidelines have not been meticulously applied. But the message is "do as I say, not as do".
Names with a single leading underscore ('_') are often used by library
functions ("_main
" and "_exit
").
Names with double leading underscores ("__"); or a single leading
underscore followed by a capital letter are reserved for compiler internal use.
Also avoid names with adjacent underscores, as it is often difficult to discern the exact number of underscores.
It is hard to remember the differences between type names that differ only by letter case, and thus easy to get confused between them.
Abbreviations may be used if they are either commonly used in the application domain (e.g., FFT for Fast Fourier Transform), or they are defined in a project-recognized list of abbreviations. Otherwise, it is very likely that similar but not quite identical abbreviations will occur here and there, introducing confusion and errors later (e.g., track_identification being abbreviated trid, trck_id, tr_iden, tid, tr_ident, and so on).
The use of suffixes for categorizing kinds of entities (such as type for type, and error for exceptions) is usually not very effective for imparting understanding of the code. Suffixes such as array and struct also imply a specific implementation; which, in the event of an implementation change-changing the representation from a struct or array-would either have an adverse effect upon any client code, or would be misleading.
Suffixes can however be useful in a number of limited situations:
Choose names from the usage perspective; and use adjectives with nouns to enhance local (context specific) meaning. Also make sure that names agree with their types.
Choose names so that constructs such as:
object_name.function_name(...); object_name->function_name(...);
are easy to read and appear meaningful.
Speed of typing is not an acceptable justification for using short or abbreviated names. One-letter and short identifiers are often an indication of poor choice or laziness. Exceptions are well-recognized instances such as using E for the base of the natural logarithms; or Pi.
Unfortunately, compilers and supporting tools, sometimes limit length of names; thus, care should be taken to ensure that long names do not differ only by their trailing characters: the differentiating characters may be truncated by these tools.
void set_color(color new_color) { ... the_color = new_color; ... }is better than:
void set_foreground_color(color fg)and:
oid set_foreground_color(color foreground);{ ... the_foreground_color = foreground; ... }
The naming in the first example is superior to the other two: new_color
is qualified and agrees with its type; thereby strengthening the semantics of
the function.
In the second case, the intuitive reader could infer that fg
was
intended to mean foreground; however, in any good programming style, nothing
should be left to reader intuition or inference.
In the third case, when the parameter foreground
is used (away from
its declaration), the reader is led to believe that foreground
in
fact means foreground color. It could conceivably; however, have been of any
type that is implicitly convertible to a color
.
Forming names from nouns and adjectives, and ensuring that names agree with their types follows natural-language and enhances both code readability and semantics.
Parts of names that are English words should be spelled correctly and conform to the project required form, i.e., consistently English or American, but not both. This is equally true for comments.
For Boolean objects, functions and function arguments, use a predicate clause
in the positive form, e.g., found_it
, is_available
,
but not is_not_available
.
When negating predicates, double negatives are harder to understand.
If a system is decomposed into subsystems, use the subsystem names as namespace names for partitioning and minimizing the system's global namespace. If the system is a library, use a single outer-most namespace for the whole library.
Give each subsystem or library namespace a meaningful name; in addition give it
an abbreviated or acronym alias. Choose abbreviated or acronym aliases that are
unlikely to clash, e.g. the ANSI C++ draft standard library [Plauger,
95] defines std
as the alias for iso_standard_library
.
If the compiler doesn't yet support the namespace construct, use name prefixes to simulate namespaces. For example, the public names in the interface of a system management subsystem could be prefixed with syma (short for System Management).
Using namespaces to enclose potentially global names, helps to avoid name collisions when code is developed independently (by sub-project teams or vendors). A corollary is that only namespace names are global.
Use a common noun or noun phrase in singular form, to give a class a name that expresses its abstraction. Use more general names for base classes and more specialized names for derived classes.
typedef ... reference; // From the standard library typedef ... pointer; // From the standard library typedef ... iterator; // From the standard library class bank_account {...}; class savings_account : public bank_account {...}; class checking_account : public bank_account {...};
When there is a conflict or shortage of suitable names for both objects and
types; use the simple name for the object, and add a suffix such as mode,
kind, code,
and so on for the type name.
Use a plural form when expressing an abstraction that represents a collection of objects.
typedef some_container<...> yellow_pages;
When additional semantics is required beyond just a collection of objects, use the following from the standard library as behavioral patterns and name suffixes:
Use verbs or action phrases for functions that don't have return values (function declarations with a void return type), or functions that return values by pointer or reference parameters.
Use nouns or substantives for functions that return only a single value by a non-void function return type.
For classes with common operations (a pattern of behavior), use operation names drawn from a project list of choices. For example: begin, end, insert, erase (container operations from the standard library).
Avoid "get" and "set" naming mentality (prefixing functions with the prefixes "get" and "set"), especially for public operations for getting and setting object attributes. Operation naming should stay at the class abstraction and provision of service level; getting and setting object attributes are low-level implementation details that weaken encapsulation if made public.
Use adjectives (or past participles) for functions returning a Boolean (predicates). For predicates, it is often useful to add the prefix is or has before a noun to make the name read as a positive assertion. This is also useful when the simple name is already used for an object, type name, or an enumeration literal. Be accurate and consistent with respect to tense.
void insert(...); void erase(...); Name first_name(); bool has_first_name(); bool is_found(); bool is_available();
Don't use negative names as this can result in expressions with double
negations (e.g., !is_not_found
); making the code more difficult to
understand. In some cases, a negative predicate can also be made positive
without changing its semantics by using an antonym, such as "is_invalid
"
instead of "is_not_valid
".
bool is_not_valid(...); void find_client(name with_the_name, bool& not_found);Should be re-defined as:
bool is_valid(...); void find_client(name with_the_name, bool& found);
When operations have the same intended purpose, use overloading rather than trying to find synonyms: this minimizes the number of concepts and variations of operations in the system, and thereby reduce its overall complexity.
When overloading operators, ensure that the semantics of the operator are preserved; if the conventional meaning of an operator cannot be preserved, choose another name for the function rather than overload the operator.
To indicate uniqueness, or to show that this entity is the main focus of the
action, prefix the object or parameter name with "the
" or
"this
". To indicate a secondary, temporary, auxiliary
object, prefix it with "a
" or "current
":
void change_name( subscriber& the_subscriber, const subscriber::name new_name) { ... the_subscriber.name = new_name; ... } void update(subscriber_list& the_list, const subscriber::identification with_id, structure& on_structure, const value for_value); void change(object& the_object, const object using_object);
Since exceptions must be used only to handle error situations, use a noun or a noun phrase that clearly conveys a negative idea:
overflow, threshold_exceeded, bad_initial_value
Use one of the words such as bad, incomplete, invalid, wrong, missing, or illegal from a project agreed list as part of the name rather than systematically using error or exception, which do not convey specific information.
The letter 'E' in floating-point literals and the hexadecimal digits 'A' to 'F' should always be uppercase.
This chapter provides guidance on the usage and form of various C++ declaration kinds.
Prior to the existence of the namespace feature in the C++ language, there were only limited means to manage name scope; consequently, the global namespace became rather over-populated, leading to conflicts that prevented some libraries from being used together in the same program. The new namespace language feature solves the global namespace pollution problem.
This means that only namespace names may be global; all other declarations should be within the scope of some namespace.
Ignoring this rule may eventually lead to name collision.
For logical grouping of non-class functionality (such as a class category), or for functionality with much greater scope than a class, such as a library or a subsystem; use a namespace to logical unify the declarations (see "Use namespaces to partition potential global names by subsystems or by libraries").
Express the logical grouping of functionality in the name.
namespace transport_layer_interface { /* ... */ }; namespace math_definitions { /* ... */ };
The use of global and namespace scope data is contrary to the encapsulation principle.
Classes are the fundamental design and implementation unit in C++. They should be used to capture domain and design abstractions, and as an encapsulation mechanism for implementing Abstract Data Types (ADT).
class
rather than struct
for implementing abstract
data typesUse the class
class-key rather than struct
for
implementing a class-an abstract data type.
Use the struct
class-key for defining plain-old-data-structures
(POD) as in C, especially when interfacing with C code.
Although class
and struct
are equivalent and can be
used interchangeably, class
has the preferred default access
control emphasis (private) for better encapsulation.
Adopting a consistent practice for distinguishing between class
and struct
introduces a semantic distinction above and beyond the
language rules: the class
become the foremost construct for
capturing abstractions and encapsulation; while the struct
represents a pure data structure that can be exchanged in mixed programming
language programs.
The access specifiers in a class declaration should appear in the order public, protected, private.
The public, protected, private ordering of member declarations ensures that information of most interest to the class user is presented first, hence reducing the need for the class user to navigate through irrelevant, or implementation details.
The use of public or protected data members reduces a class' encapsulation and affects a system's resilience to change: public data members expose a class' implementation to its users; protected data members expose a class' implementation to its derived classes. Any change to the class' public or protected data members will have consequences upon users and derived classes.
This guideline appears counter-intuitive upon first encounter: friendship exposes ones private parts to friends, so how can it preserve encapsulation? In situations where classes are highly interdependent, and require internal knowledge of each other, it is better to grant friendship rather than exporting the internal details via the class interface.
Exporting internal details as public members gives access to class clients which is not desirable. Exporting protected members gives access to potential descendants, encouraging a hierarchical design which is also not desirable. Friendship grants selective private access without enforcing a subclassing constraint, thus preserving encapsulation from all but those requiring access.
A good example of using friendship to preserve encapsulation is granting friendship to a friend test class. The friend test class, by seeing the class internals can implement the appropriate test code, but later on, the friend test class can be dropped from the delivered code. Thus, no encapsulation is lost nor is coded added to the deliverable code.
Class declarations should contain only function declarations and never function definitions (implementations).
Providing function definitions in a class declaration pollutes the class specification with implementation details; making the class interface less discernible and more difficult to read; and increases compilation dependencies.
Function definitions in class declarations also reduce control over function
inlining (see also "Use a No_Inline
conditional compilation
symbol to subvert inline compilation").
To allow the use of a class in an array, or any of the STL containers; a class must provide a public default constructor, or allow the compiler to generate one.
An exception to the above rule exists when a class has a non-static data member of reference type, in this case it is often not possible to create a meaningful default constructor. It is questionable; therefore, to use a reference to an object data member.
If needed, and not explicitly declared, the compiler will implicitly generate a copy constructor and an assignment operator for a class. The compiler defined copy constructor and assignment operator implement what is commonly referred to in Smalltalk terminology as "shallow-copy": explicitly, member wise copy with bitwise copy for pointers. Use of the compiler generated copy constructor and default assignment operators is guaranteed to leak memory.
// Adapted from [Meyers, 92]. void f() { String hello("Hello"); // Assume String is implemented // with a pointer to a char // array. { // Enter new scope (block) String world("World"); world = hello; // Assignment loses world's // original memory } // Destruct world upon exit from // block; // also indirectly hello String hello2 = hello; // Assign destructed hello to // hello2 }
In the above code, the memory holding the string "World
"
is lost after the assignment. Upon exiting the inner block, world
is destroyed; thus, also losing the memory referenced by hello
. The
destructed hello
is assigned to hello2
.
// Adapted from [Meyers, 1992]. void foo(String bar) {}; void f() { String lost = "String that will be lost!"; foo(lost); }
In the above code, when foo
is called with argument lost
,
lost
will be copied into foo
using the compiler
defined copy constructor. Since lost
is copied with a bitwise copy
of the pointer to "String that will be lost!"
, upon exit
from foo
, the copy of lost
will be destroyed (assuming
the destructor is implemented correctly to free up memory) along with the memory
holding "String that will be lost!"
// Example from [X3J16, 95; section 12.8] class X { public: X(const X&, int); // int parameter is not // initialized // No user-declared copy constructor, thus // compiler implicitly declares one. }; // Deferred initialization of the int parameter mutates // constructor into a copy constructor. // X::X(const X& x, int i = 0) { ... }
A compiler not seeing a "standard" copy constructor signature in a class declaration will implicitly declare a copy constructor. Deferred initialization of default parameters may however mutate a constructor into copy constructor: resulting in ambiguity when a copy constructor is used. Any use of a copy constructor is thus ill-formed because of the ambiguity [X3J16, 95; section 12.8].
Unless a class is explicitly designed to be non-derivable, its destructor should always be declared virtual.
Deletion of a derived class object via a pointer or reference to a base class type will result in undefined behavior unless the base class destructor has been declared virtual.
// Bad style used for brevity class B { public: B(size_t size) { tp = new T[size]; } ~B() { delete [] tp; tp = 0; } //... private: T* tp; }; class D : public B { public: D(size_t size) : B(size) {} ~D() {} //... }; void f() { B* bp = new D(10); delete bp; // Undefined behavior due to // non-virtual base class // destructor }
Single parameter constructors can also be prevented from being used for
implicit conversion by declaring them with the explicit
specifier.
Non-virtual functions implement invariant behavior and are not intended to be specialized by derived classes. Violating this guideline may produce unexpected behavior: the same object may exhibit different behavior at different times.
Non-virtual functions are statically bound; thus, the function invoked upon an object is governed by the static type of the variable referencing the object-pointer-to-A and pointer-to-B respectively in the example below-and not the actual type of the object.
// Adapted from [Meyers, 92]. class A { public: oid f(); // Non-virtual: statically bound }; class B : public A { public: void f(); // Non-virtual: statically bound }; void g() { B x; A* pA = &x; // Static type: pointer-to-A B* pB = &x; // Static type: pointer-to-B pA->f(); // Calls A::f pB->f(); // Calls B::f }
Since non-virtual functions constrain subclasses by restricting specialization and polymorphism, care should be taken to ensure that an operation is truly invariant for all subclasses before declaring it non-virtual.
The initialization of an object's state during construction should be performed by a constructor initializer-a member initializer list-rather than with assignment operators within the constructor body.
class X { public: X(); private Y the_y; }; X::X() : the_y(some_y_expression) { } // // "the_y" initialized by a constructor-initializerRather than this:
X::X() { the_y = some_y_expression; } // // "the_y" initialized by an assignment operator.
Object construction involves the construction of all base classes and data members prior to the execution of the constructor body. Initialization of data members requires two operations (construction plus assignment) if performed in a constructor body as opposed to a single operation (construction with an initial value) when performed using a constructor-initializer.
For large nested aggregate classes (classes containing classes containing classes...), the performance overheads of multiple operations-construction + member assignment-can be significant.
class A { public: A(int an_int); }; class B : public A { public: int f(); B(); }; B::B() : A(f()) {} // undefined: calls member function but A bas // not yet been initialized [X3J16, 95].
The result of an operation is undefined if a member function is called directly or indirectly from a constructor initializer before all the member initializers for base classes have completed [X3J16, 95].
Care should be exercised when calling member functions in constructors; be aware that even if a virtual function is called, the one that is executed is the one defined in the constructor or destructor's class or one of its base's.
static const
for integral class constantsWhen defining integral (integer) class constants, use static const
data members rather than #define
's or global constants. If static
const
is not supported by the compiler, use enum
's instead.
class X { static const buffer_size = 100; char buffer[buffer_size]; }; static const buffer_size;Or this:
class C { enum { buffer_size = 100 }; char buffer[buffer_size]; };But not this:
#define BUFFER_SIZE 100 class C { char buffer[BUFFER_SIZE]; };
This will prevent confusion when the compiler complains about a missing return type for functions declared without an explicit return type.
Also use the same names in both function declarations and definitions; this minimizes surprises. Providing parameter names improves code documentation and readability.
Return statements sprinkled freely over a function body are akin to goto
statements, making the code more difficult to read and to maintain.
Multiple returns can be tolerated only in very small functions, when all return
's
can be seen simultaneously and when the code has a very regular structure:
type_t foo() { if (this_condition) return this_value; else return some_other_value; }
Functions with void return type should have no return statement.
The creation of functions that produce global side-effects (change unadvertised data other than their internal object state: such as global and namespace data) should be minimized (see also "Minimize the use of global and namespace scope data"). But if unavoidable, then any side effects should be clearly documented as part of the function specification.
Passing in the required objects as parameters makes code less context dependent, more robust, and easier to understand.
The order in which parameters are declared is important from the caller's point of view:
This ordering permits taking advantage of defaults to reduce the number of arguments in function calls.
Arguments for functions with a variable number of parameters cannot be type-checked.
Avoid adding defaults to functions in further re-declarations of the function: apart from forward declarations, a function should only be declared once. Otherwise this may cause confusion for readers who are not aware of subsequent declarations.
Check whether functions have any constant behavior (return a constant value;
accept constant arguments; or operate without side effect) and assert the
behavior using the const
specifier.
const T f(...); // Function returning a constant // object. T f(T* const arg); // Function taking a constant // pointer. // The pointed-to object can be // changed but not the pointer. T f(const T* arg); // Function taking a pointer to, and T f(const T& arg); // function taking a reference to a // constant object. The pointer can // change but not the pointed-to // object. T f(const T* const arg); // Function taking a constant // pointer to a constant object. // Neither the pointer nor pointed- // to object may change. T f(...) const; // Function without side-effect: // does not change its object state; // so can be applied to constant // objects.
Passing and returning objects by value may incur heavy constructor and destructor overhead. The constructor and destructor overhead can be avoided by passing and returning objects by reference.
Const
references can be used to specify that arguments passed by
reference cannot be modified. Typical usage examples are copy constructors and
assignment operators:
C::C(const C& aC); C& C::operator=(const C& aC);
Consider the following:
the_class the_class::return_by_value(the_class a_copy) { return a_copy; } the_class an_object; return_by_value(an_object);
When return_by_value
is called with an_object
as
argument, the_class
copy constructor is invoked to copy an_object
to a_copy
. the_class
copy constructor is invoked again
to copy a_copy
to the function return temporary object. the_class
destructor is invoked to destroy a_copy
upon return from the
function. Some time later the_class
destructor will be invoked
again to destroy the object returned by return_by_value
. The
overall cost of the above do-nothing function call is two constructors and two
destructors.
The situation is even worse if the_class
was a derived class and
contained member data of other classes; the constructors and destructors of base
classes and contained classes would also be invoked, thus escalating the number
of constructor and destructor calls incurred by the function call.
The above guideline may appear to invite developers to always pass and return objects by reference, however care should be exercised not to return references to local objects or references when objects are required. Returning a reference to local a object is an invitation for disaster since upon function return, the returned reference is bound to a destroyed object!
Local objects are destroyed upon leaving function scope; using destroyed objects is inviting disaster.
Violation of this guideline will lead to memory leaks
class C { public: ... friend C& operator+( const C& left, const C& right); }; C& operator+(const C& left, const C& right) { C* new_c = new C(left..., right...); return *new_c; } C a, b, c, d; C sum; sum = a + b + c + d;
Since the intermediate results of the operator+'s are not stored when computing sum, the intermediate objects cannot be deleted, leading to memory leaks.
Violation of this guideline violates data encapsulation and may lead to bad surprises.
#define
for macro expansionBut use inlining judiciously: only for very small functions; inlining large functions may cause code bloat.
Inline functions also increase the compilation dependencies between modules, as the implementation of the inline functions need to be made available for compilation of the client code.
[Meyers, 1992] provides a detailed discussion of the following rather extreme example of bad macro usage:
Don't do this:
#define MAX(a, b) ((a) > (b) ? (a) : (b))
Rather, do this:
inline int max(int a, int b) { return a > b ? a : b; }
The macro MAX
has a number of problems: it is not type-safe; and
its behavior is non-deterministic:
int a = 1, b = 0; MAX(a++, b); // a is incremented twice MAX(a++, b+10); // a is incremented once MAX(a, "Hello"); // comparing ints and pointers
Use default parameters rather than function overloading when a single algorithm can be exploited, and the algorithm can be parameterized by a small number of parameters.
Using default parameters helps to reduce the number of overloaded functions, enhancing maintainability, and reduces the number of arguments required in function calls, improving code readability.
Use function overloading when multiple implementations are required for the same semantic operation, but with different argument types.
Preserve conventional meaning when overloading operators. Don't forget to define
related operators, e.g., operator==
and operator!=
.
Avoid overloading functions with a single pointer argument by functions with a single integer argument:
void f(char* p); void f(int i);
The following calls may cause surprises:
f(NULL); f(0);
Overload resolution resolves to f(int)
and not f(char*)
.
operator=
return a reference to *this
C++ allows chaining of the assignment operators:
String x, y, z; x = y = z = "A string";
Since the assignment operator is right-associative, the string "A
string
" is assigned to z, z to y, and y to x. The operator=
is effectively invoked once for each expression on the right side of the =, in a
right to left order. This also means that the result of each operator=
is an object, however a return choice of either the left hand or the right hand
object is possible.
Since good practice dictates that the signature of the assignment operator should always be of the form:
C& C::operator=(const C&);
only the left hand object is possible (rhs is const reference, lhs is
non-const reference), thus *this
should be returned. See [Meyers,
1992] for a detailed discussion.
operator=
check for self-assignmentThere are two good reasons for performing the check: firstly, assignment of a derived class object involves calling the assignment operator of each base class up the inheritance hierarchy and skipping these operations may provide significant runtime savings. Secondly, assignment involves the destruction of the "lvalue" object prior to copying the "rvalue" object. In the case of a self assignment, the rvalue object is destroyed before it is assigned, the result of the assignment is thus undefined.
Do not write overly long functions, for example over 60 lines of code.
Minimize the number of return statements, 1 is the ideal number.
Strive for a Cyclomatic Complexity of less than 10 (sum of the decision statements + 1, for single exit statement functions).
Strive for an Extended Cyclomatic Complexity of less than 15 (sum of the decision statements + logical operators + 1, for single exit statement functions).
Minimize the mean maximum span of reference (distance in lines between the declaration of a local object and the first instance of its use).
In large projects there are usually a collection of types used frequently throughout the system; in this case it is sensible to collect together these types in one or more low-level global utility namespaces (see example for "Avoid the use of fundamental types").
When a high degree of portability is the objective, or when control is needed over the memory space occupied by numeric objects, or when a specific range of values is required; then fundamental types should not be used. In these situations it is better to declare explicit type names with size constraints using the appropriate fundamental types.
Make sure that fundamental types don't sneak back into the code through loop counters, array indices, and so on.
namespace system_types { typedef unsigned char byte; typedef short int integer16; // 16-bit signed integer typedef int integer32; // 32-bit signed integer typedef unsigned short int natural16; // 16-bit unsigned integer typedef unsigned int natural32; // 32-bit unsigned integer ... }
The representation of fundamental types is implementation dependent.
typedef
to create synonyms to strengthen local meaningUse typedef
to create synonyms for existing names, to give more
meaningful local names and improve legibility (there is no runtime penalty for
doing so).
typedef
can also be used to provide shorthands for qualified names.
// vector declaration from standard library // namespace std { template <class T, class Alloc = allocator> class vector { public: typedef typename Alloc::types<T>reference reference; typedef typename Alloc::types<T>const_reference const_reference; typedef typename Alloc::types<T>pointer iterator; typedef typename Alloc::types<T>const_pointer const_iterator; ... } }
When using typedef-names created by typedef
, do not mix the use
of the original name and the synonym in the same piece of code.
Use named constants in preference.
Use const
or enum
instead.
Don't do this:
#define LIGHT_SPEED 3E8Rather, do this:
const int light_speed = 3E8;Or this for sizing arrays:
enum { small_buffer_size = 100, large_buffer_size = 1000 };
Debugging is much harder because names introduced by #defines
are replaced during compilation preprocessing, and do not appear in symbol
tables.
Always initialize const objects at declaration
const
objects not declared extern
have internal
linkage, initializing these constant objects at declaration allows the
initializers to be used at compilation time.
Constant objects may exist in read-only memory.
Specify initial values in object definitions, unless the object is self-initializing. If it is not possible to assign a meaningful initial value, then assign a "nil" value or consider declaring the object later.
For large objects, it is generally not advisable to construct the objects, and then later initialize them using assignment as this can be very costly (see also "Use constructor initializers rather than assignments in constructors").
If proper initialization of an object is not possible at the time of construction, then initialize the object using a conventional "nil" value that means "uninitialized". The nil value is to be used only for initialization to declare an "unusable but known value" that can be rejected in a controlled fashion by algorithms: to indicate an uninitialized variable error when the object is used before proper initialization.
Note that it is not always possible to declare a nil value for all types, especially modulo types, such as an angle. In this case choose the least likely value.
This chapter provides guidance on the usage and form of various kinds of C++ expressions and statements.
Avoid nesting expressions too deeply
The level of nesting of an expression is defined as the number of nested sets of parentheses required to evaluate an expression from left to right if the rules of operator precedence were ignored.
Too many levels of nesting make expressions harder to comprehend.
Unless evaluation order is specified by an operator (comma operator, ternary expression, and conjunctions and disjunctions); do not assume any particular evaluation order; assuming may lead to bad surprises and non-portability.
For example, don't combine the use of a variable in the same statement as an increment or decrement of the variable.
foo(i, i++); array[i] = i--;
NULL
The use of 0 or NULL
for null pointers is a highly controversial
topic.
Both C and C++ define any zero-valued constant expression to be interpretable as
a null pointer. Because 0 is difficult to read and the use of literals is highly
discouraged, programmers have traditionally used the macro NULL
as
the null pointer. Unfortunately, there is no portable definition for NULL
.
Some ANSI C compilers have used (void *)0, but this turns out to be a poor
choice for C++:
char* cp = (void*)0; /* Legal C but not C++ */
Thus any definition of NULL
of the form (T*)0, rather than
simply zero, requires a cast in C++. Historically, guidelines advocating the use
of 0 for null pointers, attempted to alleviate the casting requirement and make
code more portable. Many C++ developers however feel more comfortable using NULL
rather than 0, and also argue that most compilers (more precisely, most header
files) nowadays implement NULL
as 0.
This guideline rules in favor of 0, since 0 is guaranteed to work irrespective
of the value of NULL
, however, due to controversy, this point is
demoted to the level of a tip, to be followed or ignored as seen fit.
Use the new casting operators (dynamic_cast, static_cast,
reinterpret_cast, const_cast
) rather than old-style casting.
If you don't have the new cast operators; avoid casting altogether, especially downcasting (converting a base class object to a derived class object).
Use the casting operators as follows:
dynamic_cast
-to cast between members of the same class
hierarchy (subtypes) using run-time type information (run-time type
information is available for classes with virtual functions). Casting
between such classes is guaranteed to be safe.static_cast
-to cast between members of the same class
hierarchy without using run-time type information; so is not guaranteed to
be safe. If the programmer cannot guarantee type-safety, then use
dynamic_cast.
reinterpret_cast
-to cast between unrelated pointer types and
integral (integer) types; is unsafe and should only be used between the
types mentioned.const_cast
-to cast away the "constness" of a
function argument specified as a const
parameter. Note const_cast
is not intended to cast away the "constness" of an object truly
defined as a const object (it could be in read-only-memory).Don't use typeid
to implement type-switching logic: let the
casting operators perform the type checking and conversion atomically, see [Stroustrup,
1994] for an in-depth discussion.
Don't do the following:
void foo (const base& b) { if (typeid(b) == typeid(derived1)) { do_derived1_stuff(); else if (typeid(b) == typeid(derived2)) { do_derived2_stuff(); else if () { } }
Old-style casting defeats the type system and can lead to hard-to-detect bugs that are not caught by the compiler: the memory management system can be corrupted, virtual function tables can get trampled on, and non-related objects can be damaged when the object is accessed as a derived class object. Note that the damage can be done even by a read access, as non-existent pointers or fields might be referenced.
New-style casting operators make type conversion safer (in most cases) and more explicit.
Don't use the old-style Boolean macros or constants: there is no standard
Boolean value true; use the new bool
type instead.
Since there was traditionally no standard value for true (1 or ! 0); comparisons of non-zero expressions to true could fail.
Use Boolean expressions instead.
Avoid doing this:
if (someNonZeroExpression == true) // May not evaluate to trueBetter to do this:
if (someNonZeroExpression) // Always evaluates as a true condition.
The result of such operations are nearly always meaningless.
Avoid disaster by setting a pointer to a deleted object to null: repeated deletion of a non-null pointer is harmful, but repeated deletion of a null pointer is harmless.
Always assign a null pointer value after deletion even before a function return, since new code may be added later.
Use a switch-statement when branching on discrete values
Use a switch statement rather than a series of "else if" when the branching condition is a discrete value.
default
branch for switch-statements for catching errorsA switch statement should always contain a default
branch,and
the default
branch should be used for trapping errors.
This policy ensures that when new switch values are introduced, and branches to handle the new values are omitted, the existing default branch will catch the error.
Use a for-statement in preference to a while statement when iteration and loop termination is based upon the loop counter.
Avoid the use of jump statements in loops
Avoid exiting (using break, return
or goto
) from
loops other than by the loop termination condition; and pre-maturely skipping to
the next iteration with continue
. This reduces the number of flow
of control paths, making code easier to comprehend.
goto
-statementThis seems to be a universal guideline.
This may lead to confusion for the readers and potential risks in maintenance.
This chapter provides guidance on the topics of memory management and error reporting.
The C library malloc
, calloc
and realloc
functions should not be used for allocating object space: the C++ operator new
should be used for this purpose.
The only time memory should be allocated using the C functions is when memory is to be passed to a C library function for disposal.
Don't use delete to free memory allocated by C functions, or free on objects created by new.
Using delete on array objects without the empty brackets ("[]") notation will result in only the first array element being deleted, and thus memory leakage.
Because not much experience has been gained using the C++ exception mechanism, the guidelines presented here may undergo significant future revision.
The C++ draft standard defines two broad categories of errors: logic errors and runtime errors. Logic errors are preventable programming errors. Runtime errors are defined as those errors due to events beyond the scope of the program.
The general rule for use of exceptions is that the system in normal condition and in the absence of overload or hardware failure should not raise any exceptions.
Use function preconditions and postcondition assertions during development to provide "drop-dead" error detection.
Assertions provide a simple and useful provisional error detection mechanism until the final error handling code is implemented. Assertions have the added bonus of being able to be compiled away using the "NDEBUG" preprocessor symbol (see "Define the NDEBUG symbol with a specific value").
The assert macro has traditionally been used for this purpose; however, reference [Stroustrup, 1994] provides a template alternative, see below.
template<class T, class Exception> inline void assert ( T a_boolean_expression, Exception the_exception) { if (! NDEBUG) if (! a_boolean_expression) throw the_exception; }
Do not use exceptions for frequent, anticipated events: exceptions cause disruptions in the normal flow of control of the code, making it more difficult to understand and maintain.
Anticipated events should be handled in the normal flow of control of the code; use a function return value or "out" parameter status code as required.
Exceptions should also not be used to implement control structures: this would be another form of "goto" statement.
This ensures that all exceptions support a minimal set of common operations and can be handled by a small set of high level handlers.
Logic errors (domain error, invalid argument error, length error and out-of-range error) should be used to indicate application domain errors, invalid arguments passed to function calls, construction of objects beyond their permitted sizes, and argument values not within permitted ranges.
Runtime errors (range error and overflow error) should be used to indicate arithmetic and configuration errors, corrupted data, or resource exhaustion errors only detectable at runtime.
In large systems, having to handle a large number of exceptions at each level makes the code difficult to read and to maintain. Exception processing may dwarf the normal processing.
Ways to minimize the number of exceptions are:
Share exceptions between abstractions by using a small number of exception categories.
Throw specialized exceptions derived from the standard exceptions but handle more generalized exceptions.
Add "exceptional" states to the objects, and provide primitives to check explicitly the validity of the objects.
Functions originating exceptions (not just passing exceptions through) should declare all exceptions thrown in their exception specification: they should not silently generate exceptions without warning their clients.
During development, report exceptions by the appropriate logging mechanism as early as possible, including at the "throw-point".
Exception handlers should be defined in the most-derived, to most-base class order in order to avoid coding unreachable handlers; see the how-not-to-it example, below. This also ensures that the most appropriate handler catches the exception since handlers are matched in a declaration order.
Don't do this:
class base { ... }; class derived : public base { ... }; ... try { ... throw derived(...); // // Throw a derived class exception } catch (base& a_base_failure) // // But base class handler "catches" because // it matches first! { ... } catch (derived& a_derived_failure) // // This handler is unreachable! { ... }
Avoid catch-all exception handlers (handler declarations using ...), unless the exception is re-thrown.
Catch-all handlers should only be used for local housekeeping, then the exception should be re-thrown to prevent masking of the fact that the exception cannot be handled at this level:
try { ... } catch (...) { if (io.is_open(local_file)) { io.close(local_file); } throw; }
When returning status codes as a function parameter, always assigned a value to the parameter as the first executable statement in the function body. Systematically make all statuses a success by default or a failure by default. Think of all possible exits from the function, including exception handlers.
If a function might produce an erroneous output unless given proper input, install code in the function to detect and report invalid input in a controlled manner. Do not rely on a comment that tells the client to pass proper values. It is virtually guaranteed that sooner or later that comment will be ignored, resulting in hard-to-debug errors if the invalid parameters are not detected.
This chapter deals with language features that are a priori non-portable.
Pathnames are not represented in a standard manner across operating systems. Using them will introduce platform dependencies.
#include "somePath/filename.hh" // Unix #include "somePath\filename.hh" // MSDOS
The representation and alignment of types are highly machine architecture dependent. Assumptions made about representation and alignment may lead to bad surprises and reduced portability.
In particular, never attempt to store a pointer in an int
, a
long or any other numeric type-this is highly non-portable.
Do not depend on a particular underflow or overflow behavior
Use "stretchable" constants whenever possible
Stretchable constants avoid problems with word-size variations.
const int all_ones = ~0; const int last_3_bits = ~0x7;
Machine architectures may dictate the alignment of certain types. Converting from types with more relaxed alignment requirements to types with more stringent alignment requirements may lead to program failures.
This chapter provides guidance on reusing C++ code.
If the standard libraries are not available, then create classes based upon the standard library interfaces: this will facilitate future migration.
Use templates to reuse behavior, when behavior is not dependent upon a specific data type.
Use public
inheritance to express the "isa"
relationship and reuse base class interfaces, and optionally, their
implementation.
Avoid private inheritance when reusing implementation or modeling "parts/whole" relationships. Reuse of implementation without redefinition is best achieved by containment rather than by private inheritance.
Use private inheritance when redefinition of base class operations is needed.
Multiple inheritance should be used judiciously as it brings much additional complexity. [Meyers, 1992] provides a detailed discussion on the complexities due to potential name ambiguities and repeated inheritance. Complexities arise from:
Ambiguities, when the same names are used by multiple classes, any unqualified references to the names are inherently ambiguous. Ambiguity can be resolved by qualifying the member names with their class names. However, this has the unfortunate effect of defeating polymorphism and turning virtual functions into statically bound functions.
Repeated inheritance (inheritance of a base class multiple times by a derived class via different paths in the inheritance hierarchy) of multiple sets of data members from the same base raises the problem of which of the multiple sets of data members should be used?
Multiply inherited data members can be prevented by using virtual inheritance (inheritance of virtual base classes). Why not always use virtual inheritance then? Virtual inheritance has the negative effect of altering the underlying object representation and reducing access efficiency.
Enacting a policy to require all inheritance to be virtual, at the same time imposing an all encompassing space and time penalty, would be too authoritarian.
Multiple inheritance; therefore, requires class designers to be clairvoyant as to the future uses of their classes: in order to be able to make the decision to use virtual or non-virtual inheritance.
This chapter provides guidance on compilation issues
Do not include in a module specification other header files that are only required by the module's implementation.
Avoid including header files in a specification for the purpose of gaining visibility to other classes, when only pointer or reference visibility is required; use forward declarations instead.
// Module A specification, contained in file "A.hh" #include "B.hh" // Don't include when only required by // the implementation. #include "C.hh" // Don't include when only required by // reference; use a forward declaration instead. class C; class A { C* a_c_by_reference; // Has-a by reference. }; // End of "A.hh"
Minimizing compilation dependencies is the rationale for certain design idioms or patterns, variously named: Handle or Envelope [Meyers, 1992], or Bridge [Gamma] classes. By dividing the responsibility for a class abstraction across two associated classes, one providing the class interface, and the other the implementation; the dependencies between a class and its clients are minimized since any changes to the implementation (the implementation class) no longer cause recompilation of the clients.
// Module A specification, contained in file "A.hh" class A_implementation; class A { A_implementation* the_implementation; }; // End of "A.hh"
This approach also allows the interface class and implementation class to be specialized as two separate class hierarchies.
NDEBUG
symbol with a specific valueThe NDEBUG
symbol was traditionally used to compile away
assertion code implemented using the assert macro. The traditional usage
paradigm was to define the symbol when it was desired to eliminate assertions;
however, developers were often unaware of the presence of assertions, and
therefore never defined the symbol.
We advocate using the template version of the assert; in this case the NDEBUG
symbol has to be given an explicit value: 0 if assertion code is desired;
non-zero to eliminate. Any assertion code subsequently compiled without
providing the NDEBUG
symbol a specific value will generate
compilation errors; thus, bringing the developer's attention to the existence of
assertion code.
Here is a summary of all the guidelines presented in this booklet.
Use common sense
Always use #include
to gain access to a
module's specification
Never declare names beginning with one or more
underscores ('_')
Limit global declarations to just namespaces
Always provide a default constructor for classes with
explicitly-declared constructors
Always declare copy constructors and assignment
operators for classes with pointer type data members
Never re-declare constructor parameters to have a
default value
Always declare destructors to be virtual
Never redefine non-virtual functions
Never call member functions from a constructor
initializer
Never return a reference to a local object
Never return a de-referenced pointer initialized by new
Never return a non-const reference or pointer to member
data
Have operator=
return a reference to *this
Have operator=
check for self-assignment
Never cast away the "constness" of a constant
object
Do not assume any particular expression evaluation order
Don't use old-style casting
Use the new bool
type for Boolean
expressions
Never compare directly against the Boolean value true
Never compare pointers to objects not within the same
array
Always assign a null
pointer value to a
deleted object pointer
Always provide a default branch for switch-statements
for catching errors
Don't use the goto-statement
Avoid mixing C and C++ memory operations
Always use delete[] when deleting array objects created
by new
Never use hardcoded file pathnames
Do not assume the representation of a type
Do not assume the alignment of a type
Do not depend on a particular underflow or overflow
behavior
Do not convert from a "shorter" type to a
"longer"
Define the NDEBUG
symbol with a specific
value
Place module specifications and implementations in
separate files
Pick a single set of file name extensions to distinguish
headers from implementation files
Avoid defining more than one class per module
specification
Avoid putting implementation-private declarations in
module specifications
Place module inline function definitions in a separate
file
Break large modules into multiple translation units if
program size is a concern
Isolate platform dependencies
Protect against repeated file inclusions
Use a "No_Inline
" conditional
compilation symbol to subvert inline compilation
Use a small, consistent indentation style for nested
statements
Indent function parameters from the function name or
scope name
Use a maximum line length that would fit on the standard
printout paper size
Use consistent line folding
Use C++ style comments rather than C-style comments
Maximize comment proximity to source code
Avoid end of line comments
Avoid comment headers
Use an empty comment line to separate comment paragraphs
Avoid redundancy
Write self-documenting code rather than comments
Document classes and functions
Choose a naming convention and apply it consistently
Avoid using type names that differ only by letter case
Avoid the use of abbreviations
Avoid the use of suffixes to denote language constructs
Choose clear, legible, meaningful names
Use correct spelling in names
Use positive predicate clauses for Booleans
Use namespaces to partition potential global names by
subsystems or by libraries
Use nouns or noun phrases for class names
Use verbs for procedure-type function names
Use function overloading when the same general meaning
is intended
Augment names with grammatical elements to emphasize
meaning
Choose exception names with a negative meaning
Use project defined adjectives for exception names
Use capital letters for floating point exponent and
hexadecimal digits.
Use a namespace to group non-class functionality
Minimize the use of global and namespace scope data
Use class rather than struct for implementing abstract
data types
Declare class members in order of decreasing
accessibility
Avoid declaring public or protected data members for
abstract data types
Use friends to preserve encapsulation
Avoid providing function definitions in class
declarations
Avoid declaring too many conversion operators and single
parameter constructors
Use non-virtual functions judiciously
Use constructor-initializers rather than assignments in
constructors
Beware when calling member functions in constructors and
destructors
Use static const for integral class constants
Always declare an explicit function return type
Always provide formal parameter names in function
declarations
Strive for functions with a single point of return
Avoid creating function with global side-effects
Declare function parameters in order of decreasing
importance and volatility
Avoid declaring functions with a variable number of
parameters
Avoid re-declaring functions with default parameters
Maximize the use of const in function declarations
Avoid passing objects by value
Use inline functions in preference to #define
for macro expansion
Use default parameters rather than function overloading
Use function overloading to express common semantics
Avoid overloading functions taking pointers and integers
Minimize complexity
Avoid the use of fundamental types
Avoid using literal values
Avoid using the preprocessor #define
directive for defining constants
Declare objects close to their point of first use
Always initialize const objects at declaration
Initialize objects at definition
Use an if-statement when branching on Boolean
expressions
Use a switch-statement when branching on discrete values
Use a for-statement or a while-statement when a
pre-iteration test is required in a loop
Use a do-while-statement when a post-iteration test is
required in a loop
Avoid the use of jump statements in loops
Avoid the hiding of identifiers in nested scopes
Use assertions liberally during development to detect
errors
Use exceptions only for truly exceptional conditions
Derive project exceptions from standard exceptions
Minimize the number of exceptions used by a given
abstraction
Declare all exceptions thrown
Define exception handlers in most-derived, to most-base
class order
Avoid catch-all exception handlers
Make sure function status codes have an appropriate
value
Perform safety checks locally; do not expect your client
to do so
Use "stretchable" constants whenever possible
Use standard library components whenever possible
Define project-wide global system types
Use typedef to create synonyms to strengthen local
meaning
Use redundant parentheses to make compound expressions
clearer
Avoid nesting expressions too deeply
Use 0 for null pointers rather than NULL
Report exceptions at first occurrence