190792Sgshapiro<html>
290792Sgshapiro<head>
390792Sgshapiro    <title> libsm : Assert and Abort </title>
490792Sgshapiro</head>
590792Sgshapiro<body>
690792Sgshapiro
790792Sgshapiro<a href="index.html">Back to libsm overview</a>
890792Sgshapiro
990792Sgshapiro<center>
1090792Sgshapiro    <h1> libsm : Assert and Abort </h1>
11266692Sgshapiro    <br> $Id: assert.html,v 1.6 2001-08-27 21:47:03 ca Exp $
1290792Sgshapiro</center>
1390792Sgshapiro
1490792Sgshapiro<h2> Introduction </h2>
1590792Sgshapiro
1690792SgshapiroThis package contains abstractions
1790792Sgshapirofor assertion checking and abnormal program termination.
1890792Sgshapiro
1990792Sgshapiro<h2> Synopsis </h2>
2090792Sgshapiro
2190792Sgshapiro<pre>
2290792Sgshapiro#include &lt;sm/assert.h&gt;
2390792Sgshapiro
2490792Sgshapiro/*
2590792Sgshapiro**  abnormal program termination
2690792Sgshapiro*/
2790792Sgshapiro
2890792Sgshapirovoid sm_abort_at(char *filename, int lineno, char *msg);
2990792Sgshapirotypedef void (*SM_ABORT_HANDLER)(char *filename, int lineno, char *msg);
3090792Sgshapirovoid sm_abort_sethandler(SM_ABORT_HANDLER);
3190792Sgshapirovoid sm_abort(char *fmt, ...)
3290792Sgshapiro
3390792Sgshapiro/*
3490792Sgshapiro**  assertion checking
3590792Sgshapiro*/
3690792Sgshapiro
3790792SgshapiroSM_REQUIRE(expression)
3890792SgshapiroSM_ASSERT(expression)
3990792SgshapiroSM_ENSURE(expression)
4090792Sgshapiro
4190792Sgshapiroextern SM_DEBUG_T SmExpensiveRequire;
4290792Sgshapiroextern SM_DEBUG_T SmExpensiveAssert;
4390792Sgshapiroextern SM_DEBUG_T SmExpensiveEnsure;
4490792Sgshapiro
4590792Sgshapiro#if SM_CHECK_REQUIRE
4690792Sgshapiro#if SM_CHECK_ASSERT
4790792Sgshapiro#if SM_CHECK_ENSURE
4890792Sgshapiro
4990792Sgshapirocc -DSM_CHECK_ALL=0 -DSM_CHECK_REQUIRE=1 ...
5090792Sgshapiro</pre>
5190792Sgshapiro
5290792Sgshapiro<h2> Abnormal Program Termination </h2>
5390792Sgshapiro
5490792SgshapiroThe functions sm_abort and sm_abort_at are used to report a logic
5590792Sgshapirobug and terminate the program.  They can be invoked directly,
5690792Sgshapiroand they are also used by the assertion checking macros.
5790792Sgshapiro
5890792Sgshapiro<dl>
5990792Sgshapiro<dt>
6090792Sgshapiro    void sm_abort_at(char *filename, int lineno, char *msg)
6190792Sgshapiro<dd>
6290792Sgshapiro	This is the low level interface for causing abnormal program
6390792Sgshapiro	termination.  It is intended to be invoked from a
6490792Sgshapiro	macro, such as the assertion checking macros.
6590792Sgshapiro
6690792Sgshapiro	If filename != NULL then filename and lineno specify the line
6790792Sgshapiro	of source code on which the logic bug is detected.  These
6890792Sgshapiro	arguments are normally either set to __FILE__ and __LINE__
6990792Sgshapiro	from an assertion checking macro, or they are set to NULL and 0.
7090792Sgshapiro
7190792Sgshapiro	The default action is to print an error message to smioerr
7290792Sgshapiro	using the arguments, and then call abort().  This default
7390792Sgshapiro	behaviour can be changed by calling sm_abort_sethandler.
7490792Sgshapiro<p>
7590792Sgshapiro<dt>
7690792Sgshapiro    void sm_abort_sethandler(SM_ABORT_HANDLER handler)
7790792Sgshapiro<dd>
7890792Sgshapiro	Install 'handler' as the callback function that is invoked
7990792Sgshapiro	by sm_abort_at.  This callback function is passed the same
8090792Sgshapiro	arguments as sm_abort_at, and is expected to log an error
8190792Sgshapiro	message and terminate the program.  The callback function should
8290792Sgshapiro	not raise an exception or perform cleanup: see Rationale.
8390792Sgshapiro
8490792Sgshapiro	sm_abort_sethandler is intended to be called once, from main(),
8590792Sgshapiro	before any additional threads are created: see Rationale.
8690792Sgshapiro	You should not use sm_abort_sethandler to
8790792Sgshapiro	switch back and forth between several handlers; 
8890792Sgshapiro	this is particularly dangerous when there are
8990792Sgshapiro	multiple threads, or when you are in a library routine.
9090792Sgshapiro<p>
9190792Sgshapiro<dt>
9290792Sgshapiro    void sm_abort(char *fmt, ...)
9390792Sgshapiro<dd>
9490792Sgshapiro	This is the high level interface for causing abnormal program
9590792Sgshapiro	termination.  It takes printf arguments.  There is no need to
9690792Sgshapiro	include a trailing newline in the format string; a trailing newline
9790792Sgshapiro	will be printed if appropriate by the handler function.
9890792Sgshapiro</dl>
9990792Sgshapiro
10090792Sgshapiro<h2> Assertions </h2>
10190792Sgshapiro
10290792Sgshapiro    The assertion handling package
10390792Sgshapiro    supports a style of programming in which assertions are used
10490792Sgshapiro    liberally throughout the code, both as a form of documentation,
10590792Sgshapiro    and as a way of detecting bugs in the code by performing runtime checks.
10690792Sgshapiro<p>
10790792Sgshapiro    There are three kinds of assertion:
10890792Sgshapiro<dl>
10990792Sgshapiro<dt>
11090792Sgshapiro    SM_REQUIRE(expr)
11190792Sgshapiro<dd>
11290792Sgshapiro	This is an assertion used at the beginning of a function
11390792Sgshapiro	to check that the preconditions for calling the function
11490792Sgshapiro	have been satisfied by the caller.
11590792Sgshapiro<p>
11690792Sgshapiro<dt>
11790792Sgshapiro    SM_ENSURE(expr)
11890792Sgshapiro<dd>
11990792Sgshapiro	This is an assertion used just before returning from a function
12090792Sgshapiro	to check that the function has satisfied all of the postconditions
12190792Sgshapiro	that it is required to satisfy by its contract with the caller.
12290792Sgshapiro<p>
12390792Sgshapiro<dt>
12490792Sgshapiro    SM_ASSERT(expr)
12590792Sgshapiro<dd>
12690792Sgshapiro	This is an assertion that is used in the middle of a function,
12790792Sgshapiro	to check loop invariants, and for any other kind of check that is
12890792Sgshapiro	not a "require" or "ensure" check.
12990792Sgshapiro</dl>
13090792Sgshapiro    If any of the above assertion macros fail, then sm_abort_at
13190792Sgshapiro    is called.  By default, a message is printed to stderr and the
13290792Sgshapiro    program is aborted.  For example, if SM_REQUIRE(arg &gt; 0) fails
13390792Sgshapiro    because arg &lt;= 0, then the message
13490792Sgshapiro<blockquote><pre>
13590792Sgshapirofoo.c:47: SM_REQUIRE(arg &gt; 0) failed
13690792Sgshapiro</pre></blockquote>
13790792Sgshapiro    is printed to stderr, and abort() is called.
13890792Sgshapiro    You can change this default behaviour using sm_abort_sethandler.
13990792Sgshapiro
14090792Sgshapiro<h2> How To Disable Assertion Checking At Compile Time </h2>
14190792Sgshapiro
14290792Sgshapiro    You can use compile time macros to selectively enable or disable
14390792Sgshapiro    each of the three kinds of assertions, for performance reasons.
14490792Sgshapiro    For example, you might want to enable SM_REQUIRE checking
14590792Sgshapiro    (because it finds the most bugs), but disable the other two types.
14690792Sgshapiro<p>
14790792Sgshapiro    By default, all three types of assertion are enabled.
14890792Sgshapiro    You can selectively disable individual assertion types
14990792Sgshapiro    by setting one or more of the following cpp macros to 0
15090792Sgshapiro    before &lt;sm/assert.h&gt; is included for the first time:
15190792Sgshapiro<blockquote>
15290792Sgshapiro	SM_CHECK_REQUIRE<br>
15390792Sgshapiro	SM_CHECK_ENSURE<br>
15490792Sgshapiro	SM_CHECK_ASSERT<br>
15590792Sgshapiro</blockquote>
15690792Sgshapiro    Or, you can define SM_CHECK_ALL as 0 to disable all assertion
15790792Sgshapiro    types, then selectively define one or more of SM_CHECK_REQUIRE,
15890792Sgshapiro    SM_CHECK_ENSURE or SM_CHECK_ASSERT as 1.  For example,
15990792Sgshapiro    to disable all assertions except for SM_REQUIRE, you can use
16090792Sgshapiro    these C compiler flags:
16190792Sgshapiro<blockquote>
16290792Sgshapiro	-DSM_CHECK_ALL=0 -DSM_CHECK_REQUIRE=1
16390792Sgshapiro</blockquote>
16490792Sgshapiro
16590792Sgshapiro    After &lt;sm/assert.h&gt; is included, the macros
16690792Sgshapiro    SM_CHECK_REQUIRE, SM_CHECK_ENSURE and SM_CHECK_ASSERT
16790792Sgshapiro    are each set to either 0 or 1.
16890792Sgshapiro
16990792Sgshapiro<h2> How To Write Complex or Expensive Assertions </h2>
17090792Sgshapiro
17190792Sgshapiro    Sometimes an assertion check requires more code than a simple
17290792Sgshapiro    boolean expression.
17390792Sgshapiro    For example, it might require an entire statement block
17490792Sgshapiro    with its own local variables.
17590792Sgshapiro    You can code such assertion checks by making them conditional on
17690792Sgshapiro    SM_CHECK_REQUIRE, SM_CHECK_ENSURE or SM_CHECK_ASSERT,
17790792Sgshapiro    and using sm_abort to signal failure.
17890792Sgshapiro<p>
17990792Sgshapiro    Sometimes an assertion check is significantly more expensive
18090792Sgshapiro    than one or two comparisons.
18190792Sgshapiro    In such cases, it is not uncommon for developers to comment out
18290792Sgshapiro    the assertion once the code is unit tested.
18390792Sgshapiro    Please don't do this: it makes it hard to turn the assertion
18490792Sgshapiro    check back on for the purposes of regression testing.
18590792Sgshapiro    What you should do instead is make the assertion check conditional
18690792Sgshapiro    on one of these predefined debug objects:
18790792Sgshapiro<blockquote>
18890792Sgshapiro	SmExpensiveRequire<br>
18990792Sgshapiro	SmExpensiveAssert<br>
19090792Sgshapiro	SmExpensiveEnsure
19190792Sgshapiro</blockquote>
19290792Sgshapiro    By doing this, you bring the cost of the assertion checking code
19390792Sgshapiro    back down to a single comparison, unless expensive assertion checking
19490792Sgshapiro    has been explicitly enabled.
19590792Sgshapiro    By the way, the corresponding debug category names are
19690792Sgshapiro<blockquote>
19790792Sgshapiro	sm_check_require<br>
19890792Sgshapiro	sm_check_assert<br>
19990792Sgshapiro	sm_check_ensure
20090792Sgshapiro</blockquote>
20190792Sgshapiro    What activation level should you check for?
20290792Sgshapiro    Higher levels correspond to more expensive assertion checks.
20390792Sgshapiro    Here are some basic guidelines:
20490792Sgshapiro<blockquote>
20590792Sgshapiro	level 1: &lt; 10 basic C operations<br>
20690792Sgshapiro	level 2: &lt; 100 basic C operations<br>
20790792Sgshapiro	level 3: &lt; 1000 basic C operations<br>
20890792Sgshapiro	...
20990792Sgshapiro</blockquote>
21090792Sgshapiro
21190792Sgshapiro<p>
21290792Sgshapiro    Here's a contrived example of both techniques:
21390792Sgshapiro<blockquote><pre>
21490792Sgshapirovoid
21590792Sgshapirow_munge(WIDGET *w)
21690792Sgshapiro{
21790792Sgshapiro    SM_REQUIRE(w != NULL);
21890792Sgshapiro#if SM_CHECK_REQUIRE
21990792Sgshapiro    /*
22090792Sgshapiro    **  We run this check at level 3 because we expect to check a few hundred
22190792Sgshapiro    **  table entries.
22290792Sgshapiro    */
22390792Sgshapiro
22490792Sgshapiro    if (sm_debug_active(&SmExpensiveRequire, 3))
22590792Sgshapiro    {
22690792Sgshapiro        int i;
22790792Sgshapiro
22890792Sgshapiro        for (i = 0; i &lt; WIDGET_MAX; ++i)
22990792Sgshapiro        {
23090792Sgshapiro            if (w[i] == NULL)
23190792Sgshapiro                sm_abort("w_munge: NULL entry %d in widget table", i);
23290792Sgshapiro        }
23390792Sgshapiro    }
23490792Sgshapiro#endif /* SM_CHECK_REQUIRE */
23590792Sgshapiro</pre></blockquote>
23690792Sgshapiro
23790792Sgshapiro<h2> Other Guidelines </h2>
23890792Sgshapiro
23990792Sgshapiro    You should resist the urge to write SM_ASSERT(0) when the code has
24090792Sgshapiro    reached an impossible place.  It's better to call sm_abort, because
24190792Sgshapiro    then you can generate a better error message.  For example,
24290792Sgshapiro<blockquote><pre>
24390792Sgshapiroswitch (foo)
24490792Sgshapiro{
24590792Sgshapiro    ...
24690792Sgshapiro  default:
24790792Sgshapiro    sm_abort("impossible value %d for foo", foo);
24890792Sgshapiro}
24990792Sgshapiro</pre></blockquote>
25090792Sgshapiro    Note that I did not bother to guard the default clause of the switch
25190792Sgshapiro    statement with #if SM_CHECK_ASSERT ... #endif, because there is
25290792Sgshapiro    probably no performance gain to be had by disabling this particular check.
25390792Sgshapiro<p>
25490792Sgshapiro    Avoid including code that has side effects inside of assert macros,
25590792Sgshapiro    or inside of SM_CHECK_* guards.  You don't want the program to stop
25690792Sgshapiro    working if assertion checking is disabled.
25790792Sgshapiro
25890792Sgshapiro<h2> Rationale for Logic Bug Handling </h2>
25990792Sgshapiro
26090792Sgshapiro    When a logic bug is detected, our philosophy is to log an error message
26190792Sgshapiro    and terminate the program, dumping core if possible.
26290792Sgshapiro    It is not a good idea to raise an exception, attempt cleanup,
26390792Sgshapiro    or continue program execution.  Here's why.
26490792Sgshapiro<p>
26590792Sgshapiro    First of all, to facilitate post-mortem analysis, we want to dump core
26690792Sgshapiro    on detecting a logic bug, disturbing the process image as little as
26790792Sgshapiro    possible before dumping core.  We don't want to raise an exception
26890792Sgshapiro    and unwind the stack, executing cleanup code, before dumping core,
26990792Sgshapiro    because that would obliterate information we need to analyze the cause
27090792Sgshapiro    of the abort.
27190792Sgshapiro<p>
27290792Sgshapiro    Second, it is a bad idea to raise an exception on an assertion failure
27390792Sgshapiro    because this places unacceptable restrictions on code that uses
27490792Sgshapiro    the assertion macros.
27590792Sgshapiro    The reason is this: the sendmail code must be written so that
27690792Sgshapiro    anywhere it is possible for an assertion to be raised, the code
27790792Sgshapiro    will catch the exception and clean up if necessary, restoring
27890792Sgshapiro    data structure invariants and freeing resources as required.
27990792Sgshapiro    If an assertion failure was signalled by raising an exception,
28090792Sgshapiro    then every time you added an assertion, you would need to check
28190792Sgshapiro    both the function containing the assertion and its callers to see
28290792Sgshapiro    if any exception handling code needed to be added to clean up properly
28390792Sgshapiro    on assertion failure.  That is far too great a burden.
28490792Sgshapiro<p>
28590792Sgshapiro    It is a bad idea to attempt cleanup upon detecting a logic bug
28690792Sgshapiro    for several reasons:
28790792Sgshapiro<ul>
28890792Sgshapiro<li>If you need to perform cleanup actions in order to preserve the
28990792Sgshapiro    integrity of the data that the program is handling, then the
29090792Sgshapiro    program is not fault tolerant, and needs to be redesigned.
29190792Sgshapiro    There are several reasons why a program might be terminated unexpectedly:
29290792Sgshapiro    the system might crash, the program might receive a signal 9,
29390792Sgshapiro    the program might be terminated by a memory fault (possibly as a
29490792Sgshapiro    side effect of earlier data structure corruption), and the program
29590792Sgshapiro    might detect a logic bug and terminate itself.  Note that executing
29690792Sgshapiro    cleanup actions is not feasible in most of the above cases.
29790792Sgshapiro    If the program has a fault tolerant design, then it will not lose
29890792Sgshapiro    data even if the system crashes in the middle of an operation.
29990792Sgshapiro<p>
30090792Sgshapiro<li>If the cause of the logic bug is earlier data structure corruption,
30190792Sgshapiro    then cleanup actions intended to preserve the integrity of the data
30290792Sgshapiro    that the program is handling might cause more harm than good: they
30390792Sgshapiro    might cause information to be corrupted or lost.
30490792Sgshapiro<p>
30590792Sgshapiro<li>If the program uses threads, then cleanup is much more problematic.
30690792Sgshapiro    Suppose that thread A is holding some locks, and is in the middle of
30790792Sgshapiro    modifying a shared data structure.  The locks are needed because the
30890792Sgshapiro    data structure is currently in an inconsistent state.  At this point,
30990792Sgshapiro    a logic bug is detected deep in a library routine called by A.
31090792Sgshapiro    How do we get all of the running threads to stop what they are doing
31190792Sgshapiro    and perform their thread-specific cleanup actions before terminating?
31290792Sgshapiro    We may not be able to get B to clean up and terminate cleanly until
31390792Sgshapiro    A has restored the invariants on the data structure it is modifying
31490792Sgshapiro    and releases its locks.  So, we raise an exception and unwind the stack,
31590792Sgshapiro    restoring data structure invariants and releasing locks at each level
31690792Sgshapiro    of abstraction, and performing an orderly shutdown.  There are certainly
31790792Sgshapiro    many classes of error conditions for which using the exception mechanism
31890792Sgshapiro    to perform an orderly shutdown is appropriate and feasible, but there
31990792Sgshapiro    are also classes of error conditions for which exception handling and
32090792Sgshapiro    orderly shutdown is dangerous or impossible.  The abnormal program
32190792Sgshapiro    termination system is intended for this second class of error conditions.
32290792Sgshapiro    If you want to trigger orderly shutdown, don't call sm_abort:
32390792Sgshapiro    raise an exception instead.
32490792Sgshapiro</ul>
32590792Sgshapiro<p>
32690792Sgshapiro    Here is a strategy for making sendmail fault tolerant.
32790792Sgshapiro    Sendmail is structured as a collection of processes.  The "root" process
32890792Sgshapiro    does as little as possible, except spawn children to do all of the real
32990792Sgshapiro    work, monitor the children, and act as traffic cop.
33090792Sgshapiro    We use exceptions to signal expected but infrequent error conditions,
33190792Sgshapiro    so that the process encountering the exceptional condition can clean up
33290792Sgshapiro    and keep going.  (Worker processes are intended to be long lived, in
33390792Sgshapiro    order to minimize forking and increase performance.)  But when a bug
33490792Sgshapiro    is detected in a sendmail worker process, the worker process does minimal
33590792Sgshapiro    or no cleanup and then dies.  A bug might be detected in several ways:
33690792Sgshapiro    the process might dereference a NULL pointer, receive a signal 11,
33790792Sgshapiro    core dump and die, or an assertion might fail, in which case the process
33890792Sgshapiro    commits suicide.  Either way, the root process detects the death of the
33990792Sgshapiro    worker, logs the event, and spawns another worker.
34090792Sgshapiro
34190792Sgshapiro<h2> Rationale for Naming Conventions </h2>
34290792Sgshapiro
34390792Sgshapiro    The names "require" and "ensure" come from the writings of Bertrand Meyer,
34490792Sgshapiro    a prominent evangelist for assertion checking who has written a number of
34590792Sgshapiro    papers about the "Design By Contract" programming methodology,
34690792Sgshapiro    and who created the Eiffel programming language.
34790792Sgshapiro    Many other assertion checking packages for C also have "require" and
34890792Sgshapiro    "ensure" assertion types.  In short, we are conforming to a de-facto
34990792Sgshapiro    standard.
35090792Sgshapiro<p>
35190792Sgshapiro    We use the names <tt>SM_REQUIRE</tt>, <tt>SM_ASSERT</tt>
35290792Sgshapiro    and <tt>SM_ENSURE</tt> in preference to to <tt>REQUIRE</tt>,
35390792Sgshapiro    <tt>ASSERT</tt> and <tt>ENSURE</tt> because at least two other
35490792Sgshapiro    open source libraries (libisc and libnana) define <tt>REQUIRE</tt>
35590792Sgshapiro    and <tt>ENSURE</tt> macros, and many libraries define <tt>ASSERT</tt>.
35690792Sgshapiro    We want to avoid name conflicts with other libraries.
35790792Sgshapiro
35890792Sgshapiro</body>
35990792Sgshapiro</html>
360