Finding defects using Holzmann's "Power of 10" rules for writing safety critical code
In safety-critical applications, bugs in software are not just costly
distractions—they can put lives at risk. Consequently, safety-critical
software developers go to great lengths to detect and fix bugs before
they can make it into fielded systems.
Although there are some well-known cases where software defects have
caused disastrous failures, the record is mostly fairly good—if the
software controlling medical devices or flight-control systems was as
buggy as most software, the headlines would be truly dreadful.
The methods that safety-critical developers use are undeniably
effective at reducing risk, so there are lessons to be learned for
developers who do not write safety-critical code. Two techniques stand
out as being most responsible: advanced static
analysis and rigorous testing.
Static analysis tools have been used for decades. Their appeal is
clear: they can find problems in software without actually executing
it. This contrasts with dynamic analysis techniques (i.e. traditional
testing), which usually rely on running the code against a large set of
test cases. The first generation of static-analysis tools, of which
lint is the most widely-known example, were quite limited in capability
and suffered from serious usability problems.
However, recently a new generation of advanced static-analysis tools
has emerged. These are capable of finding serious software errors such
as buffer overruns, race conditions, null pointer dereferences and
resource leaks.
They can also find subtle inconsistencies such as redundant
conditions, useless assignments and unreachable code. These correlate
well with real bugs as they often indicate that the programmer
misunderstood some important aspect of the code.
The tenth rule
Using advanced static analysis tools is quickly becoming best practice:
rule ten of NASA/JPL's Gerard Holzmann's "Ten
Rules for Writing Safety Critical Code"
specifies that advanced static analysis tools should be used
aggressively all through the development process.
The other important technique is systematic testing. The importance
of highly rigorous testing has been recognized by some regulatory
agencies. For flight-control software, the Federal Aviation Authority
is very specific about the level of testing required.
The developer must demonstrate that they have test cases that
achieve full coverage of the code. Developing such test cases can be
very expensive. Advanced static-analysis tools can help reduce this
cost by pointing out parts of the code that make it difficult or even
impossible to achieve full coverage.
Benefits of advanced static analysis
Testing has traditionally been the most effective way to find defects
in code. The best test cases feed in as many combinations of inputs and
conditions as possible such that all parts of the code are exercised
thoroughly. Statement coverage tools can help you develop a test suite
that makes sure that every line of code is executed at least once.
But as all programmers know, just because a statement executes
correctly once does not mean it will always do so—it may trigger an
error only under a very unusual set of circumstances. There are tools
that will measure condition coverage and even path coverage, and these
are all helpful for exercising these corner cases, but achieving full
coverage for non-trivial programs is extraordinarily time-consuming and
expensive.
This is where advanced static analysis shines. The tools examine
paths and consider conditions and program states in the abstract. By
doing so, they can achieve much higher coverage of your code than is
usually feasible with testing. Best of all, they do all this without
requiring you to write any test cases.
This is the most significant way in which static analysis reduces
the cost of testing. The cheapest bug is the one you find earliest.
Because static analysis is a compile-time process, it can find bugs
before you even finish writing the program. This is usually less
expensive than if you have to find them by writing a test case or
debugging a crash. This article also describes how these tools work,
and then shows how they can also reduce the cost of creating test
cases.