De-bugging Strategies.
>>>>>>>> Proper Preparation
Prevents Piss-Poor Performance. <<<<<<<<
This lesson is really a essay about how to go about
writing programs.
I know that by far the best way to greatly reduce
the amount of effort required to get a program going
properly is to avoid making mistakes in the first palace!
Now this might seem to be stating the absolute obvious,
and it is but after looking at many programs it would
seem that there is a very definite need to say it.
So how does one go about reducing the probability
of making mistakes?
There are many strategies, and over the years I have
evolved my own set.
I have found that some of the most important are:
1) Document what you are going to do before yes BEFORE
you write any code.
Set up the source files for the section of the program
you are going to write and put some lines of explanation
as to what you intend to do in this file. Be as precise
as you can, but don't go into the detail of explaining
in English, or your First Language, exactly what every
statement does.
2) Make sure that you keep each file as small as is
sensible. Some program authors say that one should put
only one function in a file. It's my personal opinion
that this is going a little bit over the top, but certainly
you should not have more than one logical activity in
a source
file. It's easier to find a needle in a tiny haystack
than in a big one!
3) Always use names for the objects in your program
which are fully descriptive, or at the very least are
meaningful nmemonics. Put yourself in the position of
some poor soul who - a couple of years later, after
you have long finished with the project, and left the
country - has been given the task of adding a small
feature to your exquisite program. Now in the
rush to get your masterpiece finished you decided to
use variable names like "a4" and "isb51"
simply so that you can get the line typed a fraction
of a second faster than if you used something like "customer_address[POST_CODE]"
and input_status_block[LOW_FUEL_TANK_#3].
The difference in ease of understanding is obvious,
isn't it? However judging by some programs which I have
seen published in both magazines and in the public domain
program sources, the point has still to be made.
4) ALWAYS take great care with the layout of your
code. It's my opinion that the opening brace of ALL
program structures should be on a new line. Also if
you put them in the leftmost column for structs, enums,
and initialised tables, as well as functions, then the
'find function' keystrokes ( "[[" and "]]"
) in vi will find them as well as the functions themselves.
Make sure you have the "showmatch" facility
in vi turned on. ( And watch the cursor jump when you
enter the
right hand brace, bracket, or parenthesis. )
5) Try as hard as you can to have as few global variables
as possible. Some people say never have any globals.
This is perhaps a bit too severe but global variables
are a clearly documented source of programming errors.
If it's impossible to perform a logical activity in
an efficient way without having a global or two, then
confine the scope of the globals to just the one file
by marking the defining declaration "static".
This stops the compiler producing a symbol which
the linking loader will make available to all the files
in your source.
6) Never EVER put 'magic numbers' in you source code.
Always define constants
in a header file with #define lines or enum statements.
Here is an example:-
/* ----------------------------------------- */
#include <stdio.h>
enum status_input_names
{
radiator_temperature,
oil_temperature,
fuel_pressure,
energy_output,
revolutions_per_minute
};
char *stats[] =
{
"radiator_temperature",
"oil_temperature",
"fuel_pressure",
"energy_output",
"revolutions_per_minute"
};
#define NUMBER_OF_INPUTS ( sizeof ( stats ) / sizeof
( stats[0]))
main()
{
enum status_input_names name;
printf ( "Number of Inputs is: %d\n", NUMBER_OF_INPUTS
);
for ( name = radiator_temperature; name < NUMBER_OF_INPUTS;
name++)
{
printf ( "\n%s", stats[ name ] );
}
printf ( "\n\n" );
}
/* ----------------------------------------- */
Note that as a side effect we have available the meaningful
symbols radiator_temperature etc. as indices into the
array of status input names and the symbol NUMBER_OF_INPUTS
available for use as a terminator in the 'for' loop.
This is quite legal because sizeof is a pseudo-function
and the value is evaluated at the time of compilation
and not when the program is
executed. This means that the result of the division
in the macro is calculated at the time of compilation
and this result is used as a literal in the 'for' loop.
No division takes place each time the loop is executed.
To illustrate the point I would like to tell you a
little story which is fictitious, but which has a ring
of truth about it. Your employer has just landed what
seems to be a lucrative contract with
an inventor of a completely new type of engine. We are
assured that after initial proving trials one of the
larger Japanese motor manufactures is going to come
across with umpteen millions to complete the development
of the design. You are told to write a program which
has to be a simple and straightforward exercise in order
to do the job as cheaply as possible.
Now, the customer - a some-what impulsive type - realises
that his engine is not being monitored closely enough
when it starts to rapidly dis-assemble itself under
high speed and heavy load. You have to add a few extra
parameters to the monitoring program by yesterday morning!
You just add the extra parameters into the enumand the
array of pointers
to the character strings. So:
enum status_input_names
{ radiator_temperature,
radiator_pressure,
fuel_temperature,
fuel_pressure,
oil_temperature,
oil_pressure,
exhaust_manifold_temperature
};
Let's continue the story about the Japanese purchase.
Mr. Honda ( jun ) has come across with the money and
the result is that you are now a team leader in the
software section of Honda Software ( YourCountry ) Ltd.
The project of which you are now leader is to completely
rewrite your monitoring program and add a whole lot
of extra channels as well as to make the printouts much
more readable so that your cheap, cheerful, and aesthetic-free
program can be sold
as the "Ultimate Engine Monitoring Package"
from the now world famous Honda Real-time Software Systems.
You set to work, Honda et. al. imagine that there is
going to be a complete redesign of the software at a
cost of many million Yen. You being an ingenious type
have written the code so that it is easy to enhance.
The new features required are that the printouts have
to be printed with the units of measure appended to
the values which have to scaled and processed so that
the number printed is a real physical value instead
of the previous arrangement where the raw transducer
output was just dumped onto a screen.
What do you have to do?
Thinking along the line of "Get the Data arranged
correctly first". You take you old code and expand
it so that all the items of information required for
each channel are collected into a struct.
enum status_input_names
{
radiator_temperature,
radiator_pressure,
fuel_temperature,
fuel_pressure,
oil_temperature,
oil_pressure,
exhaust_manifold_temperature,
power_output,
torque
};
typedef struct channel
{
char *name; /* Channel Name to be displayed on screen.
*/
int nx; /* position of name on screen x co-ordinate.
*/
int ny; /* ditto for y */
int unit_of_measure; /* index into units of measure
array */
char value; /* raw datum value from 8 bit ADC */
char lower_limit; /* For alarms. */
char upper_limit;
float processed_value; /* The number to go on screen.
*/
float offset;
float scale_factor;
int vx; /* Position of value on screen. */
int vy;
}CHANNEL;
enum units_of_measure { kPa, degC, kW, rpm, Volts,
Amps, Newtons };
char *units { "kPa", "degC", "kW",
"rpm", "Volts", "Amps",
"Newtons" };
CHANNEL data [] =
{
{ "radiator temperature",
{ "radiator pressure",
{ "fuel temperature",
{ "fuel pressure",
{ "oil temperature",
{ "oil pressure",
{ "exhaust manifold temperature",
{ "power output",
{ "torque",
};
#define NUMBER_OF_INPUTS sizeof (data ) / sizeof (
data[0] )
Now the lesson preparation is to find the single little
bug in the above program fragment, to finish the initialisation
of the data array of type CHANNEL and to have a bit
of a crack at creating a screen layout program to display
its contents. Hint: Use printf(); ( Leave all the values
which originate from the real world as zero. )
Here are some more tips for young players.
1) Don't get confused between the logical equality
operator,
==
and the assignment to a variable operator.
=
This is probably the most frequent mistake made by
'C' beginners, and has the great disadvantage that,
under most circumstances, the compiler will quite happily
accept your mistake.
2) Make sure that you are aware of the difference
between the logical and bit operators.
&& This is the logical AND function.
|| This is the logical OR function.
The result is ALWAYS either a 0 or a 1.
& This is the bitwise AND function used for masks
etc.
The result is expressed in all the bits of the word.
3) Similarly to 2 be aware of the difference between
the logical complementation and the bitwise one's complement
operators. ! This is the logical NOT operator. ~ This
is the bitwise ones complement op.
Some further explanation is required. In deference
to machine efficiency a
LOGICAL variable is said to be true when it is non-zero.
So let's set a
variable to be TRUE.
00000000000000000000000000000001 A word representing
TRUE.
Now let's do a logical NOT !.
00000000000000000000000000000000 There is a all zero
word, a FALSE.
00000000000000000000000000000001 That word again.
TRUE.
Now for a bitwise complement ~.
11111111111111111111111111111110 Now look we've got
a word which is
non-zero, still TRUE.
Is this what you intended?
4) It is very easy to fall into the hole of getting
the '{' & '}'; '[' & ']'; '(' & ')'; symbol
pairs all messed up and the computer thinks that the
block structure is quite different from that
which you intend. Make sure that you use an editor which
tells you the matching symbol. The UNIX editor vi does
this provided that you turn on the option. Also take
great care with your layout so that the block structure
is absolutely obvious, and whatever style you choose
do take
care to stick by it throughout the whole of the project.
A personal layout paradigm is like this:
Example 1.
function_type function_name ( a, b )
type a;
type b;
{
type variable_one, variable_two;
if ( logical_expression )
{
variable_one = A_DEFINED_CONSTANT;
if ( !return_value = some_function_or_other ( a,
variable_one,
&variable_two
)
)
{
error ( "function_name" );
exit ( FAILURE );
}
else
{
return ( return_value + variable_two );
}
} /* End of "if ( logical_expression )" block
*/
} /* End of function */
This layout is easy to do using vi with this initialisation
script
in either the environment variable EXINIT or the file
${HOME}/.exrc:-
set showmode autoindent autowrite tabstop=2 shiftwidth=2
showmatch wm=1
Example 2.
void printUandG()
{
char *format =
"\n\
User is: %s\n\
Group is: %s\n\n\
Effective User is: %s\n\
Effective Group is: %s\n\n";
( void ) fprintf ( tty,
format,
passwd_p->pw_name,
group_p->gr_name,
epasswd_p->pw_name,
egroup_p->gr_name
);
}
Notice how it is possible to split up format statements
with a '\' as the last character on the line, and that
it is convenient to arrange for a nice output format
without having to count the
field widths. Note however that when using this technique
that the '\' character MUST be the VERY LAST one on
the line. Not even a space may follow it!
In summary I *ALWAYS* put the opening brace on a new
line, set the tabs so that the indentation is just two
spaces, ( use more and you very quickly run out of "line",
especially on an eighty column screen ). If a statement
is too long to fit on a line I break the line up with
the arguments set out one to a line and I then the indentation
rule to the parentheses "()"
as well. Sample immediately above. Probably as a hang-over
from a particular pretty printing program which reset
the indentation position after the printing of the closing
brace "}", I am in the habit of doing it as
well. Long "if" and "for" statements
get broken up in the same way. This is an example of
it all. The fragment of code is taken from a curses
oriented data input function.
/*
** Put all the cursor positions to zero.
*/
for ( i = 0;
s[i].element_name != ( char *) NULL &&
s[i].element_value != ( char *) NULL;
i = ( s[i].dependent_function == NULL )
? s[i].next : s[i].dependent_next
)
{ /* Note that it is the brace and NOT the */
/* "for" which moves the indentation level.
*/
s[i].cursor_position = 0;
}
/*
** Go to start of list and hop over any constants.
*/
for ( i = edit_mode = current_element = 0;
s[i].element_value == ( char *) NULL ;
current_element = i = s[i].next
) continue; /* Note EMPTY statement. */
/*
** Loop through the elements, stopping at end of table
marker,
** which is an element with neither a pointer to an
element_name nor
** one to a element_value.
*/
while ( s[i].element_name != ( char *) NULL &&
s[i].element_value != ( char *) NULL
)
{
int c; /* Varable which holds the character from the
keyboard. */
/*
** Et Cetera for many lines.
*/
}
Note the commenting style. The lefthand comments provide
a general overview of what is happening and the righthand
ones a more detailed view. The double stars make a good
marker so it is easy to separate the code and the comments
at a glance.
The null statement.
You should be aware that the ";" on its
own is translated by the compiler as a no-operation
statement. The usefullness of this is that you can do
little things, such as counting up a list of objects,
or positioning a pointer entirely within a "for"
or "while" statement. ( See example above
). There is, as always, a flip side. It is HORRIBLY
EASY to put a ";" at the
end of the line after the closing right parenthesis
- after all you do just that for function calls! The
suggestion is to both mark deliberate null statements
with a comment and to use the statement "continue;".
Using the assert macro will pick up these errors at
run time.
The assert macro.
Refer to the Programmers Reference Manual section
3X and find the
documentation on this most useful tool.
As usual an example is by far the best wasy to explain
it.
/* ----------------------------------------- */
#ident "@(#) assert-demo.c"
#include <stdio.h>
#include <assert.h>
#define TOP_ROW 10
#define TOP_COL 10
main()
{
int row, col;
for ( row = 1; row <= TOP_ROW; row++);
{
assert ( row <= TOP_ROW );
for ( col = 1; col <= TOP_COL; col++ )
{
assert ( col <= TOP_COL );
printf ( "%4d", row * col );
}
printf ( "\n" );
}
}
/* ----------------------------------------- */
Which produces the output:-
Assertion failed: row <= TOP_ROW , file assert-demo.c,
line 15
ABORT instruction (core dumped)
It does this because the varable "row" is
incremented
to one greater than The value of TOP_ROW.
Note two things:
1) The sense of the logical condition. The assert
is asserted
as soon as the result of the logical condition is FALSE.
Have a look at the file /usr/include/assert.
Where is the ";" being used as an empty program
statement?
2) The unix operating system has dumped out an image
of the executing
program for examination using a symbolic debugger. Have
a play with
"sdb" in preparation for the lesson which
deals with it in more
detail.
Lets remove the errant semi-colon, re-compile and
re-run the program.
1 2 3 4 5 6 7 8 9 10
2 4 6 8 10 12 14 16 18 20
3 6 9 12 15 18 21 24 27 30
4 8 12 16 20 24 28 32 36 40
5 10 15 20 25 30 35 40 45 50
6 12 18 24 30 36 42 48 54 60
7 14 21 28 35 42 49 56 63 70
8 16 24 32 40 48 56 64 72 80
9 18 27 36 45 54 63 72 81 90
10 20 30 40 50 60 70 80 90 100
Here's the ten times multiplication table, for you
to give to to the nearest primary-school child!
I would agree that it is not possible to compare the
value of a program layout with a real work of fine art
such as a John Constable painting or a Michaelangelo
statue, I do think a well laid out and literate example
of programming is not only much easier to read and understand,
but also it
does have a certain aesthetic appeal.
Lesson # 1, #
2, # 3, #
4, # 5, #
6, # 7, #
8 |