CCF: a Conditional Compilation Facility

It may seem hard to imagine programming without #define,...,#ifdef etc, but, in fact, Java really does not need these constructs.

David Flanagan: Java in a Nutshell

Contents


The Glossy Brochure

As you may be aware, C compilers come with a preprocessor, allegedly to aid portability. The preprocessor has three distinct capabilities:

This third function is typically used in code like:

      #if defined (TESTING)
      #  if defined (__unix)
      ....   (unix test code)
      #  elif define (__MSDOS__)
      ....   (dos test code)
      #  endif
      #endif
    

Of course, this can rapidly lead to a rats' nest of conditionals within conditionals. But, fortunately, most languages have no such facilities.

Until now.

Yes, with the arrival of CCF you can bring the full illegibility of the #ifdef to your code.


The History

I suppose I just got fed up of trying to maintain various versions of my configuration files. It got to the point where I'd spend longer setting up and commenting out ready for the next test than I'd spend on the tests themselves.

I'd recently read (in the May 1996 edition of Dr. Dobb's Journal) about the proposed "conditional compilation facility" for Fortran, which seemed to do more-or-less what I wanted. After a bit of thought, I threw together something that wasn't quite so heavily tied to Fortran, adding a few features along the way. It was further refined in light of certain comments by a hand-picked audience (basically: "it would be a bit less error-prone if ..."), and a few years of daily use.


The Overview

Although I kept the name CCF, in fact it's nothing to do with compiling. The basic job is to look through a set of files and decide which lines to comment out and which to keep. For instance, part of my AUTOEXEC.BAT looks like this:

      rem ##  CCF:init
	... (other stuff) ...
	rem ## if networking
      path c:\Dos;c:\Pctcp
      set pctcp=c:\Pctcp\Pctcp.ini
      vxdinit.exe -i -s
      rem ## else
	rem ##! path c:\Dos
	rem ## endif
    

If I ask CCF to process this file to disable networking it would end up as:

      rem ##  CCF:init
	... (other stuff) ...
	rem ## if networking
	rem ##! path c:\Dos;c:\Pctcp
	rem ##! set pctcp=c:\Pctcp\Pctcp.ini
	rem ##! vxdinit.exe -i -s
	rem ## else
      path c:\Dos
      rem ## endif
    

The lines relating to the network have been commented out, and the alternative PATH activated. Obviously, if I were to enable networking again the comment markers would be removed and things would be back to normal.

Notice that no lines are added or deleted by CCF, they're just shuffled from side to side a little. This means that any error messages you might get from a compiler will relate properly to the source file's line numbers. It also means you don't need to worry about maintaining "CCF-able" and "CCF-ed" versions of the file. A single copy serves for both.


Basic Features

Strange to believe, but CCF knows even less about my files than I do. When it starts out it doesn't even know what a comment looks like. The first thing it does when asked to process a new file is look for a line containing the string "CCF:init". This is assumed to have been commented out, and, preferably, preceded by some identifying string, so CCF comments look different to normal ones. The whole sequence (commenting out characters + CCF identifying string) are taken as the "CCF marker" for the file.

From that point on, any line starting with the CCF marker is fair game.

If, for the sake of discussion, we assume the CCF marker is "%%", the following basic commands are available:

%% set var value

Henceforth the named var is considered to be set to the given value.

%% unset var

Henceforth the given var is considered not to be set. (Using variables that are not set is an error.)

%% if var

If the given var is greater than zero then subsequent lines are passed through (having first been stripped of any %%! they may have picked up earlier). If the var is zero or less, subsequent lines are decorated with a leading %%!, thereby commenting them out.

%% endif

This line terminates the range of the %%if.

Naturally enough, between the %%if and the %%endif you can add the usual collection of %%elsif and %%else lines.


Advanced Features

While the above is easily enough to do what I originally wanted, while I was at it I threw in a collection of other goodies. First, and most important, is that %%if will accept an expression, rather than just a variable. If the expression evaluates to a positive value, it's considered true, with zero or negative values being false. Similarly, %%set can be given an expression, making it much more useful.

In decreasing order of precedence, the following operators are available in expressions:

-
not
unary minus, logical not
*
/
multiplication and division
+
-
addition and subtraction
<
<=
>
>=
ordering relations
=
/=
equality relations
and
or
logical operators

(The relations and logical operators return 1 for true, and 0 for false. They also evaluate everything relevant - unlike, say, C's "&&" and Ada's (rather threatening) "or else".)

Of course, you can override the in-built precedence by judicious use of parentheses.

There is even a pseudo-function you might find handy:

ccf:defined(var) returns 1 if the given variable is set, 0 otherwise

One thing to beware of in these expressions is that, in an attempt to make CCF blend in with its environment, I impose no severe restrictions on what can make up an identifier. In particular, something like "check-passwd" is taken as a single identifier, not an attempted subtraction. Aside from the fact that I actually use names like this in various Scheme files, I'm also too lazy to write a decent lexer. Use spaces liberally and this problem won't bite you too often.

Whilst on the subject of "blending in", there are a few totally unnecessary frills to remove some of the inevitable culture-clash when CCF is embedded in another language's source files:

and it's completely case-independant. Oh, of course, it'll accept either MSDOS or Unix style text files on input, but always writes out the flavour appropriate for the system it's running on.


Yet More Features

There's a default command, which is like a conditional set: if the variable doesn't have a value one is assigned, otherwise it is left unchanged.

There are four commands just for talking to the screen. The say command will just echo the remainder of the line to the screen. More seriously, warning and error will give a formal reprimand, including the offending filename and line number. The difference betwen these two is that error will immediately terminate CCF, whereas warning will carry on to the end of the source. (Both will report a "not sucessful" exit status to the operating system.) As a debugging aid, the report var command will output the current value of the named variable.

Sometimes it's useful to be able to put a comment on a CCF line. Anything after the second CCF marker will be ignored.

All lines that look like CCF commented them out will be restored to life if they're not covered by a failing if clause. This means you can cut and paste between marked and unmarked areas, and leave CCF to tidy up afterwards. By default CCF will mark any comments it creates with a "!" character, which is supposed to make a nice vertical line for the eye to follow when looking for the if-endif brackets. If this causes problems (because, for example, your language treats exclamation marks specially) you can change it with the hide command.

There are command line options to comment out everything covered by an if, and also to comment out nothing. These are intended mainly to allow different vintages of a file to be put into a "standard form", so diff will talk about your changes, rather than the ones CCF made.

There's a CCF command "settings" which scans the rest of its line on input, looking for variable names, and fills it in with these names and their values on output. This and the date command (which fills the rest of the line with the current date and time) are useful for documenting the CCF options used to generate a the current version of the file.

If you indent the CCF marker on an if line, CCF will attempt to similarly indent any comment markers it needs to insert. It can, however, be easily fooled by inconsistent uses of spaces and tabs, so this option isn't as useful as it might be. (If CCF gets confused it errs in the direction of safety, rather than æsthetics, so no damage is done.)

CCF can be asked to scan a file merely for its side effects (by means of set or defaultcommands) prior to actually doing some processing. This allows you to build the equivalent of a "common header file" for a collection of CCFed sources.

The command enum var sets a value for the given variable. The values are chosen in increasing order, and an effort is made to keep them distinct. This is supposed to be useful for simulating an "enumerated type". (It is an error if the var already has a value.)

The enum, report, and unset commands will accept a sequence of names, rather than just the one, with the effect being as you'd expect. In fact, report will even work if you provide it with no names at all, in which case you'll get a complete list of CCF's symbol table.

There's an unset! command, which is just like unset, except that is doesn't complain if the victim is already undefined.

As well as if..else..endif, there's a case..when..else..endcase style of conditional. Be careful, though, because judicious use of case and enum can occasionally lead to easily understood code.

There are a few pre-defined variables. They are initialised by CCF, but not referenced internally, so assigning to them will not have the desired effect. In particular, the command "set Unix 1" will not upgrade your machine, sadly.

Confusingly, perhaps, CCF has two concepts of a "blank line". If there's nothing on a line (other than maybe a few whitespace characters) then CCF will usually pass it through unchanged. This means that nicely laid out blocks of text separated by blanks come through still separated by blanks, which seems to fit with what people expect. On the other hand, a line containing only a CCF marker is always passed though unchanged, which allows you to visually link CCF-controlled regions, if that's what you want.


System Requirements

To use CCF a file format needs to have some way of embedding comment lines. Most initialisation files and scripting languages have this, usually by starting the line with a # or a ;. (The only file format I frequently use that CCF can't properly handle is HTML - the comment convention it inherited from SGML is just too weird.)

If you're lucky, your file format has comments that extend to the end of the line. Alas, not all language designers took this route. Some formats insist on an "end of comment" marker of some sort, and some even allow comments to be nested! CCF is prepared to deal with these evils, by slight modifications to the CCF:init line.

There are three valid forms of CCF:init line:

--## CCF:init

the "correct" style, where comments extend to the end of the line (as in Ada),

"## CCF:init "

the "horrid" style, where comments need to be explicitly terminated (Smalltalk), and

/*## CCF:init nested */
the "even worse" style, where comments need to be both counted and explicitly terminated (Rexx).

When CCF finds you've supplied junk at the end of its initialisation line, it tries quite hard to insert comment terminators as appropriate. (My apologies to those who are stuck with a system needing the word nested as part of its comment terminator.)

If by some misfortune you find yourself cursed with a text format that simply has no suitable comment convention, it may be worth inventing one of your very own (eg "any line starting with %%%%") and then piping CCF's output through grep -v. Admittedly you lose some elegance with this approach, but it's still better than trying to maintain multiple versions by hand. (You might want to check out the -b switch if you're doing this, by the way.)


What it's NOT

CCF does not provide the equivalent of #include, nor does it do string substitution within the file. If you need either of these get to know m4. It also doesn't produce chatty reports about how many lines were included/excluded/changed or whatever. Again, there are better tools available. CCF doesn't have a button-bar, it doesn't do drag-and-drop, and it can't be used as a replacement for a DTP package if you just want to run off fancy birthday cards for your spouse. The current version also doesn't include a spelling checker, though this may remain unchanged in the future.


How to use it

Like many of my programs, CCF is intended to be run from a shell script. Unusually, perhaps, I've actually said this for once, rather than leave it to your own common sense. The actual work is done by a compiled binary, ccfbin, CCFBIN.EXE, or whatever, with a smaller, more tweakable, program making policy decisions for it.

Once the script is installed somewhere on your PATH, you can process your files using commands like:

      ccf [options] files ...
    

If the file to be processsed is "-", standard input read, and the output goes to stdout.

The options understood so far are:

-a
Comment out all lines controlled by an if or case, regardless of the success of the condition
-b
Apply a "commented out" marker to all appropriate lines, even blanks. This might be useful if you were about to feed the results to grep -v.
-B
If a line that would normally be commented out by CCF is, in fact, blank, just pass it through unchanged. Since this is the default behaviour, the main use of this flag would probably be to defeat some over-enthusiastic invocation script.
-n
Comment out nothing.
-s
Undoes the effects of a previous -a or -n
-v
Print a version number
-w
Print a version number, plus a warning about how little warranty the software has.
-D var
Execute a "set var 1" before processing the files
-D var=val (Unix only)
Execute a "set var val" before processing the files
-U var
Execute an "unset var" before processing the files
-H hdr
Process the file hdr as a "header file" (but after any -D or -U options)
-V hdr
Process the file hdr as a "header file" (but before any -D or -U options)

Note the difference between the -H and -V options. The intention is that -V files set up symbolic variable names (probably using enums) for use by the -D option, whereas the -H files will actually do interesting work.