Common C++ Overview Documentation; Draft 2

Introduction
------------
In writing this document I hope to better explain what the Common C++
library is about and how it may be used in developing your own  C++
applications.  This document is intended as an overview and unifying
document to support the already detailed class-by-class function
descriptions found and browsable in the "doc" subdirectory of the Common
C++ distribution.

Common C++ offers a set of "portable" classes that can be used to build
highly portable applications in C++.  In particular, Common C++ offers
classes that abstract threading, sockets, synchronization, serial I/O,
"config" file parsing, class object persistence, shared object module
loading, daemon management, and optimized "block" and memory mapped file
I/O under a set of consistent classes that your application can then
be built from.  The goal is to write your application to use the portable
abstract services and classes of the Common C++ libraries rather than
having to access low level system services directly.

There is a large diversity of views in how one should code a C++
framework.  Since a large number of older C++ compilers remain in everyday
use, I choose to use what I felt was an appropriate set of C++ language
features and practices to provide the greatest compiler compatibility and
to generate the most optimized code for Common C++.  To further reduce the
overhead of writing Common C++ applications, I have split the primary
library image itself into several different shared libraries.  This
allowed me to collect the more obscure and less likely to be used features
into separate libraries which need never be loaded.

Finally, in designing Common C++, I assume that class extension
(inheritance) is the primary vehicle for application development. The 
Common C++ framework, while offering many classes that are usable
directly, is designed for one to create applications by extending Common
C++ "base" classes into an application specific versions of said classes
as needed. 

Common C++ Threading Concepts
-----------------------------
Threading was the first part of Common C++ I wrote, back when it was still
the APE library.  My goal for Common C++ threading has been to make
threading as natural and easy to use in C++ application development as
threading is in Java.  With this said, one does not need to use threading
at all to take advantage of Common C++.  However, all Common C++ classes
are designed at least to be thread-aware/thread-safe as appropriate and
necessary.

Common C++ threading is currently built either from the Posix "pthread"
library or using the win32 SDK.  In that the Posix "pthread" draft
has gone through many revisions, and many system implementations are
only marginally compliant, and even then usually in different ways, I
wrote a large series of autoconf macros found in ost_pthread.m4 which
handle the task of identifying which pthread features and capabilities
your target platform supports.  In the process I learned much about what
autoconf can and cannot do for you..

Currently the GNU Portable Thread library (GNU pth) is not directly
supported in Common C++.  While GNU "Pth" doesn't offer direct
native threading support or benefit from SMP hardware, many of the design
advantages of threading can be gained from it's use, and the  Pth pthread
"emulation" library should be usable with Common C++.  In the future,
Common C++ will directly support Pth, as well as OS/2 and BeOS native
threading API's.

Common C++ itself defines a fairly "neutral" threading model that is
not tied to any specific API such as pthread, win32, etc.  This neutral
thread model is contained in a series of classes which handle threading
and synchronization and which may be used together to build reliable
threaded applications.

Common C++ defines application specific threads as objects which are
derived from the Common C++ "Thread" base class.  At minimum the "Run"
method must be implemented, and this method essentially is the "thread",
for it is executed within the execution context of the thread, and when
the Run method terminates the thread is assumed to have terminated.

Common C++ allows one to specify the running priority of a newly created
thread relative to the "parent" thread which is the thread that is
executing when the constructor is called.  Since most newer C++
implementations do not allow one to call virtual constructors or virtual
methods from constructors, the thread must be "started" after the
constructor returns.  This is done either by defining a "starting"
semaphore object that one or more newly created thread objects can wait
upon, or by invoking an explicit "Start" member function.

Threads can be "suspended" and "resumed".  As this behavior is not defined
in the Posix "pthread" specification, it is often emulated through
signals.  Typically SIGUSR1 will be used for this purpose in Common C++
applications, depending in the target platform.  On Linux, since threads
are indeed processes, SIGSTP and SIGCONT can be used.  On solaris, the
Solaris thread library supports suspend and resume directly.

Threads can be canceled.  Not all platforms support the concept of
externally cancelable threads.  On those platforms and API
implementations that do not, threads are typically canceled through the
action of a signal handler.

As noted earlier, threads are considered running until the "Run" method
returns, or until a cancellation request is made.  Common C++ threads can
control how they respond to cancellation, using setCancellation().
Cancellation requests can be ignored, set to occur only when a
cancellation "point" has been reached in the code, or occur immediately.
Threads can also exit by returning from Run() or by invoking the Exit()
method.

Generally it is a good practice to initialize any resources the thread may
require within the constructor of your derived thread class, and to purge
or restore any allocated resources in the destructor.  In most cases, the
destructor will be executed after the thread has terminated, and hence
will execute within the context of the thread that requested a join rather
than in the context of the thread that is being terminated.  Most
destructors in derived thread classes should first call Terminate() to
make sure the thread has stopped running before releasing resources.

A Common C++ thread is normally canceled by deleting the thread object.
The process of deletion invokes the thread's destructor, and the
destructor will then perform a "join" against the thread using the
Terminate() function.  This behavior is not always desirable since the
thread may block itself from cancellation and block the current "delete"
operation from completing.  One can alternately invoke Terminate()
directly before deleting a thread object.

When a given Common C++ thread exits on it's own through it's Run()
method, a "Final" method will be called.  This Final method will be called
while the thread is "detached".  If a thread object is constructed through
a "new" operator, it's final method can be used to "self delete" when
done, and allows an independent thread to construct and remove itself
autonomously.

A special global function, getThread(), is provided to identify the thread
object that represents the current execution context you are running
under.  This is sometimes needed to deliver signals to the correct thread.
Since all thread manipulation should be done through the Common C++ (base) 
thread class itself, this provides the same functionality as things like
"pthread_self" for Common C++.

Common C++ threads are often aggregated into other classes to provide
services that are "managed" from or operate within the context of a
thread, even within the Common C++ framework itself.  A good example of
this is the TCPSession class, which essentially is a combination of a TCP
client connection and a separate thread the user can define by deriving a
class with a Run() method to handle the connected service.  This
aggregation logically connects the successful allocation of a given
resource with the construction of a thread to manage and perform 
operations for said resource.

Threads are also used in "service pools".  In Common C++, a service pool
is one or more threads that are used to manage a set of resources.  While
Common C++ does not provide a direct "pool" class, it does provide a model
for their implementation, usually by constructing an array of thread
"service" objects, each of which can then be assigned the next new
instance of a given resource in turn or algorithmically.

Threads have signal handlers associated with them.  Several signal types
are "predefined" and have special meaning.  All signal handlers are
defined as virtual member functions of the Thread class which are called
when a specific signal is received for a given thread.  The "SIGPIPE"
event is defined as a "Disconnect" event since it's normally associated
with a socket disconnecting or broken fifo.  The Hangup() method is
associated with the SIGHUP signal.  All other signals are handled through
the more generic Signal().

Incidently, unlike Posix, the win32 API has no concept of signals, and
certainly no means to define or deliver signals on a per-thread basis.
For this reason, no signal handling is supported or emulated in the win32
implementation of Common C++ at this time.

In addition to TCPStream, there is a TCPSession class which combines a
thread with a TCPStream object.  The assumption made by TCPSession is that
one will service each TCP connection with a separate thread, and this
makes sense for systems where extended connections may be maintained and
complex protocols are being used over TCP.

Common C++ Synchronization
--------------------------
Synchronization objects are needed when a single object can be
potentially manipulated by more than one thread (execution) context
concurrently.  Common C++ provides a number of specialized classes and
objects that can be used to synchronize threads.

One of the most basic Common C++ synchronization object is the Mutex
class.  A Mutex only allows one thread to continue execution at a given
time over a specific section of code.  Mutex's have a enter and leave
method; only one thread can continue from the Enter until the Leave is
called.  The next thread waiting can then get through.  Mutex's are also
known as "CRITICAL SECTIONS" in win32-speak.

The Common C++ mutex is presumed to support recursive locking.  This was
deemed essential because a mutex might be used to block individual file
requests in say, a database, but the same mutex might be needed to block a
whole series of database updates that compose a "transaction" for one
thread to complete together without having to write alternate non-locking
member functions to invoke for each part of a transaction.

Strangely enough, the original pthread draft standard does not directly
support recursive mutexes.  In fact this is the most common "NP" extension
for most pthread implementations.  Common C++ emulates recursive mutex
behavior when the target platform does not directly support it.

In addition to the Mutex, Common C++ supports a rwlock class.  This
implements the X/Open recommended "rwlock".  On systems which do not
support rwlock's, the behavior is emulated with a Mutex; however, the
advantage of a rwlock over a mutex is then entirely lost.  There has been
some suggested clever hacks for "emulating" the behavior of a rwlock with
a pair of mutexes and a semaphore, and one of these will be adapted for
Common C++ in the future for platforms that do not support rwlock's
directly.

Common C++ also supports "semaphores".  Semaphores are typically used as a
counter for protecting or limiting concurrent access to a given
resource, such as to permitting at most "x" number of threads to use
resource "y", for example.  Semaphore's are also convenient to use as
synchronization objects to rondevous and signal activity and/or
post pending service requests between one thread thread and another.

In addition to Semaphore objects, Common C++ supports "Event" objects.
Event objects are triggered "events" which are used to notify one thread
of some event it is waiting for from another thread.  These event objects
use a trigger/reset mechanism and are related to low level conditional
variables.

A special class, the ThreadKey, is used to hold state information that
must be unique for each thread of context.  Finally, Common C++ supports a
thread-safe "AtomicCounter" class.  This can often be used for reference
counting without having to protect the counter with a separate Mutex
counter.  This lends to lighter-weight code.

Common C++ Sockets
------------------
Common C++ provides a set of classes that wrap and define the operation
of network "sockets".  Much like with Java, there are also a related set
of classes that are used to define and manipulate objects which act as
"hostname" and "network addresses" for socket connections.

The network name and address objects are all derived from a common 
InetAddress base class. Specific classes, such as InetHostAddress,
InetMaskAddress, etc, are defined from InetAddress entirely so that the
manner a network address is being used can easily be documented and
understood from the code and to avoid common errors and accidental misuse 
of the wrong address object.  For example, a "connection" to something
that is declared as a "InetHostAddress" can be kept type-safe from a
"connection" accidently being made to something that was declared a 
"InetBroadcastAddress".

The socket is itself defined in a single base class named, quite
unremarkably, "Socket".  This base class is not directly used, but is
provided to offer properties common to other Common C++ socket classes,
including the socket exception model and the ability to set socket
properties such as QoS, "sockopts" properties like Dont-Route
and Keep-Alive, etc.

The first usable socket class is the TCPStream.  Since a TCP connection
is always a "streamed" virtual circuit with flow control, the standard
stream operators ("<<" and ">>") may be used with TCPStream directly.
TCPStream itself can be formed either by connecting to a bound network
address of a TCP server, or can be created when "accepting" a
network connection from a TCP server.

An implicit and unique TCPSocket object exists in Common C++ to represent
a bound TCP socket acting as a "server" for receiving connection requests.
This class is not part of TCPStream because such objects normally perform
no physical I/O (read or write operations) other than to specify a listen
backlog queue and perform "accept" operations for pending connections.
The Common C++ TCPSocket offers a Peek method to examine where the next
pending connection is coming from, and a Reject method to flush the next
request from the queue without having to create a session.

The TCPSocket also supports a "OnAccept" method which can be called when a
TCPStream related object is created from a TCPSocket.  By creating a
TCPStream from a TCPSocket, an accept operation automatically occurs, and
the TCPSocket can then still reject the client connection through the
return status of it's OnAccept method.

In addition to connected TCP sessions, Common C++ supports UDP sockets and
these also cover a range of functionality.  Like a TCPSocket, A UDPSocket
can be created bound to a specific network interface and/or port address,
though this is not required.  UDP sockets also are usually either 
connected or otherwise "associated" with a specific "peer" UDP socket.
Since UDP sockets operate through discreet packets, there are no streaming
operators used with UDP sockets.

In addition to the UDP "socket" class, there is a "UDPBroadcast" class.
The UDPBroadcast is a socket that is set to send messages to a subnet as a
whole rather than to an individual peer socket that it may be associated
with.

UDP sockets are often used for building "realtime" media  streaming
protocols and full duplex messaging services.  When used in this manner,
typically a pair of UDP sockets are used together; one socket is used to
send and the other to receive data with an associated pair of UDP sockets
on a "peer" host.  This concept is represented through the Common C++
UDPDuplex object, which is a pair of sockets that communicate with another
UDPDuplex pair.

Finally, a special set of classes, "SocketPort" and "SocketService", exist
for building realtime streaming media servers on top of UDP and TCP
protocols.  The "SocketPort" is used to hold a connected or associated TCP
or UDP socket which is being "streamed" and which offers callback methods
that are invoked from a "SocketService" thread.  SocketService's can be
pooled into logical thread pools that can service a group of SocketPorts.
A millisecond accurate "timer" is associated with each SocketPort and can
be used to time synchronize SocketPort I/O operations.

Common C++ Serial I/O
---------------------
Common C++ serial I/O classes are used to manage serial devices and
implement serial device protocols.  From the point of view of Common C++,
serial devices are supported by the underlying Posix specified "termios"
call interface.

The serial I/O base class is used to hold a descriptor to a serial device
and to provide an exception handling interface for all serial I/O classes.
The base class is also used to specify serial I/O properties such as
communication speed, flow control, data size, and parity.  The "Serial"
base class is not itself directly used in application development,
however.

Common C++ Serial I/O is itself divided into two conceptual modes; frame
oriented and line oriented I/O.  Both frame and line oriented I/O makes
use of the ability of the underlying tty driver to buffer data and return
"ready" status from when select either a specified number of bytes or
newline record has been reached by manipulating termios c_cc fields
appropriately.  This provides some advantage in that a given thread
servicing a serial port can block and wait rather than have to continually
poll or read each and every byte as soon as it appears at the serial port.

The first application relevant serial I/O class is the TTYStream class.
TTYStream offers a linearly buffered "streaming" I/O session with the
serial device.  Furthermore, traditional C++ "stream" operators (<< and
>>) may be used with the serial device.  A more "true" to ANSI C++ library
format "ttystream" is also available, and this supports an "open" method
in which one can pass initial serial device parameters immediately
following the device name in a single string, as in
"/dev/tty3a:9600,7,e,1", as an example.

The TTYSession aggragates a TTYStream and a Common C++ Thread which is
assumed to be the execution context that will be used to perform actual
I/O operations.  This class is very anagolous to TCPSession.

The TTYPort and TTYService classes are used to form thread-pool serviced
serial I/O protocol sets.  These can be used when one has a large number
of serial devices to manage, and a single (or limited number of) thread(s)
can then be used to service the tty port objects present.  Each tty port
supports a timer control and several virtual methods that the service
thread can call when events occur.  This model provides for "callback"
event management, whereby the service thread performs a "callback" into
the port object when events occur.  Specific events supported include the
expiration of a TTYPort timer, pending input data waiting to be read, and
"sighup" connection breaks.   

Common C++ Block I/O
--------------------
Common C++ block I/O classes are meant to provide more convenient file
control for paged or random access files portably, and to answer many
issues that ANSI C++ leaves untouched in this area.  A common base class,
RandomFile, is provided for setting descriptor attributes and handling
exceptions.  From this, three kinds of random file access are supported.

ThreadFile is meant for use by a threaded database server where multiple
threads may each perform semi-independent operations on a given database
table stored on disk.  A special "fcb" structure is used to hold file
"state", and pread/pwrite is used whenever possible for optimized I/O.  On
systems that do not offer pwread/pwrite, a Mutex lock is used to protect
concurrent lseek and read/write operations.  ThreadFile managed databases
are assumed to be used only by the local server and through a single file
descriptor.

SharedFile is used when a database may be shared between multiple
processes.  SharedFile automatically applies low level byte-range "file
locks", and provides an interface to fetch and release byte-range locked
portions of a file.

The MappedFile class provides a portable interface to memory mapped file
access.  One can map and unmap portions of a file on demand, and update
changed memory pages mapped from files immediately through sync().

Common C++ Daemon Support
-------------------------
Daemon support consists of two Common C++ features.  The first is the
"pdetach" function.  This function provides a simple and portable means to
fork/detach a process into a daemon.  In addition, the "slog" object is
provided.

"slog" is an object which behaves very similar to the Standard C++ "clog".
The key difference is that the "slog" object sends it's output to the
system logging daemon (typically syslogd) rather than through stderr.
"slog" can be streamed with the << operator just like "clog".  "slog" can
also accept arguments to specify logging severity level, etc.

Common C++ Persistence
----------------------
The Common C++ Persistence library was designed with one thought
foremost - namely that large interlinked structures should be
easily serializable. The current implementation is _NOT_ endian
safe, and so, whilst it should in theory be placed in the "Extras"
section, the codebase itself is considered stable enough to be
part of the main distribution.

The Persistence library classes are designed to provide a quick and
easy way to make your data structures serializable. The only way of
doing this safely is to inherit your classes from the provided class
Persistence::BaseObject. The macros "IMPLEMENT_PERSISTENCE" and
"DECLARE_PERSISTENCE" provide all the function prototypes and 
implementation details you may require to get your code off the ground.

Common C++ Config & Misc
------------------------
There are a number of odd and specialized utility classes found in Common
C++.  The most common of these is the "MemPager" class.  This is basically
a class to enable page-grouped "cumulative" memory allocation; all
accumulated allocations are dropped during the destructor.  This class has
found it's way in a lot of other utility classes in Common C++.

The most useful of the misc. classes is the Keydata class.  This class is
used to load and then hold "keyword = value" pairs parsed from a text
based "config" file that has been divided into "[sections]".  Keydata can
also load a table of "initialization" values for keyword pairs that were
not found in the external file.

One typically derives an application specific keydata class to load a
specific portion of a known config file and initialize it's values.  One
can then declare a global instance of these objects and have
configuration data initialized automatically as the executable is loaded.

Hence, if I have a "[paths]" section in a "/etc/server.conf" file, I might
define something like:

class KeyPaths : public Keydata
{
public:
	KeyPaths() : Keydata("/server/paths")
	{
		static KEYDEF *defvalues = {
		{"datafiles", "/var/server"},
		{NULL, NULL}};

		// override with [paths] from "~/.serverrc" if avail.

		Load("~server/paths");
		Load(defvalues);
	}
};

KeyPaths keypaths;

Common C++ Automake Services
----------------------------
Common C++ does a few things special with automake and autoconf.  When the
Common C++ library is built, it saves a number of compiler options in a
"config.def" file that can be retrieved by an application being configured
to use Common C++.  This is done to assure the same compiler options are
used to build your application that were in effect when Common C++ itself
was built.  Since linkage information is also saved in this manner, this
means your application's "configure" script does not have to go through
the entire process of testing for libraries or Common C++ related compiler
options all over again.  Finally, Common C++ saves it's own generated 
"config.h" file in cc++/config.h.

If you are using automake, you can add the ost_commoncxx.m4 macros to your
projects autoconf "m4" directory and use several CCXX_ macros for your
convenience.  A "minimal" configure.in can be constructed as:

AC_INIT(something...)
AC_PROG_CXX
AC_PROG_CXXCPP
AM_PROG_LIBTOOL
AM_INIT_AUTOMAKE(....)
AM_CONFIG_HEADER(my-local-config.h)
OST_CCXX_COMMON


In addition, if you plan to use classes found in -lccio, you can use
OST_CCXX_FILE, and if you plan to use anything from the "common"
directory, you can define OST_CCXX_STD.  OST_CCXX_HOARD will test for and,
if found, add the SMP optimized Hoard memory allocator to your
application link LIBS.

Common C++ "Extras"
-------------------
At the time of the release of Common C++ 1.0, it was deemed that
several class libraries either were incomplete or still experimental, and
the 1.0 designation seemed very inappropriate for these libraries.  I
also wanted to have a mechanism to later add new Common C++ class
libraries without having to disrupt or add experimental code into the
main Common C++ release.

To resolve this issue, a second package has been created, and is named
GNU "Common C++ Extras".  The extras package simply holds class frameworks
that are still not considered "mature" or "recommended".  This package can
be downloaded, compiled, and installed, after Common C++ itself.  Many of
the class libraries appearing in the extras package are likely to appear
in Common C++ proper at some future date, and should be considered
usable in their current form.  They are made available both to support
continued development of Common C++ proper and because, while not yet
mature, they are considered "useful" in some manner.

The initial Common C++ "extras" package consisted of two libraries; Common
C++ "scripting" and "math".  The scripting library (-lccscript) is the
Bayonne scripting engine which is used as a near-realtime event driven
embedded scripting engine for "callback" driven state-event server
applications.  The Bayonne scripting engine directly uses C++ inheritance
to extend the Bayonne dialect for application specific features and is
used as a core technology in the Bayonne, DBS, and Meridian telephony
servers and as part of the a free home automation project.  There has been
some discussion about folding the Bayonne scripting concepts around a more
conventional scripting language, and so this package currently remains in
"extras" rather than part of Common C++ itself.

The other package found in the initial "extras" distribution is the Common
C++ math libraries.  These are still at a VERY early stage of development,
and may well be depreciated if another suitable free C++ math/numerical
analysis package comes along.

Common C++ "serverlets"
-----------------------
Serverlets are a concept popularized with Java and web servers.  There is
a broad abstract architectural concept of serverlets or plugins that one
also finds in my Common C++ projects, though they are not directly
defined as part of Common C++ itself.

A Common C++ "serverlet" comes about in a Common C++ server project, such
as the Bayonne telephony server, where one wishes to define functionality
for alternate hardware or API's in alternated shared object files that are
selected at runtime, or to add "plugins" to enhance functionality.  A
serverlet is defined in this sense as a "DSO" loaded "-module" object file
which is linked at runtime against a server process that exports it's base
classes using "-export-dynamic".  The "server" image then acts as a
carrier for the runtime module's base functionality.

Modules, or "serverlets", defined in this way do not need to be compiled
with position independent code.  The module is only used with a
specific server image and so the runtime address is only resolved once
rather than at different load addresses for different arbitrary processes.

I recommend that Common C++ based "servers" which publish and export base
classes in this manner for plugins should also have a server
specific "include" file which can be installed in the cc++ include
directory.

