Kevin C. O'Kane, Ph.D. Computer Science Department University of Northern Iowa Cedar Falls, IA 50614 okane@cs.uni.edu http://www.cs.uni.edu/~okane October 11, 2000 Copyright (C) MM Kevin C. O'Kane This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA A copy of the license appears at the end of this document.
|
This document describes the implementation of the MumpsVM virtual machines for use with Windows 95/98, Windows NT, Solaris and Linux. Solaris is a trademark of Sun Microsystems. Windows 95/98 and Windows NT are trademarks of Microsoft Corporation; The material herein is preliminary in nature and subject to change. Use of the software described here is at the user's risk. Please send error reports the the e-mail address above.
The purpose of this document is to present an overview of the MumpsVM implementation of the MUMPS language. This package consists of a virtual machine (interpreter) for various operating systems for a dialect of the Mumps language. Usage of the MumpsVM virtual machines is entirely at the user's risk. The MumpsVM software packages described herein are not warranted in any manner whatsoever. The licensor disclaims responsibility for any and all damages, either direct or consequential, which may arise through use of these packages. The Linux and Sun versions were developed with the GNU C++ compiler. The other versions were developed with the Watcom 10.6 C++ compiler.
Installing the Virtual Machine
If you are going to use MumpsVM in standalone mode (i.e., without a web server), copy the executable mumps.exe to a directory that is in your execution path, for example, \windows. Global array files will be created in the directory from which you invoke mumps. The global arrays reside in two files, normally called "key.dat" and "data.dat".
Copy the mumps executable to a directory in your search path. The global array files are created in the directory in which MumpsVM will be run. The global arrays reside in two files, normally called "key.dat" and "data.dat".
When used with a Web server, the Mumps executable or a link to it normally resides in the directory in which the web server will access it form. Be certain to give these adequate permissions for web server access. If permissions are not correctly set, the MumpsVM will hang. If the MumpsVM runs as a task of the web server, the global files (and directory) must be read/write accessible to the web server.
The global array files created are called key.dat and data.dat. To recreate the file system, first delete these files. The MumpsVM will recreate them when it next runs. When you recreate the globals, recheck the file permissions and ownerships. Alternatively, the ZG command will delete and rebuild the globals.
If you are using a VM in conjunction with a web server, you need to copy the executable MumpsVM module to the directory in which the server executes cgi-bin programs.
In each directory from which you expect to execute a MumpsVM, you should install a program named init.mps which the MumpsVM will execute if you invoke Mumps without specifying any particular routine to execute. MumpsVM will execute this file only when it is invoked interactively. The file is ignored if Mumps is invoked by a Web server or shell script containing the environment variable QUERY_STRING.
MumpsVM will terminate on any - repeat - any error when invoked by a web server. Were they to return to interactive prompt mode after an error, they would hang the invoking web servers. Thus, all errors result in termination.
In standalone mode, the init.mps file will execute and control will be returned to your terminal when it has finished. MumpsVM is ready to accept direct mode input as denoted by the ">" prompt. You may exit by the "ze" or "halt" commands. In interactive mode, any MumpsVM command may be entered. Interactive mode input lines may contain multiple commands.
The command to compile the source code for Linux is:
gcc -O2 -o mumps mumps.c mathf.c parse.c sys1.c btree.c zfcn.c sym.c -lm
System parameters are set in the memsize.h header file.
Global arrays are automatically created and initialized when you start the MumpsVM for the first time or when you use MumpsVM for the first time after you have deleted the global array file system. All global arrays reside in the two global array files.
Global array references may contain either string or numeric subscripts. Only printable ASCII characters are permitted as values for global array subscripts. No subscript may consist of a zero or negative number. The global array files will grow in size as elements are added. These files may be copied for backups.
Using MumpsVM with a Web Server
To use MumpsVM with a web server, install the MumpsVM in the server's cgi-bin (or other appropriate directory). From HTML documents, you will reference MumpsVM with lines of the form:
<A HREF="/cgi-bin/mumps.cgi?prog=^pgmname.mps&var1=11111&var2=123"> test </A>
Here, the name of the Mumps interpreter is taken to be mumps.cgi but in some systems it may need to be different. Check your web server documentation to determine if executable files require a specific extension.
The name of the Mumps program to execute is given in the "prog=" field and the starting values of variables are given next. Note the "^" character.
Upon initialization of the MumpsVM, the variables appearing in the HREF (var1 and var2 in the above) will exist in the MumpsVM symbol table and have the values provided by the browser usually as a result of FORMS input.
When the Web server receives a request such as that above, it invokes the MumpsVM interpreter and passes to it the parameters (everything following the "?" character. These are encoded in an operating system environment variable named QUERY_STRING. The total length of the parameter string may not exceed 1024 bytes. The MumpsVM interpreter decodes QUERY_STRING and executes the named program. The value of QUERY_STRING is contained in the MumpsVM variable "%QS" during execution.
When MumpsVM initiates, it looks for QUERY_STRING. If it finds it, it surpresses the welcome message and sets an internal flag to indicate that this session appears to be a Web server session. The executing MumpsVM script may also determine this by testing for the existence of "%QS".
You may invoke the MumpsVM in simulated Web server mode by calling it from a shell script such as the following (for the Linux Bash Shell):
Here, MumpsVM will execute the script file progname.mps and pass to it parameters var1 and var2.
If the values of the variables passed through QUERY_STRING need to contain blanks or special characters, they must be encoded in the manner prescribed by the HTML standard. From a MumpsVM program, the MumpsVM builtin function $ZH can be used for this purpose (see below).
Be certain to Halt or ZE at the end of a program in order to terminate the interpreter. If you do not correctly terminate the interpreter, the web server may hang.
The MumpsVM interpreter insures that only one copy is running at any given time on any given database files. MumpsVM opens the database files for exclusive access. Thus, MumpsVM programs should be short, transaction oriented jobs that do not delay the server. Note: since only one copy of the MumpsVM is ever running for a given database, all file accesses are also exclusive.
In order to prevent the MumpsVM from failing to halt after a user programming error and thereby hang the web server, the MumpsVM when used with a web server is set to terminate its execution if it detects a user error. In non-Web server mode, MumpsVM returns to interactive mode after an error.
MumpsVM determines that it was invoked by a web server by detecting the QUERY_STRING environment variable. If you create this variable and invoke MumpsVM (as shown above), MumpsVM will assume it has been invoked by a web server.
If you halt a VM with a control-C or other external Kill command, the global array data base may be corrupted. Back up copies of critical data should be maintained. A dump function exists that copies the global arrays to an ASCII text file which can be used to reload the data base. Generally speaking, programs that do not access the globals or access them only to read their contents, do not corrupt the global arrays if forcibly terminated.
MumpsVM programs may be created with any standard system editor which does not introduce embedded control codes into the program. They are ordinary ASCII files with the following conventions:
OPEN 1:"MYFILE.DAT,NEW"
OPEN 2:"TABLE.DAT,OLD"
In the first of these, the file will be created and only output (WRITE) operations are permitted. In the second, the file is presumed to exist and only input operations will be permitted. If the NEW option is given, any previous generation of the file is deleted.
1 Multiple adjacent operators
The global arrays are automatically opened when the interpreter
begins execution and they are closed upon normal termination
of the interpreter.
If they are already in use by another instance of the interpreter, your
MumpsVM interpreter will wait.
The global arrays occupy two files. By default these are
key.dat and data.dat. If you want to use other files,
you need to change the defined constants UDAT and UKEY in
the memsize.h file.
You may select any names supported by your system.
Alternatively, the names of the global array files may be
read in when the interpreter is run. If you set the
defined variable UseConfigDB to 1 in memsize.h,
the interpreter will look for and read a file named mps.cfg
when the interpreter is loaded.
The file should contain two lines. One should contain the
keyword DATA followed by a file name and the
other line should contain the keyword KEY followed by
a file name. File names may contain path descriptions. The
file names will be used for the data nd key portions of the global
arrays.
You may want to place the files on
different disks to improve performance. One (UKEY) contains the
btree and the other (UDAT) contains data stored at nodes.
If you want to reinitialize global arrays, delete the old
files and run the interpreter. The interpreter will create new ones. Be certain to
delete both files - otherwise the system will improperly create
the new globals.
If your MumpsVM program terminates without properly
closing the global arrays, they are corrupt and must
be recreated. Dump and restore functions ($zcd, $zcl)
are provided that will dump and reload the contents of the globals arrays
to/from a sequential, ASCII file.
If your application frequently deletes globals arrays, you may want
to periodically dump them, delete the old arrays, and restore
in order to remove holes from the file space.
While the interpreter attempts to reuse space, it
is not possible to reclaim all data.
Introduction
The purpose of this section is to provide you with an introduction to the
MUMPS language in general MumpsVM in particular.
The MUMPS language originated in the
mid-60's at the Massachusetts General Hospital. The acronym stands for
"Massachusetts General Hospital Utility Multi-Programming System". It is a
language which is similar in some respects to BASIC but it contains many
additional features not found in BASIC, or for that matter, in most other
languages.
In its full form,
MumpsVM is an interpretive language. In fact, parts of the language
specification require that it can never fully become a "compiled" language
such as FORTRAN, COBOL or PL/I.
See above for details.
Among the
features which make MumpsVM attractive for both bio-medical and general
scientific applications are:
Hierarchical data base facility.
MumpsVM data sets are not only organized along traditional sequential and
direct access methods, but also as hierarchical trees whose data nodes are
addressed as path descriptions in a manner which is easy for a programmer
to master in a relatively short time.
Flexible and powerful string manipulation facilities.
MumpsVM built-in string manipulation operators and functions provide
programmer's with access to efficient means to effect complex string
manipulation and pattern matching operations.
Transportability to widely different systems
MUMPS presently runs under a large number of operating systems on many
machine architectures. These systems range in size from small home
micro-computers to the largest central time sharing systems. Through
efforts that have taken place by the MUMPS Development Committee over the
years, a well organized language definition has been written and formally
published. This standard provides for a far tighter specification for
system performance and linguistic definition than is normally the case. As
a result, programs written under a MUMPS system can be moved with
relatively little effort from one system to another.
Full numeric data handling facilities
MumpsVM provides, in addition to string handling facilities, a full range of
fixed and floating point computational facilities.
Basically, MumpsVM has only one data type: string, although it does allow
integer and floating point computations as well as logical expressions. A
string variable is restricted to 255 characters in length or less (20
characters or less if it is being used as a number. Note: this is a
restriction of this implementation of MumpsVM;
see above for a detailed list of such restrictions).
The values in a string may be any ASCII code from 1 to 128 (decimal)
inclusive with the exception that an ASCII 1 may not be used in a string
index to a global array. MumpsVM does not permit usage of the ASCII zero
character (
"THE SEAS DIVIDE AND MANY A TIDE"
When a string is being used as a number (e.g., in addition), the numeric
portion must be 20 characters or less in length. Numeric constants are
restricted to integer or decimal values (positive or negative). "E-type"
notation is not permitted. If a string begins with a number but ends with
non-numeric characters, only the numeric leading portion will participate
in operations requiring numeric operands (e.g., add, subtract, etc.); the
trailing non-numeric portion is lost. On the other hand, if a string
begins with non-numeric characters, its value will be interpreted as 0.
The following are examples:
1+2 will be evaluated as 3.
Although "string" is the basic data type,
MumpsVM converts strings internally to floating point values
for calculations.
Consequently, numbers are of approximately 7 digit
precision. A number may range in magnitude from 10**-19 to 10**19.
Logical values in MumpsVM are special cases of the numerics.
A numeric value
of zero is interpreted as false while a non-zero value is interpreted as
true. Logical operators yield either zero or one and their results can be
treated like any other numeric. Similarly, the numeric result of any
numeric operator can be used as a logical operand. The results of string
operators are interpreted either as zero (leading characters non-numeric)
or some value (leading characters numeric). Strings and the results of
string operations can therefore participate as the operands of logical
operators.
Variables are named in MumpsVM in much the same manner they are named in
other languages.
A MumpsVM variable name must begin with a letter (A through
Z) or percent sign (%) and may be followed by either letters or numbers.
In general, variable names should be nine or fewer characters in length
(the maximum variable name length is 255 characters). Unlike most
languages,
MumpsVM variables are not automatically data typed by their first
letter.
MumpsVM, in effect has only one data type so any variable name may
be any value.
All MumpsVM variables are varying length strings (length may
range from 0 to 255 characters).
In the VM there are no data declaration statements.
Variables come into
existence through assignment statements (SET) or the "READ" command and
pass from existence through the "KILL" command.
In the VM, there are two kinds of arrays:
internal arrays and global arrays.
The following pertains to internal arrays: arrays are not dimensioned. A
name used as an array variable may also, at the same time, be used as a
scalar. Array values are created by assignment or appearance in a "READ"
statement. If you create an element of an array, let us say element 10, it
does not mean that MumpsVM
has created any other elements: that is, it does
not imply that there exist elements 1 through 9. You may specifically
create these or not. Array indices may be positive or negative numbers or
character strings. Arrays in MumpsVM may have multiple dimensions. The
following are some examples of arrays:
SET A(1,2,3)="ARRAY"
Global arrays are unique to MumpsVM. As a programmer, you will work with
them as though they were arrays. The system, however, interprets them as
tree path descriptions for the system's external data files.
A global array is distinguished by beginning with the circumflex character
(^). The remainder of the specification is the same as an internal array.
global arrays are not dimensioned and they may appear anywhere an ordinary
variable may appear (except in certain forms of the "KILL" command).
A typical global array specification consists of the array name followed by
some number of indices (indices may be constants, variables [including
internal or global arrays] or expressions of string, numeric or mixed
type). For example:
SET ^A(1,43,5,99)="TEST"
The system files are viewed as trees. Each global array name ("A", "SHIP",
"CAPTAIN", and "HOME" in the above) is the root of a tree. The indices
are thought of as path descriptions to leaves. For example, out of the
root "A" there may be many branches, the above specifies to take the
branch labeled "1" (note: this does not mean the "first" branch out of the
node - it means the branch with label "1"). At the second level the
specification says to take the branch labeled "43" (note: this does not
imply that branches 1 through 42 necessarily exist). The path description
is followed (or, possibly, created if the global array specification
appears on the left hand side of an assignment statement or in a "READ"
statement) to a final node. The value at the node is either retrieved or a
new value stored depending upon the context in which the global array
specification was used. The indices of global arrays may be numeric or
character strings. The second sequence of examples above illustrates this
usage.
Both string and character indices may be mixed in the same path
description.
A value may be stored at any position in the tree. For example:
SET ^A(1,43,5)=22
The arithmetic unary operators are: + and -. The plus operator (+) has no
effect other than to force the expression to its right to be interpreted as
numeric. The minus operator forces numeric interpretation and negates the
result. For example:
SET I="123 ELM STREET"
The addition (+), subtraction (-), multiplication (*) and
exponentiation (**) operators perform
in the normal manner. Operands are given a numeric interpretation if
necessary. Operands may be either expressions, constants, variables or
array references. Results are computed in floating point if appropriate.
MumpsVM has two division operators: full division (/) and integer division
(\). Full division give results which may have fractional parts. Integer
division truncates the answer to an integer.
The modulo operator (#) gives the left operand modulo the right operand.
The following are examples:
2+3 yields 5
The greater than (>) and less than (<) relational operators compare
numbers. If the operands are not numbers, they are given a numeric
interpretation. The result is either zero for FALSE or one for TRUE. For
example:
1 > 2 yields 0
Both operators may be negated producing not greater than ('>) and not less
than ('<) (note: the single quote mark is the negating operator). There is
no "less than or equals" or "greater than or equals" operators as such.
For example:
1' > 2 yields 1
The only binary string operator is concatenation (_) represented by an
underscore character. The following are examples:
"ABC"_"XYZ" yields "ABCXYZ"
The equals relational operator (=) tests for equality as in the following
example:
IF "ABC"="ABC" WRITE "EQUALS"
We would expect that "EQUALS" would be written to the terminal. The
not-equals operator if formed by the single quote mark and the equals sign.
The equals and not-equals operator may be used with strings or numbers.
The contains operator ([) determines if the right hand operand is contained
in the left hand argument. For example:
SET A="NOW IS THE TIME"
The word "YES" would by printed on the terminal.
The follows operator (]) is used to test if the left hand operand follows
the right hand operand in the collating sequence. For example:
SET A="ABC"
The word "YES" would be printed at the terminal.
The pattern matching operator (?) is used to determine if a string conforms
to a certain pattern. The patterns are:
A for the entire upper and lower case alphabet.
A pattern code is made up of one or more of the above, each preceded by a
count specifier. The count specifier indicates how many of the named item
must be present. Alternatively, an indefinite specifier - a decimal point
- may be used to indicate any count (including zero).
For example:
SET A="032-34-6304"
The logical operators AND (&), OR (!) and NOT (') may be applied in the
usual manner. The user should note, however, that since MumpsVM has stric
t
left-to-right precedence, the results can sometimes be odd:
1 & 1 yields 1
The "NOT" operator may be used in conjunction with other operators to form
compound operators. The resulting compound operators are:
'< not less than
Each statement in MumpsVM begins with a unique command word. Most of the
time, to save space in the VM, the command word is
abbreviated to a single character. The single character abbreviations are
unique for all commands except those which begin with the letter "Z". For
commands not beginning with the letter "Z", MumpsVM does not check the spelling
of the command word if more than one character of the spelling is given.
The first letter is used to determine the command. Thus "WRITE", "W", and
"WRIGHT" all have the same meaning.
When you run MumpsVM, you are initially in direct mode. That is, if you
type a command, the VM executes it immediately. You can tell that
you're in direct mode by the ">" character which the VM places at
the left-hand side of the screen.
In direct mode
you may enter a line which contains multiple commands.
The syntax of the command portion of a line of MumpsVM code consists (in t
he
general case - there are exceptions) of the command word or letter followed
(optionally) by a post-conditional, followed by exactly one blank followed
by the arguments to the command. Most commands can have multiple
arguments. Multiple arguments are delimited by commas. If a line is to
have more than one command, the first command is delimited by exactly one
blank and the next command word or letter follows immediately. Blanks are
very significant in MumpsVM.
As noted above, most commands may be "post-conditionalized". A
post-conditional is a logical expression which is used to determine if the
command (and all its arguments) should be executed. It is like a small
"IF" statement. In the VM,
some commands, such as "DO" and "GOTO", may not only be
post-conditionalized at the command level, but also at the argument level:
that is, a separate post-conditional may be specified for each argument. A
post-conditional appears as a colon followed by an expression. If the
expression evaluates to 0 (false), the command (or argument) is not
executed. If the expression evaluates non-zero, the command or argument is
executed.
The following are examples of the above:
an ordinary assignment statement:
SET I=10*5
same as above with command word abbreviation:
S I=10*5
an assignment statement with multiple arguments:
S I=10*5,J=5,K=I+J (K will equal 55)
an assignment statement post-conditionalized:
S:I=10 J=0 (set J to zero if I equals 10)
a multiple command line:
S I=10*5 S j=5 S S=I+J (same as above)
Table of Commands
Not used in MumpsVM. It is normally used to trigger a program halt.
The CLOSE command closes a unit number and makes it available for other
uses. It also frees the system buffers for other uses. The argument must
evaluate to a number which corresponds to an open unit. In MumpsVM this
must be in the range of 1 to 4 (other values will terminate your program).
An output file must be closed explicitly. Failure to do so may result in
loss of some or all of the file. The close command may be
post-conditionalized and it may have multiple arguments.
C 4
The DO command causes the VM to branch to the label specified and
continue execution beginning at that label. Execution proceeds until the
end of the program is reached or a QUIT command is encountered. When
either of these terminating conditions is achieved, the VM returns
to the original DO command and executes subsequent arguments or commands on
that line and following lines.
DO commands may have multiple arguments. They specify multiple routines
to be executed. The DO command may be post-conditionalized and each of its
arguments may be post-conditionalized.
In the interpreters, an argumentless DO may be used.
It causes the code on the immediately following line to be
executed. This group must be terminated by a QUIT.
An argumentless DO must be followed by two blanks (unless
it is the last command on a line). It may be
Post-Conditionalized.
Thisn feature is normally used in connection with line
level indicators. For example:
For example:
Interpreter only:
The arguments to a DO command are normally program labels. They may,
however, be file names. If they are file names, MumpsVM loads the named
file, executes it, and returns to the invoking DO command. Invoked files
may invoke other files up to the internal storage limit of the VM.
After an invoked file has executed and returned to the calling routine, the
invoked routine is erased from the user's partition. This space is
available for additional routines. An invoked routine has access to the
entire symbol table. Any variables which it creates remain in the symbol
table unless explicitly removed with the KILL command.
A file name is indicated by the circumflex preceding the file name. The
file name may be either a literal (optionally enclosed in quotes) or
contained in a variable name. If the file name to be executed is contained
in a variable name, the variable name must be preceded by an at-sign (@).
The variable named must contain a circumflex immediately followed by the
file name. File names must conform to the naming conventions of the
machine on which you are operating. Normally, this will be a name
beginning with an alphabetic character, followed by alphabetic or numeric
characters up to a limit of six, followed by a period followed by the file
extension "MPS".
Note: MumpsVM does not assume any file extension by default.
Both file name arguments and internal label arguments may be used in the
same command. File name argument forms may be individually
postconditionalized.
For example:
DO PGM1.MPS
There are two more
forms of the DO: they permit you to specify both a file and a label within
the file to be executed and an offset from a label.
(The MumpsVM VM does not permit offsets from labels
but these are permitted in Standard MumpsVM). These constructions
may also be contained in a variable name: the variable name is preceded by
an at-sign (@) in the DO command. For example:
DO LAB3^"PGM1.MPS"
The ELSE command tests the value of the system wide built-in variable
$TEST. If $TEST (abbreviated as $T) is zero, the remainder of the line
on which the ELSE appears is executed. If $T is not zero, the remainder of
the line is not executed. $TEST is set, among other ways, by the IF
statement. Since ELSE does not take arguments, it must be followed by two
blanks.
For example:
ELSE S I=10
In the VM,
the FOR command specifies repeated execution of the current line with a
selected value for an index variable. There are three formats: the first
is one in which a local variable (i.e., a global may not be used here) is
set successively to values in a list for each execution of the portion of
the current line following the FOR; the second is one in which a local
variable is set to an initial numeric value, incremented by a fixed amount
and the portion of the current line remaining after the FOR is executed
until the local variable exceeds an upper limit; the final form is an
infinite loop form with a local variable being incremented by a fixed
amount with no upper limit test being performed. A given FOR command may
have multiple arguments: each argument may be in any of the above formats.
The FOR command has scope only for the remaining portion of the current
line. A line containing a FOR command may invoke other lines of code,
however, by means of a DO or XECUTE command. In these cases the index (or
indices) are valid in the remote code executed.
Multiple FOR commands may appear nested on the same line. A QUIT command
may be used to prematurely terminate the execution of a FOR command. If
there are nested FOR commands on the line, a QUIT applies to the most
recent FOR. A QUIT used in the context of a FOR command will not cause a
return to a DO command.
For example:
F I=1,2,5,99
The remainder of the above line will be executed 4 times. The variable I
will, successively, have the values 1, 2, 5 and 99.
F J=2:4:20
In the above, the remaining portion of the line on which the FOR command
appears will be executed 5 times with the value of J being 2, 6, 10, 14,
and 18. Note that the first part (2) is the starting value; the second (4)
is the increment; and the third (20) is the termination condition.
F K=1:1
The above specifies an infinite loop. K will be incremented by one with no
upper limit. The user must QUIT, HALT or GOTO to exit the loop.
F L=10:-1:0,13,15:1
The above uses three arguments. The first specifies a loop with L ranging
from 10 to zero stepping by minus one (10, 9, 8, ... 0); then L has the
value 13; then L cycles with no upper limit from 15 upwards with an
increment of one.
F I="ABC","XYZ",1:1:20,"XXX"
"I" will have the values ABC, AND XYZ (strings); then it will cycle from
one to twenty by one; then, finally, it will have the value XXX (string).
F A(99)=1:1:20
The local array element A(99) will cycle from one to twenty
in steps of one.
F I=J+1:K*L:A(9,3,2,1)
In the above, expressions are used to specify the parameters to an iterative
form of the FOR statement. Any valid expression may be used, including
those involving global array references. If the iterative forms are used,
the expressions will be interpreted as numerics.
The GOTO command causes unconditional transfer of control.
For the VM, arguments are
specified in the same manner as the DO command listed above. You may
post-conditionalize both the command word and the individual arguments.
You may specify a label, a variable containing a label (variable name must
be preceded by an at-sign [@]), a file name (preceded by circumflex with
the file name optionally contained in a quoted field), a label and a file
name, or a variable name containing a file name or a label and file name
(preceded by an at-sign [@]).
If you specify one of the file name forms, the named file completely
replaces the transferring program. The symbol table is, however, left
intact. Any open devices remain open and may be used by the newly loaded
program. The only way to return to the original program is by another
transfer of control (either a DO or GOTO).
The MumpsVM VM does not permit numeric offsets from
labels (as in the second, third and seventh examples below) although
these are permitted in full Standard MUMPS.
For example:
GOTO LAB1
GOTO LAB1+10
G LAB1+I*K
G ^PGM1.MPS
G:I-J ^"PGM1.MPS"
G LAB1^"PGM1.MPS"
G LAB1+I^"PGM1.MPS"
S A="LAB1^PGM.MPS"
G @A
The HALT command terminates execution of the MumpsVM VM and
returns you to the operating system or web server.
It may be post-conditionalized. For example:
H:I=3
The HANG instruction suspends execution of your program for a specified
period of time (in seconds). It takes as an argument the number of seconds
to wait. It may be post-conditionalized. The HANG instruction differs
from the HALT instruction only in the argument: a HANG without an argument
is a HALT instruction. For example:
H:I=J 2*K
The IF command permits conditional execution.
It has two forms: the first
takes no arguments and the second takes one or more arguments. In the
first form, the value of $TEST is examined. If $TEST is 1 (true), the
remainder of the current line is executed. If $TEST is 0 (false), the
remainder of the current line is not executed. If the no-argument form of
the IF is specified, you must include two blanks following the letter I or the
word IF to signify the omitted arguments.
The other form of the IF command takes arguments. The arguments are
evaluated and their result is used to set $TEST. If an argument
expression evaluates as non-zero, $TEST is set to 1 (true). If an
argument expression evaluates to zero, $TEST is set to 0 (false). If
multiple argument expressions are present, they, in effect, are and'ed to
produce a final result. The final result in $TEST is used to determine
whether the remainder of the line should be executed. Note: there need not
be any other commands on the line. The IF statement may be used solely to
set $TEST. Note also that expressions are evaluated left to right. This
sometimes causes problems for people used to dealing with FORTRAN or BASIC.
For example, the expression:
I=0&J<0
is always false since it is parsed as:
(((I=0)&J)<0)
if I is zero, the first expression is true (value of 1); if J is less than
zero, then J is interpreted as true giving, as a result of the AND
operation (&), a value of 1 which is not less than zero - therefore false.
If I is not zero then, regardless of the value of J, the AND operation
results in false (value of zero) which is not less than zero - therefore
false. The expression should have been written as:
(I=0)&(J<0)
Note that the IF command may have multiple arguments. These are
equivalent to AND'ed expressions. For example:
IF A=10,B=20 ...
The above is the same as saying:
IF (A=10)&(B=20) ...
Either form is acceptable. The OR operator may also be used:
IF (A=10)!(B=20) ...
The JOB command is not implemented.
The KILL command is used to prune the symbol table and to delete parts of
the global arrays.
There are three forms of the KILL command: the first
deletes all entries in the local (i.e., non-global) symbol table; the
second deletes specific elements from the local symbol table or specific
elements from the global arrays; and the third is used to delete all
elements from the local symbol table except for certain named symbols. All
forms may be post-conditionalized.
The first form - delete the entire local symbol table - is denoted by the
KILL command alone:
KILL
The second form appears as a list of references (note: indirection for the
names is permitted):
KILL A,B,^G(1,2,33)
The above would delete variables A, B and the global array node ^G(1,2,33).
Note: if the global array node ^G(1,2,33) has descendants, they are also
deleted. Also, if a local array node is deleted, any of its descendants
are also deleted.
The final form of the KILL may be used to delete all elements from the
local symbol table except for certain protected elements. It has the
following format:
KILL (A,B(1,1),K)
In the above, the local symbol table will be deleted except for variables
A, B(1,1) and K. All other variables will be lost. Global array nodes may
not be used in this form of the KILL statement. Indirection is permitted.
The LOCK command gains exclusive access to a portion of the data base
for an individual user. A LOCK with no arguments frees all prior
LOCK's.
The OPEN command opens sequential files and associates them with unit
numbers. Opened files may be read or written. This implementation
permits unit numbers 1,
2, 3, and 4 to be used by the programmer.
Unit 5 is reserved for the normal user console and unit
6 is reserved for Unix full screen console I/O.
The open command takes a device number (either a number or an expression
which evaluates to a number in the range of 1 to 4) and a file name. The
file name must either be a variable name or a quoted literal. The file
name consists of a valid file name followed by
either /NEW /APPEND or /OLD
If followed by /NEW, the VM assumes you are
opening the file for output: any previous files with this name are lost
You may only write to this file. If you open the file with the /OLD
option, the VM assumes the file exists and opens it for input only
(reads).
If you specify /APPEND, the file is opened for outout with all
new data written to the file appearing after the previously existing
data.
If an error takes place, $test is set to zero and the remainder
of the command is not inspected. If no error takes place,
$test will be one. The user should not attempt to reference
units for which the OPEN command returned a $test value of 0. For example:
For the VM:
You should be careful to CLOSE any file you have opened for output in order
not to loose any of the file's contents. You must CLOSE an open unit
number before re-using the unit number in another OPEN command.
In cases where the file name contains "/" characters, a comma
may replace the "/" prior to the NEW, OLD, or APPEND.
In the VM,
the QUIT command provides an exit point for a FOR, DO or XECUTE commands.
It may be post-conditionalized. It takes no arguments (therefore, you need
two spaces after it if there are any commands following it on the same
line). In the FOR case, the QUIT terminates the nearest loop. In the DO
case, the QUIT returns to the most recently invoking command. In the
XECUTE case, execution of the XECUTE text is terminated and control is
returned to the original command line.
The READ command reads data into variables. It may be
post-conditionalized. Ordinarily the READ command reds from the user's
terminal (unit number 5). The READ command can be redirected to other
devices and files, however, by use of the USE command (see below). It is a
common source of error - sometimes quite destructive errors - for the user
to READ or WRITE to the wrong device. Many MumpsVM programmers explicitly
place the USE command immediately prior to the READ or WRITE commands.
Ordinarily, the READ command takes one or more arguments which may be local
scalar variables, local array elements, or global array elements. Each of
these is read successively from the input device. When more than one
argument is present, a carriage return / line feed is taken as the
delimiter between the successive input values. For example:
READ A,B,^A(1,3,99)
If the input is derived from the user's terminal, the user might type the
following sequence in response to the above:
22
The READ command may also write before it reads. This mode is only
permitted at the user's console. The options permitted for "write before
read" are: a literal constant in quotes; and a tab, new line, or new page
operation. A tab operation is specified by a question mark followed by an
expression which is interpreted as arithmetic. The effect is to cause the
cursor to move to the named column on the page or screen. The new line
operation is caused by typing an exclamation point (!). The VM
will generate a carriage return / line feed pair. A new page is induced by
the pound-sign (#) character. For example, if you wish to read name,
social security number and password with user input from column 20, the
following might be appropriate:
R "NAME",?20,N,"SSN",?20,S,"PASS",?20,P
After the above, the variable N will contain the name, S the social
security number and P the password.
The READ command also permits single character input. That is, a read
operation will be satisfied as soon as the user strikes any character on
the keyboard: no carriage return is required. The variable will contain a
number which is the equivalent of the ASCII character struck. This mode is
denoted by preceding the variable to be read by an asterisk. For example:
READ "ENTER A LETTER ",*A
The READ is satisfied when any character is struck.
There is also another form of the READ: that which contains "time-outs".
Time-outs permit the programmer to specify a maximum interval of time which
the VM should wait for the user to reply to the READ operation.
The time-out may be used with either the regular or character by character
mode. The time-out is specified by placing a colon after the variable
followed by an expression which will be interpreted as numeric. The value
of the expression is the amount of time in seconds to wait for a user
response to this operation. If the user fails to respond in the required
interval, the $TEST built-in variable is set false (0) and the variable
contains nothing. If the user does respond in time, $TEST is set true (1)
and the variable contains the user's reply. For non-character by character
mode input, the user must type the carriage return for the input to be
valid. If the time-out expires before the user type the carriage return,
all input is lost. For example:
AGAIN READ "ENTER NAME ",N:20 IF '$TEST GOTO AGAIN
The SET command is the assignment command for MumpsVM. The expression to the
right hand side of the equals sign is evaluated and placed in the storage
associated with the variable on the left hand side. Global variables may
be used on the left hand side. The SET command may be
post-conditionalized. The $Piece function may be used on the left
hand side of an expression. For example, if the variable a contains
the value "aaa.bbb.ccc", the command:
$P(a,".",2)="xxx"
will result in the value of a becomming "aaa.xxx.ccc"
The USE command tells the VM which device (unit) number to use for
input output operations (READs and WRITEs). The unit designated as the
input/output device remains in effect until changed by the USE again or an
error occurs. If the VM detects an error it always resets the the
current device number to 5 - the number associated with the user's console
terminal. The valid range of unit numbers in MumpsVM is 1 through 5. The
current unit number can be determined from the $IO (abbreviation: $I)
built-in variable. The command is specified as the letter U or the word
USE followed by an expression which is interpreted as numeric. For
example:
USE 1
The VIEW command is not used in this implementation.
The WRITE command transmits data to an output device. Normally output is
directed to the user's console terminal. By the USE command, however, data
may be directed to another unit number and its associated file. A unit
number must be opened prior to writing to it.
The WRITE command takes as arguments either literals in quotes, numeric
constants, variable names: both local and global, and control codes and
expressions for tab, new line and new page operations. The control
operations are the same as those discussed above for the READ operation.
Output is stream oriented: that is, each WRITE does not begin on a new
line: it begins where the last line left off. Wrap-around occurs at your
terminal depending upon the current terminal monitor level width setting.
You may specify single character output by an asterisk followed by a
decimal number. The system will send the ASCII character associated with
the number you specify (e.g., *7 will send the BELL character). For
example:
WRITE "NAME OF PATIENT",?25,NAME
The XECUTE command can take one or more arguments. The arguments must be
strings. The XECUTE executes each of the arguments in order: that is, it
causes the values in the string arguments to be interpreted as MumpsVM
command lines. As such, the strings may contain any valid MumpsVM code
including XECUTEs (be careful though). For example:
SET A="FOR I=1:1:20 W !,I"
The above XECUTE will cause the numbers 1, 2, 3, ... 20 to be printed down
the side of the terminal. You may construct strings in local or global
variables.
The Z in ANSI MUMPS permits the implementor to add special commands.
The ZE command may be used to terminate a VM.
It may be executed either from direct or program mode.
The ZE command causes immediate termination. It should be the only command
on a line. The Halt command may also be used to terminate execution.
When executing with a web server, it is important that your
program ultimately terminate a VM. Simply ending
a program (Quit) and returning to interactive mode will
cause your web server to crash.
The ZP command is used to pass the head of a logical predicate to
the logic VM.
If the logic analyzer evaluates the predicate as true, $TEST is set
to 1, 0 otherwise. Variables in the VM symbol table may also be
created or altered as a result of the logic VM's execution.
This feature is not presently enabled.
ZSeek address repositions the currently open file system to
the byte address given by address. This is equivalent to the
C language lseek() function.
$ASCII returns the numeric value of an ASCII character. The string is
specified in e1. If no i2 is specified, the first character of e1 is used.
If i2 is specified, the i2'th character of e1 is chosen. For example:
$ASCII("ABC") YIELDS 65
Like $Next except it gives the previous value of the last global
array index. (non-standard function).
$CHAR translates numeric arguments to ASCII character strings. Numeric
values greater that 128 will generate errors. For example:
$CHAR(65) yields "A"
$DATA returns an integer which indicates whether the variable vn is
defined. The value returned is 0 if vn is undefined, 1 if vn is defined
and has no associated array descendants; 10 if vn is defined but has no
associated value (but does have descendants); and 11 is vn is defined and
has descendants. The argument vn may be either a local or global variable.
For example:
$DATA(A)
$EXTRACT returns a substring of the first argument. The substring begins
at the position noted by the second operand. If the third operand is
omitted, the substring consists only of the i2'th character of e1. If the
third argument is present, the substring begins at position i2 and ends at
position i3. Note that this differs from the usual SUBSTR function in
PL/I.
If only "e1" is given, the function returns the first character of the
string "e1".
If i3 specifies a position beyond the end of e1, the substring ends
at the end of e1. For example:
$EXTRACT("ABC",2) YIELDS "B"
$FIND searches the first argument for an occurrence of the second
argument. If one is found, the value returned is one greater than the end
position of the second argument in the first argument. If i3 is specified,
the search begins at position i3 in argument 1. If the second argument is
not found, the value returned is 0. For example:
$FIND("ABC","B") YIELDS 3
$JUSTIFY right justifies the first argument in a string field whose length
is given by the second argument. In the two operand form, the first
argument is interpreted as a string. In the three argument form, the first
argument is right justified in a filed whose length is given by the second
argument with i3 decimal places. The three argument form imposes a numeric
interpretation upon the first argument. For example:
$JUSTIFY(39,3) YIELDS " 39"
The $LEN function returns the string length of its argument. For example:
$LEN("ABC") YIELDS 3
The $NEXT function gives the next higher value for the last index in an
array (local or global). If there is no higher value, $NEXT returns -1.
You may find the first value for the last index by invoking the function
with -1 in the last position. Remember, MumpsVM arrays are sparse and not
all index values necessarily exist. For example:
$NEXT(A(1,1)) yields 3 if A(1,3) exists and A(1,2) does not.
$NEXT(A(1,1)) yields 99 if A(1,99) exists and no node exists between
A(1,1) and A(1,99).
$NEXT(A(-1)) yields 1 if A(1) is the first index value for the first index
level.
See details above under "Implementation Notes" concerning
the collating sequence. Numeric subscripts are presented in alphabetic,
not numeric, collating sequence.
Not presently implemented.
The $PIECE function returns a substring of the first argument delimited by
the instances of the second argument. The substring returned in the three
argument case is that substring of the first argument that lies between the
i3'th minus one and i3'th occurrence of the second argument. In the four
argument form, the string returned is that substring of the first argument
delimited by the i3'th minus one instance of the second argument and the
i4'th instance of the second argument.
If only two arguments are given, i3 is assummed to be 1. For example:
$PIECE("A.BX.Y",".",2) YIELDS "BX"
In the VM,
$P can be used on the left hand side of a SET command or
as an argument in a READ command. In these cases, the first
argument must be a local or global variable. The contents of this
variable are altered to the value of the right hand side of the SET statement
or the value read by the READ statement.
The entire contents of the local or global variable are not altered,
only the part which would have been extracted by the $P function.
$RANDOM returns an integer in the range zero through i1-1. For example:
$RANDOM(100) yields a value between 0 and 99
The $SELECT function takes a variable number of arguments delimited by
commas. Each argument consists of two parts: a logical expression and a
result expression. The function evaluates in sequence each of the logical
expressions (shown above as t1, t2, ...tn - note: these can be any
expression in reality: a zero result is called false and a non-zero result
is called true). If a logical expression is true, the result expression
(e1, e2, ... en) is evaluated and becomes the value for the function.
$TEXT returns a line of source program text. If just a label is given,
the source program text at the label is returned. If a
label plus an offset expression (numeric result) is given, then the line of
source text returned is some number of lines forward of the line with the
noted label. If an arithmetic expression are given, then
the line of source text I1 lines from the beginning of the program is
returned. In this version of MumpsVM, the value of a label is its line number.
$VIEW is not supported.
$Z functions are extensions added by the implementor. The MumpsVM
VM has several $Z functions:
The $H built-in variable returns a string consisting of two numbers. The
first is the number of days since December 31, 1840 and the second is the
number of seconds since the most recent midnight. The variable may not
appear as the target of an assignment or READ command.
The MumpsVM gives these values relative to Greenwich Mean Time.
$IO gives the current unit number. MumpsVM I/O is, at any given time,
directed to a given unit number. In MumpsVM, i/o, by default, is directed
to unit 5 - the user's console. This is the unit from which all read's and
write's will take place. If the user open's another unit number for
further file operations, the use command is used to redirect the read and
write commands to this unit. The $IO variable indicates the current i/o
unit number. It may not appear as the target of an assignment statement or
as an argument of a write command although it may appear in both contexts
as a source argument such as in computation of an index of a target array.
The $JOB variable returns the system job number.
This is the process PID.
The $storage variable returns the amount of free space remaining in the
user's area. In this implementation, the user partition is
normally 30,000 characters in
length. The symbol table is located at the top of this area and the user
program is located at the bottom. The symbol table, as it grows due to the
creation of variables and the increase in string values of variables, grows
downwards while the program space grows upwards. the $S variable indicates
the amount of space remaining between the two areas. The user should note
that the area between the program and symbol table is used by the parser
for intermediate expression evaluation. Thus, even though storage may
exist between the two areas before and after a command, a command, due to
expression evaluation may cause the free space to be exhausted.
The $X variable gives the current horizontal position of the record in the
current unit number. For terminals, this is the horizontal cursor
position. For other files, it is the number of characters since the start
of the current record.
The $Y variable gives the vertical position of the current unit number. It
is pre-set to zero for each top of forms format control used.
Output
Normally in a MumpsVM program, you use the write statement to
generate program output that appears as generated on the console
running the MumpsVM program. When running under a web server, however,
all your output will be captured by the web server and sent to the
web browser. The web browser will format and place your output
on the browser's screen.
In order to control the placement, font size, color and other
factors concerning the display of your output, you must embed
in your output HTML codes. The browser will use
these in determining the manner in which to display your output.
Any output you write to the default output device (unit 5) will
be sent to the browser by the web server. You may use the
write statement to send both text to be displayed as well as
HTML codes. For example, given that the variable
ptid contains "1234":
to the browser and this will cause the text to be centered on the
browser's screen.
In order to speed the development process, this VM also
supports another form of output that allows easier mixing of
HTML and MumpsVM code. In a MumpsVM program, if a line
does not contain a TAB character (required as the start character
on a line with no label or the separater bewteen the label and the
text of the line in a line with a label), the line will be written
to the default (unit=5) output. Before writing the line, the VM
will scan the line for:
2 Unmatched quotes
3 Global not found
4 Missing comma
5 Argument not permitted
6 Bad character after post-conditional
7 Invalid quote
8 Label not found
9 Too many/few fcn arguments
10 Invalid number
11 Missing operator
12 Unrecognized operator
13 Keyword
14 Argument list
15 Divide by zero
16 Invalid expression
17 Variable not found
18 Invalid reference
19 Logical table space overflow
23 Symbol table full
24 Function argument error
25 Global not permitted
26 File error
27 $N error
29
30 Function not found
31 Program space exceeded
32 Stack overflow
"123.45"
"BRIDGET O'SHAUNESSEY? YOU'RE NOT MAKING THAT UP?"
"""THE TIME HAS COME,"" THE WALRUS SAID"
"ABC"+2 will be evaluated as 2.
"1AB"+2 will be evaluated as 3.
"AA1"+2 will be evaluated as 2.
"1"+"2" will be evaluated as 3.
READ TEST(22)
WRITE TEST(22)
SET I=10 SET A(I)=10
SET A("TEST")=100
SET I="TESTING" SET A(I)=1001
SET A("MUMPS","USERS'","GROUP")="MUG"
SET ^SHIP("1ST FLEET","BOSTON","FLAG")="CONSTITUTION"
SET ^CAPTAIN(^SHIP("1ST FLEET","BOSTON","FLAG"))="JONES"
SET ^HOME(^CAPTAIN(^SHIP("1ST FLEET","BOSTON","FLAG")))=
... "PORTSMOUTH"
WRITE ^SHIP("1ST FLEET","BOSTON","FLAG")
... CONSTITUTION
WRITE ^CAPTAIN("CONSTITUTION")
... JONES
WRITE ^HOME("JONES")
... PORTSMOUTH
WRITE ^HOME(^CAPTAIN("CONSTITUTION"))
... PORTSMOUTH
SET ^A(1,43)="TEST MIDDLE LEVEL"
WRITE +I yields 123
WRITE -I yields -123
2.31+1 yields 3.31
3-5 yields -2
7/4 yields 1.75
7\4 yields 1
11#3 yields 2 (please see notes)
2 > 1 yields 1
1 < 2 yields 1
2 < 1 yields 0
2' > 1 yields 0
1' < 2 yields 0
2' < 1 yields 1
"ABC"_123 yields "ABC123"
123_456 yields 123456
IF A["THE" WRITE "YES"
IF A]"AAA" WRITE "YES"
C for the 33 control characters.
E for any of the 128 ASCII characters.
L for the 26 lower case letters.
N for the numerics
P for the 33 punctuation characters.
U for the 26 upper case characters.
A literal string.
IF A?3N1"-"2N1"-"4N WRITE "OK"
SET A="JONES, J. L."
IF A?.A1",".A WRITE "OK"
2 & 1 yields 1
1 & 0 yields 0
1!1 yields 1
1!0 yields 1
0!0 yields 0
2!0 yields 1
1 & 0 < 1 yields 0
1 & (0 < 1) yields 1
'> not greater than
'= not equal
'[ not contains
'] not follows
'? not pattern
C I
C:K=J 1,2
DO:I=J "PGM1.MPS"
DO "PGM1.MPS":I=K
DO "PGM1.MPS":I=J,"PGM2.MPS":K=L
DO LAB1:I=10;"PGM1.MPS"
SET A="PGM1.MPS"
DO @A
S A="LAB3^PGM.MPS"
DO @A
DO LAB1+2
DO LAB1+2^"PGM1.MPS"
DO LAB1+I*K
DO LAB1+I*K^"PGM1.MPS"
38
NOW IS THE TIME
WRITE !,"AGE",?20,AGE
XECUTE A
Built-in Variables
$ASCII("ABC",1) YIELDS 65
$ASCII("ABC,2) YIELDS 66
$ASCII("") YIELDS -1
$CHAR(65,66) yields "AB"
$DATA(A(1,1))
$EXTRACT("ABCDEF",3,5) YIELDS "CDE"
$FIND("ABCABC","A",3) YIELDS 5
$JUSTIFY("TEST",7) YIELDS " TEST"
$JUSTIFY(39,4,1) YIELDS "39.0"
$LEN(22.5) YIELDS 4
.DE
If a second argument is given, the function returns the number
of non-overlapping occurrences of "e2" in "e1" plus 1.
$PIECE("A.BX.Y",",",1) YIELDS "A"
$PIECE("A.BX.Y",".",2,3) YIELDS "BX"
Writing MumpsVM scripts for CGI execution is relatively simple.
The main difference between writing ordinary console based MumpsVM
programs and those to be executed in the CGI interface of a web server
concerns input/output.
In March, July, October and May, the Ides fall on the fifteenth day.