say ExternalMultiply(2, 3, 4, 5, 6)
29 December 2015
Regina Rexx is a very complete and well-documented implementation of the Rexx programming language. Unfortunately its documentation is very much reference documentation, a problem that makes using rarer facilities more difficult.
One such facility is the whole foreign language interface. Regina follows IBM’s SAA API, but this is an API that is not particularly popular outside of IBM’s notoriously inward-looking world. It is definitively not an API that is well-endowed with end-user documentation, especially in its Rexx integration side of things.
There are many examples of Rexx libraries which expose facilities to Rexx users that can be looked at, but unfortunately almost all of them rely on compatibility shims that make it very hard to piece together what’s actually going on when you’re learning. Further, the compatibility shims are a layer of unnecessary obfuscation if you, like me, only ever really intend to use Regina Rexx in your code.
Because of these deficiencies in accessible documentation I am embarking on a small series of blog entries to ease entry into the Rexx extension field. Today’s outing will introduce simply adding a single external function to Rexx’s capabilities.
For this baby step into extending Rexx we will be exposing a single, simple
function: ExternalMultiply()
. ExternalMultiply()
will accept any number of
integer numbers (only in the range of what will fit into a C long
) and multiply
them together, returning the product (where the product will fit into a C long
long
). For example the following line will print the number '720':
say ExternalMultiply(2, 3, 4, 5, 6)
Complete source for the example, a test driver, and a build script are provided at the end of this blog entry.
I’m a Linux user and don’t really have a lot of interest in programming under Windows. The code I have written should work under Windows (famous last words) but there will definitely need to be some changes to the build script. There may also be minor tweaks needed here or there in both the C implementation and the Rexx driver. If you use Windows you’ll have to figure these out for yourself. |
There are some concepts you’ll have to get used to first before extending Rexx.
Everything in Rexx is a string. Behind the scenes things may not be so simple-mindedly implemented, but at any public level of visibility Rexx values are all strings. This implies that all of your functions will have a set of strings as parameters.
Rexx is not based on the assumption that C is the lingua franca of computing.
Many languages have assumptions that are thinly veiled C assumptions. Numbers
are based on C numberical types (usually long
for integers and double
for
floating point, for example). Strings are NUL
-terminated arrays of byte-sized
characters. Rexx is not based on this assumption since it, as a language,
predates the C-as-lingua-franca era. As a result you will be needing to
understand Rexx’s exposed data types and you will spend a lot of time converting
back and forth between them and C’s.
There is a lot of boilerplate in making FFI for most languages. Rexx is no exception. Here are some of the things you’ll have to do in most function-exposing modules.
rexxsaa.h
All of the Regina API is specified in a single included header: rexxsaa.h
.
Unusually for such a system, it is not enough to merely include the header. You
have to activate specific subsystems before including it. This is done by
using #define
of relevant symbols before inclusion:
#define INCL_RXFUNC
#include <rexxsaa.h>
Here we activate the external function subsystem. INCL_RXSHV would be
used instead (or in addition) to include access to Rexx’s variable pool. |
When you include rexxsaa.h
with the relevant symbols defined you are given
access to the prototypes, data types, and symbols for the subsystem interface
you wish to use. In our case we are given access to the prototypes, data types,
and symbols for the external function interface API.
Of course you’ll have to declare the function that’s being exported. The plain version of this looks something like:
APIRET APIENTRY ExternalMultiply(PCSZ, ULONG, PRXSTRING, PCSZ, PRXSTRING);
That’s quite a mouthful, however, and would get tedious to type for each and every exported function. Thankfully the Regina API gives you a nice typedef for it:
RexxFunctionHandler ExternalMultiply;
If you must you can go ahead and use the repetitive, long-winded version but
I really strongly recommend using RexxFunctionHandler
instead.
Of course the types involved have to be known. APIENTRY
is something you
place there "just because". (It has to do with linkage types on OS/2 and
Windows. Putting it in the signature makes sure that your code will work on
OS/2 and Windows environments.) APIRET
is an alias for ULONG
. ULONG
is
an unsigned long
alias. PCSZ
is a typedef for a pointer to a C-style
(NUL
-terminated) string. PRXSTRING
is a pointer to an RXSTRING
.
RXSTRING
is the kicker. That’s the representation Regina exposes for its
internal string values. It is defined as:
typedef struct {
unsigned char *strptr;
unsigned long strlength;
} RXSTRING;
Points to the string contents: an array of unsigned bytes. | |
Contains the length of the string contents. |
strptr can contain any 8-bit values. Including NUL . It is
emphatically not a C-style string! strlength contains the length of the
content pointed at by strptr , not the size of the buffer. This will be an
issue later.
|
There are a number of helper macros provided to help work with RXSTRINGS. They are briefly glanced over here; consult the Regina documentation for full details.
|
In addition to the above, there are also two values which need defining (for readability):
#define RX_OK 0
#define RX_ERROR 1
The Regina APIs all want a return value of 0 for "worked fine" and a return value that is non-zero for "failed somehow". Note that this is not the return value that the function itself returns to the script! This is the return value the function returns to the interpreter to tell it whether the function call was a success or not. When you return non-zero, Rexx’s conditions mechanisms leap into action, signalling or calling error handlers as appropriate.
Of course in any non-trivial function package you’ll have to declare helper functions and other such things. This is a trivial function package, but for show here are two helpers:
static long rexx_to_long(RXSTRING);
static void long_long_to_rexx(long long, PRXSTRING);
Converts an RXSTRING into a long . |
|
Converts a long long into an RXSTRING . |
These two helpers are probably overkill for a module as trivial as ours, but it shows one thing: you will need a bunch of conversion functions whenever making a Rexx extension. (Indeed in a serious module you’ll probably want to build a library of them at need, specifically for bringing into other projects as you make them. Or you can just steal one of the ones from any of the existing foreign function extensions already provided.) |
Now that we’ve declared everything of interest, we need to implement the functionality. Let’s start by looking at the helper functions.
static long rexx_to_long(RXSTRING rexxval)
{
return strtol(RXSTRPTR(rexxval), NULL, 10);
}
static void long_long_to_rexx(long long val, PRXSTRING rexxval)
{
sprintf(RXSTRPTR(*rexxval), "%lld", val);
rexxval->strlength = strlen(RXSTRPTR(*rexxval));
}
FLAW! Better code would explicitly point to the end of the string! | |
FLAW! Better code would allocate a local buffer instead of using whatever was passed to it! |
The first thing that the observant reader will spot is that this code is not
very safe! This is because it is trying to illustrate the concepts of the API
without burying them underneath a pile of security boilerplate. Serious code
would make use of proper techniques including properly framing the strtol()
call with an end pointer, checking for errors, and generally not being a
one-liner. (This is why I recommended building up a conversion library earlier;
there’s a lot of potential boilerplate overhead that you’re not going to want to
type repeatedly.)
That being said, Regina in particular will pass, for numbers, C-style strings
in the strptr
member, so use of C-style string manipulation functions is fine
for demonstration purposes.
The second flaw is a bit more reasonable. In the context of Regina this is not a flaw at all. Regina allocates a 256-character string for return values. This is documented and fixed. What is at issue is if all interpreters do this. If portability across interpreters is not your concern, then using the presupplied buffer is fine.
Of course it goes without saying (but I will say it anyway) that you’ll need to include the appropriate C library headers for any of this to work:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
APIRET APIENTRY ExternalMultiply(PCSZ name, ULONG argc, PRXSTRING argv,
PCSZ queuename, PRXSTRING returnstring)
{
long long product = 1;
long i;
for (i = 0; i < argc; i++)
{
product *= rexx_to_long(argv[i]);
}
long_long_to_rexx(product, returnstring);
return RX_OK;
}
This is the function that Rexx scripts will actually directly call. It is actually quite straightforward, but has some unexpected issues. Let’s look at the parameters passed in one at a time:
PCSZ name
This is a C-style string containing the name of the function being called. Why is this needed? Because it’s possible to have a single registered function entry point that implements several related functions. (This is not something I’d personally recommend, but it’s something you can do!) We ignore this in our code.
ULONG argc
Your function will have argc
`RXSTRING`s as arguments.
PRXSTRING argv
An array of argc
`RXSTRING`s. Your arguments, in short.
PCSZ queuename
A C-style string containing the name of the current data queue. This is out of scope for this tutorial (and out of scope for most code!).
PRXSTRING returnstring
This points to a single RXSTRING
which is used to
return a value to the caller. The special variable
RESULT
will be set to this value on return. By
default Regina supplies a ready-made 256-byte
buffer in this pointer that you can use to set up
your return value. This may not be portable.
The rest of the code is straightforward. product
is initialized to 1. Each
passed-in argument is converted into a long
and multiplied in place with
product
. When the arguments have all been processed, product
is converted
into the RXSTRING
pointed at by returnstring
. RX_OK
(0) is returned then
to signal that everything is hunky dory.
On the Rexx side, the interpreter must first be informed of the existence of the exported function and its location. This is done in this chunk of code:
if RxFuncAdd('ExternalMultiply', 'external', 'ExternalMultiply') <> 0 then
do
say RxFuncErrMsg()
exit E_ERROR
end
The key code is the call to RxFuncAdd()
. It maps the first argument (internal
name of the function) to the third argument (external name of the function) in
the library named by the second argument.
In this case we’re calling the function ExternalMultiply
in Rexx, although it
will be callable as externalMultiply
or even ExTeRnAlMuLtIpLy
—Rexx is
case insensitive).
The external name is ExternalMultiply
, the name you exported the function by.
The library name is 'external' which, in a Linux environment will have 'lib' prepended and '.so' appended, so in this case it will be looking for 'libexternal.so'. Again this will change by enviroment.
Rexx itself is a very portable language, but it is quite natural that when interfacing with the outside world through an FFI there will be platform differences. It is strongly advised that you be familiar with all of your target platforms' development tools and their quirks if making code for multiple platforms. |
After all of this, calling the function is an anticlimax. ExternalMultiply
is
now used just like any Rexx BIF:
say ExternalMultiply(2, 3, 4, 5, 6)
product = 1
do i = 100 to 1000 by 145
product = ExternalMultiply(product, i)
end
say product
Of course there will be some issues relating to C type limitations. Rexx has
arbitrary-precision arithmetic that doesn’t wrap. Most C implementations will
have 64-bit long long
values that will wrap when overflowed. This particular
code will, as a result, not be seamless.
Of course this function is trivial, not particularly well-matched to Rexx, and not very safe. Using it will not give the programmer the feeling that they’re using something intended for Rexx. Here are some improvements that could be made.
Use secure code. The endptr
argument to strtol()
should be used instead of
assuming that the number passed by Regina will be NUL
-terminated. Allocate
a local buffer for returnstring
and use that instead of the Regina-provided
one. (Don’t worry: you won’t leak. If you change the returnstring->strptr
member, Regina will deallocate it for you when finished using it.)
Rexx numbers are arbitrary precision decimal representations. (Indeed they are
the inspiration and much of the design behind the more recent IEEE 754 decimal
format!) They are not like C’s float
or double
types and they are not
like C’s integer forms, unsigned or otherwise. Using something like
this decimal representation package instead of C’s native types is
probably smart idea.
(Actually ISO/IEC TS 18661-2 enhances ISO C with decimal floating point support. Catch up!)
The implementation of ExternalMultiply()
doesn’t check any of the input for
validity. Nor do its helper functions. There’s no check for 0 values, so no
short-circuit return of 0 at need. There’s no check that the answer will
overflow the returnstring
buffer (although with a long long
that is not a
meaningful risk).
Proper code will check all of this. Do that when writing real code.
The following blocks contain the full source code for the external function implementation, the test driver, as well as a simple build script usable in a Linux environment. They should serve as a good basis for making a proper, useful Rexx extension library.
/* external.c */
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define INCL_RXFUNC
#include <rexxsaa.h>
/* helper function declarations */
static long rexx_to_long(RXSTRING);
static void long_long_to_rexx(long long, PRXSTRING);
/* external API function declarations */
RexxFunctionHandler ExternalMultiply;
/* symbolic return values */
#define RX_OK 0
#define RX_ERROR 1
/* external API functions */
APIRET APIENTRY ExternalMultiply(PCSZ name, ULONG argc, PRXSTRING argv,
PCSZ queuename, PRXSTRING returnstring)
{
long long product = 1;
long i;
for (i = 0; i < argc; i++)
{
product *= rexx_to_long(argv[i]);
}
long_long_to_rexx(product, returnstring);
return RX_OK;
}
/* helper functions */
static long rexx_to_long(RXSTRING rexxval)
{
return strtol(RXSTRPTR(rexxval), NULL, 10);
}
static void long_long_to_rexx(long long val, PRXSTRING rexxval)
{
sprintf(RXSTRPTR(*rexxval), "%lld", val);
rexxval->strlength = strlen(RXSTRPTR(*rexxval));
}
/* test-external.rx */
E_OK = 0
E_SYNTAX = 1
E_ERROR = 2
E_FAILURE = 3
E_HALT = 4
E_NOTREADY = 5
E_NOVALUE = 6
E_LOSTDIGITS = 7
E_UNKNOWN = 255
signal on syntax name error
signal on error name error
signal on failure name error
signal on halt name error
signal on notready name error
signal on novalue name error
signal on lostdigits name error
if RxFuncAdd('ExternalMultiply', 'external', 'ExternalMultiply') <> 0 then
do
say RxFuncErrMsg()
exit E_ERROR
end
say 'ExternalMultiply(2, 3, 4, 5, 6) returned' ExternalMultiply(2, 3, 4, 5, 6)
exit E_OK
error:
type = condition('C')
if condition('I') = 'SIGNAL' then
say 'Error' type || '(' || rc || ') signalled on line' sigl || '.'
else
say 'Error' type || '(' || rc || ') called on line' sigl || '.'
say 'Description:' condition('D')
select
when type = 'SYNTAX' then
code = E_SYNTAX
when type = 'ERROR' then
code = E_ERROR
when type = 'FAILURE' then
code = E_FAILURE
when type = 'HALT' then
code = E_HALT
when type = 'NOTREADY' then
code = E_NOTREADY
when type = 'NOVALUE' then
code = E_NOVALUE
when type = 'LOSTDIGITS' then
code = E_LOSTDIGITS
otherwise
code = E_UNKNOWN
end
exit code
/* build-external.rx */
'gcc -shared -fpic -o libexternal.so external.c'