Pliant->C Library Wrapping Tutorial |
Foreword |
I'd like to invite any reader to comment on this tutorial or ask questions relating to the topic. This is my first document like this so I'm expecting there to be holes in my coverage of the topic. I hope to fill these based on my own experiences and feedback from others over time. I should note that this tutorial is mainly oriented toward Linux/*nix users. I'm not sure how much help it will be to a Windows user.
Introduction |
As a long time non-C hacker, I've found that a languages ability to interface with C is crucial to the languages long term usefulness. Particularly on my preferred platform, Linux. Though I'm sure this is a truism for just about any modern platform, C (or C++) being the standard systems level programming language.
One of the many things that attracted me to Pliant is the ease which you can create interfaces to external C libraries. It cannot really even be compared with a foreign languages interface, as its almost as integrated with C as C++/ObjC. There are a few catches, but its really the best I've seen in a non-C language.
Anyways... this little tutorial is meant to help you use this aspect of Pliant (not as a general introduction). To date I've only partially wrapped 3 C libraries, the GNU readline library, the ncurses-5 library and the slang library. I'll be using the first 2 as examples as I proceed, and you can download them from the resources section.
Basic Concepts |
Thus far I've encountered 3 basic types of C entities which need to considered when wrapping a library. Functions, global variables and structures. I'll first go over the basics of dealing each of these.
Functions |
Wrapping function is very simple. Its simply a matter of using the external function attribute as described on the 'Defining a new function' page of the documentation. Here's the basic syntax:
function c_function_call argument -> result arg Type argument arg Type result external "c_library.so" "c_function"The argument and result types can either be a Pliant type or a wrapped C type. The c_library.so library may be any cached by ld (on linux). The c_function is a standard function in that library. Finally, you use this function in Pliant just as a standard function.
The only complications in dealing with this is the types. For strings or character types, special Pliant variables must be used. For strings, representing 'char *' type arguments in C, use the CStr type. For characters, 'char' types, use the CChr type. This is due to differences between these pliant types and their C counterparts. Casting is done auto-magically (see implicit casting below).
Global Variables |
Global variables are handled very similarly to C functions. Using the external attribute. Again, here's the basic syntax:
var Type var_name external "c_library.so" "c_global_variable"The var Type var_name should seem familiar if you've used Pliant at all. The extra external line is taken from the syntax of functions, and works in the same way. The only thing to watch for is to make sure the types match up. The rules about character and string types mention above apply here as well.
Structures |
Trickiest of these 3 are C structures. These can either be pretty straight forward or a bit of a pain (comparatively, anyway). It depends on whether the C compiler has done any alignment optimizations on the structure or not. Its hard to give much in the way of generalities for wrapping C structs, but here's what they tend to look like:
type c_struct packed field Type first field Type secondThe packed keyword is the important difference between a wrapper and a normal type declaration. It keeps Pliant's compiler from aligning the memory, so you can make sure it stays matched to the memory layout of the wrapped structure. See the ncurses wrapper below for more an example and additional details. Particularly important is the part dealing with the memory alignment problems.
Readline Wrapper |
After briefly messing around with the interpreter, I decided that the first library I had to wrap was the GNU Readline library. I'm sure any other CLI lovers will empathize. So, I needed to replace the interpreter's current CLI with a readline provided one. This turned out to be really simple and straightforward. Its one of the things that originally got me hooked. Here is everything that is required to wrap the basic prompt/command line provided by the readline lib.
module "/pliant/language/unsafe.pli"
function readline prompt -> line arg CStr prompt line external "libreadline.so" "readline"The unsafe module is needed for the automatic CStr casting. But otherwise this is what most of the external function wrappers will look like. To use the history features of the readline lib, the add_history function also needs to be wrapped.
function add_history line arg CStr line external "libreadline.so" "add_history"Using these 2 functions, most of the readline's basic functionality can be utilized. Here is the final function which uses both the above functions to provide the new CLI functionality.
function rl_get prompt -> line arg Str prompt line var CStr ret ret := readline:prompt if ret:characters = null line := "[0]" else line := ret if not line = "" and not line = "[0]" add_history:lineAs you can see, the two wrapped function work just as if they were Pliant functions. The '[0]' is returned in place of a null, as the interpreter is expecting a normal Pliant string and Pliant Str types can't hold nulls except using their format, ie. '[0]'.
Ncurses-5 Wrapper |
The Ncurses wrapper is more complicated than the readline wrapper. As it not only has many functions to wrap, it has global variables and a C structure to wrap.
Function Wrapping |
Wrapping a function for Ncurses is much the same as for the readline library, just a lot more of them. So there isn't much new to go into in this section. Ncurses does have one twist when compared to readline, that is that it has quite a few functions exported as macros. In C, these are incorporated into the program by the compiler. This doesn't help, since we're accessing the library directly.
To deal with macros, you need to port them over manually. This is really pretty simple, as macros are defined in the header file and are usually constructed from the other functions exported from the library or an extremely simple bit of code. Each is handled in basically the same way. You simply re-implement them in Pliant. The only difference is whether you use another function exported by the library or not.
Here's an example of a reimplemented macro from the Ncurses wrapper. First, the C:
#define touchwin(win) wtouchln((win), 0, getmaxy(win), 1)And the Pliant version:
function touchwin w arg Address w wtouchln w 0 (w:_maxy + 1) 1
The touchwin function in ncurses marks a window so that the next refresh redraws it. It is implemented in C via a macro calling the wtouchln function (which marks a range of 1 or more lines for redraw) on the contents of the target window. The Pliant version does the same. The only is that instead of calling getmaxy (which is yet another macro), it accesses the window structure to get the number of lines in the window (ie. _maxy).
Global Variable Wrapping |
Global variables are easy to wrap and there's not much to complicate things. So let's get right to an example from Ncurses:
var Address stdscr external curses "stdscr"
In Ncurses stdscr is the address of the default window created when you initialize the display. Pretty simple... eh.
C Structure Wrapping |
The most difficult part of writing my Ncurses wrapper was dealing with Ncurses window structure. It probably wouldn't have been as hard for someone with more C experience than I, but with help from the forum I figured it out.
There are several steps I went through to get a working wrapper that I was happy with, and this section will reflect my development path. To start we need to know what we're dealing with.
The C structure |
struct _win_st { short _cury, _curx; /* current cursor position */ /* window location and size */ short _maxy, _maxx; /* maximums of x and y, NOT window size */ short _begy, _begx; /* screen coords of upper-left-hand corner */ short _flags; /* window state flags */ /* attribute tracking */ attr_t _attrs; /* current attribute for non-space character */ chtype _bkgd; /* current background char/attribute pair */
/* option values set by user */ bool _notimeout; /* no time out on function-key entry? */ bool _clear; /* consider all data in the window invalid? */ bool _leaveok; /* OK to not reset cursor on exit? */ bool _scroll; /* OK to scroll this window? */ bool _idlok; /* OK to use insert/delete line? */ bool _idcok; /* OK to use insert/delete char? */ bool _immed; /* window in immed mode? (not yet used) */ bool _sync; /* window in sync mode? */ bool _use_keypad; /* process function keys into KEY_ symbols? */ int _delay; /* 0 = nodelay, <0 = blocking, >0 = delay */
struct ldat *_line; /* the actual line data */
/* global screen state */ short _regtop; /* top line of scrolling region */ short _regbottom; /* bottom line of scrolling region */ /* these are used only if this is a sub-window */ int _parx; /* x coordinate of this window in parent */ int _pary; /* y coordinate of this window in parent */ WINDOW *_parent; /* pointer to parent if a sub-window */ /* these are used only if this is a pad */ struct pdat { short _pad_y, _pad_x; short _pad_top, _pad_left; short _pad_bottom, _pad_right; } _pad; short _yoffset; /* real begy is _begy + _yoffset */ };
I won't get into the gory details of the above, but there are a items which need more explanation for what comes next to make sense. First, it should be noted that the bool type used above is an unsigned char and the attr_t type is an unsigned long. Finally there are the two other structures usedit this struct, namely ldat and pdat. The latter is already defined, so here's the C for the former:
struct ldat { chtype *text; /* text of the line */ short firstchar; /* first changed character in the line */ short lastchar; /* last changed character in the line */ short oldindex; /* index of the line at last update */ };
chtype is an unsigned long int.
As you will see below, I don't actually wrap this struct. Ncurses provides functions to deal with this struct via the window. So all that will be needed is to store the address in the window struct.
The Pliant version |
First, lets wrap that pdat struct:
type PDat packed field Int16 _pad_y _pad_x field Int16 _pad_top _pad_left field Int16 _pad_bottom _pad_right
This creates the new type in Pliant, PDat. It looks pretty much just like the example, and there are no real surprises here. Notice here and below that the choice as to which Pliant type to use in the wrapper is based primarily on the size of the type its mapping. That's why you'll see the bool type, an unsigned char, in the C struct replaced with uInt8, an 8 bit Int (the same size as the wrapped char type).
public type Window packed
field Int16 _cury _curx # current cursor position # window location and size field Int16 _maxy _maxx # maximums of x and y, NOT window size field Int16 _begy _begx # screen coords of upper-left-hand corner field Int16 _flags # window state flags # padding to correct offset
field (Array Byte 2) padding1
# attribute tracking field uInt32 _attrs # current attribute for non-space character field uInt32 _bkgd # current background char/attribute pair
# option values set by user field uInt8 _notimeout # no time out on function-key entry? field uInt8 _clear # consider all data in the window invalid? field uInt8 _leaveok # OK to not reset cursor on exit? field uInt8 _scroll # OK to scroll this window? field uInt8 _idlok # OK to use insert/delete line? field uInt8 _idcok # OK to use insert/delete char? field uInt8 _immed # window in immed mode? (not yet used) field uInt8 _sync # window in sync mode? field uInt8 _use_keypad # process function keys into KEY_ symbols? # padding to correct offset
field (Array Byte 3) padding2
field Int _delay # 0 = nodelay, <0 = blocking, >0 = delay
field Address _line # the actual line data
# global screen state field Int16 _regtop # top line of scrolling region field Int16 _regbottom # bottom line of scrolling region # these are used only if this is a sub-window field Int _parx # x coordinate of this window in parent field Int _pary # y coordinate of this window in parent field Address _parent # pointer to parent if a sub-window # these are used only if this is a pad field PDat _pad field Int16 _yoffset # real begy is _begy + _yoffset # padding to correct offset - not really needed, as its at the end
field (Array Byte 2) padding3
Most of this is pretty straight forward, expept those lines I've highlighted in bold. These are present due to the C compiler aligning the memory of the structure as an optimization trick. How you figure out if you need these and how much padding to add is covered next.
Memory Alignment with C structures |
There are 2 related problems to solve if the struct you're writing gets aligned by the C compiler. First is determining this and then what to do about it.
If you think you've got your wrapper right, yet it still won't map onto the C struct its supposed to be wrapping (basically if all else fails), it's probably due to alignment issues. To be sure, compare the sizes of the C struct vs. the Pliant type.
In Pliant this is a snap, as all types have the built-in method size which, oddly enough, returns the memory size of the type. So, to check the size of the above Pliant Window type, in the file that contains its source I'd just add:
console "Window size: " (Window size) eol
In case you're curious, the correct size is 76.
To determine the size of the struct in C, you need to whip up a little C program. Here is the program I wrote to find out the size of the WINDOW struct Ncurses defines:
#include <curses.h> #include <stddef.h> int main() { WINDOW w; printf("linesize: %d\n",sizeof(w)); }
After running this, you can see for sure whether the memory sizes of the Pliant and C code are the same. If the Pliant's type size is larger than the C struct, you need to check the field types you used in the Pliant type (eg. an uInt instead of uInt16). If the C struct is the larger of the two, then you probably need to add padding to the Pliant type. How much and where? To determine this you need to determine the offsets used for each variable in the struct. Here's the abridged version of the program I used to determine this for this:
#include <curses.h> #include <stddef.h> int main() { WINDOW w; printf("cury offset: %d\n",offsetof(struct _win_st,_cury)); printf("curx offset: %d\n",offsetof(struct _win_st,_curx)); [... cut-n-paste code snipped ...] printf("pad offset: %d\n",offsetof(struct _win_st,_pad)); printf("yoffset offset: %d\n",offsetof(struct _win_st,_yoffset)); }
After you have this information, you compare each offset with the size of the appropriate variable and look for increases in the offset which are different from the variable size. If so, you've found a place where padding needs to be added. I use arrays of bytes to explicitally allocate these chunks (Hubert suggested this).
The Magic of Implicit Casting |
Now that we have our wrapper, there's one last detail to make it easy to deal with...
In Ncurses, all the functions and variables deal with the Window as an Address. They expect the windows passed in as addresses and return addresses. While you also need the information contained in the struct (wrapped via the pliant type) for things like the macro replacement above. It is possible to manually cast these each time, but that is a pain. Instead it is possible to have them cast automatically when necessary via implicit casting.
Setting up implicit casting for automatically converting Address types to Window types and vice versa requires 2 separate casting functions.
First, casting from a Window to an Address:
function 'cast Address' w -> a arg Window w ; arg Address a implicit a := addressof:w
export 'cast Address
Now, the other way:
function 'cast Window' a -> w arg Window w ; arg Address a implicit w := (a map Window)
export 'cast Window'
For more examples of implicit casting, see cchar.pli and cstr.pli in the Pliant distibution.
Questions? |
Some parts of this unclear? Have an idea for an improvement?
Resources |
I want to do a little more work on the Ncurses wrapper before releasing it. But here's a link to the readline wrapper, such as it is.