nntrac

No-nonsense TRAC T-64 reimplementation under 1000 SLOC of ANSI C
git clone git://git.luxferre.top/nntrac.git
Log | Files | Refs | README | LICENSE

README.md (15906B)


      1 # nntrac: no-nonsense TRAC implementation in ANSI C
      2 
      3 The nntrac language (all-lowercase) is a portable, lightweight (under 1000 SLOC of ANSI C) and embeddable derivative of the TRAC T-64 language originally designed by Calvin N. Mooers in the 1960's. Compared to the original, nntrac is created from scratch with modern systems in mind and adds several useful features to interact with current operating environments.
      4 
      5 ## Building
      6 
      7 Just invoke your C compiler like this (replacing `cc` with the specific command and updating the flags if necessary):
      8 
      9 ```
     10 cc [-static] -std=c99 -Os -s nntrac.c -o nntrac [-DNNT_EMBED] [-DNNT_NO_SHELLEXT] [-DNNT_SHARP="#"] [-DNNT_SYMNAMELEN=32]
     11 ```
     12 
     13 where `cc` is the C compiler of your choice. Compilation was tested on GCC, Clang, Zig cc, TCC (dynamic linking only), Cproc and chibicc.
     14 
     15 Supported compiler flags (all optional):
     16 
     17 - `-DNNT_EMBED`: build nntrac without the main function (entry point). See "Embedding nntrac into your projects" section for more information on the embedded usage.
     18 - `-DNNT_NO_SHELLEXT`: disable the `os` primitive. See "New primitives" section for more information.
     19 - `-DNNT_SHARP`: change the `#` character (start of active or neutral function call) to something else.
     20 - `-DNNT_SYMNAMELEN`: maximum length of TRAC form or primitive function **names** (default 32).
     21 
     22 ## Usage
     23 
     24 The nntrac binary can be run interactively (with `nntrac` or `echo 'script' | nntrac`) or in the script invocation mode (`./nntrac script.trac param1 param2 ...`). In the first case, the default "idling program" `#(ps,#(rs))` is run and the interpreter exits as soon as everything input until the first meta character (`'` by default) is executed and evaluated. In the second case, the script file is read and executed directly, with command line parameters passed to the special form names (see below).
     25 
     26 ### Main differences from the T-64 standard
     27 
     28 - the `ps` (print string) primitive can accept multiple arguments (concatenating them on the output);
     29 - default call results can be neutral and don't differ from explicit call results in any way;
     30 - up to 255 segment ordinals are supported (from 1 to 255);
     31 - extended operation mode (`#(mo,E)`) is the default one;
     32 - arithmetic primitives operate on 63-bit signed integers (if the target architecture allows, otherwise it's 31-bit signed integers) and accept numbers in any base supported by C's `strtoll` function in base autodetection mode (i.e. decimal, octal and hexadecimal) but they don't support prepending any string prefixes to the output and always return the result in base-10;
     33 - in arithmetic primitives, overflow argument is fully optional (in this case, a null value is returned if an overflow occurs);
     34 - all bitwise primitive results are returned in base-10 as well and truncated to unsigned 32-bit values, and `br` bit rotations are also done on 32 bits width;
     35 - form storage primitives (`fb`, `sb`, `eb`) directly accept a filename instead of a form name with the address (like in Nat Kuhn's Trac-in-Python);
     36 - the trace mode (`tn`, `tf`) is fully non-interactive and just prints out every function run into stderr, also it doesn't trace the `tn`, `tf` and `hl` primitives call;
     37 - other diagnostic functions (`ln`, `pf`) output to stdout;
     38 - 9 new primitives have been added (see below).
     39 
     40 ### Original T-64 primitives implementation status
     41 
     42 Name|Args|Implemented?|UTF-8 safe?|Meaning
     43 ----|----|------------|-----------|-------
     44 `rs`|1   |yes         |yes        |Read string
     45 `rc`|1   |yes         |no         |Read char
     46 `cm`|2   |yes         |no         |Change meta
     47 `ps`|var |yes         |yes        |Print string
     48 `ds`|3   |yes         |yes        |Define string
     49 `dd`|var |yes         |yes        |Delete definition
     50 `da`|1   |yes         |yes        |Delete all
     51 `ss`|var |yes         |yes        |Segment string
     52 `cl`|var |yes         |yes        |Call string
     53 `cs`|3   |yes         |yes        |Call segment
     54 `cc`|3   |yes         |no         |Call character
     55 `cn`|4   |yes         |no         |Call N characters
     56 `cr`|2   |yes         |yes        |Call [pointer] restore
     57 `in`|4   |yes         |yes        |Initial
     58 `eq`|5   |yes         |yes        |String equality
     59 `gr`|5   |yes         |yes        |Greater than
     60 `ad`|3, 4|yes         |yes        |Add
     61 `su`|3, 4|yes         |yes        |Subtract
     62 `ml`|3, 4|yes         |yes        |Multiply
     63 `dv`|3, 4|yes         |yes        |Divide
     64 `bu`|3   |yes         |yes        |Bitwise union (OR)
     65 `bi`|3   |yes         |yes        |Bitwise intersection (AND)
     66 `bc`|2   |yes         |yes        |Bitwise complement (NOT)
     67 `br`|3   |yes         |yes        |Bitwise rotation
     68 `bs`|3   |yes         |yes        |Bitwise shift
     69 `sb`|var |yes         |yes        |Store block (file): `#(sb,fname,f1,f2...)`
     70 `fb`|2   |yes         |yes        |Fetch block (file): `#(fb,fname)`
     71 `eb`|2   |yes         |yes        |Erase block (file): `#(eb,fname)`
     72 `ln`|2   |yes         |yes        |List names
     73 `pf`|2   |yes         |yes        |Print form
     74 `tn`|1   |yes         |yes        |Trace on
     75 `tf`|1   |yes         |yes        |Trace off
     76 `hl`|1   |yes         |yes        |Halt
     77 `mo`|2, 3|yes         |yes        |Mode (see below)
     78 
     79 ### New primitives
     80 
     81 - `bx`: bitwise XOR. 3 arguments: `#(bx,A,B)`. Returns the operation result. Invocation is the same as for the `bi` or `bu` primitives.
     82 - `ac`: ASCII code. 2 arguments: `#(ac,S)`. Returns the numeric unsigned value of the first character in `S`.
     83 - `av`: ASCII value. 2 arguments: `#(av,N)`. Returns the single byte corresponding to the ASCII code `N` (unsigned).
     84 - `fn`: format number. 3 arguments: `#(fn,fmt,N)`. Returns the sprintf-formatted string representation of the number `N` according to the format `fmt`.
     85 - `sf`: store (raw) file. 3 arguments: `#(sf,fname,form)`. Stores the raw value from `form` into a named file. The value is always written in its entirety (the form pointer is ignored) and segment gap bytes, if there are any, are written "as is". Returns a null value. The form doesn't get deleted from the internal form storage. The file is fully overwritten if it already exists.
     86 - `ff`: fetch (raw) file. 3 arguments: `#(ff,fname,form)`. Reads a raw string from the named file into `form`. Returns null value (check the target form afterwards).
     87 - `tm`: local/UTC/Epoch time. 2 or 3 arguments: `#(tl,fmt[,U])`. Returns the local (or UTC, if the third parameter `U` is specified) time formatted according to the strftime-compatible `fmt` string or `E` format. If the format string is just `E` (Epoch), returns the amount of seconds since 00:00:00 UTC, January 1, 1970.
     88 - `rn`: random number. 3 arguments: `#(rn,n1,n2)`. Returns a (pseudo-)random integer number in the range `n1` (included) to `n2` (not included). Implemented using the double-pass xorshift64* algorithm.
     89 - `os`: run an OS command. 2 arguments: `#(os,cmd)`. Runs a command in the external OS shell (the one determined by the `system()` C call) and returns the command exit code. The output is not captured.
     90 
     91 For self-contained nntrac environments that have no external shell (or the shell is nntrac itself), you can disable the `os` primitive by building nntrac with the `-DNNT_NO_EXTSHELL` flag.
     92 
     93 ### Modes
     94 
     95 The nntrac processor can run in one of the three modes that can be switched with the `mo` primitive:
     96 
     97 - `E` (extended): all primitives are available, including the custom ones — this mode is the default one (unlike the original spec);
     98 - `L` (legacy): only the original 34 primitives from T-64 standard are available, no (built-in) extensions are permitted;
     99 - `S` (secure): all primitives are available except those that can interact with the filesystem and outside operating environment (`sb`, `fb`, `eb`, `sf`, `ff`, `os`).
    100 
    101 To lock the mode set with `#(mo,L)` or `#(mo,S)` until the end of the program, use the third `L` parameter (`#(mo,S,L)` or `#(mo,L,L)`). This way, no code inside will be able to extend nntrac's privileges back to unsafe level.
    102 
    103 ### Accessing command-line parameters from nntrac scripts
    104 
    105 On normal script invocation (`nntrac script.trac [param1 param2 ...]`), nntrac automatically creates two forms: `nnt-argc` and `nnt-argv`. The `nnt-argc` form is a number containing the amount of command-line parameters (the script file name + everything after it, akin to Python). The `nnt-argv` form contains the parameters themselves and **already is segmented** so that you can use the `cs` primitive for easier parameter access.
    106 
    107 ## Embedding nntrac into your projects
    108 
    109 Besides being lightweight, nntrac is also fully embeddable. You can use it as the smallest scripting engine that can be tailored for the specific needs of your own software.
    110 
    111 ### Invoking nntrac from other C code
    112 
    113 Place `nntrac.c` and `nntrac-embed.h` files inside your project. In your C source code, append `#include "nntrac-embed.h"` to the top. Then, the following prototypes are available to you:
    114 
    115 - `void nnt_init()`: allocate the forms and primitives storage, register the basic primitives and prepare the engine for work. You must start every session with the `nnt_init();` call before being able to call other functions in this list.
    116 - `void nnt_regprimitive(const char *name, void *handler)`: register your own primitive function to the nntrac interpreter. See below for details.
    117 - `void nnt_proc(char *prog, unsigned int len)`: run a script contained in the string `prog` of length `len`. Any `#(hl)` call will exit this function.
    118 - `void nnt_finish()`: free all forms and primitive function resources. Must be called when you no longer need the nntrac engine.
    119 
    120 Then, you must build your project along with the `nntrac.c` file with the `-DNNT_EMBED` compiler flag. This flag disables the `main()` function in the nntrac source code itself.
    121 
    122 ### Extending nntrac with your own primitives
    123 
    124 For your project scripting needs, you might have to introduce your own primitive functions to nntrac. First, define your functions according to the following prototype: `char* handler(char *arglist, char *res, int *reslen);`, and your handler must do two things to be a valid primitive: return the `res` pointer back and, if necessary, update the `*reslen` field, which is 0 by default, with the actual result string length. Also, **to resize the `res` buffer, you must only use `realloc`!** Doing otherwise will eventually lead to segfaults or memory leaks.
    125 
    126 E.g. the function `pr_custom` might look like this:
    127 ```
    128 char* pr_custom(char *arglist, char *res, int *reslen) {
    129   /* ...some actions that update the result string... */
    130   *reslen = strlen(res);
    131   return res;
    132 }
    133 ```
    134 Then, in your main code, somewhere between the calls to `nnt_init` and `nnt_proc`, you register the pointer to your primitive function with the name of your choice using the `nnt_regprimitive` call:
    135 ```
    136 nnt_init();
    137 /* ... */
    138 nnt_regprimitive("my-custom", &pr_custom);
    139 ```
    140 And the `#(my-custom)` call becomes available in your nntrac script code. 
    141 
    142 Now, how do we process the function arguments in our custom primitive definition? The `arglist` parameter is a string of function arguments (starting with the registered primitive name itself) delimited with a special `NNT_ADEL` character. For usage with `strtok` C function, it's more convenient to use a predefined null-terminated string with the same delimiter, called `NNT_ADEL_S`. Both `NNT_ADEL` and `NNT_ADEL_S` definitions are available in the `nntrac-embed.h` header, as well as the inclusion of `stdlib.h` and `string.h` for your convenience.
    143 
    144 Here's an example of how we would implement some RGB light API for nntrac, returning the status:
    145 ```
    146 char *pr_rgbled(char *arglist, char *res, int *reslen) {
    147   char *arg = strtok(arglist, NNT_ADEL_S), *r, *g, *b;
    148   r = strtok(NULL, NNT_ADEL_S); /* get the first parameter */
    149   g = strtok(NULL, NNT_ADEL_S); /* get the second parameter */
    150   b = strtok(NULL, NNT_ADEL_S); /* get the third parameter */
    151   if(r != NULL && g != NULL && b != NULL) { /* all read successfully */
    152     int val_r = atoi(r), val_g = atoi(g), val_b = atoi(b); /* convert to int */
    153     rgbled_set_lights(val_r, val_g, val_b); /* call your internal API */
    154     rgbled_get_lights(&val_r, &val_g, &val_b); /* read back the status */
    155     char *fmt = "R=%d, G=%d, B=%d\n"; /* set the formatting string */
    156     /* estimate the size and initialize the resulting buffer */
    157     res = realloc(res, (*reslen) = 1 + snprintf(NULL, 0, fmt, val_r, val_g, val_b));
    158     memset(res, 0, *reslen); /* zero it out */
    159     snprintf(res, *reslen, fmt, val_r, val_g, val_b); /* render */
    160     res = realloc(res, (*reslen) = strlen(res)); /* resize to the actual written size */
    161   }
    162   return res; /* return the result pointer */
    163 }
    164 ```
    165 Then you can register this primitive with `nnt_regprimitive("rgb", &pr_rgbled);` in your main C code, and then, calling `##(rgb,43,67,133)` in your script will return the string `R=43, G=67, B=133` if the API succeeds.
    166 
    167 ## FAQ
    168 
    169 ### Why reimplement TRAC and not another scripting language?
    170 
    171 Because this is probably the only functional scripting language that can be fully, and even with some useful extensions, be implemented in under 1000 SLOC of ANSI C in a truly portable manner. Besides, C implementations of other embeddable scripting languages are easy to find and pick up, but for TRAC, at the time of nntrac creation, there existed nothing like that except a GPL-ed T-84 version that's hard to build with any modern C compiler.
    172 
    173 ### Why was T-64 standard chosen as the basis, not T-84 or T2001?
    174 
    175 While having more "batteries included", T-84/T2001 had diverged from the original elegant design by switching to name suffixes to decide what to do with the function return value. This is much less flexible and less convenient for large-scale programs. T-64, on the other hand, can be easily extended (when really necessary) to do all the same things as T-84 allowed out of the box without sacrificing its core simplicity and flexibility.
    176 
    177 ### Is nntrac binary-safe?
    178 
    179 In general, no. All nntrac programs are expected to not contain null bytes and the bytes from 248 to 255. Since nntrac, like all other TRAC dialects, is fully homoiconic and any piece of data can be treated as code, your data must not contain these bytes either. Emitting bytes with these values using the `av` primitive can and most probably will result in undefined behavior.
    180 
    181 To process arbitrary binary data in nntrac script, it is mandatory to convert it into readable format (like hex or dec) with external tools (like `od` or `xxd`) before feeding it to the script. The nntrac interpreter contains all features required for your script to be able to process this kind of data.
    182 
    183 ### Is nntrac UTF-8-safe?
    184 
    185 Mostly. All the internal meta characters are chosen so that they never occur in any valid UTF-8 sequence. All primitives, however, operate on individual bytes, so the primitives that allow you to input/output/manipulate a single byte or some numbered bytes are not UTF-8-safe. These include `rc`, `cm`, `cc`, `cn`, `ac` and `av` primitives.
    186 
    187 ### Why do `rc` and `rs` primitives require pressing Return (Enter) even after the metacharacter (`'`) was entered?
    188 
    189 They don't require it, your OS does. If you absolutely need per-character input then you need to set your terminal into the unbuffered input mode. For Unix-like systems, it can be done using a wrapper shell script with `stty` command. 
    190 
    191 ### Why doesn't the `os` primitive capture the shell command output?
    192 
    193 Because there is no truly portable way to do this. For capturing the output, it's recommended to redirect the command into a file (usually with `>` or `>>` shell operator) and then read the file contents with the `ff` primitive.
    194 
    195 ## Credits
    196 
    197 Implemented by Luxferre in 2023, released into public domain with no warranty.
    198 
    199 Based on the original specification according to ["Definition and Standard for TRAC T-64 Language"](http://web.archive.org/web/20040531054816/http://tracfoundation.org/trac64/docs/T64definition.pdf) by Calvin N. Mooers (1972).
    200