From 56ccdf057a44b57d19e770ee7cbd9f71aa1a5f6c Mon Sep 17 00:00:00 2001 From: Daniel Silverstone Date: Sun, 13 May 2012 18:43:24 +0100 Subject: Notes about compiling/executing rulesets --- doc/compiling | 193 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ doc/execution | 93 ++++++++++++++++++++++++++++ 2 files changed, 286 insertions(+) create mode 100644 doc/compiling create mode 100644 doc/execution diff --git a/doc/compiling b/doc/compiling new file mode 100644 index 0000000..ef80852 --- /dev/null +++ b/doc/compiling @@ -0,0 +1,193 @@ +The nitty-gritty of Lace's compilation process +============================================== + +When you construct a Lace engine, you give it a compilation callback +set. That set is used to call you back when Lace encounters something it +needs help compiling. The structure of it is: + +{ [".lace"] = { + loader = function(compcontext, nametoload) ... end, + commands = { + ... = function(compcontext, words...) ... end, + }, + controltype = { + ... = function(compcontext, type, words...) ... end, + } +} } + +Anything outside of the ".lace" entry in the context is considered +fair game structure-wise and can be used by the functions called back +to acquire internal pointers etc. Note however that the compilation +context will not be passed around during execution so if you need to +remember some of it in order to function properly, then it's up to +you. Also note that anything not defined above in the ".lace" entry +is considered "private" to Lace and should not be touched by non-Lace +code. Lace will usually be considerate and will not create arbitrary +entries in that table which do not start with a dot. This allows for +forward/backward compatibility to some extent. + +In addition, Lace will maintain a 'source' entry in the .lace table +with the lexed source which is being compiled and, if we're compiling +an included source, a parent entry with the compilation context of the +parent. The toplevel field is the compilation context of the top +level compilation. If parent is nil, then toplevel will equal +compcontext. Lace also maintains a 'linenr' entry with the +currently-being-compiled line number, so that commands and control +types can use that in error reports if necessary. + +If loader is absent then the include statement refuses to include +anything mandatory. If it is present but returns nil when called then +the include statement fails any mandatory includes which do so. + +Otherwise the loader is expected to return the 'real' name of the +source and the content of it. This allows for symbolic lookups. + +If Lace encounters a command it does not recognise then it will call +context.commands[cmdname] passing in the words representing the line +in question. It's up to that function to compile the line or else to +return an error. (More later) + +If Lace encounters a control type during a define command which it +does not understand, then it calls the context.controltype[name] +function passing in all the remaining arguments. The control type +function is expected to return a compiled set for the define or else +an error. (More later) + +To start a Lace engine compiling a ruleset, simply do (pseudocode): + + rules, err = lace.compiler.compile(compcontext, sourcename[, sourcecontent]) + +If sourcecontent is not given, Lace will use the loader in the +compcontext to load the source. + +If rules is nil, err is a Lua error. +If rules is false, err is a nice error from compilation +Otherwise, rules should be a table for the ruleset. + +Internally, once compiled, Lace rulesets are a list of tables. Each +rule entry has a reference to its source and line number. It then has +a function pointer for executing this rule, and a set of arguments to +give the rule. Lace automatically passes the execution context as the +first argument to the rule. Sub-included rulesets are simply one of +the arguments to the function used to run the rule. + +Loader +====== + +When Lace wishes to load an entry, it calls the loader function. This +is to allow rulesets to be loadable from arbitrary locations such as +files on disk, HTTP URLs, random bits of memory or even out of version +control repositories directly. + +The loader is given the compilation context and the name of the source +to load. Note that while it has the compilation context, the loader +function must be sensitive to the case of the initial load. Under +that circumstance, the source information in the compilation context +will be unavailable. The loader function is required to fit the +following pseudocode definition: + + realname, content = loader(compcontext, nametoload) + +If realname is not a string then content is expected to be an +"internal" error message (see below) which will to prefixed with the +calling source position etc and assembled into an error to return to +the caller of lace.compiler.compile. + +If realname is a string then it is taken to be the real name of the +loaded content (at worst you should return nametoload here) and +content is a string representing the contents of that file. + +Once loaded with the loader, Lace will then compile that sub-ruleset +before continuing with the current ruleset. + +Commands +======== + +When Lace wishes to compile a command for which it has no internal +definition, it will call the command function provided in the +compilation context. If no such command function is found, it will +produce an error and stop the compilation. + +The command functions must fit the following pseudocode definition: + + cmdtab, msg = command_func(compcontext, words...) + +If cmdtab is not a table, msg should be an "internal" error message +(see below) which will be prefixed with the calling source position +etc and assembled into an error to return to the caller of +lace.compiler.compile. + +If cmdtab is a table, it is taken to be the compiled table +representing the command to run at ruleset execution time. It should +have the form: + + { fn = exec_function, args = {...} } + +Lace will automatically augment that with the source information which +led to the compiled rule for use later. + +The exec_function is expected to fit the following pseudocode +definition: + + result, msg = exec_function(exec_context, unpack(args)) + +See execution for notes on how these exec_function functions are meant +to behave. + +Control Types +============= + +When Lace is compiling a definition rule with a control type it has +not got internally, Lace will call the controltype function associated +with it (or report an error if no such control type is found). + +The control type functions must fir the following pseudocode +definition: + + ctrltab, msg = controltype_func(compcontext, type, words...) + +If ctrltab is not a table, msg should be an "internal" error message +(see below) which will be prefixed with the calling source position +etc and assembled into an error to return to the caller of +lace.compiler.compile. + +If ctrltab is a table, it is taken to be the compiled table +representing the control type to run at ruleset execution time. It +should have the form: + + { fn = ct_function, args = {...} } + +The exec_function is expected to fit the following pseudocode +definition: + + result, msg = ct_function(exec_context, unpack(args)) + +See execution for notes on how these ct_function functions are meant +to behave. + +Compiler internal errors +======================== + +Error messages during compilation are of the form: + +{ + msg = "some string with no newlines", + words = { ... } +} + +Where words is the numeric index of the words which caused the error. +If words is empty (or nil) then the error is considered to be the +entire line. + +Lace will use this information to construct meaningful long error +messages which point at the words in question. Such as: + +myruleset:6: Unknown command name: 'go_fish' + go_fish "I have no bananas" + ^^^^^^^ + +In the case of control type compilation, the words will automatically +be offset by the appropriate number to account for the define words. +This means you should always 1-index from your arguments. + +The same kind of situation occurs during execution. diff --git a/doc/execution b/doc/execution new file mode 100644 index 0000000..f17ca1e --- /dev/null +++ b/doc/execution @@ -0,0 +1,93 @@ +Execution of Lace rulesets +========================== + +Once compiled, a ruleset is essentially a sequence of functions to +call on the execution context. The simplest execution context is an +empty table. If Lace is going to store anything it will use a ".lace" +prefix as with compilation contexts. + +A few important functions make up the execution engine. The top level +function is simply: + + result, msg = lace.engine.run(ruleset, exec_context) + +This will run the ruleset with the given execution context and return +a simple result. + +If the result is nil, then msg is a long-form string error explaining +what went wrong. It represents a Lua error being caught and as such +you may not want to report it to your users. + +If the result is false, then msg is a long-form string error +explaining that something returned an error during execution which it +would be reasonable to report to users. + +If the result is "allow", then msg is an optional string saying why +the ruleset resulted in an allow. Ditto for "deny". Essentially any +string might be a reason. This is covered below in Commands. + +Commands +======== + +When a command is being run, it is called as: + + result, msg = command_fn(exec_context, unpack(args)) + +where args are the arguments it returned when being compiled. + +If the function throws an error, that will be caught and processed by +the execution engine. + +If result is falsehood (nil, false) then the command is considered to +have failed for some reason and msg contains an "internal" error +message to report to the user. This aborts the execution of the +ruleset. + +If result is true, then the command successfully ran, and execution +continues at the next rule. + +If result is a string, then the command returned a result. This +ceases execution of the ruleset and the result and message (which must +be a string explanation) are returned to the caller. Typically such +results would be "allow" or "deny" although there's nothing forcing +that to be the case. + +Control Types +============= + +When a control type function is being run, it is called as: + + result, msg = ct_fn(exec_context, unpack(args)) + +where args are the arguments it returned when being compiled. + +If the function throws an error, it will be caught and processed by +the execution engine. + +If result is nil then msg is an "internal" error, execution will be +stopped and the issue reported to the caller. + +If result is false, the control call failed and returned falsehood. +Anything else and the control call succeeded and returns truth. + +Control type functions are called at the point of test, not at the +point of definition. Control type results are *NOT* cached. It is up +to the called functions to perform any caching/memoising of results as +needed to ensure suitably performant behaviour. + +Helper functions +================ + +Since sometimes you need to know if a given define rule passes, Lace +provides a function to do this. It is bound up in the behaviour of +Lace's internal 'define' command and as such, you should treat it as a +black box. + + result, msg = lace.engine.test(exec_context, name) + +This, via the magic of the execution context calls through to the +appropriate control type functions, returning their results directly. + +This means that it can throw an error in the case of a Lua error, +otherwise it returns the two values as above. + -- cgit v1.2.1