Notes about compiling/executing rulesets

author: Daniel Silverstone <dsilvers@digital-scurf.org> 2012-05-13 18:43:24 +0100
committer: Daniel Silverstone <dsilvers@digital-scurf.org> 2012-05-13 18:43:24 +0100
commit: 56ccdf057a44b57d19e770ee7cbd9f71aa1a5f6c (patch)
tree: 3056bf769845d6b874ffa63b2dd4296c728ceb5f
parent: 74173cb8555e76b5739fdabbed3430dec53b0503 (diff)
download: lace-56ccdf057a44b57d19e770ee7cbd9f71aa1a5f6c.tar.gz
2 files changed, 286 insertions, 0 deletions
diff --git a/doc/compiling b/doc/compiling
new file mode 100644
index 0000000..ef80852
--- /dev/null
+++ b/doc/compiling
@@ -0,0 +1,193 @@
+The nitty-gritty of Lace's compilation process
+==============================================
+
+When you construct a Lace engine, you give it a compilation callback
+set.  That set is used to call you back when Lace encounters something it
+needs help compiling.  The structure of it is:
+
+{ [".lace"] = {
+   loader = function(compcontext, nametoload) ... end,
+   commands = {
+      ... = function(compcontext, words...) ... end,
+   },
+   controltype = {
+      ... = function(compcontext, type, words...) ... end,
+   }
+} }
+
+Anything outside of the ".lace" entry in the context is considered
+fair game structure-wise and can be used by the functions called back
+to acquire internal pointers etc.  Note however that the compilation
+context will not be passed around during execution so if you need to
+remember some of it in order to function properly, then it's up to
+you.  Also note that anything not defined above in the ".lace" entry
+is considered "private" to Lace and should not be touched by non-Lace
+code.  Lace will usually be considerate and will not create arbitrary
+entries in that table which do not start with a dot.  This allows for
+forward/backward compatibility to some extent.
+
+In addition, Lace will maintain a 'source' entry in the .lace table
+with the lexed source which is being compiled and, if we're compiling
+an included source, a parent entry with the compilation context of the
+parent.  The toplevel field is the compilation context of the top
+level compilation.  If parent is nil, then toplevel will equal
+compcontext.  Lace also maintains a 'linenr' entry with the
+currently-being-compiled line number, so that commands and control
+types can use that in error reports if necessary.
+
+If loader is absent then the include statement refuses to include
+anything mandatory.  If it is present but returns nil when called then
+the include statement fails any mandatory includes which do so.
+
+Otherwise the loader is expected to return the 'real' name of the
+source and the content of it.  This allows for symbolic lookups.
+
+If Lace encounters a command it does not recognise then it will call
+context.commands[cmdname] passing in the words representing the line
+in question.  It's up to that function to compile the line or else to
+return an error.  (More later)
+
+If Lace encounters a control type during a define command which it
+does not understand, then it calls the context.controltype[name]
+function passing in all the remaining arguments.  The control type
+function is expected to return a compiled set for the define or else
+an error.  (More later)
+
+To start a Lace engine compiling a ruleset, simply do (pseudocode):
+
+    rules, err = lace.compiler.compile(compcontext, sourcename[, sourcecontent])
+
+If sourcecontent is not given, Lace will use the loader in the
+compcontext to load the source.
+
+If rules is nil, err is a Lua error.
+If rules is false, err is a nice error from compilation
+Otherwise, rules should be a table for the ruleset.
+
+Internally, once compiled, Lace rulesets are a list of tables.  Each
+rule entry has a reference to its source and line number.  It then has
+a function pointer for executing this rule, and a set of arguments to
+give the rule.  Lace automatically passes the execution context as the
+first argument to the rule.  Sub-included rulesets are simply one of
+the arguments to the function used to run the rule.
+
+Loader
+======
+
+When Lace wishes to load an entry, it calls the loader function.  This
+is to allow rulesets to be loadable from arbitrary locations such as
+files on disk, HTTP URLs, random bits of memory or even out of version
+control repositories directly.
+
+The loader is given the compilation context and the name of the source
+to load.  Note that while it has the compilation context, the loader
+function must be sensitive to the case of the initial load.  Under
+that circumstance, the source information in the compilation context
+will be unavailable.  The loader function is required to fit the
+following pseudocode definition:
+
+    realname, content = loader(compcontext, nametoload)
+
+If realname is not a string then content is expected to be an
+"internal" error message (see below) which will to prefixed with the
+calling source position etc and assembled into an error to return to
+the caller of lace.compiler.compile.
+
+If realname is a string then it is taken to be the real name of the
+loaded content (at worst you should return nametoload here) and
+content is a string representing the contents of that file.
+
+Once loaded with the loader, Lace will then compile that sub-ruleset
+before continuing with the current ruleset.
+
+Commands
+========
+
+When Lace wishes to compile a command for which it has no internal
+definition, it will call the command function provided in the
+compilation context.  If no such command function is found, it will
+produce an error and stop the compilation.
+
+The command functions must fit the following pseudocode definition:
+
+    cmdtab, msg = command_func(compcontext, words...)
+
+If cmdtab is not a table, msg should be an "internal" error message
+(see below) which will be prefixed with the calling source position
+etc and assembled into an error to return to the caller of
+lace.compiler.compile.
+
+If cmdtab is a table, it is taken to be the compiled table
+representing the command to run at ruleset execution time.  It should
+have the form:
+
+    { fn = exec_function, args = {...} }
+
+Lace will automatically augment that with the source information which
+led to the compiled rule for use later.
+
+The exec_function is expected to fit the following pseudocode
+definition:
+
+    result, msg = exec_function(exec_context, unpack(args))
+
+See execution for notes on how these exec_function functions are meant
+to behave.
+
+Control Types
+=============
+
+When Lace is compiling a definition rule with a control type it has
+not got internally, Lace will call the controltype function associated
+with it (or report an error if no such control type is found).
+
+The control type functions must fir the following pseudocode
+definition:
+
+    ctrltab, msg = controltype_func(compcontext, type, words...)
+
+If ctrltab is not a table, msg should be an "internal" error message
+(see below) which will be prefixed with the calling source position
+etc and assembled into an error to return to the caller of
+lace.compiler.compile.
+
+If ctrltab is a table, it is taken to be the compiled table
+representing the control type to run at ruleset execution time.  It
+should have the form:
+
+    { fn = ct_function, args = {...} }
+
+The exec_function is expected to fit the following pseudocode
+definition:
+
+    result, msg = ct_function(exec_context, unpack(args))
+
+See execution for notes on how these ct_function functions are meant
+to behave.
+
+Compiler internal errors
+========================
+
+Error messages during compilation are of the form:
+
+{
+   msg = "some string with no newlines",
+   words = { ... }
+}
+
+Where words is the numeric index of the words which caused the error.
+If words is empty (or nil) then the error is considered to be the
+entire line.
+
+Lace will use this information to construct meaningful long error
+messages which point at the words in question.  Such as:
+
+myruleset:6: Unknown command name: 'go_fish'
+             go_fish "I have no bananas"
+             ^^^^^^^
+
+In the case of control type compilation, the words will automatically
+be offset by the appropriate number to account for the define words.
+This means you should always 1-index from your arguments.
+
+The same kind of situation occurs during execution.
diff --git a/doc/execution b/doc/execution
new file mode 100644
index 0000000..f17ca1e
--- /dev/null
+++ b/doc/execution
@@ -0,0 +1,93 @@
+Execution of Lace rulesets
+==========================
+
+Once compiled, a ruleset is essentially a sequence of functions to
+call on the execution context.  The simplest execution context is an
+empty table.  If Lace is going to store anything it will use a ".lace"
+prefix as with compilation contexts.
+
+A few important functions make up the execution engine.  The top level
+function is simply:
+
+    result, msg = lace.engine.run(ruleset, exec_context)
+
+This will run the ruleset with the given execution context and return
+a simple result.
+
+If the result is nil, then msg is a long-form string error explaining
+what went wrong.  It represents a Lua error being caught and as such
+you may not want to report it to your users.
+
+If the result is false, then msg is a long-form string error
+explaining that something returned an error during execution which it
+would be reasonable to report to users.
+
+If the result is "allow", then msg is an optional string saying why
+the ruleset resulted in an allow.  Ditto for "deny".  Essentially any
+string might be a reason.  This is covered below in Commands.
+
+Commands
+========
+
+When a command is being run, it is called as:
+
+    result, msg = command_fn(exec_context, unpack(args))
+
+where args are the arguments it returned when being compiled.
+
+If the function throws an error, that will be caught and processed by
+the execution engine.
+
+If result is falsehood (nil, false) then the command is considered to
+have failed for some reason and msg contains an "internal" error
+message to report to the user.  This aborts the execution of the
+ruleset.
+
+If result is true, then the command successfully ran, and execution
+continues at the next rule.
+
+If result is a string, then the command returned a result.  This
+ceases execution of the ruleset and the result and message (which must
+be a string explanation) are returned to the caller.  Typically such
+results would be "allow" or "deny" although there's nothing forcing
+that to be the case.
+
+Control Types
+=============
+
+When a control type function is being run, it is called as:
+
+    result, msg = ct_fn(exec_context, unpack(args))
+
+where args are the arguments it returned when being compiled.
+
+If the function throws an error, it will be caught and processed by
+the execution engine.
+
+If result is nil then msg is an "internal" error, execution will be
+stopped and the issue reported to the caller.
+
+If result is false, the control call failed and returned falsehood.
+Anything else and the control call succeeded and returns truth.
+
+Control type functions are called at the point of test, not at the
+point of definition.  Control type results are *NOT* cached.  It is up
+to the called functions to perform any caching/memoising of results as
+needed to ensure suitably performant behaviour.
+
+Helper functions
+================
+
+Since sometimes you need to know if a given define rule passes, Lace
+provides a function to do this.  It is bound up in the behaviour of
+Lace's internal 'define' command and as such, you should treat it as a
+black box.
+
+    result, msg = lace.engine.test(exec_context, name)
+
+This, via the magic of the execution context calls through to the
+appropriate control type functions, returning their results directly.
+
+This means that it can throw an error in the case of a Lua error,
+otherwise it returns the two values as above.
+
author	Daniel Silverstone <dsilvers@digital-scurf.org>	2012-05-13 18:43:24 +0100
committer	Daniel Silverstone <dsilvers@digital-scurf.org>	2012-05-13 18:43:24 +0100
commit	56ccdf057a44b57d19e770ee7cbd9f71aa1a5f6c (patch)
tree	3056bf769845d6b874ffa63b2dd4296c728ceb5f
parent	74173cb8555e76b5739fdabbed3430dec53b0503 (diff)
download	lace-56ccdf057a44b57d19e770ee7cbd9f71aa1a5f6c.tar.gz