summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorDodji Seketeli <dodji@seketeli.org>2003-06-22 14:47:14 +0000
committerDodji Seketeli <dodji@src.gnome.org>2003-06-22 14:47:14 +0000
commita9864bf9c61c34a7403df508257cca5c30bb50d2 (patch)
treefb14350cb5d1ec2dfe9809de9127d6a40c9e21cf
parent676a4e811066dd4d98dd4ca83eac7f04b4f3b089 (diff)
downloadlibcroco-a9864bf9c61c34a7403df508257cca5c30bb50d2.tar.gz
An on going parser architecture document.
2003-06-22 Dodji Seketeli <dodji@seketeli.org> An on going parser architecture document. Dodji.
-rw-r--r--docs/design/parser-architecture.txt70
1 files changed, 70 insertions, 0 deletions
diff --git a/docs/design/parser-architecture.txt b/docs/design/parser-architecture.txt
new file mode 100644
index 0000000..094c055
--- /dev/null
+++ b/docs/design/parser-architecture.txt
@@ -0,0 +1,70 @@
+Libcroco parser architecture
+-----------------------------
+
+Author: Dodji Seketeli <dodji@seketeli.org>
+
+$Id$
+
+I) Forethoughts.
+===================
+
+Libcroco's parser is a simple recursive descent parser.
+The major design focus has been simplicity, reliability and
+conformance.
+
+Simplicity
+-----------
+We want the code to be maintainable by anyone who knows the css spec
+and who knows how to code in C. Therefore, we avoid to overuse
+the C preprocessor magic and all the tricks that tends to turn C into
+a maintainance nightmare.
+
+We also try to adhere to the gnome coding guidelines specified
+at http://developer.gnome.org/doc/guides/programming-guidelines.
+
+
+Reliability
+-----------
+Each single function of the libcroco library should never crash,
+and this, whatever the arguments it takes.
+As a consequence we tend to be paranoic when it comes to check
+pointers values before dereferencing them for example...
+
+Conformance
+-----------
+We try to stick to the css spec. We now this is almost impossible to achieve
+given the ressource we have but we think it is sane target to chase.
+
+II) Overall architecture
+=========================
+The parser is organized around two main classes :
+
+1/ CRInput
+2/ CRTknzr (Tokenizer or lexer)
+3/ CRParser
+
+II.1 The CRInput class
+-----------------------
+The CRInput class provides the abstraction of
+an utf8-encoded character stream.
+
+Ideally, it should abstracts local data sources
+(local files and in-memory buffers)
+and remote data sources (sockets, url-identified ressources) but at the
+moment, it abstracts local data sources only.
+
+Adding a new type of data source should be transparent for the
+classes that already use CRInput. After all, it is what is abstraction about :)
+
+
+II.2 The CRTknzr class
+-------------------
+The main job of the tokenizer (or lexer) is to
+provide a get_next_token () method.
+This methods returns the next css token found in the input stream.
+(Note that the input stream here is an instance of CRInput).
+
+This provides an extremely usefull facility to the parser.
+
+II.3 The CRParser class
+------------------------- \ No newline at end of file