summaryrefslogtreecommitdiff
path: root/source/parsing.doc
blob: d26a64ae4e8a264f17d0135ce2bf96461ab2680e (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
Chris Hertel, Samba Team
November 1997

This is a quick overview of the lexical analysis, syntax, and semantics
of the smb.conf file.

Lexical Analysis:

  Basically, the file is processed on a line by line basis.  There are
  four types of lines that are recognized by the lexical analyzer
  (params.c):

  Blank lines           - Lines containing only whitespace.
  Comment lines         - Lines beginning with either a semi-colon or a
                          pound sign (';' or '#').
  Section header lines  - Lines beginning with an open square bracket
                          ('[').
  Parameter lines       - Lines beginning with any other character.
                          (The default line type.)

  The first two are handled exclusively by the lexical analyzer, which
  ignores them.  The latter two line types are scanned for

  - Section names
  - Parameter names
  - Parameter values

  These are the only tokens passed to the parameter loader
  (loadparm.c).  Parameter names and values are divided from one
  another by an equal sign: '='.


  Handling of Whitespace:

  Whitespace is defined as all characters recognized by the isspace()
  function (see ctype(3C)) except for the newline character ('\n')
  The newline is excluded because it identifies the end of the line.

  - The lexical analyzer scans past white space at the beginning of a
    line.

  - Section and parameter names may contain internal white space.  All
    whitespace within a name is compressed to a single space character. 

  - Internal whitespace within a parameter value is kept verbatim with
    the exception of carriage return characters ('\r'), all of which
    are removed.

  - Leading and trailing whitespace is removed from names and values.


  Handling of Line Continuation:

  Long section header and parameter lines may be extended across
  multiple lines by use of the backslash character ('\\').  Line
  continuation is ignored for blank and comment lines.

  If the last (non-whitespace) character within a section header or on
  a parameter line is a backslash, then the next line will be
  (logically) concatonated with the current line by the lexical
  analyzer.  For example:

    param name = parameter value string \
    with line continuation.

  Would be read as

    param name = parameter value string     with line continuation.

  Note that there are five spaces following the word 'string',
  representing the one space between 'string' and '\\' in the top
  line, plus the four preceeding the word 'with' in the second line.
  (Yes, I'm counting the indentation.)

  Line continuation characters are ignored on blank lines and at the end
  of comments.  They are *only* recognized within section and parameter
  lines.


  Line Continuation Quirks:
  
  Note the following example:

    param name = parameter value string \
    \
    with line continuation.

  The middle line is *not* parsed as a blank line because it is first
  concatonated with the top line.  The result is

    param name = parameter value string         with line continuation.

  The same is true for comment lines.

    param name = parameter value string \
    ; comment \
    with a comment.

  This becomes:
  
    param name = parameter value string     ; comment     with a comment.

  On a section header line, the closing bracket (']') is considered a
  terminating character, and the rest of the line is ignored.  The lines
  
    [ section   name ] garbage \
    param  name  = value

  are read as

    [section name]
    param name = value



Syntax:

  The syntax of the smb.conf file is as follows:

  <file>            :==  { <section> } EOF

  <section>         :==  <section header> { <parameter line> }

  <section header>  :==  '[' NAME ']'

  <parameter line>  :==  NAME '=' VALUE NL


  Basically, this means that
  
    - a file is made up of zero or more sections, and is terminated by
      an EOF (we knew that).

    - A section is made up of a section header followed by zero or more
      parameter lines.

    - A section header is identified by an opening bracket and
      terminated by the closing bracket.  The enclosed NAME identifies
      the section.

    - A parameter line is divided into a NAME and a VALUE.  The *first*
      equal sign on the line separates the NAME from the VALUE.  The
      VALUE is terminated by a newline character (NL = '\n').


About params.c:

  The parsing of the config file is a bit unusual if you are used to
  lex, yacc, bison, etc.  Both lexical analysis (scanning) and parsing
  are performed by params.c.  Values are loaded via callbacks to
  loadparm.c.

--------------------------------------------------------------------------

                                  Samba DEBUG
                                       
Chris Hertel, Samba Team
July, 1998
   
   Here's the scoop on the update to the DEBUG() system.
   
   First, my goals are:
     * Backward compatibility (ie., I don't want to break any Samba code
       that already works).
     * Debug output should be timestamped and easy to read (format-wise).
     * Debug output should be parsable by software.
     * There should be convenient tools for composing debug messages.
       
   NOTE: the Debug functionality has been moved from util.c to the new
   debug.c module.
   
New Output Syntax

   The syntax of a debugging log file is represented as:
  <debugfile> :== { <debugmsg> }

  <debugmsg>  :== <debughdr> '\n' <debugtext>

  <debughdr>  :== '[' TIME ',' LEVEL ']' FILE ':' [FUNCTION] '(' LINE ')'

  <debugtext> :== { <debugline> }

  <debugline> :== TEXT '\n'

   TEXT is a string of characters excluding the newline character.
   LEVEL is the DEBUG level of the message (an integer in the range
   0..10).
   TIME is a timestamp.
   FILE is the name of the file from which the debug message was
   generated.
   FUNCTION is the function from which the debug message was generated.
   LINE is the line number of the debug statement that generated the
   message.
   
   Basically, what that all means is:
     * A debugging log file is made up of debug messages.
     * Each debug message is made up of a header and text. The header is
       separated from the text by a newline.
     * The header begins with the timestamp and debug level of the
       message enclosed in brackets. The filename, function, and line
       number at which the message was generated follow. The filename is
       terminated by a colon, and the function name is terminated by the
       parenthesis which contain the line number. Depending upon the
       compiler, the function name may be missing (it is generated by the
       __FUNCTION__ macro, which is not universally implemented, dangit).
     * The message text is made up of zero or more lines, each terminated
       by a newline.
       
   Here's some example output:

    [1998/08/03 12:55:25, 1] nmbd.c:(659)
      Netbios nameserver version 1.9.19-prealpha started.
      Copyright Andrew Tridgell 1994-1997
    [1998/08/03 12:55:25, 3] loadparm.c:(763)
      Initializing global parameters

   Note that in the above example the function names are not listed on
   the header line. That's because the example above was generated on an
   SGI Indy, and the SGI compiler doesn't support the __FUNCTION__ macro.
   
The DEBUG() Macro

   Use of the DEBUG() macro is unchanged. DEBUG() takes two parameters.
   The first is the message level, the second is the body of a function
   call to the Debug1() function.
   
   That's confusing.
   
   Here's an example which may help a bit. If you would write

     printf( "This is a %s message.\n", "debug" );

   to send the output to stdout, then you would write

     DEBUG( 0, ( "This is a %s message.\n", "debug" ) );

   to send the output to the debug file.  All of the normal printf()
   formatting escapes work.
   
   Note that in the above example the DEBUG message level is set to 0.
   Messages at level 0 always print.  Basically, if the message level is
   less than or equal to the global value DEBUGLEVEL, then the DEBUG
   statement is processed.
   
   The output of the above example would be something like:

    [1998/07/30 16:00:51, 0] file.c:function(128)
      This is a debug message.

   Each call to DEBUG() creates a new header *unless* the output produced
   by the previous call to DEBUG() did not end with a '\n'. Output to the
   debug file is passed through a formatting buffer which is flushed
   every time a newline is encountered. If the buffer is not empty when
   DEBUG() is called, the new input is simply appended.

   ...but that's really just a Kludge. It was put in place because
   DEBUG() has been used to write partial lines. Here's a simple (dumb)
   example of the kind of thing I'm talking about:

    DEBUG( 0, ("The test returned " ) );
    if( test() )
      DEBUG(0, ("True") );
    else
      DEBUG(0, ("False") );
    DEBUG(0, (".\n") );

   Without the format buffer, the output (assuming test() returned true)
   would look like this:

    [1998/07/30 16:00:51, 0] file.c:function(256)
      The test returned
    [1998/07/30 16:00:51, 0] file.c:function(258)
      True
    [1998/07/30 16:00:51, 0] file.c:function(261)
      .

   Which isn't much use. The format buffer kludge fixes this problem.
   
The DEBUGADD() Macro

   In addition to the kludgey solution to the broken line problem
   described above, there is a clean solution. The DEBUGADD() macro never
   generates a header. It will append new text to the current debug
   message even if the format buffer is empty. The syntax of the
   DEBUGADD() macro is the same as that of the DEBUG() macro.

    DEBUG( 0, ("This is the first line.\n" ) );
    DEBUGADD( 0, ("This is the second line.\nThis is the third line.\n" ) );

   Produces
    [1998/07/30 16:00:51, 0] file.c:function(512)
      This is the first line.
      This is the second line.
      This is the third line.

The DEBUGLVL() Macro

   One of the problems with the DEBUG() macro was that DEBUG() lines
   tended to get a bit long. Consider this example from
   nmbd_sendannounce.c:

  DEBUG(3,("send_local_master_announcement: type %x for name %s on subnet %s for workgroup %s\n",
            type, global_myname, subrec->subnet_name, work->work_group));

   One solution to this is to break it down using DEBUG() and DEBUGADD(),
   as follows:

  DEBUG( 3, ( "send_local_master_announcement: " ) );
  DEBUGADD( 3, ( "type %x for name %s ", type, global_myname ) );
  DEBUGADD( 3, ( "on subnet %s ", subrec->subnet_name ) );
  DEBUGADD( 3, ( "for workgroup %s\n", work->work_group ) );

   A similar, but arguably nicer approach is to use the DEBUGLVL() macro.
   This macro returns True if the message level is less than or equal to
   the global DEBUGLEVEL value, so:

  if( DEBUGLVL( 3 ) )
    {
    dbgtext( "send_local_master_announcement: " );
    dbgtext( "type %x for name %s ", type, global_myname );
    dbgtext( "on subnet %s ", subrec->subnet_name );
    dbgtext( "for workgroup %s\n", work->work_group );
    }

   (The dbgtext() function is explained below.)
   
   There are a few advantages to this scheme:
     * The test is performed only once.
     * You can allocate variables off of the stack that will only be used
       within the DEBUGLVL() block.
     * Processing that is only relevant to debug output can be contained
       within the DEBUGLVL() block.
       
New Functions

   dbgtext()
          This function prints debug message text to the debug file (and
          possibly to syslog) via the format buffer. The function uses a
          variable argument list just like printf() or Debug1(). The
          input is printed into a buffer using the vslprintf() function,
          and then passed to format_debug_text().
          
          If you use DEBUGLVL() you will probably print the body of the
          message using dbgtext(). 
          
   dbghdr()
          This is the function that writes a debug message header.
          Headers are not processed via the format buffer. Also note that
          if the format buffer is not empty, a call to dbghdr() will not
          produce any output. See the comments in dbghdr() for more info.
          
          It is not likely that this function will be called directly. It
          is used by DEBUG() and DEBUGADD().
          
   format_debug_text()
          This is a static function in debug.c. It stores the output text
          for the body of the message in a buffer until it encounters a
          newline. When the newline character is found, the buffer is
          written to the debug file via the Debug1() function, and the
          buffer is reset. This allows us to add the indentation at the
          beginning of each line of the message body, and also ensures
          that the output is written a line at a time (which cleans up
          syslog output).