summaryrefslogtreecommitdiff
path: root/doc/genfile.texi
blob: 1c3de81ff33f5d13e36e5ef1e8643b222cffd3a9 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
@c This is part of the paxutils manual.
@c Copyright (C) 2005--2006, 2009, 2108, 2023 Free Software Foundation, Inc.
@c Written by Sergey Poznyakoff
@c This file is distributed under GFDL 1.1 or any later version
@c published by the Free Software Foundation.

@cindex genfile
    This appendix describes @command{genfile}, an auxiliary program
used in the GNU tar testsuite. If you are not interested in developing
GNU tar, skip this appendix.

    Initially, @command{genfile} was used to generate data files for
the testsuite, hence its name. However, new operation modes were being
implemented as the testsuite grew more sophisticated, and now
@command{genfile} is a multi-purpose instrument.

    There are four basic operation modes:

@table @asis
@item File Generation
    This is the default mode. In this mode, @command{genfile}
generates data files.

@item File Status
    In this mode @command{genfile} displays status of specified files.

@item Set File Time
    Set last access and modification times of files given in the
command line.

@item Synchronous Execution.
    In this mode @command{genfile} executes the given program with
@option{--checkpoint} option and executes a set of actions when
specified checkpoints are reached.
@end table

@menu
* Generate Mode::       File Generation Mode.
* Status Mode::         File Status Mode.
* Set File Time::       Set File Time Mode.
* Exec Mode::           Synchronous Execution Mode.
@end menu

@node Generate Mode
@appendixsec Generate Mode

@cindex Generate Mode, @command{genfile}
@cindex @command{genfile}, generate mode
@cindex @command{genfile}, create file
    In this mode @command{genfile} creates a data file for the test
suite. The size of the file is given with the @option{--length}
(@option{-l}) option. By default the file contents is written to the
standard output, this can be changed using @option{--file}
(@option{-f}) command line option. Thus, the following two commands
are equivalent:

@smallexample
@group
genfile --length 100 > outfile
genfile --length 100 --file outfile
@end group
@end smallexample

    If @option{--length} is not given, @command{genfile} will
generate an empty (zero-length) file.

@cindex @command{genfile}, seeking to a given offset
    The command line option @option{--seek=@var{N}} istructs @command{genfile}
to skip the given number of bytes (@var{N}) in the output file before
writing to it.  It is similar to the @option{seek=@var{N}} of the
@command{dd} utility.

@cindex @command{genfile}, reading a list of file names
    You can instruct @command{genfile} to create several files at one
go, by giving it @option{--files-from} (@option{-T}) option followed
by a name of file containing a list of file names. Using dash
(@samp{-}) instead of the file name causes @command{genfile} to read
file list from the standard input. For example:

@smallexample
@group
# Read file names from file @file{file.list}
genfile --files-from file.list
# Read file names from standard input
genfile --files-from -
@end group
@end smallexample

@cindex File lists separated by NUL characters
    The list file is supposed to contain one file name per line. To
use file lists separated by ASCII NUL character, use @option{--null}
(@option{-0}) command line option:

@smallexample
genfile --null --files-from file.list
@end smallexample

@cindex pattern, @command{genfile}
    The default data pattern for filling the generated file consists
of first 256 letters of ASCII code, repeated enough times to fill the
entire file. This behavior can be changed with @option{--pattern}
option. This option takes a mandatory argument, specifying pattern
name to use. Currently two patterns are implemented:

@table @option
@item --pattern=default
    The default pattern as described above.

@item --pattern=zero
    Fills the file with zeroes.
@end table

    If no file name was given, the program exits with the code
@code{0}.  Otherwise, it exits with @code{0} only if it was able to
create a file of the specified length.

@cindex Sparse files, creating using @command{genfile}
@cindex @command{genfile}, creating sparse files
    Special option @option{--sparse} (@option{-s}) instructs
@command{genfile} to create a sparse file. Sparse files consist of
@dfn{data fragments}, separated by @dfn{holes} or blocks of zeros. On
many operating systems, actual disk storage is not allocated for
holes, but they are counted in the length of the file. To create a
sparse file, @command{genfile} should know where to put data fragments,
and what data to use to fill them. So, when @option{--sparse} is given
the rest of the command line specifies a so-called @dfn{file map}.

    The file map consists of any number of @dfn{fragment
descriptors}. Each descriptor is composed of two values: a number,
specifying fragment offset from the end of the previous fragment or,
for the very first fragment, from the beginning of the file, and
@dfn{contents string}, that specifies the pattern to fill the fragment
with. File offset can be suffixed with the following quantifiers:

@table @samp
@item k
@itemx K
The number is expressed in kilobytes.
@item m
@itemx M
The number is expressed in megabytes.
@item g
@itemx G
The number is expressed in gigabytes.
@end table

    Contents string can be either a fragment size or a pattern.
Fragment size is a decimal number, prefixed with an equals sign.  It
can be suffixed with a quantifier, as discussed above.  If fragment
size is given, the fragment of that size will be filled with the
currently selected pattern (@pxref{Generate Mode, --pattern}) and
written to the file.  

    A pattern is a string of arbitrary ASCII characters.  For each
of them, @command{genfile} will generate a @dfn{block} of data,
filled with that character and will write it to the fragment. The size
of block is given by @option{--block-size} option. It defaults to 512.
Thus, if pattern consists of @var{n} characters, the resulting file
fragment will contain @code{@var{n}*@var{block-size}} bytes of data.

    The last fragment descriptor can have only file offset part. In this
case @command{genfile} will create a hole at the end of the file up to
the given offset.

    A dash appearing as a fragment descriptor instructs
@command{genfile} to read file map from the standard input.  Each line
of input should consist of fragment offset and contents string,
separated by any amount of whitespace.

    For example, consider the following invocation:

@smallexample
genfile --sparse --file sparsefile 0 ABCD 1M EFGHI 2000K
@end smallexample

@noindent
It will create 3101184-bytes long file of the following structure:

@multitable @columnfractions .35 .20 .45
@item Offset  @tab Length       @tab Contents
@item 0       @tab 4*512=2048   @tab Four 512-byte blocks, filled with
letters @samp{A}, @samp{B}, @samp{C} and @samp{D}.
@item 2048    @tab 1046528      @tab Zero bytes
@item 1050624 @tab 5*512=2560   @tab Five blocks, filled with letters
@samp{E}, @samp{F}, @samp{G}, @samp{H}, @samp{I}.
@item 1053184  @tab 2048000     @tab Zero bytes
@end multitable

@cindex --quite, option
    The exit code of @command{genfile --sparse} command is @code{0}
only if created file is actually sparse.  If it is not, the
appropriate error message is displayed and the command exists with
code @code{1}.  The @option{--quite} (@option{-q}) option suppresses
this behavior.  If @option{--quite} is given, @command{genfile
--sparse} exits with code @code{0} if it was able to create the file,
whether the resulting file is sparse or not.

@node Status Mode
@appendixsec Status Mode

    In status mode, @command{genfile} prints file system status for
each file specified in the command line. This mode is toggled by
@option{--stat} (@option{-S}) command line option. An optional argument to this
option specifies output @dfn{format}: a comma-separated list of
@code{struct stat} fields to be displayed. This list can contain
following identifiers:

@table @asis
@item name
    The file name.

@item dev
@itemx st_dev
    Device number in decimal.

@item ino
@itemx st_ino
    Inode number.

@item mode[.@var{number}]
@itemx st_mode[.@var{number}]

@FIXME{Should we also support @samp{%} notations as in stat(1)?}

    File mode in octal.  Optional @var{number} specifies octal mask to
be applied to the mode before outputting.  For example, @code{--stat
mode.777} will preserve lower nine bits of it.  Notice, that you can
use any punctuation character in place of @samp{.}.

@item nlink
@itemx st_nlink
    Number of hard links.

@item uid
@itemx st_uid
    User ID of owner.

@item gid
@itemx st_gid
    Group ID of owner.

@item size
@itemx st_size
    File size in decimal.

@item blksize
@itemx st_blksize
    The size in bytes of each file block.

@item blocks
@itemx st_blocks
    Number of blocks allocated.

@item atime
@itemx st_atime
    Time of last access.

@item mtime
@itemx st_mtime
    Time of last modification

@item ctime
@itemx st_ctime
    Time of last status change

@item sparse
    A boolean value indicating whether the file is @samp{sparse}.
@end table

    Modification times are displayed in @acronym{UTC} as
@acronym{UNIX} timestamps, unless suffixed with @samp{H} (for
``human-readable''), as in @samp{ctimeH}, in which case usual
@code{tar tv} output format is used.

    The default output format is: @samp{name,dev,ino,mode,
nlink,uid,gid,size,blksize,blocks,atime,mtime,ctime}.

    For example, the following command will display file names and
corresponding times of last access for each file in the current working
directory:

@smallexample
genfile --stat=name,atime *
@end smallexample

   By default, @command{genfile} follows symbolic links and returns
information about files pointed to by them.  To get information about
the symlink files themselves, use the @option{--no-dereference}
(@option{-h}) option.

@node Set File Time
@appendixsec Set File Time
@cindex Set File Time Mode, @command{genfile}
    This mode is requested by the @option{--set-time} (@option{-t})
command line option.  In this mode @command{genfile} operates
similarly to the @command{touch} command: for each file listed in the
command line, it sets its access and modification times to the current
timestamp or to the value given with the @option{--date} option.  The
@option{--date} option takes a date specification in
an almost arbitrary format as its argument (@pxref{Date input
formats}), e.g.:

@example
genfile --set-time --date='2 days ago' a b c
@end example

    By default, @command{genfile} follows symbolic links and sets
times of the file they point to.  This can be changed by supplying the
@option{--no-dereference} (@option{-h}) option: if it is given,
@command{genfile} will change access and modification times of the
symbolic link itself.  Notice, that not all operating systems allow
this.

@node Exec Mode
@appendixsec Exec Mode

@cindex Exec Mode, @command{genfile}
    This mode is designed for testing the behavior of @code{paxutils}
commands when some of the files change during archiving. It supposes
that the command being executed supports @option{--checkpoint} and
@option{--checkpoint-action} options (@pxref{checkpoints,
Checkpoints,,tar,GNU tar}).

    The @samp{Exec Mode} is enabled by @option{--run} command line
option (or its alias @option{-r}). The non-optional arguments
supply the command line to be executed. @command{Genfile} modifies
this command line by inserting the following options between the
command name and first argument:

@example
--checkpoint=@var{n}
--checkpoint-action "echo=genfile checkpoint %u"
--checkpoint-action "wait=SIGUSR1"
@end example

    Here, @var{n} stands for the checkpoint granularity (for GNU
@command{tar}, it is the number of archive records read or written
between each pair of checkpoints).  The default value is 1.  This
value can be changed using the optional argument to the @option{--run}
option.  For example, to run actions on each 10th checkpoint:

@example
genfile --run=10 ...
@end example

    If the command line contains options, it must be preceded by a
double-dash (@samp{--}), which will prevent these options from being
interpreted by @command{genfile} itself.  For example:

@example
genfile --run --checkpoint=2 --truncate foo -- tar -c -f a.tar .
@end example

    Notice also, that when running @command{tar}, its command line may
not contain traditional options (cluster of letters without dash).    

    A set of options is provided for defining checkpoint values and
actions to be executed upon reaching them. Checkpoint values are
introduced with the @option{--checkpoint} command line
option. Argument to this option is the number of checkpoint in decimal.

    Any number of @dfn{actions} may be specified after a
checkpoint. Available actions are

@table @option
@item --cut @var{file}
@itemx --truncate @var{file}
    Truncate @var{file} to the size specified by previous
@option{--length} option (or 0, if it is not given).

@item --append @var{file}
    Append data to @var{file}. The size of data and its pattern are
given by previous @option{--length} and @option{pattern} options.

@item --touch @var{file}
    Update the access and modification times of @var{file}. These
timestamps are changed to the current time, unless @option{--date}
option was given, in which case they are changed to the specified
time. Argument to @option{--date} option is a date specification in
an almost arbitrary format (@pxref{Date input formats}).

@item -h
@itemx --no-dereference
    Modifies the action of the @option{--touch} option.  If both
options are given and @var{file} argument to the @option{--touch}
names a symbolic link, @command{genfile} will modify access and
modification times of the symbolic link file itself, instead the
file the symlink points to.

@item --exec @var{command}
    Execute given shell command.

@item --delete @var{file}
@itemx --unlink @var{file}
    Delete the named file or directory.  If deleting the directory, it
must be empty.    
@end table

    Option @option{--verbose} instructs @command{genfile} to print on
standard output notifications about checkpoints being executed and to
verbosely describe exit status of the command.

    While the command is being executed its standard output remains
connected to descriptor 1. All messages it prints to file descriptor
2, except checkpoint notifications, are forwarded to standard
error.

    In exec mode, @command{genfile} exits with the exit status of the
executed command.