archiver/archiver-spec


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122

Rollback archiving internals
Copyright (C) 2001 Ximian Code, Inc.
Written by Bradford Hovinen <hovinen@ximian.com>

1. Directory format

Diagram:

	 + toplevel
	 |-+ Location 1
	 | |- <id>.xml
	 | |  .
	 | |  .
	 | |  .
	 | |- metadata.log:
	 | |  [<id> <date> <time> <backend>                         ] ^
	 | |  [ .                                                   ] | Time
	 | |  [ .                                                   ] |
	 | |  [ .                                                   ] |
	 | \- metadata.xml:
	 |    [...                                                  ]
	 |    [<inherits>location</inherits>                        ]
	 |    [<contains backend-id="backend" type="full|partial"/> ]
	 |    [...                                                  ]
	 |-+ Location 2
	 |   ...

There is one toplevel directory for each archive. This directory
contains one or more location directories. Each location directory
must contain two files: an XML file describing the location and a log
of changes made in that location. Each change corresponds to an XML
file containing a snapshot of the configuration as modified. There is
one XML file per backend. Each change has an id number that is
incremented atomicall by the archiving script when it stores
configuration changes. The id number, as well as the date and time of
storage, form a filename that uniquely identifies each configuration
change. The archiving script must also store in the log file a line
with the id number, date and time of storage, and backend used
whenever it stores XML data. New entries are stored at the head of the 
file, so that during rollback, the file may be foreward scanned to
find the appropriate identifier for the configuration file. The
per-location XML configuration file contains information on what the
location's parent is and what configurations that location defines.

For now, the backend shall be referred to by its executable name. When 
the backends gain CORBA interfaces, I suggest that the OAF id be used
instead. This reduces the problem of setting a backend's configuration 
to a simple object activation and method invocation. The OAF id may
also be used to resolve the backend's human-readable name.

2. Meta-configuration details

In order that this system be complete, there must be a way to
ascertain the current location and to roll back changes in location. I 
propose that there be a special archive in the configuration hierarchy 
that contains location history in the same format as other
locations. The archiver can then be a single script that accepts
command-line arguments describing the request action: `archive this
data', `roll back this backend's configuration', and `switch to this
location'. It then handles all the details of interfacing with the
archive and applying the changes in the correct order. Conceptually,
the archiver becomes a backend in and of itself, where the frontend is 
located in the GUI of HCM. It would therefore be adviseable to use the 
same standards for the archiver as for other backends and hence make
it a CORBA service, where the tool-specific interface is as described
above.

3. Fine-grained location management

A slight modification of the basic location management system allows
individual settings to be covered by a location as well as entire
backends. The contains tag in a location's metadata file contains the
attribute type, which may either by "full" or "partial". In the former
case, rollback proceeds as described above. If it is the latter, the
archiver, upon rolling back or setting configuration for the relevant
backend in that location, first retrieves the required configuration
from both the location and its parent using the same algorithm. It
then uses an XML merging algorithm to combine the two XML files into
one, allowing the child location's data to override its parent's
data. This can be accomplished using the same technique as Bonobo uses
to allow components to override toolbars and menus in the container.

When a child location partially defines the data for a particular
backend, it must store only those configuration settings that the user
explicitly changed when updating that backend's configuration under
that location. If the frontend simply dumped its entire XML snapshot
to the log, all of the configuration settings would be reflected in
that snapshot, and under the method indicated above, partial
containment would be equivalent to full containment. Therefore, when a
frontend stores its configuration under partial containment, the
archiver must run a node-for-node comparison between the XML data of
the parent location (retrieved using the method indicated above) and
that of the child location. Only those nodes that are different are
actually stored in the configuration log.

When comparing XML nodes, there must be a way to identify distinct
nodes for comparison. For example, in a network configuration backend,
there might be one node for each interface. If, under the parent
location, the nodes are ordered with interface "eth0" before interface
"eth1", while under the child location, they are in reverse order, but
the configuration is otherwise identical, it is not the intention of
the user that child location should override any configuration data of
the parent location. Therefore, the best method for comparing XML data
is to compare each child of a given node in one source to all the
children of the relevant node in the other source. If any child in the
other source matches, then the XML node is a duplicate and may be
thrown out. If there is another node such that the name and attributes
are the same, but the children are different, then the algorithm
should be invoked recursively to determine the differences among the
children. If there is no such node, then the node should be included.

4. Future directions

The metafile log structure may run into scalability problems for
installations have have been in place for a long time. An alternative
structure that uses binary indexing might be in order. A command line
utility (with GUI interface) could be written to recover the file in
the case of corruption; such a utility could simply introspect each of 
the XML files in a directory. Provided that each XML file contains
enough information to create a file entry, which is trivial, recovery
is assured.