summaryrefslogtreecommitdiff
path: root/doc/metalink.txt
blob: 9d9dea23683144d5c4e172eff2a02364837edbd5 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
GNU Wget Metalink module

  Evaluation of the Metalink/XML and Metalink/HTTP implementations


1. Introduction
***************

This document, and the results contained in it, is focused over the
evaluation of the Metalink/XML and Metalink/HTTP implementations.

The "Directory Options" mentioned here are used on the command line in
conjunction with the option '--input-metalink=file' for Metalink/XML,
and '--metalink-over-http' for Metalink/HTTP.

$ wget --input-metalink=<file> [directory options]
$ wget --metalink-over-http [directory options] <url>

2. Notes
********

Tests for metalink:file names beginning with '/', '~/', './', or '../'
(e.g. "/path/file") shall be run manually due to security concerns.

3. Metalink files used as reference
***********************************

3.1 Test: metalink:file with "path/file" name format
====================================================

cat > test.meta4 << EOF
<?xml version="1.0" encoding="UTF-8"?>
<metalink xmlns="urn:ietf:params:xml:ns:metalink">
  <file name="path/file">
    <size>543</size>
    <hash type="sha256">d37d3965f8e1a7b16504b4273b09c392776b7e4dd17e601256c7b2fd9ce5f56e</hash>
    <hash type="md5">0f6ff5cdc15603f1b81227b5a296f001</hash>
    <url>http://wrongurl.really/gnu/wget/wget-1.18.tar.xz.sig</url>
    <url>http://ftpmirror.gnu.org/wget/wget-1.18.tar.xz.sig</url>
    <url>http://ftp.gnu.org/gnu/wget/wget-1.18.tar.xz.sig</url>
    <url>http://nl.mirror.babylon.network/gnu/wget/wget-1.18.tar.xz.sig</url>
  </file>
</metalink>
EOF

4. `wget --input-metalink=test.meta4`
*************************************

4.1 Implemented safety features
===============================

Any metalink:file name containing an absolute, relative, or home path
(see '2. Notes') parsed from Metalink/XML files is rejected.

This is a libmetalink's design decision implemented in the function
metalink_check_safe_path().  This feature shall not be modified.

All the above conform to the RFC5854 standard.

References:
 https://tools.ietf.org/html/rfc5854#section-4.1.2.1
 https://tools.ietf.org/html/rfc5854#section-4.2.8.3

4.2 File download behaviour
===========================

When a Metalink/XML file is parsed:
1. create the metalink:file "path/file" tree;
2. download the metalink:url file as "path/file";
3. verify the "path/file" size, if declared;
4. verify the "path/file" checksum.

All the above conform to the RFC5854 standard.

References:
 https://tools.ietf.org/html/rfc5854

4.3 Questionable behaviours
===========================

If more metalink:file elements are the same, wget downloads them all.

5. `wget --metalink-over-http`
******************************

5.1 Implemented safety features
===============================

The function url_file_name() is responsible of parsing the url's file
name and mixing in the "Directory Options" wrote on the command line.

The use of libmetalink's metalink_check_safe_path() shouldn't be
necessary (see '4.1 Implemented safety features').

All the above comform to the usual Wget's download behaviour.

References:
 wget(1)

5.2 File download behaviour
===========================

When a Metalink/HTTP header is parsed:
1. extract metalink metadata from the header;
2. download the file from the mirror with the highest priority;
3. verify the file's size, if declared;
4. verify the file's checksum.

All the above comform to the usual Wget's download behaviour and to
the RFC6249 standard.

References:
 wget(1)
 https://tools.ietf.org/html/rfc6249

6. Directory Options
********************

'-nd'
'--no-directories'

    Do not apply to Metalink/XML files (aka --input-metalink=<file>).

    Apply to Metalink/HTTP urls as described in the Wget's manual, see
    wget(1).  The target url is the url wrote on the command line.

'-x'
'--force-directories'

    Do not apply to Metalink/XML files (aka --input-metalink=<file>).

    Apply to Metalink/HTTP urls as described in the Wget's manual, see
    wget(1).  The target url is the url wrote on the command line.

'-nH'
'--no-host-directories'

    Do not apply to Metalink/XML files (aka --input-metalink=<file>).

    Apply to Metalink/HTTP urls as described in the Wget's manual, see
    wget(1).  The target url is the url wrote on the command line.

'--protocol-directories'

    Do not apply to Metalink/XML files (aka --input-metalink=<file>).

    Apply to Metalink/HTTP urls as described in the Wget's manual, see
    wget(1).  The target url is the url wrote on the command line.

'--cut-dirs=number'

    Do not apply to Metalink/XML files (aka --input-metalink=<file>).

    Apply to Metalink/HTTP urls as described in the Wget's manual, see
    wget(1).  The target url is the url wrote on the command line.

'-P prefix'
'--directory-prefix=prefix'

    Set the top of the retrieval tree to prefix for both Metalink/XML
    and Metalink/HTTP downloads, see wget(1).

    If combining the prefix with the file name results in an absolute,
    relative, or home path, the directory components are stripped and
    only the basename is used. See '4.1 Implemented safety features'.