diff options
author | Michal Domonkos <mdomonko@redhat.com> | 2019-08-23 22:14:00 +0200 |
---|---|---|
committer | Neal Gompa (ニール・ゴンパ) <ngompa13@gmail.com> | 2019-08-24 08:02:15 -0400 |
commit | a804161cb9c1bbef95359ddf01b3bc072f691130 (patch) | |
tree | 585191a2aef58e2efd2d49e87d983b3334beb898 | |
parent | 2bdf588a20ee9d9175fb27a819d479284b7e5079 (diff) | |
download | urlgrabber-a804161cb9c1bbef95359ddf01b3bc072f691130.tar.gz |
Support HTTP CONNECT with reget. BZ 1585596
Currently, we would reset the file upon seeing "200" in the response
header. This is, however, easily fooled by the HTTP CONNECT method used
to access an SSL server on behalf of a proxy (a common setting would be
a company intranet behind a proxy server where an internal system is
consuming Red Hat CDN repos with yum). The reason is that, in this
protocol, there are two subsequent headers sent, the first of which is:
"HTTP/1.1 200 Connection established".
Therefore, we need to explicitly check for "200 OK".
More details:
https://tools.ietf.org/html/rfc7231#section-4.3.6
Kudos to Masahiro Matsuya for suggesting this patch!
Note: As an alternative solution, it seems that setting the
CURLOPT_SUPPRESS_CONNECT_HEADERS option on the curl handle would also do
the trick (but that would require more scrutiny to ensure that nothing
else breaks):
https://curl.haxx.se/libcurl/c/CURLOPT_SUPPRESS_CONNECT_HEADERS.html
-rw-r--r-- | urlgrabber/grabber.py | 2 |
1 files changed, 1 insertions, 1 deletions
diff --git a/urlgrabber/grabber.py b/urlgrabber/grabber.py index a450671..9576fdb 100644 --- a/urlgrabber/grabber.py +++ b/urlgrabber/grabber.py @@ -1406,7 +1406,7 @@ class PyCurlFileObject(object): if buf.lower().find(b'content-length:') != -1: length = buf.split(b':')[1] self.size = int(length) - elif (self.append or self.opts.range) and not self._hdr_dump and b' 200 ' in buf: + elif (self.append or self.opts.range) and not self._hdr_dump and b' 200 OK ' in buf: # reget was attempted but server sends it all # undo what we did in _build_range() self.append = False |