summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorMichal Domonkos <mdomonko@redhat.com>2019-08-23 22:14:00 +0200
committerNeal Gompa (ニール・ゴンパ) <ngompa13@gmail.com>2019-08-24 08:02:15 -0400
commita804161cb9c1bbef95359ddf01b3bc072f691130 (patch)
tree585191a2aef58e2efd2d49e87d983b3334beb898
parent2bdf588a20ee9d9175fb27a819d479284b7e5079 (diff)
downloadurlgrabber-a804161cb9c1bbef95359ddf01b3bc072f691130.tar.gz
Support HTTP CONNECT with reget. BZ 1585596
Currently, we would reset the file upon seeing "200" in the response header. This is, however, easily fooled by the HTTP CONNECT method used to access an SSL server on behalf of a proxy (a common setting would be a company intranet behind a proxy server where an internal system is consuming Red Hat CDN repos with yum). The reason is that, in this protocol, there are two subsequent headers sent, the first of which is: "HTTP/1.1 200 Connection established". Therefore, we need to explicitly check for "200 OK". More details: https://tools.ietf.org/html/rfc7231#section-4.3.6 Kudos to Masahiro Matsuya for suggesting this patch! Note: As an alternative solution, it seems that setting the CURLOPT_SUPPRESS_CONNECT_HEADERS option on the curl handle would also do the trick (but that would require more scrutiny to ensure that nothing else breaks): https://curl.haxx.se/libcurl/c/CURLOPT_SUPPRESS_CONNECT_HEADERS.html
-rw-r--r--urlgrabber/grabber.py2
1 files changed, 1 insertions, 1 deletions
diff --git a/urlgrabber/grabber.py b/urlgrabber/grabber.py
index a450671..9576fdb 100644
--- a/urlgrabber/grabber.py
+++ b/urlgrabber/grabber.py
@@ -1406,7 +1406,7 @@ class PyCurlFileObject(object):
if buf.lower().find(b'content-length:') != -1:
length = buf.split(b':')[1]
self.size = int(length)
- elif (self.append or self.opts.range) and not self._hdr_dump and b' 200 ' in buf:
+ elif (self.append or self.opts.range) and not self._hdr_dump and b' 200 OK ' in buf:
# reget was attempted but server sends it all
# undo what we did in _build_range()
self.append = False