User Guide ========== .. currentmodule:: urllib3 Installing ---------- urllib3 can be installed with `pip `_ .. code-block:: bash $ python -m pip install urllib3 Making Requests --------------- First things first, import the urllib3 module: .. code-block:: pycon >>> import urllib3 You'll need a :class:`~poolmanager.PoolManager` instance to make requests. This object handles all of the details of connection pooling and thread safety so that you don't have to: .. code-block:: pycon >>> http = urllib3.PoolManager() To make a request use :meth:`~poolmanager.PoolManager.request`: .. code-block:: pycon >>> r = http.request('GET', 'http://httpbin.org/robots.txt') >>> r.data b'User-agent: *\nDisallow: /deny\n' ``request()`` returns a :class:`~response.HTTPResponse` object, the :ref:`response_content` section explains how to handle various responses. You can use :meth:`~poolmanager.PoolManager.request` to make requests using any HTTP verb: .. code-block:: pycon >>> r = http.request( ... 'POST', ... 'http://httpbin.org/post', ... fields={'hello': 'world'} ... ) The :ref:`request_data` section covers sending other kinds of requests data, including JSON, files, and binary data. .. _response_content: Response Content ---------------- The :class:`~response.HTTPResponse` object provides :attr:`~response.HTTPResponse.status`, :attr:`~response.HTTPResponse.data`, and :attr:`~response.HTTPResponse.headers` attributes: .. code-block:: pycon >>> r = http.request('GET', 'http://httpbin.org/ip') >>> r.status 200 >>> r.data b'{\n "origin": "104.232.115.37"\n}\n' >>> r.headers HTTPHeaderDict({'Content-Length': '33', ...}) JSON Content ~~~~~~~~~~~~ JSON content can be loaded by decoding and deserializing the :attr:`~response.HTTPResponse.data` attribute of the request: .. code-block:: pycon >>> import json >>> r = http.request('GET', 'http://httpbin.org/ip') >>> json.loads(r.data.decode('utf-8')) {'origin': '127.0.0.1'} Binary Content ~~~~~~~~~~~~~~ The :attr:`~response.HTTPResponse.data` attribute of the response is always set to a byte string representing the response content: .. code-block:: pycon >>> r = http.request('GET', 'http://httpbin.org/bytes/8') >>> r.data b'\xaa\xa5H?\x95\xe9\x9b\x11' .. note:: For larger responses, it's sometimes better to :ref:`stream ` the response. Using io Wrappers with Response Content ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Sometimes you want to use :class:`io.TextIOWrapper` or similar objects like a CSV reader directly with :class:`~response.HTTPResponse` data. Making these two interfaces play nice together requires using the :attr:`~response.HTTPResponse.auto_close` attribute by setting it to ``False``. By default HTTP responses are closed after reading all bytes, this disables that behavior: .. code-block:: pycon >>> import io >>> r = http.request('GET', 'https://example.com', preload_content=False) >>> r.auto_close = False >>> for line in io.TextIOWrapper(r): >>> print(line) .. _request_data: Request Data ------------ Headers ~~~~~~~ You can specify headers as a dictionary in the ``headers`` argument in :meth:`~poolmanager.PoolManager.request`: .. code-block:: pycon >>> r = http.request( ... 'GET', ... 'http://httpbin.org/headers', ... headers={ ... 'X-Something': 'value' ... } ... ) >>> json.loads(r.data.decode('utf-8'))['headers'] {'X-Something': 'value', ...} Query Parameters ~~~~~~~~~~~~~~~~ For ``GET``, ``HEAD``, and ``DELETE`` requests, you can simply pass the arguments as a dictionary in the ``fields`` argument to :meth:`~poolmanager.PoolManager.request`: .. code-block:: pycon >>> r = http.request( ... 'GET', ... 'http://httpbin.org/get', ... fields={'arg': 'value'} ... ) >>> json.loads(r.data.decode('utf-8'))['args'] {'arg': 'value'} For ``POST`` and ``PUT`` requests, you need to manually encode query parameters in the URL: .. code-block:: pycon >>> from urllib.parse import urlencode >>> encoded_args = urlencode({'arg': 'value'}) >>> url = 'http://httpbin.org/post?' + encoded_args >>> r = http.request('POST', url) >>> json.loads(r.data.decode('utf-8'))['args'] {'arg': 'value'} .. _form_data: Form Data ~~~~~~~~~ For ``PUT`` and ``POST`` requests, urllib3 will automatically form-encode the dictionary in the ``fields`` argument provided to :meth:`~poolmanager.PoolManager.request`: .. code-block:: pycon >>> r = http.request( ... 'POST', ... 'http://httpbin.org/post', ... fields={'field': 'value'} ... ) >>> json.loads(r.data.decode('utf-8'))['form'] {'field': 'value'} JSON ~~~~ You can send a JSON request by specifying the encoded data as the ``body`` argument and setting the ``Content-Type`` header when calling :meth:`~poolmanager.PoolManager.request`: .. code-block:: pycon >>> import json >>> data = {'attribute': 'value'} >>> encoded_data = json.dumps(data).encode('utf-8') >>> r = http.request( ... 'POST', ... 'http://httpbin.org/post', ... body=encoded_data, ... headers={'Content-Type': 'application/json'} ... ) >>> json.loads(r.data.decode('utf-8'))['json'] {'attribute': 'value'} Files & Binary Data ~~~~~~~~~~~~~~~~~~~ For uploading files using ``multipart/form-data`` encoding you can use the same approach as :ref:`form_data` and specify the file field as a tuple of ``(file_name, file_data)``: .. code-block:: pycon >>> with open('example.txt') as fp: ... file_data = fp.read() >>> r = http.request( ... 'POST', ... 'http://httpbin.org/post', ... fields={ ... 'filefield': ('example.txt', file_data), ... } ... ) >>> json.loads(r.data.decode('utf-8'))['files'] {'filefield': '...'} While specifying the filename is not strictly required, it's recommended in order to match browser behavior. You can also pass a third item in the tuple to specify the file's MIME type explicitly: .. code-block:: pycon >>> r = http.request( ... 'POST', ... 'http://httpbin.org/post', ... fields={ ... 'filefield': ('example.txt', file_data, 'text/plain'), ... } ... ) For sending raw binary data simply specify the ``body`` argument. It's also recommended to set the ``Content-Type`` header: .. code-block:: pycon >>> with open('example.jpg', 'rb') as fp: ... binary_data = fp.read() >>> r = http.request( ... 'POST', ... 'http://httpbin.org/post', ... body=binary_data, ... headers={'Content-Type': 'image/jpeg'} ... ) >>> json.loads(r.data.decode('utf-8'))['data'] b'...' .. _ssl: Certificate Verification ------------------------ .. note:: *New in version 1.25:* HTTPS connections are now verified by default (``cert_reqs = 'CERT_REQUIRED'``). While you can disable certification verification by setting ``cert_reqs = 'CERT_NONE'``, it is highly recommend to leave it on. Unless otherwise specified urllib3 will try to load the default system certificate stores. The most reliable cross-platform method is to use the `certifi `_ package which provides Mozilla's root certificate bundle: .. code-block:: bash $ python -m pip install certifi You can also install certifi along with urllib3 by using the ``secure`` extra: .. code-block:: bash $ python -m pip install urllib3[secure] Once you have certificates, you can create a :class:`~poolmanager.PoolManager` that verifies certificates when making requests: .. code-block:: pycon >>> import certifi >>> import urllib3 >>> http = urllib3.PoolManager( ... cert_reqs='CERT_REQUIRED', ... ca_certs=certifi.where() ... ) The :class:`~poolmanager.PoolManager` will automatically handle certificate verification and will raise :class:`~exceptions.SSLError` if verification fails: .. code-block:: pycon >>> http.request('GET', 'https://google.com') (No exception) >>> http.request('GET', 'https://expired.badssl.com') urllib3.exceptions.SSLError ... .. note:: You can use OS-provided certificates if desired. Just specify the full path to the certificate bundle as the ``ca_certs`` argument instead of ``certifi.where()``. For example, most Linux systems store the certificates at ``/etc/ssl/certs/ca-certificates.crt``. Other operating systems can be `difficult `_. Using Timeouts -------------- Timeouts allow you to control how long (in seconds) requests are allowed to run before being aborted. In simple cases, you can specify a timeout as a ``float`` to :meth:`~poolmanager.PoolManager.request`: .. code-block:: pycon >>> http.request( ... 'GET', 'http://httpbin.org/delay/3', timeout=4.0 ... ) >>> http.request( ... 'GET', 'http://httpbin.org/delay/3', timeout=2.5 ... ) MaxRetryError caused by ReadTimeoutError For more granular control you can use a :class:`~util.timeout.Timeout` instance which lets you specify separate connect and read timeouts: .. code-block:: pycon >>> http.request( ... 'GET', ... 'http://httpbin.org/delay/3', ... timeout=urllib3.Timeout(connect=1.0) ... ) >>> http.request( ... 'GET', ... 'http://httpbin.org/delay/3', ... timeout=urllib3.Timeout(connect=1.0, read=2.0) ... ) MaxRetryError caused by ReadTimeoutError If you want all requests to be subject to the same timeout, you can specify the timeout at the :class:`~urllib3.poolmanager.PoolManager` level: .. code-block:: pycon >>> http = urllib3.PoolManager(timeout=3.0) >>> http = urllib3.PoolManager( ... timeout=urllib3.Timeout(connect=1.0, read=2.0) ... ) You still override this pool-level timeout by specifying ``timeout`` to :meth:`~poolmanager.PoolManager.request`. Retrying Requests ----------------- urllib3 can automatically retry idempotent requests. This same mechanism also handles redirects. You can control the retries using the ``retries`` parameter to :meth:`~poolmanager.PoolManager.request`. By default, urllib3 will retry requests 3 times and follow up to 3 redirects. To change the number of retries just specify an integer: .. code-block:: pycon >>> http.requests('GET', 'http://httpbin.org/ip', retries=10) To disable all retry and redirect logic specify ``retries=False``: .. code-block:: pycon >>> http.request( ... 'GET', 'http://nxdomain.example.com', retries=False ... ) NewConnectionError >>> r = http.request( ... 'GET', 'http://httpbin.org/redirect/1', retries=False ... ) >>> r.status 302 To disable redirects but keep the retrying logic, specify ``redirect=False``: .. code-block:: pycon >>> r = http.request( ... 'GET', 'http://httpbin.org/redirect/1', redirect=False ... ) >>> r.status 302 For more granular control you can use a :class:`~util.retry.Retry` instance. This class allows you far greater control of how requests are retried. For example, to do a total of 3 retries, but limit to only 2 redirects: .. code-block:: pycon >>> http.request( ... 'GET', ... 'http://httpbin.org/redirect/3', ... retries=urllib3.Retry(3, redirect=2) ... ) MaxRetryError You can also disable exceptions for too many redirects and just return the ``302`` response: .. code-block:: pycon >>> r = http.request( ... 'GET', ... 'http://httpbin.org/redirect/3', ... retries=urllib3.Retry( ... redirect=2, raise_on_redirect=False) ... ) >>> r.status 302 If you want all requests to be subject to the same retry policy, you can specify the retry at the :class:`~urllib3.poolmanager.PoolManager` level: .. code-block:: pycon >>> http = urllib3.PoolManager(retries=False) >>> http = urllib3.PoolManager( ... retries=urllib3.Retry(5, redirect=2) ... ) You still override this pool-level retry policy by specifying ``retries`` to :meth:`~poolmanager.PoolManager.request`. Errors & Exceptions ------------------- urllib3 wraps lower-level exceptions, for example: .. code-block:: pycon >>> try: ... http.request('GET', 'nx.example.com', retries=False) ... except urllib3.exceptions.NewConnectionError: ... print('Connection failed.') See :mod:`~urllib3.exceptions` for the full list of all exceptions. Logging ------- If you are using the standard library :mod:`logging` module urllib3 will emit several logs. In some cases this can be undesirable. You can use the standard logger interface to change the log level for urllib3's logger: .. code-block:: pycon >>> logging.getLogger("urllib3").setLevel(logging.WARNING)