summaryrefslogtreecommitdiff
path: root/notes/connection-lifecycle.md
blob: e7c007fce6a2f4ce8660faebe2c7e415585e4cca (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
# Connection lifecycle

## Current implementation

`HTTPConnection` should be instantiated with `host` and `port` of the
**first origin being connected to** to reach the target origin. This either means
the target origin itself or the proxy origin if one is desired.

```python
import urllib3.connection

# Initialize the HTTPSConnection ('https://...')
conn = urllib3.connection.HTTPSConnection(
    host="example.com",
    # Here you can configure other options like
    # 'ssl_minimum_version', 'ca_certs', etc.
)

# Set the connect timeout either in the
# constructor above or via the property.
conn.timeout = 3.0  # (connect timeout)
```

If using CONNECT tunneling with the proxy, call `HTTPConnection.set_tunnel()`
with the tunneled host, port, and headers. This should be called before calling
`HTTPConnection.connect()` or sending a request.

```python
conn = urllib3.connection.HTTPConnection(
    # Remember that the *first* origin we want to connect to should
    # be configured as 'host' and 'port', *not* the target origin.
    host="myproxy.net",
    port=8080,
    proxy="http://myproxy.net:8080"
)

conn.set_tunnel("example.com", scheme="http", headers={"Proxy-Header": "value"})
```

Connect to the first origin by calling the `HTTPConnection.connect()` method.
If an error occurs here you can check whether the error occurred during the
connection to the proxy if `HTTPConnection.has_connected_to_proxy` is false.
If the value is true then the error didn't occur while connecting to a proxy.

```python
# Explicitly connect to the origin. This isn't
# required as sending the first request will
# automatically connect if not done explicitly.
conn.connect()
```

After connecting to the origin, the connection can be checked to see if `is_verified` is set to true. If not the `HTTPConnectionPool` would emit a warning. The warning only matters for when verification is disabled, because otherwise an error is raised on unverified TLS handshake.

```python
if not conn.is_verified:
    # There isn't a verified TLS connection to target origin.
if not conn.is_proxy_verified:
    # There isn't a verified TLS connection to proxy origin.
```

If the read timeout is different from the connect timeout then the
`HTTPConnection.timeout` property can be changed at this point.

```python
conn.timeout = 5.0  # (read timeout)
```

Then the HTTP request can be sent with `HTTPConnection.request()`. If a `BrokenPipeError` is raised while sending the request body it can be swallowed as a response can still be received from the origin even when the request isn't completely sent.

```python
try:
    conn.request("GET", "/")
except BrokenPipeError:
    # We can still try to get a response!

resp = conn.getresponse()
```

Then response headers (and other info) are read from the connection via `HTTPConnection.getresponse()` and returned as a `urllib3.HTTPResponse`. The `HTTPResponse` instance carries a reference to the `HTTPConnection` instance so the connection can be closed if the connection gets into an undefined protocol state.

```python
assert resp.connection is conn
```

If pooling is in use the `HTTPConnectionPool` will set `_pool` on the `HTTPResponse` instance. This will return the connection to the pool once the response is exhausted. If retries are in use set `retries` on the `HTTPResponse` instance.

```python
# Set by the HTTPConnectionPool before returning to the caller.
resp = conn.getresponse()
resp._pool = pool

# This will call resp._pool._put_conn(resp.connection)
# Connection can get auto-released by exhausting.
resp.release_conn()
```

If any error is received from connecting to the origin, sending the request, or receiving the response, the caller will call `HTTPConnection.close()` and discard the connection. Connections can be re-used after being closed, a new TCP connection to proxies and origins will be established.

If instead of a tunneling proxy we were using a forwarding proxy then we configure the `HTTPConnection` similarly, except instead of `set_tunnel()` we send absolute URLs to `HTTPConnection.request()`:

```python
import urllib3.connection

# Initialize the HTTPConnection.
conn = urllib3.connection.HTTPConnection(
    host="myproxy.net",
    port=8080,
    proxy="http://myproxy.net:8080"
)

# You can request HTTP or HTTPS resources over the proxy
# using the absolute URL.
conn.request("GET", "http://example.com")
resp = conn.getresponse()

conn.request("GET", "https://example.com")
resp = conn.getresponse()
```

### HTTP/HTTPS/proxies

This is how `HTTPConnection` instances will be configured and used when a `PoolManager` or `ProxyManager` receives a given config:

- No proxy, HTTP origin -> `HTTPConnection`
- No proxy, HTTPS origin -> `HTTPSConnection`
- HTTP proxy, HTTP origin -> `HTTPConnection` in forwarding mode
- HTTP proxy, HTTPS origin -> `HTTPSConnection` in tunnel mode
- HTTPS proxy, HTTP origin -> `HTTPSConnection` in forwarding mode
- HTTPS proxy, HTTPS origin -> `HTTPSConnection` in tunnel mode
- HTTPS proxy, HTTPS origin, `ProxyConfig.use_forwarding_for_https=True` -> `HTTPSConnection` in forwarding mode