1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
|
# -*- Mode: text -*-
TO DO SMALLER TASKS
- Make Task more like Future; getting result() should re-raise exception.
- Add a decorator just for documenting a coroutine. It should set a
flag on the function. It should not interfere with methods,
staticmethod, classmethod and the like. The Task constructor should
check the flag. The decorator should notice if the wrapped function
is not a generator.
TO DO LARGER TASKS
- Need more examples.
- Benchmarkable but more realistic HTTP server?
- Example of using UDP.
- Write up a tutorial for the scheduling API.
- More systematic approach to logging. Logger objects? What about
heavy-duty logging, tracking essentially all task state changes?
- Restructure directory, move demos and benchmarks to subdirectories.
FROM BEN DARNELL (Tornado)
- The waker pipe in ThreadRunner should really be in EventLoop itself
- we need to be able to call call_soon (or some equivalent) from
threads that were not created by ThreadRunner. In Tornado I ended
up needing two functions, add_callback (usable from any thread) and
add_callback_from_signal (usable only from signal handlers).
- Timeouts should ideally be based on time.monotonic, although this
requires some extra complexity to deal with the cases where you
actually do want time.time. (in tornado, the clock used is
configurable on a per-ioloop basis, which is not ideal but is
workable)
- I'm sure you've heard this from the twisted guys by now, but to
properly support completion-based event loops like IOCP you need to
be able to swap out most of sockets.py (the layers below
BufferedReader) for an alternative implementation.
TO DO LATER
- Wrap select(), epoll() etc. in try/except checking for EINTR.
- Move accept loop into Listener class? (Windows is said to work
better if you make many AcceptEx() calls in parallel.) OTOH we can
already accept many incoming connections without suspending.
- When multiple tasks are accessing the same socket, they should
either get interleaved I/O or an immediate exception; it should not
compromise the integrity of the scheduler or the app or leave a task
hanging.
- For epoll you probably want to check/(log?) EPOLLHUP and EPOLLERR errors.
- Add the simplest API possible to run a generator with a timeout.
- Do we need call_every()? (Easily emulated with a loop and sleep().)
- Ensure multiple tasks can do atomic writes to the same pipe (since
UNIX guarantees that short writes to pipes are atomic).
- Ensure some easy way of distributing accepted connections across tasks.
- Be wary of thread-local storage. There should be a standard API to
get the current Context (which holds current task, event loop, and
maybe more) and a standard meta-API to change how that standard API
works (i.e. without monkey-patching).
- See how much of asyncore I've already replaced.
- Do we need _async suffixes to all async APIs?
- Do we need synchronous parallel APIs for all async APIs?
- Could BufferedReader reuse the standard io module's readers???
- Support ZeroMQ "sockets" which are user objects. Though possibly
this can be supported by getting the underlying fd? See
http://mail.python.org/pipermail/python-ideas/2012-October/017532.html
OTOH see
https://github.com/zeromq/pyzmq/blob/master/zmq/eventloop/ioloop.py
- Study goroutines (again).
- Benchmarks: http://nichol.as/benchmark-of-python-web-servers
FROM OLDER LIST
- Is it better to have separate add_{reader,writer} methods, vs. one
add_thingie method taking a fd and a r/w flag?
- Multiple readers/writers per socket? (At which level? pollster,
eventloop, or scheduler?)
- Could poll() usefully be an iterator?
- Do we need to support more epoll and/or kqueue modes/flags/options/etc.?
- Optimize register/unregister calls away if they cancel each other out?
- Should block() use a queue?
- Add explicit wait queue to wait for Task's completion, instead of
callbacks?
- Global functions vs. Task methods?
- Is the Task design good?
- Make Task more like Future? (Or less???)
- Implement various lock styles a la threading.py.
- Add write() calls that don't require yield from.
- Add simple non-async APIs, for simple apps?
- Look at pyfdpdlib's ioloop.py:
http://code.google.com/p/pyftpdlib/source/browse/trunk/pyftpdlib/lib/ioloop.py
MISTAKES I MADE
- Forgetting yield from. (E.g.: scheduler.sleep(1); listener.accept().)
- Forgot to add bare yield at end of internal function, after block().
- Forgot to call add_done_callback().
- Forgot to pass an undoer to block(), bug only found when cancelled.
- Subtle accounting mistake in a callback.
- Used context.eventloop from a different thread, forgetting about TLS.
- Nasty race: eventloop.ready may contain both an I/O callback and a
cancel callback. How to avoid? Keep the DelayedCall in ready. Is
that enough?
- If a toplevel task raises an error it just stops and nothing is logged
unless you have debug logging on. This confused me. (Then again,
previously I logged whenever a task raised an error, and that was too
chatty...)
- Forgot to set the connection socket returned by accept() in
nonblocking mode.
- Nastiest so far (cost me about a day): A race condition in
call_in_thread() where the Future's done_callback (which was
task.unblock()) would run immediately at the time when
add_done_callback() was called, and this screwed over the task
state. Solution: wrap the callback in eventloop.call_later().
Ironically, I had a comment stating there might be a race condition.
- Another bug where I was calling unblock() for the current thread
immediately after calling block(), before yielding.
- readexactly() wasn't checking for EOF, so could be looping.
(Worse, the first fix I attempted was wrong.)
- Spent a day trying to understand why a tentative patch trying to
move the recv() implementation into the eventloop (or the pollster)
resulted in problems cancelling a recv() call. Ultimately the
problem is that the cancellation mechanism is part of the coroutine
scheduler, which simply throws an exception into a task when it next
runs, and there isn't anything to be interrupted in the eventloop;
but the eventloop still has a reader registered (which will never
fire because I suspended the server -- that's my test case :-).
Then, the eventloop keeps running until the last file descriptor is
unregistered. What contributed to this disaster?
* I didn't build the whole infrastructure, just played with recv()
* I don't have unittests
* I don't have good logging to see what is going
|