Notes from PyCon 2013 sprints
=============================

- Cancellation.  If a task creates several subtasks, and then the
  parent task fails, should the subtasks be cancelled?  (How do we
  even establish the parent/subtask relationship?)

- Adam Sah suggests that there might be a need for scheduling
  (especially when multiple frameworks share an event loop).  He
  points to lottery scheduling but also mentions that's just one of
  the options.  However, after posting on python-tulip, it appears
  none of the other frameworks have scheduling, and nobody seems to
  miss it.

- Feedback from Bram Cohen (Bittorrent creator) about UDP.  He doesn't
  think connected UDP is worth supporting, it doesn't do anything
  except tell the kernel about the default target address for
  sendto().  Basically he says all UDP end points are servers.  He
  sent me his own UDP event loop so I might glean some tricks from it.
  He says we should treat EINTR the same as EAGAIN and friends.  (We
  should use the exceptions dedicated to errno checking, BTW.)  HE
  said to make sure we use SO_REUSEADDR (I think we already do).  He
  said to set the max datagram sizes pretty large (anything larger
  than the declared limit is dropped on the floor).  He reminds us of
  the importance of being able to pick a valid, unused port by binding
  to port 0 and then using getsockname().  He has an idea where he's
  like to be able to kill all registered callbacks (i.e. Handles)
  belonging to a certain "context".  I think this can be done at the
  application level (you'd have to wrap everything that returns a
  Handle and collect these handles in some set or other datastructure)
  but if someone thinks it's interesting we could imagine having some
  kind of notion of context part of the event loop state,
  e.g. associated with a Task (see Cancellation point above).  He
  brought up uTP (Micro Transport Protocol), a reimplementation of TCP
  over UDP with more refined congestion control.

- Mumblings about UNIX domain sockets and IPv6 addresses being
  4-tuples.  The former can be handled by passing in a socket.  There
  seem to be no real use cases for the latter that can't be dealt with
  by passing in suitably esoteric strings for the hostname.
  getaddrinfo() will produce the appropriate 4-tuple and connect()
  will accept it.

- Mumblings on the list about add vs. set.


Notes from the second Tulip/Twisted meet-up
===========================================

Rackspace, 12/11/2012
Glyph, Brian Warner, David Reid, Duncan McGreggor, others

Flow control
------------

- Pause/resume on transport manages data_received.

- There's also an API to tell the transport whom to pause when the
  write calls are overwhelming it: IConsumer.registerProducer().

- There's also something called pipes but it's built on top of the
  old interface.

- Twisted has variations on the basic flow control that I should
  ignore.

Half_close
----------

- This sends an EOF after writing some stuff.

- Can't write any more.

- Problem with TLS is known (the RFC sadly specifies this behavior).

- It must be dynamimcally discoverable whether the transport supports
  half_close, since the protocol may have to do something different to
  make up for its missing (e.g. use chunked encoding).  Twisted uses
  an interface check for this and also hasattr(trans, 'halfClose')
  but a flag (or flag method) is fine too.

Constructing transport and protocol
-----------------------------------

- There are good reasons for passing a function to the transport
  construction helper that creates the protocol.  (You need these
  anyway for server-side protocols.)  The sequence of events is
  something like

  . open socket
  . create transport (pass it a socket?)
  . create protocol (pass it nothing)
  . proto.make_connection(transport); this does:
    . self.transport = transport
    . self.connection_made(transport)
  
  But it seems okay to skip make_connection and setting .transport.
  Note that make_connection() is a concrete method on the Protocol
  implementation base class, while connection_made() is an abstract
  method on IProtocol.

Event Loop
----------

- We discussed the sequence of actions in the event loop.  I think in the
  end we're fine with what Tulip currently does.  There are two choices:

  Tulip:
  . run ready callbacks until there aren't any left
  . poll, adding more callbacks to the ready list
  . add now-ready delayed callbacks to the ready list
  . go to top

  Tornado:
  . run all currently ready callbacks (but not new ones added during this)
  . (the rest is the same)

  The difference is that in the Tulip version, CPU bound callbacks
  that keep adding more to the queue will starve I/O (and yielding to
  other tasks won't actually cause I/O to happen unless you do
  e.g. sleep(0.001)).  OTOH this may be good because it means there's
  less overhead if you frequently split operations in two.

- I think Twisted does it Tornado style (in a convoluted way :-), but
  it may not matter, and it's important to leave this vague so
  implementations can do what's best for their platform.  (E.g. if the
  event loop is built into the OS there are different trade-offs.)

System call cost
----------------

- System calls on MacOS are expensive, on Linux they are cheap.

- Optimal buffer size ~16K.

- Try joining small buffer pieces together, but expect to be tuning
  this later.

Futures
-------

- Futures are the most robust API for async stuff, you can check
  errors etc.  So let's do this.

- Just don't implement wait().

- For the basics, however, (recv/send, mostly), don't use Futures but use
  basic callbacks, transport/protocol style.

- make_connection() (by any name) can return a Future, it makes it
  easier to check for errors.

- This means revisiting the Tulip proactor branch (IOCP).

- The semantics of add_done_callback() are fuzzy about in which thread
  the callback will be called.  (It may be the current thread or
  another one.)  We don't like that.  But always inserting a
  call_soon() indirection may be expensive?  Glyph suggested changing
  the add_done_callback() method name to something else to indicate
  the changed promise.

- Separately, I've been thinking about having two versions of
  call_soon() -- a more heavy-weight one to be called from other
  threads that also writes a byte to the self-pipe.

Signals
-------

- There was a side conversation about signals.  A signal handler is
  similar to another thread, so probably should use (the heavy-weight
  version of) call_soon() to schedule the real callback and not do
  anything else.

- Glyph vaguely recalled some trickiness with the self-pipe.  We
  should be able to fix this afterwards if necessary, it shouldn't
  affect the API design.