PEP: XXX Title: Standard futures library Version: $Revision$ Last-Modified: $Date$ Author: Brian Quinlan Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 16-Oct-2009 Python-Version: 3.2 Post-History: ======== Abstract ======== This PEP proposes a design for a package that facilitates the evaluation of callables using threads and processes. ========== Motivation ========== Python currently has powerful primitives to construct multi-threaded and multi-process applications but parallelizing simple operations requires a lot of work i.e. explicitly launching processes/threads, constructing a work/results queue, and waiting for completion or some other termination condition (e.g. failure, timeout). It is also difficult to design an application with a global process/thread limit when each component invents its own execution strategy. ============= Specification ============= Check Prime Example ------------------- :: import futures import math PRIMES = [ 112272535095293, 112451234512351, 112582705942171, 112272535095293, 115280095190773, 115797848077099, 115912095245127, 117450548693743, 993960000099397] def is_prime(n): if n % 2 == 0: return False sqrt_n = int(math.floor(math.sqrt(n))) for i in range(3, sqrt_n + 1, 2): if n % i == 0: return False return True with futures.ProcessPoolExecutor() as executor: for number, prime in zip(PRIMES, executor.map(is_prime, PRIMES)): print('%d: %s' % (number, prime)) Web Crawl Example ----------------- :: import functools import urllib.request import futures URLS = ['http://www.foxnews.com/', 'http://www.cnn.com/', 'http://europe.wsj.com/', 'http://www.bbc.co.uk/', 'http://some-made-up-domain.com/'] def load_url(url, timeout): return urllib.request.urlopen(url, timeout=timeout).read() with futures.ThreadPoolExecutor(max_threads=5) as executor: future_list = executor.run_to_futures( [functools.partial(load_url, url, 30) for url in URLS]) for url, future in zip(URLS, future_list): if future.exception() is not None: print('%r generated an exception: %s' % (url, future.exception())) else: print('%r page is %d bytes' % (url, len(future.result()))) Interface --------- The proposed package provides three core class: `Executors`, `FutureLists` and `Futures`. Executor '''''''' `Executor` is an abstract class that provides methods to execute calls asynchronously. `run_to_futures(calls, timeout=None, return_when=ALL_COMPLETED)` Schedule the given calls for execution and return a `FutureList` containing a `Future` for each call. This method should always be called using keyword arguments, which are: *calls* must be a sequence of callables that take no arguments. *timeout* can be used to control the maximum number of seconds to wait before returning. If *timeout* is not specified or ``None`` then there is no limit to the wait time. *return_when* indicates when the method should return. It must be one of the following constants: ============================= ================================================== Constant Description ============================= ================================================== `FIRST_COMPLETED` The method will return when any call finishes. `FIRST_EXCEPTION` The method will return when any call raises an exception or when all calls finish. `ALL_COMPLETED` The method will return when all calls finish. `RETURN_IMMEDIATELY` The method will return immediately. ============================= ================================================== `run_to_results(calls, timeout=None)` Schedule the given calls for execution and return an iterator over their results. The returned iterator raises a `TimeoutError` if `__next__()` is called and the result isn't available after *timeout* seconds from the original call to `run_to_results()`. If *timeout* is not specified or ``None`` then there is no limit to the wait time. If a call raises an exception then that exception will be raised when its value is retrieved from the iterator. `map(func, *iterables, timeout=None)` Equivalent to map(*func*, *\*iterables*) but executed asynchronously and possibly out-of-order. The returned iterator raises a `TimeoutError` if `__next__()` is called and the result isn't available after *timeout* seconds from the original call to `run_to_results()`. If *timeout* is not specified or ``None`` then there is no limit to the wait time. If a call raises an exception then that exception will be raised when its value is retrieved from the iterator. `Executor.shutdown()` Signal the executor that it should free any resources that it is using when the currently pending futures are done executing. Calls to `Executor.run_to_futures`, `Executor.run_to_results` and `Executor.map` made after shutdown will raise `RuntimeError`. ProcessPoolExecutor ''''''''''''''''''' The `ProcessPoolExecutor` class is an `Executor` subclass that uses a pool of processes to execute calls asynchronously. `__init__(max_processes)` Executes calls asynchronously using a pool of a most *max_processes* processes. If *max_processes* is ``None`` or not given then as many worker processes will be created as the machine has processors. ThreadPoolExecutor '''''''''''''''''' The `ThreadPoolExecutor` class is an `Executor` subclass that uses a pool of threads to execute calls asynchronously. `__init__(max_threads)` Executes calls asynchronously using a pool of at most *max_threads* threads. FutureList Objects '''''''''''''''''' The `FutureList` class is an immutable container for `Future` instances and should only be instantiated by `Executor.run_to_futures`. `wait(timeout=None, return_when=ALL_COMPLETED)` Wait until the given conditions are met. This method should always be called using keyword arguments, which are: *timeout* can be used to control the maximum number of seconds to wait before returning. If *timeout* is not specified or ``None`` then there is no limit to the wait time. *return_when* indicates when the method should return. It must be one of the following constants: ============================= ================================================== Constant Description ============================= ================================================== `FIRST_COMPLETED` The method will return when any call finishes. `FIRST_EXCEPTION` The method will return when any call raises an exception or when all calls finish. `ALL_COMPLETED` The method will return when all calls finish. `RETURN_IMMEDIATELY` The method will return immediately. This option is only available for consistency with `Executor.run_to_results` and is not likely to be useful. ============================= ================================================== `cancel(timeout=None)` Cancel every `Future` in the list and wait up to *timeout* seconds for them to be cancelled or, if any are already running, to finish. Raises a `TimeoutError` if the running calls do not complete before the timeout. If *timeout* is not specified or ``None`` then there is no limit to the wait time. `has_running_futures()` Return `True` if any `Future` in the list is currently executing. `has_cancelled_futures()` Return `True` if any `Future` in the list was successfully cancelled. `has_done_futures()` Return `True` if any `Future` in the list has completed or was successfully cancelled. `has_successful_futures()` Return `True` if any `Future` in the list has completed without raising an exception. `has_exception_futures()` Return `True` if any `Future` in the list completed by raising an exception. `cancelled_futures()` Return an iterator over all `Future` instances that were successfully cancelled. `done_futures()` Return an iterator over all `Future` instances that completed are were cancelled. `successful_futures()` Return an iterator over all `Future` instances that completed without raising an exception. `exception_futures()` Return an iterator over all `Future` instances that completed by raising an exception. `running_futures()` Return an iterator over all `Future` instances that are currently executing. `__len__()` Return the number of futures in the `FutureList`. `__getitem__(i)` Return the ith `Future` in the list. The order of the futures in the `FutureList` matches the order of the class passed to `Executor.run_to_futures` `FutureList.__contains__(future)` Return `True` if *future* is in the `FutureList`. Future Objects '''''''''''''' The `Future` class encapsulates the asynchronous execution of a function or method call. `Future` instances are created by the `Executor.run_to_futures` and bundled into a `FutureList`. `cancel()` Attempt to cancel the call. If the call is currently being executed then it cannot be cancelled and the method will return `False`, otherwise the call will be cancelled and the method will return `True`. `Future.cancelled()` Return `True` if the call was successfully cancelled. `Future.done()` Return `True` if the call was successfully cancelled or finished running. `result(timeout=None)` Return the value returned by the call. If the call hasn't yet completed then this method will wait up to *timeout* seconds. If the call hasn't completed in *timeout* seconds then a `TimeoutError` will be raised. If *timeout* is not specified or ``None`` then there is no limit to the wait time. If the future is cancelled before completing then `CancelledError` will be raised. If the call raised then this method will raise the same exception. `exception(timeout=None)` Return the exception raised by the call. If the call hasn't yet completed then this method will wait up to *timeout* seconds. If the call hasn't completed in *timeout* seconds then a `TimeoutError` will be raised. If *timeout* is not specified or ``None`` then there is no limit to the wait time. If the future is cancelled before completing then `CancelledError` will be raised. If the call completed without raising then ``None`` is returned. `index` int indicating the index of the future in its `FutureList`. ========= Rationale ========= The proposed design of this module was heavily influenced by the the Java java.util.concurrent package [1]_. The conceptual basis of the module, as in Java, is the Future class, which represents the progress and results of an asynchronous computation. The Future class makes little commitment to the evaluation mode being used e.g. it can be be used to represent lazy or eager evaluation, for evaluation using threads or using processes. Futures are created by concrete implementations of the Executor class (called ExecutorService in Java). The reference implementation provides a classes that use either a process a thread pool to eagerly evaluate computations. Futures have already been seen in Python as part of a popular Python cookbook recipe [2]_ and have discussed on the Python-3000 mailing list [3]_. The proposed design is explicit i.e. it requires that clients be aware that they are consuming Futures. It would be possible to design a module that would return proxy objects (in the style of `weakref`) that could be used transparently. It is possible to build a proxy implementation on top of the proposed explicit mechanism. The proposed design does not introduce any changes to Python language syntax or semantics. Special syntax could be introduced [4]_ to mark function and method calls as asynchronous. A proxy result would be returned while the operation is eagerly evaluated asynchronously, and execution would only block if the proxy object were used before the operation completed. ======================== Reference Implementation ======================== The reference implementation [5]_ contains a complete implementation of the proposed design. ========== References ========== .. [1] `java.util.concurrent` package documentation `http://java.sun.com/j2se/1.5.0/docs/api/java/util/concurrent/package-summary.html` .. [2] Python Cookbook recipe 84317, "Easy threading with Futures" `http://code.activestate.com/recipes/84317/` .. [3] `Python-3000` thread, "mechanism for handling asynchronous concurrency" `http://mail.python.org/pipermail/python-3000/2006-April/000960.html` .. [4] `Python 3000` thread, "Futures in Python 3000 (was Re: mechanism for handling asynchronous concurrency)" `http://mail.python.org/pipermail/python-3000/2006-April/000970.html` .. [5] Reference `futures` implementation `http://code.google.com/p/pythonfutures` ========= Copyright ========= This document has been placed in the public domain. .. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 coding: utf-8 End: