|
|
/ Hathaway Weblog / Transparent Async I/O in Python |
I dream of being able to write I/O code in Python in the synchronous, blocking style, then switching the code to asynchronous and non-blocking with minimal changes. Synchronous I/O is easier to write, but asynchronous avoids the need for threads and is often faster. Can it be done? The challenge is this: when the software is about to do some blocking I/O operation, you have to convert the operation to a non-blocking operation (fairly easy), unwind the stack (probably not hard), transfer control to an event loop (easy), and restore the stack when the I/O finishes (hard).
Stackless Python provides a way to unwind and restore stacks. Stackless is the easiest way to find out whether the idea is a good one. However, Stackless is not part of the core Python and may never be, so even if the idea turns out successful, not many people will use it.
The new coroutines in Python 2.5 might help if you code everything that does any I/O as a generator and simulate a stack using a stack of generators. Unfortunately, I think I'd have to write a ton of "yield" statements in that case. I wouldn't be allowed to call any function that does any I/O; I'd have to ask the simulated stack to call the function.
The coroutine-based solution might work, though, if there's a decorator that can turn a function into an I/O aware coroutine. This decorator would turn calls to blocking functions into yield statements that transfer control back to the event loop.
I wonder if anyone is trying to do this already.
Comments
Test comment. Mail server was broken--if this gets through, it's fixed.
I think the biggest problem in Python is the (lack of) support for low-level asynchronous file access. (I think Py3k may have this, I saw a post on the dev mailing list.) The stackless stuff is a piece of cake. Get a coroutine to start reading a file, then block on a channel that you unblock once the data is available... Then of course, building an extra layer of syntactic sugar on top is trivial.
Shane,
There is some interesting work being done at the MIT Media Labs that might help :
http://viral.media.mit.edu/peers/doc/info.html
It would be used as an application "core", driven by a Python library, with an event loop running "underneath". It sounds like you could more or less forget about whether your process should be sync or async, and let the engine sort it out. The C++ engine is thread based, with continuations (as "micro-threads") at a higher level, which are used whenever those are sufficient to the task. I know this sounds much like a "Twisted" approach, with the event loop based core, etc., but perhaps it is lightweight enough to be "invisible".
If that solution doesn't quite get you where you want to go, maybe one of these message driven approaches would be usable:
http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/365292
http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/365640
HTH, Jerry S.
Here's an approach, using greenlets:
http://pyds.muensterland.org/wiki/continuationbasedserver.html
Maybe this is the real deal, but requires Python 2.5:
http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/498141
