Python Language Summit

Note: while a few procedural decisions were made (such as the Discussions-To: header becoming more significant for PEPs), this was more an information sharing session than it was about making decisions on any particular topic.

State of PyPy

Armin Rigo and Maciej Fijalkowski

PyPy 2.0 not far away
PyPy/RPython split in progress. Separated directories in the main PyPy repo, will move to separate repos some time post-2.0 (requires a lot of fixes to related tools)
ARM support in progress, will chat to Trent regarding better access to ARM machines through Snakebite
making good progress on Py3k support

State of Jython

Phillip Jenvey

Jython 2.7b1 released last month
Will look at 3.3 support after 2.7 is released
Java 7 “invoke dynamic” support doesn’t actually work yet (JRuby tried it), but once it is working and Jython has been updated to use it, then it should make for substantial performance improvements

State of IronPython

Jeff Hardy

Already supports 2.7
Jeff’s made some attempts at Python 3 support, hopes to take another look this year
Not many contributors, definitely welcome more
Now Apache licensed on Github (shared repo with IronRuby)

Packaging Eco-System

Me

Previous effort involved a lot of good work, but various factors have limited adoption in practice
Current efforts are focused on decoupling the build toolchains from the installation tools
Will be giving the “Discussions-To” header in PEPs more significance: the announcement of acceptance of a PEP with that set will happen on the named list and copies will not be sent to python-dev.
PEPs involving standard library changes will still have to happen on python-dev
Need better documentation for the overall packaging and distribution tools ecosystem. I’ve started such a thing, but at the moment it is just a forlorn issue sitting alone in a “python-meta-packaging” repo I created under the PSF’s BitBucket account.

XML and Security

Brett Cannon

Many security issues inherent in the XML spec.
Hard to decide how to update the standard library appropriately
“Secure-by-default” is highly desirable, but some things are inherently dangerous (e.g. pickle, XML)
May settle for readily available “safer XML parsing” config options that frameworks may choose to enable by default
Also some communication issues with being clear on what is currenty blocking CPython point releases

Tulip and enhanced async programming support in the standard library

Guido van Rossum

PEP 380 almost made it into 2.7. Integration issues (such as the lack of unit tests and docs) and the language moratorium meant it ended up being delayed until 3.3
Non-blocking socket support and asyncore exist, but not a great foundation for robust async IO infrastructure
Twisted and Tornado show how event based async IO can be succesful in Python
Guido still doesn’t like callback based programming :)
Aim to create a universal event loop API for Tornado/Twisted/et al to interoperate
Also aim to make it possible to write yield-from based async code
Read PEP 3156 and related discussions, we mostly just rehashed those for the benefit of those that hadn’t been following along through the many, many threads on python-ideas :)

Parallelizing the Python Interpreter

Trent Nelson

Allow CPython internals to be executed from multiple threads
Minimise required changes
Initial attempt on hg.python.org/trent (px branch)
Only works on Vista+ (relies heavily on Windows features, Trent has ideas on how to adapt it to *nix)
Wants to get it working as a proof of concept first, then clean up and add *nix based solution
May end up as a Stackless style persistent fork/derived implementation for a long while
Example of a CPU-bound task based on tulip style async API, able to exploit all cores
GIL still present, no fine-grained locks, no STM
Intercept “thread-sensitive calls” - anything the GIL protects. (refcounts, object allocator, free lists, interpreter globals, etc)
“Normal” threads behave as they do now
Declared “parallel threads” do something different
Low overhead then becomes about detecting whether or not you’re in a parallel thread really fast.
Windows and POSIX both offer ways to detect thread identity based on a single memory read
Parallel threads only incref/decref when the parallel context is created/destroyed, so it is possible to cope with the fact that the main thread is effectively ignoring any synchronisation mechanisms
Main thread stops while the parallel threads are running, so it can’t steal things out from underneath the parallel threads
Wraps objects with async-protected equivalents
Ultimately, current version is highly experimental, and it’s not yet clear if it can be made sufficiently robust to be useful in general.

(There are some promising notions here that may fit with some vague ideas I’ve had regarding subinterpreters, but there are a lot of real problems with the current approach, especially relating to references to mutable containers that are modified after the parallel context starts. May still be worth pursuing for the benefif of platforms where multiple processes are a significant problem for performance especially memory usage. I suggested Trent look into subinterpreters and the Rust memory model for ways this could be hardened against the many possible segfault inducing behaviours in the current imlementation)

Snakebite

Trent Nelson

Set up to provide interesting architectures and OSes for open source projects to test against
Currently heavily reliant on Trent’s time, interested in exploring ways to make it more open to external contributions (donate to PSF?)
AIX, HP-UX, still red on CPython buildbots, some others are only greendue to extensive environment setup to get CPython building properly
Trent is interested in finding ways to make this more useful to the community
Perhaps set up databases for easier database testing?
Ad hoc BuildBot farms for testing experimental forks?
Currently pre-built machines on bare metal (mostly more esoteric OSes and architectures)

Argument Clinic

Larry Hastings, Nick Coghlan

Introspection on builtin and extension functions is currently close to useless
Builtin and extension functions are already too hard to write, adding signature data as well isn’t a reasonable option
Solution: add an in-place DSL that generates in-place C to be checked in.
PEPs 436 (Larry) and 437 (Stefan Krah) are competing flavours of the DSL
Both PEPs agree on the general concept of adding a preprocessor step to reduce the complications involved in adding and updating builtin and extension module functions and methods
Both PEPs also agree on checking the preprocessed modules with both the input and generated output into source control, so the custom preprocessor isn’t needed to build Python from a source checkout
Stefan’s PEP pushes for a more Python-inspired syntax for the signature definition itself, whereas Larry’s PEP is more Javadoc inspired (with fewer @ symbols and more indentation)
Since the PEPs are in agreement on most points, Larry, Guido and I will get together at some point this week to try to thrash out something Guido likes in terms of the DSL syntax details

CFFI

Alex Gaynor, Armin Rigo, Maciej Fijalkowski

cffi competes with both ctypes and SWIG (for C only, not C++)
unlike ctypes, transparent to the JIT on PyPy (and hence much faster)
generally slightly faster than ctypes on CPython (due to module generation step)
replaces ctypes for ABI access to shared libraries
provides an easy way to generate C extensions given a subset of the C API details (thus replacing some uses of SWIG and Cython)
Needs some work to clarify the API and more clearly separate the “create an extension module” step from the “load from cached extension module” step
Dependencies are pycparser and PLY for the higher level typesafe API, libffi for callback handling and the ABI layer of the API (which is just as unsafe and prone to segfaults as ctypes)
If cffi, and hence pycparser and PLY, are added to the stdlib, all 3 will be public. We may make use of the “provisional API” status.
Will reconsider proposal once some of the feedback has been addressed, but the idea of adding it certainly seems reasonable

Cross-compilation

Matthias Klose

(I confess I wasn’t really listening to this part, I was playing catch-up on Stefan Krah’s draft argument DSL PEP he sent me shortly before I left Australia for PyCon US)

CPython 3.3 and 2.7 both support cross-compilation (e.g. x86_64 to ARM)
still a few issues in various regards
looking to propose additional more invasive changes to the build process to potentially make this easier

Test Facilities

Robert Collins

stdlib test facilities are focused on in-process testing
cross-platform and cross-process and parallel testing becoming more important
easier to drop into a debugger (especially a remote debugger!)
Robert has a stack that can do this for ordinary unittest-based tests
Michael is interested in evolving unittest itself as needed, but need to figure out appropriate things to do

Enums in the standard library

Barry Warsaw

Feature set of flufl.enum is pretty good
Don’t want to implement the superset of all third party enum libraries
Precedent set by bool is for enums to seamlessly interoperate with integers
Highly desirable for any stdlib enum solution to be usable as a replacement for the constants in the socket and errno libraries without a backwards compatibility break
Guido doesn’t want to have 2 similar enum types in the stdlib, and he wants one that can be used in socket and errno (he called this behaviour bdfl.enum, to contrast with flufl.enum)
Guido is OK with different enum types comparing equal, and requiring explicit type checks to limit an API to accepting only particular enum types
As a proponent of labelled values over any form of enum, Guido’s stated preference for “enum as labelled int” (following the precedent set by bool) actually works for me
Barry is in favour of bitmask support for flufl.enum anyway, which is the other element (other than comparisons) needed for a solid proposal that is interoperable with integers
Guido also made the point that this is a case where “good enough” will likely be enough to kill off most third party enums over time

Requests

Larry Hastings

With Kenneth Reitz declaring a stable API for requests with 1.0, he’s interested in offering it for stdlib inclusion in 3.4
chardet and urllib3 vendored dependencies are a concern for incorporation, particularly with tulip/PEP 3156 also coming in Python 3.4
a tulip-backed requests would be much easier to include (as well as a validation of tulip’s support for writing synchronous front ends to the async tulip backend.

Legacy Modules

Nick Coghlan

Better indicate deprecated libraries in the table of contents
Maybe by a separate section in the ToC, or just by appending “(deprecated)” to the section titles

Things we didn’t cover

These didn’t get covered because I forgot to put them on the agenda. I’ll probably be chatting to people about them during the week anyway:

PEP 422 (simple customisation of class creation)
PEP 432 (CPython interpreter initialization)
Unicode improvements (change stream encodings, better encoding specification in subprocess, restore type agnostic convenience access to the codecs module)

Comments powered by Disqus