mwh's blog for 2005

<Y
Y>

whoa!:

Long time no update. Oh well.

I seem to be accumulating some fairly major CPython changes in my tree of late. Here are the ones I remember:

I guess it would be more useful to try and check some of these in (or abandon the idea) before creating too many more...

:

I think I need to cut down on the amount of email I get.

That is all.

more climbing:

Been doing quite a bit of climbing recently (and even more since my last climbing post, nearly a year ago). Recent satisfying accomplishments include a nice fingery 6a+ at the wall (took a few visits to get it, though) and Daydream (VS 5a) in the Avon gorge.

Daydream probably counts as my first "real" VS, despite having climbed a VS (Fairy Steps), a HVS (Sunset Slab) and an E1 (Two-Sided Triangle) in the Peak district. They were all head climbs, though.

[Comments] (7) job!:

As of a few hours ago, I am now paid to work full time on PyPy. Yay!

PyPy 0.8.0:

We released PyPy 0.8.0 today. No huge changes over 0.7.0 which was the first release capable of building a free-standing Python implementation, but quite a bit more polish and speed (both in the translation process and in the resulting executable). Take it for a spin! Find bugs! Have fun!

[Comments] (1) This Week In PyPy backlog:

For a few weeks now, I've been writing a This Week in PyPy summary of activity in PyPy-land. I've only been posting them to pypy-dev, and they are hopefully suited to being a bit more widely read than that. This blog gets onto Planet Python which probably has an appropriate readership, so I'm going to post future summaries here. But first I'm going to spam Planet Python with the summaries I've already written :) (one a day sounds about right :).

This Week In PyPy for the week ending 2005-11-04:

As previously threatened...

This Week in PyPy 1

Introduction

This is the first of what will hopefully be many summaries of what's been going on in the world of PyPy in the last week. First, I'd like to make a request: help me write these things. As is mentioned in the page about This Week in PyPy:

http://codespeak.net/pypy/dist/pypy/doc/weekly/

as and when something worth summarizing happens, be it on IRC, on a mailing list or off in the blogosphere, add an entry to this file:

http://codespeak.net/pypy/dist/pypy/doc/weekly/log

(if you can) or email me about it (if you can't). This week noone at all has done anything like this, which I'll forgive because it's the first week :) Please, please do get into the habit of doing this though, at least if you think writing this summary isn't a complete waste of time.

Release of PyPy 0.8.0

The biggest thing that's happened in the past week was clearly the release of PyPy 0.8.0. You can read the release announcement at:

http://codespeak.net/pypy/dist/pypy/doc/release-0.8.0.html

This release went fairly smoothly compared to some releases, mainly because we weren't rushing to get some feature or other into the release.

Import Analysis

In an effort to understand what code is used where in PyPy, Michael Hudson wrote a tool to analyse the import structure of PyPy, culminating in a several megabyte HTML report which you can find at:

http://starship.python.net/crew/mwh/importfunhtml/pypy/

For example, this is a list of all the modules that reference pypy.objspace.flow.model.Constant (one of the more referenced names in PyPy):

http://starship.python.net/crew/mwh/importfunhtml/pypy/objspace/flow/model/Constant.html

Of course, this work ended up duplicating some of the things done by tools such as pylint and pyflakes and has the potential to be useful for projects other than PyPy, so I hope to clean it up and maybe make it a pylint plugin soon-ish.

A RPythonC tool?

A fairly common topic of discussion on #pypy starts with people who want to write RPython code and then use PyPy to translate it to efficient C. This was again the case on Monday evening (look from about 19:30 onwards):

http://tismerysoft.de/pypy/irc-logs/pypy/%23pypy.log.20051031

While "officially speaking" supporting such things is not a goal of the PyPy project (RPython is essentially an implementation detail) the frequency of raising of the subject means that there probably is some interest in a "rpythonc" type tool that would compile an RPython program. A fairly serious problem, though, is that when the target of compilation turns out not to be RPython, working out why can be very difficult. For these reasons, it seems unlikely that such a tool will be written all that soon (at least, I'm not going to do it :).

PyPy-sync

The main discussion at the weekly pypy-sync meeting was planning for the G??teborg sprint in December:

http://codespeak.net/pypy/extradoc/minute/pypy-sync-11-03-2005.txt

This Week In PyPy for the week ending 2005-11-11:

Number two...

This Week in PyPy 2

Introduction

This is the second of what will hopefully be many summaries of what's been going on in the world of PyPy in the last week. I'd still like to remind people that when something worth summarizing happens to recommend if for "This Week in PyPy" as mentioned on:

http://codespeak.net/pypy/dist/pypy/doc/weekly/

where you can also find old summaries.

There were about 100 commits to the pypy section of codespeak's repository this week.

pypy-c py.py

Over the weekend (while I was being blown around Wales by the remnants of hurricane Wilma) Armin and a few others worked on trying to get a translated pypy-c to run the interpreted py.py. This resulted in a fixing a welter of small differences between CPython and pypy-c, though at the end of it all "we are still left in the dark of incomprehensible geninterplevel crashes caused by subtle differences between the most internal types of CPython and pypy-c."

Multiple Spaces

In one of the reports we're currently writing for the end of phase 1 EU review:

http://codespeak.net/pypy/dist/pypy/doc/low-level-encapsulation.html

we made this claim:

The situation of multiple interpreters is thus handled automatically: if there is only one space instance, it is regarded as a pre-constructed constant and the space object pointer (though not its non-constant contents) disappears from the produced source, i.e. both from function arguments and local variables and from instance fields. If there are two or more such instances, a 'space' attribute will be automatically added to all application objects (or more precisely, it will not be removed by the translation process), the best of both worlds.

And then we tried to do it, and had to tune the claim down because it doesn't work. This is because the StdObjSpace class has a 'specialized method' -- a different version of the wrap() method is generated for each type it is seen to be called with. This causes problems when there are genuine StdObjSpace instances in the translated pypy because of limitations in our tools. We looked at these limitations and decided that it was time to rewrite the world again, leading in to the next section...

SomePBC-refactoring

One of the more unusual annotations produced by PyPy's annotator is that of 'SomePBC', short for 'SomePrebuiltConstant'. This annotation means that a variable contains a reference to some object that existed before the annotation process began (key example: functions). Up until now, the annotation has actually explicitly included which prebuiltconstants a variable might refer to, which seems like the obvious thing to do. Unfortunately, not all things that we'd like to annotate as a prebuiltconstant actually exist as unique CPython objects -- in particular the point of specializing a function is that it becomes many functions in the translated result. Also for 'external', i.e. not written in RPython, functions we want to be able to supply annotations for the input and exit args even if there is no corresponding CPython function at all.

The chosen solution is to have the SomePBC annotation refer not to a CPython object but to a more abstracted 'Description' of this object. In some sense, this isn't a very large change but it affects most files in the annotation directory and a fair fraction of those under rpython/ and translator/. We're also cleaning up some other mess while we're there and breaking everything anyway.

Draft-Dynamic-...

It's not linked from anywhere on the website (yet...) but the report that will become "Deliverable 05.1":

http://codespeak.net/pypy/dist/pypy/doc/dynamic-language-translation.html

has been reviewed and re-reviewed in the last couple of weeks and is definitely required reading for anyone who has an interest in the more theoretical side of PyPy.

Gtbg Sprint in December

Hopefully very soon, we'll announce the next PyPy sprint... stay tuned!

This Week In PyPy for the week ending 2005-11-18:

Only two to go now...

This Week in PyPy 3

Introduction

This is the third of what will hopefully be many summaries of what's been going on in the world of PyPy in the last week. I'd still like to remind people that when something worth summarizing happens to recommend if for "This Week in PyPy" as mentioned on:

http://codespeak.net/pypy/dist/pypy/doc/weekly/

where you can also find old summaries.

There were about 60 commits to the pypy section of codespeak's repository this week.

SomePBC-refactoring

Work on the branch continued, to the point that the annotator now works but the scary mess of the RTyper still remains. We're still pleased with the ideas behind the branch -- the new annotator code has a good deal fewer hacks than the old (though it still has quite a few, of course).

Backend progress

There was a fair bit of light refactoring on the LLVM backend this week, including a recommendation to upgrade to the newly released LLVM 1.6. This gives slightly better performance, meaning that a new pypy-llvm is the closest to CPython performance we've gotten yet (still about 8 times slower, mind). The main change is that the list of operations that can raise exceptions is now shared with the genc backend, reducing duplication and maintence overhead. Basically this means that the LLVM backend is and should remain compatible with the default pypy-c build (no threads, only the Boehm GC and no stackless features).

There was also progress on the JavaScript backend, mainly focussed on adding some of the stackless features currently sported by the C backend.

This Week In PyPy for the week ending 2005-11-25:

Only one more, then I'll be caught up...

This Week in PyPy 4

Introduction

This is the fourth of what will hopefully be many summaries of what's been going on in the world of PyPy in the last week. I'd still like to remind people that when something worth summarizing happens to recommend if for "This Week in PyPy" as mentioned on:

http://codespeak.net/pypy/dist/pypy/doc/weekly/

where you can also find old summaries.

There were about 50 commits to the pypy section of codespeak's repository since the last summary (not quite a week).

SomePBC-refactoring

We attacked the RTyper quite a lot, which meant staring at some of the most obscure code in the codebase, and made substantial but incomplete progress (currently about 60% of the rtyper tests pass). We're optimistic that the majority of work is done on the branch, but there may be many strange details to cope with before translate_pypy runs again.

Sprint Preparation

The next sprint is less than two weeks away -- it's definitely time to be buying flights and booking accomodation if you're going to be there :)

LLVM progress

Richard implemented threading in the LLVM backend, bringing another feature that was previously pypy-c only in. Stacklessness next?

PyPy spreads

Christian returned from America where he'd been consulting for a company implementing some systems in RPython which had been implemented in Java, and after some effort beating the Java versions for performance. This company had found out about PyPy and RPython from reading our mailing lists -- a nice example of how open development processes can work (and even make you money!).

Resource consumption

As part of being EU-funded, we have to keep track of the resources we use and have a slightly unusual problem: we haven't spent enough time or money in the first half of the project, and have to find something to do about this...

PyPy at conferences

All three talks on PyPy that we submitted to PyCon were accepted, so there will be talks from

  • Michael and Christian on the current state of PyPy (whatever that may be at the time :),
  • Holger and Armin on the architecture and future of PyPy, and
  • Bea and Holger on the methodology of PyPy and the issues around being EU funded.

Further to that, two papers were accepted for the Chaos Communication Congress in Berlin over the new year were accepted:

http://events.ccc.de/congress/2005/fahrplan/events/585.en.html http://events.ccc.de/congress/2005/fahrplan/events/586.en.html

Again, one talk is on the technology of PyPy and the other on methodology/business issues.

So if you're going to a Python or hacker conference any time soon, you're likely to hear about PyPy :)

This Week In PyPy for the week ending 2005-12-02:

And finally, my blog is up to date with this week in PyPy.

This Week in PyPy 5

Introduction

This is the fifth of what will hopefully be many summaries of what's been going on in the world of PyPy in the last week. I'd still like to remind people that when something worth summarizing happens to recommend if for "This Week in PyPy" as mentioned on:

http://codespeak.net/pypy/dist/pypy/doc/weekly/

where you can also find old summaries. I note in passing that the idea of keeping track of IRC conversations in the weekly summary has pretty much fizzled. Oh well.

There were about 230 commits to the pypy section of codespeak's repository in the last week (a busy one, it seems :-).

SomePBC-refactoring

We merged the branch at last! Finishing the branch off and getting translate_pypy running again seemed to mostly involve fighting with memoized functions and methods, and the "strange details" hinted at in the last "This Week in PyPy" were not so bad -- indeed once we got to the point of rtyping finishing, the backend optimizations, source generation, compilation and resulting binary all worked first time (there must be something to be said for this Test Driven Development stuff :).

If you recall from the second This Week in PyPy the thing that motivated us to start the branch was wanting to support multiple independent object spaces in the translated binary. After three weeks of refactoring we hoped we'd made this possible... and so it proved, though a couple of small tweaks were needed to the PyPy source. The resulting binary is quite a lot (40%) bigger but only a little (10%) slower.

CCC papers

As mentioned last week, two PyPy talks have been accepted for the Chaos Communication Congress. The CCC asks that speakers provide papers to accompany their talks (they make a proceedings book) so that's what we've done, and the results are two quite nice pieces of propaganda for the project:

http://codespeak.net/pypy/extradoc/talk/22c3/agility.pdf http://codespeak.net/pypy/extradoc/talk/22c3/techpaper.pdf

It's still possible to attend the conference in Berlin, from December 27th to the 30th:

http://events.ccc.de/congress/2005

A number of PyPy people will be around and innocently mixing with people from other communities and generally be available for discussing all things PyPy and the future.

Where did PyPy-sync go?

What's a pypy-sync meeting? Apparently:

It's an XP-style meeting that serves to synchronize
development work and let everybody know who is
working on what.  It also serves as a decision
board of the PyPy active developers.  If discussions
last too long and decisions cannot be reached
they are delegated to a sub-group or get postponed.

pypy-sync meetings usually happen on thursdays at 1pm CET on the #pypy-sync IRC channel on freenode, with an agenda prepared beforehand and minutes posted to pypy-dev after the meeting. Except that the last couple haven't really happened this way -- no agenda and only a few people have turned up and mostly just the people who are in #pypy all week anyway.

So after the Göteborg sprint next week we're going to try harder to prepare and get developers to attend pypy-sync meetings again. This is especially important as we head towards more varied and less intrinsically related challenges such as a JIT compiler, integration of logic programming, GC, higher level backends and much more.

This Week In PyPy for the week ending 2005-12-09:

The first summary that gets posted to my blog before at the same time as it gets sent out as email:

This Week in PyPy 6

Introduction

This is the sixth of what will hopefully be many summaries of what's been going on in the world of PyPy in the last week. I'd still like to remind people that when something worth summarizing happens to recommend if for "This Week in PyPy" as mentioned on:

http://codespeak.net/pypy/dist/pypy/doc/weekly/

where you can also find old summaries. This week features the first IRC summary from Pieter Holtzhausen, a feature that will hopefully continue.

There were about 150 commits to the pypy section of codespeak's repository in the last week (a relatively small number for a sprint week -- lots of thinking going on here).

The Sprint!

This is covered in more detail in the sprint report, but seems to be going well. There has been work on the JIT, supporting larger integers and sockets in RPython, making the stackless option more useful, performance, compiler flexibility, documentation and probably even more.

IRC Summary

Thanks again to Pieter for this. We need to talk about formatting :)

Friday http://tismerysoft.de/pypy/irc-logs/pypy/%23pypy.log.20051202:

[00:04] Arigo states it is time to merge the PBC branch. Merging henceforth
        commences.
[15:46] Pedronis and mwh discusses the simplification of the backend
        selection of the translator. Some translator planning documents
        checked in later.

Saturday http://tismerysoft.de/pypy/irc-logs/pypy/%23pypy.log.20051203:

[15:45] Stakkars mentions the idea he posted to pypy-dev, that involves
        the substitution of CPython modules piecewise with pypy generated
        modules. Pedronis replies that he has thought of a similar
        approach to integrate pypy and Jython, but that this effort needs
        to be balanced with the fact that the pypy JIT currently needs
        attention.

Sunday http://tismerysoft.de/pypy/irc-logs/pypy/%23pypy.log.20051204:

[14:03] Stakkars asks about the necessity of 3 stacks in the l3interpreter
        that Armin has been working on. One for floats, ints and
        addresses. After remarks about easier CPU support, Arigo replies
        that there is simply no sane way to write RPython with a single one.
[18:26] Gromit asks how ready pypy is for production usage. He is
        interested in pypy as a smalltalk-like environment, since he deems
        objects spaces to be reminiscent of smalltalk vm images.
[18:31] Stakkars states that he believes the project should postpone
        advanced technologies, in favour of getting the groundwork to a
        level where the project really becomes a CPython alternative.

Monday http://tismerysoft.de/pypy/irc-logs/pypy/%23pypy.log.20051205:

[01:44] Pedronis running counting microbenchmarks, one 4.7 times slower
        than CPython, the other one 11.3 times. Function calling takes
        its toll in the latter.

Tuesday, Wednesday:

[xx:xx] Sprint background radiation. Braintone rings like a bell. Not
        much to report.

Thursday http://tismerysoft.de/pypy/irc-logs/pypy/%23pypy.log.20051208:

[17:55] Stakkars guess that RPython may get basic coroutine support, and
        is excited about that.
[18:05] Stakkars votes for having stackless enabled all the time. The
        advantages:
           - real garbage collection
           - iterator implementation without clumsy state machines
[20:19] Rhamphoryncus wonders whether dynamic specialization (e.g. psyco)
        can possibly improve memory layout.
[20:46] Sabi is glad that long long is now supported (courtesy of mwh and
        Johahn). He yanks out his work around.

ukc's climbing log:

This is quite funky, if a little heavy on the Javascript. I've been adding to my log but as I'm in Goteborg and don't have my guidebooks with me the dates are all a bit approximate. I should also be sprinting and not spending my time fooling on a climbing website...

This Week In PyPy for the week ending 2005-12-16:

Better late than never, I guess.

This Week in PyPy 7

Introduction

This is the seventh summary of what's been going on in the world of PyPy in the last week. I'd still like to remind people that when something worth summarizing happens to recommend if for "This Week in PyPy" as mentioned on:

http://codespeak.net/pypy/dist/pypy/doc/weekly/

where you can also find old summaries.

There were about 110 commits to the pypy section of codespeak's repository in the last week.

The Sprint!

The last weekly summary was written towards the end of the sprint. The things we did in the couple of remaining days were written up in the second sprint report:

http://codespeak.net/pipermail/pypy-dev/2005q4/002660.html

Apart from continuing our work from the first half of the sprint, the main new work was implementing __del__ support in the translated PyPy.

IRC Summary

Thanks again to Pieter for this.

Monday http://tismerysoft.de/pypy/irc-logs/pypy/%23pypy.log.20051212:

[00:26] Stakkars says that it is great that pypy does not punish you for
        indirection. He is of meaning that he writes better style in RPython
        than in Python, because the "it is slow" aspect is gone.

Tuesday http://tismerysoft.de/pypy/irc-logs/pypy/%23pypy.log.20051213:

[21:01] Heatsink says that he is doing some dynamic optimizations in CPython.
        This turns into a discussion about the nature of pypy, and Arigo takes
        us on a tour of how pypy and the JIT will interact in the future. A
        good read of general pypy ideas.

Thursday http://tismerysoft.de/pypy/irc-logs/pypy/%23pypy.log.20051215:

[10:24] Ericvrp discovers an optimization that makes pypy 6.8x slower than
        CPython on the richards test suite. All if-elses are converted to
        switches. Cfbolz replies that it is time to write a graph
        transformation to implement this optimization officially.

PyPy's Bytecode Dispatcher

Something that was suggested but never got-around-to at the last sprint was to modify the translation process so that the bytecode dispatch loop of the interpreter used a C switch rather than a table of function pointers.

The bytecode implementation code in PyPy builds a list of functions that contain the implementation of the respective bytecode. Up until a few days ago, the dispatch function retrieved the correct function by using the bytecode as an index into this list. This was turned by the translator and the C backend into an array of function pointers. This has the drawback that the bytecode-implementing functions can never be inlined (even though some of them are quite small) and there always is a read from memory for every bytecode executed.

During the Gothenburg sprint we discussed and a strategy to transform the dispatch code into something more efficient and in the last week Eric, Arre and Carl Friedrich implemented this strategy. Now the dispatching is done by a huge (automatically generated, of course) chain of if/elif/else that all test the value of the same variable. In addition there is a transformation that transforms chains of such if/elif/else blocks into a block that has an integer variable as an exitswitch and links with exitcases corresponding to the different values of the single integer variable. The C backend converts such a block into a switch. In addition this technique makes it possible for our inliner to inline some of the bytecode implementing functions work. Using the new dispatcher pypy-c got 10% or so faster (though the first time we ran it it was much much faster! Benchmarking is hard).

Preparations for EU-review still ongoing

Many developers are still involved in preparations for the EU review on 20th January. Reports are being finalized and there are discussions about various issues that are only indirectly related to the development efforts (in so far as it provides the basis for the partial funding we receive). We probably will only know on the 20th if everything works out suitably.

mwh's blog for 2005

<Y
Y>

[Main]

Unless otherwise noted, all content licensed by Michael Hudson
under a Creative Commons License.