News from the jit front
As usual, progress is going slower then predicted, but nevertheless, we're working hard to make some progress.
We recently managed to make our nice GCs cooperate with our JIT. This is one point from our detailed plan. As of now, we have a JIT with GCs and no optimizations. It already speeds up some things, while slowing down others. The main reason for this is that the JIT generates assembler which is kind of ok, but it does not do the same level of optimizations gcc would do.
So the current status of the JIT is that it can produce assembler out of executed python code (or any interpreter written in RPython actually), but the results are not high quality enough since we're missing optimizations.
The current plan, as of now, looks as follows:
- Improve the handling of GCs in JIT with inlining of malloc-fast paths, that should speed up things by a constant, not too big factor.
- Write a simplified python interpreter, which will be a base for experiments and to make sure that our JIT does correct things with regard to optimizations. That would work as mid-level integration test.
- Think about ways to inline loop-less python functions into their parent's loop.
- Get rid of frame overhead (by virtualizables)
- Measure, write benchmarks, publish
- Profit
fijal
Comments
nice to see the progresses on pypy jit!!
Do you expect to produce jit faster, then Unladen-Swallow's LLVM based ?
Thanks for all the hard work, guys. Keep it up!
ah, this jit business is so exciting!
I am not really shure how this plan relates to the roadmap that was presented in April.
How this plan relates: it does not. Fijal's style is to give the current idea of the plans. Don't believe him too much :-) This and April's plan need somehow to be added to each other, or something :-)
Unladen-Swallow's LLVM JIT is a very different beast: it compiles each Python function as a unit. You can only get a uniform bit of speedup this way (maybe 2-3x). By contrast, what we are doing gives a non-uniform speedup: like Psyco, we will probably obtain speedups between 2x and 100x depending on the use case.
(Of course the plan is to be faster than Psyco in the common case :-)
Armin: regarding Unladen-Swallow, does this approach prevent coming up later with a tracing jit? Or it could be done on top of it?
Sweet !! Good luck guys :)
No no no no, trust me :-)
The thing is that I'm trying to present "current plan"
as live as it can be. Which means we might change
our mind completely. But otherwise, the whole blog
would be mostly empty and boring...
Cheers,
fijal
Could you please, elaborate on the second point about a simplified python interpreter?
Also, wouldn't it be better to refactor the plan as follows?:
- Improve the handling of GCs in JIT with inlining of malloc-fast paths, that should speed up things by a constant, not too big factor.
- Measure, write benchmarks
- Write a simplified python interpreter, which will be a base for experiments and to make sure that our JIT does correct things with regard to optimizations. That would work as mid-level integration test.
- Think about ways to inline loop-less python functions into their parent's loop.
- Measure, publish benchmarks, RELEASE 1.2
- Get rid of frame overhead (by virtualizables)
- Measure, publish benchmarks
- Iterate...
Concerning current ideas vs April's roadmap: I understand that plans change and that's ok of course. But as April's roadmap isn't mentioned at all, I have no idea how the current ideas relate to the previous roadmap (like the current ideas replace the old road map or parts of it / they are additional ideas and the old roadmap is postponed / they are a detailing of (parts of) April's roadmap). Maybe that's obvious to people with better pypy-knowledge than me. I understand Armin's comment that they are additional ideas.
Keep up the good work!
Branko
What about threading? Will we have a GIL-less interpreter in the end (assuming the GCs support that)?