JIT progress
In the last days I finally understood how to do virtualizables. Now the frame overhead is gone. This was done with the help of discussion with Samuele, porting ideas from PyPy's first JIT attempt.
This is of course work in progress, but it works in PyPy (modulo a few XXXs, but no bugs so far). The performance of the resulting code is quite good: even with Boehm (the GC that is easy to compile to but gives a slowish pypy-c), a long-running loop typically runs 50% faster than CPython. That's "baseline" speed, moreover: we will get better speed-ups by applying optimizations on the generated code. Doing so is in progress, but it suddenly became easier because that optimization phase no longer has to consider virtualizables -- they are now handled earlier.
Update:Virtualizables is basically a way to avoid frame overhead. The frame object is allocated and has a pointer, but the JIT is free to unpack it's fields (for example python level locals) and store them somewhere else (stack or registers). Each external (out of jit) access to frame managed by jit, needs to go via special accessors that can ask jit where those variables are.
Comments
I have no clue of what you're talking about, bit it sounds great! Keep it up!!
What are virtualizables?
From what I understand virtualizables are objects that you use to represent objects that are expensive to construct. For example frame objects in python are very expensive so they are virtualizables and if a function is executed and it doesn't try to access its frame object it is never created.
Probably armin can give a more precise answer.
What I want to know, couldn't CPython have virtualizables for frame objects? I guess the answer is that it could but would involve a lot of C code.
Ok, I updated the post with quick explanation of what actually virtualizables are. Leonardo: you need compiler in the first place for that :-) Psyco has some kind of virtualizables (but psyco frames are read only).
Cheers,
fijal
Could you use virtualizables to avoid constructing the frame at all, and then only allocate it if it is accessed?
@Leonardo:
I'm guessing that yes, CPython COULD have virtualizables. However, the people who built CPython a) didn't know about them, b) didn't know how to code that in "C", or c) didn't consider it a priority item.
Either way, these are the types of advantages I would imagine coding python using python would expose. Optimize what you need to, and then start to see the real ROI of PyPy!
@Ben: no. In the current incarnation, the JITs generated by PyPy optimize only hot loops, when they are executed more than N times. At that point, the frame object has already been allocated.
The real advantage of virtualizables is that they allows to:
1) produce very fast code, as if the frame weren't allocated at all (e.g. by storing local variables on the stack or in the registers)
2) they don't compromise the compatibility with CPython; in particular, sys._getframe() & co. still works fine, because the JIT knows how and when to synchronize the virtualizable (i.e., the frame) with the values that are on the stack.
@gregturn: I don't see how you can implement something similar to virtualizables without writing a compiler, and CPython is not such a thing :-)