PyPy is an absolutely amazing piece of technologies and the core developers are brilliant people - however, their main use case is not numeric python.
Their framework is (theoretically) able to trace any Python operation, no matter how dynamic, and speed it up. This means that if you want to speed up pure Python code, PyPy is really the only game in town.
The downside is that when scientists use Python, Python is used as a beautiful API on top of very optimized code written in Fortran or C. If you want to do numerically complex code in Python, you are much, much better off using numba. numba is much less ambitious than PyPy - it handles a small subset of Python (basically, NumPy) but it is very, very fast and very efficient at speeding up pure NumPy code. In my experience, 100x speed ups (over pure NumPy code) are not that uncommon.
As you stated, targeting numerical code (array access and primitive operations) versus general code (with modern features such as virtual dispatch, varargs, generics or even metaprog) is completely different in terms of difficulty and solutions.
It is indeed fairly easy to create a highly performant JIT for numerical operations on N-dimensional arrays (vector,matrix..etc..); and depending on your field, it might represent 99% of the execution time.
For example, I wrote a really simple JIT compiler for Matlab which performs sometimes better than raw C (and it's backed by LLVM vectorizer to generate the correct SIMD assembly). Link to the Master thesis if you are interested: https://www.dropbox.com/s/caz7d4d08xhbwcu/thesis.pdf?dl=0
I'm not really going to argue with you, if you can make numba perform - great! We're aiming at providing middle ground. Situations where the code is heavy on numerics, but complex enough that numba either can't handle it at all or is not performing very well. That covers some usecases and from what you're saying very likely not yours, but that's ok. We don't have to cater to everyone and there is a place for tools like pypy (also in numerics) and tools like numba :-)
PyPy is really impressive, I love the idea of getting all these optimisations (SIMD, STM) for free... but as you say, numerical work means numpy, scipy, pandas, which don't work with PyPy. Even if the NumPyPy project was able to fully match the Numpy api, you still have a lot of large projects like Pandas that depend on the c api. it would be stupid to copy everything. Perhaps something can be worked out between pypy, cffi, and cython.
In the short term Numba is much more practical for numerics. In the longer term Pyston looks promising - it's actually similar to Numba in that it also uses LLVM, I imagine there could be synergy between the two...
NumPy is a protocol that dictates the layout of multidimensional arrays with really fast Fortran code that knows that layout. What needs to be copied is that memory layout protocol so that we get n:m sharing instead of n^2 duplication.
My point was that even if you copy the numpy protocol, you still have huge projects that depend on c-extensions that you wouldn't want to port.
Pandas is one, others include scikit-learn, scikit-image, Astropy, Bioinformatics libraries, stats libraries, etc... which all have heavy C/Cython use and depend to varying degrees on the Python C-api. Porting NumPy barely scratches the surface of scientific python.
Biopython runs on pypy. It also runs under Jython. While it has C use, it is not "heavy" C use.
Going on a tangent, and though I realize it's a lost battle, I wish people would stop saying that NumPy is the base of scientific programming in Python. As Biopython shows, it isn't required for at least some of bioinformatics.
My own research[1] deals with chemical graphs, and NumPy/SciPy/etc. are nearly irrelevant to that research.
[1] For example, given a set of 100 structures, what is the largest substructure (based on the number of bonds) which is in at least 90 of the structures?
The great thing is that Python JIT research is happening in multiple fronts, and maybe someday in a very far future CPython won't matter any longer (except for legacy deployments).
Their framework is (theoretically) able to trace any Python operation, no matter how dynamic, and speed it up. This means that if you want to speed up pure Python code, PyPy is really the only game in town.
The downside is that when scientists use Python, Python is used as a beautiful API on top of very optimized code written in Fortran or C. If you want to do numerically complex code in Python, you are much, much better off using numba. numba is much less ambitious than PyPy - it handles a small subset of Python (basically, NumPy) but it is very, very fast and very efficient at speeding up pure NumPy code. In my experience, 100x speed ups (over pure NumPy code) are not that uncommon.
The founder of continuum (Travis Oliphant) wrote a blog about his technical vision for a Python jit: http://technicaldiscovery.blogspot.it/2012/08/numba-and-llvm... and http://technicaldiscovery.blogspot.it/2012/07/more-pypy-disc...). Basically, the continuum team made a big bet that a very efficient JIT that targets only numerical python would be more useful than a generic JIT that can theoretically handle all of Python. For my use case (scientific coding) - numba is far superior.