Numba: NumPy-aware optimizing compiler for Python

fijal · on Aug 25, 2012

So maybe to clarify few things. This is just sensationalism - numba is not a python compiler. It's a compiler of some restricted subset of python into LLVM. I don't think the subset is very well defined by now, but I would expect it to be at some point in the future. It's just that you can choose what to compile and what not, which is very convinient for a lot of numerics.

The whole approach might be viable and it definitely has use cases, however the sensational headline makes it so bad :/ Hackernews - you let me down.

pash · on Aug 25, 2012

You mean sort of like how PyPy is a JIT for only a restricted subset of Python?

fijal · on Aug 25, 2012

No, PyPy JITs all the Python by design. It also runs all the Python and JITs all the constructs. It does not make all the possible python magically faster however, but that does not change my point. A JIT for only a restricted subset of Python would mean that it segfaults/crashes/gives a wrong answer if you don't provide a correct subset. This is not true.

pash · on Aug 25, 2012

So basically you're saying that PyPy knows what code it can optimize and Numba requires the programmer to specify it. That probably has more to do with Numba's being version 0.2 than anything else. And it's hardly reason to moan that it's not a "real" compiler.

fijal · on Aug 25, 2012

I did not call it not a "real" compiler (with or without quotes), I did call it not a python compiler. It has nothing to do with 0.2 - supporting all of Python is probably not even within goals of the numba project. I'm not talking about "all the bytecodes", but making sure all the corner cases work perfectly well etc. Also supporting all cases would mean that you get rid of a lot of speed benefits that you see right now, precisely because you don't care about all the corner cases. Predicting the future is obviously hard and I can't speak on behalf of numba authors, but I don't think there are plans to support all of Python.

pash · on Aug 25, 2012

Python code goes in, LLVM IR comes out. Sounds like a compiler to me.

As you know, the basic idea of the project is to make writing fast, vectorized code for numerical and scientific computations as easy as writing native Python. And to do it in a way that maintains compatibility with the extensive CPython universe.

I think the criticism you're fishing for is that it is not an implementation of Python. No, it's not that. But, yes, it is a compiler.

fijal · on Aug 26, 2012

No, it's not. There is one more requirement - the LLVM IR has to work exactly the same as intended for python (in this case as cpython does). This is not the case - it's not full and what's implemented does not support all the corner cases of the language, starting with overflow-to-long.

samuel · on Aug 25, 2012

Just asking... How it would compare to PyPy's RPython?

fijal · on Aug 25, 2012

Different purpose, different restrictions. Numba is written to support fast numpy ufuncs (without resorting to C) and the like. RPython is a proper subset of Python (so all corner cases either work or are declared "not RPython", you can't get a different answer) which supports more of a language but with less of integration with the host interpreter.

andreasvc · on Aug 25, 2012

That restricted subset, RPython, is what PyPy itself is implemented in; but PyPy can run arbitrary Python code.

juanlu001 · on Aug 25, 2012

Here is the post at Travis Oliphant's blog about numba

http://technicaldiscovery.blogspot.com.es/2012/08/numba-and-...

As it is said there, it's still early software and its "road-map is being defined right now by the people involved in the project". Sure the subset is not well defined now and there are no docs, but hey - let's give the thing a few months.

Haven't tested thoroughly yet, but I think no NumPy calls can be made from inside a numba-compiled function - so this might be the case for other Python modules as well.

travisoliphant · on Aug 25, 2012

Actually some NumPy calls can be made inside the numba-compiled function as there is nascent support for the NumPy C-API --- it translates the call to a C-call.

In the future, we want to support all Python, but not necessarily optimized --- just using the Python Object C-API.

kkuduk · on Aug 25, 2012

here is a talk form SciPy 2012 about Numba http://www.youtube.com/watch?v=WYi1cymszqY

montecarl · on Aug 25, 2012

Here is a mailing list thread with the results of a simple benchmark: https://groups.google.com/a/continuum.io/forum/m/#!msg/numba...

Looks promising!

buster · on Aug 25, 2012

How good does this work with other python modules? Can i use this to compile _every_ python program to native executables?

andreasvc · on Aug 25, 2012

No.

In any case, doing that wouldn't be difficult, you just wrap a Python interpreter and the Python code in a single executable and you're done. What would make it worthwhile is when it would be an optmizing compiler, which is of course difficult for such a dynamic language.

bravura · on Aug 25, 2012

The github README (https://github.com/numba/numba) explains how to install Numba, but not how to use it.

Or is it used automatically for all Python byte code? If so, can it be disabled?

pash · on Aug 25, 2012

The main way to use it right now is by decorating the functions you want it to compile. See the examples directory [0] in the Github repository.

0. https://github.com/numba/numba/tree/master/examples

shoo · on Aug 25, 2012

the tests suggest that it is only used where you tell it to be used: https://github.com/numba/numba/tree/master/tests

textminer · on Aug 25, 2012

Big user of Scipy Sparse. Anyone know if this is compatible?

alexenko · on Aug 25, 2012

... It uses the remarkable LLVM compiler infrastructure to compile Python byte-code to machine code especially for use in the NumPy run-time and SciPy modules... from the README on github

fosap · on Aug 25, 2012

And the PyPy authors told us that Python would be to dynamic for ahead of time compilation. This is awesome.

fijal · on Aug 25, 2012

this is not a python compiler. it's a subset of python compiler (this exercise has been done already quite a few times)