Installing the numpy module in PyPy

Working on the High Performance Python book (mailing list here for our occasional announces) I’ve reinstalled PyPy a couple of times, each time I forget how to install the numpy module. Note that PyPy’s numpy is different and much smaller than CPython’s numpy. It does however work for smaller problems if you just need some of the core features (i.e. not the libs that numpy wraps). It used to be included in a branch, now it comes as a separate package.

I’m posting this as a reminder to myself and maybe as  a bit of help to another intrepid soul. The numpy PyPy install instructions are in this Nov 2013 blog post. You need to clone their numpy repo and then install it as a regular module using the “setup.py” that’s provided (it takes just a couple of minutes and installs fine). Download PyPy from here and just extract it somewhere.

Once you have pypy you’ll also need pip, follow the get-pip instructions and but use “bin/pypy get-pip.py” and it’ll install pip, you can then use “bin/pip install git+https://bitbucket.org/pypy/numpy.git” as per their instructions.

NOTE if you get “ImportError: No module named _numpypy” after all of this – maybe you’re using pypy3 – as of June 2015 pypy3 doesn’t support numpy.

Having installed it I can:
$ ../bin/pypy 
Python 2.7.3 (87aa9de10f9c, Nov 24 2013, 18:48:13)
[PyPy 2.2.1 with GCC 4.6.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
And now for something completely different: ``it seems to me that once you
settle on an execution / object model and / or bytecode format, you've already
decided what languages (where the 's' seems superfluous) support is going to be
first class for''
>>>> import numpy as np
>>>> np.__version__
'1.8.0.dev-74707b0'

From here I can use the random module and do various vectorized operations, it isn’t as fast as CPython’s numpy for the Pi example I’m working but it does work. Does anyone know which parts offer comparable speed to its bigger brother?


Ian applies Data Science as an AI/Data Scientist for companies in ModelInsight and in his Mor Consulting, sign-up for Data Science tutorials in London. He also founded the image and text annotation API Annotate.io, lives in London and is a consumer of fine coffees.

8 Comments

  • John M. Camara
    Right now they are more concerned about numpy compatibility than speed so I'm not sure there are any features in the numpy implementation that are faster outside of the parts of numpy that are written purely in Python. I would expect, at this time, projects that only lightly use numpy features would see a speed up using PyPy. For a better answer to this you may want to have a chat with the PyPy devs on the #pypy irc channel. Talk to either fijal or rguillebert.
  • Hi John. I'm preparing some code for Fijal to diagnose the slower than expected running speed of my example, cheers.