High Performance Python Tutorial v0.1 (from my 4 hour tutorial at EuroPython 2011)

UPDATE – the v0.2 High Performance Python tutorial is now available.

I enjoyed running a 4 hour tutorial on High Performance Python at EuroPython last week (great event guys!). The class was limited to 40 people and I’d love for more people to benefit from the several weeks of work that went into it so I’ve written it up as a 49 page PDF (license: Creative Commons By Attribution).

This is v0.1, please take a look and give me feedback so I can release an improved v0.2 within a few weeks. Is anything missing? Sure! A couple of sections just have src (no write-up) and there’s a bunch of IAN_TODO markers for me to complete for the next revision. The 49 pages should have something useful for you to chew on though.

Download “High Performance Python v0.1 (pdf)” and send me your feedback! The source code for the examples is on this github page (including the Sphinx src for the pdf). Get the updated v0.2 High Performance Python tutorial now.

The EuroPython tutorial slides are on slideshare as the High Performance Python tutorial.

Topics covered:

  • Python profiling (cProfile, RunSnake, line_profiler) – find bottlenecks
  • PyPy – Python’s new Just In Time compiler
  • Cython – annotate your code and compile to C
  • numpy integration with Cython – fast numerical Python library wrapped by Cython
  • ShedSkin – automatic code annotation and conversion to C
  • numpy vectors – fast vector operations using numpy arrays
  • NumExpr on numpy vectors – automatic numpy compilation to multiple CPUs and vector units
  • multiprocessing – built-in module to use multiple CPUs
  • ParallelPython – run tasks on multiple computers
  • pyCUDA – run tasks on your Graphics Processing Unit

If you haven’t been to a EuroPython – I definitely recommend them. Next year’s will also be in Florence (a lovely city with lovely people), the science/HPC tracks were very interesting to me and I hope to see more of the same next year.

 


Ian is a Chief Interim Data Scientist via his Mor Consulting. Sign-up for Data Science tutorials in London and to hear about his data science thoughts and jobs. He lives in London, is walked by his high energy Springer Spaniel and is a consumer of fine coffees.

6 Comments

  • Technically PyPy is another Python implementation of Python, rather than a JIT. That implementation includes a JIT of course. /pedant
  • I wanted to download the video from the site for the talk but it's a 404
  • The training sessions weren't recorded so no video is available :-( That's a part of the reason I wrote up this PDF!
  • Spinor
    Did you see this Mandelbrot numpy example: http://www.scipy.org/Tentative_NumPy_Tutorial/Mandelbrot_Set_Example
  • I hadn't seen that before. Do you know if it runs faster than my example? The vector operations are a little different so maybe there is a speed difference?
  • Garrett has left a very nice write-up about the PDF on his blog: http://garrettbluma.com/2011/07/13/high-performance-python-tutorial-by-ian-ozsvald/ Thanks Garrett!