About

Ian Ozsvald picture

This is Ian Ozsvald's blog, I'm an entrepreneurial geek, a Data Science/ML/NLP/AI consultant, founder of the Annotate.io social media mining API, author of O'Reilly's High Performance Python book, co-organiser of PyDataLondon, co-founder of the SocialTies App, author of the A.I.Cookbook, author of The Screencasting Handbook, a Pythonista, co-founder of ShowMeDo and FivePoundApps and also a Londoner. Here's a little more about me.

View Ian Ozsvald's profile on LinkedIn Visit Ian Ozsvald's data science consulting business Protecting your bits. Open Rights Group

23 February 2014 - 11:53High Performance Python at PyDataLondon 2014

Yesterday I spoke on The High Performance Python Landscape at PyDataLondon 2014 (our first PyData outside of the USA – see my write-up). I was blessed with a full room and interesting questions. With Micha I’m authoring a High Performance Python book with O’Reilly (email list for early access) and I took the topics from a few of our chapters.

“@ianozsvald providing eye-opening discussion of tools for high-performance #Python: #Cython, #ShedSkin, #Pythran, #PyPy, #numba… #pydata” – @davisjmcc

Overall I covered:

  • line_profiler for CPU profiling in a function
  • memory_profiler for RAM profiling in a function
  • memory_profiler’s %memit
  • memory_profiler’s mprof to graph memory use during program’s runtime
  • thoughts on adding network and disk I/O tracking to mprof
  • Cython on lists
  • Cython on numpy by dereferencing elements (which would normally be horribly inefficient) plus OpenMP
  • ShedSkin‘s annotated output and thoughts on using this as an input to Cython
  • PyPy and numpy in PyPy
  • Pythran with numpy and OpenMP support (you should check this out)
  • Numba
  • Concluding thoughts on why you should probably use JITs over Cython

Here’s my room full of happy Pythonistas :-)

pydatalondon2014_highperformancepython

“Really useful and practical performance tips from @ianozsvald @pydata #pydata speeding up #Python code” – @iantaylorfb

Slides from the talk:

 

 

UPDATE Armin and Maciej came back today with some extra answers about the PyPy-numpy performance (here and here), the bottom line is that they plan to fix it (Maciej says it is now fixed – quick service!). Maciej also notes improvements planned using e.g. vectorisation in numpy.

VIDEO TO FOLLOW


Ian applies Data Science as an AI/Data Scientist for companies in Mor Consulting, founded the image and text annotation API Annotate.io, co-authored SocialTies, programs Python, authored The Screencasting Handbook, lives in London and is a consumer of fine coffees.

9 Comments | Tags: High Performance Python Book, pydata, Python