Ian Ozsvald picture

This is Ian Ozsvald's blog (@IanOzsvald), I'm an entrepreneurial geek, a Data Science/ML/NLP/AI consultant, author of O'Reilly's High Performance Python book, co-organiser of PyDataLondon, a Pythonista, co-founder of ShowMeDo and also a Londoner. Here's a little more about me.

High Performance Python book with O'Reilly

View Ian Ozsvald's profile on LinkedIn

ModelInsight Data Science Consultancy London Protecting your bits. Open Rights Group


15 April 2013 - 14:03More Python 3.3 downloads than Python 2.7 for past 3 months

Since PyCon 2013 I’ve been in a set of conversations that start with “should I be using Python 3.3 for science work?”. Here’s a recent reddit thread on the subject. Last year I solidly recommended using Python 2.7 for scientific work (as many key libraries weren’t yet supported). I’m on the cusp of changing my recommendation.

Update there’s a nice thread on Reddit/r/python discussing what’s required and where the numbers are coming from.

I last looked at the rate of Python downloads via ShowMeDo during 2008 when Python 2.5 was the top dog. The Windows 2.5.1 installer was getting 500,000 downloads a month. In the last 3 months I’m pleasantly surprised to see that Python 3.3 for Windows is downloaded more each month than Python 2.7. We can see:

  • March 2013 Python 3.3 for Windows has 647k downloads vs Python 2.7 with 630k
  • February 2013 Python 3.3 for Windows has 553k downloads vs Python 2.7 with 498k
  • January 2013 Python 3.3 for Windows has 533k downloads vs Python 2.7 with 495k (Python 2.7 less popular since January 2013)
  • December 2012 Python 3.3 for Windows has 412k downloads vs Python 2.7 with 525k

These figures only tell a part of the story of course. For Windows you have to download Python. On Linux and Mac it comes pre-installed (so we can’t measure those numbers).

Python 2.7 has been the default on Ubuntu for a while, that’s changing with Ubuntu 13.04. There are two lists of Python-3 compatible packages, it seems that Django is on this list and at PyCon there was a how-to-port-to-py3 video (not Flask yet update Armin is tweeting for sprint help for Py3 support), SQLAlchemy is (but not MySQL-python), Fabric isn’t ready yet. For web-dev it seems to be a mixed bag but I’m guessing Python 3 support will be across the board this year.

For scientific use we already have Python-3 compatible numpy, scipy and matplotlib. scikit-learn is ‘nearly‘ ported, Pillow (the recent fork of PIL) is ready for Python 3. NLTK is also being ported.

For scientific use around natural language processing the switch to unicode-by-default looks most attractive (the mix of strings and unicode datatypes has burnt hours for me over the years in Python 2.x). Here’s a PyCon video on the use of Python 3 for text processing and this reviews why Python 3.3 is superior to Python 2.7.

It is slightly too early for me yet to want to switch but I’m starting to experiment. I’ve added some __future__ imports to new code so I know I’m writing Python 2.7 in a 3-like style. I’m also increasingly using Ned Batchelder’s coverage.py via nosetests to make sure I have good coverage. I currently run 2to3 to check that things convert cleanly to Python 3 but rarely run the result with Python 3 (I haven’t needed to do this yet). There’s a set of useful advice on python3porting including various __future__ imports (including division, print_function, unicode_literals, absolute_import).

Ian applies Data Science as an AI/Data Scientist for companies in ModelInsight, sign-up for Data Science tutorials in London. Historically Ian ran Mor Consulting. He also founded the image and text annotation API Annotate.io, co-authored SocialTies, programs Python, authored The Screencasting Handbook, lives in London and is a consumer of fine coffees.

28 Comments | Tags: Life, Python