Ian Ozsvald picture

This is Ian Ozsvald's blog (@IanOzsvald), I'm an entrepreneurial geek, a Data Science/ML/NLP/AI consultant, author of O'Reilly's High Performance Python book, co-organiser of PyDataLondon, a Pythonista, co-founder of ShowMeDo and also a Londoner. Here's a little more about me.

High Performance Python book with O'Reilly

View Ian Ozsvald's profile on LinkedIn

ModelInsight Data Science Consultancy London Protecting your bits. Open Rights Group

18 February 2013 - 23:43PyCon Tutorial Notes for Applied Parallel Computing

This post is for students of the Applied Parallel Computing tutorial that Minesh B. Amin and I will run during March 2013 at PyCon.This is a wiki-post, I’ll update it over the next month. If you are attending the tutorial you must check this post in the run-up to the tutorial. Important notes are below for you to read now. This is linked to from our PyCon Tutorial Support page.

If you come to this after the tutorial you’ll probably find this useful for setup. The following is for my students:

  • Check this post before you come to PyCon, you will be expected to have followed instructions and installed the software and updates before the tutorial
  • You won’t have time to install/setup during the tutorial, you must arrive prepared, we have a lot to work through and we’ll start immediately
  • Accepting that the PyCon wifi has been great in past years you must assume that wifi will be broken – come prepared with a fully working environment
  • We recommend strongly that you use our VirtualBox (it has all the libs and the github repo pre-installed, it is open source, it’ll run on Win/Mac/Linux), if you install your own package set then we can’t help you if it doesn’t work as expected (it is also quite fiddly to setup yourself) – you can of course buddy-up with someone else during the tutorial if required

You will be able to get the VirtualBox (about 7GB GB) from this post in the next week, you’ll be better off using the torrent that we’ll provide (please seed if you can, if possible all the way until the tutorial runs to help fellow students).

Download link for VirtualBox (required!) for the tutorial:

(v1.1 torrent deleted as it didn’t run cleanly on Macs)

PyCON-2013_AppliedParallelComputing1.2.zip torrent (very robust – resume if download breaks, 2.2GB zip decompresses to 6.9GB) or via direct download (more brittle – no resume if the download breaks).

md5sum: ce43b52a18ca913e62842ae72cc8df74

NOTE – I had the v1.1 version linked in the torrent above for a few days – if you got that and you can’t start the VirtualBox, just right-click in VirtualBox and discard the saved state, then restart the image. If you have the v1.2 version (linked as of March 4th) then you’re fine.

Video – this YouTube Video Demo (7 minutes) shows you how to install the image.


  1. Unzip to a directory with 7GB of disk space (MAC USERS – the built-in unzip doesn’t seem to handle 64 bit files, use 7zip for success [maybe Windows users too?])
  2. Open VirtualBox (optional but useful – add the extension pack for host integration)
  3. Machine | Add and open the directory that contains the .vdi and .vbox files
  4. Start the machine, it’ll boot to the Linux desktop
  5. Open the web link on the Desktop if you want to see the latest version of this blog post
  6. Double click the “Download GITHUB Repo” script on the desktop and it’ll refresh the repository (in case we’ve added new code)
  7. Familiarise yourself with the environment (Linux Mint 14), GTK Vim and emacs are installed
  8. Open a terminal and run ./pycon2013_applied_parallel_computing/run_this_to_confirm_you_have_the_correct_libraries.py (from the home directory) which confirms to you that the necessary Python libraries are installed (I’ve done this, you can do it for confirmation)

The VirtualBox is a fully configured Linux Mint 14 32 bit (based on Ubuntu 12.10) distribution, with gui, also with gvim installed. Feel free to add anything else. You don’t need to bother installing further system updates, the OS was up to date when we released it. It is configured to provide 2 CPUs and 3GB RAM – you might need to reduce these figures to get it running on your machine.

It runs on my 64 bit laptop (Linux Mint 13 64 bit) and on 32 bit machines, it should work equally well on Windows and Mac (we’ve tested it on both). You should install the Guest Additions (when the Ubuntu installation has booted use the Devices menu at the top of the VirutalBox window and “install guest additions” – this installs integration features like copy/paste with your host OS) as they provide things like shared clipboard to the host machine.

Instructions if you can’t/won’t use our VirtualBox (but you’re on your own in this case):

You can get the github repo here – if you set this up yourself then we can’t offer help if it doesn’t work (go to the relevant forums and ask there). There is a test script in the root of the repo (run_this_to_confirm_you_have_the_correct_libraries.py) which will confirm if you have the right libraries installed (it only checks for the presence of Disco, it doesn’t confirm that it is configured correctly). The README will give you some guidance but we really recommend that you get our VirtualBox (to be released in the next week via this post).

Ian applies Data Science as an AI/Data Scientist for companies in ModelInsight, sign-up for Data Science tutorials in London. Historically Ian ran Mor Consulting. He also founded the image and text annotation API Annotate.io, co-authored SocialTies, programs Python, authored The Screencasting Handbook, lives in London and is a consumer of fine coffees.

4 Comments | Tags: Life, Python