Archives of Python

A first approach to automatic text data cleaning

In October I gave the opening keynote at PyConIreland on The Real Unsolved Problems in Data Science. One of the topics I covered was poor quality data, by some estimates data cleaning occupies 50-80% of a data scientist’s time. Personally I’ve just spent the better part of last year figuring out ways to convert poorly-represented […]

We’re running more Data Science Training in 2015 Q1 in London

A couple of weeks ago Bart and I ran two very successful training courses in London through my ModelInsight, one introduced data science using pandas and numpy to build a recommender engine, the second taught a two-day course on High Performance Python (and yes, that was somewhat based on my book with a lot of […]

My Keynote at PyConIreland 2014 – “The Real Unsolved Problems in Data Science”

I’ve just given the opening keynote here at PyConIreland 2014 – many thanks to the organisers for letting me get on stage. This is based on 15 years experience running my own consultancies in Data Science and Artificial Intelligence. (Small note  – with the pic below James mis-tweeted ‘sexist’ instead of ‘sexiest’ (from my opening slide) […]

Fourth PyDataLondon Meetup

We’ve just run our 4th PyDataLondon meetup (@PyDataLondon). Having over 500 members is superb for just 4 months growth, woot 🙂 Many thanks to @GoPivotalEMEA for hosting us. We had 3 speakers and 1 lightning talk. Here are my slides on “The High Performance Python Landscape”: I’m still collecting data for my two surveys (to […]

Slides for High Performance Python tutorial at EuroSciPy2014 + Book signing!

Yesterday I taught an excerpt of my 2 day High Performance Python tutorial as a 1.5 hour hands-on lesson at EuroSciPy 2014 in Cambridge with 70 students: We covered profiling (down to line-by-line CPU & memory usage), Cython (pure-py and OpenMP with numpy), Pythran, PyPy and Numba. This is an abridged set of slides from […]

High Performance Python Training at EuroSciPy this afternoon

I’m training on High Performance Python this afternoon at EuroSciPy, my github source is here (as a shortlink: http://bit.ly/euroscipy2014hpc). There are prerequisites for the course. This training is actually a tiny part of what I’ll teach on my 2 day High Performance Python course in London in October (along with a Data Science course). If […]

Why are technical companies not using data science?

Here’s a quick question. How come more technical companies aren’t making use of data science? By “technical” I mean any company with data and the smarts to spot that it has value, by “data science” I mean any technical means to exploit this data for financial gain (e.g. visualisation to guide decisions, machine learning, prediction). […]

Data Science Training Survey

I’ve put together a short survey to figure out what’s needed for Python-based Data Science training in the UK. If you want to be trained in strong data science, analysis and engineering skills please complete the survey, it doesn’t need any sign-up and will take just a couple of minutes. I’ll share the results at […]

PyDataLondon 3rd event

This week we had our 3rd PyDataLondon meetup (@PyDataLondon), this builds on our 2nd event. We’re really happy to see the group grow to over 400 members, co-org Emlyn made a plot (see below) of our linear growth. Our main speakers: Andrew Clegg (chief Data Scientist at Pearson Publishing in London) spoke on his Snake […]