Archive of month: January 2015

Annotate.io self-learning text cleaner demo online

A few weeks I posted some notes on a self-learning text cleaning system, to be used by data scientists who didn’t want to invest time cleaning their data by hand. I have a first demo online over at annotate.io (the demo code is here in github). The intuition behind this is that we currently divert […]

Data Science Jobs UK (ModelInsight) – Python Jobs Email List

I’ve had people asking me about how they can find data scientists in London and through our PyDataLondon meetup we’ve had members announcing jobs. There’s no central location for data science jobs so I’ve put together a new list (administered through my ModelInsight agency). Sign-up to the list here: Data Science Jobs UK (ModelInsight) Aimed […]

A first approach to automatic text data cleaning

In October I gave the opening keynote at PyConIreland on The Real Unsolved Problems in Data Science. One of the topics I covered was poor quality data, by some estimates data cleaning occupies 50-80% of a data scientist’s time. Personally I’ve just spent the better part of last year figuring out ways to convert poorly-represented […]