Archives of Data science

EuroSciPy 2015 and Data Cleaning on Text for ML (talk)

I’m at EuroSciPy 2015, we have 2 days of Pythonistic Science in Cambridge. Next year will be in Bavaria, you can sign-up for announces. I spoke in the morning on Data Cleaning on Text to Prepare for Data Analysis and Machine Learning (which is a terribly verbose title, sorry!). I’ve just covered 10 years of […]

PyConUK and the Science Track

PyConUK is in its 9th year and this year it’ll host its first Science Track aimed at scientists (not “data scientists” but real lab-coat-wearing scientists). I’m speaking in that track, yay (“Ship Data Science Products!“)! This track is part of the main conference, it all runs during September 19-21. Here’s a tiny reminder from the […]

PyDataLondon Conference 2015 Call for Proposals now OPEN (yay!) for June 19-21

PyDataLondon 2015 will take place June 19-21 at Bloomberg’s HQ in Central London, we’ll have 300 people, multiple tracks and a very solid set of speakers and teachers. You should come. You should probably speak and share your knowledge. In fact – you should submit a talk to our Call for Proposals, it opens this […]

A review of ModelInsight’s growth this last year

Early last year Chris and I founded ModelInsight, a boutique Python-focused Data Science agency in London. We’ve grown well, I figure some reflection is in order. In addition the Data Science scene has grown very well in London, I’ll put some notes on that down below too. Through consulting, training, workshops and coaching we’ve had […]

PyDataParis 2015 and “Cleaning Confused Collections of Characters”

I’m at PyDataParis, this is the first PyData in France and we have a 300-strong turn-out. In my talk I asked about the split of academic and industrial folk, we have 70% industrialists here (at least – in my talk of 70 folk). The bulk of the attendees are in the Intro track and maybe […]

Scikit-learn training in London this April 7-8th

We’re running a 2 day scikit-learn and statsmodels training course through my ModelInsight with Jeff Abrahamson (ex-Google) at the start of April (7-8th) in central London. You should join this course if you’d like to: confidently use scikit-learn to solve machine learning problems strengthen your statistical foundations so you know both what to use and why […]

Data-Science stuff I’m doing this year

2014 was an interesting year, 2015 looks to be even richer. Last year I got to publish my High Performance Python book, help co-organise the rather successful PyDataLondon2014 conference, teach High Performance in public (slides online) and in private, keynote on The Real Unsolved Problems in Data Science and start my ModelInsight AI agency. That […]

Starting Spark 1.2 and PySpark (and ElasticSearch and PyPy)

The latest PySpark (1.2) is feeling genuinely useful, late last year I had a crack at running Apache Spark 1.0 and PySpark and it felt a bit underwhelming (too much fanfare, too many bugs). The media around Spark continues to grow and e.g. today’s hackernews thread on the new DataFrame API has a lot of […]

Lightning talk at PyDataLondon for Annotate

At this week’s PyDataLondon I did a 5 minute lightning talk on the Annotate text-cleaning service for data scientists that I made live recently. It was good to have a couple of chats after with others who are similarly bored of cleaning their text data. The goal is to make it quick and easy to […]