Archives of pydata

Starting Spark 1.2 and PySpark (and ElasticSearch and PyPy)

The latest PySpark (1.2) is feeling genuinely useful, late last year I had a crack at running Apache Spark 1.0 and PySpark and it felt a bit underwhelming (too much fanfare, too many bugs). The media around Spark continues to grow and e.g. today’s hackernews thread on the new DataFrame API has a lot of […]

Lightning talk at PyDataLondon for Annotate

At this week’s PyDataLondon I did a 5 minute lightning talk on the Annotate text-cleaning service for data scientists that I made live recently. It was good to have a couple of chats after with others who are similarly bored of cleaning their text data. The goal is to make it quick and easy to […]

Data Science Jobs UK (ModelInsight) – Python Jobs Email List

I’ve had people asking me about how they can find data scientists in London and through our PyDataLondon meetup we’ve had members announcing jobs. There’s no central location for data science jobs so I’ve put together a new list (administered through my ModelInsight agency). Sign-up to the list here: Data Science Jobs UK (ModelInsight) Aimed […]

My Keynote at PyConIreland 2014 – “The Real Unsolved Problems in Data Science”

I’ve just given the opening keynote here at PyConIreland 2014 – many thanks to the organisers for letting me get on stage. This is based on 15 years experience running my own consultancies in Data Science and Artificial Intelligence. (Small note  – with the pic below James mis-tweeted ‘sexist’ instead of ‘sexiest’ (from my opening slide) […]

Fourth PyDataLondon Meetup

We’ve just run our 4th PyDataLondon meetup (@PyDataLondon). Having over 500 members is superb for just 4 months growth, woot 🙂 Many thanks to @GoPivotalEMEA for hosting us. We had 3 speakers and 1 lightning talk. Here are my slides on “The High Performance Python Landscape”: I’m still collecting data for my two surveys (to […]

Slides for High Performance Python tutorial at EuroSciPy2014 + Book signing!

Yesterday I taught an excerpt of my 2 day High Performance Python tutorial as a 1.5 hour hands-on lesson at EuroSciPy 2014 in Cambridge with 70 students: We covered profiling (down to line-by-line CPU & memory usage), Cython (pure-py and OpenMP with numpy), Pythran, PyPy and Numba. This is an abridged set of slides from […]

Why are technical companies not using data science?

Here’s a quick question. How come more technical companies aren’t making use of data science? By “technical” I mean any company with data and the smarts to spot that it has value, by “data science” I mean any technical means to exploit this data for financial gain (e.g. visualisation to guide decisions, machine learning, prediction). […]

Data Science Training Survey

I’ve put together a short survey to figure out what’s needed for Python-based Data Science training in the UK. If you want to be trained in strong data science, analysis and engineering skills please complete the survey, it doesn’t need any sign-up and will take just a couple of minutes. I’ll share the results at […]

PyDataLondon 3rd event

This week we had our 3rd PyDataLondon meetup (@PyDataLondon), this builds on our 2nd event. We’re really happy to see the group grow to over 400 members, co-org Emlyn made a plot (see below) of our linear growth. Our main speakers: Andrew Clegg (chief Data Scientist at Pearson Publishing in London) spoke on his Snake […]

Second PyDataLondon Meetup a Javascript/Analystic-tastic event

This week we ran our 2nd PyDataLondon meetup (@PyDataLondon), we had 70 in the room and a rather techy set of talks. As before we hosted by Pivotal (@gopivotal) via Ian – many thanks for the beer and pizza! I took everyone to the pub after for a beer on out  data science consultancy to […]