Archives of Data science

Lightning talk at PyDataLondon for Annotate

At this week’s PyDataLondon I did a 5 minute lightning talk on the Annotate text-cleaning service for data scientists that I made live recently. It was good to have a couple of chats after with others who are similarly bored of cleaning their text data. The goal is to make it quick and easy to […]

Data Science Jobs UK (ModelInsight) – Python Jobs Email List

I’ve had people asking me about how they can find data scientists in London and through our PyDataLondon meetup we’ve had members announcing jobs. There’s no central location for data science jobs so I’ve put together a new list (administered through my ModelInsight agency). Sign-up to the list here: Data Science Jobs UK (ModelInsight) Aimed […]

A first approach to automatic text data cleaning

In October I gave the opening keynote at PyConIreland on The Real Unsolved Problems in Data Science. One of the topics I covered was poor quality data, by some estimates data cleaning occupies 50-80% of a data scientist’s time. Personally I’ve just spent the better part of last year figuring out ways to convert poorly-represented […]

We’re running more Data Science Training in 2015 Q1 in London

A couple of weeks ago Bart and I ran two very successful training courses in London through my ModelInsight, one introduced data science using pandas and numpy to build a recommender engine, the second taught a two-day course on High Performance Python (and yes, that was somewhat based on my book with a lot of […]

Why are technical companies not using data science?

Here’s a quick question. How come more technical companies aren’t making use of data science? By “technical” I mean any company with data and the smarts to spot that it has value, by “data science” I mean any technical means to exploit this data for financial gain (e.g. visualisation to guide decisions, machine learning, prediction). […]

Data Science Training Survey

I’ve put together a short survey to figure out what’s needed for Python-based Data Science training in the UK. If you want to be trained in strong data science, analysis and engineering skills please complete the survey, it doesn’t need any sign-up and will take just a couple of minutes. I’ll share the results at […]

Python Training courses: Data Science and High Performance Python coming in October

I’m pleased to say that via our ModelInsight we’ll be running two Python-focused training courses in October. The goal is to give you new strong research & development skills, they’re aimed at folks in companies but would suit folks in academia too. UPDATE training courses ready to buy (1 Day Data Science, 2 Day High […]

PyDataLondon second meetup (July 1st)

Our second PyDataLondon meetup will be running on Tuesday July 1st at Pivotal in Shoreditch. The announce went out to the meetup group and the event was at capacity within 7 hours – if you’d like to attend future meetups please join the group (and the wait-list is open for our next event). Our speakers: […]

High Performance Python manuscript submitted to O’Reilly

I’m super-happy to say that Micha and I have submitted the manuscript to O’Reilly for our High Performance Python book. Here’s the final chapter list: Understanding Performant Python Profiling to find bottlenecks (%timeit, cProfile, line_profiler, memory_profiler, heapy and more) Lists and Tuples (how they work under the hood) Dictionaries and Sets (under the hood again) […]

Flask + mod_uwsgi + Apache + Continuum’s Anaconda

I’ve spent the morning figuring out how to use Flask through Anaconda with Apache and uWSGI on an Amazon EC2 machine, side-stepping the system’s default Python. I’ll log the main steps in, I found lots of hints on the web but nothing that tied it all together for someone like me who lacks Apache config […]