All posts of Ian

Second Successfully Delivering Data Science Projects just over

I ran the second iteration of my Successfully Delivering Data Science Projects course last Friday to this happy group, we had a lovely day and good conversation has continued in the teaching slack over the weekend: Topics covered included the design and derisking of data projects (not just machine learning), building a project plan, communicating […]

New public course on Successfully Delivering Data Science Projects for March 1st

On Friday February 1st I ran my first Successfully Delivering Data Science Projects, this is a part of my new plan to give more training this year. This went really well and I got to both teach and learn a lot from my students. We talked through best practice, project design, derisking strategies, communication plans […]

“discover feature relationships” – new EDA tool

I’ve built a new Exploratory Data Analysis tool, I used it in a few presentations last year with the code on github and have now (finally) published it to PyPI. The goal is to quickly check in a DataFrame using machine learning (sklearn’s Random Forests) if any column predicts any other column. I’m interested in […]

Looking back on 2018, looking to 2019

So last year was a damned hard year – ignoring Brexit and other international foolishness, on a personal level (without going in to details) by mid-year I was emotionally wiped out. A collection of health issues between family and friends kept rearing their ugly heads and over time I ran very low of emotionally supportive […]

New public course on Successfully Delivering Data Science Projects for Feb 1st

During my Pythonic data science team coaching I see various problems coming up that I’ve helped solve before. Based on these observations and my prior IP design and delivery for clients over the years I’ve put together a 1 day public course aimed at data scientists (any level) who want to be more confident with […]

Talking on “High Performance Python” at Linuxing In London last week

Mario of PyLondonium (where I gave a keynote talk earlier this year) was kind enough to ask me along to speak at Linuxing in London. I gave an updated version of one of my older High Performance Python talks based on material I’d covered in my book, to show the more-engineering audience how to go […]

“On the Diagramatic Diagnosis of Data” at BudapestBI 2018

A couple of days back I spoke on using diagrams (matplotlib, seaborn, pandas profiling) to diagnose data during the exploratory data analysis phase. I also introduced my new tool discover_feature_relationships which helps prioritise which features to investigate in a new dataset by identifying pairs of features that have some sort of ‘interesting’ relationship. We finished […]

On helping to open the inaugural PyDataPrague meetup

A couple of weeks back I had the wonderful opportunity to open the PyDataPrague meetup – this is the second meetup I’ve opened after our PyDataLondon started back in 2014. The core organisers Ondřej Kokeš, Jakub Urban and Jan Pipek asked me to give two short talks on: Introducing NumFOCUS (video for both of my […]