Archives of pydata

“Higher Performance Python” at PyDataCambridge 2019

I’ve had the pleasure of speaking at the first PyDataCambridge conference (2019), this is the second PyData conference in the UK after PyDataLondon (which colleagues and I co-founded 6 years back). I’m super proud to see PyData spread to 6 regional meetups and now 2 UK conferences. We had over 200 attendees and the conference […]

“On the Delivery of Data Science Projects” – talk at PyDataCambridge meetup

A few weeks I got to speak at PyDataCambridge (thanks for having me!), slides are here for “On The Delivery of Data Science Projects“. This talk is based on my experiences coaching teams (whilst building IP for clients) to help them derisk, design and deliver working data science products. This talk is really in two […]

Thoughts on how to start a PyData or Python meetup

At PyConLT 2019 (Lithuania) we just had a 10-person meeting on “how to start a new PyData or Python meetup” with existing organisers and some potential new event organisers. The night before in the conference bar Radovan and I had spent an hour helping someone from Latvia figure out their plan to start a new […]

PyCon Lithuania 2019 and a keynote on “Citizen Science with Python”

I’ve had the great pleasure of attending PyConLT 2019 – my first trip to Lithuania. I had no idea what to expect (I’ve never been to this part of Europe) – Vilnius is a lovely city full of lovely Pythonistas. There’s a bunch of lovely art hanging underneath bridges, an amazing Soviet Palace of Arts […]

Second Successfully Delivering Data Science Projects just over

I ran the second iteration of my Successfully Delivering Data Science Projects course last Friday to this happy group, we had a lovely day and good conversation has continued in the teaching slack over the weekend: Topics covered included the design and derisking of data projects (not just machine learning), building a project plan, communicating […]

“discover feature relationships” – new EDA tool

I’ve built a new Exploratory Data Analysis tool, I used it in a few presentations last year with the code on github and have now (finally) published it to PyPI. The goal is to quickly check in a DataFrame using machine learning (sklearn’s Random Forests) if any column predicts any other column. I’m interested in […]

Talking on “High Performance Python” at Linuxing In London last week

Mario of PyLondonium (where I gave a keynote talk earlier this year) was kind enough to ask me along to speak at Linuxing in London. I gave an updated version of one of my older High Performance Python talks based on material I’d covered in my book, to show the more-engineering audience how to go […]

“On the Diagramatic Diagnosis of Data” at BudapestBI 2018

A couple of days back I spoke on using diagrams (matplotlib, seaborn, pandas profiling) to diagnose data during the exploratory data analysis phase. I also introduced my new tool discover_feature_relationships which helps prioritise which features to investigate in a new dataset by identifying pairs of features that have some sort of ‘interesting’ relationship. We finished […]