All posts of Ian

Week(ish) note

So – High Performance Python 2nd ed finally shipped (Amazon, Goodreads) – yay! In brief we’ve added notes on how you can be a “highly performant programmer”, added some more profiling, added Pandas onto NumPy, improved the Compiling to C chapter with more Numba and a new full section on GPUs (in the first edition […]

Week note

Well, mid-next-week note I guess. I gave another variant of my higher performance Python talk last night for PyDataUK to 250 live streamers, we had some good questions, cheers all. On Friday Micha & I heard that the 2nd edition of our Higher Performance Python book has gone to the printers – we’d said we’d […]

“Flying Pandas” and “Making Pandas Fly” – virtual talks this weekend on faster data processing with Pandas, Modin, Dask and Vaex

This Saturday and Monday I’ve had my first experience presenting at virtual conferences – on Saturday it was for Remote Pizza Python (brilliant line-up!) and on Monday (note – this post predates the talk, I’ll update it tomorrow after I’ve spoken) at BudapestBI. UPDATE added 2nd variant of Making Pandas Fly for a short-notice PyDataUK […]

Recent “week notes”

I’ve not done a public “week notes” before. I’ve been hacking on various things and I figure it is worth sharing some of it. Using public Companies House data I’ve started to plot the decline in new company formations in the UK. Here’s a first crack, which shows a decline at the end of March. […]

New Higher Performance Python class (June 1-3)

I’ve listed my next Higher Performance Python public class, it’ll run online for 3 mornings on June 1-3 during UK hours. We’ll use Zoom and Slack with pre-distributed Notebooks and modules and you’ll run it using an Anaconda environment. Here’s the write-up from my recent class. We’ll focus on Profiling to find what’s slow in […]

Notes on last week’s Higher Performance Python class

Last week I ran a two-morning Higher Performance Python class, we covered: Profiling slow code (using a 2D particle infection model in an interactive Jupyter Notebook) with line_profiler & PySpy Vectorising code with NumPy vs running the original with PyPy Moving to Numba to make iterative and vectorised NumPy really fast (with up to a […]

Notes from Zoom call on “Problems & Solutions for Data Science Remote Work”

On Friday I held an open Zoom call to discuss the problems and solutions posed by remote work for data scientists. I put this together as I’ve observed from my teaching cohorts and from conversation with colleagues that for anyone “suddenly working remotely” the process has typically not been smooth. I invited folk to join […]

Another Successful Data Science Projects course completed

A week back I ran the 4th iteration of my 1 day Successful Data Science Projects course. We covered: How to write a Project Specification including a strong Definition of Done How to derisk a new dataset quickly using Pandas Profiling, Seaborn and dabl Building interactive data tools using Altair to identify trends and outliers […]

Higher Performance Python (ODSC 2019)

Building on PyDataCambridge last week I had the additional pleasure of talking on Higher Performance Python at ODSC 2019 yesterday. I had a brilliant room of 300 Pythonic data scientists at all levels who asked an interesting array of questions: This talk expanded on last week’s version at PyDataCambridge as I had some more time. […]