Week note

Well, mid-next-week note I guess. I gave another variant of my higher performance Python talk last night for PyDataUK to 250 live streamers, we had some good questions, cheers all.

On Friday Micha & I heard that the 2nd edition of our Higher Performance Python book has gone to the printers – we’d said we’d do 1 person-month each on it last summer and 9 months later (with many-person-months invested each) we’re finally there. Phew.

I now have an open PR on the dabl project to add ordinal-sorted y-axis box plot items in place of the default always-sort-by-median, which I think makes some of the exploratory process more intuitive. This also involved figuring out a new weird matplotlib rendering behaviour and writing my first unit test where I make up a dummy matplotlib figure in a test which is rendered but never displayed. There’s always a million new things to learn, right?

I’ve also been digging into Companies House data to look at how the economy is responding to the pandemic and this lets me play with some higher performance Pandas operations (for my talks) and to dig into pivot_table, pivot, groupby and crosstab along with stack & unstack in Pandas. I’ve always been confused about how many options I have available here, I’m less confused now, but I still don’t understand some of the performance differences I see for otherwise-equivalent operations. I also discovered the Pandas xs operation (take a cross-section of a dataframe) whilst reading a wikipedia page on crosstabs. Learning. Always learning.

My kneaded sourdough is improving, I’m up to 1kg now. I think I’m done with no-knead for a bit, that was fun but the really-wet dough is hard to handle. Radishes are great, pretty big now, but annoyingly the snails have found my lettuce.


Ian is a Chief Interim Data Scientist via his Mor Consulting. Sign-up for Data Science tutorials in London and to hear about his data science thoughts and jobs. He lives in London, is walked by his high energy Springer Spaniel and is a consumer of fine coffees.