Blog

Entrepreneurial Geekiness

What I’ve been up to since 2022

This has been terribly quiet since July 2022, oops. It turns out that having an infant totally sucks your time! In the meantime I’ve continued to build up: Training courses – I’ve just listed my new Fast Pandas course plus the existing Successful Data Science Projects and Software Engineering for Data Scientists with runs of […]

My first commit to Pandas

I’ve used the Pandas data science toolkit for over a decade and I’ve filed a couple of issues, but I’ve never contributed to the source. At the weekend I got to balance the books a little by making my first commit. With this pull request I fixed the recent request to update the pct_change docs […]

Skinny Pandas Riding on a Rocket at PyDataGlobal 2020

On November 11th we saw the most ambitious ever PyData conference – PyData Global 2020 was a combination of world-wide PyData groups putting on a huge event to both build our international community and to leverage the on-line only conferences that we need to run during Covid 19. The conference brought together almost 2,000 attendees […]

“Making Pandas Fly” at EuroPython 2020

I’ve had a chance to return to talking about High Performance Python at EuroPython 2020 after my first tutorial on this topic back in 2011 in Florence. Today I spoke on Making Pandas Fly with a focus on making Pandas run faster. This covered: Categories and RAM-saving datatypes to make 100-500x speed-ups (well, some of […]

Weekish notes

I’ve recently switched back from Sourdough yeast to dried packet yeast mix, given a recipe by a colleague (thanks Nick!). I immediately set to work modifying his recipe (well, cutting out steps if we’re honest). The first loaf looked fine but was bland – I cut out too much salt. The next was really very […]

Weekish notes

I gave another iteration of my Making Pandas Fly talk sequence for PyDataAmsterdam recently and received some lovely postcards from attendees as a result. I’ve also had time to list new iterations of my training courses for Higher Performance Python (October) and Software Engineering for Data Scientists (September), both will run virtually via Zoom & […]

Weeknote (dtype-diet)

Over the weekend I hacked on dtype_diet – a tool for Pandas users that checks their DataFrame to see if smaller datatypes might be applicable. If so they’d offer no data loss and a reduction in RAM, for Categorical data there’s also the possibility of faster calculations. This tool makes no changes, it recommends the […]

Week(ish) note

So – High Performance Python 2nd ed finally shipped (Amazon, Goodreads) – yay! In brief we’ve added notes on how you can be a “highly performant programmer”, added some more profiling, added Pandas onto NumPy, improved the Compiling to C chapter with more Numba and a new full section on GPUs (in the first edition […]