“Making Pandas Fly” for PyDataAmsterdam 2020

I thank the PyDataAmsterdam 2020 organisers for another chance to speak on Making Pandas Fly (PyDataAmsterdam 2020). This variant of the talk focuses more on:

  • Understanding when categories beat strings and smaller floats beat larger ones
  • What’s happening with NumPy behind the scenes
  • How we can save 50% of our RAM (and so fit in more data to the same machine) by checking dtypes with my dtype_diet tool
  • Considering that float16 is simulated on modern hardware and so is memory efficient but slow for calculating!
  • Tips to install bottleneck & numexpr to make Pandas faster
  • Digging into some Pandas internals when I filed a bug – and what I learned as a result (you can learn too by reading the bug report!)

In a few months I’ll run another of my Higher Performance Python virtual training classes, you’re most welcome to join. You’ll find details on my very-lightly-used “training email list“, you should join this if you’d like to hear about my upcoming training courses.

I make notes on some of these topics in my irregular “weekish notes” here on the blog and in my every-2-weeks “thoughts & jobs” email list. You’re welcome to join the list (your email is always kept private) if getting it in your inbox is more convenient.

At the end of my talks I always ask for a postcard “if you learned something”, I’ve just received the first for last week’s talk from the Netherlands – thanks!

Ian is a Chief Interim Data Scientist via his Mor Consulting. Sign-up for Data Science tutorials in London and to hear about his data science thoughts and jobs. He lives in London, is walked by his high energy Springer Spaniel and is a consumer of fine coffees.