Blog

Entrepreneurial Geekiness

A tiny foray into Apache Spark & Python

I’ve spent an afternoon playing with Apache Spark (1.0.1) to start to form an opinion on where it might be useful. Here’s a couple of notes. We’re discussing this at PyDataLondon tonight. UPDATE I cover PySpark 1.2, ElasticSearch and PyPy in 2015. You can run Spark out of the box on Linux (I’m using 13.10) […]

IPython Memory Usage interactive tool

I’ve written a tool (ipython_memory_usage) to help my colleague and I understand how RAM is allocated for large matrix work, it’ll work for any large memory allocations (numpy or regular Python or whatever) and the allocs/deallocs are reported after every command. Here’s an example – we make a matrix of 10,000,000 elements costing 76MB and […]

PyDataLondon second meetup (July 1st)

Our second PyDataLondon meetup will be running on Tuesday July 1st at Pivotal in Shoreditch. The announce went out to the meetup group and the event was at capacity within 7 hours – if you’d like to attend future meetups please join the group (and the wait-list is open for our next event). Our speakers: […]

High Performance Python manuscript submitted to O’Reilly

I’m super-happy to say that Micha and I have submitted the manuscript to O’Reilly for our High Performance Python book. Here’s the final chapter list: Understanding Performant Python Profiling to find bottlenecks (%timeit, cProfile, line_profiler, memory_profiler, heapy and more) Lists and Tuples (how they work under the hood) Dictionaries and Sets (under the hood again) […]

Flask + mod_uwsgi + Apache + Continuum’s Anaconda

I’ve spent the morning figuring out how to use Flask through Anaconda with Apache and uWSGI on an Amazon EC2 machine, side-stepping the system’s default Python. I’ll log the main steps in, I found lots of hints on the web but nothing that tied it all together for someone like me who lacks Apache config […]

7 chapters of “High Performance Python” now live

O’Reilly have just released another update to our High Performance Python book, in total we’ve now released the following: Understanding Performance Python Profiling to find bottlenecks (%timeit, cProfile, line_profiler, memory_profiler, heapy) Lists and Tuples Dictionaries and Sets Iterators and Generators Matrix and Vector Computation (numpy and scipy) Compiling to C (Cython, Shed Skin, Pythran, Numba, […]

First PyDataLondon meetup done, preparing the second

Last night we ran our first PyDataLondon meetup (@PyDataLondon). We had 80 data-focused Pythonistas in the room, co-organiser Emlyn lead the talks followed by a great set of Lightning Talks. Pivotal provided a cool venue (thanks Ian Huston!) with lovely pizza and beer in central Shoreditch – we’re much obliged to you. This was a […]