Review: High Performance Python by Gorelick and Ozsvald
Summary: Fascinating book covering the whole breadth of high performance Python. It starts with detailed discussion of various profiling methods, continues with chapters on performance in standard Python, then focuses on high performance using arrays, compiling to C and various approaches to parallel programming. I learnt a lot from the book, and have already started improving the performance of the code I wrote for my PhD (rather than writing up my thesis, but oh well…).
I would consider myself to be a relatively good Python programmer, but I know that I don’t always write my code in a way that would allow it to run fast. This, as Gorelick and Ozsvald point out a number of times in the book, is actually a good thing: it’s far better to focus on programmer time than CPU time – at least in the early stages of a project. This has definitely been the case for the largest programming project that I’ve worked on recently: my PhD algorithm. It’s been difficult enough to get the algorithm to work properly as it is – and any focus on speed improvements during my PhD would definitely have been a premature optimization!
However, I’ve now almost finished my PhD, and one of the improvements listed in the ‘Further Work’ section at the end of my thesis is to improve the computational efficiency of my algorithm. I specifically requested a review copy of this book from O’Reilly as I hoped it would help me to do this: and it did!
I have a background in C and have taken a ‘High Performance Computing’ class at my university, so I already knew some of theory, but was keen to see how it applied to Python. I must admit that when I started the book I was disappointed that it didn’t jump straight into high performance programming with numpy, and parallel programming libraries – but I soon changed my mind when I learnt about the range of profiling tools (Chapter 2), and the significant performance improvements that can be done in pure Python code (Chapters 3-5). In fact, when I finished the book and started applying it to my PhD algorithm I was surprised just how much optimization could be done on my pure Python code, even though the algorithm is a heavy user of numpy.
When we got to numpy (Chapter 6) I realised there were a lot of things that I didn’t know – particularly the inefficiency of how numpy allocates memory for storing the results of computations. The whole book is very ‘data-driven’: they show you all of the code, and then the results for each version of the code. This chapter was a particularly good example of this, using the Linux perf tool to show how different Python code led to significantly different behaviour at a very low level. As a quick test I implemented numexpr for one of my more complicated numpy expressions and found that it halved the time taken for that function: impressive!
I found the methods for compiling to C (discussed in Chapter 7) to be a lot easier than expected, and I even managed to set up Cython on my Windows machine to play around with it (admittedly by around 1am…but still!). Chapter 8 focused on concurrency, mostly in terms of asynchronous events. This wasn’t particularly relevant to my scientific work, but I can see how it would be very useful for some of the other scripts I’ve written in the past: downloading things from the internet, processing data in a webapp etc.
Chapter 9 was definitely useful from the point of view of my research, and I found the discussion of a wide range of solutions for parallel programming (threads, processes, and then the various methods for sharing flags) very useful. I felt that Chapter 10 was a little limited, and focused more on the production side of a cluster (repeatedly emphasising how you need good system admin support) than how to actually program effectively for a cluster. A larger part of this section devoted to the IPython parallel functionality would have been nice here. Chapter 11 was interesting but also less applicable to me – although I was surprised that nothing was mentioned about using integers rather than floats in large amounts of data where possible (in satellite imaging values are often multiplied by 10,000 or 100,000 to make them integers rather than floats and therefore smaller to store and quicker to process). I found the second example in Chapter 12 (by Radim Rehurek) by far the most useful, and wished that the other examples were a little more practical rather than discussing the production and programming process.
Although I have made a few criticisms above, overall the book was very interesting, very useful and also fun to read (the latter is very important for a subject that could be relatively dry). There were a few niggles: some parts of the writing could have done with a bit more proof-reading, some things were repeated a bit too much both within and between chapters, and I really didn’t like the style of the graphs (that is me being really picky – although I’d still prefer those style graphs over no graphs at all!). If these few niggles were fixed in the 2nd edition then I’d have almost nothing to moan about! In fact, I really hope there is a second edition, as one of the great things about this area of Python is how quickly new tools are developed – this is wonderful, but it does mean that books can become out of date relatively quickly. I’d be fascinated to have an update in a couple of years, by which time I imagine many of the projects mentioned in the book will have moved on significantly.
Overall, I would strongly recommend this book for any Python programmer looking to improve the performance of their code. You will get a lot out of it whether you write in pure Python or use numpy a lot, whether you are an expert in C or a novice, and whether you have a single machine or a large cluster.