Programming link clearance 2015: Python edition
I have aÂ CodingÂ bookmarks folder which is stuffed full of loads of interesting articles that I’ve never shared with anyone because they don’t really fit into any of my posts. So, taking an idea from The Old New Thing, I’m going to run a few ‘Link Clearance’ posts. This is the Python-focused one (there will be more soon, including a general programming one).
(Yes, I know it is now the middle February 2016, but things got delayed a bit! Most of these links are from 2015 – with a few more recent ones added too)
- Elements of Python Style: This Python style guide goes beyond PEP8 to give useful advice on the subtler art of writing high-quality Python code.
- 30 Python Language Features and Tricks You May Not Know About: I think I knew most of these, but there are some real gems in there – slice objects and enumerate are great.
- Ten awesome features of Python you can’t use as you refuse to upgrade to Python 3: A great presentation of smaller features of Python 3 that you may not know about – the first few are particularly useful
- Supporting Python 3: An in-depth guide: Now you’ve decided you want to use Python 3, you need to make your code work with it. This is a good place to start.
- Python 2.7 Quick Reference: Very comprehensive (and not necessarily ‘quick’) reference for Python 2.7 (but also mostly applicable to Python 3). Great to have open and rapidly search with Ctrl-F.
- Python String Format Cookbook: I’m sure I’m not the only person who struggles to remember some of the more complex options for new-style string formatting in Python – this should help
- The ever useful and neat subprocess module: A very comprehensive guide to this powerful – but sometimes rather complex – module. Please use this instead of os.system – it may be slightly harder to get started, but it will help you in the long run.
- Advanced use of Python decorators and metaclasses: Delving in more depth into Python decorators and their links with metaclasses – along with ‘callability’ in Python.
- Hands-On Introduction to Python Programming: Very detailed slides and notes (use t to switch between them) for a course in Python programming. Rather than just showing you how to do things, this takes you inside the language showing how things actually work.
- Modules and Packages – Live and Let Die: Slides from a presentation taking an in-depth look at how Python modules and packages work (note: not package managment via pip, but modules and packages themselves). There’s a lot in here that I never knew before!
- Python for Computational Science and Engineering: Book-length introduction to scientific Python programming, including basic Python, plus numpy, matplotlib, SymPy and more.
- Bayesian Methods for Hackers: An introduction to Bayesian methods from a programming-perspective – also book-length and definitely worth a read.
- Think Bayes: If you didn’t like the previous book relying on the PyMC module then you might prefer this one – it teaches similar concepts but with pure Python (with a bit of numpy later on). It gave me a far better understanding of probability in general – not just Bayesian thinking.
- Kalman and Bayesian filters in Python: Yup, yet another book – but I promise this is the last one. It covers some of what has been covered in the two previous books, but goes into a lot of depth about Kalman filters, in a very easy-to-understand way.
- 100 numpy exercises: This link is actually far more interesting than it sounds – it’s amazing what can be done in numpy in very few lines of code. I’d recommend starting at the top and seeing how many of the exercises you can complete…and then looking at the answers which will probably teach you a lot!
- PSA: Consider using NumPy if you need to parse a large binary data file with a fairly simple format: This was very useful to me once, and I had no idea about it before I read this article – again, numpy is great!
- Pandas and Python: Top 10: A great introduction to useful pandas features, I often use this as a reference for functions that confuse me slightly (like map, apply and applymap
- Python GDAL/OGR Cookbook!: Some good ‘cookbook’-style examples of using the Python interface to GDAL/OGR (for reading/writing geographic data). Particularly useful as the main GDAL docs are focused on the C++ interface
- Fitting models using R-style formulas:Have you ever wished for R-style formulas for fitting models in Python? Well, look no further – it can be done easily using a combination of statsmodels and patsy
- Probability distributions in SciPy: A great brief summary of probability distributions included in scipy, and how to use the various methods available on them
- Overview of Python Visualization: Visualisation options for Python were a lot less confusing when the only option was matplotlib! This should help you navigate the range of options now available
- What is your Jupyter workflow like?: As with many Reddit discussions, there is some gold buried amongst the less-useful comments. I definitely learnt some new ways of working.
- Getting the Best Performance out of NumPy: A good guide to increasing the performance of your numpy code.
These are all packages that didn’t quite fit in to my Top 5 Python Packages of 2015 post, but are still great
- pypath-magic: A handy command-line tool and IPython magic to allow you to easily change your PYTHONPATH – very useful!
- MoviePy: Lovely simple interface to make animations/videos in Python – using whatever libraries/functions you want to create the actual images
- SWAPY: A simple GUI to allow you to interactively generate pywinauto scripts to automate functions on Windows. Even better is that you can then edit the resulting Python code if you want – far nicer than switching to something like AutoHotKey
- Glue: A great Python-based GUI for exploring data relationships, principally based on ‘linked displays’. All functionality is available through the Python API too – and the documentation is great.
- Gloo: I really loved the ProjectTemplate library for R, but somehow never quite got as comfortable with this port of the library to Python. I really should try again – as the idea of a standardised structure for all analysis projects is very appealing.
- pudb: Interactive, curses-style debugger, even accessible remotely and through IPython. I must remember to use this more!/li>
- pony: An interesting new Object-Relational Model, a potential competitor to SQLAlchemy. I like its pythonic-nature
- pyserial: Simple and easy-to-use library for serial communications in Python. I’ve used this for connecting to scientific instruments as well as for home automation.
- xmltodict: This makes working with XML feel like you are working with JSON, by parsing XML data to a dict. You wouldn’t want to use it on enormous XML files, but for quick scripts it’s great!
- uncertainties: A very easy-to-use package that lets you do calculations with uncertain numbers (eg. 3 +/- 0.3) – even in numpy arrays
- pathlib: Do you hate os.path.join as much as I do? How does dir / output_folder / filename seem instead? A great pythonic path-handling package, which is a part of the standard library since Python 3.4. This package allows you to get the same functionality in previous versions.
- geocoder: Very easy-to-use geocoding module
- fuzzywuzzy: Simple but comprehensive fuzzy string matching library
- blessings: The easiest way to introduce colour, font styles and positioning to your terminal programs
- PrettyPandas: Handy API for making nicely-formatted Pandas tables
- pandas-profiling: I think this is slightly misleadingly named: it doesn’t do profiling in a ‘speed’ sense, but in a ‘summary’ sense. Basically it’ll produce a lovely HTML summary of your Pandas DataFrame, with a huge amount of detail
- PyDataset Do you envy R programmers with their handy access to various nice test datasets as data(cars) and so on? Well, this does the same for Python – with an even larger range of data
- pyq Allows you to search Python code using jQuery-like selectors, such as class:extends(#IntegerField) for all classes that extend the IntegerField class. Fascinating, and I can see all sorts of interesting uses for this…if only I had the time!
I use the Anaconda scientific Python distribution to get a standard, easily-configurable Python set up on all of my machines. I’m not going to give full details for each of these links, as they are fairly self-explanatory – but definitely very useful for those using Anaconda.
- Running scripts in temporary conda environments with conda execute
- Advanced Features of Conda Part 1
- Advanced Features of Conda Part 2
The most difficult part of programming is designing and structuring your code: the actual ‘getting the computer to do what you want’ bit is often relatively easy. This becomes particularly difficult with larger projects. The links below are all interesting discussions of software architecture with a Python focus. I find the 500 Lines or Less posts to be particularly interesting: they all implement challenging programs in relatively short pieces of code. They’ll all be released in book form eventually – and I’m definitely going to buy a copy!
- The Architecture of Open Source Applications (Volume 2): matplotlib
- The Architecture of Open Source Applications (Volume 2): SQLAlchemy
- 500 Lines or Less | A Template Engine
- 500 Lines or Less | A Python Interpreter Written in Python
- 500 Lines for Less | A web crawler with asyncio co-routines