Five new-ish Python things – Part 1
I keep gathering links of interesting Python things I’ve seen around the internet: new packages, good tutorials, and so on – and so I thought I’d start a series where I share them every so often.
Not all of these are new new – some have been around for a while but are new to me – and so they might be new to you too!
Also, there is a distinct ‘PyData’ flavour to these things – they’re all things I’ve come across in my work in data science and geographic processing with Python.
So, on with the list:
I try really hard to follow the PEP8 style guide for my Python code – but I wasn’t so disciplined in the past, and so I’ve got a lot of old code sitting around which isn’t styled particularly well.
One of the things PEP8 recommends against is using:
from blah import *. In my code I used to do a lot of
from matplotlib.pyplot import *, and
from Py6S import * – but it’s a pain to go through old code and work out what functions are actually used, and replace the import with something like
from matplotlib.pyplot import plot, xlabel, title.
removestar is a tool that will do that for you! Just install it with
pip install removestar and then it provides a command-line tool to fix your imports for you.
Gives the following diff as output:
--- original/ncaveo.py +++ fixed/ncaveo.py @@ -1,7 +1,7 @@ # Import Py6S -from Py6S import * +from Py6S import Geometry, GroundReflectance, SixS, SixSHelpers # Import the Matplotlib plotting environment -from matplotlib.pyplot import * +from matplotlib.pyplot import clf, legend, plot, savefig, xlabel, ylabel # Import the functions for copying objects import copy
To run it on all of the Python files in a module, and do the edits inplace rather than just showing the diffs, you can run it as follows:
removestar -i module_folder/
If you use OS X then you’ll know about the very handy ‘quicklook’ feature that shows you a preview of the selected file in Finder when pressing the spacebar. You can add support for new filetypes to quicklook using quicklook plugins – and I’d already set up a number of useful plugins which will show syntax-highlighted code, preview JSON, CSV and Markdown files nicely, and so on.
I only discovered ipynb-quicklook last week, and it does what you’d expect: it provides previews of Jupyter Notebook files from the Finder. Simply follow the instructions to place the
ipynb-quicklook.qlgenerator file in your
~/Library/QuickLook folder, and it ‘Just Works’ – and it’s really quick to render the files too!
This is a great cheatsheet for the matplotlib plotting library from Nicolas Rougier. It’s a great quick reference for all the various matplotlib settings and functions, and reminded me of a number of things matplotlib can do that I’d forgotten about.
Find the high-resolution cheatsheet image here and the repository with all the code used to create it here. Nicolas is also writing a book called Scientific Visualization – Python & Matplotlib which looks great – and it’ll be released open-access once it’s finished (you can donate to see it ‘in progress’).
If you’re not interested in geographic data processing using Python then this probably won’t interest you…but for those who are interested this looks great. PyGEOS provides native Python bindings to the GEOS library which is used for geometry manipulation by many geospatial tools (such as calculating distances, or finding out whether one geometry contains another). However, by using the underlying C library PyGEOS bypasses the Python interpreter for a lot of the calculations, allowing them to be vectorised efficiently and making it very fast to apply these geometry functions: their preliminary performance tests show speedups ranging from 4x to 136x. The interface is very simple too – for example:
import pygeos import numpy as np points = [ pygeos.Geometry("POINT (1 9)"), pygeos.Geometry("POINT (3 5)"), pygeos.Geometry("POINT (7 6)") ] box = pygeos.box(2, 2, 7, 7) pygeos.contains(box, points)
This project is still in the early days – but definitely one to watch as I think it will have a big impact on the efficiency of Python-based spatial analysis.
napari is a fast multi-dimensional image viewer for Python. I found out about it through an extremely comprehensive blog post written by Juan Nunez-Iglesias where he explains the background to the project and what problems it is designed to solve.
One of the key features of napari is that it has a full Python API, allowing you to easily visualise images from within Python – as easily as using
imshow() from matplotlib, but with far more features. For example, to view three of the scikit-image sample images just run:
from skimage import data import napari with napari.gui_qt(): viewer = napari.Viewer() viewer.add_image(data.astronaut(), name='astronaut') viewer.add_image(data.moon(), name='moon') viewer.add_image(data.camera(), name='camera')
You can then add some vector points over the image – for example, to use as starting points for a segmentation:
points = np.array([[100, 100], [200, 200], [300, 100]]) viewer.add_points(points, size=30)
That is very useful for me already, and it’s just a tiny taste of what napari has to offer. I’ve only played with it for a short time, but I can already see it being really useful for me next time I’m doing a computer vision project, and I’m already planning to discuss some potential new features to help with satellite imagery work. Definitely something to check out if you’re involved in image processing in any way.