Software choices in remote sensing
I recently read the article Don’t be a technical masochist on John D. Cook’s blog, and it struck a chord with me about the way that I see people choosing software and programming tools in my field.
John states “Sometimes tech choices are that easy: if something is too hard, stop doing it. A great deal of pain comes from using a tool outside its intended use, and often that’s avoidable…a lot of technical pain is self-imposed. If you keep breaking your leg somewhere, stop going there.”.
I like to think that I can do some GIS/remote-sensing analyses more efficiently than others – and I think this is because I have a broad range of skills in many tools. If the only GIS/Remote Sensing tool you know how to use is ArcGIS then you try and do everything in Arc – even if the task you’re doing is very difficult (or even impossible) to do within Arc. Similarly, if you only know Python and won’t touch R, then you can’t take advantage of some of the libraries that are only available in R, which might save you a huge amount of time. I wouldn’t say I’m an expert in all of these tools, and I prefer some to others, but I have a working knowledge of most of them – and am able to acquire a more in-depth knowledge when needed.
Don’t get me wrong, sometimes it is very important to be able to do things within a certain tool – and sometimes it’s worth pushing the boat out a bit to try and see whether it’s possible to get a tool to do something weird but useful. Often though, it’s better to use the best tool for the job. That’s why, if you watch me work, you’ll see me switching backwards and forwards between various tools and technologies to get my job done. For example:
- I’m currently working on a project which involves a lot of time series manipulation. When I started this project, the Python pandas library (which deals very nicely with time series) wasn’t very mature, and I wasn’t prepared to ‘bet the project’ on this very immature library. So, even though Python is my preferred programming language, I chose to use R, with the xts library to handle my time-series analysis.
- I don’t use ArcGIS for remote sensing analysis, and I don’t use ENVI to do GIS. Yes, both programs allow you to deal with raster and vector data, but it’s almost always easier to use ENVI to do the remote sensing work, and then transfer things into ArcGIS to overlay it with vector data or produce pretty output maps (ENVI’s map output options are pretty awful!). I’ve lost track of the number of students I’ve found who’ve been really struggling to do satellite data processing in ArcGIS that would have taken them two minutes in ENVI.
- If there’s a great library for a different programming language then I use it. For example, I recently needed to create a set of images where each pixel contained a random value, but there was spatial correlation between all of the values. I investigated various tools which purported to be able to do this (including some random Python code I found, ArcGIS, specialist spatial stats tools and R) and in the end the one that I found easiest to get it to work in was R – so that’s what I used. Yes, it meant I couldn’t drive it easily from Python (although using RPy, or, as a last resort, running R from the command line using the Python subprocess module) but that was a far easier option compared to trying to write some code to do this from scratch in Python.
Overall, the tools I use for my day-to-day work of GIS/Remote-sensing data processing include (in approximate order of frequency of use): ENVI, GDAL’s command-line tools, QGIS, ArcGIS, GRASS, eCognition, PostGIS, Google Earth, BEAM, and probably more that I can’t think of at the moment. On top of that, in terms of programming languages, I use Python the most, but also R and IDL fairly frequently – and I’ve even been known to write Matlab, Mathematica and C++ code when it seems to be the best option (for example, Mathematica for symbolic algebra work).
Having a basic knowledge of all of these tools (and, of course, having them installed and set up on my computers and servers) allows me to get my work done significantly faster, by using the best (that is, normally, easiest) tool for the job.