In a project recently I was struggling to find a way to parse strings that contain a date range, for example:
- 27th-29th June 2010
- Tuesday 29 May -> Sat 2 June 2012
- From 27th to 29th March 1999
None of the Python modules I investigated (including parsedatetime) seemed to be able to cope with the range of strings that I had to deal with. I investigated patching parsedatetime to allow it to do what I wanted, but I found it very hard to get into the code. So, I thought, why not write my own…
So I did, and I’ve released it under the LGPL and you can install it right now by running:
pip install daterangeparser
The current version will parse a wide range of formats (see the examples in the documentation) and will deal with individual dates as well as date ranges. The API is very simple – just import the parse method and run it, giving the date range string as an argument. For example:
from daterangeparser import parse print parse("14th-19th Feb 2010")
This will produce an output tuple with two datetime objects in it: the start and end date of the range you gave.
The parser is built using PyParsing – a great Python parsing framework that I have found very easy to get to grips with. It is incredibly powerful, very easy to use, and really shows how limited regular expressions can be! Now that I’ve done this I have an urge to use PyParsing to write parsers for all of the horrible scientific data formats that I have to deal with in my PhD….watch this space!