Robin's Blog

Announcing DateRangeParser: Parse strings like “27th-29th June 2010”

In a project recently I was struggling to find a way to parse strings that contain a date range, for example:

  • 27th-29th June 2010
  • Tuesday 29 May -> Sat 2 June 2012
  • From 27th to 29th March 1999

None of the Python modules I investigated (including parsedatetime) seemed to be able to cope with the range of strings that I had to deal with. I investigated patching parsedatetime to allow it to do what I wanted, but I found it very hard to get into the code. So, I thought, why not write my own…

So I did, and I’ve released it under the LGPL and you can install it right now by running:

pip install daterangeparser

Or you can visit the DateRangeParser PyPI page to download it manually, read the documentation, or hack on the code.

The current version will parse a wide range of formats (see the examples in the documentation) and will deal with individual dates as well as date ranges. The API is very simple – just import the parse method and run it, giving the date range string as an argument. For example:

from daterangeparser import parse
print parse("14th-19th Feb 2010")

This will produce an output tuple with two datetime objects in it: the start and end date of the range you gave.

The parser is built using PyParsing – a great Python parsing framework that I have found very easy to get to grips with. It is incredibly powerful, very easy to use, and really shows how limited regular expressions can be! Now that I’ve done this I have an urge to use PyParsing to write parsers for all of the horrible scientific data formats that I have to deal with in my PhD….watch this space!

If you found this post useful, please consider buying me a coffee.
This post originally appeared on Robin's Blog.

Categorised as: Linux, OSX, Programming, Python, Windows

One Comment

  1. Paul McGuire says:

    Glad to hear pyparsing was helpful! Good luck in your future studies, Paul

Leave a Reply

Your email address will not be published. Required fields are marked *