Please specify a license for your software (and even your code samples!)
Another one of the things that came out of the Collaborations Workshop 2013 was the importance of licensing any software or code that you release. The whole point of releasing code is so that other people can use it – but no-one can use it properly unless you tell them what license you have released it under.
I’ve run into various problems with un-licensed code during the last few years, which makes it all the more embarrassing that I didn’t have proper licenses specified for all of my GitHub repositories – I didn’t even have proper licenses specified for some of the packages that I have released on the Python Package Index!
Anyway, I’ve been through and set licenses for all of my repositories, and I’ve explained the procedure I went through for each repository below, which consists of two simple steps:
- Choose a license – this is the thing that often scares people, and it can be fairly difficult. Unfortunately most of the guides to choosing open-source licenses online are fairly complicated, and unfortunately not much can be done about that, because some of the licenses themselves are fairly complicated. There are a few key things to keep in mind:
- Make sure that any license you choose is OSI-approved, or it won’t properly count as ‘open source’
- Think about whether you want to force everyone who uses your code to release anything they use it in as open-source (as in the GPL), any modifications they make to it as open-source (as in the LGPL), or whether people can do whatever they want with it.
- Think about whether you want to let anyone – including yourself – use the code in commercial software. If so, depending on your views on the previous question, you may want to release under a license like LGPL or BSD rather than releasing under the GPL which would mean the code could never be used commercially.
- Do you just really not care what happens to the code – anyone can do anything they want, in any way they want, and don’t have to bear you in mind in the slightest? If so, formally releasing the code into the public domain through something like CC0 may be a good way to go – although I wouldn’t recommend this for large bits of code/software as it is not specifically designed with software in mind – it is more appropriate for small code samples and examples.
- Tell people what the license is – in as many places as possible – now you’ve gone to the trouble of choosing a license you want people to be able to easily find out what license the code is released under. To achieve that, do as many of the items below as possible (ordered roughly in order of importance):
- Place a license file in your source directory with a filename of LICENSE or COPYING (depending on the license – generally GPL-style licenses ask you to use a file called COPYING, whereas many other licenses suggest a file called LICENSE). Put a full ASCII text copy of the license into this file.
- State in the README file (you do have a README file, don’t you?) what license the code is released under.
- State in the documentation (preferably somewhere fairly obvious, such as above the Table of Contents, or on the Introduction page) what license the code is released under.
- Add headers to the top of your source code files stating the license that the code is released under, and the author(s). Many licenses provide a suggested short version of the license to put here, but if not, just write a simple comment stating the license and telling readers to look in the LICENSE or COPYING file for more details.
- If your code has been uploaded to some sort of package repository, such as CRAN, CPAN, the Python Package Index or Rubygems then see if there is some metadata you can fill in to say what license you are using – as this will mean people can search for your software by the type of license it uses (for example, if they’re looking to use it commercially). For example, when writing the setup.py file ready to upload a Python module to the Python Package Index you can fill in various categories – known as troves – and a number of these are for licenses, such as
License :: OSI Approved :: GNU General Public License (GPL)
That’s it – it really is as simple as that. All you need to do now is to commit your changes to you version control system (you are using version control, aren’t you?) and anyone who finds your code will know what license it is under, and therefore what they are allowed to do with it.
However, there is one more thing to consider: what about example code that you post online (on your blog, on StackOverflow etc) – what should you do about that? I’m not a lawyer, but the consensus seems to be that if there isn’t an explicit license attached to the code then it is very difficult to decide what people are allowed to do with it – they’re almost certainly not allowed to use it commercially, and there may – or may not – be uses that count as ‘fair use’. To save everyone a lot of bother I’ve decided to release all of my small code samples into the public domain as CC0. The way I’ll do this is generally by a link in the text surrounding the sample (“As seen in the following example (released under CC0)”) or as a comment at the top of the code. I’ve also taken a number of old Github repositories that I wrote as part of various courses at university (for example, my parallel programming exercises) and released those as CC0, by simply specifying in the README files.
Hopefully this post will have been useful to you – let me know what you think in the comments below. Is the idea to use CC0 for small code samples a good idea? I think so personally, but I’d be interested to hear alternative points of view.
Categorised as: My Software, Programming
[…] file (you do have one, don’t you?), and your LICENCE file (youÂ must have one – see my explanation for why)Â and possibly also your INSTALL file and so […]
The other reason formal licensing is important is that some people release code under some CC or CC-like license (or a misleading vague reference to them) then later change the license information in their code to read that it is proprietary. They use this to go after innocent organizations who have no way to prove, later, that they downloaded CC code. Take-backsies at their worst.