Summary: Cheap, lots of features, can resell. Servers located in US (may be a problem) and expensive for dedicated servers.
Details: Register at Dreamhost. If you use the promotion code RTWILSONBLOG you will get $50 off your hosting package, and a free lifetime domain registration – yes, really! The details of their standard package is available here.
I’ve had my own domain (rtwilson.com) and associated website and blog for around a year and a half now, and I’m very glad I decided to set it up. I’d always wanted a website, but wasn’t quite sure what I’d put on it (well, I guess I’ve sorted that now by having useful academic stuff to put there), and I thought it would be expensive to set up.
In fact, I didn’t have to spend much money because I got a great detail through Dreamhost. I’d heard of the hosting company before, but wasn’t sure whether they were any good or not. After reading some good reviews at Lifehacker, I decided to bite the bullet and go with them – and I’m glad they did.
Their support for many different services (PHP, Ruby on Rails, Subversion, Jabber etc) was a definite selling point for me, as was the capability to resell hosting services (which I have started doing). Unlimited downloads and unlimited storage really do mean that – and they have a lovely 1-click installer which will set up common web applications (WordPress, MediaWiki and many others) for you in less than five minutes.
I’m always slightly wary of tech companies customer support, but Dreamhost have been very good to me. When I signed up I used the wrong promotion code, and they managed to sort that out for me within around half an hour (instant online chat to customer advisors is a useful thing), and they’ve also helped me with troubleshooting some rails problems that I’ve had. Their online web hosting control panel is by far the best I’ve seen – far better than cPanel, which most people seem to use – and combines ease of use and power.
For those in the UK or Europe, you may wish to be aware that their servers are located exclusively in the USA and therefore sites can appear slow sometimes. However, this seems to have improved a lot recently, and I don’t notice any problems with my sites.
So, overall I am very happy with Dreamhost and happy to recommend them.
For a while I used Quicksilver, but it seemed to be rather unstable on my machine. Then I used Quick Search Box, but it got terribly slow. Then, somehow, I found out about Alfred - and it’s replaced them both.
Alfred, like the other applications mentioned above, is a launcher, but it does far more than just launch things. Alfred will let you search websites, find local files, define words, perform calculations, email files, find contacts, and is very extensible as it has good command-line integration.
The basic app is free, but you can buy the add-on PowerPack for £12 (that’s around 20$ for those in the states) which gives you access to the advanced features such as terminal integration. So – have a try of the free version, and buy the extension if you like it or need the extra features. Overall I’d thoroughly recommend it. It’s completely replaced the other two for me – and is so much faster that I can’t see how I coped with the slowness of Quick Search Box.
Stay tuned for some more Alfred posts soon – including how I have set up some useful command-line tools to work with it.
When clearing out some of my old programming books the other day I realised how far I’d come with programming over the years, and the number of different technologies that I have used over time. I thought I’d do a little summary post going from first programming experience to now – and it’s amazing what’s changed.
My First Programming Experience
My first proper programming experience was with a BBC Micro computer. You can tell how old I am (fairly young I know, but relatively old in computing terms) that when I was at primary school we had four computers – three BBC Micros and one Acorn machine. The machine I was using had a 5 1/4″ floppy drive (as pictured above), but the book that I used to teach me how to program assumed that I’d be using an audio cassette recorder to store my programs on – how awesome is that? Old-school yes, but practical, after all, most people had a tape recorder available (in those days at least). Anyway I managed to find a book called BBC Basic for Beginners which taught me how to write simple programs – think guessing games, simple asking questions programs (“What is your name?”, “Hello <NAME>”). I distinctly remember trying to persuade a friend to stay in during break time and write a program with me. He wasn’t interested – but oh well.
Moving further – but still staying rather BASIC
In my final year of primary school, when I was aged 10, we got our first computer at home. It ran Windows 98 (fairly new at that time) and I think it had an amazing 450Mhz processor! We also had access to the internet via dial-up, where it only took 20 minutes to download a 1Mb file – at the time that seemed amazing! Anyway, on this computer I discovered that there was a similar programming language to BBC Basic called QBasic. This had many of the same ways of writing things as BBC Basic, but was far more powerful. I had great fun writing a number of programs using QBasic – I distinctly remember writing a number of maths programs (printing multiplication tables, calculating prime numbers etc) and a few menu-based programs. I managed to find The QBasic Bible in the library one day, and couldn’t stop reading it for about three days. I remember running up to my Dad and telling him, very excitedly, about this new SELECT CASE statement that I’d come across that would remove the need for nested if statements.
Visual-ing it up
The natural progression from QBasic was to Visual Basic, and it so happened that a friend of my Grandparents gave me a copy of Visual Basic 6 Learning Edition. He’d been using it for a while but had upgraded to the Professional Edition as some features he needed were only available there. With this I discovered the joy (at least I thought it was joy then!) of GUI programming. I started off without any textbooks (our library hadn’t got that up-to-date then…) and followed a lot of tutorials. It was then that I realised that following a tutorial and just typing in the code didn’t really teach me anything – I needed to try and play around with the code, change it and see what happened, and truly try and understand it. I remember writing a painting program (without really understanding it – but being amazed at the OnPaint method and how that worked) and GUI versions of my maths programs.
An internet sideline – plus some office tools
My father is an IT (sorry, ICT) teacher, and I often managed to borrow some of his textbooks/workbooks and learnt a number of useful skills: programming in Visual Basic for Applications in Excel and Access and writing HTML. I wrote HTML pages with no server to host them on – just running them locally on my computer – and produced a project for school entirely in HTML – which amazed the teachers!
Finally – a useful use?
It seems hard to believe, but this must have taken me to the end of secondary school. I think I must have had a break from programming while doing my GCSEs, as I can’t really remember writing anything then. I think I probably dabbled in PHP a bit (a sin we’re all guilty of, I’m sure) and I remember playing with Pascal for a few days, but nothing else really. This changed when I got to Sixth Form College, and took A-Level Computing. For my A2 coursework I had to produce a piece of software for a real end-user – and I chose the person who managed the internal bookshop at my father’s school. I assessed requirements, implemented and tested code and wrote user guides. I remember feeling very impressed with myself when I worked out how to interface with the Amazon website so that I could look up book details from their ISBN numbers. (Yes I know the screenshot above is terrible – it’s the only one I could find, and is one sent by the user to show a bug – that is, that I allowed the application to be maximised and didn’t resize the controls on the screen once the user had done it). I think it must have been as part of the pseudo-assembly language that we learnt that I got interested in C and started playing around a bit. This, of course, led to a bit of C++ and a little bit more of an understanding of object-oriented programming.
A literary aside
It was around this time that I started getting very interested in reading about programming – both online and in paper format. I was a huge (understatement) fan of Joel on Software and Coding Horror and started to touch on a few of Paul Graham‘s posts too. I bought a number of programming books, ranging from the specific (ActiveX Data Objects in Visual Basic 6) to the general (Algorithms and Data Structures).
My coding goes nuclear
After Sixth Form college I took a gap year through the wonderful Year in Industry organisation, who placed me with British Energy – a company who run nuclear power stations. I spent my year writing two pieces of software to run at Sizewell B Nuclear Power Station as part of the control and safety systems. Obviously I can’t say a huge amount about what I did – but the screenshot above is allowed to be shown in public, and shows one of the sections of the software that I was most proud of. It is a live status screen showing the position and movement of a number of moveable detectors inside the reactor, and took a long time to program correctly.
It was during my Year in Industry that I learnt to program in the .NET framework – we were (for our sins) using Visual C++ .NET, which combined (again, for my sins – whatever hellish crimes they were) managed .NET C++ code with unmanaged standard C/C++ code. I got into electronics and drivers (for dealing with serial communications), writing my own graphical controls (particularly for the live status screen) and proper object-oriented design. I read Knuth (parts of) and the Gang of Four book (all of), and learnt a huge amount.
University – a scientific focus?
Everyone at British Energy assumed I’d go off to university to study Computer Science, but I shocked them by studying Geography at the University of Southampton. I left my professional programming behind to focus on rivers, glaciers, cities and poverty – but I found a way to get programming back in. During my second year I took a course in Remote Sensing – basically satellite imaging – and was amazed by what was possible. I found the software easy to use, and found that I could start extending it with my programming experience. In fact, I took my programming scientific in many ways – learning a number of new languages so that I could use various scientific tools. These included R (for statistics), Python (for use with NumPy) and IDL – a language that no-one seems to have heard of, but which is a very useful array-based scientific language, which is well integrated with a popular satellite imaging tool.
I spent the summers of my undergraduate degree doing research bursary placements (including, of course, some programming), and then used significant amounts of programming in my undergraduate dissertation. I then decided – on Christmas Eve of all days – to release a number of the pieces of code that I wrote to help me do the data processing for my dissertation as an open-source extension for a piece of satellite imaging software. I was incredibly excited to release my first bit of proper software ‘into the wild’ – and was amazed when I got to 10 downloads! The last few versions combined have had over 150 downloads now – and it is used in universities in continental Europe, environmental consultancies in Australia and by police forces in the US.
During my time at university I’ve also got more into a number of web languages and frameworks. I’ve (just about) learnt CSS, so I can produce a vaguely serviceable website, and I got very into Ruby – both as a general scripting language and as a web framework (with and without Ruby on Rails).
Piling on the workload
I’m now doing a PhD, and have returned to lower-level programming as part of my training during my first year. I took a course in parallel programming (very interesting; very difficult – think pointers but ten times harder), machine learning (it’s amazing what you can get computers to do with a bit of maths) and more (yes more) remote sensing. And, I’m really enjoying myself. My code is working (most of the time) and being used by other people, and it’s helping me write papers that – hopefully – will get published soon. I have a website, and a blog (which you’re reading).
So – how to summarise?
I’ve really enjoyed programming over the years, and have journeyed a long way from my first programming experiences. Yes, I learnt to program in BASIC, and some people say that means I’ll never be a good programmer, but I beg to differ. It gave me a great starting point which I have built-on, and I’m very glad that programming is part of my life. In fact, writing this article – and finding the screenshots to illustrate it – has made me quite nostalgic for my early programming experiences – just seeing QBasic again or VB6 has triggered so many memories. Anyway, just to finish, I’ll attempt to give a list of the various languages and frameworks that I have used over the years – I’m sure I’m missing some, but here’s a flavour:
- BBC Basic
- QuickBasic (the successor to QBasic – a little bit of ncurses-style GUI)
- Visual Basic 6
- ActiveX and OCX (through Visual Basic – mostly as extra GUI controls)
- Object-Oriented programming (originally through Visual Basic…yes I know…)
- C# .NET
- C++ .NET
- Python (particularly with NumPy)
- Ruby on Rails
So, programming has done me well – and I look forward to updating this list in another 15 years time!
Ahhh PDFs…., or more formally, Portable Document Format files. I remember the days when I thought that PDFs were only for instruction manuals downloaded from the internet, or electronic copies of things that you don’t want people to be able to alter. Not so – I have recently discovered the joys of PDFs, particularly through my use of Mac OS X. I will explain more below:
1. PDFs are a vector filetype. That means that when there is text in a PDF document it is stored as text, with details of the font, size and location. When there is a line in a PDF document it is stored as a line from one location to another, with details of colour, width etc. This has a number of benefits – principally that PDFs maintain their quality no matter how much you zoom into them. You never get the horrible pixelated look that you can get with raster graphics files (such as JPEG and PNG). For scientific documents this is great – it means I can generate a graph, and then with one file I can produce an A3 sized copy for use in a poster and a 6″ x 4″ copy for inclusion in the paper. Not only that, but the 6″ x 4″ copy actually looks good – it looks professional, clean and high quality. Recently I had to include a PNG graph in a LaTeX document that I was writing – I hated it, all of the lines were blurry, I couldn’t resize it and it generally looked un-professional. Of course, I’m not suggesting you should store your holiday snaps as PDFs – that’s not what they’re designed for – but for diagrams, graphs and other technical drawings they are perfect.
2. PDFs are cross platform. Nearly every system can read PDFs these days. The standard is now open (ISO 32000-1). There are readers for Windows, Linux, OS X, Android, iOS, Palm, BSD, BeOS – you name it, it’ll probably have a PDF reader. Google Chrome even has a built-in reader these days – and many websites have stopped saying “This is a PDF file. If you don’t have a program to view them please download Adobe Reader”.
3. PDFs can be included in LaTeX documents extremely easily. Yes, I know you can include PNG files just as easily, and possibly it’s even easier to use PostScript files (although who doesn’t use PdfTex these days?). As mentioned above, raster files for graphs just look horrible, particularly when included in a LaTeX document in which (as is nearly always the case with LaTeX) all of the rest of the design and typography is near-perfect.
4. PDFs do not have to be A4 sized (or Letter, for those in the US). I know – I didn’t realise this until very recently, but you can crop PDFs to any size. In fact, there is a great perl script called pdfcrop which will crop a PDF file to the minimum bounding rectangle of the contents – taking your A4-sized PDF with a 6″ x 4″ graph in it down to a 6″ x 4″ PDF – perfect for inclusion in a LaTeX document, for example.
5. PDFs can be annotated easily. For example, as the text in a PDF document is stored as text, it can be selected just like text in a word processor, and then highlighted just as easily. Of course you can also add extra text or vector illustrations (such as circles around important features in a diagram). This is great for making notes on, and highlighting papers, articles and e-books.
6. PDFs are a first class filetype in OS X. I never knew how much I’d value this until I started using OS X. By default so many things are PDF. For example, PDF export is built into the standard OS X print dialog box. On Windows you’d have to install something like CutePDF to do that – but OS X does it by default. In fact, if you use the Print Preview function in the print dialog box, OS X simply prints to a PDF and shows you the PDF in the aptly-named Preview application. In fact, this application – which can display almost any graphical filetype – is also a powerful PDF editor. Using Preview you can re-order PDF pages, merge PDFs, annotate PDFs and crop PDFs. PDFs are a first class filetype in other ways too – all Spotlight searching by default searches within text in PDF files, and there is a separate section in the Spotlight dropdown for PDF files. Overall, Apple just seem to ‘get’ PDF.
Summary: Perfect for the price. Great cross-platform compatibility. Couldn’t ask for more.
Details: Asonic External USB 2.0 8 Channel Sound Card. I think it comes in various incarnations with the same chips inside, but I got mine from eBuyer where it was sold as the Ebuyer Extra Value Asonic External USB 2.0 8 Channel Sound Card.
A while back I wanted to buy a cheap USB soundcard to use with my laptop so that I could use my new (secondhand) 5.1 surround sound speaker system properly. I found this sound card on Ebuyer for under £10 and decided that I couldn’t really go wrong. I was right – it Just Works(tm).
So far I have used it on Mac OS X, Windows Vista, Windows 7 and Linux and it has worked perfectly with all of them. I’m sure it doesn’t give as high quality output as ‘proper’ soundcards such as those by Creative, but it sounds perfectly acceptable to my ears. It comes with drivers for Windows, but is picked up automatically very well. It also works automatically in both OS X and Linux.
For those interested in using it with linux, the output of lsusb is below:
Bus 004 Device 003: ID 0d8c:0102 C-Media Electronics, Inc. CM106 Like Sound Device
It is picked up automatically, set up by dmesg, and will start giving output automatically. It plays nicely with alsa and pulseaudio, and I have got 6 channels working fine (I’m sure I could get all 8 to work, but I don’t have the speakers to test it).
I remember the time, a few years ago, when buying a cheap ‘generic’ device like this would be a terribly bad plan – it probably wouldn’t be good quality, it almost certainly wouldn’t work under OS X or Linux and in the end you’d regret it. That’s changed now – this device at this price is perfect.
Only one minor (very minor) niggle: there is a red LED on the device which flashes constantly when it is plugged in. However, a simple piece of blu-tac over the LED has stopped that annoying me.
The title of this post is a quote from a programming course I’m taking at the moment that really made me think about things differently. In this course we’ve been doing fairly low-level programming in C, which is quite different from most of the programming I’ve been doing recently. One of my biggest lessons from the course was:
Get the compiler to catch the bugs before you even run the code then you don’t have to
In my assignments for this course there have been so many bugs that I’ve caught by setting the compiler to be a bit stricter. In fact, it’s amazing what compilers these days can do. The latest versions of compilers such as gcc will do an awful lot of what tools like lint do – checking things that might be a problem and letting you know about them. So – Rule 1 is to make sure you’re using a fairly up-to-date version of your compiler, as the ability of compilers to find problems in your code has got a lot better recently.
Rule 2 is to set your compiler to be stupidly picky and it will, honestly, make your life easier. For us gcc people that means rather than running:
gcc -Wall -pedantic source.c
It’s only a few more characters typing (no extra characters if you put it in your makefile) and can make your life a lot easier.
Recently I took a while to try and simplify and consolidate my online presence. I thought it was an appropriate time to do this, as I had just bought a domain name (rtwilson.com), where I was hosting my academic website (www.rtwilson.com/academic) and my blog (which is what you’re reading now!). I thought it’d be useful to document the steps I took:
Stage 1: Create a new, clean identity
I will assume here that you have a good, sensible email address that you are likely to keep for a long time. If you don’t, then I’d recommend buying a domain name and getting a sensible email. This will the email that you use for all web-based communications and accounts.
The tasks below basically involve setting up accounts on major cross-site platforms, and ensuring they have sensible details, images etc. You may want to find an appropriate photo of yourself, or some other photo that you don’t mind representing you on the internet.
1. Create a Gravatar account with that email, and give it a sensible photo. In case you don’t know, Gravatar is a system used by a number of websites, blogs etc. to provide an image to go with your account. It’s all linked through your email, and is very simple to set up – simply go to the Gravatar signup page and follow the instructions.
2. Create an OpenID account. You may already have one of these – check here to see what account you may already have that has an OpenID associated with it. If you don’t have one, either set one up through a dedicated provider like MyOpenID, or link one to your domain.
3. Clean up and configure your Google account. You probably want to update your Google Profile with a sensible image and some sensible information.
Stage 2: Remove old accounts/identities and link them to new ones
This is a little more difficult – but just because you have to remember where you have accounts. Some of these accounts you may wish to close (Bebo, MySpace etc), but some you’ll want to keep and associate with your new email address.
This may involve simply changing the email address of your account and updating details (photo, profile etc) or removing an account and creating a new one. You will find that a number of sites will now let you link your account through OpenID or to your Google Profile, and loads of sites will pick up your avatar from Gravatar. A list of sites you might want to check is below:
- StackExchange sites (StackOverflow, SuperUser etc)
- Blogging accounts
- Hacker News
Stage 3: Update as necessary
Last but not least – keep an eye on your profiles on these sites. Update them when they need updating – I found a site which still said I was at school – and make sure photos (if you are using them) are at least vaguely up-to-date.
(If you liked this, you might enjoy my other how-to’s and my book reviews)
Reference: Marsland, S., 2009, Machine learning: An Algorithmic Perspective, Chapman & Hall/CRC, Boca Raton, Florida, 390pp Amazon Link
Machine Learning can be a difficult topic – as I found out when taking a Masters-level machine learning course this year. It can become very mathematical – particularly when dealing with complicated areas such as Support Vector Machines – and it is very difficult to pitch a university lecture course at a level where all of the students can understand it. Unfortunately, in my course I was one of the students who didn’t really understand the lectures particularly well…but this book saved me! In fact, it was so well written that I was reading it in bed at night, and staying up late to finish the chapter!
I firmly believe that fields such as Machine Learning and other practical computing topics (such as Computer Vision, Statistics, Programming etc) should be taught using practical examples and practical teaching sessions wherever possible. Also – for all but the most complex topics – full algorithms should be provided, and implementations in appropriate programming languages shown. This book definitely fulfils this – showing detailed algorithm explanations for all of the algorithms considered (apart from Support Vector Machines, which the author decided – sensibly in my opinion – were too difficult to cover in full detail), and implementing them in Python versions. Parts of the python code are shown in the book, and all of the code is available online.
The explanations and motivations for the techniques shown in the book are brilliant – both easy-to-read and comprehensive. Mathematical explanation are clear – with ‘big picture’ explanations given in textual form for those who want to skip the detailed maths – and diagrams are well-chosen and easy to understand. Furthermore, sensible examples are given for the uses of machine learning – ranging from standard datasets like iris, to more unusual examples like ozone layer depth prediction.
The book covers most of what you’d need for an introductory Machine Learning course, starting with perceptrons, before moving to multi-layer perceptrons, radial basis functions, support vector machines, decision trees, unsupervised techniques and genetic algorithms. The book also covers introductions to probability (including Bayesian inference), dimensionality reduction (including PCA and LDA) and optimisation techniques. The wide range of techniques covered makes this useful not just for machine learning students: genetic algorithms, dimensionality reduction and optimisation have many applications in other fields.
This is a shorter review than many of my reviews, as I really can’t find much to fault with the book. It’s great. My advice is, if you’re interested in this topic, need to learn it for a course, or think you’ll want to use Machine Learning techniques in your work then buy it – you won’t regret it!
Just a quick post this time, as I’m currently enjoying a nice holiday (well, holiday combined with work) in France. I had to post this because I’ve just realised that one of my biggest gripes with ArcGIS has been fixed in version 10! Hooray!
I suspect a lot of other people have been frustrated by this too: if you want to take an ArcGIS map document and use it on another computer it is (or at least, was) very difficult. The .mpd file only contains references to the actual data for the map, so you have to find where all of the data is stored and take that with you too – and whats more, the references are often stored as relative path names, so even moving it to a different folder on the same computer is a pain. In fact, I seem to remember being taught in an undergraduate ArcGIS course never to move an ArcGIS .mpd file once I had created it!
Anyway, ArcGIS 10 appears to have a new function called Map Packages. This allows you to package a .mpd file with all of the data that it uses into one (quite big, probably) .mpk file, which is then completely portable. Sounds great!
I haven’t been able to test it yet (I haven’t got ArcGIS on the laptop with me in France), but it sounds like it’ll be very useful.
For more information, see this ArcGIS blog post.
I remember, fairly early on in my programming career, reading Joel Spolsky’s article about interviewing for programmers. At the time I thought I might want to get a job as a programmer (in fact, I’ve now got a job in academia – albeit in a field that involves a fair amount of programming), so I was interested to know what sort of things he thought interviewers should look for. One of the key things is being able to understand pointers, which he suggests is far harder than most of the rest of programming:
Joel Spolsky (from The Guerrilla Guide to Interviewing)
At the time, I’d never properly dealt with pointers, so I made it my goal to understand them, which I did – mainly through reading the great K&R book. In his quote above, Joel refers to the type of ‘doubly-indirected thinking’ that is required to understand pointers – realising that what you have isn’t the object itself, it’s just a reference to the object, and that you can manipulate the pointer without necessarily manipulating the object, and so on.
However, I’ve just discovered the next thing along the line from the doubly-indirected thinking that pointers require: it’s the n-indirected thinking that parallel programming requires. That’s what the title refers to – using the mathematical symbolism of >> being ‘significantly greater than’ (yes there may be a unicode character for this, no I didn’t bother to find it). So – why’s all this parallel programming even harder (conceptually, at least)? Well…with pointers you have one thing (the pointer) which points to another thing (a memory location storing something – an integer, or a double or something). In parallel programming, everything has multiple copies, all of which may (or may not) have different values at any one time.
I’ve been doing some programming using MPI – the Message Passing Interface – recently. The way this works is that each core (that is, individual processing unit – whether it is combined on a piece of silicon with other cores or not) is treated as entirely separate from every other core in terms of memory (this is in distinct difference to other methods like OpenMP). Therefore, if cores want to exchange data they have to send an explicit message to another core to get the data. This sounds very restrictive, but by carefully distributing data to begin with, you can minimise the amount of communication you need to do.
One other thing about MPI, and the most relevant for this post, is that each process runs exactly the same code – so the whole code runs on each processor (unless you do things like ‘if (process_id == 0)’). This means that, at any point in code, one variable (for example, num_rows, holding the number of rows of the array this processor is operating on) can actually hold different values in each processor. For example, in the code I was writing, this was the same for most processes (as I tried to split up the array evenly) but with a few processes having more or less than the others. So, one variable has multiple values – ok, doesn’t sound too complicated…
However, when you start sending values from one place to another you realise that you can get in a terrible mental muddle (I find scribbling lots of diagrams helps!) as you’re sending variables from one process to variables in another process that may have different values for all of the other variables. Of course, when you’re dealing with pointers as well you’ve got the confusion of pointers, then the n-indirectedness of dealing with all of the variables. Fun!
(Oh and add to this the realisation that each process can be doing different things at the same time, but that send and receive calls must still match up….and your brain starts to explode!)
Still, all of this parallel programming is worth it – I’ve got really great speedups for some of my code!