Always put comments in your code!

I have a paper which I wrote some years ago, which has not been finished, and which should be accompanied by an R package. So far nothing special, but at that time, I was only at the beginning of my affair with R, and so I made several mistakes (OK – I did also some things right – I hope). One thing which I did not think about (or cared about) was to comment my code. So now I am sitting in front of about 8 R files with strange names and no comments in them. Now:

What can I do with them?

One advantage: I have graphs, generated by R, in my draft paper – so I can trace my scripts back from the name of the graphs, identify the script which created the graphs, then to the data and finally (hopefully) have an idea how my script mess did what it was doing – and hopefully, I will be able to do this before retirement (which is still several years away).

Now – what could I have done better at that time? Well, there are several things:

  1. I could have used org-mode. Org-mode enables one to combine documentation and code in a single file. It is a literate programing at its best (more will likely follow later). In addition: it can easily exported to, among others, pdf and html, including code and text.
  2. But I used only ess. Nevertheless,  I could have added more comments in the code.

There is always the # in R!!!

I am not saying that org-mode would necessarily have saved me (even in org-mode you have to write the documentation and code yourself), but it would have pushed towards documentation, as the body of the text is the documentation, and you put the code in source blocks. At the first look, it sounds strange, but one usually starts with ideas about the code, a structure, notes for algorithms, charts, etc. and all these go into the document. And then, if one starts coding. And to each code block, there should be already some text which explains what it shlud be doing – and voilá, here is the basic documentation.

To execute the code blocks, one can either evaluate them in the document and insert the results, or “tangle” the document, which means extracting the source code into files. As it is possible to define into which file which code block should be extracted, one can create a complex system of resulting R files. And these R files, can then be sourced from R, running in ess / emacs.

The next possible step  would be then to put your script files into a package, which would then even ask for more documentation. And then there will Roxygen help – but that might be told in another blog.

So there are many tools which make documenting your R code easier, but you don’t have to use them.

I want to close with a quote from Donald Knuth. “Literate Programming (1984)” in Literate Programming. CSLI, 1992, pg. 99:

I believe that the time is ripe for significantly better documentation of programs, and that we can best achieve this by considering programs to be works of literature. Hence, my title: “Literate Programming.”

Let us change our traditional attitude to the construction of programs: Instead of imagining that our main task is to instruct a computer what to do, let us concentrate rather on explaining to human beings what we want a computer to do.

The practitioner of literate programming can be regarded as an essayist, whose main concern is with exposition and excellence of style. Such an author, with thesaurus in hand, chooses the names of variables carefully and explains what each variable means. He or she strives for a program that is comprehensible because its concepts have been introduced in an order that is best for human understanding, using a mixture of formal and informal methods that reinforce each other.

 Cheers and enjoy life.


6 responses to “Always put comments in your code!

  1. Guilty as charged. All too often I splurge out the code, get it working, send out the analyses, and move on to the next thing. But then I have to come back to it a month later, and rue my idiotic working practices.

    I’m just getting started with Sweave-ing everything, and I think this really helps, because I start writing code immediately, even if I have to make up the data, and just as with org-mode there’s lots of non-code fore and aft to explain what’s going on.

    I’m trying to get the nerve to start using Emacs at the moment, this post is yet another motivator for me to just take the plunge and go for it.

    • You definitely should take the plunge – emacs with ESS and org-mode is definitely worth it.

      It is a steep learning curve, but I would say definitaly not as steep as when starting with R, and, as with R, it is absolutely worth it — even if you only use emacs for programming in R.

  2. Hi rainer

    Very kind to share such experience with us…i hope that you’ll post soon about the R and GRASS interface…


  3. Of course commenting is useful, but an even more desirable goal is to write code that doesn’t need much commenting. Some things that will help you for this are using a style guide, giving explanatory variable names, breaking things down into short functions and being consistent about how you organise and name your files.

    • This is definitely true – but even if you follow a style guide, use explanatory variable names, use short functions and so on, will make it quite difficult to understand it on a later stage – you probably know how your code is doing something, but there is still the question:
      Why the heck did I write that code so complicated, and I have no idea why it is calling this function?
      So you are right: using coding guidelines helps to understand how your code is working, but not why – I think that could be another blog.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s