Myself, Coding, Ranting, and Madness

The Consciousness Stream Continues…

An Interlude of White-space

21 Jul 2011 8:00 Tags: None

After finally settling on the title, I was rather tempted to use the post as a space to perform John Cage's "0'00""1 in the style of his early work "4'33"". But, I doubt such a post would get many readers

What I am here to talk about is one of the many peeves which I was able to fully rationalise during my group project. I think it is fair to say that doing a large programming project in a group of 14 over 5 weeks is a reliable way to get slightly stressed, and form some pretty hard-set opinions. The particular one for now, as people might have now guessed from the title, is spaces, tabs, and blank lines.

Blank lines are the easiest to talk about. In many ways, coding layouts would be better off with a distinction between line breaks and paragraph breaks - especially is the latter's distance was configurable. I have always found that leaving a white line in between logical sections of code only slightly more understandable, as I sometimes find it difficult to distinguish between actually division of code and the logical sections. And, of course, you can approximate the appropriate number of lines when you're working in a type-writer-like environment.

The other side is the mess of tabs and spaces. For those of you who don't know, a tab is not a series of spaces. Nor, strictly, does it have a width measurable as an integer number of characters2. What is should be, by default, is a movement of the carriage to the next tab stop; conventionally, the first tab stop was at half an inch, for indenting a paragraph. This key was originally included to make the tabulation of data easier, back in the days of typewriters, and the tab stops were set physically on the devices. And, although the tab character was included in ASCII, its meaning has been lost to computer programmers, in much the same way as carriage return and line feed.

This, to me, is slightly surprising - more so for the latter two - as they still make a lot of sense in a computing environment, especially if you look at a terminal. They're still called TTYs for a reason - and, it is possible to pipe ASCII into a slightly augmented typewriter, and have your terminal on a ream of paper (but, seriously, why would you? It can't even refresh a section...) - in that both the operation of carriage return and line feed make sense. And, in a graphical environment, tabs still make sense in a number of ways, especially for indenting nested lists. This brings me neatly back round to code which, at best, is a complex network of related and indented lines of text and, at worse, it just a plain nightmare.

When it became tradition to render a tab as 8 spaces, things became bad. 8-width tabs, on an 80x25 screen, quickly result in more space being used for opening white-space than for code - I could give any number of code examples with 5 or 6 indents, but I'm sure you can think up your own3. Now, I vary between wanting 2 or 4 width tabs depending on what I'm working on but, more importantly, I want to be able to choose. Thankfully, some IDEs now support this - NetBeans tab/space is actually very flexible - and this means that I can use tab characters, and have the IDE present the text with nice looking indentation. But, I could achieve the same result with spaces.

Well, yes, I could. But then other people would be stuck with my choice - they'd have to do some search replace to change to their preferred width. And also with lot more content, as four or eight width tabs are four and eight time less space efficient respectively. To put this in context, the project I mentioned earlier appears to have about 12802 tabs in it - replaced with 4-width space would be to put a shade over 50 KB extra used to store the code. And, to put that in context, that's like randomly adding this fine image to the code4.

Image of exactly 51208 bytes

Admittedly, there are a few drawbacks. I've tripped myself up a few times with patches in Git, where I've forgotten to add a leading space to make a line as context5, but, overall, I think using tabs is still worthwhile.

  1. 1 For those of you who only familiar with the more noted "4'33"", the score of "0'00"" is a sheet of paper with the words "In a situation provided with maximum amplification, perform a disciplined action."; some other time, I'll go into my interpretations of this.
  2. 2 Usually
  3. 3 if in doubt, perform some operation on three-dimensional space :P
  4. 4 Yes, I went and found an image that was exactly 51208 bytes. Copyright (c) 2003,2004, Brian P. Shore from http://www.shorey.net/Auto/Italian/Alfa%20Romeo/Milano/htmltree.html
  5. 5 In a patch, all lines have a +, -, or ' ' pre-pended to say how the line shoudl be handled - added, removed, or left alone