Code highlighting is wonderful, but we're having too much of it

There's been some debate about syntax highlighting, about how and even if it's useful at all. This article talks about how we're doing it wrong, and this much older article doubts its usefulness in general. I think the arguments against syntax highlighting are thought provoking and so I've since been forming an opinion of my own.

I admit that before I read any of the articles, I did not really question syntax highlighting. It's just been there for a long time, it certainly looks neat, and since it became some kind of bare minimums standard feature, I even started to associate its absence with inferior or misconfigured editors (in emacs, the absence of syntax highlighting in a buffer is a good sign that you may be missing a much desired major mode for editing that kind of file).

By now though, I do agree that a lot of the common over-abundant syntax highlighting is questionable at best. As Linus Akesson's article demonstrates, reading a sentence where almost every word is colored according to some syntactical class neither makes reading easier, nor does it really add any valuable information. With respect to actual code, I then find myself thinking that "yes, I do know that if is a keyword of the language I'm using right now, thank you very much".

But I do also think that there's a lot of (not necessarily syntax) highlighting that really does add some tremendous value.

Let's take a look at some examples:

Making stuff that I am looking for pop out


When I am looking for pairs of matching things (parens, braces or even start/end tags in XML) it is certainly useful when the editor lights the corresponding element up without any effort on my part. And when I search for expressions, it's most helpful if the editor highlights all found occurrences while the search is active, so that I may visually scan through the file.

One of the most valuable features of modern editors, however, is the ability to highlight all occurrences of the same syntactical element, like a specific variable or function, just by placing the cursor on it. IntelliJ is one editor which does this, and it is often a huge help when reading or modifying code, even more so if it's someone else's code with which I am not fully familiar yet (which I tend to dig in a lot). While there is also the exhaustive "Find Usages" feature which gives you a clickable list of global usages, when working locally a clear visual standout is more useful than any list in some other panel.


This screenshot, however, shows well how IntelliJ's default of subtly underlining the selected variable is drowned out in a sea of over-abundant formatting attributes. Granted, IntelliJ's default is also a little bit too subtle for my tastes, but looking for those underlined "i"s in that lit up mess hurts me more than it helps. And since this is not my code (it's part of jansi, by the way), I would appreciate the help.

Let's turn the symbol usage up for just a little bit and all the other stuff way down:


Much better. I can easily see where "i" gets used. The colorful stuff is now the stuff that matters.


Notice that in this screenshot, the variable "buff" is also still highlighted with a yellowy background. That's because it belongs to another class of stuff that I care about, which is:

Anomalies


What IntelliJ is trying to tell me is that the code may use a StringBuilder instead of a StringBuffer at that point for efficiency. This, like any other warning, error or anything with the string "TODO" in it, is information that I am always interested in. In general, anything that I consider an anomaly should stand out from the rest, because it tells me: "hey, look at this, there's something wrong or unusual here, you should probably do something about it".

Syntax Subtleties


Syntax highlighting itself becomes useful again when the syntax is subtle enough that I welcome further help about it. Take this example of string interpolation in Scala:


Here, highlighting helps reminding me that .name in $person.name will be interpreted as part of the string (something I most likely did not intend), that I escaped the $ in 40$ correctly ("aha, so it's not \$"), and where the expression in ${...} really ends.

This is different from, say, plain old Java code where I really do know what word is a keyword.

Less is More


In theory, a lot of common syntax highlighting aims to convey useful information that is not always immediately available from the directly surrounding code. IntelliJ for example colors local variables, fields, constants etc. differently. In practice, though, I don't need that information constantly shoved into my face. I often already know what is what, because I am actively working on the code in question. But even if I don't, I'm entirely content to do some small additional action (like command-clicking on the symbol) to find out.

Only for some of the more interesting cases like constants or static variables do I keep some unobtrusive formatting (like cursive script) enabled. With a constant visual overload, on the other hand, I tend to ignore any formatting attributes altogether.


The conclusion is simple: I would not want to live without code highlighting in certain situations. But in order for it to become actually useful, I really need less highlighting than what a lot of the defaults offer.

No comments: