What’s your signal-to-noise ratio in your code?

You write code to deliver business value, hence your code deals with a business domain like e-trading in finance, or the navigation for an online shoe store. If you look at a random piece of your code, how much of what you see tells you about the domain concepts? How much of it is nothing but technical distraction, or « noise »?

Like the snow on tv

I remember TV used to be not very reliable long ago, and you’d see a lot of « snow » on top of the interesting movie. Like in the picture below, this snow is actually a noise that interferes with the interesting signal.

TV signal hidden behind snow-like noise

The amount of noise compared to the signal can be measured with the signal-to-noise ratio. Quoting the definition from Wikipedia:

Signal-to-noise_ratio (often abbreviated SNR or S/N) is a measure used in science and engineering that compares the level of a desired signal to the level of background noise. It is defined as the ratio of signal power to the noise power. A ratio higher than 1:1 indicates more signal than noise.

We can apply this concept of signal-to-noise ratio to the code, and we must try to maximize it, just like in electrical engineering.

Every identifier matters

Look at each identifier in your code: package names, classes and interfaces names, method names, field names, parameters names, even local variables names. Which of them are meaningful in the domain, and which of them are purely technicalities?

Some examples of class names and interface names from a recent project (a bit changed to protect the innocents) illustrate that. Identifiers like « CashFlow »or « CashFlowSequence » belong to the Ubiquitous Language of the domain, hence they are the signal in the code.

Examples of classnames as signals, or as noise

On the other hand, identifiers like « CashFlowBuilder » do not belong to the ubiquitous language and therefore are noise in the code. Just counting the number of « signal » identifiers over the number of « noise » identifiers can give you an estimate of your signal-to-noise ratio. To be honest I’ve never really counted to that level so far.

However for years I’ve been trying to maximize the signal-to-noise ratio in the code, and I can demonstrate that it is totally possible to write code with very high proportion of signal (domain words) and very little noise (technical necessities). As usual it is just a matter of personal discipline.

Logging to a logging framework, catching exceptions, a lookup from JNDI and even @Inject annotations are noise in my opinion. Sometimes you have to live with this noise, but everytime I can live without I definitely chose to.

For the domain model in particular

All these discussion mostly focuses on the domain model, where you’re supposed to manage everything related to your domain. This is where the idea of a signal-to-noise ratio makes most sense.

A metric?

It’s probably possible to create a metric for the signal-to-noise ratio, by parsing the code and comparing to the ubiquitous language « dictionary » declared in some form. However, and as usual, the primary interest of this idea is to keep it in mind while coding and refactoring, as a direction for action, just like test coverage.

I introduced the idea of signal-to-code ratio in my talk at DDDx 2012, you can watch the video here. Follow me (@cyriux) on Twitter!

Credits:

TV noise picture: Some rights reserved CC par massimob(ian)chi