Code Entropy

While preparing my talk at Smidig 2008 I kept thinking about the second law of thermodynamics.

Given two snippets of code, A and B, that have exactly the same external behaviour. If an expert programmer is more likely to change A to B than B to A, then snippet A has higher code entropy than snippet B.

Let us consider a very simple example:
{ int a=3; while (a<9) { ... ; a++; } } // snippet A
and
{ for (int a=3; a<9; a++) { ... } } // snippet B
The external behaviour for these two code snippets are exactly the same. However, most programmers would agree that snippet A is better rewritten into snippet B. So, in this example, during a refactoring session, it is likely that someone will change A into B, but unlikely that someone will change B into A. Hence snippet A has higher code entropy than snippet B.

Now, extend this idea into larger functions, classes, modules, applications, software design and architecture. Can entropy be used to describe the state of a codebase?

This entry was posted on Friday, November 14th, 2008 at 12:53 am and is filed under Uncategorized. You can follow any responses to this entry through the RSS 2.0 feed. Both comments and pings are currently closed.

4 Responses to Code Entropy

Johannes Brodwall says:

November 14, 2008 at 8:47 am

Neat idea. Code entropy seems to be very much like what others call “technical debt.” It would be interesting to see if the concepts could be combined/contrasted.
Kjetil V. says:

November 14, 2008 at 9:16 am

Entropy, as in ‘likelihood of being changed’. It seems to me that the concept is related to that of idiomatic code. The for loop in B has become an idiom because it is superior: the counter variable is more restricted in scope and cannot be confusingly re-used. And idioms are, by definition, exactly that: What most programmers would do.
Kevlin Henney says:

November 27, 2008 at 4:16 pm

In terms of a composite measure of ‘mess’ you might want to check out the idea of code toxicity described by Erik Doernenburg from ThoughtWorks:

http://erik.doernenburg.com/2008/11/how-toxic-is-your-code/
Balog Pal says:

April 25, 2009 at 10:28 pm

The snippets are really NOT identical.
If you have ‘continue’ anywhere in the … part they will behave differently.

A “rewrite” shall take that into account, and the “natural” formulation is also dependent on what is actually done, what is the role of a, how it is changed, etc.

The for() form is fit when the only modification of a is that a++ and it acts like an iteration. However it could be some phase/state marker too, that keeps changing inside, then while() is likely better.

As we write code for particular purpose, not in general, too stripped snippets are not really useful. ;-)

Geektalk

Code Entropy

4 Responses to Code Entropy

Top Posts & Pages

Twitter Updates

Recent Posts

Meta

Geektalk

Code Entropy

Share this:

4 Responses to Code Entropy

Top Posts & Pages

Twitter Updates

Recent Posts

Meta