Reading beside the lines: Using indentation to rank revisions by complexity

I'm always on the lookout for good papers that offer an empirical approach to programming languages. Thanks to Oscar Nierstrasz for suggesting this one, which shows that indentation is as effective as more sophisticated measures at classifying program complexity.
Abram Hindle, Michael W. Godfrey,
Richard C. Holt, Reading beside the lines: Using indentation to rank revisions by complexity, Science of Computer Programming, Volume 74, Issue 7, 1 May 2009, Pages 414–429.

Maintainers often face the daunting task of wading through a collection of both new and old revisions, trying to ferret out those that warrant detailed inspection. Perhaps the most obvious way to rank revisions is by lines of code (LOC); this technique has the advantage of being both simple and fast. However, most revisions are quite small, and so we would like a way of distinguishing between simple and complex changes of equal size. Classical complexity metrics, such as Halstead’s and McCabe’s, could be used but they are hard to apply to code fragments of different programming languages. We propose a language-independent approach to ranking revisions based on the indentation of their code fragments. We use the statistical moments of indentation as a lightweight and revision/diff friendly metric to proxy classical complexity metrics. We found that ranking revisions by the variance and summation of indentation was very similar to ranking revisions by traditional complexity measures since these measures correlate with both Halstead and McCabe complexity; this was evaluated against the CVS histories of 278 active and popular SourceForge projects. Thus, we conclude that measuring indentation alone can serve as a cheap and accurate proxy for computing the code complexity of revisions.


Computing at School

Hurrah! The government appears to finally understand that there is a difference between teaching IT and teaching computing, and that we need to move from the former to the latter in schools. Join the campaign at Computing at School. The Guardian is running a digital literacy campaign. Computing is the new Latin!

This page is powered by Blogger. Isn't yours?