estimations and theory building

Published: 22 Feb 2016

My team lead at Instructure, Dan Dorman passed this fantastic article from 1985 by Peter Naur around the team a few days ago. It’s entitled Programming as Theory Building and discusses the idea that programming is closer to building theories of how the world works then it is to manufacture. This has really interesting ramifications.

The reason it came up has more to do with the gradual decline that code bases go through as the team comes and goes. Naur explains that because programming is creating a theory of how the world functions it is impossible to completely transfer the theory from one programmer to another. Documentation helps, but the complete underlying theory is essentially impossible to transfer from one group of programmers to another.

All of which is fascinating! But this article is about how all of that plays into the difficulty of estimating projects.

I was talking the other day with a co-worker about how we really don’t know what the final implementation of a feature or API is going to look like until we have spent the time reasoning about the problem space. Usually once we have reasoned about the problem enough to know how to solve it the solution is relatively easy to implement. Most of my time working on a program is in the figuring-out-how-the-heck-this-thing-is-supposed-to-work phase.

Which is why Naur’s ideas of programming as building theories appeals to me. Reasoning about the problem takes time. Building the theory takes time.

That is why estimations are almost always wrong. While we can generate snap judgements around the order of magnitude that we think a problem’s theory might take to prove. But if that initial theory of the program fails to pan out then we are left with a need to develop and test a new theory repeatedly until we find one that matches.

This is problematic. Because we manage our projects as if we are manufacturing parts for an automobile or other physical product where the theory building has already taken place before production has begun. In that sort of production the theory building is treated as a separate task from the physical creation.

Because of that deriving estimates is a simple task. You can measure the amount of time that it takes to weld a frame, or paint a car. Those measurements give you accurate estimates.

Likewise in the creation of programs we can measure things like Lines of Code and know exactly how long it will take to type out a program. However, that leaves us with a problem. Because sometimes fewer lines are better lines, and we don’t even know how many lines of code we will need until we have the implementation of the theory completed. Meaning that we should be phenomenal estimators of things we have already implemented in the programs that we have already implemented them in, and truly terrible estimators at things that we have never implemented in programs we have never implemented them in.

Because when it comes down to it we can’t know if our theories work until we prove them, and we won’t know how long it takes to reason out new theories until we have done it.

That is why estimations are hard. Because we are asking Einstein to sit down and tells us how long it will take him to come up with the General Theory of Relativity before he even knew what it would encompass.* That is a recipe for disaster. But it does make me feel better about my shortcomings when I mis-estimate features and projects.

* It took Einstein 8 years of research after he published Special Relativity to nail down the General Theory of Relativity.