Code Archeology

I work with codebase that has revisions going back over five years. Like dog years, the age of code adds up quickly. A five year old source file is like fifteen in code years. In that amount of time, boat loads of engineers have come and gone, checked in fixes and bugs alike, have left comments, and removed methods. Code has been written and rewritten and refactored and removed in that time span. Third party vendors emerge, merge, and disappear.

The codebase is large, with thousands of class source files and millions of line of code with hundreds of third party resources. No one person alone knows the system inside out, it’s behavior is a mystery to some. To grasp it in your debugger or your mind you need the collective knowledge of the whole team. In such an environment, tests are that much more important. Test become a part of the collective knowledge, an opinionated specification, and essentially another team member, of sorts.

Debugging such legacy source code becomes like code archeology, but instead of finding ceremonial tombs encrusted with jewels you find unceremonious hacks littered with bugs.

After reading thousands of source files, like the matrix, you begin to recognize patterns, fads, and trends, like XML configuration files, to code generation, to dependency injection, to annotations, to configuration over configuration, to new languages, etc.

Along the way, your realize that if code is art, then having Picasso on your software development team would not necessary help keep defect count low.

Technorati Tags: , , , , , , , ,