Cleaning a Complex Java Code Base
Standard checks and unit tests for every line of code might be impractical, but here's a strategy for expediency in delivery of large, complex code bases
by Matt Love
May 8, 2006
Thecking coding standards and unit testing would be performed ideally on every piece of code before it was added to a team's code base. However, doing so is not always practical. Many organizations do not provide developers the time and resources required for testing at this level. Moreover, most organizations do not develop applications from scratch by writing new code for all required function¬ality. Rather, they typically make incremental enhancements to a large amount of functioning leg¬acy code or add their own code to extend third-party or open source packages. The resulting code bases could include legacy code written within the organization, code obtained through a merger or acquisition, code obtained from an outsourcer, or code that was developed by the open source community and downloaded from the Internet.
Consequently, most teams accumulate large and complex code bases with at least some code that has not been subject to coding standard analysis and unit testing. This accumulation involves several critical risks. When the application is used in a way that development and QA didn't anticipate (and didn't test), the code might throw unexpected run-time exceptions that cause the application to become unstable, produce unexpected results, or even crash. The code also might open the only door that an attacker needs to manipulate the system and/or access privileged information. Small coding mistakes could lead to significant performance or functionality problems. The code's functionality might be broken as the application evolves over the course of its life cycle.
If your team already has a large and complex code base (hundreds of thousands, or even millions, of lines), it's not too late to benefit from coding standard analysis and unit testing. As long as these practices are automated and applied properly, they can still be used to identify functionality, reliability, security, and performance problems before release and deployment—as well as to satisfy any contractual obligations for performing unit testing or complying with a designated set of standards.
Let's look at a simple two-step strategy that has been proven to deliver fast and significant improvements to large and complex Java code bases. The first step is using coding standard analysis to identify bugs and bug-prone code. The second is using unit-level regression testing to ensure that the functionality is intact and using unit-level reliability testing to ensure that all code base changes are reliable and secure. Both steps can be automated to promote a consistent implementation and allow your team to reap the potential benefits without disrupting your development efforts or adding overhead to your already hectic schedule.
Bugs and Bug-Prone Code
Why is it important to identify bugs and bug-prone code? Complying with coding standard rules is a proven way to achieve key benefits that we can put into four groups: 1) detect bugs or potential bugs that impact reliability, security, and performance; 2) enforce organizational design guidelines and specifications (application-specific, use-specific, or platform-specific) and error-prevention guidelines abstracted from known specific bugs; 3) improve code maintainability by improving class design and code organization; and 4) enhance code readability by applying common formatting, naming, and other stylistic conven¬tions. Rules that provide the first benefit will be referred to as group 1 rules; rules that provide the second benefit will be referred to as group 2 rules, and so on.
Back to top
|