This is a guest posting by Jim Holmes
Understanding the health of the software’s codebase can help you better focus your testing efforts. Static analysis metrics can help testers focus on areas of the codebase to invest extra attention.
Some testers may feel intimidated about code metrics, but understanding how metrics work is vastly different than understanding how to write code that solves hard problems like threading, concurrent data access, scalable web services, etc. Software metrics are easily used without deep knowledge of how highly technical pieces of software are written. Moreover, metrics are a vital tool in helping educate testers on what good software should look like.
First off, some clarity around terms as I use them below.
Software metrics are measurements of various aspects of a software codebase. The metrics can be at a high level across the entire codebase or can focus on small blocks of code at the class, method, or even lower level. Such metrics are generally gathered by tools during static or dynamic analysis phases.
Dynamic analysis happens while the system is in operation; static analysis runs without the system being live and focuses on the structure of the code itself, versus the operation of the system. This article focuses on metrics gathered during static analysis.
Static analytics tools examine the code or system and measure specific concerns during the analysis. Over time the industry has figured out general guidelines around various metrics indicating whether or not that particular measurement aligns with generally accepted ranges. Every modern analysis tool will provide measurements and some reference as to standard accepted values for that metric.
(No, there aren’t any “best practices” around such metrics, as there’s no such thing as a “best practice.” Metric results need to be interpreted in the context of the organization’s overall practices and maturity.)
There are a great many metrics available for measuring software. This article focuses on the five I’ve found most useful over a number of years in the industry. They’re listed below in reverse order.
What It Is: Overuse of comments in code.
Comments in code can be a controversial topic. Old-school development practices encouraged a huge amount of comments, and some in academia mandated a comment for every single statement. Every. Single. Statement.
This led to practices such as:
int index = 0; //set index to 0
which does nothing other than litter code with meaningless information that can be easily missed when making updates to the code, eg:
int index = 1; //set index to 0
Such disconnects cause huge difficulties when someone goes back to read the code as part of a bug fix or feature enhancement. Is the code right? Was index supposed to be set to a value of “1”? Or was the comment right and index should have been set to zero?
Modern software practices prefer to focus on good naming practices in the code itself. Comments should be reserved for explaining a section with very difficult logic or domain-specific rules. Blocks of confusing, critical behavior should be extracted into well-named methods that make the intent clear. Short, concise, applicable comments should be limited to explaining the *why*, not the how.
For example, here’s a completely contrived, non-functional example.
public boolean IsShippingTypeValid(Product product,
//Call lookup system to check shipping. Reject if
// too heavy or bulky for specific destination.
//See IShippingLookup for more details on rules.
How to Use It: Look at the metric reports for classes and methods with a large number of comments. Open up those classes in your favorite editor and look for disconnects between what the code does and what the comments say. If there’s confusion and disconnects then go talk with a developer to see if you can find clarity. Look at any automated tests for clarity. Build up and execute test scenarios/exploratory charters to exercise that specific area of functionality.
What it Is: Overly long classes, modules, methods, blocks.
Huge classes, blocks, or methods can indicate an area of code that has far too many responsibilities and behavior. Long sections of code are generally very confusing, difficult to test, and very hard to maintain.
How to Use It: Look for long classes and methods. Are those blocks confusing and have mixed responsibilities such as creating data connections while building threads and updating the UI? See what you can discern from the code, and pair up with a developer to discuss risky areas of those large blocks. Again, build up and execute test sessions based on what you discover.
What it Is: Coupling is a two-way measurement. Coupling is how many components use the one you’re looking at, and is also how many other components the one you’re looking at relies upon. It’s a measure of incoming and outgoing dependencies. (“Afferent” is incoming coupling and “Efferent” is outgoing.)
Components with lots of coupling are at risk in two different ways. If lots of other components depend on the one you’re examining, then any change to the current component risks breaking those others.
Conversely, if the current component relies on lots of external components, then the risk of the current component breaking explodes since so many other components can impact the current one.
How to Use It: Look for high coupling in either direction. Create breaking changes inside external components and see how the current component reacts. Do the opposite to test outgoing components. Examine how well those areas are protected with good automated unit and integration testing. Shore that up as needed. Explore API testing of the various components to better understand quality issues.
What it Is: Churn is the number of edits to a source code file. It’s usually measured by the number of commits to source control for a specific file.
Files with a high churn indicate specific areas of the system that undergo a lot of updates. While there are exceptions, files frequently updated are normally an indicator that there are quality problems within that file. These quality issues could be straightforward: a difficult, brittle section of the system that requires lots of bug fixes. Quality problems could also be more subtle: a badly designed class or component that is very hard to modify when trying to extend or modify other areas of the system.
How to Use It: Look for areas with high churn. Spelunk into these classes, modules, packages, whatever. As with high coupling, ensure automated regression tests are solid. Build them out on a risk/value approach where needed. Work high churn areas hard via fuzz testing and other similar high-impact approaches.
What it Is: Cyclomatic complexity is a count of the number of separate paths through a block of code. The more paths, the harder the code is to test and maintain. Nested IF/ELSE statements, multiple SWITCH/CASE blocks, and other similar constructs all quickly drive cyclomatic complexity into dangerous realms.
Cyclomatic complexity is my number one, go-to metric when trying to get a quick feel for the quality and health of a codebase. High, or even outrageous complexity numbers indicate many things about the team’s maturity level, design thinking, and plain software engineering skills. None of those indicators are good, by the way.
High complexity crushes a team’s ability to clearly express a block’s intended behavior, and it makes re-reading that block a mind-numbing, error-prone death march. Moreover, studies have shown a positive correlation between high complexity and high defect rates.
How to Use It: Start top down with the most egregious offenders–blocks with complexity metrics well above normal. Spend some time and overlay churn metrics with the worst complexity sections. See if you can determine bug history (open and closed) for those same blocks. With that data in hand head off and break out testing efforts around the riskiest areas of that metaphorical Venn diagram.
Static code analysis metrics can provide testers a wealth of information on the codebase’s overall health. The metrics can help inform of solid areas which don’t need much attention, and they certainly can help testers determine what areas to spend their valuable time on.
Do you use static code analysis to inform your testing? Use the comments and let us know what you’re finding helpful in delivering better value!
Help us improve this page!
What problem are you trying to solve?
By integrating Global App Testing with TestRail, you can launch unscripted and scripted tests and receive results in as little as 15 minutes.
Building QA into your SDLC is key to delivering quality. Here are the mistakes to avoid when building quality throughout the SDLC.
Organizations need conceptual and QA has crucial strengths to effect that change. Here are three winning action plans to change your QA culture and integrate it with the rest of your SDLC.