Sunday, March 27, 2016

API Best Practices and Breaking Changes

This post is a collection of deceptive breaking changes and best practices to follow when designing an API. It's based on my research from reputable sources on the web. Please refer to the references section to check the sources where I collected information from.

There are three types of compatibility issues when dealing with API changes. A new version of an assembly can break existing client assemblies in the following ways:

Binary compatibility break: Client assembly cannot load a new version of an assembly. Client needs to re-compile their code.
Source compatibility break: Existing assemblies work with new version of an assembly, however, client code doesn't re-compile against the new assembly.
Semantic compatibility break. Client binary and source are both compatible with new version of an assembly but, there are semantic changes in the application.

In this post I'll focus only on some fundamental object-oriented best practices and non-obvious (deceptive) breaking changes when developing a public interface in C#. This list of breaking changes will remain incomplete on purpose. I am not going to talk about obvious breaking changes here (such as, changing a method signature), but only talk about changes that arise from not writing code with best design principles from the start, and how changing them later can cause non-obvious breaks in client code.


Adding a return type to a public void method

Break type: Binary, Source
API before change (assemblyA)
public static class Foo
{
    public static void Bar(int i){ }
}
API after change
public static class Foo
{
    public static bool Bar(int i){ return true; }
}
Client code (assemblyB)
Foo.Bar(13);                  // binary break (will not load assemblyA)
Action<int> a = Foo.Bar;      // source break (will not recompile)
This kind of change is a violation of the Open/Closed Principlebecause we are trying to add new functionality by modifying an existing class/function. The mistaken assumption here is that, since the return type is not part of the method signature in C# and Java, and since the void method was never assigned to anything previously, adding a return type should not break existing code. But it does break. The return type is not used by C# in determining which method to call, but C# needs to know the return type to generate correct bytecode in the caller of the method. This is a binary break in Java as well.

NOTE: Even though in C# the return type is not part of the method signature, in the CLR the return type is part of the signature and hence the above change makes them two completely different methods at the CLR level (hence the binary break).


Changing a field to a Property 

Break type: Binary, Source
API before change (assemblyA)
public class Foo {
    public int Bar;
}
API after change
public class Foo {
    public int Bar { get; set; }
}
Client code (assemblyB)
SomeMethod(objFoo.Bar);         // binary break (will not load assemblyA)
AnotherMethod(ref objFoo.Bar);  // source break (will not recompile)
Putting aside the fact that we should not expose public fields in the first place, the reason it causes a binary break is because properties in C# are converted to get/set methods by the compiler, so the Bar member doesn't exist in the new assemblyA. The passing by ref causes a source break because Properties in C# don't allow passing by ref, it's only for fields.
The reason it's important to know this is because many developers think that they can simply refactor a public field into a property when needed, without breaking client code, which is not true.

Public fields (and protected fields) violate two of the most important Object-Oriented design principles: encapsulation and low-coupling. MSDN:CA1051

Public fields also violate the Open/Closed Principle. When the member variables of a class are public and if they change, every external function that depends upon those variables must be changed. So, none of those functions can be closed with respect to those variables.


Changing the value of a private const 

Break type: Semantic
API before change (assemblyA)
public class Foo {
    private const string strA = "The meaning of life";
    public void MyMethodA(string a, string b = strA){ }
}
API after change
public class Foo {
    private const string strA = "The meaning of life has changed";
    public void MyMethodA(string a, string b = strA){ }
}
Client code (assemblyB)
public class Bar {
    public string MyMethodB(){
        var objA = new MyClassA();
        objA.MyMethodA("hello");  // semantic break (needs recompilation to see the new value)
    }
}
Pretty deceptive, ha? :) We know that a public const in C# is emitted as a literal in client code, so it needs recompilation to reflect a new value, but who would have thought that a private const can cause trouble as well. After all, client code has no access to the private const. The reason it breaks is, the C# compiler embeds the default const values of optional parameters in every method call site in the client. So, if a const used in a default parameter is changed in assemblyA, then assemblyB needs recompilation to update all method call sites. Otherwise assemblyB keeps using the old values.

The moral of the story is, keep consts for real constants, like the speed of light in a vacuum. Constants that need changing are not constants.


Property setter: changing private to public

Break type: Source
API before change (assemblyA)
public class Foo {
    public int P { get; private set; }
}
public class Bar {
    public int P { get; set; }
}
API after change
public class Foo {
    public int P { get; set; }
}
public class Bar {
    public int P { get; set; }
}
Client code (assemblyB)
static void ClientMethod(Action<Foo> f) {}
static void ClientMethod(Action<Bar> f) {}
static void Baz() {
    ClientMethod(x => {x.P = 42;});  // source break (overload resolution error)
}
On first thought it may seem like changing a property setter from private to public can never be a breaking change because client code never used the private setter. We are only adding new functionality. However, it can break, as illustrated in the Client code example.

Client code has a method Baz which does overload resolution on ClientMethod. When Foo.P has a private setter in assemblyA, overload resolution in client chooses the second overload and compilation succeeds. But when Foo.P's setter is made public client gets an overload resolution error because both overloads of ClientMethod are equally valid now and neither is better than the other.

This kind of change is a violation of the Open/Closed Principle, because we are trying to modify existing functionality in a class.



Adding a public method

Break type: Source, Semantic
API before change (assemblyA)
public class Foo {
}
API after change
public class Foo {
    public void Baz();
}
Client code (assemblyB)
public static class FooExtensions {
    public void Baz(this Foo foo);
}
public class Bar {
    public void Baz() {}
}
public class Program {
    static void ClientMethod(Action<Foo> a){}
    static void ClientMethod(Action<Bar> a){}
    static void Main() {
        new Foo().Baz();            // semantic break
        ClientMethod(x => x.Baz()); // source break (overload resolution error)
    }
}
Semantic break: The C# overload resolution algorithm chooses the “closest” method match, and in this case it will be the instance method, not the extension method. Hence all client code that was using the extension method will silently start using the instance method.
Source break: The problem here is caused by lambda type inference in C# in the presence of overload resolution.

This kind of change is a violation of the Open/Closed Principle, because we are trying to add new functionality in an existing class. Depending on what the method does, it may be a violation of the Single Responsibility Principle as well.


Properties should not return arrays (MSDN:CA1819)

  • Arrays are not easy to make tamper-proof. The caller can replace the contents of an array with anything they want. In order to keep the internal data of your class safe you need to pass a copy of the array every time, which is expensive. 
  • Array covariance is not type-safe in C#. Here's a blog by Eric Lippert describing the problems with Array Covariance. It means that every time you put an object into an array you have to do a run-time check to ensure that the type is correct (see eg below). 
  • Arrays don't have an extensibility model and are not thread-safe. Better alternatives are to return a ReadOnlyCollection, IEnumerable or an IList

Here's another blog from Eric Lippert on the harmful side effects of arrays: Arrays considered somewhat harmful. Eric Lippert was the principal engineer in the C# compiler team at Microsoft for many years, and is one of the most respected .Net contributors in StackOverflow.

MSDN Framework Guidelines: DO prefer collections over arrays.

Break type: Semantic
Array covariance is not type-safe
object[] objects = new[] { "foo", "bar", "baz" };
objects[0] = 5;     // runtime exception
Arrays also cause runtime exception in IList covariance
List<Dog> dogsList = new List<Dog>();
Dog[] dogsArray = new Dog[5];
IList<Mammal> mammals = dogsList;     // compile error
IList<Mammal> mammals = dogsArray;    // no compile error
mammals[0] = cat;                   // runtime exception
object[] objArray = new string[5];
objArray[0] = 57;                   // runtime exception

It's better to avoid using arrays altogether (except maybe for high performance scenarios). They are not really type-safe when it comes to covariance, and have no extensibility model. List is slightly better in this respect because it is not covariant and gives you compile error. Most readonly interfaces are covariant, such as, IEnumerable, IQueryable, IReadOnlyCollection, IReadOnlyList, etc, they are easily identifiable in the docs as <out T>.


Do not expose generic lists (MSDN:CA1002)

Break type: Binary, Source
API before change (assemblyA)
public interface IFoo {
    List<IBar> GetList();
}
public class Foo: IFoo {
    public List<IBar> GetList() {
        return new List<IBar>();
    }
}
API after change
public interface IFoo {
    Collection<IBar> GetList();
}
public class Foo: IFoo {
    public Collection<IBar> GetList() {
        return new Collection<IBar>();
    }
}
Client code (assemblyB)
List<IBar> lst = new Foo().GetList();
lst.Add(item);      // binary break, if we wanted to extend the behaviour of the returned list

In order to avoid this type of interface breaking changes, it is advisable to always use either an interface or a higher level abstraction when returning a type from a public interface. Lists were designed for performance, not for extension. So, if we wanted to extend the behaviour of the returned list, we can't with a List type. For eg, we may want to do a null check for Add(), or have events raised during Insert/Remove. Changing the List to a Collection allows us to extend the behaviour of the returned list by overriding the Collection's Add/Remove/Insert methods. In the overridden methods we can add event hooks, null checks, etc.

Collections have protected virtual methods that we can override to extend its behaviour for Add, Remove, Insert, etc. Other better alternatives could be a ICollection, IList, IReadOnlyCollection, IEnumerable, etc

Lists have no virtual methods. You can use an IList however, if you really wanted to return a List.


Avoid using default parameters 

MSDN:CA1026: Default parameters should not be used

Break type: Binary, Source
API before change (assemblyA)
public void Foo(int a) { }
API after change
public void Foo(int a, string b = "bar") { }  // binary break
    OR
public void Foo(int a) { }
public void Foo(int a, string b = "bar") { }  // source break
Client code (assemblyB)
Foo(5);     // binary break; MissingMethodException
    OR
Foo(5)      // source break; ambiguous method call

For the first case, the client code needs to be recompiled. The recompilation will output Foo(5, "bar") in all call sites at the bytecode level. This is because C# embeds the default values of the parameters directly into the call sites. Default parameters change the method signature at the CLR level. Default parameters are a language feature of C#, the CLR does not treat them in a different way. This also explains why default values have to be compile-time constants in C#. Default parameters are not similar to adding method overloads. They are like adding new parameters to the method definition.

In the second case, you have to make sure there are no ambiguous call created by adding another method with default parameters.


Avoid using virtual methods

MSDN Framework Guidelines: DO NOT make members virtual unless you have a good reason to do so and you are aware of all the costs related to designing, testing, and maintaining virtual members.

Break type: Semantic
API (assemblyA)
class Base {
   private int state = 0;
   public virtual void Foo(){
      Bar();
      if(state == 17){...}
   }
   public virtual void Bar(){
      state++;
   }
}
Client code (assemblyB)
class Derived : Base {
   public override void Bar(){}
}

Method Base.Bar() changes base class’s state, and method Base.Foo() depends on that state change. But after Derived class overrides Base.Bar() as written above, method Base.Foo() is broken, because the overridden Bar() doesn't change the Base state anymore. A solution to fix this issue is to call base.Bar() from Derived Bar(), so that the state change happens. But this kind of fix breaks encapsulation of base class's implementation.

The general guideline is to try to avoid inheritance and use composition/delegation instead, in combination with interfaces.

This Derived class violates the Liskov Substitution Principle. The Derived type is not substitutable in method call sites that currently accept the Base type as a parameter, because the behavior of Base.Foo has changed now.


Fragile Base Class inheritance

Break type: Source, Semantic
API before change (assemblyA)
public class Base {
   private int counter = 0;
   public void Foo() {
     counter++;
   }
   public virtual void Bar() {
     counter++;
   }
}
API after change
public class Base {
   private int counter = 0;
   public void Foo() {
      Bar();        // this is a valid refactoring
   }
   public virtual void Bar() {
      counter++;
   }
}
Client code (assemblyB)
public class Derived: Base {
   public override void Bar() {
      Foo();   // semantic break; goes into infinite recursion
  }
}

Here's a list of problems related to class inheritance in general:

  • Class inheritance breaks encapsulation. It makes the derived classes tightly coupled with the implementation details of the base class. Any change in the base class may break derived classes and may violate Liskov Substitution Principle. 
  • There can be Multiple Inheritance issues. These are easily solved using composition instead of class inheritance. Most game engines use the Entity-Component System pattern (ECS) to avoid these kind of hierarchical problems. 
  • Fragile Base Class problems (as exemplified above). 
  • Implementation cannot be changed at runtime. Composition with Interfaces solve this problem. 
  • Difficult to add new classes in the hierarchy without breaking existing clients. For eg, to fix Multiple Inheritance issues, you may need to restructure the tree. 
  • Can get very difficult to understand the control flow (function calls) once the hierarchy is 2 or more levels deep.



Base in the Middle

Break type: Semantic
API before change (assemblyA)
public class Foo {
    public virtual void M() {
        // some default stuff
    }
}
public class Bar : Foo {
    public void SomeOtherMethod() { }
}
API after change
public class Foo {
    public virtual void M() {
        // some default stuff
   }
}
public class Bar : Foo {
    public void SomeOtherMethod() { }
    public override void M() {
        // some new behavior
        base.M();
    }
}
Client code (assemblyB)
public class Client : Bar {
    public override void M() {
        base.M();   // semantic break; will still call Foo.M()
    }
}
 
new Client().M();

This C# behavior is seriously counter-intuitive. Most developers expect base.M() to be determined at runtime (polymorphism) and call Bar.M() and then Foo.M() with the new API. However, the base.M() call is statically bound at compile time to Foo.M(), so simply changing the API assembly doesn't change the runtime behavior of the client (but it is expected). Client needs a recompilation to get the new behavior. (Java, however, meets the expected behavior (runtime binding)).


Unsealing a sealed class

Break type: Semantic
API before change (assemblyA)
public interface IBaz { }
public sealed class Foo { }
public class Bar { }
 
//public class Derived : Bar, IBaz { }  // not needed, just to show
API after change
public interface IBaz { }
public class Foo { }                    // unsealed
public class Bar { }
 
//public class Derived : Bar, IBaz { }  // not needed, just to show
Client code (assemblyB)
static void M(Func<Foo, IBaz> f) { }
static void M(Func<Bar, IBaz> f) { }
 
M(x => x as IBaz);       // source break; overload resolution error

Before API change, in client code, in the line "x as IBaz", x cannot be Foo or its children, because Foo doesn't implement IBaz and it's sealed. So x is resolved to Bar at compile-time (even though Bar doesn't implement IBaz, some child may potentially do, eg Derived, but is not required at compile-time).

After API change, x can now be resolved to both Foo and Bar (whether or not something actually derives from them), because both can have children that implements IBaz. So overload resolution error occurs.




References:
Microsoft Framework Design Guidelines 
A definitive guide to API breaking changes 
Jon Skeet's blogs (Jon Skeet: senior engineer at Google, highest ranked stackoverflow contributor)
Agile Architecture (Allen Holub: consultant, educator, writer)
Fragile Base Class  (Allen Holub: consultant, educator, writer)
How To Design A Good API (Joshua Bloch: Java platform architect at Google)
Array Covariance in C# (Eric Lippert: former principal engineer of the C# compiler team at Microsoft)
Open/Closed Principle (Uncle Bob: consultant, Agile speaker, software architect)
Virtual Methods and Brittle Base Classes (Eric Lippert: former principal engineer of the C# compiler team at Microsoft)
James Gosling on Java (James Gosling: creator of Java)
Google's Go programming language design decisions

Cyclomatic Complexity

If you were asked why the goto statement is considered harmful, what would you say? Our gut feeling tells us it’s bad, it’s not easy to follow code that has goto statements in it, it creates spaghetti code, it just has a bad vibe all around. But why is it really bad? What’s the mathematical reason behind it?

This blog is not for QA or testers, this is for developers who want to write quality code that is readable, maintainable and testable with good coverage. The number of software defects and maintenance cost are directly related to the complexity of an application.

Today we will talk about quantitative ways to reduce complexity at the function level and try to understand why it works. Please note that this blog is not about how to write tests, but about how to write code to facilitate testing with adequate coverage.

What is Cyclomatic Complexity?


Cyclomatic Complexity (henceforth CC or CCN) is a software metric to indicate the level of logical complexity of a program (or function). It measures the number of linearly independent paths through the code. The concept originates in graph theory and is applied to the control flow graph of a program. There are three ways to measure CC, the simplest way is to count the number of if, while, for, switch-case and try-catch statements in a function. Check here and here for details.


Why is it important? When we want to test a method we have a few options at our disposal on choosing test case coverage. Some of them are:
Line Coverage:                             Every statement will be executed at least once by our choice of test cases.
Branch Coverage (DD-Paths):      Each branch condition will be executed for both true and false values.
Basis Path Coverage:                   Covers all linearly independent paths through the function. Basis paths are a set of linearly independent paths.

Essentially what it means is, any other path in the control flow graph of the function can be obtained by a linear combination of paths from the basis set. Basis Path Testing identifies test cases needed to ensure:
  • Every statement will be executed at least once (line coverage).
  • Each branch condition will be executed for both true and false values (branch coverage).

The Cyclomatic Complexity number of a function tells us what the size of the Basis Path Set should be. Reducing the Cyclomatic Complexity of functions minimizes program defects and testing efforts, and maximizes readability and maintainability. Using CCN to estimate test coverage eliminates redundant test cases, and provides adequate test coverage. Complete path coverage, also known as NPath complexity, is extremely difficult and resource intensive, even for a simple function. Given these challenges, the goal is to select a set of test cases, a basis set, that will most likely identify as many defects as possible (not all, but many).
public int ReturnInput(bool foo, bool bar, bool baz, int x) {
    if (foo) {
        x++;
    }
    if (bar) {
        x--;
    }
    if (baz) {
        x = x;
    }
    return x;
}
Here are some different Test Cases for this function (Assuming : T = if (true), F = if (false))
  • Line Coverage (100% lines):                    TTT
  • Branch Coverage (100% branches):        TTT, FFF
  • Randomly selected Path Coverage:         TTT, FFF, FFT, TTF
  • Basis Path Coverage:                               TTT, FTT, TFT, TTF
  • NPath Coverage (full coverage):              TTT, FTT, TFT, TTF, FFF, FFT, TFF, FTF
There’s a defect in the above function. If foo is true and bar is false, it fails to return input. Notice that neither Line coverage nor Branch coverage finds the defect. Even a Random selection of 4 test cases fails to find the defect. The reason the random selection of paths fails to find it is because those 4 cases are not linearly independent (for eg, FFF = FFT + TTF – TTT). However, Basis Path Coverage with 4 basis test cases does find it. The NPath coverage will find the defect as well, but we would have to run a lot more test cases (8 instead of 4). Writing test cases to cover all paths (NPath complexity) is practically impossible in most situations. Instead, our focus is going to be on covering linearly independent paths (basis set).
Therein lies the beauty of Cyclomatic Complexity. It tells you the minimum number of test cases needed to adequately cover your function. However, it doesn’t tell you what those linearly independent paths are.

How to determine a Basis Set of linearly independent paths

  • Pick any path (eg, TTT)
  • Flip each condition to its opposite, one at a time, to create new paths (eg, FTT, TFT, TTF).
  • Stop when you have reached the cyclomatic complexity number (eg 4).

The relationship between the different coverages is as follows:
line coverage ≤ branch coverage  cyclomatic complexity  total number of paths (NPath complexity)


A More Elaborate Example


Now, let’s analyse a more elaborate example, with a defect in it. Imagine a function that tells you if the King of chess can move up, down, left or right (no capturing of pieces or diagonal movements for simplicity).
// Cyclomatic complexity: 5; NPath complexity: 16
public void SetSquareForPiece(Piece piece, Board board, int x, int y) {
    int k = 0;
    if (board.GetSquareValue(x + 1, y) == "EMPTY") {         // 1
        piece.CanMoveRight = true;
    } else {                                                 // 2
        piece.CanMoveRight = false;
    }
    if (board.GetSquareValue(x - 1, y) == "EMPTY") {         // 3
        piece.CanMoveLeft = true;
        x = x + 1;
    } else {                                                 // 4
        piece.CanMoveLeft = false;
        k = k + 1;
    }
    if (board.GetSquareValue(x, y + 1) == "EMPTY") {         // 5
        piece.CanMoveUp = true;
        x = x + k;
    } else {                                                 // 6
        piece.CanMoveUp = false;
    }
    if (board.GetSquareValue(x - 1, y - 1) == "EMPTY") {     // 7   (bug)
        piece.CanMoveDown = true;
    } else {                                                 // 8
        piece.CanMoveDown = false;
    }
}
Assuming: T = if (true), F = if (false)
Branch Coverage (100% branches):                               TTFF, FFTT
Basis Path Coverage (cyclomatic complexity: 5)              TTTT, FTTT, TFTT, TTFT, TTTF
NPath Coverage: 2^4 = 16                                               TTTT, TFTT, TTFT, TTTF, TFFT, TTFF, TFFF, TFTF, FTTT, FFFF, FFTT, etc…
In order to obtain full coverage for all possible path combinations, 16 unit tests are required for this function (NPath). However, we can get adequate coverage with Basis Path Testing with just 5 unit tests (CCN). But this is only a 31% path coverage (5/16). It’s important to note that Basis Path Coverage (5 tests) in this particular case does not find the defect. It’s because the defect in the last if-else block is masked by state changes in the other blocks.
What can we do to improve our chances of finding the defect? Let’s refactor the above function into four smaller functions (keeping the bug as is), a technique known as Extract Method:
public void SetPossibleSquare(Piece piece, Board board, int x, int y) {
    SetSquareRight(piece, board, x, y);
    SetSquareLeft(piece, board, x, y);
    SetSquareUp(piece, board, x, y);
    SetSquareDown(piece, board, x, y);
}
// CCN: 2; NPath: 2
public void SetSquareRight(Piece piece, Board board, int x, int y) {
    if (board.GetSquareValue(x + 1, y) == "EMPTY") {                      // 1
        piece.CanMoveRight = true;                                  
    } else {                                                              // 2
       piece.CanMoveRight = false;                                  
    }
}
// CCN: 2; NPath: 2
public void SetSquareLeft(Piece piece, Board board, int x, int y) {
    int k = 0;
    if (board.GetSquareValue(x - 1, y) == "EMPTY") {                      // 3
        piece.CanMoveLeft = true;
        x = x + 1;
    } else {                                                              // 4
        piece.CanMoveLeft = false;
        k = k + 1;                               
    }
}
// CCN: 2; NPath: 2
public void SetSquareUp(Piece piece, Board board, int x, int y) {
    int k = 0;
    if (board.GetSquareValue(x, y + 1) == "EMPTY") {                      // 5
        piece.CanMoveUp = true;
        x = x + k;
    } else {                                                              // 6
        piece.CanMoveUp = false;
    }
}
// CCN: 2; NPath: 2
public void SetSquareDown(Piece piece, Board board, int x, int y) {
    if (board.GetSquareValue(x - 1, y - 1) == "EMPTY") {                  // 7  (bug)
        piece.CanMoveDown = true;
    } else {                                                              // 8
        piece.CanMoveDown = false;
    }
}
Cyclomatic complexity of each method: 2              T and F test cases for each function
NPath complexity of each method: 2                      T and F paths for each function

Notice that we increased the total cyclomatic complexity of the methods from 5 to 8 after the refactor, but reduced the total NPath complexity from 16 to 8 (2 * 4), which now gives us a test coverage of 100% (8/8).
More importantly, now our tests also detect the bug at line #7 (with test case F), because this logical block is now an independent function and is not corrupted by state changes in other blocks.

Martin Fowler’s book Refactoring and Steve McConnell’s book Code Complete are two of the best sources of information on how to reduce program complexity and increase test coverage. Fowler advocates aggressive use of the Extract Method technique to keep the complexity very low. McConnell provides a list of reasons in his book to keep functions small.


Summary

  • Basis Path Coverage based on Cyclomatic Complexity number provides better test coverage than either Line or Branch coverage.
  • If a function has a high NPath complexity (all path combinations), then even a low cyclomatic complexity number may miss some defects in the function. We want our Basis Set of test cases to be as close to NPath complexity as possible for each function. Extract Method refactoring allows us to achieve that.
  • Extract Method refactoring may increase the combined CCN for all extracted methods, but it also reduces the NPath complexity (full path complexity) of each method, and so the CC number becomes closer and closer to the NPath complexity of each method, and test coverage approaches 100% with smaller and smaller methods.
  • In our Chess board example, we needed 8 unit tests for full coverage after Method Extraction, rather than 16 unit tests for full coverage before Method Extraction.
  • Even though the recommendation is to keep the CC number below 10 for a method, I would go as far as keeping it below 5. As we saw in the last example, even with a CCN of 5 for a single function we couldn't detect the bug in our code.
  • Reducing the cyclomatic complexity of a function makes the code more readable, maintainable and less error prone to future refactoring. Small functions also adhere to the Single Responsibility Principle and are highly cohesive.


Essential complexity is the cyclomatic complexity of the reduced CFG (control flow graph) after iteratively replacing (reducing) all structured programming control structures in a function. All structured functions have an essential complexity of 1. An unstructured function, such as one that contains goto statements, is not always reducible to a cyclomatic complexity of 1. This makes it very hard to test and reason about those functions. That’s why goto statements are harmful. They lead to irreducible functions with essential complexity greater than 1, which makes them hard to reason about when used in code.




References

Structured Testing using Cyclomatic Complexity, McCabe
Basis Path Testing
Reasons to Create a Routine (Steve McConnell, Code Complete)
Refactoring: Improving the Design of Existing Code, Martin Fowler
Path Testing
McCabe Cyclomatic Complexity, expert opinions (7 min video)
Go To Statement Considered Harmful, Dijkstra, 1968