Adventures in Coding: April 2016

Tuesday, April 26, 2016

Design by Contract

Design by Contract: A discipline of software analysis, design, implementation, testing and documentation. It also goes by the name Code Contracts, primarily because the name Design by Contract is trademarked.

The concept of Design by Contract (DbC) was first formalized by a French academic named Bertrand Meyer, creator of the Eiffel programming language (which inspired Java and C#), which is also the only language that was built with DbC in mind. Microsoft research lab had a language called Spec# which also had built-in support for DbC. Spec# was an extension of C# with DbC concepts. Code Contracts is the successor of that project. The .Net framework’s Base Class Library (BCL) has been using Code Contracts extensively since v4.0. Code Contracts used to be available only for Premium and Ultimate customers of VS, but they recently made it open source. It comes with a static checker and a runtime rewriter. Check the user manual for details.

The static analysis engine uses abstract interpretation (the automatic, compile-time determination of run-time properties of programs), not software verification (conformance to a specification).

Code Contracts vs Traditional if-then-throw block and debug asserts:

Design by Contract is a programming paradigm, it’s not a tool, a library or an extension to a language. It’s an abstract philosophy of how to design robust applications with the help of specifications known as contracts.

We already use some form of contracts in our everyday Object-Oriented Programming, such as Interfaces that serve as contracts for classes, delegates that serve as contracts for methods, and so on. We also write contracts in various forms for our software, such as in comments, as test suites, as documentation, as Debug.Asserts, and so on. The idea of DbC is to formalize a systematic approach so that every phase of the software life-cycle can benefit from a uniform specification. It’s a guide throughout the entire software development process.

The concepts of preconditions, post-conditions and invariants exist at all levels: requirements, design, implementation, testing, documentation.

Software Analysis: Contracts can be used during the analysis of software components to specify requirements. (https://archive.eiffel.com/doc/manuals/technology/contract/)

Software Implementation: Contracts define precisely how to implement the components based on preconditions, post-conditions, invariants and exceptions.

Inheritance: Contracts are inherited and enforce the Liskov Substitution Principle for inheritance, which helps to build robust software components.

Exception handling: Contracts provide clear exception handling mechanisms.

Documentation: Contracts provide documentation for each component, documentation that never goes out of sync with the code-base because they are embedded in the code.

Testing/Debugging: Contracts provide a platform for tools and testing frameworks to auto-generate tests (see Microsoft Pex). It also makes implementation of test cases very easy.

Some snapshots of what you can do with Code Contracts:

Here's a checklist of some of the advantages and disadvantages of Code Contracts vs Defensive Programming:

Following is a list of discoveries from the investigation of Code Contracts while implementing it in one of my projects.

If you are trying to apply Code Contracts to an existing code base it is best to disable it at the assembly level and then to incrementally open it up per method, then per class and so on. Otherwise you will be bombarded with hundreds of warnings and squiggly lines everywhere. You can enable/disable them with the following attributes: [assembly: ContractVerification (false )], [ContractVerification (true)]
Add a SQL server name in the Code Contracts project properties for caching, I usually use (localdb)\V11.0.
Refactor long functions into smaller private methods. Makes it easy to add Contracts.
Think about using the baseline option if you want to track warnings only from a certain baseline code (as in keep track of warnings after a certain point, to avoid getting overwhelmed by warnings).
If your project contains contracts and is referenced by other projects, it is recommend that you select Build under the contract reference assembly section in the properties tab for CodeContracts. The contract reference assembly for an assembly named A will be called A.Contracts.dll and appears in the project output directory.
Invariants are checked on every public method/property call. They are not checked on private method/property calls.
All methods called within a contract must be pure. Methods marked with [Pure] attribute.
The static checker recognizes a new contract helper AssumeInvariant that allows you to assume the invariant of an object at arbitrary points in your program. It is a work-around for some static checker limitations.
If there is one warning in a method, subsequent warnings may be suppressed. So, fix or comment out the other warnings first. Not sure if it’s a limitation or a feature.
Most of the LINQ methods are not annotated with the Contracts API (except for All, Any, Sum, etc). Check the source code or tooltip in VS to find out which ones have contracts.
The user-supplied string in Requires, Ensures, etc will be displayed whenever the contract is violated at runtime. Currently, it must be a compile-time constant.
Multiple invariant conditions in the same expression seem to throw “condition unproven” warnings sometimes. Breaking them up into separate invariant cases works.
The static contract checker has limited supported for ForAll and Exists quantifiers, provided the quantified expression is simple like x => x != null. I wasn’t able to make them work statically at all.
Static checking does not work for closures. (Need to verify with latest version)
No contracts allowed on delegates. (Need to verify with latest version)
When writing iterators using yield, the static contract checker will not be able to prove post-conditions. (Need to verify with latest version)
Post-conditions (Ensures) of async methods are not checked by the static checker.
Any method whose fully qualified name begins with "System.Diagnostics.Contracts.Contract", "System.String", "System.IO.Path", or "System.Type" are considered Pure, so you can use them in your contracts.
In a Pure method either enable the "Infer ensures" in properties or add Ensures (post-condition) in the method.
Enable "Infer invariants for readonly" in properties to make readonly properties invariants.
Preconditions are checked before running the constructor initializer.
Object invariants on interface properties are not supported.
You cannot add a Requires contract in an overridden method (because it violates LSP).
External input (database, file system, user input, etc) should be handled in Data Access Layer or UI layer, and all validated data should flow into Business Logic Layer as contracts.

Some guidelines to start using Code Contracts:

When adding code contracts to an existing code base it may seem overwhelming to receive hundreds of warnings for only a few contract cases. So, the suggestion is to disable code contract in all relevant assemblies and to enable it per method one at a time, and fixing the contracts of that method. In order to fix them you may need to add some Contract.Assume or null checks and empty string checks in the calling methods. Anyway, this is how to start the process...

Project configurations:

Download and install Code Contract static checker and binary rewriter in your Visual Studio: https://visualstudiogallery.msdn.microsoft.com/1ec7db13-3363-46c9-851f-1ce455f66970
You may install the Code Contracts Editor Extensions to see contracts in tooltip and intellisense, but I have been getting reproducible VS crashes on some tooltips. You may want to skip it for now.
In Project Properties -> Code Contracts, enable Perform Static Contract Checking.
In Project Properties -> Code Contracts, make sure Check in background and Show squigglies are also enabled.
In Project Properties -> Code Contracts, add an SQL server for caching: for eg, (localdb)\V11.0
In Project Properties -> Code Contracts, slide the warning level to "hi".

Adding contracts in your code:

Disable contracts in all relevant assemblies using the following attribute in the AssemblyInfo.cs file: [assembly: ContractVerification (false )]
Start with a class that has little or no dependencies on other classes.
Enable contract on one method in the class chosen in Step 2 using the following attribute: [ContractVerification (true)]
Add contracts such as Contract.Requires and Contract.Ensures in the target method (chosen in Step 3). I suggest starting with simple null checks and empty string checks.
In your calling methods you may have to add some if-else null checks, array bound checks, empty string checks, or whatever else is needed to satisfy the contracts of the target method (the callee).
If at any point inside a target method a contract cannot be verified statically, use Contract.Assume. (for eg. library calls that don't support contracts)
Add contracts to the other methods/properties in the selected class. (Remember to enable ContractVerification for the method)
Repeat the above steps for other classes.

Design by Contract in other languages:

There are many 3rd party libraries that provide support for Design by Contract in languages such as C++, Java, JavaScript and others. Most of these libraries provide runtime contract validation. Static validation in popular languages/frameworks is rare. The .Net framework happens to be the first popular framework that has built-in support for DbC. In the Wikipedia page for Design by Contact you will find a list of 3rd party libraries for other languages: https://en.wikipedia.org/wiki/Design_by_contract.

There has also been talks about incorporating Contracts in the next version of C++ (2017). Here's an interview with Bjarne Stroustrup where he would like to see Contracts in C++17: https://isocpp.org/files/papers/D4492.pdf

User Manual: http://research.microsoft.com/en-us/projects/contracts/userdoc.pdf
How the static checker works: http://research.microsoft.com/en-US/projects/contracts/cccheck.pdf

Unit Test Best Practices

This post provides a summary of good design techniques for writing testable code and best practices for unit tests. The information is based on my research from books, talks and blogs by industry experts.
Please refer to the references section to learn more from the experts.

Unit tests are development tests. They should test units of work in isolation. In our case we can consider a class as a unit. They should be used in the day-to-day development process. Hence, they must be fast, readable and maintainable. We will see below why each attribute is necessary and how to achieve them. Check the glossary at the bottom for acronyms.

The goal of this guideline is to help us create unit tests that will serve as internal documentation of the architecture of the application (besides testing of course), as well as creating tests that are easy to review. That's why we want to maintain a flat structure for each test. Hence, the naming conventions, readability and maintainability are of utmost importance. This will help us to follow the TDD methodology in the future.

Development guidelines for writing testable code

Before we can write unit tests, we need to make sure that our application can be isolated into units. The guidelines below are intended to help us achieve that.

Dependency Injection

It's best to design/refactor the application so that the classes are as decoupled as possible. It becomes really hard to write clean unit tests if the classes/modules are coupled. One way to achieve decoupling of classes is by Dependency Injection (DI). We can either use a DI framework for .Net or write basic DI functionalities manually (which becomes too much work when we have deep object trees). Popular DI frameworks for .Net are: Autofac, Castle, Ninject, Spring.Net, etc. For runtime dependencies (or lazy instantiation) use an injected factory/provider.

Dependency Injection is the primary and fundamental requirement for writing clean unit tests easily in an Object-Oriented language. Here's a blog from Bob Lee, a former team lead of the Android core library at Google and creator of Guice, talking about how most Java applications in Google use dependency injection, applications such as, gmail, youtube, adwords, google docs, etc.

Besides the secondary benefit of testability, dependency injection's primary benefits are making the code decoupled, modular, reusable and readable. It also makes you adhere to best practices of software engineering such as programming to Interfaces rather then implementation, the Single Responsibility Principle and the Open/Closed Principle of SOLID (see below).

SOLID principles

The next thing to pay attention to are the SOLID principles of Object-Oriented software engineering. Not only are these best practices for good architecture, they also make code very testable. Here are the descriptions in brief, please check them out online if you are not familiar with them already.

S: Single Responsibility Principle: Each class and method should have a single responsibility. It makes unit testing simpler (obviously), makes code less error prone to cross-feature bugs introduced during bug fixes or refactoring.

O: Open/Closed Principle: Modules should be open for extension, but closed for modification. It makes unit tests maintainable (because existing code base is not changed often, except for bug fixes or development time design fixes).

L: Liskov Substitution Principle: Base types should be substitutable by derived types without breaking expected behavior. Unit testing for LSP guarantees that polymorphic code doesn't break in existing clients when new derived types are implemented (see Unit Tests guidelines).

I: Interface Segregation Principle: Use multiple specific interfaces instead of one big interface. It makes unit testing easier by focusing on smaller units. Also makes it easy to create stubs or mocks for unit tests.

D: Dependency Inversion Principle: High level modules should not depend on low level modules. Both should depend on abstractions. It makes unit tests maintainable, because changing/refactoring details in one level doesn't affect the other levels, since they both are dependent on interfaces rather than implementations. (This is NOT the same as Dependency Injection btw, the two are completely different concepts.)

Here's a list of cases where violating the Single Responsibility Principle for methods will cause problems:

Long methods are not unit testable. A long method contain local states that are shared by multiple logical blocks in the same method, so it becomes difficult to follow and isolate test cases.
Any change in any logical block of a long method has the potential to break other logical blocks in the same method. This requires re-tests for all logical paths of the whole method for any fix, and makes it susceptible to regression bugs.
Long methods cannot be profiled effectively (if we need to in the future). A profiler will only tell you that a method is slow, not which logical part of the method is the cause.
It's very hard to add Code Contacts in long methods.

Global state (to avoid)

Avoid using global states, such as static variables or the Singleton Design Pattern. Singleton objects that are dependency injected are ok, but avoid the Singleton Design Pattern. Static methods are ok if they don't change or access any state, or if they are leaf nodes in the call graph. Note that static methods that access DateTime.Now or the Random class are accessing global state. Here's a short list of problems related to global states in a program:

Readability issues - Source code is easier to understand when the scope of objects are limited. Since global variables can be accessed from anywhere, it becomes difficult to remember or reason about every possible use.
Implicit coupling - Since different objects are accessing mutable global states, they are implicitly coupled via the global state.
Concurrency issues - Global mutable states are well-known for concurrency problems for obvious reasons.
Unit testing issues - Unit testing becomes difficult because you need to set up the global states for the tests, which may be hard because of global coupling with other code. Also, tests may become contaminated in-between runs. Pure static methods are unit testable, but the callers of static methods have dependencies on the classes/assemblies of those static methods and you cannot mock them out. So, you cannot test the caller in isolation. Pure static methods as leaf nodes may be ok.
Scalability issues - Global scope is per AppDomain, so scaling the application to multiple processes will pose a challenge. You will need to pass the globals as parameters or something, so they are not really used as globals anymore.
Refactoring issues - Since multiple parts of the application are coupled via globals, refactoring becomes risky as it can break anything. This is against the TDD methodology.

Unit Test guidelines

Here are some guidelines to make unit tests fast, readable and maintainable:

Readability:

Test project naming convention: {ProjectName}.Tests
Test class naming convention: {ClassName}Tests
Test method naming convention: {MethodName}_{StateUnderTest}_{ExpectedBehavior}
Identify stubs and mocks clearly for variables in a test method, eg: stubEmployee, mockEmployee
Follow the Arrange, Act, Assert (AAA) pattern in each test.
There should be no logic in unit tests (no if-else, switch-case, loops, etc). We don't want to test the tests. We don't want to think "logically" when reading someone else's tests. We don't want to spend time "maintaining" the tests.
Avoid adding comments in your tests. If they need comments then it implies that the code is not easily readable.
Don't use magic strings or numbers in a test, such as, "Test_Title", "CountryName", "Jane Doe", 15, 55, etc. to fill properties that are not directly under test, use the following defaults instead:
- String: "aaa"
- Char: 'a'
- Number: 10, 100, 1000, etc
- DateTime: default(DateTime)
- Boolean: default(bool)
- Enum: default(enumType)
- Reference: null, or create a stub/mock, or pass in the real object if it has been proven to work (with tests)

Isolation:

All unit tests should be in-memory tests. They should not leave main memory. If your test leaves main memory, delete the test, or put it among Integration Tests.
Use stubs for dependencies. If a dependency is hard-coded in the SUT, create an interface (and a wrapper if necessary) to be able to stub out the dependency during test. You may use the dependency directly if it is proven to be working (with passing tests).
Use at most one mock per test, mostly for testing interactions with 3rd party libraries. Too many fakes/mocks make tests tied to implementation details and unmaintainable (eg: a new version of the library may work differently).
Tests should not allow access to the Database. Use a hashtable or other in-memory data structure, or a database wrapper (eg: a generic IRepository)
Tests should not allow access to the File system or registry (not even config files). Use a FileWrapper interface to stub out File IO (same for Configs).
Tests should not allow access to the Network. Use a NetworkWrapper interface to stub out network access.
Tests must be deterministic, repeatable, able to run in parallel and in any order. Avoid using global states in SUT in order to achieve these conditions for the test. Avoid using global states in tests (eg DateTime.Now, Random, etc).
Try to avoid Setup and Teardown methods (or use them lightly). They decrease readability and can cause state corruption in between tests. Use factory methods instead.

Asserts:

Test only one thing per test. Try to use only one assert. Multiple asserts on a single object are fine if the asserts are related, but don't do multiple asserts on different independent objects/properties in the same test.
Verify only a single call to a mock object.
Don't use variables in your assert. Use fixed values, numbers or strings. Variables can contain logical error or may duplicate production logic. It also makes tests hard to read.

General:

Test only the public methods. Testing private methods makes them tied to implementation details and makes it hard to refactor the application without breaking tests. There may be exceptional cases though.
It is recommended to create separate helper factory methods with different names instead of returning different objects conditionally from the same factory. This is to avoid logical bugs in the factory code and for ease of readability of the test method.
If a stub is created in a factory, don't hard-code the stub's properties in the factory, set them in the test method instead.
All base class unit tests should pass with derived classes. Inheritance tests should pass LSP (base class exceptions, pre-conditions, post-conditions, invariants, history, etc.). Please read up on LSP to understand the details.
Never [Ignore] a test method. Let it fail or fix it.
If a test that is written following these guidelines doesn't work as expected, keep the test and delete the code. Now run the test, it will fail (obviously), write the minimum code needed to make it pass, refactor and repeat. You are on your way to TDD.

What are the benefits of TDD

TDD stands for Test Driven Development. The general practice is to write minimal unit tests before writing any line of code. Sometimes the differences and benefits between traditional unit tests and TDD unit tests are not clear to everyone. Hence, here is a bullet point list of some concrete advantages that TDD provide over traditional way of writing unit tests (after writing code):

TDD guarantees that tests are not passing by accident, because tests start by failing and pass only after implementation. Traditional unit tests require you to manually break working code to guarantee that the tests are working as expected. This is especially useful when testing async calls, because most async tests will pass automatically if not written properly.
TDD enforces the practice of decoupled code, because other implementations are not yet available for tests, so all tests need to use mocks/stubs by default. Makes better code design. Traditional unit tests end up becoming integration tests because it is tempting to use real implementations rather than mocks.
Writing tests after the code is like writing tests for code that already works, because the code was already confirmed by manual testing during implementation. So, traditional unit tests seem like extra tests and sometimes it is tempting to bypass hard to test cases (because they were manually tested during coding). TDD enforces the automated testing practice and reduces reliance on manual testing during implementation.
TDD makes you write less code during implementation, because tests are Spec oriented, rather than implementation oriented. Testing spec gives more confidence in the application rather than testing internal implementation. Cleaner interfaces. Guarantees that tests cover all relevant behaviour of the application. Streamlines the codebase. Traditional method can bloat up the codebase with code that may be used in the future.
Traditionally, new functionalities are bolted on to the application in fear of breaking it. TDD gives confidence to truly integrate new functionalities into existing applications without fear of breaking it. The confidence comes from the fact that the tests are spec-oriented, not implementation-oriented, so it guarantees that spec is not broken.
Constant refactoring makes sure the code never becomes outdated and unchangeable.

Glossary:

SUT: System Under Test. A class, a method or a module. Anything that's under test.
DI: Dependency Injection
TDD: Test Driven Development
AAA: Arrange, Act, Assert. A unit test writing pattern for separating a test method into three groups. Arrange the variables and dependencies needed, call the method to test (Act), Assert on the expected state/behavior.
LSP: Liskov Substitution Principle (named after the mathematician Barbara Liskov)
Stub: A fake object created for state verification of the SUT. They are pre-programmed with hard-coded responses to calls. You don't assert against them. You assert against the SUT (or other collaborators).
Mock: A fake object created for behavior verification of the SUT. They are pre-programmed with "expectations" of the calls they are expected to receive from the SUT. You assert against them to verify that the right calls were made by the SUT.

References:

Robert Martin (Uncle Bob): Co-author of the Agile Manifesto. Check out his blogs and videos.
Martin Fowler: Co-author of the Agile Manifesto, speaker, writer, check out his blogs on Dependency Injection and IoC containers.
Misko Hevery: Agile coach at Google. Check out his clean code talks series in youtube, and his blogs on how to write testable code.
Roy Osherove: Agile coach and writer. Check out his book "The Art of Unit Testing" or his youtube videos here and here on unit tests.
Jon Skeet: Senior Software Engineer at Google, author, all-time highest ranked StackOverflow contributor, check out this demo on IoC container from scratch.
Mark Seemann: Writer, developer, check out his articles on Dependency Injection here, here and here.
James Coplien: Writer, lecturer, researcher, C++ guru, Agile consultant. Check out his piece on Why Most Unit Testing is Waste.
Bob Lee: Former team lead of Android core library at Google, talks about Guice, the dependency injection framework that Google uses in all Java applications.