Adventures in Coding: 2016

Thursday, October 20, 2016

Effective Code Review

The following checklist is meant to be a general purpose guideline for code review. Code review is an opportunity to learn and teach others about best coding practices and patterns, besides of course, hunting for bugs. Take this opportunity to teach new members about coding conventions, design principles and corner cases in your code base.

Overview

Use a checklist

The code should fit in the overall architecture of the code base (a design discussion should precede the implementation, so this is a sanity check)

SOLID principles of software engineering are respected

Design paradigms followed by the team are respected (object-oriented, functional, ECS, data-oriented, etc)

Design patterns used are appropriate for the use case

Promote code reuse; DRY principle

Function arguments are tested for invalid input in range and type

Unit tests have been written (or test cases have been outlined) for passing, failing and exceptional test cases

If there is an older pattern and a newer pattern of coding, make sure the newer pattern is followed

Consider Big(O) efficiency of algorithms (avoid micro-optimization). Performance will vary among platforms (desktop, mobile, VR), the algorithm runtime should take that into account.

Naming conventions and formatting are respected

Anything that can be easily modified later or by a tool should be given the least priority during review

Familiarize yourself with the OWASP Top Ten security vulnerabilities

Detailed Checks

[ARRAY]

Null or empty check

Minimum length of input array is checked

Integer overflow for array elements (int.Min, int.Max, etc); eg, is there an operation (add/multiply) inside the function that can cause int overflow. (2 * 1.5GB = 2 * 1,610,612,736 bytes, will return a negative number in C#/Java if Int32 is used to store the bytes result)

Logical condition is satisfied at both ends of an array (depending on how some vars are initialized to a[0] or how a loop is exited, a logic may not get executed for endpoint elements of the array)

Buffer overflow, index out of bounds

"String"

Null, empty or whitespace

Single char string

Unicode chars (surrogate pairs)

Single and double quotes inside string

Escape chars "\n", "\t" inside string

If method contains Regex expression then check for Regex DoS vulnerabilities (eg: ^(\d+)+$); add a timeout in Regex constructor as a safety measure. aka ReDoS attacks

If using an XML parser (reachable from a service API) then make sure XML DoS vulnerabilities are taken into account (aka XML Bomb, Billion Laughs), xml entity expansion causing exponential memory usage.

(Number, DateTime)

Make sure culture format is taken into account for numbers and datetimes

DateTime should normally be UTC

Some time periods are invalid based on the date, even if the hr:mm:ss looks good. For eg: "April 2, 2016, 2:30 A.M." is an invalid time if daylight saving time is on for certain TimeZones, even if it 'looks' good.

DateTime.Now values should not be used to calculate time difference for anything. They are prone to errors (DST for eg). Always use either UtcNow or a StopWatch.

Always check if any calculation may cause an int to overflow the min/max values of the Int32 or Int64 range. (Lua numbers are double by default)

{Class}

Make sure every class overrides GetHashCode and Equals methods. (The default implementations for Hash functions may be vulnerable to DoS attacks on server-side code.
Default GetHashCode has other issues for both classes and structs, so it's better to avoid the default implementation altogether.)

Classes and functions should try to live up to the SOLID principles of OOD.
At least live up to the Single Responsibility Principle, as in, a class should have a single responsibility, and a function should have a single responsibility. This improves modularity and testability.

Function names should tell us exactly what it does. A function should not do more work than its name advertises. We should not have to read the code to understand what the function does.

Methods in a class must be cohesive.

Program to interfaces rather than implementation (abstract classes in C++, interfaces in C#/Java). It helps in keeping classes decoupled and flexible.

Prefer composition over inheritance. We use the Entity-Component-System design pattern in our Lua code-base. Familiarize yourself with it and help others learn about it.

Function()

Add comments with the correct syntax for auto-generation of documentation from comments. At least public functions should be commented. Eg: ADoc for Lua can generate docs from comments.

Watch out for type safety in dynamic languages (Lua, JavaScript). Function arguments may need to be validated for the expected type.

Avoid coupling functions and classes with global mutable types. In other words, minimize usage of global lists, variables and objects.

Prefer not returning null from a function

Lua supports tail call optimization for recursive function calls, take advantage of that whenever you can. It will prevent stack overflow. Neither Java nor C# has tail call optimization support.

Make sure you understand what #table means in Lua. It returns the length of the shortest array part of the table (if there are gaps in the table). Eg: print(#{1, 2, nil, 3, nil}) prints 2.

#Framework/Library

Make sure frameworks and libraries you are using don't have any known vulnerabilities.

Watch out for framework or library related gotchas: Java's SimpleDateFormat.parse method is not thread-safe, which is unexpected. Java's Random method is not cryptographically secure (use SecureRandom instead).

Turn FxCop on with custom rules for .Net static check. Note: FxCop checks the compiled code, not the source code.

~ Multi-Threading ~

Prefer Isolation first, Immutability second, and Synchronization last for thread safety

Prefer newer concurrency models such as Actor model (message passing, Java Akka, Akka.Net, TPL DataFlow) and Software Transactional Memory (JVSTM, Akka STM, STMNET, Clojure)

.Net Concurrent collections are only thread-safe per method call (of the collection), they don't make YOUR code thread-safe for multiple method calls in a collection.

Avoid locking on lock(this), lock(typeof(myClass)), lock(valueType), lock(public object), lock("string literal")

If you are desperate to test concurrency related code in .Net, check out Microsoft CHESS, a tool for finding and deterministically reproducing concurrency bugs (eg deadlocks). For Java use IBM's ConTest.

Security

Understand the flow of data in your application. Check with your team's security advocate to see if there's a Threat Model or Data Flow Diagram for the portion of code you are reviewing.

Data that cross privilege boundary should be checked for authorization and possibly encrypted

Familiarize yourself with the OWASP Top 10 vulnerabilities (such as XSS, CSRF, SQL injections, etc)

Install OWASP SafeNuGet and Dependency Check into your .Net and Java projects to automatically check for 3rd party package vulnerabilities

Any kind of user input must be sanitized to prevent injection attacks

Buffer overflow, index bounds, integer overflow, stack overflow Improve code readability and testability. This will reduce defects in code which are the major source of security vulnerabilities.

Make sure passwords, encryption keys, tokens, etc don't get checked into VCS (git).

Use OWASP Regex Parser to check for Regex vulnerabilities in your code

Don't use String to store passwords or CC numbers. C# has SecureString, in Java use a char[].
Pay attention to deserialization of untrusted data, using Json.Net for eg. (possibility of arbitrary code execution)
Do not log session tickets, access tokens, usernames, PII, connection strings, keys, etc.

References:

Proven Practices for Peer Review - IBM
OWASP Code Review Guide
SafeNuGet - OWASP
The Unfortunate Reality of Insecure Libraries
Code Complete - Steve McConnell
Refactoring - Martin Fowler
Lua Sandboxing
Regex Vulnerabilities Cheatsheet
Black Hat - JSON deserialization attacks
Json.NET TypeNameHandling Vulnerability

Sunday, October 2, 2016

Semantic Knowledge Base with Natural Language Chatbot

The following video is an internal Tech Talk I gave based on a hackathon project I did at Autodesk in September 2016. The talk describes the project and its long-term goal.

Friday, September 30, 2016

Becoming Agile: Practice & Culture

This is a piece I wrote for the engineering newsletter while working as a developer at Autodesk. The intent was to understand why our team was struggling to adopt Agile principles and practices. It was published in February 2016.

Why write this piece?
If you were asked “Can a company cultivate an agile culture,” what would you say? Your response might fall somewhere between the processes the company actually follows, and what the company tells itself about those processes. And if we look at companies like Google, where no one says they follow Agile, is it possible they can be Agile, without necessarily following Agile? In the words of Jeff Sutherland, the father of Scrum, "they [Google] are Agile as far as I am concerned no matter what they are doing."

I spent a week having one-on-one discussions with colleagues from different teams within M&E about Autodesk’s Agile culture and adoption. I interviewed 16 people from 7 different teams. I ended up gathering way more information than I can summarize here. What I heard from everyone was that Autodesk wants to be collaborative and wants to improve processes; we just may not have the practices or culture to do it seamlessly quite yet.

The Practices
What came up over and over again is the perception that we are often too focused on process (such as Scrum), at the expense of engineering quality. Robert Martin, one of the signatories of the Agile Manifesto, explains in a powerful talk how companies adopting Scrum often fail to get the full benefit of Agile because they don't adopt the technical aspects of it. Martin Fowler coined the term Flaccid Scrum to explain this.

Our estimation techniques (read: grooming and planning meetings) may not be getting adopted by our teams. This lack of adoption alludes to what could be understood as too much of a focus on process, simply to satisfy a system, rather than as a useful exercise in planning. This has led to general dissatisfaction with many of our current Agile practices.

Several colleagues expressed that during retrospectives, we simply aren’t producing to capacity, and often fill in time simply to go through the exercise. It was also shared that we often have too much discussion around small tasks, and too little discussion about the important stories, lending again to a focus more on process than on agility.

Where we seem to be doing well is with our code review processes. We also seem to have achieved a level of consistency in certain technical aspects, which is also a step in the right direction.

The Culture
One of the points raised from discussions with colleagues was the lack of understanding around Agile methodology. While we say we are an Agile organization, we still iterate in a waterfall fashion, often in silos, with a heavy dependence on QA after the fact. What we need to do is encourage and educate everyone about the salient cultural and technical aspects of Agile.

In 2010, VersionOne conducted a survey which found that the primary barrier to furthering the adoption of Agile was the inability to change organizational culture. Allen Holub, an educator and popular Agile consultant, talks about the reasons why Agile usually fails in The Death of Agile. You might be noticing how the emphasis here is on the adoption of culture and principles, beyond simply following Agile processes.

What is great is that I’ve seen a few teams at Autodesk with cohesion on ideas about engineering practices. But in other teams, personal responsibility and enthusiasm to learn and implement proper techniques is lacking. For a lot of those teams, sprint commitments seem to drive the process, and in response, developers cut corners in order to ship features at the expense of quality. This has been asserted by several colleagues.

But how do you change culture? What we may need to look at are fundamental concepts regarding our current culture. For one thing, our rewards programs compel people to think at an individual level rather than at a team level. Implementing Agile with these types of conventional metrics may not get us the desired outcome. Mary Poppendieck illustrates in her article the issues that arise with a traditional reward mindset in an agile culture. It’s hard for developers to advertise accomplishments in improving quality and engineering design principles because it’s not a concept easily presentable in a visual manner. One of the things we do today that reify this is our hackathons; they are designed to focus only on end-products and not on techniques and engineering practices. These were also concerns raised by several colleagues.

What can we do today?
Some of the simple ideas put forward by my colleagues included 1) an objective 3rd party observer to streamline our scrum process, 2) a committee to standardize code quality, 3) more engineering training, 4) engineering hackathons, and 5) more direct involvement by leadership in motivating teams to improve code quality.

Karine Roy and I are collaborating with experienced colleagues on how to setup a culture of code quality and to encourage teams to adopt it.

Looking at what is happening at other companies, SAP has an Agile Software Engineering course that teaches their developers agile programming techniques. And Spotify was built around an engineering culture that has maintained their success, even with substantial growth among their engineering ranks.

Amar in his recent quarterly all-Hands articulated that Autodesk should be in the ranks of Google and Amazon in terms of quality products. It’s a great vision. But in order for that to become a reality, our engineering practices and culture need to be aligned with that goal. We need a clear approach to improving code quality and developer integration, and it is the organizational support to build such a culture that will be indispensable.

Additional resources:

Tuesday, April 26, 2016

Design by Contract

Design by Contract: A discipline of software analysis, design, implementation, testing and documentation. It also goes by the name Code Contracts, primarily because the name Design by Contract is trademarked.

The concept of Design by Contract (DbC) was first formalized by a French academic named Bertrand Meyer, creator of the Eiffel programming language (which inspired Java and C#), which is also the only language that was built with DbC in mind. Microsoft research lab had a language called Spec# which also had built-in support for DbC. Spec# was an extension of C# with DbC concepts. Code Contracts is the successor of that project. The .Net framework’s Base Class Library (BCL) has been using Code Contracts extensively since v4.0. Code Contracts used to be available only for Premium and Ultimate customers of VS, but they recently made it open source. It comes with a static checker and a runtime rewriter. Check the user manual for details.

The static analysis engine uses abstract interpretation (the automatic, compile-time determination of run-time properties of programs), not software verification (conformance to a specification).

Code Contracts vs Traditional if-then-throw block and debug asserts:

Design by Contract is a programming paradigm, it’s not a tool, a library or an extension to a language. It’s an abstract philosophy of how to design robust applications with the help of specifications known as contracts.

We already use some form of contracts in our everyday Object-Oriented Programming, such as Interfaces that serve as contracts for classes, delegates that serve as contracts for methods, and so on. We also write contracts in various forms for our software, such as in comments, as test suites, as documentation, as Debug.Asserts, and so on. The idea of DbC is to formalize a systematic approach so that every phase of the software life-cycle can benefit from a uniform specification. It’s a guide throughout the entire software development process.

The concepts of preconditions, post-conditions and invariants exist at all levels: requirements, design, implementation, testing, documentation.

Software Analysis: Contracts can be used during the analysis of software components to specify requirements. (https://archive.eiffel.com/doc/manuals/technology/contract/)

Software Implementation: Contracts define precisely how to implement the components based on preconditions, post-conditions, invariants and exceptions.

Inheritance: Contracts are inherited and enforce the Liskov Substitution Principle for inheritance, which helps to build robust software components.

Exception handling: Contracts provide clear exception handling mechanisms.

Documentation: Contracts provide documentation for each component, documentation that never goes out of sync with the code-base because they are embedded in the code.

Testing/Debugging: Contracts provide a platform for tools and testing frameworks to auto-generate tests (see Microsoft Pex). It also makes implementation of test cases very easy.

Some snapshots of what you can do with Code Contracts:

Here's a checklist of some of the advantages and disadvantages of Code Contracts vs Defensive Programming:

Following is a list of discoveries from the investigation of Code Contracts while implementing it in one of my projects.

If you are trying to apply Code Contracts to an existing code base it is best to disable it at the assembly level and then to incrementally open it up per method, then per class and so on. Otherwise you will be bombarded with hundreds of warnings and squiggly lines everywhere. You can enable/disable them with the following attributes: [assembly: ContractVerification (false )], [ContractVerification (true)]
Add a SQL server name in the Code Contracts project properties for caching, I usually use (localdb)\V11.0.
Refactor long functions into smaller private methods. Makes it easy to add Contracts.
Think about using the baseline option if you want to track warnings only from a certain baseline code (as in keep track of warnings after a certain point, to avoid getting overwhelmed by warnings).
If your project contains contracts and is referenced by other projects, it is recommend that you select Build under the contract reference assembly section in the properties tab for CodeContracts. The contract reference assembly for an assembly named A will be called A.Contracts.dll and appears in the project output directory.
Invariants are checked on every public method/property call. They are not checked on private method/property calls.
All methods called within a contract must be pure. Methods marked with [Pure] attribute.
The static checker recognizes a new contract helper AssumeInvariant that allows you to assume the invariant of an object at arbitrary points in your program. It is a work-around for some static checker limitations.
If there is one warning in a method, subsequent warnings may be suppressed. So, fix or comment out the other warnings first. Not sure if it’s a limitation or a feature.
Most of the LINQ methods are not annotated with the Contracts API (except for All, Any, Sum, etc). Check the source code or tooltip in VS to find out which ones have contracts.
The user-supplied string in Requires, Ensures, etc will be displayed whenever the contract is violated at runtime. Currently, it must be a compile-time constant.
Multiple invariant conditions in the same expression seem to throw “condition unproven” warnings sometimes. Breaking them up into separate invariant cases works.
The static contract checker has limited supported for ForAll and Exists quantifiers, provided the quantified expression is simple like x => x != null. I wasn’t able to make them work statically at all.
Static checking does not work for closures. (Need to verify with latest version)
No contracts allowed on delegates. (Need to verify with latest version)
When writing iterators using yield, the static contract checker will not be able to prove post-conditions. (Need to verify with latest version)
Post-conditions (Ensures) of async methods are not checked by the static checker.
Any method whose fully qualified name begins with "System.Diagnostics.Contracts.Contract", "System.String", "System.IO.Path", or "System.Type" are considered Pure, so you can use them in your contracts.
In a Pure method either enable the "Infer ensures" in properties or add Ensures (post-condition) in the method.
Enable "Infer invariants for readonly" in properties to make readonly properties invariants.
Preconditions are checked before running the constructor initializer.
Object invariants on interface properties are not supported.
You cannot add a Requires contract in an overridden method (because it violates LSP).
External input (database, file system, user input, etc) should be handled in Data Access Layer or UI layer, and all validated data should flow into Business Logic Layer as contracts.

Some guidelines to start using Code Contracts:

When adding code contracts to an existing code base it may seem overwhelming to receive hundreds of warnings for only a few contract cases. So, the suggestion is to disable code contract in all relevant assemblies and to enable it per method one at a time, and fixing the contracts of that method. In order to fix them you may need to add some Contract.Assume or null checks and empty string checks in the calling methods. Anyway, this is how to start the process...

Project configurations:

Download and install Code Contract static checker and binary rewriter in your Visual Studio: https://visualstudiogallery.msdn.microsoft.com/1ec7db13-3363-46c9-851f-1ce455f66970
You may install the Code Contracts Editor Extensions to see contracts in tooltip and intellisense, but I have been getting reproducible VS crashes on some tooltips. You may want to skip it for now.
In Project Properties -> Code Contracts, enable Perform Static Contract Checking.
In Project Properties -> Code Contracts, make sure Check in background and Show squigglies are also enabled.
In Project Properties -> Code Contracts, add an SQL server for caching: for eg, (localdb)\V11.0
In Project Properties -> Code Contracts, slide the warning level to "hi".

Adding contracts in your code:

Disable contracts in all relevant assemblies using the following attribute in the AssemblyInfo.cs file: [assembly: ContractVerification (false )]
Start with a class that has little or no dependencies on other classes.
Enable contract on one method in the class chosen in Step 2 using the following attribute: [ContractVerification (true)]
Add contracts such as Contract.Requires and Contract.Ensures in the target method (chosen in Step 3). I suggest starting with simple null checks and empty string checks.
In your calling methods you may have to add some if-else null checks, array bound checks, empty string checks, or whatever else is needed to satisfy the contracts of the target method (the callee).
If at any point inside a target method a contract cannot be verified statically, use Contract.Assume. (for eg. library calls that don't support contracts)
Add contracts to the other methods/properties in the selected class. (Remember to enable ContractVerification for the method)
Repeat the above steps for other classes.

Design by Contract in other languages:

There are many 3rd party libraries that provide support for Design by Contract in languages such as C++, Java, JavaScript and others. Most of these libraries provide runtime contract validation. Static validation in popular languages/frameworks is rare. The .Net framework happens to be the first popular framework that has built-in support for DbC. In the Wikipedia page for Design by Contact you will find a list of 3rd party libraries for other languages: https://en.wikipedia.org/wiki/Design_by_contract.

There has also been talks about incorporating Contracts in the next version of C++ (2017). Here's an interview with Bjarne Stroustrup where he would like to see Contracts in C++17: https://isocpp.org/files/papers/D4492.pdf

User Manual: http://research.microsoft.com/en-us/projects/contracts/userdoc.pdf
How the static checker works: http://research.microsoft.com/en-US/projects/contracts/cccheck.pdf

Unit Test Best Practices

This post provides a summary of good design techniques for writing testable code and best practices for unit tests. The information is based on my research from books, talks and blogs by industry experts.
Please refer to the references section to learn more from the experts.

Unit tests are development tests. They should test units of work in isolation. In our case we can consider a class as a unit. They should be used in the day-to-day development process. Hence, they must be fast, readable and maintainable. We will see below why each attribute is necessary and how to achieve them. Check the glossary at the bottom for acronyms.

The goal of this guideline is to help us create unit tests that will serve as internal documentation of the architecture of the application (besides testing of course), as well as creating tests that are easy to review. That's why we want to maintain a flat structure for each test. Hence, the naming conventions, readability and maintainability are of utmost importance. This will help us to follow the TDD methodology in the future.

Development guidelines for writing testable code

Before we can write unit tests, we need to make sure that our application can be isolated into units. The guidelines below are intended to help us achieve that.

Dependency Injection

It's best to design/refactor the application so that the classes are as decoupled as possible. It becomes really hard to write clean unit tests if the classes/modules are coupled. One way to achieve decoupling of classes is by Dependency Injection (DI). We can either use a DI framework for .Net or write basic DI functionalities manually (which becomes too much work when we have deep object trees). Popular DI frameworks for .Net are: Autofac, Castle, Ninject, Spring.Net, etc. For runtime dependencies (or lazy instantiation) use an injected factory/provider.

Dependency Injection is the primary and fundamental requirement for writing clean unit tests easily in an Object-Oriented language. Here's a blog from Bob Lee, a former team lead of the Android core library at Google and creator of Guice, talking about how most Java applications in Google use dependency injection, applications such as, gmail, youtube, adwords, google docs, etc.

Besides the secondary benefit of testability, dependency injection's primary benefits are making the code decoupled, modular, reusable and readable. It also makes you adhere to best practices of software engineering such as programming to Interfaces rather then implementation, the Single Responsibility Principle and the Open/Closed Principle of SOLID (see below).

SOLID principles

The next thing to pay attention to are the SOLID principles of Object-Oriented software engineering. Not only are these best practices for good architecture, they also make code very testable. Here are the descriptions in brief, please check them out online if you are not familiar with them already.

S: Single Responsibility Principle: Each class and method should have a single responsibility. It makes unit testing simpler (obviously), makes code less error prone to cross-feature bugs introduced during bug fixes or refactoring.

O: Open/Closed Principle: Modules should be open for extension, but closed for modification. It makes unit tests maintainable (because existing code base is not changed often, except for bug fixes or development time design fixes).

L: Liskov Substitution Principle: Base types should be substitutable by derived types without breaking expected behavior. Unit testing for LSP guarantees that polymorphic code doesn't break in existing clients when new derived types are implemented (see Unit Tests guidelines).

I: Interface Segregation Principle: Use multiple specific interfaces instead of one big interface. It makes unit testing easier by focusing on smaller units. Also makes it easy to create stubs or mocks for unit tests.

D: Dependency Inversion Principle: High level modules should not depend on low level modules. Both should depend on abstractions. It makes unit tests maintainable, because changing/refactoring details in one level doesn't affect the other levels, since they both are dependent on interfaces rather than implementations. (This is NOT the same as Dependency Injection btw, the two are completely different concepts.)

Here's a list of cases where violating the Single Responsibility Principle for methods will cause problems:

Long methods are not unit testable. A long method contain local states that are shared by multiple logical blocks in the same method, so it becomes difficult to follow and isolate test cases.
Any change in any logical block of a long method has the potential to break other logical blocks in the same method. This requires re-tests for all logical paths of the whole method for any fix, and makes it susceptible to regression bugs.
Long methods cannot be profiled effectively (if we need to in the future). A profiler will only tell you that a method is slow, not which logical part of the method is the cause.
It's very hard to add Code Contacts in long methods.

Global state (to avoid)

Avoid using global states, such as static variables or the Singleton Design Pattern. Singleton objects that are dependency injected are ok, but avoid the Singleton Design Pattern. Static methods are ok if they don't change or access any state, or if they are leaf nodes in the call graph. Note that static methods that access DateTime.Now or the Random class are accessing global state. Here's a short list of problems related to global states in a program:

Readability issues - Source code is easier to understand when the scope of objects are limited. Since global variables can be accessed from anywhere, it becomes difficult to remember or reason about every possible use.
Implicit coupling - Since different objects are accessing mutable global states, they are implicitly coupled via the global state.
Concurrency issues - Global mutable states are well-known for concurrency problems for obvious reasons.
Unit testing issues - Unit testing becomes difficult because you need to set up the global states for the tests, which may be hard because of global coupling with other code. Also, tests may become contaminated in-between runs. Pure static methods are unit testable, but the callers of static methods have dependencies on the classes/assemblies of those static methods and you cannot mock them out. So, you cannot test the caller in isolation. Pure static methods as leaf nodes may be ok.
Scalability issues - Global scope is per AppDomain, so scaling the application to multiple processes will pose a challenge. You will need to pass the globals as parameters or something, so they are not really used as globals anymore.
Refactoring issues - Since multiple parts of the application are coupled via globals, refactoring becomes risky as it can break anything. This is against the TDD methodology.

Unit Test guidelines

Here are some guidelines to make unit tests fast, readable and maintainable:

Readability:

Test project naming convention: {ProjectName}.Tests
Test class naming convention: {ClassName}Tests
Test method naming convention: {MethodName}_{StateUnderTest}_{ExpectedBehavior}
Identify stubs and mocks clearly for variables in a test method, eg: stubEmployee, mockEmployee
Follow the Arrange, Act, Assert (AAA) pattern in each test.
There should be no logic in unit tests (no if-else, switch-case, loops, etc). We don't want to test the tests. We don't want to think "logically" when reading someone else's tests. We don't want to spend time "maintaining" the tests.
Avoid adding comments in your tests. If they need comments then it implies that the code is not easily readable.
Don't use magic strings or numbers in a test, such as, "Test_Title", "CountryName", "Jane Doe", 15, 55, etc. to fill properties that are not directly under test, use the following defaults instead:
- String: "aaa"
- Char: 'a'
- Number: 10, 100, 1000, etc
- DateTime: default(DateTime)
- Boolean: default(bool)
- Enum: default(enumType)
- Reference: null, or create a stub/mock, or pass in the real object if it has been proven to work (with tests)

Isolation:

All unit tests should be in-memory tests. They should not leave main memory. If your test leaves main memory, delete the test, or put it among Integration Tests.
Use stubs for dependencies. If a dependency is hard-coded in the SUT, create an interface (and a wrapper if necessary) to be able to stub out the dependency during test. You may use the dependency directly if it is proven to be working (with passing tests).
Use at most one mock per test, mostly for testing interactions with 3rd party libraries. Too many fakes/mocks make tests tied to implementation details and unmaintainable (eg: a new version of the library may work differently).
Tests should not allow access to the Database. Use a hashtable or other in-memory data structure, or a database wrapper (eg: a generic IRepository)
Tests should not allow access to the File system or registry (not even config files). Use a FileWrapper interface to stub out File IO (same for Configs).
Tests should not allow access to the Network. Use a NetworkWrapper interface to stub out network access.
Tests must be deterministic, repeatable, able to run in parallel and in any order. Avoid using global states in SUT in order to achieve these conditions for the test. Avoid using global states in tests (eg DateTime.Now, Random, etc).
Try to avoid Setup and Teardown methods (or use them lightly). They decrease readability and can cause state corruption in between tests. Use factory methods instead.

Asserts:

Test only one thing per test. Try to use only one assert. Multiple asserts on a single object are fine if the asserts are related, but don't do multiple asserts on different independent objects/properties in the same test.
Verify only a single call to a mock object.
Don't use variables in your assert. Use fixed values, numbers or strings. Variables can contain logical error or may duplicate production logic. It also makes tests hard to read.

General:

Test only the public methods. Testing private methods makes them tied to implementation details and makes it hard to refactor the application without breaking tests. There may be exceptional cases though.
It is recommended to create separate helper factory methods with different names instead of returning different objects conditionally from the same factory. This is to avoid logical bugs in the factory code and for ease of readability of the test method.
If a stub is created in a factory, don't hard-code the stub's properties in the factory, set them in the test method instead.
All base class unit tests should pass with derived classes. Inheritance tests should pass LSP (base class exceptions, pre-conditions, post-conditions, invariants, history, etc.). Please read up on LSP to understand the details.
Never [Ignore] a test method. Let it fail or fix it.
If a test that is written following these guidelines doesn't work as expected, keep the test and delete the code. Now run the test, it will fail (obviously), write the minimum code needed to make it pass, refactor and repeat. You are on your way to TDD.

What are the benefits of TDD

TDD stands for Test Driven Development. The general practice is to write minimal unit tests before writing any line of code. Sometimes the differences and benefits between traditional unit tests and TDD unit tests are not clear to everyone. Hence, here is a bullet point list of some concrete advantages that TDD provide over traditional way of writing unit tests (after writing code):

TDD guarantees that tests are not passing by accident, because tests start by failing and pass only after implementation. Traditional unit tests require you to manually break working code to guarantee that the tests are working as expected. This is especially useful when testing async calls, because most async tests will pass automatically if not written properly.
TDD enforces the practice of decoupled code, because other implementations are not yet available for tests, so all tests need to use mocks/stubs by default. Makes better code design. Traditional unit tests end up becoming integration tests because it is tempting to use real implementations rather than mocks.
Writing tests after the code is like writing tests for code that already works, because the code was already confirmed by manual testing during implementation. So, traditional unit tests seem like extra tests and sometimes it is tempting to bypass hard to test cases (because they were manually tested during coding). TDD enforces the automated testing practice and reduces reliance on manual testing during implementation.
TDD makes you write less code during implementation, because tests are Spec oriented, rather than implementation oriented. Testing spec gives more confidence in the application rather than testing internal implementation. Cleaner interfaces. Guarantees that tests cover all relevant behaviour of the application. Streamlines the codebase. Traditional method can bloat up the codebase with code that may be used in the future.
Traditionally, new functionalities are bolted on to the application in fear of breaking it. TDD gives confidence to truly integrate new functionalities into existing applications without fear of breaking it. The confidence comes from the fact that the tests are spec-oriented, not implementation-oriented, so it guarantees that spec is not broken.
Constant refactoring makes sure the code never becomes outdated and unchangeable.

Glossary:

SUT: System Under Test. A class, a method or a module. Anything that's under test.
DI: Dependency Injection
TDD: Test Driven Development
AAA: Arrange, Act, Assert. A unit test writing pattern for separating a test method into three groups. Arrange the variables and dependencies needed, call the method to test (Act), Assert on the expected state/behavior.
LSP: Liskov Substitution Principle (named after the mathematician Barbara Liskov)
Stub: A fake object created for state verification of the SUT. They are pre-programmed with hard-coded responses to calls. You don't assert against them. You assert against the SUT (or other collaborators).
Mock: A fake object created for behavior verification of the SUT. They are pre-programmed with "expectations" of the calls they are expected to receive from the SUT. You assert against them to verify that the right calls were made by the SUT.

References:

Robert Martin (Uncle Bob): Co-author of the Agile Manifesto. Check out his blogs and videos.
Martin Fowler: Co-author of the Agile Manifesto, speaker, writer, check out his blogs on Dependency Injection and IoC containers.
Misko Hevery: Agile coach at Google. Check out his clean code talks series in youtube, and his blogs on how to write testable code.
Roy Osherove: Agile coach and writer. Check out his book "The Art of Unit Testing" or his youtube videos here and here on unit tests.
Jon Skeet: Senior Software Engineer at Google, author, all-time highest ranked StackOverflow contributor, check out this demo on IoC container from scratch.
Mark Seemann: Writer, developer, check out his articles on Dependency Injection here, here and here.
James Coplien: Writer, lecturer, researcher, C++ guru, Agile consultant. Check out his piece on Why Most Unit Testing is Waste.
Bob Lee: Former team lead of Android core library at Google, talks about Guice, the dependency injection framework that Google uses in all Java applications.

Sunday, March 27, 2016

API Best Practices and Breaking Changes

This post is a collection of deceptive breaking changes and best practices to follow when designing an API. It's based on my research from reputable sources on the web. Please refer to the references section to check the sources where I collected information from.

There are three types of compatibility issues when dealing with API changes. A new version of an assembly can break existing client assemblies in the following ways:

Binary compatibility break: Client assembly cannot load a new version of an assembly. Client needs to re-compile their code.
Source compatibility break: Existing assemblies work with new version of an assembly, however, client code doesn't re-compile against the new assembly.
Semantic compatibility break. Client binary and source are both compatible with new version of an assembly but, there are semantic changes in the application.

In this post I'll focus only on some fundamental object-oriented best practices and non-obvious (deceptive) breaking changes when developing a public interface in C#. This list of breaking changes will remain incomplete on purpose. I am not going to talk about obvious breaking changes here (such as, changing a method signature), but only talk about changes that arise from not writing code with best design principles from the start, and how changing them later can cause non-obvious breaks in client code.

Adding a return type to a public void method

Break type: Binary, Source

API before change (assemblyA)

public static class Foo
{
    public static void Bar(int i){ }
}

API after change

public static class Foo
{
    public static bool Bar(int i){ return true; }
}

Client code (assemblyB)

Foo.Bar(13);                  // binary break (will not load assemblyA)
Action<int> a = Foo.Bar;      // source break (will not recompile)

This kind of change is a violation of the Open/Closed Principle, because we are trying to add new functionality by modifying an existing class/function. The mistaken assumption here is that, since the return type is not part of the method signature in C# and Java, and since the void method was never assigned to anything previously, adding a return type should not break existing code. But it does break. The return type is not used by C# in determining which method to call, but C# needs to know the return type to generate correct bytecode in the caller of the method. This is a binary break in Java as well.

NOTE: Even though in C# the return type is not part of the method signature, in the CLR the return type is part of the signature and hence the above change makes them two completely different methods at the CLR level (hence the binary break).

Changing a field to a Property

Break type: Binary, Source

API before change (assemblyA)

public class Foo {
    public int Bar;
}

API after change

public class Foo {
    public int Bar { get; set; }
}

Client code (assemblyB)

SomeMethod(objFoo.Bar);         // binary break (will not load assemblyA)
AnotherMethod(ref objFoo.Bar);  // source break (will not recompile)

Putting aside the fact that we should not expose public fields in the first place, the reason it causes a binary break is because properties in C# are converted to get/set methods by the compiler, so the Bar member doesn't exist in the new assemblyA. The passing by ref causes a source break because Properties in C# don't allow passing by ref, it's only for fields.
The reason it's important to know this is because many developers think that they can simply refactor a public field into a property when needed, without breaking client code, which is not true.

Public fields (and protected fields) violate two of the most important Object-Oriented design principles: encapsulation and low-coupling. MSDN:CA1051

Public fields also violate the Open/Closed Principle. When the member variables of a class are public and if they change, every external function that depends upon those variables must be changed. So, none of those functions can be closed with respect to those variables.

Changing the value of a private const

Break type: Semantic

API before change (assemblyA)

public class Foo {
    private const string strA = "The meaning of life";
    public void MyMethodA(string a, string b = strA){ }
}

API after change

public class Foo {
    private const string strA = "The meaning of life has changed";
    public void MyMethodA(string a, string b = strA){ }
}

Client code (assemblyB)

public class Bar {
    public string MyMethodB(){
        var objA = new MyClassA();
        objA.MyMethodA("hello");  // semantic break (needs recompilation to see the new value)
    }
}

Pretty deceptive, ha? :) We know that a public const in C# is emitted as a literal in client code, so it needs recompilation to reflect a new value, but who would have thought that a private const can cause trouble as well. After all, client code has no access to the private const. The reason it breaks is, the C# compiler embeds the default const values of optional parameters in every method call site in the client. So, if a const used in a default parameter is changed in assemblyA, then assemblyB needs recompilation to update all method call sites. Otherwise assemblyB keeps using the old values.

The moral of the story is, keep consts for real constants, like the speed of light in a vacuum. Constants that need changing are not constants.

Property setter: changing private to public

Break type: Source

API before change (assemblyA)

public class Foo {
    public int P { get; private set; }
}
public class Bar {
    public int P { get; set; }
}

API after change

public class Foo {
    public int P { get; set; }
}
public class Bar {
    public int P { get; set; }
}

Client code (assemblyB)

static void ClientMethod(Action<Foo> f) {}
static void ClientMethod(Action<Bar> f) {}
static void Baz() {
    ClientMethod(x => {x.P = 42;});  // source break (overload resolution error)
}

On first thought it may seem like changing a property setter from private to public can never be a breaking change because client code never used the private setter. We are only adding new functionality. However, it can break, as illustrated in the Client code example.

Client code has a method Baz which does overload resolution on ClientMethod. When Foo.P has a private setter in assemblyA, overload resolution in client chooses the second overload and compilation succeeds. But when Foo.P's setter is made public client gets an overload resolution error because both overloads of ClientMethod are equally valid now and neither is better than the other.

This kind of change is a violation of the Open/Closed Principle, because we are trying to modify existing functionality in a class.

Adding a public method

Break type: Source, Semantic

API before change (assemblyA)

public class Foo {
}

API after change

public class Foo {
    public void Baz();
}

Client code (assemblyB)

public static class FooExtensions {
    public void Baz(this Foo foo);
}
public class Bar {
    public void Baz() {}
}
public class Program {
    static void ClientMethod(Action<Foo> a){}
    static void ClientMethod(Action<Bar> a){}
    static void Main() {
        new Foo().Baz();            // semantic break
        ClientMethod(x => x.Baz()); // source break (overload resolution error)
    }
}

Semantic break: The C# overload resolution algorithm chooses the “closest” method match, and in this case it will be the instance method, not the extension method. Hence all client code that was using the extension method will silently start using the instance method.
Source break: The problem here is caused by lambda type inference in C# in the presence of overload resolution.

This kind of change is a violation of the Open/Closed Principle, because we are trying to add new functionality in an existing class. Depending on what the method does, it may be a violation of the Single Responsibility Principle as well.

Properties should not return arrays (MSDN:CA1819)

Arrays are not easy to make tamper-proof. The caller can replace the contents of an array with anything they want. In order to keep the internal data of your class safe you need to pass a copy of the array every time, which is expensive.
Array covariance is not type-safe in C#. Here's a blog by Eric Lippert describing the problems with Array Covariance. It means that every time you put an object into an array you have to do a run-time check to ensure that the type is correct (see eg below).
Arrays don't have an extensibility model and are not thread-safe. Better alternatives are to return a ReadOnlyCollection, IEnumerable or an IList.

Here's another blog from Eric Lippert on the harmful side effects of arrays: Arrays considered somewhat harmful. Eric Lippert was the principal engineer in the C# compiler team at Microsoft for many years, and is one of the most respected .Net contributors in StackOverflow.

MSDN Framework Guidelines: DO prefer collections over arrays.

Break type: Semantic

Array covariance is not type-safe

object[] objects = new[] { "foo", "bar", "baz" };
objects[0] = 5;     // runtime exception

Arrays also cause runtime exception in IList covariance

List<Dog> dogsList = new List<Dog>();
Dog[] dogsArray = new Dog[5];
IList<Mammal> mammals = dogsList;     // compile error
IList<Mammal> mammals = dogsArray;    // no compile error
mammals[0] = cat;                   // runtime exception
object[] objArray = new string[5];
objArray[0] = 57;                   // runtime exception

It's better to avoid using arrays altogether (except maybe for high performance scenarios). They are not really type-safe when it comes to covariance, and have no extensibility model. List is slightly better in this respect because it is not covariant and gives you compile error. Most readonly interfaces are covariant, such as, IEnumerable, IQueryable, IReadOnlyCollection, IReadOnlyList, etc, they are easily identifiable in the docs as <out T>.

Do not expose generic lists (MSDN:CA1002)

Break type: Binary, Source

API before change (assemblyA)

public interface IFoo {
    List<IBar> GetList();
}
public class Foo: IFoo {
    public List<IBar> GetList() {
        return new List<IBar>();
    }
}

API after change

public interface IFoo {
    Collection<IBar> GetList();
}
public class Foo: IFoo {
    public Collection<IBar> GetList() {
        return new Collection<IBar>();
    }
}

Client code (assemblyB)

List<IBar> lst = new Foo().GetList();
lst.Add(item);      // binary break, if we wanted to extend the behaviour of the returned list

In order to avoid this type of interface breaking changes, it is advisable to always use either an interface or a higher level abstraction when returning a type from a public interface. Lists were designed for performance, not for extension. So, if we wanted to extend the behaviour of the returned list, we can't with a List type. For eg, we may want to do a null check for Add(), or have events raised during Insert/Remove. Changing the List to a Collection allows us to extend the behaviour of the returned list by overriding the Collection's Add/Remove/Insert methods. In the overridden methods we can add event hooks, null checks, etc.

Collections have protected virtual methods that we can override to extend its behaviour for Add, Remove, Insert, etc. Other better alternatives could be a ICollection, IList, IReadOnlyCollection, IEnumerable, etc

Lists have no virtual methods. You can use an IList however, if you really wanted to return a List.

Avoid using default parameters

MSDN:CA1026: Default parameters should not be used

Break type: Binary, Source

API before change (assemblyA)

public void Foo(int a) { }

API after change

public void Foo(int a, string b = "bar") { }  // binary break
    OR
public void Foo(int a) { }
public void Foo(int a, string b = "bar") { }  // source break

Client code (assemblyB)

Foo(5);     // binary break; MissingMethodException
    OR
Foo(5)      // source break; ambiguous method call

For the first case, the client code needs to be recompiled. The recompilation will output Foo(5, "bar") in all call sites at the bytecode level. This is because C# embeds the default values of the parameters directly into the call sites. Default parameters change the method signature at the CLR level. Default parameters are a language feature of C#, the CLR does not treat them in a different way. This also explains why default values have to be compile-time constants in C#. Default parameters are not similar to adding method overloads. They are like adding new parameters to the method definition.

In the second case, you have to make sure there are no ambiguous call created by adding another method with default parameters.

Avoid using virtual methods

MSDN Framework Guidelines: DO NOT make members virtual unless you have a good reason to do so and you are aware of all the costs related to designing, testing, and maintaining virtual members.

Break type: Semantic

API (assemblyA)

class Base {
   private int state = 0;
   public virtual void Foo(){
      Bar();
      if(state == 17){...}
   }
   public virtual void Bar(){
      state++;
   }
}

Client code (assemblyB)

class Derived : Base {
   public override void Bar(){}
}

Method Base.Bar() changes base class’s state, and method Base.Foo() depends on that state change. But after Derived class overrides Base.Bar() as written above, method Base.Foo() is broken, because the overridden Bar() doesn't change the Base state anymore. A solution to fix this issue is to call base.Bar() from Derived Bar(), so that the state change happens. But this kind of fix breaks encapsulation of base class's implementation.

The general guideline is to try to avoid inheritance and use composition/delegation instead, in combination with interfaces.

This Derived class violates the Liskov Substitution Principle. The Derived type is not substitutable in method call sites that currently accept the Base type as a parameter, because the behavior of Base.Foo has changed now.

Fragile Base Class inheritance

Break type: Source, Semantic

API before change (assemblyA)

public class Base {
   private int counter = 0;
   public void Foo() {
     counter++;
   }
   public virtual void Bar() {
     counter++;
   }
}

API after change

public class Base {
   private int counter = 0;
   public void Foo() {
      Bar();        // this is a valid refactoring
   }
   public virtual void Bar() {
      counter++;
   }
}

Client code (assemblyB)

public class Derived: Base {
   public override void Bar() {
      Foo();   // semantic break; goes into infinite recursion
  }
}

Here's a list of problems related to class inheritance in general:

Class inheritance breaks encapsulation. It makes the derived classes tightly coupled with the implementation details of the base class. Any change in the base class may break derived classes and may violate Liskov Substitution Principle.
There can be Multiple Inheritance issues. These are easily solved using composition instead of class inheritance. Most game engines use the Entity-Component System pattern (ECS) to avoid these kind of hierarchical problems.
Fragile Base Class problems (as exemplified above).
Implementation cannot be changed at runtime. Composition with Interfaces solve this problem.
Difficult to add new classes in the hierarchy without breaking existing clients. For eg, to fix Multiple Inheritance issues, you may need to restructure the tree.
Can get very difficult to understand the control flow (function calls) once the hierarchy is 2 or more levels deep.

Base in the Middle

Break type: Semantic

API before change (assemblyA)

public class Foo {
    public virtual void M() {
        // some default stuff
    }
}
public class Bar : Foo {
    public void SomeOtherMethod() { }
}

API after change

public class Foo {
    public virtual void M() {
        // some default stuff
   }
}
public class Bar : Foo {
    public void SomeOtherMethod() { }
    public override void M() {
        // some new behavior
        base.M();
    }
}

Client code (assemblyB)

public class Client : Bar {
    public override void M() {
        base.M();   // semantic break; will still call Foo.M()
    }
}
 
new Client().M();

This C# behavior is seriously counter-intuitive. Most developers expect base.M() to be determined at runtime (polymorphism) and call Bar.M() and then Foo.M() with the new API. However, the base.M() call is statically bound at compile time to Foo.M(), so simply changing the API assembly doesn't change the runtime behavior of the client (but it is expected). Client needs a recompilation to get the new behavior. (Java, however, meets the expected behavior (runtime binding)).

Unsealing a sealed class

Break type: Semantic

API before change (assemblyA)

public interface IBaz { }
public sealed class Foo { }
public class Bar { }
 
//public class Derived : Bar, IBaz { }  // not needed, just to show

API after change

public interface IBaz { }
public class Foo { }                    // unsealed
public class Bar { }
 
//public class Derived : Bar, IBaz { }  // not needed, just to show

Client code (assemblyB)

static void M(Func<Foo, IBaz> f) { }
static void M(Func<Bar, IBaz> f) { }
 
M(x => x as IBaz);       // source break; overload resolution error

Before API change, in client code, in the line "x as IBaz", x cannot be Foo or its children, because Foo doesn't implement IBaz and it's sealed. So x is resolved to Bar at compile-time (even though Bar doesn't implement IBaz, some child may potentially do, eg Derived, but is not required at compile-time).

After API change, x can now be resolved to both Foo and Bar (whether or not something actually derives from them), because both can have children that implements IBaz. So overload resolution error occurs.

References:
Microsoft Framework Design Guidelines
A definitive guide to API breaking changes
Jon Skeet's blogs (Jon Skeet: senior engineer at Google, highest ranked stackoverflow contributor)
Agile Architecture (Allen Holub: consultant, educator, writer)
Fragile Base Class (Allen Holub: consultant, educator, writer)
How To Design A Good API (Joshua Bloch: Java platform architect at Google)
Array Covariance in C# (Eric Lippert: former principal engineer of the C# compiler team at Microsoft)
Open/Closed Principle (Uncle Bob: consultant, Agile speaker, software architect)
Virtual Methods and Brittle Base Classes (Eric Lippert: former principal engineer of the C# compiler team at Microsoft)
James Gosling on Java (James Gosling: creator of Java)
Google's Go programming language design decisions