Thursday, July 11, 2013

Liskov Substitution Principle vs. immutability

Today I had a very interesting discussion about Liskov Substitution Principle (LSP) - what it really means and how to avoid breaking it, of course using an overused example of squares and rectangles. Liskov's principle tells us to design our inheritance trees or interface implementations in such a way that whenever the code uses the base class (or interface), any of the derived classes (implementations) can be used without breaking existing behavior or invariants. In wider understanding, it also means that whenever we use an abstraction like base class or interface, we should not expect any particular implementation provided in the runtime nor should we know anything about that implementations. This also forbids any constructs like conditions based on implementation's concrete type - if we need it, we're working with leaky abstraction that need to be rethought and not hacked around.
void MethodThatGetsAbstraction(IAbstraction abstraction)
{
    if (abstraction is SomethingReal)
        YouReDoingItWrong();
}
Let me remind the classical example of shapes I've mentioned before. As we know, squares are rectangles, and we often represent "is a" relationships like this in code using an inheritance - Square derives from Rectangle (I don't want to discuss here whether it makes sense or not, but the fact is this IS the most often quoted example when talking about LSP). A rectangle, mathematically speaking, is defined by its side sizes. We can represent these as properties in our Rectangle class. We also have a method to calculate an area of our rectangle:
public class Rectangle
{
    public int Height { get; set; }
    public int Width { get; set; }

    public int CalculateArea()
    {
        return Height * Width;
    }
}
Now enter the Square. It is a special case of rectangle that has both sides of equal lengths. We implement it by setting the Height to equal Width whenever we set Width and opposite. Now consider the following test:
class When_calculating_rectangle_area
{
    [Test]
    public void It_should_calculate_area_properly()
    {
        // arrange
        var sut = new Rectangle();
        sut.Width = 2;
        sut.Height = 10;
        
        // act
        var result = sut.CalculateArea();

        // assert
        Assert.Equal(result, 20);
}
It works perfectly fine until, according to the ability given us by the Liskov's principle, we change Rectangle into its derived class, Square (and leave the rest unchanged, again according to the LSP). Note that in real-life scenarios we rarely create the Rectangle a line above, we got it created somewhere and we probably don't even know its specific runtime type.
class When_calculating_square_area
{
    [Test]
    public void It_should_calculate_area_properly()
    {
        // arrange
        var sut = new Square();
        sut.Width = 2; // also sets Height behind the scenes
        sut.Height = 10; // also sets Width behind the scenes, overriding the previous value
        
        // act
        var result = sut.CalculateArea();

        // assert
        Assert.Equal(result, 20); // naah, we have 100 here
}
Square inheriting from Rectangle breaks the LSP badly by strenghtening the base class' preconditions in a subtype - base class doesn't need an equal sides, it's the implementation-specific whim we don't even know about if we're only providing the base class to be derived in the wild. But what is the root cause of that failure? It's the fact that Square is not able to enforce its invariants (both sides equal) properly. It couldn't yell by throwing an exception whenever one tries to set the height differently than width, because one may set width earlier than height - and we definitely don't want to enforce particular property setting order, right?. Setting width and height atomically would solve that problem - consider the following code:
public class Rectangle
{
    private int _height;
    private int _width;

    public virtual int SetDimensions(int height, int weight)
    {
        _height = height;
        _width = width;
    }

    public int CalculateArea()
    {
        return _height * _width;
    }
}
public class Square : Rectangle { public override int SetDimensions(int height, int weight) { if (height != width) throw new InvalidOperationException("That's a weird square, sir!"); _height = _width = height; } } But now we've replaced one LSP abuse with another. How the poor users of Rectangle can be prepared for SetDimensions throwing an exception? Again, it's an implementation-specific weirdness, we don't want our abstraction to know about it, as it will become a pretty leaky one. But is width and height really a state? I'd rather say they are shape's identity - if they change, we are talking about another shape than before. So why expose setting possibility at all? This leads us to the simple but extremely powerful concept of immutability - a concept in which the object's data once set cannot be changed and are read-only. The only way to populate an immutable object with data is to do it when creating the object - in its constructor. Note that constructors are not inherited and when new-ing a Rectangle we are 100% sure it's not Square or any other unknown beast, and conversely, when constructing Square we can enforce our invariants without an influence on base class behaviours - as all properly constructed Squares will be valid Rectangles.
public class Rectangle
{
    private int _height;
    private int _width;

    public Rectangle(int height, int weight)
    {
        _height = height;
        _width = width;
    }

    public int CalculateArea()
    {
        return _height * _width;
    }
}

public class Square : Rectangle
{
    public Square(int size)
        : base(size, size)
    {
    }
}
LSP satisfied, code clean, nasty bugs and leaky abstractions removed - immutability is king!

Thursday, July 4, 2013

Code Smells: Rigidity

I was recently preparing a small talk about several code smells, as discussed by Robert C. Martin in his famous 'Clean Code' book. I've decided to share my notes - it's nothing revolutionary, a bit of copy-paste from well-known resources and a bit of my personal comments and understandings.

The one for today is rigidity. Rigidity literally means "the inability to change or be changed to fit changed circumstances". And that nicely fits what the smell is about: "The software is difficult to change. A small change causes a cascade of subsequent changes". We all know it - some tasks that sounded easy at the beginning became a few weeks of bug hunting or turned out to be too difficult to finish without destroying everything around.

When it happens?

  • When the code is written in procedural style - a.k.a. "everything in one place" - with a lot of very narrow and specific, possibly unrelated conditions and special cases handling included directly in the "main" method,
  • When there is a lack of abstractions - meaning that the code operates on the low level of technical details instead of focusing on more real-life concepts, like bit flags fiddling to check for object's characteristics instead of having it available as a method or a property,
  • When the code implements some general concepts, but defines it with details specific to particular use case - like a code to render an HTML table from the general matrix that also decides the table header has dark background,
  • When a single responsibility is spread between many classes or code layers - like checking user's permissions on every layer from the data access layer up to the UI layer,
  • When components need to know a lot of details about each other to communicate properly (leaky abstractions).

How to avoid it?

  • Thinking in general terms, wrapping the implementation details into abstractions that are understood on less technical levels,
  • Creating code with satisfying the requirements first in mind, not what API's we're using or the constraints we have around,
  • Using object-oriented techniques and SOLID - with its small classes, single responsibilities and open design,
  • Encapsulating the code in the logical pieces, defining a clear boundaries and interfaces between them - if you can't tell what is the border of responsibilities, it's probably done wrong,
  • Depending upon abstraction, not implementation.