contrarian notes on software engineering, Open Source hacking, cryptocurrencies etc.

What OOP gets wrong about interfaces and polymorphism

I often receive feedback to my general OOP critique from people somewhat sympathetic to my message suggesting that since OOP is vague and not precisely defined, it would be more productive to talk about its core tenets/features in separation and drop the “OOP” name altogether.

I've also received an email (hi Martin!) asking among other things on my opinion about usage of interfaces in OOP. So instead of writing a private response, I'm just going to dwell a bit more on it in a blog post.

BTW. I'm continuing to read #oop #books to gather more insight and arguments. Currently, I am going through Object-Oriented Software Construction by Bertrand Meyer. The book is huge and presents the case for OOP in-depth, which is perfectly fulfilling my needs. And on top of it – it's old, so it gives me a lot of insight on “what were they thinking?!” ;). Hopefully, I'll get to a post about it in the not-too-distant future, but I will be referring to it already in this post.

Anyway... about the polymorphism and stuff...

Let's clarify: when talking about polymorphism I will usually mean dynamic polimorphism, Java/C++ style, or similar methods to achieve the same thing in other languages and programming paradigms.

Anytime polymorphism is used, it is facilitated through the implicit or explicit interface. Since the whole point of polymorphism is for the caller not to have to know the implementation of the callee, each implementation is responsible for conforming to the same interface.

Polymorphism is not exclusive to OOP. One can do polymorphism in the machine code (by calling function indirectly by address in a register), C (by calling function by function pointer), FP (functions are first-class), non-OOP languages (dyn Trait in Rust, interface in Go). I'm not aware of any modern mainstream programming language that does not support polymorphism.

So to get it out of the way: polymorphism is great. Super useful. More than that – Polymorphism is absolutely necessary to write pragmatic software. And OOP embraces polymorphism, so what is my problem? Well... The problem with polymorphism in OOP is... OOPers smoke too much polymorphism. They don't know when to stop.

Oh, “You're ranting again, Dawid. You're not fair.” some of you will say. Really? I can open any book on OOP I've gone through so far, and without looking too long I can find signs of polymorphism addiction.

Let's take Java OOP Done Right

interface Shape {
  void draw();
}

Simple. Easy. Done right? Nope.

The author contrasts it with the alternative:

public void draw() {
   // DO NOT DO THIS!!!

  switch (shapeType) {
     case "CIRCLE":
       System.out.println("Look how round I am");
       break;
   
   case "SQUARE":
      System.out.println("Four sides are the same for me");
      break;

   default:
      throw new IllegalArgumentException("Unknown Shape type " + shapteType);
   }
}

and writes:

Having no if or switch statement is a huge win. (...) You can see how code like that grows and grows as new shapes get added. The code itself is complex, as all conditionals are. They have extra possible execution paths through the code. Each one needs testing. Each one increases the chance of coding mistakes.

This switch statement prejudice is a constant of OOP, as far as I can tell. Anytime polymorphism is explained, it's contrasted with a “scary” switch statement.

Let's give the “bad” version a little makeover. If you're not stuck with language from the 90s, which copies control structures of the languages from the 60s, and instead use anything modern, you could just write something like:

fn draw(shape_type: ShapeType) {
    match shape_type {
        Circle => println!("Look how round I am"),
        Square => println!("Four sides are the same for me"),
    }
}

Shorter, cleaner, and most importantly eliminating string used as a tag, and thus the possibility of an invalid value. The compilation will fail in case any new shape is added and not handled somewhere.

But the main thing I'd like to point out is the near-complete functional equivalence between polymorphic and switch-based versions. What the OOP-blessed version is doing in comparison is taking every case from the switch statement and adding a lot of boilerplate around it. Both versions “grow and grow as new shapes get added” – just one with much more boilerplate and ceremony (new class for each new shape) in addition to the necessary code. Both have the same number of “possible execution paths through the code”. In each of them, “each one (shape) needs testing”.

The polymorphism-based version is slower at runtime and requires way more boilerplate. Reusability is the same. None of the reasons given by the author to prefer polymorphic one hold any water.

The real fundamental difference between them is that enumeration is a closed set, and the polymorphic approach supports an open set. And neither of these is universally better.

And if you look at this example code, you will notice that the author (probably for brevity) “draws” the shape as... text on the console. Now, let's think about how this code would have to work in a slightly more realistic scenario. Each shape would have to do some actual drawing, right? But how exactly? To where? Well... we don't want to hardcode that. So then what? Do we pass an interface Renderer in a constructor to each class implementing Shape? That's the infamous “You wanted a banana but what you got was a gorilla holding the banana and the entire jungle.” trap. So then maybe...

interface Draw {
  void raw(Renderer renderer);
}

kind of thing? And then what? What do you think makes more sense? That a Circle knows how to render itself on WebGLRender, SDL2Render and possibly CommandLineRenderer in some uniform way? Or that each renderer knows how to render each shape? The only meaningful thing that a Shape implementation could do is to call a roughly corresponding method on the Renderer.

The whole interface here and all this polymorphism is mostly an exercise in futility and overcomplicating things. Having a polymorphic Renderer interface does make sense, but usually the most pragmatic implementation of it will be a switch statement over all supported shapes. There's no way around it – having an open set of both possible rendering engines and all possible rendered shapes is not practical (possible?). Something has to give.

Again – I'm not cherry-picking. Another example: If you look at what is happening in the article sent to me as feedback to my review of Growing Object-Oriented Software, the author is basically throwing out all the needless interfaces:

The third point to note here is that the code base above doesn’t have any interfaces.

WAT? :mindblown: . Does not compute. Lack of interfaces makes things simpler?

99 Bottles of OOP that I've just ranted about? The author can't help but introduce Number6 and similar inheriting from a BottleNumber class.

Another example: Most (all?) OOP languages have dynamic polymorphism-based collections, combined together in a deep inheritance hierarchy, even though it's completely unnecessary (just look at Rust with a minimal Iterator-based interface).

It's everywhere! In OOP it's dogmatic to hunt down anything that looks like a switch statement and turn it into an interface.

The right mental model for interfaces and polymorphism

Let's look at another engineering discipline.

Think of a house. What are the examples of interfaces in a house? A power socket in the wall is an interface. Why do we have power socket/plug interfaces? Because the people that design and build a house can't possibly predict what kind of stuff you'll want to power when using it. They need to give house users a uniform and standardized way to swap and plug any appliances into the power grid. So they standardize on the voltage, connector shape and some details and do some extra work to let us plug whatever we want as needed.

Designing an interface, even a simple one is often laborious and complicates things. Especially if it has to be an universal API for all the supported (even future) cases. Both sides have to be constrained to one uniform API. And interfaces are hard to change, even internal ones.

In contrast: why aren't the walls swappable? Because it's not very practical and people generally don't change the house layout often. It's simpler this way. The house frame needs to be solid, and everything else (e.g. plumbing) needs to be designed around it. In rare circumstances when people do want to change the layout of a house, it's a bigger engineering project, requiring new permits, design, and a lot of work.

Just think about how mindboggling would the complexity of a house design have to be to support rearranging walls at will, including support for plumbing, electrical wiring, etc. That's a metaphor for OOP's ideal: everything reusable, everything composable, everything flexible. Everything impossibly complicated and impractical, ha.

This metaphor translates to pragmatic software design rules.

Designing and maintaining swappable (open set, polymorphic) interfaces between components is a cost. It requires unifying the view of all callers/callees behind the same API, oftentimes reversing the control flow in unnatural ways, and cascading into introducing even more interfaces. Being able to swap and combine components is a benefit, sure. But always make sure the benefits outweigh the costs.

Whenever you can accept the fact that a certain set will be closed, and changing it will require adding handling code in all the client code, take it. It's a small price for not having to deal with the additional abstraction, indirection, and complexity.

Usually, polymorphic interfaces are worth introducing to avoid permanently coupling with opaque, hardwired elements. E.g. having to talk to a real SMTP server just to unit-test a new user creation controller, would be very impractical. Abstracting away SMTP server via an interface, and using dependency injection to supply real (in production) or fake (in tests) implementations has great benefits, with a minimum amount of overhead (the interface is small and clearly defined anyway).

Most logic that can be written in a side-effect-free way has no need for DI and polymorphism – it's usually perfectly practical to just drive all the inputs, and check all the results.

BTW. The basic frame of your software (data architecture) is like the walls of a house. It has to be concrete, robust, solid, and efficient. There is no point in abstracting it away and trying to make it swappable. The doors and windows are external interfaces, like database access, message queues, external services APIs – they area the external-world openings and you always want to make them an interface – that's what in large the hexagonal architecture is about.

OOP polymorphism enthusiasm

After reading the first four chapters of Object-Oriented Software Construction (written in the 90s), I can't help but notice the underlying uncritical enthusiasm and optimism that to this day permeates OOP literature. The context is set as if all software before OOP was ignoring modularity, reusability, extensibility, testability. I don't know if there's a lot of substance to it and the state of affairs was indeed dire, or is it mostly an exaggeration.

Interfaces, late binding, dynamic dispatch, and polymorphism are elevated from merely useful tools to solutions to all the ills and enablers of the new glorious future. They are to fix compilation times, allow businesses to build company-wide reusable component libraries, with near-perfect modularity and reusability, change the economics of software engineering, and so on. And the OOP is the only enlightened paradigm that can harness them and use them to their full potential.

In a lot of ways, it reminds me of 2017's blockchain-mania or Gartner Hype Cycle in general: “the peak of inflated expectations”.

Having the benefit of hindsight it's easy to see that a lot of the promised benefits did not ever materialize, and some that did have nothing to do with polymorphism or OOP altogether. E.g. Contemporary reusability, modularity needs are handled with software packages – in particular registries and package managers like NPM (with its own sets of problems, mind you). But there is nothing special about OOP that enabled it.

It would be nice to reach the “Plateau of Productivity” already. Where modern software literature stopped uncritically repeating the glorification of “new” ideas like polymorphism accompanied by some switch statement strawman bashing; and the naivety of the enthusiasm behind most of OOP ideas was universally recognized.