First Principle - Don't Break Consumers

First Principle - Don't Break Consumers

Published on in first-principles & founding-engineer.

Don't Break Consumers

Hyrum's Law:

With a sufficient number of users of an API,
it does not matter what you promise in the contract:
all observable behaviors of your system
will be depended on by somebody.

In other words, when you have enough external users; you should be thoughtful about changes.

What are breaking changes?

Before I rant, let's agree on the 3 types of changes: bug fixes, accretion, and breakage.

Bug Fixes ...

... oft referred as patches, are just that: no new functionality and no changed functionality; only something that used to be broken is now fixed in some way.

For example, your "Hello World" API has an endpoint /hello used to erroneously return

{
    "hello": "Dlrow"
}

and now you've fixed it to return

{
    "hello": "world"
}

Accretion ...

... sometimes called new features, is additive change. This is requiring less from a user OR returning more to a user OR adding a new interface.

For example, your "Hello World" API /hello endpoint used to require a Query String Parameter of ?theWholeWorld=true and now it does not require it; importantly, I can still send it if I want to, but I don't have to send it anymore. This is requiring less.

OR maybe your "Hello World" API endpoint /hello now also returns the time:

{
    "hello": "world",
    "as-of": "2024-10-10T01:22:32Z"
}

This is returning more to a user. They didn't ask for it, but that's OK; they still get their "hello" field returning as it always did.

OR maybe it a new resource in your "Hello World" API, /goodbye:

{
    "shutting": "down"
}

They may not use it, but that's OK; they still have /hello for all their current use cases.

Breakage ...

... sometimes called major changes, is regressive change. This is requiring more from a user OR returning less to a user OR changing the semantics of an existing interface.

For example, your "Hello World" API /hello endpoint now requires a Query String Parameter of ?noDryRun=true; importantly, if I don't send it I get different results than I used to.

OR maybe your "Hello World" API endpoint /hello now only returns the time when it used to return more:

{
    "as-of": "2024-10-10T01:22:32Z"
}

This is returning less to a user. They probably needed that data.

OR maybe it a changing the semantics of an existing resource in your "Hello World" API, /hello:

{
    "hello": "world",
    "as-of": "1728537752"
}

The as-of changes from an ISO datetime with timezone to an epoch. Are these the same thing? Well ... they convert to the same thing if you assume certain timezone restrictions and don't use a shitty date conversion library (which is pretty much all of them). But it's NOT the same thing, because, for example, they don't lexicographically sort together the same way.

Lost Hours

<soapbox>

Semantic Versioning.

tl;dr

There are so many examples of disastrous breaking changes to consumers. In the tech world alone, I think of Python 2 -> Python 3, Angular 1 -> Angular 2, and every Google service that's disappeared over the years.

I've heard the argument so many times: it's onerous to maintain the existing "bad stuff", we must remove it, and we can make a new major version.

Here's an idea: just deprecate and leave the old thing and make a new thing. When you deprecate and leave the old thing, guess what: the community will take over ownership if the service is still valuable. When you make a new thing, you never break existing users. This is why versioning APIs in the URL works: you are making new resources and leaving the existing ones for existing consumers. If you are interested in companies who do this amazingly well, check out stripe (and a good overview in this podcast) and Werner Vogel's rules of API design.