When you all think about API versioning, what makes the most sense to you — a semver approach (major.minor.patch) a la NPM, or a date-based approach (2020-01-07) a la AWS? Or is some combination of the two desirable?
At Standard Library [0] we both allow people to publish APIs but also publish API proxies on behalf of partners (Stripe, Slack + others) using a semver approach. It’s not perfect but theoretically enforceable (schema parameter additions can be forced to require a minor update, schema parameter removals can be forced to require a major update). We’ve just stuck to this semver approach based on intuition and haven’t had negative feedback about it, but I do like the idea of time-based versioning.
Would love thoughts! If you want to play around you can build your own APIs using https://code.stdlib.com/, which uses the FunctionScript specification [1] to enforce HTTP request schemas.
- Treat APIs as immutable.
- Any mutation results in a wholly new API version, not a patch or minor update.
- The developer will learn what changed, and how much changed by reading the patchnotes, not looking at which semver numbers changed. This is probably a healthy practice to encourage.
- Don't change APIs so much. The interface should be very carefully designed and tested and shipped like an NES cartridge: consider it impossible to fix once its shipped.
Therefore just start with `1` and increment each time.
The reason I don't like semver is that it's a developer convenience that leads to sloppy practices. You built your product against a specific API. If the API changes, you need to re-run your entire API evaluation, testing, blessing workflow. If the delta is tiny (what would have been a patch change) then yay, your task is likely going to be very simple. But you shouldn't see a bump version update and decide you can cut corners.
I do think that with web APIs specifically, the surface area is a lot smaller — the HTTP interface is literally all you touch — so semver, in its purest form, is actually completely enforceable as long as you understand the API schema.
Our team has talked a lot about either hard-enforcing or automatically applying semver where applicable. My question to you is — does this sound reasonable, and if you knew a semver contract was actually bound to implementation (i.e. guaranteed and not implied), would you trust it more?
I feel like code bloat would be hard to maintain. Obviously I could take two endpoint handlers that have the same functionality and move their behavior into a shared function, but there's also tests to consider. As time went on, I'd have a pile of tests running against seemingly random versions.
Eg, `test_blog_create()` might test v1 and v2 of the API as that specific endpoint didn't change, but suddenly in v3 the API _does_ change, so now I need to write a slightly different test to handle that new functionality. `test_blog_create_v3()` or w/e.
I'm not arguing against it, merely noting what I've previously thought about as problematic for implementing versioned APIs.
Thoughts on how best to manage the code base?
Note that the phrase non-overlapping is where all the complexity is hidden; it’s actually tricky to guarantee that an addition hasn’t changed any existing queries. For example, adding an enum value will mess up clients who query with max(enum_value). Technically, they’re not sending the same bytes, so the change is non-overlapping, but the client might disagree :)
Semver is important for knowing whether or not to evaluate for breakages. You can theoretically combine both of these by making the date the patch version or supplying it as a version metadata
It can be very useful for continuous improvement and handling forward compatibility. Here's how Stripe do it:https://stripe.com/en-fr/blog/api-versioning, we found that very convenient.
The trade off is that AWS has some godawful APIs (DynamoDB has the least intuitive API I have ever worked with). But they’re stable.
If you go with a date-based approach and create a process + contract whereby you guarantee API stability and / or deprecation date, the developer always knows exactly how much time they have before an upgrade.
If you publish API version 1.0.0, and then internally you fix something and publish 1.0.1 (remember: a patch doesn't change the interface at all), should you continue to serve both the 1.0.0 and 1.0.1 APIs? How are they different from the consumer's perspective? What if your reason to release 1.0.1 is to fix a security issue - in what world is it ethical to continue to serve 1.0.0? If you can discontinue serving 1.0.0 at any time (because of a security patch released with 1.0.1), then you can't offer any long-term durability guarantees for early patch versions, and so indeed, offering older patch versions (when newer patches didn't fix security issues) is more likely to break your consumers when you're forced to discontinue older patch versions for security reasons, than if you refused to serve patch-level variants in the first place.
Because you, as upstream, have no control over whether the pure addition of fields will break downstream (since downstream may or may not be strict about what they accept), you should only offer semantically versioned minor releases if you're willing to guarantee durability for earlier minor versions, and can maintain multiple minor versions of your API in parallel. If not, then be explicit about the potential of your changes to break downstream - use timestamps and document sunset dates.
We're talking about HTTP APIs, right? Then neither. These are solutions with bad trade-offs for the problem at hand.
Version the link relations. This follows the principles that make the Web successful (a.k.a. REST). If that seems weird to you¹, consider the following:
You have a personal homepage type Web site. When you change or add or remove a document, do your users need to upgrade their user agent to keep using the site? Why not?
----
¹ Numerous developers are so enamoured with putting a version number into document URIs that they cannot fathom not doing it. This is another mutilation of the mind à la Dijkstra.
https://events.pagerduty.com/generic/2010-04-15/create_event...
Customers there would sometimes ask if that API was still relevant given its date is 9+ years old.
In Shopify’s case, where there are quarterly version bumps and deprecation over time, date based makes better sense.
My high-level feeling is that it’s just way more difficult to ship a SaaS API than a Ruby Gem, so adding semver to that is just another layer of API management everybody has to agree on. Do you agree with this assessment?
But with semver you can say: ok, I want every new version up to a new minor version (all fixes). While with a date based version you don't know how 'breaking' the changes will be.
This is a good point. Good changelogs, like any documentation, take a lot of effort and so are unfortunately quite rare.
As an API consumer, to me the ideal upgrade-related documentation includes:
* Detailed release notes for each released version
* For non-GA releases like alpha/beta/RC's, describe the changes since the last small release -- typically consumers of a beta are following dev quite closely
* For GA releases, pretend the alpha/beta/etc don't exist at all. Describe the delta from the last GA release. Consumers don't care about the fix for a bug in a beta they never even knew existed.
* When you do breaking changes, provide upgrade guidance, like "If you were doing x before, now you should do y + z instead"
* If appropriate, consider also keeping separate upgrade guidance documentation for major breaking changes, for those laggard consumers that are going from a much older version like 2.x to 4.x. This allows someone to follow more of a checklist to get up-to-date without having to read 900 pages of individual release notes
You can do this with date-based version, but it's much harder as a consumer to figure out unless there's very good documentation. If I'm upgrading from "2.0.4" to "4.1.1", I know as a absolute minimum starting point I will be looking at release notes for "3.0" and "4.0" and from that, I should get a pretty good sense of the overall effort involved. If I'm upgrading from "2016-11-05" to "2019-12-19", how do I do the equivalent evaluation?
Do you feel like you’d trust a company’s API more if you could peruse the API changelog more easily?
If I have a bug, it's a breaking change. If I fix this bug, it's also a breaking change.
Google's proposed a "stability" semantic as a third option[0]. TL;DR no breaking changes in the Stable channel but you can add backwards-compatible[1] features in-place.
A permanent Beta channel that's a superset of Stable lets users choose how change-tolerant they are. This lets API producers launch features earlier, knowing they will only impact risk tolerant users if breaking changes are needed. Theoretically this reduces the need for breaking changes in Stable, which require a new Major version.
For example, 5.x.x is currently stable, so you release 6.0.0-rc1 (2, 3, ...)?
V2
V3-dev/v3-test/v3-rc - all related to how many changes you expect
V3
As far as I understand, the Stripe api would continue to work indefinitely so long as you lock your api version, whereas Shopify would eventually break the app as they essentially backport breaking changes to older api versions.
Initially, I thought Stripe's method was superior, and would provide the best API experience, but realized that Stripe and Shopify have different incentives w/r/t their api.
For Stripe, breaking a functioning site harms revenue, and generally the developer is the Stripe customer.
For Shopify, their customer is the store owner, and for the most part, the store will continue to function because that is mostly controlled by shopify. The developer api is for added functionality, and it is in the interest of Shopify and the merchant that those apps continue to be updated and utilizing the latest features.
So, two different ways of managing breaking changes, but both are ultimately centered around providing the best customer experience.
while this is orthogonal to Shopify API or your post at all, since you mentioned the need of updates, I just wanted to use that opportunity to vent my frustration with the constant push to update everything all the time and judging any piece of software by using "when was the last update" as a metric.
The problem I see is that not all apps or libs need (frequent) updates, many (maybe most) do need them, but some don't. They provide some functionality, they do that well and you could call them "complete". Maybe some security fix could be needed from time to time, but with a mature code being in use for many years event those are not frequent.
For example, consider something like ping utility. It does what it does for many decades. There was a need to add IPv6 support, but that was almost two decades ago. Why would anyone need to update it? I do not want any additional functionality, I don't want it to send emails or have social media share button. I want it to send ICMP echo requests and receive ICMP echo replies and nothing more. Aside from some security fixes no updates should be needed for 10+ years. This utility is done. It should not be thrown upon just because there were no updates for many years.
While of course neither ecommerce or Shopify platform are "done" and they get many updates now and will get updates in future it does not mean that some functionalities could have reached "done" stage.
For a "complete and done" addon, there could be a need for a security fix from time to time. There could be a need for some adjustments if a major browser introduces a new deviation from JS/CSS/HTML standards and forces everyone to update their code. But those events happen from time to time, possibly not that frequently. This means that some addon/plugin would not require any updates during the periods between those events and those periods could be many months/years long. But hey: "this addon did not receive any updates for 13 months, it must be really bad and should be avoided". This leads to a situation where a competing solution with tons of bugs will look better just because it receives two updates a week.
Edit: I deleted a couple sentences and realize now this might not convey exactly what I meant. Utilities are easy to call "done," applications are not. Applications interact with external forces who do change constantly (other software, business processes, law and regulation, etc). I think in general, updating applications is a necessary thing, bordering on good, regardless of circumstances.
The update in theses cases are not necessary to keep the most recent feature, but to stay connected to the interface that does change with time.
Someone needs to update the interface between the two while things evolve. Like it or not, but that's need to be done. If the tool is used and like by a few, then why not push that requirement toward the ones theses few that used it and like it, instead of supporting everything, even what's no longer used?
Sporadic updates tell those who depend on the code that stability and security updates are still being handled, even if the code is "complete". I'm not sure I'd have a lot of faith in code that looks abandoned.
How about being able to ping TCP ports? I need this for debugging all the time. Yeah, I use tcping instead... but why is this a separate tool? Dumb.
I'm pretty sure that for every tool you think is "done", there's someone out there screaming at its inadequacy.
I LOATHE breaking changes. They are almost never actually necessary.
... Until just recently in mid Nov they changed some behavior that caused us to double/triple/quadruple charge customers unintentionally in some not-so-uncommon edge cases... I’m still trying to square this one with their support, so details are a bit thin. But definitely a big surprise for me to see this happening when previously I never imagined something like this can happen given the API versioning stability.
We used the payment failed hook to send an email to our customers and let them know when we'd be retrying the charge, which was no longer possible because the next_payment_attempt field was null. Before the change, a next_payment_attempt=null meant the charge would not be retried.
I reported the issue and the webhook changes were rolled back a month later. Really threw a wrench into our flow.
I will say that generally speaking, Stripe is the canonical backwards compatibility API in my mind. With a few edge cases.
It was the change on limitations for variants. We woke up one day to product additions failing. Took about a week for the tech team to figure out that we were being throttled because we were exceeding 50k variants. Totally understandable in the bigger picture of things, but I didn't feel we received adequate warning. Developer support had told us that sometimes they grant temporary exemptions but in our case they refused. They advised we upgrade to shopify plus (they quoted us over $2000) to remove the throttling/limitation. Financially it was out of scope for us, so we had to throttle our own customers, which led to a massive disadvantage.
I ended up writing our own cart software, which was always part of the plan, but in the meantime our business suffered.
I don't hold any grudge about it since we were getting incredible value out of the previous arrangement, and I do think Shopify is amazing software. I've been a paying customer in multiple capacities since at least 2008 and recommend it to people all the time. What happened was just unfortunate timing for us.
Thanks for taking a minute to listen, though. Much appreciated.
What was the change?
Edit: Just to be clear, it was always on our roadmap to migrate away from Shopify's platform, but they accelerated our timeline and we had to limit the amount of merch our customers could add which obviously led to upset customers.
We're still operating, albeit close to insolvent, and have since launched our new platform. But our reputation has been irreversibly damaged.
What was the business?
- Moving JSON fields in and out of nestings didn't seem to be counted as a breaking change.
- Changes were rarely announced, and there was never a changelog as to what had changed (they look to have started one starting 2018 [1])
- When we contacted support about a brake, they would often be surprised.
- Often the only sign there would be a change would be that new fields would start to show up before a larger change.
All this would happen every few months. Reading this article I can start to see the reasons why this was happening.
This has led me to taking the drastic step of making _every_ property nullable. It's gross and feels bad to use, but at least it prevents JSON parse operations from crashing applications when a value is unexpectedly null.
1. You can just keep adding new methods and fields as needed, but since each client asks only for the fields specific to what they want, you don't get big bloated response objects.
2. Lots of times your breaking changes only differ slightly from previous versions, and the way GraphQL resolvers are written makes it really easy to refactor things into one base method that both the old and new versions can share.
3. Proper use of the @deprecated schema directive means your doc is 'clean' by always showing the latest version that new users should adopt, but the doc is still there for users on older versions.
4. It's really easy to add logging and tracing in your resolvers to see how often fields are being accessed and who is using them. At some point you may decide to break backwards compatibility by deleting old fields, but you'll know exactly who you are breaking.
Shopify says the main product is the store owner. But the developers pick up all of the slack of Shopify.
Recurring payments. App. Store backups. App. Theme backups. App. Order editing. Came late 2019. Checkout. So locked down. Where’s the API?! Slate tooling. Abandoned. Starter themes. Abandoned. Storefront SDK. Terrible documentation. More than 1 variant image? App. Metafields. App. Wholesale. App. Mailchimp. Removed.
People talk about google abandoning products. Shopify abandons nearly all developer tooling and is so locked down that it’s a constant “app for that” for the basics.
The interesting thing is theres been more than a few store owners I know that use Shopify. They’ve asked how to move off of Shopify.
I guess fulfilment is more important though. Right?!
I'm actually working on an open-source fulfillment and operations app for Shopify:
https://github.com/openshiporg/openship
I use it to build small apps that interact with the API directly instead of paying and relying on any apps.
I don’t think Shopify could sustain a real entrance by Adobe with Magento, or a product in the same space from someone like Microsoft. As these companies know developer tooling and in the end. Shopify needs developers. Developers don’t need Shopify.
Sure, it's Shopify's choice. But considering how long their own changes take (for example multi-language is still in some sort of beta and it's been up and coming for like 5 years?) the API cycles are just brutal. And the saddest thing is that Shopify is still the best managed e-commerce platform for most usecases.