Ask HN: How do you keep track of releases/deployments of dozens micro-services?

142 pointsKunix5y ago134 comments

It's been three years since the last thread (https://news.ycombinator.com/item?id=16166645), maybe there are more mature solutions now.

Interested to hear about current setups, and how it works for you.

134 comments

87 comments · 34 top-level

bhouston5y ago· 15 in thread

We just made all the microservices into one big monorepo and we deploy all at the same time.

To be honest we tried to avoid the monorepo but it was hellish. Maybe if each microservices was larger and our team was larger but then are they microservices any more?

lovedswain5y ago

The biggest difficulty I've experienced is "librification", where some common code ends up in a little library, and soon that library is not so little any more, and not long after starts to look like half of every service. I can maintain discipline when working on small systems alone, but on a team there will always be one lazy person or urgent need which means eventually some shared component gains enough gravity to start sucking code out of their nice isolated homes

Giving up and dumping everything into a monorepo, that's not going to help at all. At that point probably better off just giving up any hope of carefully split up and individually managed services

bhouston5y ago

These libraries already exist whether you write them or you use someone else's. In our case most of our micro services are node.js based so Koa is in every microservice and we use middleware for authentication -- and thus if the authentication system evolves (moving to JWT or a microservice gateway) we have to evolve that middleware everywhere.

Same with our consistent logging system.

Libraries are better than unique code everywhere for the same task - allows you to fix a bug once and to do consistency checking.

1 more reply

mewpmewp25y ago

I don't get the first paragraph you are saying. This lazy person puts some random code into a shared component, or... ?

Wouldn't this urgent need mean that they put this code into the microservice that needs this urgent update as opposed to going through the effort to make it available for everyone to use?

1 more reply

slifin5y ago

This video was interesting to me

https://youtu.be/pebwHmibla4

Cthulhu_5y ago

Microservice is a misnomer; it should have a responsibility, but that could be 10 lines of code or 10 million.

Anyway, it sounds like you have a distributed monolith. If you cannot maintain and deploy a microservice independently, it should not be a microservice.

notwedtm5y ago

I don't build microservices anymore. All of the reasons listed in this thread tend to cause bottlnecks. I aim for domain services. Define your domains, and build a program to service it.

1 more reply

bhouston5y ago

> Anyway, it sounds like you have a distributed monolith. If you cannot maintain and deploy a microservice independently, it should not be a microservice.

We can maintain and deploy them independently, but it was annoying to try to track which version was deployed where and having to check it out independently, etc.

The overhead was incredibly high. So we plopped them all into a single monorepo as sub projects. We can still update each one individually but we know what is live on the website is what is in the head of that branch.

As someone whose last website was a monolith (Clara.io), we do feel we are getting the benefits of micro services with little of their downsides now. It is like night and day.

It may be we have a lot of micro services for the size of our team - 20+ micro services and a team size of around 12.

BurningFrog5y ago

One "wisdom" I hear is that the benefit of microservices is organizational:

One microservice per team, so you cut down on intra-team friction, and the team can manage their own releases.

beastcoast5y ago

What people miss is that micro services/SOA is an organizational concept more than anything. At Amazon, SOA is tied to the concept of the two pizza team. Each 2PT completely owns a collection of related services, owns the roadmap for those services, and can make deployments independently of any other team. If your company doesn’t have enough engineers to justify at least say 5-10 scrum teams then you might not be large enough to need microservices.

1 more reply

dfcowell5y ago

Microservices are a solution for scaling teams, not software or infrastructure.

(In other words, you're 100% spot on!)

myzie5y ago

+1 to this. In our case we say "deploy all" but use Zim[1] to automatically determine which services have associated changes. This keeps the overall deploy quick.

This is comparable to CloudFormation or Terraform in terms of determining whether something is up-to-date, but more general purpose.

[1] https://github.com/fugue/zim/

JensRantil5y ago

"Microservices" is a tool to solve problems. The goal isn't microservices, the goal is to solve those problems. Example of problems it might solve: Unclear ownership, slow innovation, unstable software or orthogonal scaling of features.

nicoburns5y ago

Perhaps they shouldn't be microservices. What would be the disadvantage if you combined your microservices into a monolith? Or perhaps 4-5 "macroservices"?

jayd165y ago

In this case I think the disadvantages would be you're locked into one machine type and your blast radius is a bit wider.

taneq5y ago

I was going to make some sarcastic response to the effect of "iT's EvErGReeN" but this is basically the only answer. You don't do individual releases of your microservices any more than you do individual releases of the classes/functions in your C++ project.

monster_group5y ago· 10 in thread

I don't keep track. All microservices use continuous deployment pipelines. If you check in code and it passes all the tests, it will make it out to prod some time in the next few hours.

jjice5y ago

How do updates to a database work through that pipeline? Do migrations run through and rolled back automatically as needed?

sokoloff5y ago

Not in the context of micro-services, but we ran our production DB for years in a “both N and N+1 work” by following a few simple rules which turn out to be not that restrictive in practice.

Short version: have DB1 hold the transactional data (data generated while running the system). Have DB2a have the release-bound data (data about and connected to the code itself-settings, prices, whatever).

Have DB2a have views onto DB1 tables. Version a code only “knows about” DB2a but any transactional CRUD ops hit the tables on DB1.

Now version b of the code just needs to ship/create a DB2b and both a and b can run in parallel.

If you need to change the shape of DB1 tables, those changes need to be backward compatible (can only add nullable columns, no use of "select *", etc).

There’s a few details about how to make it fully practical, but that’s the gist and we ran than for about 12 years on a moderately heavily trafficked e-commerce site.

5 more replies

GordonS5y ago

Not the OP, but the way I handle this to to ensure that all migrations are backwards compatible - the current and new versions of the app/API/service must be able to run with the old and new database.

This requires a little discipline, but if you follow a few simple rules it's not really that arduous:

  - when adding a new column, it must have a default value set, or be nullable

  - don't drop any columns

  - don't rename any columns

Now, for those last 2, what I really mean is "don't do it in a single release" - if you want to make destructive changes, do it over the course of 2 releases.

  - release 1: remove dependencies on the column from the app/API/service

  - release 2: performs the database migration with destructive changes

It probably sounds more difficult than it actually is :) In reality, I don't make destructive changes that often though.

rblatz5y ago

We always do it by not pushing breaking changes to the database. It’s extremely freeing. It does require some discipline to go back and cleanup things later, but not worrying about database “versions” is the way to go in my opinion.

vbsteven5y ago

Not gp but here's a possible answer: I usually require db migrations to have a "down" script as well but "down" is never applied automatically. I only auto-apply "up", and when a rollback is needed (which has been very infrequent in my case) I manually apply the "down" scripts using Flyway cli commands or by hand.

1 more reply

imafish5y ago

Do you ever release serious errors into prod?

BatteryMountain5y ago

The question is not IF, but WHEN.

So ideally you have some kind of monitoring that reports/shows how many services are alive (and where they live in a cluster), how many errors they generate etc. Then based on some thresholds you can take them out of circulation and let them cool down. If certain kinds of errors occurs, or at a certain frequency, the system can notify a site reliability engineer (or equivalent) to check it out. Then they can decide if it should be permanently removed and to log an internal support ticket and so forth for the developers or product teams.

Production issues are a part of life. You need to have some visibility on issues and their severity. Every company and tech stack is different, also depending on their SLA's and uptime promises.

Ads not rendering in an app might be less severe than a pump failure at a fuel station, so they have different kinds of monitoring and and reaction times to faults. Obviously things like hospitals, banks, airlines/aircraft manufacturers have way different requirements and infrastructure from say a system that manages all school libraries for a state/province.

There are too many products and approaches to mention here if you were looking for a list of those. I have one or two favorite approaches and a handful of tools for this kind of stuff, half of which is homemade, so not something you can google. But you can google it and see a few different approaches. "microservices monitoring java" or "microservices monitoring best practice" or something along those lines will get you on a path. Try to find 5 different approaches and reflect what each one is missing or how they may help you, and then ponder what would you like to see from a reporting system with hundreds/thousands of services.

And then obviously the the best lessons will come from production itself.

Good luck!

1 more reply

Townley5y ago

Sometimes, though thankfully less frequently (and for a less-disastrous definition of "serious") than I used to.

Luckily, a good CI/CD pipeline makes reversions just as easy as deployments. So even when you have errors, it's easier to correct than if you suddenly discovered "our deployment bash script / ansible playbook isn't as reversible as we thought it was"

monster_group5y ago

Rarely. All features are gated by feature flags with the capability to dial up the feature gradually and dial down the launch instantly. I can monitor if the feature launch is going as expected by monitoring errors and metrics in the logs.

ex_amazon_sde5y ago

YES and this is why deployments to prod should go though many stages and have long bake-in time for critical applications.

The idea of deploying every commit all the way to prod is is very questionable.

klohto5y ago· 6 in thread

Flux with GitOps approach, using Helm charts.

All of our of microservices have deployment charts, with frozen image versioning. That way, we can can rollout a whole release knowing they are all compatible with each other and can easily fall back just by using git rollback.

CI/CD updates image versions in affected YAMLs on every backend release and Flux keeps staging in sync. When we are happy, we sync to production branch, Flux syncs and it's done.

If we spot an issue that we didn't see in staging, we either release a hotfix or rollback.

theptip5y ago

Do you have a separate git repo for the deploy config/manifests? Or just force-push your `master` branch to the `staging` and `production` branches to do a deploy (i.e. not keeping full history in the env branches)?

I've seen both advocated for, interested in what the consensus is.

klohto5y ago

We have gitops repo which contains state of both clusters. Staging and production. The only difference is that production flux watches only production folder and production branch, while staging flux watches staging folder and master branch. Production branch is kept in sync with master when releasing, ff-only.

Backend is a monorepo. I can easily check the commit history in gitops repo to see what was the state of backend when the release was made.

Nothing should be lost, we keep history of everything this way.

computershit5y ago

Have you looked into Jenkins-X at all? I'm at a point where I'm starting to adopt GitOps and I'm torn between Flux and (what I consider) a far more opinionated but pretty elegant solution in JX.

klohto5y ago

I did, it’s overly complicated for what I need (single team, apply YAMLs in git repo, specific branch, tagging). I see the industry using mostly Flux and ArgoCD and I really don’t want anything Jenkins related in infra again.

1 more reply

dclausen5y ago

Could you explain more about your "frozen image versioning?"

klohto5y ago

Was wondering if I just invented the term, or it’s something known :)

Basically a specific semver, no major.minor or just major. Whole version including patch.

jayd165y ago· 5 in thread

If you need to keep track its probably too late. What makes them services is that they should be able to be deployed without a bunch of orchestration of other services. You can solve this by having backwards compatible apis.

That said, to know what changes would actually break things you'd ideally have a suite of tests.

giantg25y ago

"If you need to keep track its probably too late. What makes them services is that they should be able to be deployed without a bunch of orchestration of other services."

If only you could tell my bosses/architects that. They won't listen to me.

Edit: why downvote?

kjeetgill5y ago

Quick counterpoint:

Just because you should be able to release without orchestration doesn't mean you shouldn't be able to watch and track things.

You shouldn't have frequent breaking changes but you should still have the tools to manage when you do.

1 more reply

lowbloodsugar5y ago

>If only you could tell my bosses/architects that. They won't listen to me.

Then leave and go somewhere where they will. I wasted too much of my life trying to "change things from within", but I finally learned the lesson. If you have no authority but are held accountable, then GTFO.

edoceo5y ago

I'm not a downvoter but maybe its cause "they wont listen"

my theory is your presentation is not compelling. Was your CBA clear? What risk/reward metrics did you highlight?

1 more reply

acoard5y ago

>What makes them services is that they should be able to be deployed without a bunch of orchestration of other services.

True, but you absolutely should still be versioning/tagging your releases for each service. It's not to provide sophisticated orchestration; but just to know each of your releases and be able to roll back to them.

Also I'll point out that some loose coupling between services is unavoidable even in the best case scenario. Sometimes breaking changes happen, or new features need to be taken advantage of. This necessitates some level of (perhaps ad-hoc) "orchestration." If you add a new feature to a microservice and rely on it elsewhere, there's an implicit dependency to that version (or later) of the microservice now.

100011_1000015y ago· 4 in thread

My team is responsible for about 200 microservices being deployed, some of them have 10 or more pods. We don't do continuous delivery. Instead it's done by 5 different groups deploying once every week or two.

Our production deployment jobs are in Jenkins and isolated. It's easy to check what was deployed when. We also have a script written that can run an environment report to see what versions and which microservices have been deployed. Along with their CPU/memory allocations, number of pods etc.

Release management tracks which JIRA stories are in which release, they do it mainly by looking at master merges between prod deployments.

JamesSwift5y ago

That seems overwhelming to me. Do you feel this is a tenable strategy? Or is the number of deployments becoming an issue?

twh2705y ago

This is similar to what $myclient is doing.

Parent comment doesn't mention whether identification of versions is done manually or whether they just grab master. If the latter, it's probably reasonable. At $myclient, every release to stage and prod requires teams to manually identify each version of each microservice as well as the stories (JIRA tickets) that are being deployed. This is extremely painful, time-consuming, and error-prone. Avoid at all cost; as the number of services grows, the pain/time/error cost appears to increase geometrically.

1 more reply

100011_1000015y ago

If you automate enough you almost don't need to know the specifics. We no longer monitor actively during production deployments. They just work, once every 5 months there is a deployment that needs to be escalated to us.

Crazyontap5y ago

Any plans for showing user's birthday on the settings page?

Sorry, I'm just kidding but that's the only thing I could think off when I heard the number 200!

drbojingle5y ago· 3 in thread

I'm curious to see how micro service management evolves over time and learn whether or not it will become viable for small companies. Hopefully one day its as cheap as writing a function is.

As it stands, with what I've seen and heard about microservices, I'd say the best way to deal with micro service anything is to use a monolith 90% of the time and for the rest of the time make sure your micro service could stand as it's own SAAS if given enough love.

Not a direct solution to your problem but might be an indirect one.

twh2705y ago

Right now I don't think microservice management is 'viable' even at larger companies. The custom deployment scripts & yaml to manage building, package/artifact repository, versioning, and deployment tends to be Too Damn Big, at least at the shops I've seen.

drbojingle5y ago

Oof. I haven't seen micro services much but, what I have seen makes me wonder what value people are actually expecting to get back. I hope the next iterations of this idea work better.

1 more reply

ex_amazon_sde5y ago

> Hopefully one day its as cheap as writing a function is.

When a network call is involved, never.

UK-Al055y ago· 3 in thread

You've broken the microservice abstraction if this a problem.

A team should own a microservice, you release as soon as the team able to.

You version your apis, so you don't break any services which rely on yours.

giantg25y ago

"You've broken the microservice abstraction if this a problem."

I agree, but in practice it seems more companies break it rather than follow it.

UK-Al055y ago

I agree people hop on the microservice bandwagon without really understanding the "philosophy" behind it. Then blame microservices when they struggle.

1 more reply

nonameiguess5y ago

Although this is true, there is still the additional problem that a lot of customers, specifically government customers, require frozen known versions of everything as a requirement for acceptance testing. Auditing standards for the aerospace industry require that what is running in ops is exactly the same version of everything that was running when an acceptance test was witnessed and signed.

It's a totally impractical standard for modern software development, but the developers themselves have no choice in the matter until the customers change.

johnx123-up5y ago· 1 in thread

From the previous discussion https://news.ycombinator.com/item?id=16166645

1. https://github.com/gocd/gocd - 6.1k stars

2. https://github.com/Shopify/shipit-engine - 1.2k stars

3. https://github.com/guardian/riff-raff - 252 stars

4. https://github.com/ankyra/escape - 201 stars

5. https://github.com/kiwicom/crane - 92 stars

6. https://github.com/tim-group/orc - 34 stars

7. https://github.com/wballard/starphleet - 19 stars (dead?)

8. https://spinnaker.io/

theptip5y ago

ArgoCD could be on the list too.

anishdhar5y ago· 1 in thread

We're building Cortex (https://www.getcortexapp.com/) to solve this problem :) We help you track all your microservices and integrate with all your 3rd party tooling to build a single pane of glass for your architecture. Happy to give you a demo if you're interested!

headcanon5y ago

We're new customers of Cortex at my workplace and I can't recommend it enough. Its delivering a lot of value to us, mainly around keeping track of service "quality" metrics. Team is very responsive and is constantly improving the app. Big fan!

abunuwas5y ago· 1 in thread

In a previous job had tons of microservices and tons of environments, so it was getting difficult to track what was deployed where. We opted for a simple solution to this: we wrote a very simple CLI that makes the deployments and at the same time registers the deployment in a DynamoDB table. Then to get a picture of a certain environment we just had to list all services for that environment. You could also list the history of releases for a certain service in a certain environment.

abunuwas5y ago

To clarify: we tracked not only microservices but also UI deployments. We had what they now call "microfrontends"

vbsteven5y ago· 1 in thread

My usual setup is pretty simple with each service in its own git repository with a Gitlab pipeline:

  * build code
  * run tests (unit + integration using database)
  * build docker image
  * push to gitlab registry
  * deploy to staging k8s environment by using a custom image that just templates a .yml and does `kubectl apply` against the staging cluster
  * optional extra "deploy to production" that works in the same way but is triggered with a manual button click in the pipeline.

I don't do canary deploys or anything. Just deploy to staging, and if it works, promote to production.

For some projects I have "staging test scripts" which I can run from my devmachine or CI that check some common scenarios. The test scripts are mostly blackbox using an HTTP client to perform a series of requests and assert responses. (signup flow scenario for example)

I would like to move to a monorepo, but I have not yet figured out an easy way to have a separate pipeline for each service that is only triggered when that service has changed.

edit: formatting

Chico755y ago

The issue with this model of manual deployment to production is that it creates uncertainty about what version was last deployed to, and the team can lose confidence in the deployment process if that doesn't happen regularly

romanhn5y ago· 1 in thread

Check out OpsLevel, seems in line with what you might be looking for. I know the folks behind it, they're top tier.

kenrose5y ago

Thanks Roman.

Founder of OpsLevel here (https://www.opslevel.com).

A lot of companies build their own internal microservice tracking tools. Not just for release/deployments, but also for tracking service owners and production readiness.

e.g., Shopify has ServicesDB ([1]) and Spotify has System-Z [2], which they recently open sourced as Backstage [3].

If you're down to build / maintain your own service catalog, those are good places to start.

We started OpsLevel a few years back because we saw a pretty clear need for a product in this space. OpsLevel tracks your services and their owners, production readiness of your services, and brings together lots of event/metadata about your services (including deploys).

There's been a lot of traction in this space over the last few years with a lot of new companies popping up. I'm glad to see some of our newer friends in the space chiming in this thread.

[1] - https://shopify.engineering/e-commerce-at-scale-inside-shopi...

[2] - https://dzone.com/articles/modeling-microservices-at-spotify...

[3] - https://backstage.io/

znpy5y ago· 1 in thread

you don't. each team keeps track of its own set of micro-services.

igetspam5y ago

This.

We have standardized pipeline models that we reuse everywhere. Service owners are responsible for updating their pipelines to pick up changes. As we mature, we're moving a lot of it into ci templates and key changes will be picked up automatically. There are a few pipelines that occasionally require manual steps but those are uncommon. As we add more continuous testing, we'll be deploying more frequently. Once we've gotten good at that, then we'll be working on a/b testing and/or feature flags.

wikibob5y ago· 1 in thread

Don’t have dozens of micro services.

This is a serious comment.

nepthar5y ago

I'd like to kindly point out that your dismissive comment doesn't add much value to the conversation by itself. Check out the guidelines here to help craft expressive comments that add to the discussion: https://news.ycombinator.com/newsguidelines.html.

adamhp5y ago

Frankly, we don't do a great job of it. We have some Ansible deploying to OpenShift via openshift applier, that gets run from some Jenkins jobs. We use a form of Git Flow to do branching and tag releases. It's messy.

I've been looking at Sentry for this, recently. They have a specific feature for tracking releases (and even relating them to errors vs. commits) which looks very interesting. Haven't tried it yet though.

nonameiguess5y ago

Two ways I've seen it done reasonably well.

The somewhat more modern way with Kubernetes deployments is the Helm "chart of charts" pattern, where your system level deployment is a single Helm chart that does nothing but pull in other charts, specifying the required semantic version of each sub-chart in the values.yaml file.

The older, but also much more flexible way I've seen it done is through something a local system architect developed a while back that he called a "metamodule." This was back when Apache Ivy was a preferred means of dependency management when Apache Ant was still a popular build tool and microservices were being deployed as Java OSGi components. Ivy defines a coordinate to uniquely identify a software dependency by organization, module, and revision. So a metamodule was just a module, but like the chart of charts, it doesn't define an actual software component, but rather a top-level grouping of other modules. Apache Ivy is significantly more flexible than Helm, however, allowing you to define version ranges, custom conflict managers, and even multiple dependencies that globally conflict but can be locally reconciled as long as the respective downstreams don't actually interact with each other.

Be aware both of these systems were for defense and intelligence applications. Personally, I would just recommend trunk based development and fail fast in production for most consumer applications, but for things that are safety or mission critical, you can't do that and may have very stringent pre-release testing and demonstration requirements and formal customer acceptance before you can release anything at all into ops, in which case you need the more complicated dependency management schemes to be able to use microservices.

Arguably, in this case, the simplest thing to do from the developer's perspective is don't use microservices and do everything as a monorepo instead, but government and other enterprise applications usually don't want to operate this way because of being burned so much in the past by single-vendor solutions. It's not totally impossible to have a monorepo with multiple vendors, but it's certainly a lot harder when they tend to all want to keep secrets from each other and have locally incompatible standards and practices and no direct authority over each other.

sdevonoes5y ago

We don't. I'm just waiting the day we reach more than 100 microservices and my company realises that microservices was a bad idea to begin with. That's usually the way it works: learning the hard way.

To elaborate:

- I do think there is value in "utility microservices". For example: a microservice to send email, a microservice to filter spam, etc. These are the next level libraries (because they do need to run as services 24/7). Management usually don't like these kind of microservices because these "domains" usually don't belong to any particular team, so managers cannot "own" their success.

- I don't think there's much value in building microservices for the core of your business (e.g., a checkout microservice, a payments microservice, etc.). The usual argument management gives is: "we'll make teams more independent and they will be able to delivery stuff faster than with a monolith!". While this is sometimes true, "faster software delivery" is not on my top list of prioritites when it comes to build software.

itielshwartz5y ago

Founder of https://Komodor.com here, we track changes and alerts across your complete K8s-based stack, analyzing their ripple effect and then providing devs, DevOps and SRE teams the context they need to troubleshoot efficiently. Independently.

Feel free to ask question or reach out :)

jokethrowaway5y ago

Push to master -> jenkins runs linting, tests, applies migration (or fail, requiring manual intervention), build sdocker image, k8s deploys to canary, monitors canary for a bit for errors, k8s deploys to production, tags docker image, notifies slack.

In the past, instead of canary, we used a staging environment with manual promotion. That was costing us a cool half a million in AWS overpriced machines (but we were committed to spend a certain amount of money per year in exchange for discounts, so it's hard to price things) and it was doubling the testing process (promote to staging, test, promote to prod, test). We have been bitten by issues happening in production and not in staging. With the canary, prod only approach we have higher risks of messing up with real data but we have safeguards in place and the canary approach means that a small portion of the users will see problems. We also have the option to deploy to a canary for devs only.

I'm not happy about using / running / maintaining jenkins (terrible UI, upgrade path, API to add plugins, etc) but it does the job and it improved a fair bit over the last 5 years. Jenkinsfile are especially nice, even though not being able to easily run them locally is a bit annoying.

mrdonbrown5y ago

When I worked at Atlassian, we had this issue as well, given all the many services that were deployed for products. A few of us left and created Sleuth [1] to solve it for Atlassian and folks like you. Sleuth helps you know what is deployed, its health, and helps with workflow automation. It also tracks the DORA metrics so you know how healthy a service release processes is.

[1] https://sleuth.io

moksly5y ago

It depends a little on you definition of “microservice”, but we keep track of a lot of our “mostly single responsibility” data-processes that make up the builk of our AD, IDM and organisational database for 10.000 employees and 300+ IT systems with a mix of azure automation runbooks and local tasks that are activated by azure automation to. This gives us a clear picture of when what is run, alert humans on errors and halts processes.

For all-ways-on systems we have a simple dash-board that each service interacts with.

We don’t have a fancy CI/CD pipeline or anything like that, just a set of rules that you have to follow.

Database-wise a service has to register itself with one of our data-gatekeepers, which involves asking for permission for the exact data used with a reason. But beyond that services are rather free to make “add” changes, often in the forms of new tables that are linked with views. It’s not efficient, and we have a couple of cleanup scripts that check if anyone subscribed to all the data, but we’re not exactly Netflix, so the inefficiency is less expensive than doing something about it.

kerblang5y ago

Since the specific question is "how do you keep track of", my build & deploy script copies a quick one-liner dump of git information (SHA, date, environment, branch, etc.) to a directory on a shared server, as a text file. Later I can go to that server and `cat versions/* | sort` to get a report of what is deployed where/when and so on.

It helps that I have One Deployment Script To Rule Them All (or really, a couple DSTRTA's). When every service has its own special build & deploy script you have to ask nicely and hope people keep up with it. A lot of CI/CD systems force you into that corner because of an implicit assumption that each build & deploy is its own special one-off.

Anyhow, text files rule, at least as an ad-hoc solution.

k8s_Hero5y ago

Have you heard of Komodor? They just held a joint webinar with Epsagon regarding this very issue! You can see a recording of the webinar here: https://www.youtube.com/watch?v=J32ZoiRVvPg Or the product overview here: https://www.youtube.com/watch?v=Qgio3vF1sPE&t=6s

mandeepj5y ago

Looks like someone from AirBnb can shed a light on the topic. They seem to be nailed the Microservices deployments :-)

https://www.altoros.com/blog/airbnb-deploys-125000-times-per...

taleodor5y ago

We work on a solution - https://relizahub.com

Our community Discord Server (questions on DevOps and DataOps, not limited to Reliza Hub) - https://discord.gg/UTxjBf9juQ

selphy15y ago

We use Octopus Deploy. On commit it autodeploys to dev and sends the team a slack message ([environment] version x (previous was y) deployed by z). Prod deployments are also done through octopus but "manually" by the team when we are ready to make a release. Usually every week or two.

sasfn5y ago

Sauron is a solution to help to track as many microservices as you have, indexing these information into an elasticsearch

https://github.com/freenowtech/sauron

raphaelj5y ago

In the case of web/RESTful services, I'm just relying on Heroku/Dokku.

exabrial5y ago

I hate to state the obvious... but don't do microservices? It's the lunacy of the 2000's ESB craze but without all the formal testing.

The only way to accomplish what you're asking for would be extremely thorough mock testing.

geritwo5y ago

CI/CD can make it manageable with Git and Atlassian tools, or you can build a custom web dashboard if needed. Personally I like version tagging and release management based on semantic versioning.

sidcool5y ago

GoCD or Gitlab CI work work well for me. I have CircleCI too, but it's initial version did not impress me. I am sure it has improved since.

rileymichael5y ago

GitOps w/Flux, although currently evaluating ArgoCD for its “app of apps” pattern to more easily provide feature preview environments.

anotherhue5y ago

ArgoCD has been very helpful here. I understand that the upcoming 'Argo Rollouts' will be even more so.

caniszczyk5y ago

check out https://backstage.io

j / k navigate · click thread line to collapse

134 comments

87 comments · 34 top-level

bhouston5y ago· 15 in thread

We just made all the microservices into one big monorepo and we deploy all at the same time.

To be honest we tried to avoid the monorepo but it was hellish. Maybe if each microservices was larger and our team was larger but then are they microservices any more?

lovedswain5y ago

Giving up and dumping everything into a monorepo, that's not going to help at all. At that point probably better off just giving up any hope of carefully split up and individually managed services

bhouston5y ago

Same with our consistent logging system.

Libraries are better than unique code everywhere for the same task - allows you to fix a bug once and to do consistency checking.

1 more reply

mewpmewp25y ago

I don't get the first paragraph you are saying. This lazy person puts some random code into a shared component, or... ?

Wouldn't this urgent need mean that they put this code into the microservice that needs this urgent update as opposed to going through the effort to make it available for everyone to use?

1 more reply

slifin5y ago

This video was interesting to me

https://youtu.be/pebwHmibla4

Cthulhu_5y ago

Microservice is a misnomer; it should have a responsibility, but that could be 10 lines of code or 10 million.

Anyway, it sounds like you have a distributed monolith. If you cannot maintain and deploy a microservice independently, it should not be a microservice.

notwedtm5y ago

I don't build microservices anymore. All of the reasons listed in this thread tend to cause bottlnecks. I aim for domain services. Define your domains, and build a program to service it.

1 more reply

bhouston5y ago

> Anyway, it sounds like you have a distributed monolith. If you cannot maintain and deploy a microservice independently, it should not be a microservice.

We can maintain and deploy them independently, but it was annoying to try to track which version was deployed where and having to check it out independently, etc.

As someone whose last website was a monolith (Clara.io), we do feel we are getting the benefits of micro services with little of their downsides now. It is like night and day.

It may be we have a lot of micro services for the size of our team - 20+ micro services and a team size of around 12.

BurningFrog5y ago

One "wisdom" I hear is that the benefit of microservices is organizational:

One microservice per team, so you cut down on intra-team friction, and the team can manage their own releases.

beastcoast5y ago

1 more reply

dfcowell5y ago

Microservices are a solution for scaling teams, not software or infrastructure.

(In other words, you're 100% spot on!)

myzie5y ago

+1 to this. In our case we say "deploy all" but use Zim[1] to automatically determine which services have associated changes. This keeps the overall deploy quick.

This is comparable to CloudFormation or Terraform in terms of determining whether something is up-to-date, but more general purpose.

[1] https://github.com/fugue/zim/

JensRantil5y ago

nicoburns5y ago

Perhaps they shouldn't be microservices. What would be the disadvantage if you combined your microservices into a monolith? Or perhaps 4-5 "macroservices"?

jayd165y ago

In this case I think the disadvantages would be you're locked into one machine type and your blast radius is a bit wider.

taneq5y ago

monster_group5y ago· 10 in thread

I don't keep track. All microservices use continuous deployment pipelines. If you check in code and it passes all the tests, it will make it out to prod some time in the next few hours.

jjice5y ago

How do updates to a database work through that pipeline? Do migrations run through and rolled back automatically as needed?

sokoloff5y ago

Not in the context of micro-services, but we ran our production DB for years in a “both N and N+1 work” by following a few simple rules which turn out to be not that restrictive in practice.

Have DB2a have views onto DB1 tables. Version a code only “knows about” DB2a but any transactional CRUD ops hit the tables on DB1.

Now version b of the code just needs to ship/create a DB2b and both a and b can run in parallel.

If you need to change the shape of DB1 tables, those changes need to be backward compatible (can only add nullable columns, no use of "select *", etc).

There’s a few details about how to make it fully practical, but that’s the gist and we ran than for about 12 years on a moderately heavily trafficked e-commerce site.

5 more replies

GordonS5y ago

This requires a little discipline, but if you follow a few simple rules it's not really that arduous:

  - when adding a new column, it must have a default value set, or be nullable

  - don't drop any columns

  - don't rename any columns

Now, for those last 2, what I really mean is "don't do it in a single release" - if you want to make destructive changes, do it over the course of 2 releases.

  - release 1: remove dependencies on the column from the app/API/service

  - release 2: performs the database migration with destructive changes

It probably sounds more difficult than it actually is :) In reality, I don't make destructive changes that often though.

rblatz5y ago

vbsteven5y ago

1 more reply

imafish5y ago

Do you ever release serious errors into prod?

BatteryMountain5y ago

The question is not IF, but WHEN.

Production issues are a part of life. You need to have some visibility on issues and their severity. Every company and tech stack is different, also depending on their SLA's and uptime promises.

And then obviously the the best lessons will come from production itself.

Good luck!

1 more reply

Townley5y ago

Sometimes, though thankfully less frequently (and for a less-disastrous definition of "serious") than I used to.

monster_group5y ago

ex_amazon_sde5y ago

YES and this is why deployments to prod should go though many stages and have long bake-in time for critical applications.

The idea of deploying every commit all the way to prod is is very questionable.

klohto5y ago· 6 in thread

Flux with GitOps approach, using Helm charts.

CI/CD updates image versions in affected YAMLs on every backend release and Flux keeps staging in sync. When we are happy, we sync to production branch, Flux syncs and it's done.

If we spot an issue that we didn't see in staging, we either release a hotfix or rollback.

theptip5y ago

I've seen both advocated for, interested in what the consensus is.

klohto5y ago

Backend is a monorepo. I can easily check the commit history in gitops repo to see what was the state of backend when the release was made.

Nothing should be lost, we keep history of everything this way.

computershit5y ago

Have you looked into Jenkins-X at all? I'm at a point where I'm starting to adopt GitOps and I'm torn between Flux and (what I consider) a far more opinionated but pretty elegant solution in JX.

klohto5y ago

1 more reply

dclausen5y ago

Could you explain more about your "frozen image versioning?"

klohto5y ago

Was wondering if I just invented the term, or it’s something known :)

Basically a specific semver, no major.minor or just major. Whole version including patch.

jayd165y ago· 5 in thread

That said, to know what changes would actually break things you'd ideally have a suite of tests.

giantg25y ago

"If you need to keep track its probably too late. What makes them services is that they should be able to be deployed without a bunch of orchestration of other services."

If only you could tell my bosses/architects that. They won't listen to me.

Edit: why downvote?

kjeetgill5y ago

Quick counterpoint:

Just because you should be able to release without orchestration doesn't mean you shouldn't be able to watch and track things.

You shouldn't have frequent breaking changes but you should still have the tools to manage when you do.

1 more reply

lowbloodsugar5y ago

>If only you could tell my bosses/architects that. They won't listen to me.

edoceo5y ago

I'm not a downvoter but maybe its cause "they wont listen"

my theory is your presentation is not compelling. Was your CBA clear? What risk/reward metrics did you highlight?

1 more reply

acoard5y ago

>What makes them services is that they should be able to be deployed without a bunch of orchestration of other services.

100011_1000015y ago· 4 in thread

Release management tracks which JIRA stories are in which release, they do it mainly by looking at master merges between prod deployments.

JamesSwift5y ago

That seems overwhelming to me. Do you feel this is a tenable strategy? Or is the number of deployments becoming an issue?

twh2705y ago

This is similar to what $myclient is doing.

1 more reply

100011_1000015y ago

Crazyontap5y ago

Any plans for showing user's birthday on the settings page?

Sorry, I'm just kidding but that's the only thing I could think off when I heard the number 200!

drbojingle5y ago· 3 in thread

I'm curious to see how micro service management evolves over time and learn whether or not it will become viable for small companies. Hopefully one day its as cheap as writing a function is.

Not a direct solution to your problem but might be an indirect one.

twh2705y ago

drbojingle5y ago

Oof. I haven't seen micro services much but, what I have seen makes me wonder what value people are actually expecting to get back. I hope the next iterations of this idea work better.

1 more reply

ex_amazon_sde5y ago

> Hopefully one day its as cheap as writing a function is.

When a network call is involved, never.

UK-Al055y ago· 3 in thread

You've broken the microservice abstraction if this a problem.

A team should own a microservice, you release as soon as the team able to.

You version your apis, so you don't break any services which rely on yours.

giantg25y ago

"You've broken the microservice abstraction if this a problem."

I agree, but in practice it seems more companies break it rather than follow it.

UK-Al055y ago

I agree people hop on the microservice bandwagon without really understanding the "philosophy" behind it. Then blame microservices when they struggle.

1 more reply

nonameiguess5y ago

It's a totally impractical standard for modern software development, but the developers themselves have no choice in the matter until the customers change.

johnx123-up5y ago· 1 in thread

From the previous discussion https://news.ycombinator.com/item?id=16166645

1. https://github.com/gocd/gocd - 6.1k stars

2. https://github.com/Shopify/shipit-engine - 1.2k stars

3. https://github.com/guardian/riff-raff - 252 stars

4. https://github.com/ankyra/escape - 201 stars

5. https://github.com/kiwicom/crane - 92 stars

6. https://github.com/tim-group/orc - 34 stars

7. https://github.com/wballard/starphleet - 19 stars (dead?)

8. https://spinnaker.io/

theptip5y ago

ArgoCD could be on the list too.

anishdhar5y ago· 1 in thread

headcanon5y ago

abunuwas5y ago· 1 in thread

abunuwas5y ago

To clarify: we tracked not only microservices but also UI deployments. We had what they now call "microfrontends"

vbsteven5y ago· 1 in thread

My usual setup is pretty simple with each service in its own git repository with a Gitlab pipeline:

  * build code
  * run tests (unit + integration using database)
  * build docker image
  * push to gitlab registry
  * deploy to staging k8s environment by using a custom image that just templates a .yml and does `kubectl apply` against the staging cluster
  * optional extra "deploy to production" that works in the same way but is triggered with a manual button click in the pipeline.

I don't do canary deploys or anything. Just deploy to staging, and if it works, promote to production.

I would like to move to a monorepo, but I have not yet figured out an easy way to have a separate pipeline for each service that is only triggered when that service has changed.

edit: formatting

Chico755y ago

romanhn5y ago· 1 in thread

Check out OpsLevel, seems in line with what you might be looking for. I know the folks behind it, they're top tier.

kenrose5y ago

Thanks Roman.

Founder of OpsLevel here (https://www.opslevel.com).

A lot of companies build their own internal microservice tracking tools. Not just for release/deployments, but also for tracking service owners and production readiness.

e.g., Shopify has ServicesDB ([1]) and Spotify has System-Z [2], which they recently open sourced as Backstage [3].

If you're down to build / maintain your own service catalog, those are good places to start.

There's been a lot of traction in this space over the last few years with a lot of new companies popping up. I'm glad to see some of our newer friends in the space chiming in this thread.

[1] - https://shopify.engineering/e-commerce-at-scale-inside-shopi...

[2] - https://dzone.com/articles/modeling-microservices-at-spotify...

[3] - https://backstage.io/

znpy5y ago· 1 in thread

you don't. each team keeps track of its own set of micro-services.

igetspam5y ago

This.

wikibob5y ago· 1 in thread

Don’t have dozens of micro services.

This is a serious comment.

nepthar5y ago

adamhp5y ago

nonameiguess5y ago

Two ways I've seen it done reasonably well.

sdevonoes5y ago

To elaborate:

itielshwartz5y ago

Feel free to ask question or reach out :)

jokethrowaway5y ago

mrdonbrown5y ago

[1] https://sleuth.io

moksly5y ago

For all-ways-on systems we have a simple dash-board that each service interacts with.

We don’t have a fancy CI/CD pipeline or anything like that, just a set of rules that you have to follow.

kerblang5y ago

Anyhow, text files rule, at least as an ad-hoc solution.

k8s_Hero5y ago

mandeepj5y ago

Looks like someone from AirBnb can shed a light on the topic. They seem to be nailed the Microservices deployments :-)

https://www.altoros.com/blog/airbnb-deploys-125000-times-per...

taleodor5y ago

We work on a solution - https://relizahub.com

Our community Discord Server (questions on DevOps and DataOps, not limited to Reliza Hub) - https://discord.gg/UTxjBf9juQ

selphy15y ago

sasfn5y ago

Sauron is a solution to help to track as many microservices as you have, indexing these information into an elasticsearch

https://github.com/freenowtech/sauron

raphaelj5y ago

In the case of web/RESTful services, I'm just relying on Heroku/Dokku.

exabrial5y ago

I hate to state the obvious... but don't do microservices? It's the lunacy of the 2000's ESB craze but without all the formal testing.

The only way to accomplish what you're asking for would be extremely thorough mock testing.

geritwo5y ago

CI/CD can make it manageable with Git and Atlassian tools, or you can build a custom web dashboard if needed. Personally I like version tagging and release management based on semantic versioning.

sidcool5y ago

GoCD or Gitlab CI work work well for me. I have CircleCI too, but it's initial version did not impress me. I am sure it has improved since.

rileymichael5y ago

GitOps w/Flux, although currently evaluating ArgoCD for its “app of apps” pattern to more easily provide feature preview environments.

anotherhue5y ago

ArgoCD has been very helpful here. I understand that the upcoming 'Argo Rollouts' will be even more so.

caniszczyk5y ago

check out https://backstage.io

j / k navigate · click thread line to collapse