A small team starting a new project should not waste a single second considering microservices unless there's something that is so completely obviously decoupled in a way that not splitting it into a microservice will lead to extra work. It's also way easier to split into microservices after the fact than when you're developing a new app and you don't have a clue how it will look like or what the overall structure of the app will be in a year (most common case for startups).
A thousand times yes. Distributed systems are hard.
> Debugging is more difficult since you now can no longer step through your program in a debugger but rather have an opaque network request that you can't step into.
Yes. Folks underestimate how difficult this can be.
In theory it should be possible to have tooling to fix this, but I've not seen it in practice.
> You can no longer use editor/IDE features like go to definition.
Not a problem with a good editor.
> Version control becomes harder if the different services are in different repositories.
No organisation should have more than one regular-use repo (special-use repos, of course, are special). Multiple repos are a smell.
I would modify this slightly. Larger organizations with independent teams may want to run on per-team repos. Conway's law is an observation about code structure but it sometimes also makes good practice for code organization. And of course, sometimes the smell is "this company is organized pathologically".
Another problem is that large monolithic repositories can be difficult to manage with currently available software. Git is no panacea and Perforce isn't either.
Flat out wrong for any organization with multiple products. Which, let's be honest, is most of them.
Mind elaborating on this?
What editor are you thinking of that can jump from HTTP client API calls to the corresponding handler on the server?
Totally agree with everything else, but gotta completely disagree on this last point. Monorepos are a huge smell. If there's multiple parts of a repo that are deployed independently, they should be isolated from each other.
Why? Because you're fighting human nature, otherwise. It's totally reasonable to think that once you excise some code from a repo that it's no longer there, but when you have multiple projects all in one repo, different services will be on different versions of that repo, and your change may have changed semantics enough that interaction bugs across systems may occur.
You may think that you caught all of the services using the code you refactored in that shared library, but perhaps an intermediate dependency switched from using that shared library to not using it, and the service using that intermediate library hasn't been upgraded, yet?
When separately-deployable components are in separate repositories, and libraries are actual versioned libraries in separate repositories these relationships are explicit instead of implicit. Explicit can be `grep`ed, implicit cannot, so with the multi-repo approach you can write tools to verify that all services currently in production are no longer using an older, insecure shared library, or find out exactly which services are talking to which services by the IDLs they list as dependencies.
While with the monorepo approach you can get "fun" things like service A inspecting the source code of service B to determine if cache should be rebuilt (because who would forget to deploy service A and service B at the same time, anyways...), as an example I have personally experienced.
My personal belief is that the monorepo approach was a solution back when DVCSs were all terrible and most people were still on centralized VCSs like Subversion that couldn't deal with branches and cross-repo dependencies well, and that's just what you had to do, while Git and Mercurial, along with the nice language-level package managers, make this a non-issue.
Finally, there's an institutional bias to not rock the boat (which I totally agree with) and change things that are already working fine, along with a "nobody got fired buying IBM" kind of thing with Google and Facebook being two prominent companies using monorepos (which they can get away with by having over a thousand engineers each to manage the infrastructure and build/rebuild their own VCSs to deal with the problems inherent to monorepos that most companies don't have the resources and/or skills to replicate).
EDIT: Oh, I forgot, I'm not advocating a service-oriented architecture as the only way to do things, I'm just advocating that whatever your architecture, you should isolate the deployables from each other and make all dependencies between them explicit, so you can more easily write tooling to automatically catch bad deploy states, and more easily train new hires on what talks to/uses what, since it's explicitly (and required to be) documented.
If that still means a monorepo for your company's single service and a couple of tiny repos for small libraries you open source, that's fine. If it means 1000 repos for each microservice you deploy multiple times a day, that's also fine (good luck!).
Most likely it means something like 3-10 repos for most companies, which seems like the right range for Miller's Law) ( https://en.wikipedia.org/wiki/The_Magical_Number_Seven,_Plus... ) and therefore good for organizing code for human consumption.
If you expect to need to step into a function call when debugging, then it's too tightly coupled to spin out. You should be able to look at the arguments to the call and the response and determine if it's correct (and if not, now you have isolated a test case to take to the other service and continue debugging there).
If the interface will change so often that you expect it will be a problem that it's in a separate repository, if you expect that you will always need to deploy in tandem, then it's too tightly coupled to spin out.
The advantage of micro services is the separation in fact of things that are separate in logic. The complexity of systems grows super-linearly, so it's easier to reason about and test several smaller systems with clear (narrow) interfaces between them than one big. It's easier to isolate faults. It's harder to accidentally introduce bugs in a different part of the system when the system doesn't have a different part. If done right, scaling can be made easier. But these are hard architectural questions, there's no clear-cut rule for when you should spin off a new service and when you should keep things together.
Someone else mentioned separating the shopping app from the payment system for an ecommerce business, which even has security benefits. I think that's an excellent example.
Edit: Another clear benefit is that you can choose different languages, libraries, frameworks and paradigms for different parts of the code. You can write your boring CRUD backend admin app in Ruby on Rails, your high-performance calculation engine in Rust and your user-facing app in Node.js (so the front- and backend an share Javascript validation code).
As for advantages, microservices tend to keep code relatively simple and free from complex inheritance schemes. There's rarely a massive tangled-up engine full of special cases in the mix, as there often is in monolithic apps. This substantially decreases technical debt and learning curve, and can make it simple to understand the function an isolated microservice performs.
There is the obvious advantage that if you have disparate applications executing nearly-identical logic to read or write data to the same location, and the application platforms can't execute the same library code, you can centralize that logic into an HTTP API, which reduces maintenance burden and prevents potentially major bugs.
My opinion is that adopting microservices as a paradigm leads to a slow, difficult-to-debug application, primarily because people take the "micro" in microservices too seriously. One shouldn't be afraid to split functionality out into an ordinary service after it's been shown to be reasonable to do so.
With microservices, the production version of their service would conceivably be stable. It moves the contract from the repo to the state of production services.
With a monolithic repo done right, the other teams broke their build of their branch, and it's up to them to resolve it. You, meanwhile, are perfectly happy working on your branch. When their changes are mergeable into trunk, then they may merge them, not before — and likewise for you.
With multiple repos, they break your build, but don't know it. You don't know it either, until you update your copies of their repos — and now you have to figure out what they did, and why, and how to update your logic to handle their new control flow, and then you update again and get to do it again, until finally you ragequit and go live in a log cabin with neither electricity nor running water.
You don't debug distributed systems by tracing into remote calls and jumping into remote code. You debug it by comparing requests and responses (you use discrete operations, right) with the specified requests and responses, and then opening the code that has a problem¹.
It calls for completely different tooling, not for a "better debugger".
1 - Or the specs, because yes, now that your system is distributed you also have to debug the specs. Why somebody would decide on doing that for no reason at all? Yet lots of people do.
Multiple platforms is not a problem and generally a good thing as long as it's not excessive. You don't want to be in a case where you have the same number of different platforms as developers or anything like that. I'm guessing there is a rule of thumb here, but I'm not sure what it would be. Max 1 different platform per 5 developers? Something like that.
I do wish people would stop conflating "running in a different service" and "loose coupling". They are completely orthogonal.
I've worked on some horrendously tightly coupled microservices.
Unless you can coax dOSGi into working (which is tons of fun), then you can have services tightly coupled to other services running on entirely different machines causing frequent (and hilarious) cascades of bundle failures whenever the network hiccups.
OSGi is a trigger word for me now. I've worked on two large OSGi projects (previous job and current job) and it's always the same. Sh*t is always broken (and my lead still insists that OSGi is the one true way to modular bliss). And the OSGi fanboys always say "Your team is using it wrong!" Which very well might be true, but I no longer care. Apparently it's just too damn hard to get a team of code monkeys to respect service boundaries when OSGi makes it so damn easy to ignore them.
If I'm ever in a position of getting to design a new software architecture (hasn't happened in 10 years, but hey I can dream), I'll punch anyone who suggests "OSGi" to me right in the face.
That's a good point. I think this thought extrapolates to other parts of software engineering as well. Sometimes writing very modular and decoupled software from the beginning is very hard for a small team, and we can't see well if this is the best approach since it's also hard to grasp the big picture.
I'm currently facing this issue. I'm trying to write very modular and reusable applications, but now I'm paralyzed trying to picture the best patterns to use, where should I use a facade, a decorator, etc. I think I'll adopt this strategy for myself--only focus on modularizing from the beginning if it'd lead to extra work otherwise.
Microservices also make it much harder to refactor the code which you often need to do in the early stage of a project.