> If you're running on a simple system architecture,
His point was that even a feature flag system in a complex environment with substantial functional and system requirements is worth building vs buying. If your needs are even simpler, then this statement is even more true!
I'm having a hard time making sense out of the rest of your comment, but in larger businesses the kinds of things you're dealing with are:
- low latency / staleness: You flip a flag, and you'll want to see the results "immediately", across all of the services in all of your datacenters. Think on the order of one second vs, say 60s.
- scalability: Every service in your entire business will want to check many feature flags on every single request. For a naive architecture this would trivially turn into ungodly QPS. Even if you took a simple caching approach (say cache and flush on the staleness window), you could be talking hundreds of thousands of QPS across all of your services. You'll probably want some combination of pull and push. You'll also need the service to be able to opt into the specific sets of flags that it cares about. Some services will need to be more promiscuous and won't know exactly which flags they need to know in advance.
- high availability: You want to use these flags everywhere, including your highest availability services. The best architecture for this is that there's not a hard dependency on a live service.
- supports complex rules: Many flags will have fairly complicated rules requiring local context from the currently executing service call. Something like: "If this customer's preferred language code is ja-JP, and they're using one of the following devices (Samsung Android blah, iPhone blargh), and they're running versions 1.1-1.4 of our app, then disable this feature". You don't want to duplicate this logic in every individual service, and you don't want to make an outgoing service call (remember, H/A), so you'll be shipping these rules down to the microservices, and you'll need a rules engine that they can execute locally.
- supports per-customer overrides: You'll often want to manually flip flags for specific customers regardless of the rules you have in place. These exclusion lists can get "large" when your customer base is very large, e.g. thousands of manual overrides for every single flag.
- access controls: You'll want to dictate who can modify these flags. For example, some eng teams will want to allow their PMs to flip certain flags, while others will want certain flags hands off.
- auditing: When something goes wrong, you'll want to know who changed which flags and why.
- tracking/reporting: You'll want to see which feature flags are being actively used so you can help teams track down "dead" feature flags.
This list isn't exhaustive (just what I could remember off the top of my head), but you can start to see why they're an endeavor in and of themselves and why products like LaunchDarkly exist.