Check out Topaz [0], which uses OPA as its decision engine, but adds a data plane that is based on the ReBAC ideas explored in the Google Zanzibar [1] paper.
Disclaimer: I work on the team [2] that builds and maintains the Topaz project.
[1] https://research.google/pubs/zanzibar-googles-consistent-glo...
https://www.openpolicyagent.org/docs/latest/management-bundl...
To the main point, what you described reflects the current trends of authorization. Define a data model, define data that adheres to that model, write declarative rules that consume that model, make a decision based on those rules.
Where things really start to differ is the kind of data that they bind against and how do you write rules. E.g. OPA is often used for either ABAC (Attribute) or RBAC (Roles) while OpenFGA is looking at ReBAC (Relationships). Each has their complexity tradeoffs, depending on the system being implemented. How easy or difficult a system makes these kinds of checks has a significant impact on how you write policies.
Hope this helps!
The advantage is that it's a single container image (or go binary, if that's how you want to run it), and supports a combination of RBAC, ABAC, and ReBAC. ABAC is accomplished via the Rego language, which is as "standard" as it comes in the cloud-native world.
OPA replaces a complex hard-coded, and largely inscrutable UAM model with a (still complex), but flexibly defined, independently testable, and easily inspectable single-responsibility model.
I like that OPA has built in support for testing rulesets. The partial evaluation feature is amazing, ands makes it easy to apply UAM filters to endpoints that return large sets of data (we have consistent query APIs across the app, so could do this with a relatively simple OPA-aware proxy).
It's not all sunshine and roses, and the result might seem overly complex for a lot of use cases, but in our case I think OPA has provided a nice clean abstraction and enabled us to disentangle our UAM from the rest of our code and move more quickly overall.
1: Yeah, we can use OPA to get rid of all this legacy spaghetti code!
2: Wow this PoC really proves out the idea!
3: Whoa we have three use cases now running in production!
4: Wait, these remaining 20 use cases are way more complex. To our surprise, all this legacy spaghetti code _exists for a reason_.
5: We now have 5 use cases in production but the Rego is now quite convoluted and our application logic has actually increased in complexity.
6: Red button: okay this is going horribly wrong. Back out this whole thing.
7: Recognition: the reason this has gone horribly wrong is because the spaghetti code combines pure logic and side effects in a way that did not map well with OPA.
8: Regroup: first step is to refactor all the legacy code and separate policy logic from side effects in a meaningful way.
9: Refactor: implement the above redesign. The policy classes all now map naturally to Rego for all 23 use cases! Let's do it!
10: Reality: we don't want to. Our codebase is well-structured now and we like it. Adding OPA now feels like an unnecessary layer, an additional potential for network timeouts etc to creep in, an extra thing to maintain, an extra special case to handle in our safe deployment pipeline, an extra language to train developers on. Now _maybe_ if we ever wanted other teams to write up and maintain their own Rego policies, then _maybe_ we'd consider going with it in the future, but for now the reality is our team would end up doing that work for them anyway, and it doesn't seem worth the tradeoff.
Anyway, lesson learned: don't expect it to magically clean up all the garbage in your existing code. You'll do it wrong and things will be worse than when you started. Clean that up first, and _then_ decide whether and how you want to adopt OPA for your remaining needs.
The thinking is we'll have some basic built-in policies (like admins can do X, editors can do Y, etc) but also allow users to configure their own policies if they want by writing rego and loading their policy rules at startup time (via config). We'd document the inputs that we pass to the evaluation call such as request headers, IP, role, etc.
I'm curious if anyone has ever tried something like this or similar?
[0] https://github.com/flipt-io/flipt
[1] https://www.openpolicyagent.org/docs/latest/integration/#int...
You can find our talk here https://www.styra.com/resources/videos/snap-inc--snaps-journ...
I found it relatively easy to use and at a good level of abstraction to make the policies relatively reusable.
The biggest advantage to OPA was the flexibility. This enabled not just an authorization decision, but the why behind it. No more questions of why did this person/system gain (or was denied) access, combing through dozens of rules to find the matching statements. Just pull up the log and read the results… This is incredibly useful during audits.
Cedar could not provide that level of detail (or so I was told by AWS representatives selling their hosted version).
I understand why restricting the possibilities with an external DSL might be a good idea, but I consider Rego to be to restricted. I mean, in the the a policy is just a function saying basically "yes" or "no" (I know, it's not that simple with OPA, but it boils down to access yes/no, anyway).
package play
import rego.v1
default allow := false
allow if {
user := input.id
user in data.groups.A
user in data.groups.B
not user in data.groups.C
}
[0] https://play.openpolicyagent.org/p/adMo9TE9bSOne area that is a constrained and narrow use case is around the actual application level permissions - eg what a user can do inside of your service. Having hand-rolled this in various companies - and the inevitable rebuilds that were required as requirements change such as adding a new, product packaging updates etc - you do end up with a complex web of logic - ether in your codebase or as Rego.
For these application level permissions - where the requirements really come from the product/business rather than engineering - I always felt there could be a simpler way of defining this rules. Policies needed to be in a format a business user could understand, and enforcing them needs to be extremely responsive as checks are in the blocking path of every request - and this needs to work at large scale - all whilst making every decision auditable to tick all the regulatory and compliance needs around access controls.
To this effect we begun working on Cerbos[0] a few years ago which initially targets that one specific use case - models policy in simple YAML [1] (love it or hate it!) and takes a stateless approach meaning it is infinitely scalable with none of the headache of synchronizing information about your users or resources to the authZ layer, also critically generates that single audit log of decisions.
Disclaimer: I work on the team that builds and maintains Cerbos[2].
[0] https://github.com/cerbos/cerbos
[1] https://play.cerbos.dev/p/XhkOi82fFKk3YW60e2c806Yvm0trKEje
There's some other interesting work with spiffe/spire that I've been investingating for $WORK, could be useful to some on this path https://spiffe.io/docs/latest/microservices/envoy-opa/readme...
Disclaimer: I work on this but it’s free, & open source!
1. Define policies using declarative language Rego
2. Deploy OPA alongside your service as a sidecar in Kubernets
3. Make your service queries OPA when it needs to make policy decisions, passing the current state/context as input.
4. OPA evaluates the policies written in Rego against the input and returns a decision (allow or deny) back to your service.
Found it's hard to convince everyone around to use OPA/Rego and wrap into a managed service. The main objection - wrapping another DSL (domain-specific language) is hard.
However it was relatively simple to convince my team to use featured complete Go library Ladon https://github.com/ory/ladon
Ladon is inspired by AWS IAM Policies.
{
"description": "One policy to rule them all.",
"subjects": ["users:<peter|ken>", "users:maria", "groups:admins"],
"actions" : ["delete", "<create|update>"],
"effect": "allow",
"resources": [
"resources:articles:<.*>",
"resources:printer"
],
"conditions": {
"remoteIP": {
"type": "CIDRCondition",
"options": {
"cidr": "192.168.0.1/16"
}
}
}
}All policies are loaded on the app start, stored in memory (not DB) and checked with the help of small middleware which triggered the following function.
func (l *Ladon) DoPoliciesAllow(r *Request, policies []Policy) (err error)
https://github.com/ory/ladon/blob/972387f17e29c529ad3ff42a84...
Very negligible perfomance hit. Code is very simple, hackable, and can be subject for further optimisations.
Ladon is very fast. It's possible to run all user groups against all CRUD routes, and get the basic permission matrix or build some simple UI forms to test condition for better control.
P.s. Feel free to ping me in private @reactima (github, telegram) if you want to discuss the edge cases for the above.
One significant complication that all centralized authorization solutions share is that you end up needing to reproduce application data in the authorization system. We've been doing a lot of work in this area to simplify data management and have some beta functionality available. I'll include some links to the docs for those.
Sync and reconcile data: https://www.osohq.com/docs/guides/data/sync-data#initial-syn... Filter lists with decentralized data (about halfway down): https://www.osohq.com/docs/guides/enforce/filter-lists