Anti-patterns in event-driven architecture (opens in new tab)

(codeopinion.com)

243 pointsindentit2y ago163 comments

163 comments

100 comments · 14 top-level

candiddevmike2y ago· 43 in thread

Can someone share some long term event driven success stories? Almost everything you see online is written by consultants or brand new, greenfield implementations, curious how long these systems last.

ninkendo2y ago

Chiming in with another “no” here. We adopted a message bus/event-driven architecture when moving a very popular piece of software from the cloud, to directly on the user’s device… it was a disaster IMO.

The core orchestration of the system was done via events on the bus, and nobody had any idea what was happening when a bug occurred. People would pass bugs around, “my code did the right thing given the event it got”, “well, my code did the right thing too”, and nobody understood the full picture because everyone was stuck in their own silo. Event driven architectures encourage this: events decouple systems such that you don’t know or care what happens when you emit a message, until one day it’s emitted with slightly different timing or ordering or different semantics, and things are broken and nobody knows why.

The worst part is that software is basically “take user input, do process A on it, then do process B on that, then do process C on that.” It could have so easily been a simple imperative function that called C(B(A(input))), but instead we made events for “inputWasEmitted”, “Aoutput”, “Boutput”, etc.

What happens when system C needs one more piece of metadata about the user input? 3 PR’s into 3 repos to plumb the information around. Coordinating the release of 3 libraries. All around just awful to work with.

Oh and this is a very high profile piece of software with a user base in the 9 figure range.

(Wild tangent: holy shit is hard to get iOS to accept “do process” in a sentence. I edited that paragraph at least 30 times, no joke, trying every trick I could to stop it correcting it to “due process”. I almost gave up. I used to defend autocorrect but holy shit that was a nightmare.)

tadfisher2y ago

I think the true term for this phenomenon is "decoherence" rather than "decoupling". Your components are still as coupled as they ever were, but the coupling has moved from compile-time (e.g. function calls) to runtime. The component that "handles events" decoheres the entire system because it's now responsible for the entire messaging layer between components, rather than the individual components being responsible for their slice of the system.

1 more reply

setr2y ago

> (Wild tangent: holy shit is hard to get iOS to accept “do process” in a sentence. I edited that paragraph at least 30 times, no joke, trying every trick I could to stop it correcting it to “due process”. I almost gave up. I used to defend autocorrect but holy shit that was a nightmare.)

can you not just pick the original spelling in the autocomplete menu above the keyboard?

1 more reply

sharlos2010682y ago

We use an event driven architecture at work and find it works quite well, however events are for communicating between services across business domains and owned by different teams.

If you have some logic A and B running on user input, I wouldn't be splitting that across different services.

Salgat2y ago

https://www.eventstore.com/case-studies/insureon

I can attest to this case study being 100% true. Our platform has been using EventStore as our primary database for 9 years going strong, and I'm still very happy with it. The key thing is that it needs to be done right from the very beginning; you can't do major architecture reworks later on and you need an architect who really knows what they're doing. Also, you can't half-ass it; event sourcing, CQRS, etc all had to embraced the entire time, no shortcuts.

I will say though, the biggest downside is that scaling is difficult since you can't always rely on snapshots of data, sometimes you need to event source the entire model and that can get data heavy. If you're standing up a new projector, you could be going through tens of millions of events before it is caught up which requires planning. It is incredible though being able to have every single state change ever made on the platform available, the data guys love it and it makes troubleshooting way easier since there's no secrets on what happened. The biggest con is that most people don't really understand it intuitively, since it's a very different way of doing things, which is why so many companies end up fucking it up.

Spivak2y ago

Am I dumb or is this basically the binlog of your database but without the tooling to let you do efficient querying?

Like I get the "message bus" architecture when you have a bunch of services emitting events and consumers for differing purposes but I don't think I would feel comfortable using it for state tracking. Especially when it seems really hard to enforce a schema / do migrations. CQRS also makes sense for this but only when it functions as a WAL and isn't meant to be stored forever but persisted by everyone who's interested in it and then eventually discarded.

4 more replies

therealdrag02y ago

Note that event sourced data and event based architecture are different things. You can have one without the other.

1 more reply

blowski2y ago

I was the lead developer on one for an insurance company a few years back, and it’s still in active use. Insurance is a heavily regulated domain, where an audit trail is more important than performance. There was a natural pattern for it to follow, as we were mapping a stable industry standard.

I also tried doing it in a property setting, where profit margins were tight. The effort needed wasn’t worth the cost, and clients didn’t really care about the value proposition anyway. We pretty much replaced the whole layer with a more traditional crud system.

chenster2y ago

What did you mean traditional crud as oppose to event-driven arch? How is it relevant to the subject in dicussion?

1 more reply

devdude13372y ago

When I did game dev I often went for an even driven approach or messaging based systems combined with oop and state machines to prevent eventual consistency locally. It works great in that domain, albeit not being the most performant solution.

In web or business systems it works well for some(!) parts. You just shouldn’t do everything that way - but often people get too exited about a solution and then they tend to overdo it and apply it everywhere, even when not appropriate.

Always chose the golden middle path and apply patterns where they fit well.

rswail2y ago

Wrote a public transport ticketing system that processes 100-200K+ trips/day with sub-second push of notification to mobiles of trip/payments.

Event driven and CQRS "entities" made logic and processing much easier to create/test/debug.

Primary issues: 1. Making sure you focus on the "Nouns" (entities) not the "Verbs". 2. Kafka requiring polling for consumers sucks if you want to "scale to zero". 3. Sharding of event consumers can be complicated. 4. People have trouble understanding the concepts and keep wanting to write "ProcessX" type functions instead of state machines and event handlers. 5. Retry/replay is complicated, better to reverse/replay. Dealing with side effects in replay is also complicated (does a replay generate the output events which trigger state changes in other entities?)

Been running now for 6 years, minimal downtime except for maintenance/upgrades.

In the process of introducing major new entity and associated changes, most of the system unaffected due to the decoupling.

chenster2y ago

Can you elaborate #1 Nouns over Vers?

1 more reply

cjk22y ago

No. We have a complete fucking disaster on our hands.

macintux2y ago

How old of a system? Do you feel it’s the implementation, the design, or the concept itself that went wrong? Is your system a good fit?

(No stake in this one way or another, just curious.)

1 more reply

cweld5102y ago

I work on an event-based architecture that I think is successful, but that’s because our core primitives are event-based, so there is no impedance mismatch in the way that there can be if you migrate from a request-response architecture to an evented one. Specifically, we aren’t trying to deal with databases and HTTP (both of which are largely synchronous primitives). Instead, I work on a platform for somewhat arbitrary code execution; and the code we are executing depends on our code rather than vice versa. In general, the code we execute on the platform can run for an indeterminate amount of time, and it generally has control and calls back into our code rather than our code calling into it. So our control flow is naturally callback-based rather than request/response; as a result, our system is fundamentally event-based.

vmaurin2y ago

I have been doing this kind of stuff both in ad tech and trust & safety industry, mainly to handle scalability. Something that looks like "Event-carried state transfer" here https://martinfowler.com/articles/201701-event-driven.html

These system are working fine, but maybe a common ground : * very few services * the main throughput is "fact" events (so something that did happen) * what you get as "Event carried state transfer" is basically the configuration. One service own it, with a classical DB and a UI, but then expose configuration to all the system with this kind of event (and all the consumers consume these read only) * usually you have to deal with eventual consistency a lot in this kind of setup (so it scales well, but there is a tradeoff)

jgraettinger12y ago

PostgreSQL.

The WAL is an event log, and when you squint at its internal architecture, you’ll see plenty of overlap with distributed event sourcing.

mrkeen2y ago

Likewise with git. There's the "top-level events" that you see (commits). But even when you're doing 'unsafe' operations, you're working with the lower-level reflog events.

marcosdumay2y ago

Almost every modern software system. Anything running over the Web is event driven.

1 more reply

lmm2y ago

Right. The hard part is already done. Which makes it infuriating that it's all "internal". Every serious RDBMS already contains an implementation of an event-sourcing system, but you're not allowed to actually use it.

mrkeen2y ago

We've had mistakes that we've been able to course-correct from.

Our users are small-businesses with organisation numbers, and we mostly think of them as unique. But they strictly aren't, so we 'overwrote' some companies with other companies.

Once we detected and fixed the bug, we just replayed the events with the fixed code, and we hadn't lost any data.

lz4002y ago

AFAIK almost every stock market order processing system is event driven, and they are all usually very old systems that have been successfully running for years. I've seen some implementation in investment banks, what you're usually told is that most exchanges and banks run similar architectures. The reason for this is partially that FIX, the protocol for electronic orders in markets is event based.

lanstin2y ago

It is a very convenient way to move higher latency operations from the realtime path to a near real time path. E.g. you want to send an email when a payment is authorized, you don’t want to wait for the whole SMTP transaction so you just post an even and reply back to the user. Also settlements of captured autos, 5st sort of thing. Even saving some user pref, start the task, reply back to user. and if it fails async send a failure msg.

nitwit0052y ago

I've seen successful, but flawed usage.

Every use I've seen sent events after database transactions, with the event not part of the transaction. This means you can get both dropped events, and out of order events.

My current company has analytics driven by a system like that. I'm sure there's some corrupted data as a result.

The main issue being people just don't know how to build and test distributed systems.

mrkeen2y ago

I had an interview where I was asked how I would guarantee that an event happened in addition to a database update (transactionally).

It sounded kind of impossible, I said as much, and then proposed a different approach. The interviewer persisted and claimed that it could be done with 'the outbox pattern'.

I disagreed and ended the interview there. Later when I was chatting about it with a former colleague, he said "Oh, they solved the two generals problem?"

> Every use I've seen sent events after database transactions, with the event not part of the transaction.

Maybe this is what they were doing.

3 more replies

simonbw2y ago

It's been an incredibly useful pattern for me in game development. I have a hard time imagining making a game with any level of complexity without it. You can definitely go overboard with it, but I have a hard time even imagining how some systems like collision detection/a physics engine could even work without it.

ClimaxGravely2y ago

That's generally been my experience as well.

However I've seen some frameworks where you can do collision imperatively. For example

if (sprite.collide(tilemap)) {do something}

These are generally on smaller less taxing frameworks (in this case I'm referring to haxeflixel) but they do exist!

TeeMassive2y ago

I've worked in an embedded Linux system that was a greenfield project. We needed a library that was written in a certain language, but we also wanted Python for the rest because getting the logic right with a client that changed his mind often was top priority and the data crunching was minimal.

So we ended up using protobufs over a local MQTT broker and adopted a macro-service architecture. This suited the project very well because it had a handful of obvious distinct parts and we took full advantage of Conway's law by making each devs work the part where their strengths and skills were maximized.

We made a few mistakes along the way but learned from them. Most of them relating to inter-service asynchronous programming. This article put words on concepts we learned through trial and errors, especially queries disguised as events.

liampulles2y ago

Our system is command driven, and works well, but it is because we explicitly have less rigorous demands on the messages and the messages don't cross team boundaries. My past experience also makes me wary of event driven systems.

bob10292y ago

I saw it done well in manufacturing.

I think it works well when it's the only thing that can work.

Osiris2y ago

The project I'm working on is about 13 years old (ruby on rails) with over 260 engineers and the product has a very robust event driven system that is at the core of a lot of important features.

swistak352y ago

Out of curiosity, what is the system? Is this based on Rails Event Store, or something else (custom?)?

tlarkworthy2y ago

Webhooks? Slack automation? GitHub actions?

turkey992y ago

Yes, it’s a great tool for integration. We have a product suite and it’s our chosen way to connect products.

tkiolp42y ago

No. The usual pains are:

- Producer and consumer are decoupled. That’s a good thing m right? Good luck finding the consumer when you need to modify the producer (the payload). People usually don’t document these things

- Let’s use SNS/SQS because why not. Good luck reproducing producers and consumers locally in your machine. Third party infra in local env is usually an afterthought

- Observability. Of rather the lack of it. It’s never out of the box, and so usually nobody cares about it until an incident happens

throwaway24x72y ago

> Good luck finding the consumer when you need to modify the producer

It sounds like your alternative is a producer that updates consumers using HTTP calls. That pushes a lot of complexity to the producer and the team that has to sync up with all of the other teams involved.

> Let’s use SNS/SQS because why not. Good luck reproducing producers and consumers locally in your machine

At work we pull localstack from a shared repo and run it in the background. I almost forget that it's there until I need to "git pull" if another team has added a new queue that my service is interested in. Just like using curl to call your HTTP endpoints, you can simply just send a message to localstack with the standard aws cli

https://github.com/localstack/localstack

> Observability. Of rather the lack of it. It’s never out of the box, and so usually nobody cares about it until an incident happens

I think it depends on what type of framework you use. At work we use a trace-id field in the header when making HTTP calls or sending a message (sqs) which is propagated automatically downstream. This enables us to easily search logs and see the flow between systems. This was just configured once and is added automatically for all HTTP requests and messages that the service produces. We have a shared dependency that all services use that handles logging, monitoring and other "plumbing". Most of it comes out of the box from Spring, and the dependency just needs to configure it. The code imports a generic sns/http/jdbc producer and don't have to think about it

thr0w2y ago

> - Let’s use SNS/SQS because why not.

The amount of times I've come across someone who's inserted SQS into the mix to "speed things up"...

mrkeen2y ago

> Good luck finding the consumer when you need to modify the producer (the payload)

I just grep for the event's class name.

thr0w2y ago

> Can someone share some long term event driven success stories?

JavaScript

jesse__2y ago

I think we have very different definitions of success

richardw2y ago

Banking, 7 ish years. Worked well for us in general. When I start needing increased confidence and truth the effort level goes way up but can be done. Definitely still worth it, has given us some solid benefits.

When I say increased, I mean we want the best answer but there are some answers the bank can’t know. If someone has transferred money into your account from another bank but we don’t know that yet, optimising for absolute correctness is pointless because the vast majority of wrong answers are baked in to the process. We can send you a message and you might read it a day later. Unless we delete the message from your phone, we can’t guarantee the message you read is fully consistent with our internal state.

Frankly our system is much better than the batch driven junk that is out of sync a second after it has executed. “Hey you have a reward.” “No I used it 2 hours ago you clowns.”

Note this isn’t cope. In some cases we started fully sync but relaxed it where there are tradeoffs that gave us better outcomes and we weren’t giving anything material up.

lanstin2y ago

Or worse “hey you have a reward” but it doesn’t show up in the UI for three minutes. Twitter used to do this to me all the time.

1 more reply

throwawaymaths2y ago

Does canbus count?

arwhatever2y ago· 21 in thread

I often hear the argument in favor of event-driven architecture that you can work on one part of a system in isolation without having to consider the other parts, and then I get assigned some task which requires me to consider the entire system operation, now with events that are harder to trace than function calls would have been.

Now when people argue “because decoupling,” I hear, “You don’t get as much notification that you just broke a downstream system.”

junto2y ago

You need to improve your telemetry to feel the benefits. I can trace all the way through multiple services easily on a simple detailed flame graph in our systems.

https://www.datadoghq.com/knowledge-center/distributed-traci...

Unless you have a single monolith, you’re going to face issues with versioning whether it’s event based or API based. In each case you can usually add new properties to a message, but you can’t remove properties or change their types. If you need that, create a new version.

The author does a lot of videos on the event sourcing topic. Event driven I get. It works well in several applications I’ve helped to build over the last 15 years. But event sourcing? I truly don’t get it. Yeah I get it’s nice in terms of auditing to see every change to an entity and who made it, or replay up to to change x on y date, but that really is a niche requirement.

scubbo2y ago

> I can trace all the way through multiple services easily on a simple detailed flame graph in our systems

I'm not sure what point is being made here. It's good that you can do that - but are you implying that that's not possible in an API-driven system?

YZF2y ago

There are many examples of event sourcing but perhaps the most classical one is the bank account.

It's not just about auditing, it's also about transactionality and atomicity.

If you want to withdraw $5 from your account, the traditional approach of locking, updating everything, unlocking (or in other words wrapping everything in a transaction) doesn't scale as well as the notion that you just record the transaction (event). Implementation-wise this withdrawal can involve, updating two accounts and updating the audit/account transaction logs. We also want this to scale since our bank has millions of customers all operating more or less concurrently. A distributed log (like Kafka) is easy to scale and easy to reason about. You just insert the transaction record and you have a distributed system that will scale and is easy to reason about.

Another driver/flavour for something like event sourcing is what some might call state-based or state-oriented programming. That is instead of modifying state directly you are synchronizing state via events. This lets you e.g. code state machines around those that can lead, again, to easier to reason about (and test) code.

3 more replies

NomDePlum2y ago

Depends what domain you work in. Auditability is a key/mandatory requirement in a lot of regulated industries.

There are of course other ways to do auditability.

Event Sourcing + Projections provide a nice way to build multiple models/views from the same dataset. This can provide a lot of simplification for client code.

okr2y ago

A niche requirement? There are big accounting firms who organize payrolls, where the changes that you mentioned are an important part of their business.

There are also other companies, which do the typical snapshot and roll up to the current time, when they start the services, that need the data without having access to the database.

1 more reply

chipdart2y ago

> I can trace all the way through multiple services easily on a simple detailed flame graph in our systems.

That's not exactly an obscure feature exclusive to datadog. From the top of my head, both AWS and Azure support distributed tracing with dedicated support for visualization in their x-ray and application insights services.

1 more reply

hobs2y ago

I think generally a lot of these types of problems were actually had by folks who grew out of single node systems and had a lot of interesting ideas to solve problems that were new in those domains, GIVEN they've already solved the stateful domain problems as well.

When you've never grown out of a single node domain but you do event driven "because scaling" or whatever, you've shot yourself in the foot amazingly hard.

The_Colonel2y ago

Yes, events, async, eventual consistency, decoupling represent a difficult/complex solution for some hard problems encountered when scaling high.

But people often forget there are trade-offs to everything and if you don't have these hard problems, you're giving yourself only headaches.

My pet-peeve is "decoupling" - it's treated as holy with only benefits and no downsides. But it's actually again a level of complexity - unless you need it, tightly coupled code will be easier to write, read, debug etc.

trevor-e2y ago

Like anything it can be abused and sometimes folks go overboard with turning everything into an event. However, when done right, it is really amazing to work with.

As an event producer as long as you follow reasonable backwards-compatibility best practices then you should be pretty safe from breaking things downstream. As a consumer, follow defensive programming and allow for idempotency in case you need to reprocess an event. Pretty straightforward once you get the hang of things.

majormajor2y ago

> As an event producer as long as you follow reasonable backwards-compatibility best practices then you should be pretty safe from breaking things downstream.

That can protect you from "downstream can't even read the message anymore" but it doesn't help you with the much more common "downstream isn't doing the right thing with the message anymore" problem. Schema evolution is kinda like schema'd RPC calls vs plain JSON: it will protect you from "oops, we sent eventId instead of event_id" type of errors, but won't prevent you from making logical errors. In a larger org, this can turn into delayed-discovery nightmares.

A synchronous API call could give you back an error response and alert your immediately to something being wrong. The system notifies you directly.

A downstream event consumer may fail in ways entirely off of your team's radar. The downstream team starts getting alerts. Whether or not those alerts make it immediately obvious to them that it's your fault... that depends on a bunch of factors.

1 more reply

gnat2y ago

Do you have pointers to such best practices? Gratefully received etc.

2 more replies

mrkeen2y ago

> now with events that are harder to trace than function calls would have been

I don't know how this could be true. Events are things - nouns which can be backed-up, replicated, stored, queried, rendered, indexed and searched over.

sattoshi2y ago

How is it not true? Instead of tracking data and function calls over a unified stacktrace, you track Things and Messages over databases, queues, and logs —- none of which you can trivially attach a debugger to.

I generally like event-driven architecture, but I need to admit that debuggability is sacrificed where it matters most.

lmm2y ago

There's no "find usages" for events, and it becomes harder to find out why something didn't happen. A function call can't simply "not return" - in the worst case you get an exception, or a stuck thread in the caller that will show up in a stack dump. But downstream event processing can very easily just not happen, for one of many different reasons, and out-of-the-box it's often difficult to investigate.

2 more replies

Osiris2y ago

What I like about event driven is that you don't even need to know if anyone is listening to or cares about your event.

And as a consumer, many independent tasks can be triggered by the same event.

I'm working on a system right now and because of events, it's very easy for me to write a handler for when a certain type of record is created in the database. My feature depends on knowing that new record was made so we can send some emails and do other things.

The people that wrote the code that creates the record, didn't have to do anything to support the feature.

But I agree that it's not the right solution for every problem. But there are certain problems it solves really well.

zdragnar2y ago

> you don't even need to know if anyone is listening to or cares about your event.

Right up until you need to change something about the event because the business logic it represents has changed. Then you suddenly need to track down all the systems that have been relying on it, including that one that nobody knows anything about and always forgets exists because some guy decided to implement the service in erlang and nobody who ever touched it even works at the company anymore.

2 more replies

NicoJuicy2y ago

And that's why a SAGA describes that flow.

Don't take it into consideration and you're fucked.

Source: previous "seniors" didn't take it into consideration, they left

pyrale2y ago

> now with events that are harder to trace than function calls

Same issue as microservices: there are people who want to use the paradigm but not do the investment in monitoring/tooling.

scubbo2y ago

Amen. Event-driven architecture makes it easier to bury your head in the sand, and harder to implement an actually-working feature.

moandcompany2y ago

Integration tests?

worik2y ago

...happen too late

pudwallabee2y ago· 8 in thread

I have seen Kafka pulled out by its hairs and replaced with request based architecture.

Event driven architecture, to me is itself an antipattern.

It seems like a replacement for batch processing. Replayable messages are AWESOME. Until you encounter the complexity for a system to actually replay them consistently.

As far as the authors video, while there was some truth in there, it was a little thin, compared to the complexity of these architectures. I believe that even though Kafka acts the part of "dumb pipe", it doesnt stay dumb for long, and the n distributions of Kafka logs in your organization could be 1000x more expensive than a monolithic DB and a monolithic API to maintain.

Yes it appears auditable but is it? The big argument for replayability is that unlike an API that falls over theres no data loss. If you work with Kafka long enough you’ll realize that data loss will become a problem you didnt think you had. You’ll have to hire people to “look into” data loss problems constantly with Kafka. Its just too much infrastructure to even care about.

Theres also, something ergonomically wrong with event drive architecture. People dont like it. And it also turns people into robots who are “not responsible” for their product. Theres so much infrastructure to maintain that people just punt everything back to the “enterprise kafka team”.

The whole point of microservices was to enable flexibility, smart services and dumb pipes, and effective CI/CD and devops.

We are nearing the end of microservices adoption whether it be event or request driven. In mature organizations it seems to me that request driven is winning by a large margin over event driven.

It may be counterintuitive, but the time to market of request driven architecture and cost to maintain is way way lower.

lmm2y ago

> I believe that even though Kafka acts the part of "dumb pipe", it doesnt stay dumb for long

In my experience programmers are very happy to do everything in the application (something database people often complain about). What kind of problems do you see?

> If you work with Kafka long enough you’ll realize that data loss will become a problem you didnt think you had. You’ll have to hire people to “look into” data loss problems constantly with Kafka.

Not my experience at all, and I've used Kafka at a wide range of companies, from household-name scale to startups. Kafka is the boring just-works technology that everyone claims they're looking for.

I'm no fan of microservices, but Kafka is absolutely the right datastore most of the time.

smrtinsert2y ago

> and the n distributions of Kafka logs in your organization could be 1000x more expensive than a monolithic DB and a monolithic API to maintain

Not to mention certain observability vendors bleeding you for all those logs you now need to keep an eye on it.

Absolutely agreed on every point

RHSman22y ago

The unseen critical part of the equation

buster2y ago

I think the problem here is Kafka and not event driven architecture. I am a strong proponent of not using Kafka for events. It's wrong 90% of the time and for the other 10% you can find better solutions.

Also, people need to understand that "event driven" has nothing to do with "event sourcing". Just don't keep all the events until eternity, because you can (and because some people think you should because "kafka").

majormajor2y ago

I haven't run into weird Kafka data loss issues like you describe - although, I will note, a lot of applications don't actually have much testing to notice something like 1 in 10k messages being dropped if it was happening.[0]

But when I've done that testing, Kafka hasn't been the problem.

The problem I've run into most is that ordering is a giant fucking pain in the ass if you actually want consistent replayability and don't have trivial partitioning needs. Some consumers want things in order by customer ID, other consumers want things in order by sold product ID, others by invoice ID? Uh oh. If you're thinking you could easily replay to debug, the size and scope of the data you have to process for some of those cases just exploded. Or you wrote N times, once for each of those, and then hopefully your multi-write transaction implementation was perfect!

[0] in fairness, a lot of applications also don't guarantee that they never drop requests at all, obviously. 500 and retry and hope that you don't run out of retries very often; if you do, it's just dropped on the ground and it's considered acceptable loss to have some of that for most companies/applications.

withinboredom2y ago

I would say that requiring in-order events is a huge anti-pattern. What guarantee do you have that they were actually produced in-order and all the clocks are in-sync enough to know that without a doubt?

therealdrag02y ago

What are the causes of data loss?

syca_f132y ago

Jepsen has written a fantastic article on this issue. I'm not sure if it has been fixed since then. https://aphyr.com/posts/293-call-me-maybe-kafka

onetimeuse923042y ago· 7 in thread

Not specifically about event-driven, but the most damaging anti-pattern I would say is microservices.

In pretty much all projects I worked with in recent years, people chop up the functionality into small separate services and have the events be serialised, sent over the network and deserialised on the other side.

This typically causes enormous waste of efficiency and consequently causes applications to be much more complex than they need to be.

I have many times worked with apps which occupied huge server farms when in reality the business logic would be fine to run on a single node if just structured correctly.

Add to that the amount of technology developers need to learn when they join the project or the amount of complexity they have to grasp to be able to be productive. Or the overhead of introducing a change to a complex project.

And the funniest of all, people spending significant portion of the project resources trying to improve the performance of a collection of slow nanoservices without ever realising that the main culprit is that the event processing spends 99.9% of the time being serialised, deserialised, in various buffers or somewhere in transit which could be easily avoided if the communication was a simple function call.

Now, I am not saying microservices is a useless pattern. But it is so abused that it might just as well be. I think most projects would be happier if the people simply never heard about the concept of microservices and instead spent some time trying to figure how to build a correctly modularised monolithic application first, before they needed to find something more complex.

roncesvalles2y ago

Also, the single most nonsensical reason that people give for doing microservices is that "it allows you to scale parts of the application separately". Why the fuck do you need to do that? Do you scale every API endpoint separately based on the load that it gets? No, of course not. You scale until the hot parts have manageable load and the cold parts will just tag along at no cost. The only time this argument makes sense is if one part is a stateless application and the other part is a database or cache cluster.

Microservices make sense when there are very strong organizational boundaries between the parts (you'd have to reinterview to move from one team to the other), or if there are technical reasons why two parts of the code cannot share the same runtime environment (such as being written in different languages), and a few other less common reasons.

onetimeuse923042y ago

Oh, it is even worse.

The MAIN reason for microservices was that you could have multiple teams work on their services independently from each other. Because coordinating work of multiple teams on a single huge monolithic application is a very complex problem and has a lot of overhead.

But, in many companies the development of microservices/agile teams is actually synchronised between multiple teams. They would typically have common release schedule, want to deliver larger features across multitude of services all at the same time, etc.

Effectively making the task way more complex than it would be with a monolithic application

1 more reply

marginalia_nu2y ago

> Also, the single most nonsensical reason that people give for doing microservices is that "it allows you to scale parts of the application separately". Why the fuck do you need to do that? Do you scale every API endpoint separately based on the load that it gets? No, of course not. You scale until the hot parts have manageable load and the cold parts will just tag along at no cost. The only time this argument makes sense is if one part is a stateless application and the other part is a database or cache cluster.

I think it really matters what sort of application you are building. I do exactly this with my search engine.

If it was a monolith it would take about 10 minutes to cold-start, and it would consume far too much RAM to run a hot stand-by. This makes deploying changes pretty rough.

So the index is partitioned into partitions, each with about a minute start time. Thus, to be able to upgrade the application without long outages, I upgrade one index partition at a time. With 9 partitions, that's a rolling 10%-ish service outage.

The rest of the system is another couple of services that can also restart independently, these have a memory footprint less than 100MB, and have hot standbys.

This wouldn't make much sense for a CRUD app, but in my case I'm loading a ~100GB state into RAM.

michaelcampbell2y ago

> Why the fuck do you need to do that?

Because deploying the whole monolith takes a long time. There are ways to mitigate this, but in $currentjob we have a LARGE part of the monolith that is implemented as a library; so whenever we make changes to it, we have to deploy the entire thing.

If it were a service (which we are moving to), it would be able to be deployed independently, and much, much quicker.

There are other solutions to the problem, but "µs are bad, herr derr" is just trope at this point. Like anything, they're a tool, and can be used well or badly.

1 more reply

SadCordDrone2y ago

Also - you give up type safety and refactoring. LoL

onetimeuse923042y ago

Well, technically, you can construct the microservices preserving type safety. You can have an interface with two implementations

- on the service provider, the implementation provides the actual functionality,

- on the client, the implementation of the interface is just a stub connecting to the actual service provider.

Thus you can sort of provide separation of services as an implementation detail.

However in practice very few projects elect to do this.

1 more reply

Pet_Ant2y ago

You don't have to. The producers of the microservice also produces an adapter. The adapter looks like a regular local service, but it implements the code as a REST request to another microservice. This was you get you type-safety. Generally you structure the code as

Proj:

|-proj-api

|-proj-client

|-proj-service

Both proj-client and proj-service consume/depend-on proj-api so they are in sync of what is going on.

Now, you can switch the implementation of the service to gRPC if you wanted with full source compatibility. Or move it locally.

YZF2y ago· 4 in thread

I've worked in a large company where some variation of event driven architecture was used everywhere and treated as the word of G-d. Fairly successfully. Mostly in applications that ran on a single machine.

I've ended up in a lot of arguments about this while we were building larger distributed systems because I've come from a more request/response oriented message passing architectures. I.e. more synchronous. What I've found is that the event driven architecture did tend to lead to less abstractions and more leaked internal details. This isn't fundamental (you can treat events like an API) but was related to some details in our implementation (something along the line of CDC).

Another problem with distributed systems with persistent queues passing events is that if the consumer falls behind you start developing a lag. Yet another considerations is that the infrastructure to support this tends to have some performance penalties (e.g. going through Kafka with an event ends up being a lot more expensive than an RPC call). Overall it IMO makes for a lot of additional complexity which you may need in some cases, but if you don't then you shouldn't pay the cost.

What I've come to realize is that in many ways those systems are equivalent. You can simulate one over the other. If you have an event based system you can send requests as events and then wait for the response event. If you have a request/response system you can simulate events over that.

If we look at things like consensus protocols or distributed/persistent queues then obviously we would need some underlying resources (e.g. you might need a database behind your request/response model). So... Semantics. Don't know if others have a similar experience but when one system is mandated people will invent workarounds that end up looking like the other paradigm, which makes things worse.

There are things that conceptually fit well with an event driven architecture and then there are things that fit well with a request/response model. I'm guessing most large scale complex distributed apps would be best supporting both models.

bux932y ago

Every synchronous call is, in fact, asynchronous. We just hide it on the stack, in the return address, in the TCP connection etc. No call is really blocking anymore, the OS or the CPU will run some other thread. People who insist that things are simpler in a synchronous model are just ignoring the actual mechanics. Which is fine, that's just abstraction.

YZF2y ago

Well, a CALL instruction on your CPU is pretty much synchronous. But it's true that sometimes what looks synchronous is often asynchronous under the hood. The CPU itself is I guess "asynchronous", a bunch of flipflops and signals ("events") ;)

I can recall software where I tried to wrestle a bunch of asynchronous things into looking more synchronous and then software where I really enjoyed working with a pure asynchronous model (Boost.Asio FTW). Usually the software where I want things to be synchronous is where for the most part I want to execute a linear sequence of things that depend on each other without really being able to use that time for doing other things vs. software where I want all things to happen at the same time all the time (e.g. being able to take in new connections over the network, serve existing connections etc.) and spinning threads for doing that is not a good fit (performance or abstraction-wise).

The locality of the synchronous model makes it easier to grok as long as you're ok with not being able to do something else while the asynchronous thing is going on. OTOH state machines, or statecharts to go further, which are an inherently asynchronous view, have many advantages (But are not Turing Complete).

lmm2y ago

> What I've found is that the event driven architecture did tend to lead to less abstractions and more leaked internal details. This isn't fundamental (you can treat events like an API)

I'd put it the other way: event driven architecture makes it safer to expose more internal details for longer, and lets you push back the point where you really need to fully decouple your API. I see that as an advantage; an abstract API is a means not an end.

> Another problem with distributed systems with persistent queues passing events is that if the consumer falls behind you start developing a lag.

Isn't that what you want? Whatever your architecture, fundamentally when you can't keep up either you queue or you start dropping some inputs.

> If you have a request/response system you can simulate events over that.

How? I mean you can implement your own eventing layer on top of a request/response system, but that's going to give you all the problems of both.

> If we look at things like consensus protocols or distributed/persistent queues then obviously we would need some underlying resources (e.g. you might need a database behind your request/response model).

Huh?

> Don't know if others have a similar experience but when one system is mandated people will invent workarounds that end up looking like the other paradigm, which makes things worse.

I agree that building a request/response system on top of an event sourcing system gives you something worse than using a native request/response system. But that's not a good reason to abandon the mandate, because building a true event-sourcing system has real advantages, and most of those advantages disappear once you start mixing the two. What you do need is full buyin and support at every level rather than a mandate imposed on people who don't want to follow it, but that's true for every development choice.

YZF2y ago

Everything is a means and not an end but decoupling via an explicit API makes change easier. Spreading state across your system via events (specifically synchronizing data across systems via events relating to how that data changes) creates coupling.

re: Huh. Sorry I was not clear there. What I meant is you can not create persistent queue semantics out of a request/response model without being able to make certain kinds of requests that access resources. Maybe that's an obvious statement.

re: mandate. I think I'm saying these sort of mandates inevitable result in poor design. even the purest of purest event sourcing systems actually use requests/response simply because that is the fundamental building block of systems. E.g. Kafka uses gRPCs from the client and waits for a response in order to inject something into a queue. The communication between Kafka nodes is based on messages. The basic building block of any distributed computer system is a packet (request) being sent from one machine to another, and a response being sent back (e.g. TCP control messages). A mandate that says though shall build everything on top of event sourcing is sort of silly in this context since it should be obvious the building blocks of event sourced systems use requests/response. Even without this nit-picking restricting application developers to only build on top of this abstraction inevitably leads to ugliness. IMO anyways and having seen this mandate at work in a large organizations. Use the right tool for your job is more or less what I'm saying or the other famous way of stating this is when all you have is a hammer everything looks like a nail.

re: isn't that what you want. well, if it is what you want then it is what you want, but many systems are ok with things just getting lost and not persisted. e.g. an HTTP GET request from a browser, in the absence of a network connection, is just lost, it's not persisted to be played later, and so there is no way to build a lagging queue with HTTP GET requests that are yet to be processed. Again, maybe an obvious statement.

1 more reply

aejm2y ago· 1 in thread

What are people’s thoughts on using event driven architecture in games? Specifically multiplayer games, and massively multiplayer games (MMOrpgs). Another comment mentions it was helpful, how specifically, were there any tradeoffs, do certain types of games work better?

edd252y ago

I have not worked on an MMO before, but recently I had the chance to try out my own custom event system on a small multiplayer game (Unity, PUN2). I had most issues with differentiating which events came from which client. Additionally, I had lots of issues differentiating which events were issued locally. In the end, the code ended up being quite messy. If I were to redo the game, I'd use direct method calls where possible with regular callbacks.

Generally, I found that when using event systems you have to be really careful not to over-use it, even small/single player games. Its super hard to debug when everything is an event - if you go this route, you essentially end up in a situation where everything is "global" and can be reached from anywhere (might as well just go full singleton mode at that point). Additionally, I found it difficult having to deal with event handlers which raise other events, or worse, async events, as then it becomes really hard to ensure the correct order of invocations.

If you plan to use an event system, my advice would be (in Unity): - Reference and raise events only on root Game Object scripts (e.g., have a root "Actor" script which subscribes/publishes events and communicates with its children via properties/C# events) - Never subscribe or publish events in regular "child" components - Use DI/service locator to fetch systems/global things and call them directly when possible from your "Actors"

rammy12342y ago· 1 in thread

My 2 cents - There is no anti-pattern specific to event driven. It is essentially asynchronous nature. It means you start with understanding the business needs and SLA. Question often comes in my experience is "can they wait?" and what's the risk of dirty data or data fetched with delay ( worst case ). event driven is always about worst case scenario and will it work then.

therealdrag02y ago

The main anti pattern is making the wrong choice; using async when sync fits better.

worik2y ago· 1 in thread

"Event driven architecture ". Mēh!

There is no avoiding it when dealing with, erm, events.

Events are things that happen that you cannot predict exactly when, where, and what.

The user clicked the mouse

The wind changed direction

Using Events to signal state change from one part of a system to another is a bad idea. Use a function call.

A rule of thumb is if the producer and the consu,er are in the same system then "Event Driven Architecture " is the anti pattern

gardenhedge2y ago

What if one event should trigger many actions?

gnat2y ago

https://web.archive.org/web/20240608190815/https://codeopini...

FridgeSeal2y ago

I’m starting to get a sad about event driven stuff.

I’ve used it with a good degree of success in some data pipeline and spark stuff to have stuff automatically kick off, without heinous conditional orchestration logic. I also use evented stuff over channels in a lot of my rust code with great success.

However, echoing the sentiments of some other comments: most articles about event driven stuff seem to be either marketing blogspam or “we tried it and it was awful”. To be honest I look at a lot of those blog posts and about half the time my thoughts are “no wonder that didn’t work out, that’s an insane design” but is that just “you’re-doing-it-wrong-cope”?

Are there success stories out there that just aren’t being written? Is there just no success stories? Is the architecture less forgiving of poor design and this “higher bar of entry” torpedoes a number of projects? Is it more susceptible to “architecture astronauts” which dooms it? Is it actually decent, but requires a somewhat larger mindset-change than most people take to it, leading to half-baked implementations?

I can’t help but feel the underlying design has some kernels of some really good ideas, but the volume of available evidence sort of suggests otherwise.

chermi2y ago

I always thought it would be an interesting exercise to build an event-based controls system. There's a lot triggering action X based on event A. And actions based on composite events. I never found anyone who had done it.

Edit- I should say I never saw one in the wild, quick search found some academic projects https://scholar.google.com/scholar?q=event-driven+control+sy...

liampulles2y ago

Our system at work is a command driven system. We don't use messages as a source of state, but really just as Async instructions. And we store them which can be useful for retrying, data fixes, and stats.

I feel like a lot of teams out there can probably benefit from this simpler approach - it's probably what a lot of people are doing unwittingly.

DotaFan2y ago

Event-driven architecture should be implemented across complete system (client-be) or be used in a single feature, i.e. it needs to be all or bare minimum, else it's just an absolute mess.

oddevan2y ago

Went in thinking I would find out a few pitfalls for the event-driven app I'm writing...

> Commands only have a single consumer. There must be a single consumer. That’s it. They do not use the publish-subscribe pattern.

...oops.

Now the question is how much (more) time I want to spend on a(nother) rewrite.

j / k navigate · click thread line to collapse

163 comments

100 comments · 14 top-level

candiddevmike2y ago· 43 in thread

ninkendo2y ago

Oh and this is a very high profile piece of software with a user base in the 9 figure range.

tadfisher2y ago

1 more reply

setr2y ago

can you not just pick the original spelling in the autocomplete menu above the keyboard?

1 more reply

sharlos2010682y ago

We use an event driven architecture at work and find it works quite well, however events are for communicating between services across business domains and owned by different teams.

If you have some logic A and B running on user input, I wouldn't be splitting that across different services.

Salgat2y ago

https://www.eventstore.com/case-studies/insureon

Spivak2y ago

Am I dumb or is this basically the binlog of your database but without the tooling to let you do efficient querying?

4 more replies

therealdrag02y ago

Note that event sourced data and event based architecture are different things. You can have one without the other.

1 more reply

blowski2y ago

chenster2y ago

What did you mean traditional crud as oppose to event-driven arch? How is it relevant to the subject in dicussion?

1 more reply

devdude13372y ago

Always chose the golden middle path and apply patterns where they fit well.

rswail2y ago

Wrote a public transport ticketing system that processes 100-200K+ trips/day with sub-second push of notification to mobiles of trip/payments.

Event driven and CQRS "entities" made logic and processing much easier to create/test/debug.

Been running now for 6 years, minimal downtime except for maintenance/upgrades.

In the process of introducing major new entity and associated changes, most of the system unaffected due to the decoupling.

chenster2y ago

Can you elaborate #1 Nouns over Vers?

1 more reply

cjk22y ago

No. We have a complete fucking disaster on our hands.

macintux2y ago

How old of a system? Do you feel it’s the implementation, the design, or the concept itself that went wrong? Is your system a good fit?

(No stake in this one way or another, just curious.)

1 more reply

cweld5102y ago

vmaurin2y ago

jgraettinger12y ago

PostgreSQL.

The WAL is an event log, and when you squint at its internal architecture, you’ll see plenty of overlap with distributed event sourcing.

mrkeen2y ago

Likewise with git. There's the "top-level events" that you see (commits). But even when you're doing 'unsafe' operations, you're working with the lower-level reflog events.

marcosdumay2y ago

Almost every modern software system. Anything running over the Web is event driven.

1 more reply

lmm2y ago

mrkeen2y ago

We've had mistakes that we've been able to course-correct from.

Our users are small-businesses with organisation numbers, and we mostly think of them as unique. But they strictly aren't, so we 'overwrote' some companies with other companies.

Once we detected and fixed the bug, we just replayed the events with the fixed code, and we hadn't lost any data.

lz4002y ago

lanstin2y ago

nitwit0052y ago

I've seen successful, but flawed usage.

Every use I've seen sent events after database transactions, with the event not part of the transaction. This means you can get both dropped events, and out of order events.

My current company has analytics driven by a system like that. I'm sure there's some corrupted data as a result.

The main issue being people just don't know how to build and test distributed systems.

mrkeen2y ago

I had an interview where I was asked how I would guarantee that an event happened in addition to a database update (transactionally).

It sounded kind of impossible, I said as much, and then proposed a different approach. The interviewer persisted and claimed that it could be done with 'the outbox pattern'.

I disagreed and ended the interview there. Later when I was chatting about it with a former colleague, he said "Oh, they solved the two generals problem?"

> Every use I've seen sent events after database transactions, with the event not part of the transaction.

Maybe this is what they were doing.

3 more replies

simonbw2y ago

ClimaxGravely2y ago

That's generally been my experience as well.

However I've seen some frameworks where you can do collision imperatively. For example

if (sprite.collide(tilemap)) {do something}

These are generally on smaller less taxing frameworks (in this case I'm referring to haxeflixel) but they do exist!

TeeMassive2y ago

liampulles2y ago

bob10292y ago

I saw it done well in manufacturing.

I think it works well when it's the only thing that can work.

Osiris2y ago

The project I'm working on is about 13 years old (ruby on rails) with over 260 engineers and the product has a very robust event driven system that is at the core of a lot of important features.

swistak352y ago

Out of curiosity, what is the system? Is this based on Rails Event Store, or something else (custom?)?

tlarkworthy2y ago

Webhooks? Slack automation? GitHub actions?

turkey992y ago

Yes, it’s a great tool for integration. We have a product suite and it’s our chosen way to connect products.

tkiolp42y ago

No. The usual pains are:

- Producer and consumer are decoupled. That’s a good thing m right? Good luck finding the consumer when you need to modify the producer (the payload). People usually don’t document these things

- Let’s use SNS/SQS because why not. Good luck reproducing producers and consumers locally in your machine. Third party infra in local env is usually an afterthought

- Observability. Of rather the lack of it. It’s never out of the box, and so usually nobody cares about it until an incident happens

throwaway24x72y ago

> Good luck finding the consumer when you need to modify the producer

> Let’s use SNS/SQS because why not. Good luck reproducing producers and consumers locally in your machine

https://github.com/localstack/localstack

> Observability. Of rather the lack of it. It’s never out of the box, and so usually nobody cares about it until an incident happens

thr0w2y ago

> - Let’s use SNS/SQS because why not.

The amount of times I've come across someone who's inserted SQS into the mix to "speed things up"...

mrkeen2y ago

> Good luck finding the consumer when you need to modify the producer (the payload)

I just grep for the event's class name.

thr0w2y ago

> Can someone share some long term event driven success stories?

JavaScript

jesse__2y ago

I think we have very different definitions of success

richardw2y ago

Frankly our system is much better than the batch driven junk that is out of sync a second after it has executed. “Hey you have a reward.” “No I used it 2 hours ago you clowns.”

Note this isn’t cope. In some cases we started fully sync but relaxed it where there are tradeoffs that gave us better outcomes and we weren’t giving anything material up.

lanstin2y ago

Or worse “hey you have a reward” but it doesn’t show up in the UI for three minutes. Twitter used to do this to me all the time.

1 more reply

throwawaymaths2y ago

Does canbus count?

arwhatever2y ago· 21 in thread

Now when people argue “because decoupling,” I hear, “You don’t get as much notification that you just broke a downstream system.”

junto2y ago

You need to improve your telemetry to feel the benefits. I can trace all the way through multiple services easily on a simple detailed flame graph in our systems.

https://www.datadoghq.com/knowledge-center/distributed-traci...

scubbo2y ago

> I can trace all the way through multiple services easily on a simple detailed flame graph in our systems

I'm not sure what point is being made here. It's good that you can do that - but are you implying that that's not possible in an API-driven system?

YZF2y ago

There are many examples of event sourcing but perhaps the most classical one is the bank account.

It's not just about auditing, it's also about transactionality and atomicity.

3 more replies

NomDePlum2y ago

Depends what domain you work in. Auditability is a key/mandatory requirement in a lot of regulated industries.

There are of course other ways to do auditability.

Event Sourcing + Projections provide a nice way to build multiple models/views from the same dataset. This can provide a lot of simplification for client code.

okr2y ago

A niche requirement? There are big accounting firms who organize payrolls, where the changes that you mentioned are an important part of their business.

There are also other companies, which do the typical snapshot and roll up to the current time, when they start the services, that need the data without having access to the database.

1 more reply

chipdart2y ago

> I can trace all the way through multiple services easily on a simple detailed flame graph in our systems.

1 more reply

hobs2y ago

When you've never grown out of a single node domain but you do event driven "because scaling" or whatever, you've shot yourself in the foot amazingly hard.

The_Colonel2y ago

Yes, events, async, eventual consistency, decoupling represent a difficult/complex solution for some hard problems encountered when scaling high.

But people often forget there are trade-offs to everything and if you don't have these hard problems, you're giving yourself only headaches.

trevor-e2y ago

Like anything it can be abused and sometimes folks go overboard with turning everything into an event. However, when done right, it is really amazing to work with.

majormajor2y ago

> As an event producer as long as you follow reasonable backwards-compatibility best practices then you should be pretty safe from breaking things downstream.

A synchronous API call could give you back an error response and alert your immediately to something being wrong. The system notifies you directly.

1 more reply

gnat2y ago

Do you have pointers to such best practices? Gratefully received etc.

2 more replies

mrkeen2y ago

> now with events that are harder to trace than function calls would have been

I don't know how this could be true. Events are things - nouns which can be backed-up, replicated, stored, queried, rendered, indexed and searched over.

sattoshi2y ago

I generally like event-driven architecture, but I need to admit that debuggability is sacrificed where it matters most.

lmm2y ago

2 more replies

Osiris2y ago

What I like about event driven is that you don't even need to know if anyone is listening to or cares about your event.

And as a consumer, many independent tasks can be triggered by the same event.

The people that wrote the code that creates the record, didn't have to do anything to support the feature.

But I agree that it's not the right solution for every problem. But there are certain problems it solves really well.

zdragnar2y ago

> you don't even need to know if anyone is listening to or cares about your event.

2 more replies

NicoJuicy2y ago

And that's why a SAGA describes that flow.

Don't take it into consideration and you're fucked.

Source: previous "seniors" didn't take it into consideration, they left

pyrale2y ago

> now with events that are harder to trace than function calls

Same issue as microservices: there are people who want to use the paradigm but not do the investment in monitoring/tooling.

scubbo2y ago

Amen. Event-driven architecture makes it easier to bury your head in the sand, and harder to implement an actually-working feature.

moandcompany2y ago

Integration tests?

worik2y ago

...happen too late

pudwallabee2y ago· 8 in thread

I have seen Kafka pulled out by its hairs and replaced with request based architecture.

Event driven architecture, to me is itself an antipattern.

It seems like a replacement for batch processing. Replayable messages are AWESOME. Until you encounter the complexity for a system to actually replay them consistently.

The whole point of microservices was to enable flexibility, smart services and dumb pipes, and effective CI/CD and devops.

We are nearing the end of microservices adoption whether it be event or request driven. In mature organizations it seems to me that request driven is winning by a large margin over event driven.

It may be counterintuitive, but the time to market of request driven architecture and cost to maintain is way way lower.

lmm2y ago

> I believe that even though Kafka acts the part of "dumb pipe", it doesnt stay dumb for long

In my experience programmers are very happy to do everything in the application (something database people often complain about). What kind of problems do you see?

Not my experience at all, and I've used Kafka at a wide range of companies, from household-name scale to startups. Kafka is the boring just-works technology that everyone claims they're looking for.

I'm no fan of microservices, but Kafka is absolutely the right datastore most of the time.

smrtinsert2y ago

> and the n distributions of Kafka logs in your organization could be 1000x more expensive than a monolithic DB and a monolithic API to maintain

Not to mention certain observability vendors bleeding you for all those logs you now need to keep an eye on it.

Absolutely agreed on every point

RHSman22y ago

The unseen critical part of the equation

buster2y ago

majormajor2y ago

But when I've done that testing, Kafka hasn't been the problem.

withinboredom2y ago

therealdrag02y ago

What are the causes of data loss?

syca_f132y ago

Jepsen has written a fantastic article on this issue. I'm not sure if it has been fixed since then. https://aphyr.com/posts/293-call-me-maybe-kafka

onetimeuse923042y ago· 7 in thread

Not specifically about event-driven, but the most damaging anti-pattern I would say is microservices.

This typically causes enormous waste of efficiency and consequently causes applications to be much more complex than they need to be.

I have many times worked with apps which occupied huge server farms when in reality the business logic would be fine to run on a single node if just structured correctly.

roncesvalles2y ago

onetimeuse923042y ago

Oh, it is even worse.

Effectively making the task way more complex than it would be with a monolithic application

1 more reply

marginalia_nu2y ago

I think it really matters what sort of application you are building. I do exactly this with my search engine.

If it was a monolith it would take about 10 minutes to cold-start, and it would consume far too much RAM to run a hot stand-by. This makes deploying changes pretty rough.

The rest of the system is another couple of services that can also restart independently, these have a memory footprint less than 100MB, and have hot standbys.

This wouldn't make much sense for a CRUD app, but in my case I'm loading a ~100GB state into RAM.

michaelcampbell2y ago

> Why the fuck do you need to do that?

If it were a service (which we are moving to), it would be able to be deployed independently, and much, much quicker.

There are other solutions to the problem, but "µs are bad, herr derr" is just trope at this point. Like anything, they're a tool, and can be used well or badly.

1 more reply

SadCordDrone2y ago

Also - you give up type safety and refactoring. LoL

onetimeuse923042y ago

Well, technically, you can construct the microservices preserving type safety. You can have an interface with two implementations

- on the service provider, the implementation provides the actual functionality,

- on the client, the implementation of the interface is just a stub connecting to the actual service provider.

Thus you can sort of provide separation of services as an implementation detail.

However in practice very few projects elect to do this.

1 more reply

Pet_Ant2y ago

Proj:

|-proj-api

|-proj-client

|-proj-service

Both proj-client and proj-service consume/depend-on proj-api so they are in sync of what is going on.

Now, you can switch the implementation of the service to gRPC if you wanted with full source compatibility. Or move it locally.

YZF2y ago· 4 in thread

bux932y ago

YZF2y ago

lmm2y ago

> What I've found is that the event driven architecture did tend to lead to less abstractions and more leaked internal details. This isn't fundamental (you can treat events like an API)

> Another problem with distributed systems with persistent queues passing events is that if the consumer falls behind you start developing a lag.

Isn't that what you want? Whatever your architecture, fundamentally when you can't keep up either you queue or you start dropping some inputs.

> If you have a request/response system you can simulate events over that.

How? I mean you can implement your own eventing layer on top of a request/response system, but that's going to give you all the problems of both.

Huh?

> Don't know if others have a similar experience but when one system is mandated people will invent workarounds that end up looking like the other paradigm, which makes things worse.

YZF2y ago

1 more reply

aejm2y ago· 1 in thread

edd252y ago

rammy12342y ago· 1 in thread

therealdrag02y ago

The main anti pattern is making the wrong choice; using async when sync fits better.

worik2y ago· 1 in thread

"Event driven architecture ". Mēh!

There is no avoiding it when dealing with, erm, events.

Events are things that happen that you cannot predict exactly when, where, and what.

The user clicked the mouse

The wind changed direction

Using Events to signal state change from one part of a system to another is a bad idea. Use a function call.

A rule of thumb is if the producer and the consu,er are in the same system then "Event Driven Architecture " is the anti pattern

gardenhedge2y ago

What if one event should trigger many actions?

gnat2y ago

https://web.archive.org/web/20240608190815/https://codeopini...

FridgeSeal2y ago

I’m starting to get a sad about event driven stuff.

I can’t help but feel the underlying design has some kernels of some really good ideas, but the volume of available evidence sort of suggests otherwise.

chermi2y ago

Edit- I should say I never saw one in the wild, quick search found some academic projects https://scholar.google.com/scholar?q=event-driven+control+sy...

liampulles2y ago

I feel like a lot of teams out there can probably benefit from this simpler approach - it's probably what a lot of people are doing unwittingly.

DotaFan2y ago

Event-driven architecture should be implemented across complete system (client-be) or be used in a single feature, i.e. it needs to be all or bare minimum, else it's just an absolute mess.

oddevan2y ago

Went in thinking I would find out a few pitfalls for the event-driven app I'm writing...

> Commands only have a single consumer. There must be a single consumer. That’s it. They do not use the publish-subscribe pattern.

...oops.

Now the question is how much (more) time I want to spend on a(nother) rewrite.

j / k navigate · click thread line to collapse