Storing 50M events per second in Elasticsearch (opens in new tab)

(datadome.co)

142 pointsBenfromparis6y ago44 comments

44 comments

33 comments · 7 top-level

jungletime6y ago· 13 in thread

3 years ago, I made a simple calendar app in django, and I wanted to use Elasticsearch so users can search and find an event, and to use it to populate an upcoming events list. There's only about 10,000 events in the database.

I quickly realized what a pain it is to use Elasticsearch, for a simple app like mine.

Pain points:

1) You have to setup and recreate part of your database in elastic search. So you essentially end up with two databases. Which now you have to keep in sync.

2) I was getting unpredictable query results from Elasticsearch, which after a few days, and much head scratching turned out to be that I was running out of memory.

3) When a user added a new event, it was not being added to elastcsearch index automatically. I could not figure out how to do this reliably. I could make it work reliably only after a sync of the entire Elasticsearch index. But this meant that it was next to useless, to use for the Upcoming Events List. Since I only wanted to sync the index once a day. Confusing the users, as to why their event was not showing up. And I gave up, and just ended up implementing the Upcoming Events List directly from my database in python.

4) Elasticsearch came without some security settings not set by default, and after a few months it was hacked. I had to download a new version and wasted more time.

I still use Elasticsearch, but only for search, and not the upcoming event list. And I don't think it was worth the complexity that it added to my project.

linuxdude3146y ago

This is a mistake many people make. Elasticsearch is probably overkill for your particular use case.

It’s similar to bringing a F1 car to a go-cart race and then being surprised you aren’t able to finish the race because you don’t have a pit crew able to maintain your vehicle.

I’ve built and owned large Elasticseach clusters at Fortune 50 companies for providing log search as well as document search. Like anything, administering an ES cluster requires planning, engineering, and process/documentation.

I wouldn’t consider using ES if I didn’t have a dedicated ops team to help in its administration unless possibly using a managed service like the one AWS provides.

It’s a very powerful tool; it was a mistake to think you can just casually throw it in your stack without fully understanding its complexity.

1 more reply

manigandham6y ago

Elasticsearch isn't a database, it's a search engine built on top of Lucene. Although some may use it as a document-store, and their own marketing claims it's ok, it never ends up being a good choice: https://www.quora.com/Why-shouldnt-I-use-ElasticSearch-as-my...

steve196y ago

I have been put off by Elasticsearch's complexity a number of times. Can I ask why with searching a limited about of text you didn't juse use Postgres' full-text search?

neuronexmachina6y ago

I'm not the person you're replying to, but does Postgres nowadays have a straightforward way to do tf-idf or BM25-style information retrieval?

1 more reply

jungletime6y ago

I was new to Python, Django and Postgres. I was looking up how to do search and stumbled on articles how to use Elasticsearch in Django. So I went with it.

1 more reply

qeternity6y ago

10k? Hell, any vanilla RDBMS can handle this (including SQLite).

anon12536y ago

we use zombodb https://www.zombodb.com/ to keep ES and Postgresql in sync, it has its flaws but when it works it works perfectly. That at least helps with 1 and 3, which are indeed a major annoyances.

findjashua6y ago

Kafka streams can solve this use case fairly well - though setting up & managing infra may be a bit more than what you'd want to deal with for a hobby project

atomashpolskiy6y ago

+1, or use any other log-based replication mechanism (e.g. Logstash). The point is that instead of having two independent systems that can easily go out of sync (if not using distributed transactions) and become permanently incostistent with each other, you'll now have the database as the primary source of data (commonly referred to as the system of record) and Elasticseach as a secondary, eventually consistent search index. This approach sacrifices read-your-writes consistency though, but for a search index this can be tolerated.

1 more reply

rsrx6y ago

It requires a lot of pampering, but I quite enjoy using it and discovering it's possibilities. I am using it for the startup project which I work on, that has search feature very similar to Instagram one. Do you have any other suggestions for search engine that is flexible enough? I don't want to couple things too tightly by using zombodb and similar stuff.

linuxdude3146y ago

If you really need search, I think its the clear winner still. I don't think its terribly hard to manage/operate, just that you do need to do some initial planning otherwise it will balloon out of control.

Dynamic mappings can mess things up really easily, so its best to disable them in favor of using a pre-defined static map for the type of documents you will be ingesting. What I've encountered in the past that usually causes things to break, is when 90% of your documents contain a field called "Date" that contains a ANSI date field, but the other 10% contain "Null" (string instead of an ANSI date). Since the documents don't match the dynamically generated mapping, they fail to be indexed.

Shard management is also critical and this largely depends on the type of data you are indexing. If the data in ES is unique (not just a copy of a database you already have), you will want to have some sort of cross-region/DC replication strategy as well as a backup strategy.

Fortunately both of these are pretty easy. ES has a mechanism of using tags that allows you to define things like regions, data centers, really whatever you want, and shards can be routed based on rules defined over these tags.

A setup I've used in the past is to have 5 nodes in LAX DC 5 nodes in LAS DC, any data that is ingested into LAX is replicated into shards in LAS and vice versa.

Backup to S3 is rather trivial now thanks to the built in export options in the newer versions of ES.

With a little bit of planning ES can be a great addition to your stack, just be sure you do the initial engineering so you can avoid a big headache in the future.

1 more reply

zihotki6y ago

That's why whenever I need to have a good full text search I pick https://ravendb.net/ . Literally close to 0 maintenance required and it's very feature rich.

jungletime6y ago

Here it is in glorious action, if anyone is curious: https://www.pincalendar.com/search/events/?q=

outworlder6y ago· 6 in thread

This part left me scratching my head:

> We have set “replica 0” in our indexes settings

> Now let’s assume that node 3 goes down:

> As expected, all shards from node 3 are moved to node 1 and node 2

No, as there are no shards that can be moved, as number of replicas was set to zero and one node went down. Not sure what they are trying to explain here.

> In order to resolve this issue, we introduced a job which runs each day in order to update the mapping template and create the index for the day of tomorrow, with the right number of shards according to the number of hits our customer received the previous day.

This is a very common use-case(eg. logging), but it's surprising that Elastic has nothing to automate this.

rexer6y ago

> This is a very common use-case(eg. logging), but it's surprising that Elastic has nothing to automate this.

You can set an index template to be used on new indices that match a pattern, which is a very common thing to do. It sounds like what they did was modify the template daily, which is less common IME. It's not clear why they had to manually create the index, though. That should happen automatically.

outworlder6y ago

> You can set an index template to be used on new indices that match a pattern, which is a very common thing to do

It is, but how can you tell in your template you want to keep shard sizes under 50GB? You can't.

The best thing you can do (as they did) is, based on historical data, update the template, so that the new index will have shards that (hopefully) are under 50GB.

IanGabes6y ago

Indices are composed of one or more primary shards. Each primary shard can have one replica. Three nodes, each with one primary shard as a part of that sjngle index, no replicas in play at all.

outworlder6y ago

> Indices are composed of one or more primary shards. Each primary shard can have one replica. Three nodes, each with one primary shard as a part of that sjngle index, no replicas in play at all.

Ok, 3 nodes, each with one primary shard. No replicas. 1 node goes down, one shard is no longer found in the cluster, because it was in the missing node. That particular index, and in fact the whole cluster, are now RED.

Unless you discard that shard (force reroute, with accept_data_loss), nothing is going to be recovered and the missing shards will not be allocated anywhere.

1 more reply

tty66y ago

author here: > replica 0 It's an example for the article and the intention was to remove the complexity of primary/replica shards. Let's say "shard" is an unit and no matter about primary or replica. In fact with replica to 1, the behaviour would be the same but in the diagram it will have twice more shards. What we wanted to show is IF one node goes down and up after few times AND a rollover occurs just after then this node will handle all those new shards and so handle all the write. here we have a spread write issue

> introduction of a job which runs each day this job has many purposes: 1) because of rollover, our indexes are now suffixed by -000001 then -000002 etc...our applications no matter of the rollover post and get doc by the alias in front of these indexes suffixed by -000001... So if you don't create the index for the next day in a daily basis index design, your application will push at 00:00:01 a new index with the alias name and it won't be in rollover "mode". 2) because we are using ILM feature, we need to define the ILM rollover alias in the template and it changes each day because our indexes name are "index_$date" 3) performance issues: we have a lot of traffic and if we do not create index before for the next day, we will except a lot of unassigned shards, cluster yellow etc...

in fact, yes, it's a common use-case (daily based index) but maybe not with rollover + ilm

outworlder6y ago

> replica 0 It's an example for the article and the intention was to remove the complexity of primary/replica shards.

Ah, got it. So maybe it would be best said as "for the following example, ignore any replicas".

> in fact, yes, it's a common use-case (daily based index)

But it is not automated by the Elastic folks. Do you have any intentions of open-sourcing a portion of this job?

altmind6y ago· 5 in thread

>> Each day, during peak charge, our Elasticsearch cluster writes more than 200 000 documents per second

What is this 50M in the title?

gatorcode6y ago

They state each document has 250 events, 200,000 document/sec x 250 events/document = 50m events/sec.

kodyo6y ago

"Each document stores 250 events in a seperate field."

Curiouser and curiouser.

1 more reply

BenfromparisOP6y ago

Good catch! We will add it in the article

uncoder06y ago

It's misleading for sure but they're writing 250 'events' per document.

MuffinFlavored6y ago

200k documents per second is a lot less impressive, no?

3 more replies

mistrial96y ago· 1 in thread

it appears in this document:

* DataDome is a security company, and gets web traffic in near real-time for clients; a lot of traffic in some cases with very specific numbers given, like daily peak loads.

* DataDome only retains records for 30 days, and the most attention is given to the most recent traffic, to detect attacks

* an ElasticSearch deployment records all of the traffic records downstream from Apache Flink; a new feature added to ES this year, improves the management of ES indexing, and that solved problems that DataDome was having.. things are better! write an engineering blog post !

* re-indexing is done nightly, and implemented in a cloud environment that can handle the (heavy) work to rebuild the indexing.

These numbers are impressive. Earlier criticisms of ES are being addressed, and ES is stable and a cornerstone of the architecture. A company called DataDome is providing real services in near real-time. Congratulations to the team and an interesting read.

frumiousirc6y ago

I was curious about their numbers

> Storing 50 million of events per second

> A few numbers: our cluster stores [...], 15 trillion of events

> We provide up to 30 days of data retention to our customers. The first seven days of data were stored in the hot layer, and the rest in the warm layer.

15e12 / 50 MHz is 3.5 days.

I guess 50 MHz is the peak ingest rate.

dazoot6y ago· 1 in thread

Reading this reminds me of the pains of running an ElasticSearch cluster. We just moved to Elassandra. No more red status.

PixyMisa6y ago

Hadn't seen Elassandra before. Thanks for the tip.

FBISurveillance6y ago

I don't think writing a clickbaity title like this is fair. You just write 200k large documents per second, period. Good for you but to be blatantly honest it's actually not a lot.

I'm not saying you shouldn't have written this post, but rather suggest you be fair to your readers (and yourself). Otherwise you could just make up random titles like "Writing 1 trillion log lines per second" (by storing 1,000,000 1-byte, newline-separated log lines per document).

philip12096y ago

I've seen elasticsearch clusters like this have consistency problems. Turns out it's a problem in a security setting to have an off-by-one error.

j / k navigate · click thread line to collapse

44 comments

33 comments · 7 top-level

jungletime6y ago· 13 in thread

I quickly realized what a pain it is to use Elasticsearch, for a simple app like mine.

Pain points:

1) You have to setup and recreate part of your database in elastic search. So you essentially end up with two databases. Which now you have to keep in sync.

2) I was getting unpredictable query results from Elasticsearch, which after a few days, and much head scratching turned out to be that I was running out of memory.

4) Elasticsearch came without some security settings not set by default, and after a few months it was hacked. I had to download a new version and wasted more time.

I still use Elasticsearch, but only for search, and not the upcoming event list. And I don't think it was worth the complexity that it added to my project.

linuxdude3146y ago

This is a mistake many people make. Elasticsearch is probably overkill for your particular use case.

It’s similar to bringing a F1 car to a go-cart race and then being surprised you aren’t able to finish the race because you don’t have a pit crew able to maintain your vehicle.

I wouldn’t consider using ES if I didn’t have a dedicated ops team to help in its administration unless possibly using a managed service like the one AWS provides.

It’s a very powerful tool; it was a mistake to think you can just casually throw it in your stack without fully understanding its complexity.

1 more reply

manigandham6y ago

steve196y ago

I have been put off by Elasticsearch's complexity a number of times. Can I ask why with searching a limited about of text you didn't juse use Postgres' full-text search?

neuronexmachina6y ago

I'm not the person you're replying to, but does Postgres nowadays have a straightforward way to do tf-idf or BM25-style information retrieval?

1 more reply

jungletime6y ago

I was new to Python, Django and Postgres. I was looking up how to do search and stumbled on articles how to use Elasticsearch in Django. So I went with it.

1 more reply

qeternity6y ago

10k? Hell, any vanilla RDBMS can handle this (including SQLite).

anon12536y ago

findjashua6y ago

Kafka streams can solve this use case fairly well - though setting up & managing infra may be a bit more than what you'd want to deal with for a hobby project

atomashpolskiy6y ago

1 more reply

rsrx6y ago

linuxdude3146y ago

A setup I've used in the past is to have 5 nodes in LAX DC 5 nodes in LAS DC, any data that is ingested into LAX is replicated into shards in LAS and vice versa.

Backup to S3 is rather trivial now thanks to the built in export options in the newer versions of ES.

With a little bit of planning ES can be a great addition to your stack, just be sure you do the initial engineering so you can avoid a big headache in the future.

1 more reply

zihotki6y ago

That's why whenever I need to have a good full text search I pick https://ravendb.net/ . Literally close to 0 maintenance required and it's very feature rich.

jungletime6y ago

Here it is in glorious action, if anyone is curious: https://www.pincalendar.com/search/events/?q=

outworlder6y ago· 6 in thread

This part left me scratching my head:

> We have set “replica 0” in our indexes settings

> Now let’s assume that node 3 goes down:

> As expected, all shards from node 3 are moved to node 1 and node 2

No, as there are no shards that can be moved, as number of replicas was set to zero and one node went down. Not sure what they are trying to explain here.

This is a very common use-case(eg. logging), but it's surprising that Elastic has nothing to automate this.

rexer6y ago

> This is a very common use-case(eg. logging), but it's surprising that Elastic has nothing to automate this.

outworlder6y ago

> You can set an index template to be used on new indices that match a pattern, which is a very common thing to do

It is, but how can you tell in your template you want to keep shard sizes under 50GB? You can't.

The best thing you can do (as they did) is, based on historical data, update the template, so that the new index will have shards that (hopefully) are under 50GB.

IanGabes6y ago

Indices are composed of one or more primary shards. Each primary shard can have one replica. Three nodes, each with one primary shard as a part of that sjngle index, no replicas in play at all.

outworlder6y ago

> Indices are composed of one or more primary shards. Each primary shard can have one replica. Three nodes, each with one primary shard as a part of that sjngle index, no replicas in play at all.

Unless you discard that shard (force reroute, with accept_data_loss), nothing is going to be recovered and the missing shards will not be allocated anywhere.

1 more reply

tty66y ago

in fact, yes, it's a common use-case (daily based index) but maybe not with rollover + ilm

outworlder6y ago

> replica 0 It's an example for the article and the intention was to remove the complexity of primary/replica shards.

Ah, got it. So maybe it would be best said as "for the following example, ignore any replicas".

> in fact, yes, it's a common use-case (daily based index)

But it is not automated by the Elastic folks. Do you have any intentions of open-sourcing a portion of this job?

altmind6y ago· 5 in thread

>> Each day, during peak charge, our Elasticsearch cluster writes more than 200 000 documents per second

What is this 50M in the title?

gatorcode6y ago

They state each document has 250 events, 200,000 document/sec x 250 events/document = 50m events/sec.

kodyo6y ago

"Each document stores 250 events in a seperate field."

Curiouser and curiouser.

1 more reply

BenfromparisOP6y ago

Good catch! We will add it in the article

uncoder06y ago

It's misleading for sure but they're writing 250 'events' per document.

MuffinFlavored6y ago

200k documents per second is a lot less impressive, no?

3 more replies

mistrial96y ago· 1 in thread

it appears in this document:

* DataDome is a security company, and gets web traffic in near real-time for clients; a lot of traffic in some cases with very specific numbers given, like daily peak loads.

* DataDome only retains records for 30 days, and the most attention is given to the most recent traffic, to detect attacks

* re-indexing is done nightly, and implemented in a cloud environment that can handle the (heavy) work to rebuild the indexing.

frumiousirc6y ago

I was curious about their numbers

> Storing 50 million of events per second

> A few numbers: our cluster stores [...], 15 trillion of events

> We provide up to 30 days of data retention to our customers. The first seven days of data were stored in the hot layer, and the rest in the warm layer.

15e12 / 50 MHz is 3.5 days.

I guess 50 MHz is the peak ingest rate.

dazoot6y ago· 1 in thread

Reading this reminds me of the pains of running an ElasticSearch cluster. We just moved to Elassandra. No more red status.

PixyMisa6y ago

Hadn't seen Elassandra before. Thanks for the tip.

FBISurveillance6y ago

I don't think writing a clickbaity title like this is fair. You just write 200k large documents per second, period. Good for you but to be blatantly honest it's actually not a lot.

philip12096y ago

I've seen elasticsearch clusters like this have consistency problems. Turns out it's a problem in a security setting to have an off-by-one error.

j / k navigate · click thread line to collapse