Datomic Best Practices (opens in new tab)

(docs.datomic.com)

120 pointsSpendar8910y ago31 comments

31 comments

20 comments · 3 top-level

mattjaynes10y ago· 13 in thread

I have a client that is exploring Datomic, so I wonder if some of you can chime in on why this is popular at the moment and what your experiences are with it?

I'm a big Rich Hickey fan. If you don't know who he is, he's the guy behind Clojure and Datomic. I don't use those tools, but his views on simplicity are wonderful.

Here's a great quote of his on the subject:

"Simplicity is hard work. But, there's a huge payoff. The person who has a genuinely simpler system - a system made out of genuinely simple parts, is going to be able to affect the greatest change with the least work. He's going to kick your ass. He's gonna spend more time simplifying things up front and in the long haul he's gonna wipe the plate with you because he'll have that ability to change things when you're struggling to push elephants around."

Here's his classic talk on simplicity if you haven't seen it yet: http://www.infoq.com/presentations/Simple-Made-Easy

arohner10y ago

I love datomic. It's a relational, ACID, transactional, non-SQL database.

The upsides:

SQL is a horrible language, yet all other noSQL DB also throw away the relational, transactional and ACID features that are great in postgres. Postgres with datalog syntax would basically be a win by itself. Datomic queries are data, not strings. Queries can be composed without string munging, and with clear understanding of what that will do to the query planner.

The schema has built-in support for has-one, has-many relationships, so there's no need for join tables.

I've never met a SQL query planner that didn't get in the way at some point. If needed, you can bypass the query planner, and get raw access to the data, and write your own query.

You can run an instance of it in-memory, which is fantastic for unit tests, so you don't have Postgres in production, but SQLite when testing.

The downsides:

It's closed source.

Operationally, it's unique. Because it uses immutable data everywhere, its indexing strategy is different. I don't have the experience of what it will do under high load.

The schema is 'weaker' than say, postgres. While you can specify "this column is type Int", you don't have the full power of Postgres constraints, so you can't declare 'column foo is required on all entities of this type', or "if foo is present, bar must not be present", etc. It should be possible to add that using a transactor library, but I don't think anyone has done serious work in that direction yet.

Compound indexing support isn't in the main DB yet. I had to write my own library: https://github.com/arohner/datomic-compound-index

sgrove10y ago

Definitely agree re: datalog/pull syntax for SQL backends. Quite surprised it hasn't happened yet.

dasmoth10y ago

Datomic doesn't seem to have had a huge amount of marketing: it's been spreading largely by word of mouth, so a slow build-up makes sense.

It does bring an exceptionally elegant design (well worth reading Nikita Prokopov's "Unofficial guide" if you're curious). Also, the time and transaction-annotation features are unmatched AFAICT -- if you're working with complex data where provenance matters, Datomic can save a HUGE amount of work building tracking systems.

blintzing10y ago

I was very interested, but pretty disappointed that Datomic is completely closed source. Maybe this is a little mean, but what could be more "simple" than being able to read, understand, and modify the database you rely on?

Neo4j, though marketed differently, is a similar approach (but the Community version is GPLv3 and Enterprise is AGPLv3). The Cypher query language is declarative in a similar way to Datomic - the biggest missing feature is transactions.

joshdick10y ago

Rich Hickey has been criticized for that repeatedly. When asked, he's been transparent that Datomic is closed source so that he can put his kids through college. He also points out that he already gave us the whole Clojure language open source.

It's hard for me not to sympathize with him on this.

brianwawok10y ago

For sure, I would have played around with it, if it was open source and free to some small number of clients. But with so many FOSS databases, why use Datomic?

jtmarmon10y ago

We're using datomic in production. It's had its ups and downs. For one, having raw data available at in-memory speeds really changes the level of expressiveness you have in your code; you no longer are constrained to packing every question about your data into a giant query and sending it off - you can instead pull data naturally and as needed. Many of our queries make multiple queries and are high performance.

The licensing is a huge pain in the ass. If I accidentally launch an extra peer over our license limit, our production environment will stop working until the extra peer comes down. This is really butting heads with the growing popularity of abstracting physical servers as clusters so I think the strategy is kind of a mistake on cognitect's behalf.

cliftonk10y ago

Part of me wonders why they don't open source datomic and crank up the marketing effort on the consultancy and datomic/clojure/etc support portion of the business. It seems like a much more effective model for DB companies. For direct revenue streams, they can always have tuned/monitored clusters packaged as appliances.

ljosa10y ago

Datomic is probably getting more attention on HN in the wake of David Nolen's EuroClojure talk about Om Next (https://news.ycombinator.com/item?id=9848602).

talles10y ago

I just can't get enough Hickey talks. The guy put on clear words things I always feel.

taeric10y ago

I can't help but feel the quote ultimately embodies a false belief. Simplicity doesn't build you a rocket that can get to the outer solar system. Understanding and experimentation does.

Sure, this was probably built up using simple experiments and designs. But consider the Mar's landing[1]. Simplicity would be to have a single mechanism for landing the Curiosity. Not 3. With one of them being a crane drop from a hovering rocket!?

I do feel there is an argument to up front simplicity. However, as systems grow, expect that the simplicity will be harder and harder to maintain and keep such requirements as performance met. To the point that it becomes a genuine tradeoff that has your standard cost/benefit analysis.

In the end, this falls to the trap of examples. If you are allowed to remove all assumptions from real use down to only a simple problem, you can get a simple solution. Add back in the realities of the problem, and the solution can get complex again. It is a shame that, in studies, so few real programs are actually looked at.

[1] https://www.youtube.com/watch?v=Sbqc6MPUpOA

jacobolus10y ago

You should watch the talk(s), as your analysis here is entirely missing the context. What you’re talking about is what Rich Hickey and Stu Halloway call “complicated”, which is different from what they call “complex”.

1 more reply

Skinney10y ago

> Simplicity would be to have a single mechanism for landing the Curiosity. Not 3. With one of them being a crane drop from a hovering rocket!?

Why? Simple, in the way Rich Hickey advocates, means the opposite of complex, which means that things are woven together. You can have many landing strategies without them being tightly coupled together. A huge system isn't necessarily complex.

1 more reply

trengrj10y ago· 4 in thread

Anyone willing to share their views on using Datomic in a production environment?

valarauca110y ago

Query optimization is difficult because of the abstract structure and limited indexes. So you may query an index that holds EVERYTHING, and doing the query backwards would be faster... This'll depend purely on what you've inserted to the DB up to this point. Or more-so how you insert things into the DB.

Don't run an SQL server as your KV store you'll likely screw up the config and performance will suffer. If you want competitive performance with other DB's you will likely end up running memcache between your KV and Query Engine(s).

Don't store data over 1KB. Yes, the database can technically handle them, but in real world applications and expected speeds it can't.

B-Tree Syncs can be slower then you think in surprising number of cases.

dasmoth10y ago

Could you say a bit more about the query optimisation problems you've run into?

The "put your most restrictive clauses first" rule (which is reiterated on the new Best Practices page) usually seems to do the trick in our hands.

2 more replies

jeletonskelly10y ago

I've been using it for years in production now. Hell, there was a strangeloop talk about our use of it. We're using Dynamo for storage.

I found working with Datomic really nice. I like how I can express queries using clojure. It also has some great performance for reads and the data cache cuts down on those reads from Dynamo, which keeps the AWS bill down.

I did not like the way schema worked in the beginning. You had to make sure you had it right from a very early stage of development and that can be difficult if requirements change, but they've addressed that in later versions. It's also probably the most complicated part of our infrastructure, which is 100% run on AWS. You've got 2 transactors (high availability) and X peers, you need to deploy a new version of Datomic on them. Coordinating that while minimizing downtime is no simple task.

talles10y ago

This. Maybe the marketing isn't great, but Rich Hickey's talks plus some successful use cases would be enough to drive more people into it (IMO).

nickpsecurity10y ago

It's an interesting database. Regarding closed source complaint, it's part of a false dilemma that keeps repeating unnecessarily: that one must choose open + free or closed + paid. Nonsense! You can have open source and proprietary licensing simultaneously. You can even let paying users extend it as Burroughs did for MCP OS in 60's. I go into more detail here [1] on various models of source sharing and security review implications (my focus).

So, he could patent any key technology, publish the implementation with copyright protection, give source/binaries to customers on condition they keep paying, let users extend it for internal use, and even let users submit such improvements for others to use.. His company continues to make money on the licensing in each of these cases. All of this has been done before. If anything, the real risk is on the users that the source license might change like what happen with QNX. It's why I advocate perpetual licenses for a given release at a given rate which are re-issued each year a client pays.

[1] https://www.schneier.com/blog/archives/2014/05/friday_squid_...

j / k navigate · click thread line to collapse

31 comments

20 comments · 3 top-level

mattjaynes10y ago· 13 in thread

I have a client that is exploring Datomic, so I wonder if some of you can chime in on why this is popular at the moment and what your experiences are with it?

I'm a big Rich Hickey fan. If you don't know who he is, he's the guy behind Clojure and Datomic. I don't use those tools, but his views on simplicity are wonderful.

Here's a great quote of his on the subject:

Here's his classic talk on simplicity if you haven't seen it yet: http://www.infoq.com/presentations/Simple-Made-Easy

arohner10y ago

I love datomic. It's a relational, ACID, transactional, non-SQL database.

The upsides:

The schema has built-in support for has-one, has-many relationships, so there's no need for join tables.

I've never met a SQL query planner that didn't get in the way at some point. If needed, you can bypass the query planner, and get raw access to the data, and write your own query.

You can run an instance of it in-memory, which is fantastic for unit tests, so you don't have Postgres in production, but SQLite when testing.

The downsides:

It's closed source.

Operationally, it's unique. Because it uses immutable data everywhere, its indexing strategy is different. I don't have the experience of what it will do under high load.

Compound indexing support isn't in the main DB yet. I had to write my own library: https://github.com/arohner/datomic-compound-index

sgrove10y ago

Definitely agree re: datalog/pull syntax for SQL backends. Quite surprised it hasn't happened yet.

dasmoth10y ago

Datomic doesn't seem to have had a huge amount of marketing: it's been spreading largely by word of mouth, so a slow build-up makes sense.

blintzing10y ago

joshdick10y ago

It's hard for me not to sympathize with him on this.

brianwawok10y ago

For sure, I would have played around with it, if it was open source and free to some small number of clients. But with so many FOSS databases, why use Datomic?

jtmarmon10y ago

cliftonk10y ago

ljosa10y ago

Datomic is probably getting more attention on HN in the wake of David Nolen's EuroClojure talk about Om Next (https://news.ycombinator.com/item?id=9848602).

talles10y ago

I just can't get enough Hickey talks. The guy put on clear words things I always feel.

taeric10y ago

I can't help but feel the quote ultimately embodies a false belief. Simplicity doesn't build you a rocket that can get to the outer solar system. Understanding and experimentation does.

[1] https://www.youtube.com/watch?v=Sbqc6MPUpOA

jacobolus10y ago

1 more reply

Skinney10y ago

> Simplicity would be to have a single mechanism for landing the Curiosity. Not 3. With one of them being a crane drop from a hovering rocket!?

1 more reply

trengrj10y ago· 4 in thread

Anyone willing to share their views on using Datomic in a production environment?

valarauca110y ago

Don't store data over 1KB. Yes, the database can technically handle them, but in real world applications and expected speeds it can't.

B-Tree Syncs can be slower then you think in surprising number of cases.

dasmoth10y ago

Could you say a bit more about the query optimisation problems you've run into?

The "put your most restrictive clauses first" rule (which is reiterated on the new Best Practices page) usually seems to do the trick in our hands.

2 more replies

jeletonskelly10y ago

I've been using it for years in production now. Hell, there was a strangeloop talk about our use of it. We're using Dynamo for storage.

talles10y ago

This. Maybe the marketing isn't great, but Rich Hickey's talks plus some successful use cases would be enough to drive more people into it (IMO).

nickpsecurity10y ago

[1] https://www.schneier.com/blog/archives/2014/05/friday_squid_...

j / k navigate · click thread line to collapse