[0] https://gist.github.com/grafikchaos/1b305a4e0b86c0a356de
And, yes, you want your tools and application to abstract away some of the messiness involved in something as simple as “load/save this object from the database,” otherwise you’ll end up with a much larger amount of code for such a simple operation. I’ve seen it done in a Django app, and it’s not bad to work with once those abstractions are in place.
EAV performance at large scale is really dreadful. So many queries.
I used EAV in an application and it was fine (I used Datascript on the client-side, with server-side data being stored in a JSON document database, RethinkDB). I did run into some annoying issues with the lack of nil value handling. I'd say writing queries was difficult, but not significantly more difficult than in any other language, if you care about performance and want to know what the query actually does.
Overall, I felt there was a good mapping between my domain model and EAV.
I eventually dropped this solution in favor of Clojure data structures: if you have an in-memory (in-browser) database anyway, why keep the data in Datascript, if you can simply keep it as ClojureScript data structures?
In the context of Magento, it is a real _nightmare_ and is one of the major contributor of the slowness of the Magento platform (at least for magento < 2.0).
The reason for this slowness is that in a relational database, the EAV model makes it so that e-v-e-r-y s-i-n-g-l-e SQL query is one gigantic query made of tons of JOINs.
To give you an example, querying a product in Magento may need to join no less than 11 tables!
catalog_product_entity, catalog_product_entity_datetime, catalog_product_entity_decimal, catalog_product_entity_int, catalog_product_entity_gallery, catalog_product_entity_group_price, catalog_product_entity_media_gallery, catalog_product_entity_text, etc, etc.
In order to fix this issue, the Magento team created what they call "flat tables" which are tables that are created by querying the database with an EAV query (i.e. the query with a million joins) and putting the results in a table with as many columns as there is attributes being returned by the original query.
In theory choosing to use EAV was an amazing idea. In practice, this idea did not scale for large Magento stores and it has made Magento hugely complex, slow and hard to use.
We use Magento at betabrand.com and I can confidently say 90% of the slowness of our website is due to Magento's EAV tables and we have spent a humongous number of engineering hours optimizing this.
While there are many excellent ideas embedded in Datomic and these projects, for me just being able to persist the same data structures you're using at a repl and query for them with data is a huge win vs having to start translating types and concepts and query strings to and from SQL is a huge win.
Now if only they’d make it into open source projects in languages other than Clojure. I want my Datomic-for-Elixir, darn it! (And, if I wasn’t too busy, I’d be the first person to volunteer to build it!)
It's worth noting that time now has two levels of meaning in this context - transaction time and valid time.
Disclosure: working on Crux
Datomic seems to be one of the best ones in terms of execution of the ideas behind it and usage in real world projects, to say the least.
* Generic/"bootstrappy" looking site, mostly marketing speak.
* Not even one code example of what it looks like to solve a problem with this DB.
* Gigantic "Get a Trial" button that links to a lengthy form that I'm sure has 99% abandonment rate.
There are a lot of other software products that are marketed like this, so I'm genuinely curious how this works.
1: https://www.actian.com/data-management/nosql-object-database...
congrats on speaking about a product you haven't even tried.
Versant was building systems that were updated on the fly without shutdown, which was quite an achievement back then (and probably still is: Imagine updating a running instance of PostgresQL)
The Object-Oriented Database (OODBMS) space is basically just Versant. Actian however seems not to be developing it actively anymore and it's generally not that much advertises (they rebranded it as "Actian NoSQL" apparently).
I wonder if anybody else still uses it (besides us).
I'm confused -- does this mean that Workiva themselves are not using Eva? Or are they still using it, but not officially developing it any more? If they were really invested in it, why would they only allow employees to work on it in their 10% time?
Source: I work there, although I have literally nothing to do with this project.
What are the failures of EAV except when applying it to a database optimized for something different?
The transaction log design is a fundamental design tradeoff that Eva/Datomic/Crux share, which means system throughput is limited by the throughput of a single process. The argument in favour of such a design is that most businesses & business applications don't actually experience transactional data volumes over 10K tx/sec.
I have created almost identical implementation of this as an embeddable library in a year 2000 along with proprietary SQL language. The reason was that my client had inventory of products with some crazy amount of attributes and each product can have its own set and the client kept changing, creating, deleting those. It was in memory but with persistence and atomic transactions. No history though. It was blindingly fast on complex queries. And the schema of the database was kept as a set of entities with some predefined names, values and range of id's .
For a while I was contemplating releasing it as a standalone product but as I had enough tasks on my plate decided not to do it. Kinda feel sorry now ;(
So all in all very close by idea.