Don't want to put anyone's work down, but at first sight this looks much more like MyISAM than InnoDB - and just like this, MyISAM's a helluva lot faster than InnoDB for single writer workloads throwing ACID out of the window.
Just that MyISAM is battle tested over the course of several years. Again, interested to see what comes out of it, but if history is a lesson, it's usually easier to go from correct to fast (PostgreSQL) than from fast to correct (MySQL et al.).
What concerned me a bit was absence of an interesting and promising general idea behind the implementation, the reason to implement a new engine. Maybe it is there, but the blog post does not say anything about it.
postgres was at that point already a mature, stable, and pretty much correct RDBMS, but it wasn't as fast for certain workloads (massively read heavy ones with low levels of concurrency, IIRC) or as easy to administer badly (I wouldn't say it was harder to administer well, but mysql was more forgiving to of mistakes by people who just wanted to plug it in and not care).
upscaledb does use transactions. I.e. if you insert a row with a primary and a secondary index then upscaledb inserts two key/value pairs in two databases. All these operations are wrapped in Transactions. They are just not yet supported on SQL level.
Batching a group of statements together in a small transaction is usually better because it reduces log flushing. ACID only needs to be guaranteed on commit. Similarly, applying transactions in parallel is faster because of group commit.
Ironically, back in the day, the core MySQL team argued that was a feature, not a bug.
Less forgivable is training a couple generations of developers that all the other mistakes they made are the Right Way...
InnoDB is designed for concurrency (using MVCC, granular locking, etc) so I'd expect it to be slower at single-threaded workloads than another engine that skips all that.
Only using single-threaded benchmarking is a bit misleading, imo. This is mentioned in the article but only in a small bullet point towards the bottom.
The reason is that most of the performance is spent in MySQL and not in the key/value store, and then it does not make a big difference if the key/value store is concurrent or not.
In my experience the assumption of "concurrent = fast" is a misconception. Right now upscaledb moves certain operations (i.e. flushing dirty buffers) to the background. It is better to have fast single-threaded code instead of multi-threaded code with a huge locking overhead. A compromise would be to move the lock to the database level (instead of the Environment, which is basically the container for multiple databases), and make sure that there's no shared state between the databases. But that actually does not have that much priority for me because I do not expect to win that much performance.
I am not arguing that "concurrent = fast". My point is real systems have a higher level of concurrency as a baseline.
InnoDB supports granular concurrent access because real workloads need this. Systems that have poor stories around concurrency -- MyISAM, Redis, pre-WiredTiger MongoDB -- definitely hit real scalability issues under high-volume workloads.
MVCC is so popular because it makes ACID compliance much easier to implement, but there is some additional read latency (and management overhead) because storage layout is disconnected from the natural layout.
A storage engine which sticks to the basics could be very fast, with predictable low-latency.
With regards to threading. I know locks are a dirty word, but you do need a single version of truth somewhere. Either this is done through locking or an allocator. Going single-threaded is a valid way to remove the overhead of locking (plus no race conditions!).
Hopefully a single writer, multiple readers is available soon.
I don't know anyone who chooses InnoDB and says "gee I wish it were faster". While it isn't the fastest show in town, it is a known quantity, and how it breaks is well understood.
So if I was going to use a different storage engine, would want something more than its a bit faster, in some cases.
Having said that, what a wonderful amount of work, and don't stop hacking on it!
Never got to the bottom of it either.
What does "100% compatible to InnoDB" mean here? It's not functionally compatible and it's not disk-file-format compatible.
This would also put the "compatible with InnoDB" thing in perspective for people and potentially clear up some other confusion.