Inventory management in MongoDB: A design philosophy I find baffling (opens in new tab)

(ayende.com)

127 pointsdouche9y ago86 comments

86 comments

44 comments · 12 top-level

weddpros9y ago· 12 in thread

Regarding "SQL is better suited to this use case because it has transactions" comments:

Before we had 3-tier architectures, people would have designed a shopping cart use-case as a single SQL transaction that would last maybe 10 minutes. The DB would make sure everything stays consistent until the final commit. The GUI would keep an open connection to the DB the whole time.

In the web age, you want stateless services and HA. It means a transaction can't last more than a single web page. It becomes more challenging to design a shopping cart, because the DB can't handle a long-running transaction anymore.

Writing a correct system that reserves the items you put in a shopping cart and doesn't leak items and doesn't sell the same item twice is not easy. A transaction Rollback will not do the cleanup for you, because there's no long running transaction anymore.

So SQL transactions can't help as much as you think.

Mongodb doesn't have transactions, but updates are atomic, which allows CAS and optimistic locking use cases. I agree it's less than ideal when you need to provide ACID behavior, but don't believe it's easy with SQL transactions. It's not.

The author regrets the book's suggestion of putting each object in stock in its own document, and I agree it's probably a recipe for disaster. Atomic updates make this design absurd.

You could easily db.products.update({_id: productId}, {$inc: {inStock: -5}, $addToSet: {pendingCarts: {cartId: cartId, quantity: 5, timestamp: new Date()}}}). This has the exact same atomic behavior as a SQL transaction to remove 5 from the stock and add a new "shopping cart entry" in another table.

(you still need to expire cancelled shopping carts, and you may need a transactional way of completing the order: it's also manageable if designed as an idempotent operation)

Anyway don't over-simplify this use case and believe "a single big SQL ACID transaction would handle the problem". That's just not true.

elcritch9y ago

Well put! Having worked on corporate cash accounting and inventory systems, it's pretty clear that accounting is handled by banking systems _does_ not rely on SQL-transactions. It's handled by auditing and transaction based system (e.g. event systems) encoded in double entry accounting. ATM's for example will generally dispense money up to a given limit (say $500) as the cost of lost consistency is outweighed under those limits by providing high availability. If the user double draws and goes over their limit, they are held accountable at a later point (usually this is bounded by hard cash transfer limits).

In a warehouse inventory setting, when you _do_ have inconsistencies (e.g. lost items, misplaced orders, etc), a system which strictly enforces "inventory limits" will as often prevent employees from doing their job and shipping an item which could be sitting right in front on them but is not counted for in the system. Auditing combined with optimistic locking resolves this and allows both accountability, tracking, and flexibility.

Those are two real world examples which underly the idea that ACID guarantees and locking / transactions are two separate intents. CouchDB & Couchbase both provide ACID guarantees per document making it straightforward to implement multi-service applications using event base systems. It's equivalent to MongoDB's CAS operations. Really all that you need is to ensure that your changes are atomic and generally ACID compliance at a key/document level enables you to do this readily.

Personally, I find that SQL-style transactions just cause lots of issues with performance and locking contention while enabling developers to skimp on thinking deeply about how to appropriately design their data flow. Sometimes that's the right call for a team, but sometimes it's not.

codedokode9y ago

> I find that SQL-style transactions just cause lots of issues with performance and locking contention while enabling developers to skimp on thinking deeply about how to appropriately design their data flow

"Designing data flow" would mean having to spend more time and money for the same task?

1 more reply

calafrax9y ago

Transactions are never needed with a fully normalized model so if transactions are needed it is probably because your model sucks.

Or it is because you denormalized your model because your db engine's performance sucks in which case the transactions will probably just make it worse.

Good schema design and lock-free/wait-free (transaction-free) algorithms are not "reimplementing transactions in the client."

OPs example is garbage but his proposed transaction solution is garbage too.

3 more replies

MVorlm9y ago

Would you happen to have any articles/papers/blogs/talks that demonstrate how to perform the traditional bank transaction/shopping cart examples using an event-sourced system? Curious to learn more.

electricEmu9y ago

This is very close to how a team I was on solved this issue at Amazon. We took money, held inventory, had a shopping cart, and it worked out fine.

A service bus was necessary, but the actual atomic transactions in MongoDB didn't fail us. We didn't lose data. While the nay-sayers discounted Mongo, we were raking in cash on top of it.

fweespeech9y ago

> Anyway don't over-simplify this use case and believe "a single big SQL ACID transaction would handle the problem". That's just not true.

> In the web age, you want stateless services and HA. It means a transaction can't last more than a single web page. It becomes more challenging to design a shopping cart, because the DB can't handle a long-running transaction anymore.

Or you could cheat and not update the inventory until the purchase is made. ;)

The problem with the "adjust inventory on cart" in low inventory situations is you'll have 80% of your carts holding items that won't convert until a cart expiration. You only need the actual purchase to be atomic. Then, once the queued credit card transaction completes you adjust the order to refund the inventory [declined] or ship the order [completed].

That pattern absolves you of needing complex logic and allows you to distribute the activity relatively trivially as a set of two independent idempotent operations. And if the analog portion of the process fails, the picker hits a button and the order gets queued for a refund. Once the order is cancelled, another service contacts the customer.

Cart expiration, etc. makes the system unnaturally brittle by adding non-critical steps to the process.

pjc509y ago

You don't leave the transaction open while the user browses; you check for stock at the start, possibly moving items from "in stock" to "in cart" state. And then do the actual transaction for stock->sold at the time before you send off to the payment processor. If it's rejected, you return it to stock.

mbell9y ago

> You could easily db.products.update({_id: productId}, {$inc: {inStock: -5}, $addToSet: {pendingCarts: {cartId: cartId, quantity: 5, timestamp: new Date()}}}).

That is unlikely to work well at much scale. At least last I knew, Mongo docs are limited to 16MB and the entire doc is read then written in cases like this, very slow on large docs. Given the amount of data that may be attached to a `product`, it's not hard to hit these limits.

weddpros9y ago

Please do the math... Before this document reaches 16MB, you're bigger bigger than Amazon. If this solution scales up to Amazon scale, that's good enough for me.

codedokode9y ago

The example in the book is very simple - it can be implemented in SQL database with 2 tables (products and carts). When you have more entities and relations it would become too complex to keep all of them consistent in a denormalized scheme in MongoDB. You will have to write cron jobs that would cleanup broken references and still get errors.

So I wanted to say that denormalization and lack of foreign keys in MongoDB is much worse that lack of transactions.

gaius9y ago

people would have designed a shopping cart use-case as a single SQL transaction that would last maybe 10 minutes

This problem was solved in 1965 by CICS for the use case of "you're on the phone to a travel agent and they're finding you a ticket on their terminal". No "10 minute single transactions" anywhere...

In the web age, you want stateless services and HA

Those who forget history are doomed to repeated it.

weddpros9y ago

I thought about CICS too, but I guess few even know what it is. SQL was the contender here...

bsg759y ago· 6 in thread

Did I read that right? A document per item in inventory? This seems horribly inefficient.

BjoernKW9y ago

It depends. If you use the event sourcing design pattern and CQRS this can be very efficient, especially if you have huge number of purchase and sale events.

lilbobbytables9y ago

As someone not versed in Mongo or anything similar, what is the alternative? A large array of docs that has to be loaded in to memory to work with?

pjungwir9y ago

From the article:

    > The example is that if you have 10 rakes in the stores,
    > you can only sell 10 rakes. The approach that is taken
    > is quite nice, by simulating the notion of having a
    > document per each of the rakes in the store and allowing
    > users to place them in their cart.

In other words there are 10 documents in Mongo, not 1 document with a `"quantity": 10` attribute.

1 more reply

wolco9y ago

A products table with a document per product. A user table. An order table with line items in an array of embedded objects.

One would subtract product from amout field in products table to set new inventory level. A new order document gets created with all information needed to describe an order. Fields like total, subtotal, date would sit at the root level and line items with product descriptions and prices would be embedded as an array of objects. Then a user object with user_id, name address would be embedded.

Dealing with documents is a different but being able to contain the entire dataset with some relations is nice.

mrlinx9y ago

What else would you suggest?

bildung9y ago

{"product":"rake", "in_stock":10}

2 more replies

russdpale9y ago· 5 in thread

MongoDB has purpose, but inventory management is not one of those purposes.

camus29y ago

> MongoDB has purpose, but inventory management is not one of those purposes.

What purpose does it have frankly? The only one I see might be GridFS for what it is worth, though I don't believe one second the performances are that great, but when it comes to document oriented DB, Postgres can store both JSON and XML and query them and also do partial atomic changes. Scaling? easier maybe... Now competition is good and I'm sure NoSQL db success kind of forced traditional players to innovate. But I see no reason to use MongoDB in 2017.

weddpros9y ago

"No reason to use MongoDB in 2017"

I would reply: replica sets and sharding and multi-threaded architecture and the absence of impedance mismatch, all in a single product that was designed for these features.

3 more replies

paulddraper9y ago

Postgresql had never had a great story for sharding large datasets.

codedokode9y ago

Maybe it can be used as an advanced cache with persistense and secondary indexes?

s_kilk9y ago

Indeed.

The truth is that MongoDB is well suited to a fairly narrow set of use-cases, but ends up being used for all sorts of stuff in practice. Hence the weird contortions which the author observes in the book they're reading.

binocarlos9y ago· 3 in thread

I remember writing documentation for a Perl system in 2001 - it was using MySQL MyISAM tables and the main developer had a few hundred thousand lines of Perl that acted as the same kind of client attempt at transactions. It was a mess and huge amounts of money were spent on trying to get the thing to work. A few months later InnoDB came along and made it apparent that trying to write transaction logic in the client was a very bad idea, which seems to be the point of this article.

tacostakohashi9y ago

That was a terrible time in history - lots of Perl web apps built using MySQL or mSQL, both of which lacked transactions, right at a time when e-commerce was taking off.

Although Oracle, Sybase, SQL server and friends all transactions at that time, somehow the the mindset was that it was a complicated enterprise marketing gimmick, MySQL/mSQL are faster and simpler, and we can work around it in the client side. Seems like not much has changed.

gaius9y ago

That was a terrible time in history - lots of Perl web apps built using MySQL or mSQL

It was made worse by the MySQL team actively advocating against features they didn't have "you don't need transactions, do it in your application", "you don't need foreign keys, do it in your application" blah blah.

20+ years later they're still struggling to shoehorn it in.

z3t49y ago

lock tables

ojosilva9y ago· 2 in thread

Just to make it clear, the issue the OP is pointing out is with whoever wrote this inexcusable piece of code in the book, not MongoDB itself. Even if the authors later clarify there's the possibility of error, just publishing such misleading and grotesque solution for creating transactions in a non ACID database is very poor judgement.

MongoDB should not be used for an online ordering system, period. But if the programmer had no better alternative than Mongo, then please use Mongo's atomic operations [1] and nested documents to make sure nasty Bad Things don't happen.

[1] https://docs.mongodb.com/manual/tutorial/model-data-for-atom...

tyingq9y ago

In the case of an ecommerce site, where multiple products can go in a cart, I don't see any real right way to do things with MongoDB. Atomic for one row just doesn't help much, and you can't nest arbitrary cart product mixes.

As you say, bad idea in the first place.

electricEmu9y ago

I was part of a team that successfully did inventory in mongo. The cart makes zero difference if Mongo is only storing the inventory. It's fine to atomically change a single inventory row.

Operations that span multiple rows can be safely performed with a bus in front of it.

You can't judge the idea. I don't think you've quite grasped it. I agree the book example isn't great, but that's not the technology's fault.

2 more replies

nwatson9y ago· 1 in thread

My prior startup tried to do a security product with backend in Mongo. It really needed transactions and to avoid N+1 issues.

DB team insisted on writing "DAOs" that ended up pulling 1GB+ of data back from Mongo to merge in EACH of 100+ data points from a scanned machine. Similar issues in UI presentation. With multiple threads doing each of these things simultaneously there were many out of memory dumps. I analyzed these multiple times and told the DK VP Engineering what the problems were, and they didn't follow up for 6 months. He was gone soon after.

united8939y ago

> DB team insisted on writing "DAOs" that ended up pulling 1GB+ of data

That DB team shouldn't be allowed near any database. Why on earth would they go for such a moronic abstraction?

united8939y ago· 1 in thread

Should have a disclaimer, founder is the founder of RavenDB and it's clear he's cherry picking things and blaming it on the database vendor, instead of whomever wrote that example.

icebraining8y ago

Where did he blame it on the database vendor?

mindcrash9y ago· 1 in thread

Particularly fitting comment:

"Mongodb, the ultimate Maybe monad. With a built in fromMaybe mempty call for your convenience."

Per Hmemcpy and Michael Snoyman on Twitter.

PeCaN9y ago

Makes it great for building Snapchat clones though!

codedokode9y ago· 1 in thread

They should not try to emulate SQL databases, there are other ways to manage inventory without transactions.

One way is to add a field to an item that show its status: whether it is in a warehouse, in someone's cart, ordered or sold. Then adding an item to a cart means updating those fields. There probably is a way to do several similar updates atomically.

Another way is to use append-only collection, that keeps a list of events, like "Item X added to cart Y", "Item X sent to delivery".

But I guess when there are more entities and relations this would become too complex to manage. While SQL databases have no problems with hundreds of tables and thousands of columns.

elmigranto8y ago

> Then adding an item to a cart means updating those fields. There probably is a way to do several similar updates atomically.

Atomicity is document level, not collection level. So you can't update multiple documents atomically. Or do you plan on having `{status: 'in-cart', cartOwner: 'customer-id' | null}` and single document for every stocked item (like 1k copies of the same book would be 1k db documents and you also have all those sold from before)?

> Another way is to use append-only collection

How does it help with overselling? To decide if it's okay to append, you have to know if current number of items is greater than 0 (don't forget to lock other clients out of appending this whole process, so they wait for you to finish).

moxious9y ago

New families of DB technologies generally traded certain things off (like ACID guarantees and transactions) in exchange for other things like scalability or flexibility. When someone comes back, and in user space reimplements the things that the DB intentionally traded off, you get the worst of all worlds. It's quite a bit like flattening one end of a screwdriver so that you can make it work to drive nails. Yes, you can make that work, and in some rare circumstances where you're trapped on a desert island that might be your only option.

The rest of us will just use a hammer.

valarauca19y ago

TLDR:

Using a database that doesn't offer ACID, in a manner that requires ACID has non-trivial associated costs. This may also leave you open to a number of strange situations, where inventory quantities are unknown or incorrect.

twothamendment8y ago

Thanks for bringing up nightmares. Inventory on the web is one thing, think about how you'd do this for inventory in person at the store when a customer has product A in their hand and cash in the other. The inventory system says there aren't any in stock - should you sell them one? Of course.

Are you tracking individual lots of inventory and the costs you paid for them? Against which lot did you sell this one? You don't have any - so how do you calculate the margin for this item you sold but don't know how much it cost or where it came from? If it is returned, do you restock that inventory?

Mongo, SQL - they have there differences, but doing inventory management is tricky no matter what technology you use.

j / k navigate · click thread line to collapse

86 comments

44 comments · 12 top-level

weddpros9y ago· 12 in thread

Regarding "SQL is better suited to this use case because it has transactions" comments:

So SQL transactions can't help as much as you think.

The author regrets the book's suggestion of putting each object in stock in its own document, and I agree it's probably a recipe for disaster. Atomic updates make this design absurd.

(you still need to expire cancelled shopping carts, and you may need a transactional way of completing the order: it's also manageable if designed as an idempotent operation)

Anyway don't over-simplify this use case and believe "a single big SQL ACID transaction would handle the problem". That's just not true.

elcritch9y ago

codedokode9y ago

"Designing data flow" would mean having to spend more time and money for the same task?

1 more reply

calafrax9y ago

Transactions are never needed with a fully normalized model so if transactions are needed it is probably because your model sucks.

Or it is because you denormalized your model because your db engine's performance sucks in which case the transactions will probably just make it worse.

Good schema design and lock-free/wait-free (transaction-free) algorithms are not "reimplementing transactions in the client."

OPs example is garbage but his proposed transaction solution is garbage too.

3 more replies

MVorlm9y ago

Would you happen to have any articles/papers/blogs/talks that demonstrate how to perform the traditional bank transaction/shopping cart examples using an event-sourced system? Curious to learn more.

electricEmu9y ago

This is very close to how a team I was on solved this issue at Amazon. We took money, held inventory, had a shopping cart, and it worked out fine.

A service bus was necessary, but the actual atomic transactions in MongoDB didn't fail us. We didn't lose data. While the nay-sayers discounted Mongo, we were raking in cash on top of it.

fweespeech9y ago

> Anyway don't over-simplify this use case and believe "a single big SQL ACID transaction would handle the problem". That's just not true.

Or you could cheat and not update the inventory until the purchase is made. ;)

Cart expiration, etc. makes the system unnaturally brittle by adding non-critical steps to the process.

pjc509y ago

mbell9y ago

> You could easily db.products.update({_id: productId}, {$inc: {inStock: -5}, $addToSet: {pendingCarts: {cartId: cartId, quantity: 5, timestamp: new Date()}}}).

weddpros9y ago

Please do the math... Before this document reaches 16MB, you're bigger bigger than Amazon. If this solution scales up to Amazon scale, that's good enough for me.

codedokode9y ago

So I wanted to say that denormalization and lack of foreign keys in MongoDB is much worse that lack of transactions.

gaius9y ago

people would have designed a shopping cart use-case as a single SQL transaction that would last maybe 10 minutes

This problem was solved in 1965 by CICS for the use case of "you're on the phone to a travel agent and they're finding you a ticket on their terminal". No "10 minute single transactions" anywhere...

In the web age, you want stateless services and HA

Those who forget history are doomed to repeated it.

weddpros9y ago

I thought about CICS too, but I guess few even know what it is. SQL was the contender here...

bsg759y ago· 6 in thread

Did I read that right? A document per item in inventory? This seems horribly inefficient.

BjoernKW9y ago

It depends. If you use the event sourcing design pattern and CQRS this can be very efficient, especially if you have huge number of purchase and sale events.

lilbobbytables9y ago

As someone not versed in Mongo or anything similar, what is the alternative? A large array of docs that has to be loaded in to memory to work with?

pjungwir9y ago

From the article:

    > The example is that if you have 10 rakes in the stores,
    > you can only sell 10 rakes. The approach that is taken
    > is quite nice, by simulating the notion of having a
    > document per each of the rakes in the store and allowing
    > users to place them in their cart.

In other words there are 10 documents in Mongo, not 1 document with a `"quantity": 10` attribute.

1 more reply

wolco9y ago

A products table with a document per product. A user table. An order table with line items in an array of embedded objects.

Dealing with documents is a different but being able to contain the entire dataset with some relations is nice.

mrlinx9y ago

What else would you suggest?

bildung9y ago

{"product":"rake", "in_stock":10}

2 more replies

russdpale9y ago· 5 in thread

MongoDB has purpose, but inventory management is not one of those purposes.

camus29y ago

> MongoDB has purpose, but inventory management is not one of those purposes.

weddpros9y ago

"No reason to use MongoDB in 2017"

I would reply: replica sets and sharding and multi-threaded architecture and the absence of impedance mismatch, all in a single product that was designed for these features.

3 more replies

paulddraper9y ago

Postgresql had never had a great story for sharding large datasets.

codedokode9y ago

Maybe it can be used as an advanced cache with persistense and secondary indexes?

s_kilk9y ago

Indeed.

binocarlos9y ago· 3 in thread

tacostakohashi9y ago

That was a terrible time in history - lots of Perl web apps built using MySQL or mSQL, both of which lacked transactions, right at a time when e-commerce was taking off.

gaius9y ago

That was a terrible time in history - lots of Perl web apps built using MySQL or mSQL

20+ years later they're still struggling to shoehorn it in.

z3t49y ago

lock tables

ojosilva9y ago· 2 in thread

[1] https://docs.mongodb.com/manual/tutorial/model-data-for-atom...

tyingq9y ago

As you say, bad idea in the first place.

electricEmu9y ago

I was part of a team that successfully did inventory in mongo. The cart makes zero difference if Mongo is only storing the inventory. It's fine to atomically change a single inventory row.

Operations that span multiple rows can be safely performed with a bus in front of it.

You can't judge the idea. I don't think you've quite grasped it. I agree the book example isn't great, but that's not the technology's fault.

2 more replies

nwatson9y ago· 1 in thread

My prior startup tried to do a security product with backend in Mongo. It really needed transactions and to avoid N+1 issues.

united8939y ago

> DB team insisted on writing "DAOs" that ended up pulling 1GB+ of data

That DB team shouldn't be allowed near any database. Why on earth would they go for such a moronic abstraction?

united8939y ago· 1 in thread

Should have a disclaimer, founder is the founder of RavenDB and it's clear he's cherry picking things and blaming it on the database vendor, instead of whomever wrote that example.

icebraining8y ago

Where did he blame it on the database vendor?

mindcrash9y ago· 1 in thread

Particularly fitting comment:

"Mongodb, the ultimate Maybe monad. With a built in fromMaybe mempty call for your convenience."

Per Hmemcpy and Michael Snoyman on Twitter.

PeCaN9y ago

Makes it great for building Snapchat clones though!

codedokode9y ago· 1 in thread

They should not try to emulate SQL databases, there are other ways to manage inventory without transactions.

Another way is to use append-only collection, that keeps a list of events, like "Item X added to cart Y", "Item X sent to delivery".

But I guess when there are more entities and relations this would become too complex to manage. While SQL databases have no problems with hundreds of tables and thousands of columns.

elmigranto8y ago

> Then adding an item to a cart means updating those fields. There probably is a way to do several similar updates atomically.

> Another way is to use append-only collection

moxious9y ago

The rest of us will just use a hammer.

valarauca19y ago

TLDR:

twothamendment8y ago

Mongo, SQL - they have there differences, but doing inventory management is tricky no matter what technology you use.

j / k navigate · click thread line to collapse