The DynamoDB Book: Data Modeling with NoSQL and DynamoDB (opens in new tab)

(dynamodbbook.com)

245 pointsabd126y ago110 comments

110 comments

81 comments · 18 top-level

arpinum6y ago· 23 in thread

I bought the book, I read the book, I've used DynamoDB for awhile. It didn't change my mind. DynamoDB makes tradeoffs in order to run at massive scale, but scale isn't a problem many people need solving when 2TB of RAM fits in a single box. Meanwhile I need to handle eventual consistency, an analytics pipeline, another database for fuzzy search, another geo lookup database, Lambda functions to do aggregations, and a pile of custom code. All while giving up tooling so readily available for the RDBMS world.

In a world where Opex is much higher than Capex DynamoDB might make sense, but for me server costs are 5% of dev costs. And even if it works from a cost perspective, how many AWS services have the console experience ruined by DynamoDB? The UI tricks you into thinking its a data table with sortable columns, but no! DynamoDB limitations strike again and you are off on a journey of endless paging. The cost savings come at the expense of the user.

DynamoDB also isn't fast. 20ms for a query isn't fast, 30ms for an insert isn't fast. Yes its amazingly consistent and faster than other systems holding 500TB, but that isn't a use case for many users.

philipkglass6y ago

Others are comparing DynamoDB to Redis and Cassandra. It has additional limitations. These are fairly clearly spelled out but maybe weren't highlighted as prominently a few years back. (I say that because I inherited an application that made heavy use of DynamoDB but turned out not to be a great fit for DDB.)

- It provides rich types with some odd limitations: strings, sets, lists, and binaries do not allow empty values.

- You can store a maximum 400 KB data in one row.

- You can get a maximum of 1 MB data returned in a single query.

So it's mostly good for high-data-throughput applications, and then only if your high data throughput consists of large numbers of small records, processed a few at a time. This surely describes an important class of workloads. You may suffer if your workload isn't in this class.

Another annoyance is that (in my experience) one of the most common errors you will encounter is ProvisionedThroughputExceededException, when your workload changes faster than the auto-scaling. Until last year you couldn't test this scenario offline with the DynamoDB Local service because DynamoDB Local didn't implement capacity limits.

outworlder6y ago

> - It provides rich types with some odd limitations: strings, sets, lists, and binaries do not allow empty values.

That is _infuriating_

It's documented, but it is so surprising when you first hit it. Sometimes, empty values have semantics attached to them, I don't want to scrub them out.

1 more reply

lbruck6y ago

Regarding the empty string and binary values: https://aws.amazon.com/about-aws/whats-new/2020/05/amazon-dy...

(Disclosure: I work for AWS on DynamoDB and on this)

dropofwill6y ago

Note that Cassandra has similar limitations with data/throughput, but they aren't enforced or documented (because they depend on your particular setup) and your queries just fail or worse make all queries to the same node in the cluster fail (fun times with large wide rows).

The rich data types in Dynamo are quite strange, since they're basically useless for querying I'm not sure why you would use them. Maybe I'm missing something...

2 more replies

DVassallo6y ago

If you treat DynamoDB as a DBMS, you’re going to be disappointed (for the reasons you mention). But if you think of it as a highly-durable immediately-consistent btree in the cloud, it’s amazing. DynamoDB is closer to Redis than MySQL. Amazon does it a disservice by putting it in the databases category.

arpinum6y ago

The indexes are not immediately consistent.

Its not just that it is put in the database category, but that its champions at AWS make statements like "if you are utilising RDBMS you are living in the past", or that "there are very few use cases to choose Postgres over DynamoDB".

Btw, loved your AWS book!.

avip6y ago

DynamoDb is like redis without the fun data structures, the fantastic cli and discoverability, the usefull configurable tradeoff between fast and consistent, and really much-needed features s.a listing your keys.

1 more reply

abd12OP6y ago

Daniel, I'm a big fan of yours but disagree with this take :).

It's definitely a database. The modeling principles are different, and you won't get some of the niceties you get with a RDBMS, but it still allows for flexible querying and more.

S3 is not a database, but DynamoDB is :).

1 more reply

faheel6y ago

I agree. DynamoDB is like a serverless child of Redis and MongoDB.

balfirevic6y ago

> If you treat DynamoDB as a DBMS, you’re going to be disappointed

Did you mean to say "as a RDBMS"? Because I don't see how it's not a DBMS.

1 more reply

abd12OP6y ago

Fair enough! I think that's a reasonable position.

IMO, there are two times you should absolutely default to DynamoDB:

- Very high scale workloads, due to its scaling characteristics

- Workloads w/ serverless compute (aka Lambda) due to how well it fits with the connection model, provisioning model, etc.

You can use DynamoDB for almost all OLTP workloads, but outside of those two categories, I won't fault you for choosing an RDBMS.

Agree that DynamoDB isn't _blazing_ fast. It's more that it's extremely consistent. You're going to get ~10 millisecond response times when you have 1GB of data or when you have 10 TB of data, and that's pretty attractive.

scarface746y ago

Workloads w/ serverless compute (aka Lambda) due to how well it fits with the connection model, provisioning model, etc.

If you can use Aurora Serverless, the Data API makes sense for lambda.

https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide...

1 more reply

sudhirj6y ago

Not to mention the same also applies for load. You get about 10ms at 10, 1000 or 1000000 requests per second, again irrespective of how much data you have.

bufferoverflow6y ago

There's a third use: if you want a free ride, AWS free tier for DynamoDB is quite nice, enough to run a decent dynamic website.

1 more reply

arpinum6y ago

> Workloads w/ serverless compute (aka Lambda) due to how well it fits with the connection model, provisioning model, etc.

This is only true for AWS. Azure functions share resources and don't have this issue.

The speed is actually quite sad. Its 5-10x slower than my other databases at p95, and I can't throw money at the problem on the write side. Reads I can use DAX, but then there goes consistency.

1 more reply

dropofwill6y ago

RDS maxes out RAM at 768GiB, if we're comparing managed to managed.

If you're approaching that point, you already are going to need an analytics pipeline, a search DB, etc, because maintaining ever growing indices will kill your latency. You probably can get away with aggregations for a bit longer, but if the number of rows you aggregate is growing too, eventually you will need to come up with something and the way you do that with Dynamo off a stream isn't a bad way to go about it with MySql either.

Looking at the tables I have access to, they all come under 5ms for both read/write. This is the same ballpark as our MySql apps for similar style queries (i.e. not aggegrations).

Sadly my favorite reason to use Dynamo is political, not technical. Since it somehow is not classified as a database at my company, the DBAs don't 'own' it. So I don't have to wait 2-3 months for them to manually configure something.

Conway's law strikes again.

arpinum6y ago

> RDS maxes out RAM at 768GiB

RDS goes 4TB on X1e instance type. But the point is RDBMS systems handle a large amount of data and workload types before needing to reach for specialist systems

I don't know how you are doing write transactions in 5ms on DynamoDB. Single puts p50 maybe, but i've never seen p90 put operations below 10ms.

1 more reply

abd12OP6y ago

Haha, I love that story at the end. I promise not to tell your company that it is a database.

Ambol6y ago

>but scale isn't a problem many people need solving when 2TB of RAM fits in a single box.

What's the price of that on the cloud? I know I can run crazy big tables on DynamoDB for a couple of dollars. I don't know what 1 month of a relational database with 2TB of RAM costs on the cloud, but I am pretty sure I can't afford it.

StreamBright6y ago

We used to run Riak for Dynamo like workloads very efficiently, 30ms p50 insertion time.

ryanmarsh6y ago

Very disappointed to find the top comment is about DynamoDB and not Alex and his wonderful book. I suppose this is par for the course with HN. I hope nothing I create ever ends up posted here.

arpinum6y ago

You are disappointed the comment is about the subject of the book from someone who read it and didn’t find it a compelling read? I didn’t like the book because it continued a trend in mia-selling DynamoDB and my comment reflects that frustration. Sorry but not every review is going to be glowing, and I’m certainly not going to make personal comments about the author.

qaq6y ago

6TB fits in a single box

abd12OP6y ago· 13 in thread

Waves Author here. Happy to answer any questions folks have about the book, about DynamoDB, or about self-publishing.

NoSQL modeling is waaay different than relational modeling. I think a lot of NoSQL advice out there is pretty bad, which results in people dismissing the technology altogether. I've been working with DynamoDB for a few years now, and there's no way I'll go back.

The book has been available for about a month now, and I've been pretty happy with the reception. Strong support from Rick Houlihan (AWS DynamoDB wizard) and a lot of other folks at AWS.

You can get a free preview by signing up at the landing page. If you buy and don't like it, there's a full money-back guarantee with no questions asked. Also, if you're having income problems due to COVID, hit me up and we'll make something work :)

Anyhow, hit me up with questions!

EDIT: Added a coupon code for folks hearing about the book here. Use the code "HACKERNEWS" to save $20 on Basic, $30 on Plus, or $50 on Premium. :)

eloff6y ago

The biggest problem I'm aware of with DynamoDB is the hot key / partition issue[1]. Throughout is distributed evenly across nodes, you can't control how many nodes you have, so you always have a node that's hot either temporarily or permanently and so you end up having to over provision all your nodes to be able to handle that hot case, which ends up costing far more than alternatives. What's your take on this? This is the chief reason I avoid DynamoDB, which in theory would be a good fit for some of my problems.

[1] https://syslog.ravelin.com/you-probably-shouldnt-use-dynamod...

luhn6y ago

As of a couple years ago, DynamoDB will redistribute throughput between shards based on usage [1], so in theory this should eliminate the hot shard problem. I haven't had a chance to test this in practice, if anybody has hands-on experience I'd love to hear it.

You also finally have a way of identifying hot keys with the terribly named CloudWatch Contributor Insights for DynamoDB. [2]

For exceptional use cases, you also have the option of On-Demand Capacity to pay for what you use and not worry about capacity at all. [3]

[1] https://docs.aws.amazon.com/amazondynamodb/latest/developerg...

[2] https://docs.aws.amazon.com/amazondynamodb/latest/developerg...

[3] https://docs.aws.amazon.com/amazondynamodb/latest/developerg...

1 more reply

Judson6y ago

With instant adaptive capacity, I think quite a few hot key issues are mitigated.

https://aws.amazon.com/blogs/database/how-amazon-dynamodb-ad...

abd12OP6y ago

luhn responded to this one pretty well :)

Basically, most of these issues are gone. As long as you don't have extreme skew in your partition keys, you don't need to worry about throughput limits.

bwarren26y ago

Bought the book, thank you!

What was your approach to self-publishing here? What tools did you use? If I wanted to publish a book but knew nothing about it, what resources should I read and what approach would you recommend?

abd12OP6y ago

Thank you for your support!

The biggest advice I can give you is not about any specific tool, it's about an approach. You need to think about how you will market the book if you're self-publishing.

Engage with the community that will be interested in the book. Write articles, help out on Twitter, write code libraries, etc.

For me, I wrote DynamoDBGuide.com two and a half years ago over Christmas break. I wanted to just make an easier introduction to DynamoDB after I watched Rick Houlihan's talk at re:Invent (which is awesome).

That led to other opportunities and to me being seen as an 'expert' (even when I wasn't!). I got more questions and spent more time on DynamoDB to the point where I started to know more. I gave a few talks, etc.

I finally decided to do a book and set up a landing page and mailing list. I basically followed the playbook that Adam Wathan described for his first book launch.[0] Write in public, release sample chapters, engage with people, etc.

In terms of tooling, I used AsciiDoc to generate the book and Gumroad to sell. On a 1-10 scale, I'd give AsciiDoc a 5 and Gumroad an 8. But the tooling barely matters -- think about how to find the people that are interested :)

Happy to answer any other questions, either in public or via email.

[0] - https://adamwathan.me/the-book-launch-that-let-me-quit-my-jo...

bilalq6y ago

Just bought the book. I've been working at AWS and using DynamoDB for years now, but I'm sure there are things I could be doing better. I love that you've dedicated attention to analytics and operations too.

abd12OP6y ago

Thank you! I really appreciate it :) Hit me up if you have any questions!

balfirevic6y ago

Honest question: would you say "NoSQL modeling is way more restrictive, labor intensive and painful, but in turn gives you consistent performance as you scale" is a fair characterization?

AlisdairO6y ago

I'd been sort-of considering buying this for a while and the coupon made me pull the trigger. Thanks!

dkobia6y ago

Any chance for a Kindle friendly mobi?

abd12OP6y ago

Yep! It comes with PDF, MOBI, and EPUB formats :)

Scarbutt6y ago

and there's no way I'll go back.

err.. back to what?

danenania6y ago· 6 in thread

DynamoDB is very compelling for performance, scalability, and low ops overhead, but I recommend thinking very carefully about the limited transaction support before going with it, as it’s likely to be a dealbreaker for many use cases, whether or not you realize that up front. I think most apps will need a transaction involving more than 25 rows at some point, and with dynamo your only option is to fire them off in groups of 25 and hope none fail (plenty will at scale).

You can get many of the benefits of dynamo (sans auto-sharding), by applying its elegant indexing strategy to an sql database. It will be as fast or faster, your transactions can be as big as you need them to be, and you retain the ability to occasionally fire off un-indexed ad hoc queries for development or convenience. Running and scaling an sql db is also fairly painless these days with options like aurora.

cactus20936y ago

Interesting, idk that I've ever needed a transaction with more than 25 rows.

But I agree in general about the limitations. Having used RDBMSes like Postgres a lot, as well used Cassandra and DynamoDB in production, I would almost certainly not create a new app with DynamoDB as the primary DB. Even if you have an app where you expect to need to scale writes heavily, it's not going to be on all tables equally. For instance, your users table, and related resources that are relatively small and grow linearly with your users, will probably fit fine in a Postgres DB for a very long time. And being able to have normalized models and powerful indexing and querying patterns available is a big benefit.

DynamoDB can work well for a specific sub-system that needs very high scalability. For instance, if you needed to store pairwise info between every user and product combination for some reason. Of if every user can upload a huge number of resources of some type (though the access patterns need to fit dynamodb's constraints, if these are documents or files of some type then another system like S3 or Elasticsearch would probably make more sense). Or if you're tracking advertising views by an advertising identifier or something. Or scraping and importing a bunch of data from other places. In some specific use-cases like this, the downsides vs an RDMS can be very minimal, and the built-in scalability can save you a ton of time vs having to constantly tune and potentially shard your RDBMS system.

But even in these cases, you might have better options depending on your access patterns. For instance if you don't ever need to refer to this data by reading it in an OLTP context, you might want to just write it to a log like Kafka to be ingested into Redshift or HDFS for offline processing or querying.

abd12OP6y ago

I can understand the sentiment and don't fault you for it.

That said, I think you can definitely handle complex, relational patterns in DynamoDB pretty easily. It will take some work to learn new modeling patterns, but it's absolutely doable.

1 more reply

parsnips6y ago

Agreed. This is a limitation we ran into trying to implement a critical accounting ledger on top of DynamoDB. The transaction model we came up with is formally verified w/ TLA+. We're turning our work into a product: txlayer.com

cbdumas6y ago

> This is a limitation we ran into trying to implement a critical accounting ledger on top of DynamoDB.

Sounds like a perfect use case for a traditional RDBMS. Why Dynamo?

1 more reply

zemo6y ago

> I think most apps will need a transaction involving more than 25 rows at some point

I ... can't think of a single time I've ever needed this.

danenania6y ago

A common one is cascading deletes when you delete a user or a ‘project’ or something else that has a lot of stuff associated with it. Those will exceed 25 rows very quickly. Also any kind of bulk update or data import... hell even just initializing a new account can easily require writing more than 25 rows in a moderately complex app.

1 more reply

Niccizero6y ago· 6 in thread

$79 for the basic package? A bit pricey if you ask me.

abd12OP6y ago

Fair enough! IMO, it's worth it :). You could spend a bunch of time cobbling together free resources, and you'd still only get about 30% of what's in the book. How much is your time worth as a software engineer?

That said, a few notes:

1. I added a coupon code ('HACKERNEWS') to knock $20 off Basic, $30 off Plus, and $50 off Premium.

2. If you're from a country where PPP makes this pretty expensive, hit me up. I'm happy to help.

3. If you're facing income challenges due to COVID-19, hit me up, I'm happy to help.

4. If this is unaffordable for any reason, hit me up, I'm happy to help. :)

MatthewPhillips6y ago

Your book does an excellent job explaining the single-table design pattern of DynamoDB. This pattern literally saves you money. So at a certain point you will earn back the $79 from a lower AWS bill (plus your applications will be much faster!)

1 more reply

cityzen6y ago

If your rate is low enough that you can learn everything that is in this book for $80 worth of your time, then sure. Price is relative, it's not like he's selling prescription drugs for $1000 per pill.

I bought it and have found it to be completely worth the money. I don't look at prices for these things in relation to how much other books cost but how much time it will save me.

znpy6y ago

Yeah I think that the medical comparison is a good idea.

We tend to criticize people for asking decent amount of money in our industry whereas people on others industries shamelessly ask for ludicrous amount of money even for pretty much anything (think medical or legal)

mellavora6y ago

Alex was super-helpful to me. I had an edge-case problem using batch writes; the issue was assembling the batch in R to pass through the paws api, basically a bunch of really tricky nested lists, and I had one element out of place.

Alex answered my questions in such a way that I myself saw where the bug was in my code.

He saved me easily several hours of time.

At my hourly rate, this means that the book had a negative cost in my case.

I was able to repay the favor, I suggested an improvement to one code example in the book which Alex eagerly accepted.

abd12OP6y ago

Thanks for your support! I'm grateful for the fix you suggested as well :)

seibelj6y ago· 4 in thread

DynamoDB is monster scale but... tricky to use and difficult pricing model. The paying for writers / readers thing is strange to me and makes it difficult to scale up for bursts. I recommend not using this tech for most things. You need to know exactly why you want to use it and have a good reason.

abd12OP6y ago

I'd much rather pay for reads & writes directly rather than guessing at how my CPU and RAM will translate to the reads and writes that I need.

RDBMS capacity planning basically goes:

1. How much traffic will I get? 2. How much RAM & CPU will I need to handle the traffic from (1).

With DynamoDB, you can skip the second question.

maerF0x06y ago

> makes it difficult to scale up for bursts

Can you tell me why the On Demand mode doesnt work for you?

arpinum6y ago

7x the cost. I find it interesting that the DynamoDB cheer squad points out most databases only run at 10-15% utilisation and are burning money every hour. In the next breath they suggest running on demand "till it hurts" and paying AWS as if they were running at 15% utilisation.

1 more reply

whalesalad6y ago

You need to build exponential-backoff logic into your system to handle waiting for Dynamo to warm up. It doesn't happen instantly.

1 more reply

abarrettwilsdon6y ago· 2 in thread

I bought this a few weeks ago and am about 130 pages in.

It is just stunning how much better it is learning Dynamo/NoSQL in general from this than effectively any other source. Anyone who's had to rely on AWS docs knows how face-meltingly dense they can be.

I went back and refactored all my previous Dynamo work last night, and the difference was night and day. I'm planning to migrate some relational structures later this week, as well.

Is good book.

TheSpiciestDev6y ago

What has this book taught you that could be applied outside DynamoDB? I'm close to buying but the price is kinda steep... if however I can take away some general NoSQL insight then I'm sold.

Edit: nevermind, I see another review elsewhere and the author replying. Though, your opinion would still be appreciated! :)

abd12OP6y ago

Thank you for the kind words! :) Glad you're liking it.

Nican6y ago· 2 in thread

> While your relational database queries slow down as your data grows, DynamoDB keeps on going. It is designed to handle large, complex workloads without melting down.

I mean- hand a person a gun, and they might shoot themselves in the foot. While you can make bad queries/workloads for a relational database, you can just as easily make bad workloads for DynamoDB.

abd12OP6y ago

My contention is that it's much easier to have an access pattern that won't scale in a relational database than in DynamoDB. DynamoDB basically removes all the things that can prevent you from scaling (JOINs, large aggregations, unbounded queries, fuzzy-search).

This is underrated, but it's really helpful. So many times w/ a relational database, I've had to tweak queries or access patterns over time as response times degrade. DynamoDB basically doesn't have that unless you really screw something up.

jeremyjh6y ago

So what is the cost of doing a bit of query tuning and de-norming every now and then compared to the development costs imposed by DynamoDB?

1 more reply

raynguyen6y ago· 2 in thread

This looks like a great resource. One thing I'm struggling with is the ability to sort and filter and was wondering if the book goes into detail about this topic.

If I have a person entity and its attributes listed out in a table. How would you go about sorting by first name, last name, created at, etc... I was thinking of streaming everything over to elastic search, but that would add extra complexity to maintain.

abd12OP6y ago

Yep! There are entire chapters on sorting & filtering. Note: it's different than in a relational database, but it's doable :)

raynguyen6y ago

Awesome! Glad to hear that there's a section on that. Quick question. I'm thinking of leveraging elasticsearch for the fulltext search capabilities. Is the work to get sorting on various different attributes heavy from a dev perspective and is there any advantages of doing it through dynamo rather than querying with elasticsearch?

siscia6y ago· 1 in thread

I work a little outside the standard startup hyper-scale, fast growing business, so forgive my question.

But how widely used is DynamoDB? And for what use cases?

And what are the problems with it?

abd12OP6y ago

In a nutshell:

- It was designed for super high scale use cases (think Amazon.com retail on Cyber Monday). It has decent adoption there. Competes mostly with Cassandra or other similar tools.

- With the introduction of AWS Lambda, it got more adoption in the 'serverless' ecosystem because of how well its connection model, provisioning model, and billing model works with Lambda. RDBMS doesn't work as well here.

A lot of people find 'problems' with it because they try to use it like a relational database, which it most certainly isn't. You have to model differently and think about it differently. The book helps here :).

agustif6y ago· 1 in thread

Can some knowledge be transferred to other NoSQL flavours like mongo or is the book heavily specific about DynamoDB?

abd12OP6y ago

All the examples are specific to DynamoDB and use DynamoDB features.

That said, the principles apply pretty well to other popular NoSQL databases, especially MongoDB and Cassandra. There will be some slight differences -- MongoDB allows better nesting and querying on nested objects -- but it's broadly the same. If you want to model NoSQL for scale, you need to use these general patterns.

If you want to check it out but find out it doesn't work for you, just let me know. I've got a 100% money-back guarantee with no questions asked if you don't like it.

timebomb06y ago· 1 in thread

The book looks great, but being a startup, the price is hard to swallow for 20+ engineers.

abd12OP6y ago

Email me, and I'm happy to discuss :). alex@alexdebrie.com

haolez6y ago· 1 in thread

Does anyone have book recommendations on NoSQL modeling in general?

databrecht6y ago

Tbh I don't think that makes sense since it depends on what your definition of NoSQL is. Some people say 'no relations' others say 'no sql' others say 'eventual consistency'. Some people call FaunaDB NoSQL because it's distributed and scales yet it offers strong consistency and relations and hence normalized data and joins is an option.

In others, you might have relations but lose consistency, in others you might have relations but only keep consistency under specific conditions (sharding keys etc)

NoSQL modeling typically depends on the specific characteristics of the database. Essentially it's about looking at these, see what it doesn't offer, compare that with what you need, and find workarounds.

djstein6y ago· 1 in thread

thanks for this. I just started creating my first DynamoDB database yesterday

abd12OP6y ago

Awesome! Hit me up if you have any questions :)

tinkertamper6y ago

After using Dynamo for 2 years now the biggest problem I’ve seen thus far is the pretty extreme expectations it puts on your application code to manage things that have traditionally been considered the responsibility of the data store. We found it was a bit onerous to ensure all facets of modeling/validation/indexing were into consideration when writing that layer of the application. To address the constant bootstrapping you either end up with a crap ton of utilities that form indexes or create updateExpression strings, etc, or you end up constantly reinventing the wheel.

The JS landscape for Dynamo is a bit bare, notable options all largely ignore the indexing principles that are the real draw of Dynamo. This heartburn caused me to sit down and write a library myself (https://github.com/tywalch/electrodb) that allows you focus on the models and relationships while taking care of all the little pitfalls and “hacky” tricks inherent in single table design.

Alex’s book covers all these things and I honestly wish I had had it sooner before having to learn via foot shooting. It’s pricey but if you have a need for Dynamo on your project it really pays off knowing you’re swimming with the current, and Alex definitely gets you there.

pier256y ago

For my current serverless project I'm using Fauna which I think is a better option than Dynamo. You get relations, complex queries, etc. You also get authentication and authorization baked-in.

I haven't done any serious tests but I'd say on average my reads to Fauna from Cloudflare workers are 30ms. Seems a lot compared to querying a local instance of Postgres but since Fauna is distributed you end up getting much better latency on average for your worldwide users compared to a single DB in us-east-1.

Writes take longer (probably around 200-300ms on average) but considering these are replicated to all Fauna servers with ACID I'm ok with that.

I wrote a little intro to Fauna's query language which is very powerful if anyone is interested:

https://github.com/PierBover/getting-started-fauna-db-fql

the_arun6y ago

What I like in DDB is TTL. It is a fantastic feature. I read someone comparing it with Redis. Redis is faster because of TCP connectivity, whereas DDB is over HTTP.

albatross136y ago

Well I'm a sucker for this kind of stuff- how do the videos work in the premium package? Do I get to download them for offline viewing?

bangbig6y ago

Wonder if anyone agrees that Uber's order processing can be handled by DynamoDB very well.

j / k navigate · click thread line to collapse

110 comments

81 comments · 18 top-level

arpinum6y ago· 23 in thread

philipkglass6y ago

- It provides rich types with some odd limitations: strings, sets, lists, and binaries do not allow empty values.

- You can store a maximum 400 KB data in one row.

- You can get a maximum of 1 MB data returned in a single query.

outworlder6y ago

> - It provides rich types with some odd limitations: strings, sets, lists, and binaries do not allow empty values.

That is _infuriating_

It's documented, but it is so surprising when you first hit it. Sometimes, empty values have semantics attached to them, I don't want to scrub them out.

1 more reply

lbruck6y ago

Regarding the empty string and binary values: https://aws.amazon.com/about-aws/whats-new/2020/05/amazon-dy...

(Disclosure: I work for AWS on DynamoDB and on this)

dropofwill6y ago

The rich data types in Dynamo are quite strange, since they're basically useless for querying I'm not sure why you would use them. Maybe I'm missing something...

2 more replies

DVassallo6y ago

arpinum6y ago

The indexes are not immediately consistent.

Btw, loved your AWS book!.

avip6y ago

1 more reply

abd12OP6y ago

Daniel, I'm a big fan of yours but disagree with this take :).

It's definitely a database. The modeling principles are different, and you won't get some of the niceties you get with a RDBMS, but it still allows for flexible querying and more.

S3 is not a database, but DynamoDB is :).

1 more reply

faheel6y ago

I agree. DynamoDB is like a serverless child of Redis and MongoDB.

balfirevic6y ago

> If you treat DynamoDB as a DBMS, you’re going to be disappointed

Did you mean to say "as a RDBMS"? Because I don't see how it's not a DBMS.

1 more reply

abd12OP6y ago

Fair enough! I think that's a reasonable position.

IMO, there are two times you should absolutely default to DynamoDB:

- Very high scale workloads, due to its scaling characteristics

- Workloads w/ serverless compute (aka Lambda) due to how well it fits with the connection model, provisioning model, etc.

You can use DynamoDB for almost all OLTP workloads, but outside of those two categories, I won't fault you for choosing an RDBMS.

scarface746y ago

Workloads w/ serverless compute (aka Lambda) due to how well it fits with the connection model, provisioning model, etc.

If you can use Aurora Serverless, the Data API makes sense for lambda.

https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide...

1 more reply

sudhirj6y ago

Not to mention the same also applies for load. You get about 10ms at 10, 1000 or 1000000 requests per second, again irrespective of how much data you have.

bufferoverflow6y ago

There's a third use: if you want a free ride, AWS free tier for DynamoDB is quite nice, enough to run a decent dynamic website.

1 more reply

arpinum6y ago

> Workloads w/ serverless compute (aka Lambda) due to how well it fits with the connection model, provisioning model, etc.

This is only true for AWS. Azure functions share resources and don't have this issue.

The speed is actually quite sad. Its 5-10x slower than my other databases at p95, and I can't throw money at the problem on the write side. Reads I can use DAX, but then there goes consistency.

1 more reply

dropofwill6y ago

RDS maxes out RAM at 768GiB, if we're comparing managed to managed.

Looking at the tables I have access to, they all come under 5ms for both read/write. This is the same ballpark as our MySql apps for similar style queries (i.e. not aggegrations).

Conway's law strikes again.

arpinum6y ago

> RDS maxes out RAM at 768GiB

RDS goes 4TB on X1e instance type. But the point is RDBMS systems handle a large amount of data and workload types before needing to reach for specialist systems

I don't know how you are doing write transactions in 5ms on DynamoDB. Single puts p50 maybe, but i've never seen p90 put operations below 10ms.

1 more reply

abd12OP6y ago

Haha, I love that story at the end. I promise not to tell your company that it is a database.

Ambol6y ago

>but scale isn't a problem many people need solving when 2TB of RAM fits in a single box.

StreamBright6y ago

We used to run Riak for Dynamo like workloads very efficiently, 30ms p50 insertion time.

ryanmarsh6y ago

Very disappointed to find the top comment is about DynamoDB and not Alex and his wonderful book. I suppose this is par for the course with HN. I hope nothing I create ever ends up posted here.

arpinum6y ago

qaq6y ago

6TB fits in a single box

abd12OP6y ago· 13 in thread

Waves Author here. Happy to answer any questions folks have about the book, about DynamoDB, or about self-publishing.

The book has been available for about a month now, and I've been pretty happy with the reception. Strong support from Rick Houlihan (AWS DynamoDB wizard) and a lot of other folks at AWS.

Anyhow, hit me up with questions!

EDIT: Added a coupon code for folks hearing about the book here. Use the code "HACKERNEWS" to save $20 on Basic, $30 on Plus, or $50 on Premium. :)

eloff6y ago

[1] https://syslog.ravelin.com/you-probably-shouldnt-use-dynamod...

luhn6y ago

You also finally have a way of identifying hot keys with the terribly named CloudWatch Contributor Insights for DynamoDB. [2]

For exceptional use cases, you also have the option of On-Demand Capacity to pay for what you use and not worry about capacity at all. [3]

[1] https://docs.aws.amazon.com/amazondynamodb/latest/developerg...

[2] https://docs.aws.amazon.com/amazondynamodb/latest/developerg...

[3] https://docs.aws.amazon.com/amazondynamodb/latest/developerg...

1 more reply

Judson6y ago

With instant adaptive capacity, I think quite a few hot key issues are mitigated.

https://aws.amazon.com/blogs/database/how-amazon-dynamodb-ad...

abd12OP6y ago

luhn responded to this one pretty well :)

Basically, most of these issues are gone. As long as you don't have extreme skew in your partition keys, you don't need to worry about throughput limits.

bwarren26y ago

Bought the book, thank you!

What was your approach to self-publishing here? What tools did you use? If I wanted to publish a book but knew nothing about it, what resources should I read and what approach would you recommend?

abd12OP6y ago

Thank you for your support!

The biggest advice I can give you is not about any specific tool, it's about an approach. You need to think about how you will market the book if you're self-publishing.

Engage with the community that will be interested in the book. Write articles, help out on Twitter, write code libraries, etc.

Happy to answer any other questions, either in public or via email.

[0] - https://adamwathan.me/the-book-launch-that-let-me-quit-my-jo...

bilalq6y ago

abd12OP6y ago

Thank you! I really appreciate it :) Hit me up if you have any questions!

balfirevic6y ago

Honest question: would you say "NoSQL modeling is way more restrictive, labor intensive and painful, but in turn gives you consistent performance as you scale" is a fair characterization?

AlisdairO6y ago

I'd been sort-of considering buying this for a while and the coupon made me pull the trigger. Thanks!

dkobia6y ago

Any chance for a Kindle friendly mobi?

abd12OP6y ago

Yep! It comes with PDF, MOBI, and EPUB formats :)

Scarbutt6y ago

and there's no way I'll go back.

err.. back to what?

danenania6y ago· 6 in thread

cactus20936y ago

Interesting, idk that I've ever needed a transaction with more than 25 rows.

abd12OP6y ago

I can understand the sentiment and don't fault you for it.

That said, I think you can definitely handle complex, relational patterns in DynamoDB pretty easily. It will take some work to learn new modeling patterns, but it's absolutely doable.

1 more reply

parsnips6y ago

cbdumas6y ago

> This is a limitation we ran into trying to implement a critical accounting ledger on top of DynamoDB.

Sounds like a perfect use case for a traditional RDBMS. Why Dynamo?

1 more reply

zemo6y ago

> I think most apps will need a transaction involving more than 25 rows at some point

I ... can't think of a single time I've ever needed this.

danenania6y ago

1 more reply

Niccizero6y ago· 6 in thread

$79 for the basic package? A bit pricey if you ask me.

abd12OP6y ago

That said, a few notes:

1. I added a coupon code ('HACKERNEWS') to knock $20 off Basic, $30 off Plus, and $50 off Premium.

2. If you're from a country where PPP makes this pretty expensive, hit me up. I'm happy to help.

3. If you're facing income challenges due to COVID-19, hit me up, I'm happy to help.

4. If this is unaffordable for any reason, hit me up, I'm happy to help. :)

MatthewPhillips6y ago

1 more reply

cityzen6y ago

I bought it and have found it to be completely worth the money. I don't look at prices for these things in relation to how much other books cost but how much time it will save me.

znpy6y ago

Yeah I think that the medical comparison is a good idea.

mellavora6y ago

Alex answered my questions in such a way that I myself saw where the bug was in my code.

He saved me easily several hours of time.

At my hourly rate, this means that the book had a negative cost in my case.

I was able to repay the favor, I suggested an improvement to one code example in the book which Alex eagerly accepted.

abd12OP6y ago

Thanks for your support! I'm grateful for the fix you suggested as well :)

seibelj6y ago· 4 in thread

abd12OP6y ago

I'd much rather pay for reads & writes directly rather than guessing at how my CPU and RAM will translate to the reads and writes that I need.

RDBMS capacity planning basically goes:

1. How much traffic will I get? 2. How much RAM & CPU will I need to handle the traffic from (1).

With DynamoDB, you can skip the second question.

maerF0x06y ago

> makes it difficult to scale up for bursts

Can you tell me why the On Demand mode doesnt work for you?

arpinum6y ago

1 more reply

whalesalad6y ago

You need to build exponential-backoff logic into your system to handle waiting for Dynamo to warm up. It doesn't happen instantly.

1 more reply

abarrettwilsdon6y ago· 2 in thread

I bought this a few weeks ago and am about 130 pages in.

I went back and refactored all my previous Dynamo work last night, and the difference was night and day. I'm planning to migrate some relational structures later this week, as well.

Is good book.

TheSpiciestDev6y ago

What has this book taught you that could be applied outside DynamoDB? I'm close to buying but the price is kinda steep... if however I can take away some general NoSQL insight then I'm sold.

Edit: nevermind, I see another review elsewhere and the author replying. Though, your opinion would still be appreciated! :)

abd12OP6y ago

Thank you for the kind words! :) Glad you're liking it.

Nican6y ago· 2 in thread

> While your relational database queries slow down as your data grows, DynamoDB keeps on going. It is designed to handle large, complex workloads without melting down.

I mean- hand a person a gun, and they might shoot themselves in the foot. While you can make bad queries/workloads for a relational database, you can just as easily make bad workloads for DynamoDB.

abd12OP6y ago

jeremyjh6y ago

So what is the cost of doing a bit of query tuning and de-norming every now and then compared to the development costs imposed by DynamoDB?

1 more reply

raynguyen6y ago· 2 in thread

This looks like a great resource. One thing I'm struggling with is the ability to sort and filter and was wondering if the book goes into detail about this topic.

abd12OP6y ago

Yep! There are entire chapters on sorting & filtering. Note: it's different than in a relational database, but it's doable :)

raynguyen6y ago

siscia6y ago· 1 in thread

I work a little outside the standard startup hyper-scale, fast growing business, so forgive my question.

But how widely used is DynamoDB? And for what use cases?

And what are the problems with it?

abd12OP6y ago

In a nutshell:

- It was designed for super high scale use cases (think Amazon.com retail on Cyber Monday). It has decent adoption there. Competes mostly with Cassandra or other similar tools.

agustif6y ago· 1 in thread

Can some knowledge be transferred to other NoSQL flavours like mongo or is the book heavily specific about DynamoDB?

abd12OP6y ago

All the examples are specific to DynamoDB and use DynamoDB features.

If you want to check it out but find out it doesn't work for you, just let me know. I've got a 100% money-back guarantee with no questions asked if you don't like it.

timebomb06y ago· 1 in thread

The book looks great, but being a startup, the price is hard to swallow for 20+ engineers.

abd12OP6y ago

Email me, and I'm happy to discuss :). alex@alexdebrie.com

haolez6y ago· 1 in thread

Does anyone have book recommendations on NoSQL modeling in general?

databrecht6y ago

In others, you might have relations but lose consistency, in others you might have relations but only keep consistency under specific conditions (sharding keys etc)

djstein6y ago· 1 in thread

thanks for this. I just started creating my first DynamoDB database yesterday

abd12OP6y ago

Awesome! Hit me up if you have any questions :)

tinkertamper6y ago

pier256y ago

For my current serverless project I'm using Fauna which I think is a better option than Dynamo. You get relations, complex queries, etc. You also get authentication and authorization baked-in.

Writes take longer (probably around 200-300ms on average) but considering these are replicated to all Fauna servers with ACID I'm ok with that.

I wrote a little intro to Fauna's query language which is very powerful if anyone is interested:

https://github.com/PierBover/getting-started-fauna-db-fql

the_arun6y ago

What I like in DDB is TTL. It is a fantastic feature. I read someone comparing it with Redis. Redis is faster because of TCP connectivity, whereas DDB is over HTTP.

albatross136y ago

Well I'm a sucker for this kind of stuff- how do the videos work in the premium package? Do I get to download them for offline viewing?

bangbig6y ago

Wonder if anyone agrees that Uber's order processing can be handled by DynamoDB very well.

j / k navigate · click thread line to collapse