D1: Our SQL database (opens in new tab)

(blog.cloudflare.com)

592 pointselithrar4y ago228 comments

228 comments

150 comments · 43 top-level

kurinikku4y ago· 20 in thread

wow SQLite getting a lot of love these days

https://tailscale.com/blog/database-for-2022

https://fly.io/blog/all-in-on-sqlite-litestream

https://blog.cloudflare.com/introducing-d1

peterhunt4y ago

SQLite is great but it's way overhyped and abused on HN. People are very eager to turn SQLite into a durable, distributed database and it's really not meant for that, and by going down that road instead of using something like MySQL or Postgres you're missing out on lots of important functionality and tooling.

I only say this because I have made this mistake at my previous startup. We built these really cool distributed databases on top of a similar storage engine (RocksDB) plus Kafka, but it ended up being more trouble than it was worth. We should have just used a battle-tested relational database instead.

Using SQLite for these applications is really fun, and it seems like a good idea on paper. But in practice I just don't think it's worth it. YMMV though.

manigandham4y ago

So you didn't use SQLite then? Because RocksDB + Kafka is not similar at all.

Also databases all use the same fundamental primitives and it's up to you to choose the level of abstraction you need. For example, FoundationDB is a durable distributed database that uses SQLite underneath as the storage layer but exposes an ordered key/value API, but then allows you to build your own relational DB on top.

If you just needed distributed SQL because a single instance wasn't enough then there are already plenty of choices like CockroachDB/Yugabyte/TiDB/Memsql/etc that can serve the purpose instead of building your own.

2 more replies

samatman4y ago

I accept that you learned a lot about the limits of combining RockDB with Kafka, especially in the exact way you combined them.

This might have limited utility if the goal were to combine RocksDB with something else. And even less for SQLite and something else.

The big push of interest in SQLite serverside isn't driven by people who have never set up pgbounce, but rather by developers who have both read the SQLite docs very carefully and have used the library extensively, and know what it's good for.

ripley124y ago

I'm not sure why you concluded that SQLite is the problem when you built a "really cool distributed database" with Kafka. Distributed databases are complicated, Kafka's complicated.

If you're saying that a replicated Postgres setup would be simpler than what you're built, I agree; but SQLite+Litestream probably would be too.

1 more reply

MuffinFlavored4y ago

Is this any good? https://github.com/rqlite/rqlite

I've been looking for a turn key solution that is better than me running a single node Postgres instance "bare metal" or in a container.

postgres-operator seems cool but... k8s, pretty heavy I guess.

1 more reply

jen204y ago

It’s the default storage engine for FoundationDB - not sure many would agree that isn’t a “durable, distributed database”.

1 more reply

jgrahamc4y ago

SQLite has been cool forever. It was the underlying data store for my machine learning email filter POPFile 20 years ago!

https://en.wikipedia.org/wiki/POPFile https://getpopfile.org/browser/trunk/engine/POPFile/Database...

runlevel14y ago

It's high-quality software too. It's well-commented and exceptionally well tested.[^1][^2]

> As of version 3.33.0 (2020-08-14), the SQLite library consists of approximately 143.4 KSLOC of C code. ... By comparison, the project has 640 times as much test code and test scripts - 91911.0 KSLOC.

I don't usually place much stock in those sort of counts, but 640x is notable.

It makes sense considering the wide variety of use-cases, from embedded devices to edge computing and everything in between.

[1]: https://www.sqlite.org/testing.html [2]: https://sqlite.org/src/dir?ci=trunk

sigzero4y ago

I used POPFile!! It was awesome.

alberth4y ago

SQLite was originally great for desktop applications.

Problem is, there's still a huge market for these apps but everything has moved to the web (no one is making desktop apps anymore). So having a full-blown RDMS is overkill for these kind of app, and now SQLite is starting to fill these web app needs.

@sqlite - if you are reading this, any word on merging WAL2 and BEGIN CONCURRENT into main? There clearly is a new class of needs to do so in this world that has completely moved over to web app development (which introduces concurrency problems never experienced on desktop). Any thoughts of focusing more on these web related needs for SQLite (or maybe even fork your own code base to have a more enhanced SQLite version targeted at web needs)?

jchw4y ago

I think it’s long overdue. While SQLite certainly has its limitations, it’s a winner in many categories. Even for sites with mild traffic using ordinary SQLite in PHP like a decade ago, it was always nice to use for its simplicity and the performance was totally acceptable. In comparison, the memory usage of typical relational database servers was high enough to make it hard to fit on a single lowend VPS with the same data and traffic. (I found myself tuning MySQL, but I never needed to tune SQLite.)

Cthulhu_4y ago

The main thing for tuning SQLite will be how to open it, e.g. in write-ahead mode, to turn on foreign keys (this needs to be enabled manually), and whether it should wait to get a database lock on slower hardware before giving up. There's also some gotchas like if you mark an ID column as primary key, it'll use the rowid as key - which can be reused if a row is removed. So you need to explicitly set primary key AND autoincrement, else you're going to have a bad time. (https://www.sqlite.org/autoinc.html)

1 more reply

RaoulP4y ago

With the mileage (and attention) those new products are getting out of using SQLite, I think Richard Hipp deserves a lot more acknowledgement for creating such an amazing piece of software.

sophacles4y ago

New products getting a lot of mileage out of sqlite is old-hat at this point. It one of those rare evergreen techs - pay attention for a while and this latest round of attention will die down for 6-12 months then someone else will start another round of "look how cool sqlite is".

At least that's been my observation since I started coming around here.

tootie4y ago

I'm wondering if we'll see some similar energy around non-sql embedded databases like leveldb or rocksdb

sanderjd4y ago

Right! SQLite is great, but those two are great as well. It seems like the energy should be around "hey, you should consider a local, maybe even in-memory, database for some things!" more so than specifically "SQLite is great" (though it is).

Thaxll4y ago

Well I don't think it's a good fit for regular service, exactly how do you handle 2 replicas of the same service talking to the same DB?

The fact that it's just a file on disk limits the usage.

nibab4y ago

Projects such as litestream and rqlite have this figured out.

2 more replies

manigandham4y ago

Transactions, locks, queues, etc. No different than multiple app instances changing the same row in other databases.

Any state mutation is ultimately ordered in time and how that that ordering is accomplished depends on the abstractions you're using: in your app, network layer, database, etc.

1 more reply

mbreese4y ago

Is think one way to think about this is to have one database being tied to one replica (replicas could handle more than one database). Where (importantly) the idea would be one database for each user. You horizontally scale for the number of users, but each user is only using one end node.

It’s interesting because you have to consider how to scale your database as well as your application. The fact that you don’t have one central database opens up more possibilities. But it doesn’t work for all instances (such as a shared read-write data source for all users). For example, this approach wouldn’t work for something like Twitter (at least the original architecture).

the_duke4y ago· 19 in thread

All this recent hype around sqlite...

sqlite is a great embedded database and thanks to use by browsers and on mobile the most used database in the world by orders of magnitude.

But it also comes with lots of limitations.

* there is no type safety, unless you run with the new strict mode, which comes with some significant drawbacks (eg limited to the handful of primitive types)

* very narrow set of column types and overall functionality in general

* the big one for me: limited migration support, requiring quite a lot of ceremony for common tasks (eg rewriting a whole table and swapping it out)

These approaches (like fly.io s) with read replication also (apparently?) seem to throw away read after write consistency. Which might be fine for certain use cases and even desirable for resilience, but can impact application design quite a lot.

With sqlite you have do to a lot more in your own code because the database gives you fewer tools. Which is usually fine because most usage is "single writer, single or a few local readers". Moving that to a distributed setting with multiple deployed versions of code is not without difficulty.

This seems to be mitigated/solved here though by the ability to run worker code "next to the database".

I'm somewhat surprised they went this route. It probably makes sense given the constraints of Cloudflares architecture and the complexity of running a more advanced globally distributed database.

On the upside: hopefully this usage in domains that are somewhat unusual can lead to funding for more upstream sqlite features.

prirun4y ago

* the big one for me: very limited migration support, requiring quite a lot of ceremony for common tasks (eg rewriting a whole table and swapping it out)

I don't know where this idea of having to swap a whole table in SQLite came from, but it simply isn't true. Over the last 13 years I have upgraded production HashBackup databases at customer sites a total of 35 times without rewriting and swapping out tables by using the ALTER statement, just like other databases:

https://www.sqlite.org/lang_altertable.html

For the most recent upgrade, I upgraded to strict tables, which I could also have done without a rebuild/swap. I chose to do a rebuild/swap this one time because I wanted to reorder some columns. Why? Because columns stored with default or null values don't have row space allocated if the column is at the end of the row.

the_duke4y ago

For a long time sqlite did not have DROP COLUMN and RENAME COLUMN support, which are both pretty essential.

I'm embarrassed to admit that I didn't realize RENAME COLUMN was actually added in 3.25, almost four years ago.

DROP COLUMN was only just added last year in 3.35.

I'm surprised a database schema lasted 9/12 years without ever renaming or dropping a column.

This changes things! But even now, ALTER TABLE is not transactional. So especially with many concurrent readers there can definitely be situations where you'd still want to rewrite.

2 more replies

cryptonector4y ago

It would really help if SQLite3 had a `MERGE`, or, failing that, `FULL OUTER JOIN`. In fact, I want it to have `FULL OUTER JOIN` even if it gains a `MERGE`.

`FULL OUTER JOIN` is the secret to diff'ing table sources. `MERGE` is just a diff operation + insert/update/delete statements to make the target table more like the source one (or even completely like the source one).

`FULL OUTER JOIN` is essential to implementing `MERGE`. Granted, one could implement `MERGE` without implementing `FULL OUTER JOIN` as a public feature, but that seems silly.

Sadly, the SQLite3 dev team specifically says they will not implement `FULL OUTER JOIN`[0].

Implementing `MERGE`-like updates without `FULL OUTER JOIN` is possible (using two `LEFT OUTER JOIN`s), but it's an O(N log N) operation instead of O(N).

The lack of `FULL OUTER JOIN` is a serious flaw in SQLite3. IMO.

  [0] https://www.sqlite.org/omitted.html

1 more reply

sorenbs4y ago

Migrations have gotten better recently, but there are still cases where you need to follow the 12 steps very carefully: https://www.sqlite.org/lang_altertable.html#otheralter

Prisma Migrate can automatically generate these steps, removing most of the pain. I'm sure other migration tools can do this as well.

llimllib4y ago

simonw's sqlite-utils can help here too: https://sqlite-utils.datasette.io/en/stable/cli.html#transfo...

vlovich1234y ago

D1 does not throw away consistency. It’s built on top of Durable Objects which is globally strongly consistent.

smarx0074y ago

"D1 will create read-only clones of your data, close to where your users are, and constantly keep them up-to-date with changes."

Sounds like there will be no synchronous replication and instead there will be a background process to "constantly keep [read-only clones] up-to-date". This means that a stale read from an older read replica can occur even after a write transaction has successfully committed on the "primary" used for writes.

So, while the consistency is not "thrown away", it's no longer a strong consistency? Anyway, Kyle from Jepsen will figure it out soon, I guess :)

1 more reply

greg-m4y ago

Just clarifying - D1 without read replicas is strongly consistent. If you add read replicas, those can have replication lag and will not be strongly consistent.

Disclaimer: I work at Cloudflare :)

1 more reply

mwcampbell4y ago

Interesting that D1 is built on top of Durable Objects. Does this mean that it would be practical for a single worker to access multiple D1 databases, so it could use, for example, a separate database for each tenant in a B2B SaaS application? Edit: And could each database be in a different primary region?

1 more reply

hn_ei_ser_234y ago

That is interesting. I wish CF would give us some more information as I've assumed that there must be a lack of strong consistency which would be a major drawback.

Edit: But that would mean that durable objects can't be replicated asynchronously? That would mean a big latency hit. Then what's the difference to a central DB in one datacenter?

kwizzt4y ago

I’m not familiar with Durable Objects. When D1 does replication to read replicas, if it’s not doing synchronous replication, then it’s not strongly consistent, is that correct?

the_duke4y ago

I wish the post had provided some more technical details.

It's more of a "quickstart" than a peek under the hood.

1 more reply

jpcapdevila4y ago

Are you guys using litestream or a similar approach? E.g storing WAL frames in a durable object.

jambutters4y ago

What types are missing from strict that you need?

1 more reply

vaughan4y ago

Has anyone tried to write a new modern SQLite?

sophacles4y ago

Why? Yes sqlite doesn't have all the features postgres has. Postgres doesn't have all the features the sqlite has either. What's wrong with having different tools with different sets tradeoffs. It's a different shape of Lego and that's fine - some things call for a 1/3height 2x2 and others call for a full height 1x8.

jpcapdevila4y ago

I think the most successful attempt would be Realm.

https://realm.io/

chrisshroba4y ago

DuckDB comes to mind, but I can't speak to its differences from SQLite.

https://duckdb.org/

2 more replies

steaminghams4y ago

why do you consider sqlite to not be modern?

all the hip service providers seem to be all over it which would indicate pretty good modernity to me at least.

mwcampbell4y ago· 13 in thread

Any current or planned support for existing ORMs, such as Prisma or TypeOrm?

Also, I wonder how hard it will be to migrate existing PostgreSQL databases and SQL statements. Of course, I understand if Cloudflare is focused on greenfield applications.

sorenbs4y ago

Prisma won't work with D1 out of the box. The primary limitations are:

- SQLite is traditionally embedded in an application, so Prisma interacts with it by mounting a file. Workers does not have a local filesystem, and D1 is exposed over the network through an API accessible from a Worker. Prisma will have to create a specific connector for D1. - Workers have a script size limit which is currently 1MB. My understanding is that Cloudflare will be increasing this in the future. We also have specific work to decrease the size of Prisma. Both of those will have to happen before Prisma could be used with D1.

Note that Prisma already support querying Postgres, MySQL, SQL Server and MongoDB from Cloudflare Workers through the Prisma Data Proxy, which will see a GA release next month.

We are also very excited about D1 as a way to bring a subset of data closer to users in order to deliver faster experiences. We hope this will be a way to bring the benefit of edge computing to larger organisations who cannot simply rearchitect everything to run on Workers.

geelen4y ago

> We are also very excited about D1 as a way to bring a subset of data closer to users in order to deliver faster experiences. We hope this will be a way to bring the benefit of edge computing to larger organisations who cannot simply rearchitect everything to run on Workers.

I am also excited about this :)

Cthulhu_4y ago

Before you consider using an ORM, try using regular SQL and some tooling first; your future self will thank you. Just write the code, it's only volume and it's not so bad.

joshstrange4y ago

I took this advice on my last project and ended up re-writing the whole thing to use Prisma later. I launched and had a successful event with raw sql but it quickly became unwieldy. Prisma gives me type safety throughout my app (written in Typescript) and would have prevented a number of bugs/pain points as my app grew. And I'm only 1 developer, this gets worse if you have multiple people working on it. I still write raw sql for reporting/aggregation (Prisma's features here only work for basic examples in my experience) and I'm not "scared of raw sql" but I can move much faster when I have the guardrails of types.

pier254y ago

Totally agree.

Source: someone who avoided learning SQL for 20 years.

gigatexal4y ago

+1 to this as well.

jgrahamc4y ago

We are definitely interested in ORMs. Want to make it easy to use. I hope someone creates the next Rails using Workers. And having other models on top of our SQL offerings will be important. Get in contact and let us know what you'd like.

joshstrange4y ago

> I hope someone creates the next Rails using Workers

I too am eagerly waiting for a good serverless nodejs framework that is "batteries included". I've deployed on Lambda using the "Serverless Framework" but once your app grows to a certain size everything starts to fall apart and you lose some of the magic. Unfortunately, most of the things that advertise themselves as serverless/lambda/worker nodejs frameworks are monoliths and/or an existing monolith framework that "supports" lambda (with a billion asterisks after that). There is absolutely nothing wrong with monolith frameworks, I love them, but just not for lambda, I want to deploy a single endpoint as a single function (or as a cron, or queue listener, etc), not all of my code for every function (you hit size limits quick with this method).

I want express/nestjs/etc-type routes that I define with code or annotations that result in /only/ that function (endpoint) being bundled up and deployed. I ended up rolling my own "framework" on top of Serverless Framework (uses serverless.ts config file that scans my directories for a special file that defines the routes defined in that directory) but Serverless Framework is pretty shaky ground. Their documentation is a mess, Serverless Components appears dead, and they seem to be busy with their own "cloud" so I don't know how much longer I can keep building on top of them.

When it works it's like magic but there are a ton of walls you run headfirst into: Cloud formation entity limits, package size limits, typescript/bundling support, clear disregard for medium/large projects ("Just use multiple services", this leads to a terrible dev experience), and long deploy times.

I wish CF Workers had been out when I first started building my current project, I might have gone in that direction instead, I still might.

2 more replies

gen2204y ago

You might want to consider adding Deno [1] to the language examples: https://developers.cloudflare.com/workers/platform/languages...

Deno can compile to wasm, so it can plug in through that vertical. But it's just TS on the frontend.

I'm mainly a python programmer, but Deno's been the most alluring development in the JS ecosystem since typescript for me. Might be helpful to you all to capture some steam from source.

[1]: https://deno.land/

jpcapdevila4y ago

I'm building an open source firebase alternative using sqlite. I'll be reaching out, I was thinking to build the distribution & durability part myself, but I would rather use D1!

I guess it would count as a client focused ORM :)

I'll be reaching out from jp@javascriptdb.com

Great addition, congrats!

eatonphil4y ago

Will not any existing ORM that supports SQLite support D1? I looked in the post for details on how it extends SQLite (is the query language different or extended, semantics very different, etc.) but didn't notice anything.

2 more replies

irq-14y ago

This should have a virtual file system. CF should write it so each user doesn't have to load a JS abstraction and it has better performance.

Cthulhu_4y ago

Before you consider using an ORM, try using regular SQL and some tooling first; your future self will thank you. Just write the code, it's only volume and it's not so bad. What is bad is learning a 3rd language on top of SQL and JS/TS that you somehow have to manually map to SQL.

slashdev4y ago· 9 in thread

For a Cloudflare article, this one is surprisingly light on technical details. And for the product where it most matters.

I'm guessing this is a single master database with multiple read replicas. That means it's not consistent anymore (the C in ACID). Obviously reads after a write will see stale data until the write propogates.

I'm a bit curious how that replication works. Ship the whole db? Binary diffs of the master? Ship the SQL statements that did the write and reapply them? Lots of performance and other tradeoffs here.

What's the latency like? This likely doesn't run in every edge location. Does the database ship out on the first request. Get cached with an expiry? Does the request itself move to the database instead of running at the edge - like maybe this runs on a select subset of locations?

So many questions, but no details yet.

otoolep4y ago

I agree -- this blog post is light on details. To me the value Cloudflare believes they are offering is mostly ease-of-use, particularly setup. With minimal work you can have a stateful, relational store available to your code. But in terms of actual database functionality, they are not offering anything particularly novel. Of course, I might be missing something.

In fact, I don't see anything D1 is doing that is not already offered by something like rqlite[1], which is also a super-easy-to-use distributed database built on SQLite. Of course Cloudflare will run the database for you, which is a great help -- they take care of the uptime, monitoring, backups, etc. And that's important obviously, because in the real-world databases must be operated.

Disclaimer: I am the creator of rqlite.

[1] https://github.com/rqlite/rqlite

rad_gruchalski4y ago

I’ve been looking at rqlite for some time and it’s really great to track the product on github.

I believe that the power of what Cloudflare offers here isn’t in the actual database. It’s the packaging and how it sits in their serverless world. Even with rqlite, I still need ip addresses to run a resilient system. As someone who sometimes needs a table here snd there, I really, really don’t want a server. I want a table to store a thousand records in and that’s it. This is where I would very much enjoy using something like D1.

A combo of D1, R2 and Workers is a serious contender for over-the-top serverless distributed apps. This is great.

1 more reply

sebk4y ago

Small nitpick, but that's still consistent as in ACID. I think what you mean is it wouldn't be consistent in the CAP sense (it wouldn't be linearizable).

TFA does say that read-replicas will be present at every edge location, which makes sense for a product like Workers. But it doesn't mention writes at all.

eloff4y ago

Yes, that's true.

dragonwriter4y ago

> I'm guessing this is a single master database with multiple read replicas. That means it's not consistent

Single master with read replicas is fully consistent if commits don't return until propagated to and acknowledged by replicas (the expense here being commit latency.)

otoolep4y ago

You've basically described rqlite [1], which uses Raft to coordinate the changes to the Leader, and then across some number of Followers. The write won't be acked until a quorum has persisted the change, and committed to the underlying SQLite database.

Disclaimer: I am the creator of rqlite.

[1] https://github.com/rqlite/rqlite

1 more reply

eloff4y ago

I would say the expense is both latency and availability because if one node doesn't ack within the timeframe then you have to drop it from the cluster. Requests that go there would need to be routed elsewhere to avoid being unavailable. If there's a network partition preventing that, then you have partial downtime. If enough nodes fail then you have full downtime across the whole cluster.

ithrow4y ago

Yeah, nothing about WAL mode which is what most users will want for web apps.

sqlite is accessed via a socket? defeats the whole purpose of using sqlite.

Many here are mentioning using one sqlite file per customer but that sounds like a nightmare for migrations and analytics.

SQLite is great and all these new services and articles are nice but intentionally shadowing lots of complexity.

detroitcoder4y ago

Going to be very interesting to see how they glue together R2, edge workers and sqllite. They can manage replication using R2 and make the sqllite process aware of this for eventual consistency. Having edge compute with edge data on a globally consistent data model is the dream.

endisneigh4y ago· 8 in thread

Have any of the problems that led people to use Postgres instead of SQLite actually been solved? Are we doomed to repeat the same mistakes?

Also, any plans to support PATCH x-update-range so SQLite can be used entirely in the browser via SQLite.js?

Can someone enlighten me with the types of use cases this would be better for vs say Postgres?

ignoramous4y ago

It isn't as much as folks who need Postgres features are moving to SQLite just because it is cool, but it is folks who don't want those Postgres features moving to SQLite, because the latter has just enough features they only ever really need.

endisneigh4y ago

SQLite made sense as an embedded database on day a desktop or phone because there’s only a single person generally writing to it. The perfect use case.

I don’t understand how it will be usable at all in a website with multiple users. Is the idea to make your site to every user gets their own database? How do you stop SQL injection?

Once you solve all of these problems aren’t you better off just using Postgres?

1 more reply

nindalf4y ago

Which problems were you thinking of?

Cloudflare and fly.io both promise hassle free read replicas and backup. They will both offer only a single node capable of writes, because that’s how SQLite rolls.

This is a pretty good fit for a read heavy load that requires SQL and very low latency.

endisneigh4y ago

I guess I’m not understanding what the benefit is vs hosted Postgres. Also low latency and setup can be equally trivial - see supabase for example.

2 more replies

hn_ei_ser_234y ago

The important drawback is async replication and therefore the lack of full consistency. On the other hand, this is the big advantage of hosted Postgres and the like.

Those offerings are great for use-cases that don't need that kind of consistency, which are many.

hn_ei_ser_234y ago

No and no. I think this is great for Edge computing, where there is currently no solution. So, it's better than nothing.

It all depends on the use-case, of course. A traditional hosted Postgres or MySQL database or cluster is certainly the go-to solution for all who need advanced features or full consistency, which only synchronous replication could provide.

jve4y ago

What problems? Both are for different use cases albeit overlapping.

endisneigh4y ago

Concurrent writes, for one.

ranguna4y ago· 6 in thread

This looks amazing!

I see cloudflare people are on this post, any chance to compar D1 vs postgres in terms of DB features?

Insert ... Returning

Stored procedures and triggers

Etc etc

Would be really helpful to get a comparison like cockroachDB did here https://www.cockroachlabs.com/docs/stable/postgresql-compati...

Or even better, a general sql compatibility matrix like this https://www.cockroachlabs.com/docs/stable/sql-feature-suppor...

Kudos to the cloudflare team!

the_duke4y ago

Well, it's sqlite... so presumably you will get most of the capabilities sqlite has.

RETURNING is covered.

Stored procedures are indirectly there by running your own code "next to the database", as mentioned in the post. Which is arguably much nicer than having to use some database specific language, given that you can run WASM on workers.

tyingq4y ago

There is a layer on top of Sqlite here, so I imagine it's something less than all the capabilities sqlite has, at least initially. Plus the upsides and downsides from their approach to have a master and read replicas.

1 more reply

ranguna4y ago

> Stored procedures are indirectly there by running your own code "next to the database",

"indirectly" is a keyword here, because running code when data is modified potentially won't replace triggers since they'll probably execute outside the running transaction.

pier254y ago

Listen/notify

Cthulhu_4y ago

The announcement - if you read it before posting - says it's sqlite, so that's something you can punch into google.

Long story short, don't expect anything fancy. Support for alter table is limited, and concurrency can be an issue.

ranguna4y ago

It is indeed sqlite but it could possibly have modification done or additions added. Please be considerate and think a little more before commenting.

jgrahamc4y ago· 4 in thread

BTW R2 is open beta now: https://blog.cloudflare.com/r2-open-beta/

mariushn4y ago

R2 is 3x more expensive than B2 (storage) https://www.backblaze.com/b2/cloud-storage-pricing.html

Am I missing something? Is there no bandwidth cost at all?

messe4y ago

Yep, you're not charged for egress.

1 more reply

alberth4y ago

Does R2 provide synching between regions? Maybe that's why it's so much more expensive? You're getting regional failover?

1 more reply

rubenv4y ago

Latency

greenie_beans4y ago· 4 in thread

dang i was hoping for postgres so i can use postgis

edit: maybe one day! this looks cool regardless

edvinbesic4y ago

I'm right there with you. I wonder if this is an SQLite compatible API on top of their own solution, or if it's using actual SQLite under the hood with custom replication.

If the latter, and anyone from CloudFlare is here, is there any chance to have SpatiaLite enabled?

https://www.gaia-gis.it/fossil/libspatialite/index

durkie4y ago

Seconding a vote for Spatialite support! I came here just to make that same request.

yawaramin4y ago

No need to call dang!

greenie_beans4y ago

lol i'm from a place where "dang" is a natural part of our vocab

lucasyvas4y ago· 3 in thread

The API for this is currently the only thing I wish I could grok a bit better. It seems like it would be hard to make it work with existing libraries that can access SQLite, which is kind of a shame.

I'm thinking of sqlx in Rust (or any other language binding / ORM for that matter), which has compile time schema safety. This is a nice capability, and because this interface seems non-standard (possibly for good reason), I guess we are being asked to give some of those things up.

I am getting a bit ahead of myself on the Rust part (presumably that will eventually be supported as part of workers-rs), but I think the feelings still stand if you consider the JS ecosystem.

Edit: I may actually be wrong, but presumably the entire surface isn't covered because there's no file opening, etc.

mritchie7124y ago

There might be a `env.DB.url` (e.g. the jdbc URL) which you could pass into an existing library.

yencabulator4y ago

I'm kinda willing to make a bet that this rides on top of what looks like HTTP to the Javascript engine. That's how their worker-to-worker and worker-to-durable-object protocols are.

(It's not really HTTP as in it might never cross a TCP socket, just get shuffled from one V8 isolate to another, but it looks like a `fetch` call to the Javascript.)

It's also worth remembering that SQLite itself has no wire protocol, it's a library. And there is no such thing as a "SQL wire protocol". It sure isn't gonna be Postgres wire protocol either.

From the article:

> D1’s API includes batching: anywhere you can send a single SQL statement you can also provide an array of them, meaning you only need a single HTTP round-trip to perform multiple operations. This is perfect for transactions that need to execute and commit atomically:

lucasyvas4y ago

Interesting thought! Would love to see more details.

fzaninotto4y ago· 2 in thread

Love the Northwind Traders reference! However, for a demo, I suggest a slightly larger and more complex data set, [data-generator-retail](https://www.npmjs.com/package/data-generator-retail).

The demo is also a bit buggy: orders are duplicated as many times as there are products, but clicking on the various lines of the same order leads to the same record, where the user can only see the first product...

I also think the demo would have more impact if it wasn't read-only (although I understand that this could lead to broken pages if visitors mess up with the data).

Anyway, kudos to the CloudFlare team!

naiv4y ago

I was thinking the same. The dataset is way too small.

celso4y ago

Fixed the orders table. Good catch.

samwillis4y ago· 2 in thread

This is really interesting, it's (basing it on SQLite) exactly what I was expecting CloudFlare to do for their first DB.

Its perfect for content type sites that want search and querying.

Anyone from CF here, is it using Litestream (https://litestream.io) for its replication or have you built your own replication system?

I assume this first version is somewhat limited on write performance having a single "main" instance and SQLite laking concurrent writes? It seems to me that using SQLite sessions[0] would be a good way to build an eventually consistent replication system for SQLite, would be perfect for an edge first sql database, maybe D2?

0: https://www.sqlite.org/sessionintro.html

jgrahamc4y ago

1. No, it's not built on Litestream. Operating a massive network and shuttling data around is kind of our thing.

2. We are going all in on databases and D2 sounds like a cool name for something...

xafke4y ago

R2, D2. I see what you did there!

lucasyvas4y ago· 2 in thread

To the person from Cloudflare I complained to in last year's thread about putting your money where your mouth is on serverless databases:

You weren't lying, and this is super cool - the SQLite hype train also seems to be in full force.

throwaway8943454y ago

It's interesting to see a relatively old technology get hyped.

jgrahamc4y ago

:-)

rmbyrro4y ago· 2 in thread

I'm buying Cloudflare stocks right now.

In 2-3 years from now, these services will be so mature and strong they will be crushing the cloud market.

They're turning dreams into reality, one after another.

endisneigh4y ago

Cloud business is driven by enterprise generally. Would enterprise be using SQLite?

Quarrelsome4y ago

they should be using SQLite more often than they are.

1 more reply

jpcapdevila4y ago· 2 in thread

If SQLite gets you excited, I'm building a firebase alternative based on sqlite. I'm betting hard on sqlite so this get's me super excited!!

https://javascriptdb.com

CF people around, I would love to chat, if anyone is interested please reach out at: jp@javascriptdb.com

I'll be applying to this beta for sure!

js4ever4y ago

Super interesting! I really like the idea. I'll join the beta, email sent :)

jpcapdevila4y ago

Any feedback on what do you find interesting would be awesome :) thanks!!

frogger84y ago· 2 in thread

Not a expert on DOM or JavaScript so be kind ;)

One thing I hope to see in the future is a better product filtering experience. When I worked on a jquery product filter I realized the DOM bloat was the main problem.

I wonder if D1 can help devs build instant product filtering pages that don’t require the reload like microcenter or Newegg does.

IE https://www.newegg.com/p/pl?d=hdmi+cable&N=-1&SortType=8

mbreese4y ago

At any sufficient scale, it is difficult to do filtering on the client. Yes, it can be done, but with 10,000+ potential records, you don’t want to ship that to the client for each query. (Note: I’m thinking Newegg scale for “hdmi cable” here. There are certainly situations where you can ship the entire database to the client for filtering.)

It’s not DOM bloat… it’s too many records. If you’re building a DOM node for each record, that’s bloat, but you still have the problem even if the results are stored in a JSON object and dynamically queried on the client side.

So, for each new filter or new query you need to hit the server anyway. If that’s an asynchronous query that returns a json blob or a full refresh, IMHO, it doesn’t really matter that much. Either way, you’re rebuilding a large portion of the DOM with the new results. The only thing that skews things in favor of an async call is if the rest of the page is so heavyweight that reloading the page takes a significant amount of time. This is probably what you’re taking about.

Having a SQLite db close to your worker node really isn’t going to affect this problem all that much.

Cthulhu_4y ago

It's probably better - especially for more advanced search engines - to have an elasticsearch instance or whichever is the more recent example handle product search and filtering like that.

_kyran4y ago· 2 in thread

So can we assume that D2 will be postgres/mysql ?

eatonphil4y ago

It sounds like you're making a simile but I don't understand it. The article did literally state D1 is based on sqlite.

_kyran4y ago

The opening paragraph reads "Today, we're excited to announce D1, our first SQL database." read: first

and well R2 and D2 would make for a great naming scheme.

1 more reply

hn_ei_ser_234y ago· 1 in thread

First, I'm very excited. Sure, SQLite has some limitations compared to Postgres, esp. regarding the type system and concurrency. But we get ACID compliance and SQL.

But it is really hard getting some useful information from this article. I can't even tell if it is not there or just buried in all this marketing hot air.

So, what is it really? Is there one Write-Master that is asynchronously replicated to all other locations? Will writes be forwarded to this master and then replicated back?

I'm very curious about how it performs in real life. Especially considering the locking behavior (SQLite has always the isolation level 'serializable' iirc). The more you put in a transaction or the longer you have to wait for another process to finish their writes, the more likely you have to deal with stale data.

But overall I'm very excited. Also by the fly.io announcement, of course. Lots of innovation and competition. Good times for customers.

tyingq4y ago

>So, what is it really? Is there one Write-Master that is asynchronously replicated to all other locations? Will writes be forwarded to this master and then replicated back?

Not a lot of detail, but that is mentioned:

"But we're going further. With D1, it will be possible to define a chunk of your Worker code that runs directly next to the database, giving you total control and maximum performance—each request first hits your Worker near your users, but depending on the operation, can hand off to another Worker deployed alongside a replica or your primary D1 instance to complete its work."

infogulch4y ago· 1 in thread

Very cool! Glad to see all the love for SQLite recently.

One thing I've noticed that many commenters miss about read-replicated SQLite is assuming that the only valid model is having one, giant, centralized database with all the data. Lets be honest with ourselves, the vast majority of applications hold personal or B2B data and don't need centralized transactions, and at scale will use multi-tenant primary keys or manual sharding anyways. For private data, a single SQLite database per user / business will easily satisfy the write load of all but the most gigantic corporations. With this model you have unbounded compute scaling for new users because they very likely don't need online transactions across multiple databases at once.

Some questions:

Will D1 be able to deliver this design of having many thousands of separate databases for a single application? Will this be problematic from a cost perspective?

> since we're building on the redundant storage of Durable Objects, your database can physically move locations as needed

Will D1 be able to easily migrate the "primary" at will? CockroachDB described this as "follow the sun" primary.

unraveller4y ago

I guess the first answer is: similar to Durable Object limits (unlimited databases / 50 GB total) since they alluded to those abilities more so than a simple file stored on R2 (only for backups).

ryanto4y ago· 1 in thread

This is so cool!

From the blog post it says read-only replicas are created close to users and kept up to date with the latest data.

- How should I think about this in terms of CAP? If there's a write and I query a replica what happens?

- How are writes handled? Do they go to a single location or are they handled by various locations?

I'm excited to try this. It's so cool to see databases being distributed "on CDNs" for lack of a better term.

leonidasv4y ago

I think they're replicated asynchronously, so reading directly from the replica may return old data. That's why they've added the ability to deploy special workers that "live" closer to the primary:

> Embedded compute

> But we're going further. With D1, it will be possible to define a chunk of your Worker code that runs directly next to the database, giving you total control and maximum performance — each request first hits your Worker near your users, but depending on the operation, can hand off to another Worker deployed alongside a replica or your primary D1 instance to complete its work.

SheinhardtWigCo4y ago· 1 in thread

Big fan of Cloudflare but I wish they would stick to descriptive product names.

Good: Workers, KV, Durable Objects, Cron Triggers

Bad: Spectrum, Zaraz, R2, D1

alberth4y ago

Naming is hard.

> Zaraz

That's the name of the company they acquired. Though, I do agree that more descriptive naming is nice.

E.g.

Zaraz = SafeXXS

D1 = LDS (light database system)

R2 = ObjectStore

Spectrum = Reverse Proxy

didip4y ago· 1 in thread

All these hype around SQLite recently and I am still confused.

* How do you replicate it consistently?

* Who has the master privilege (or masters if sharded)? What's the failover story?

I am guessing a blob store is involved, but I have gaps in my understanding here.

discodave4y ago

SQLite has a write ahead log (journal) mode. If you write that log to some store that is already replicated (S3, CloudFlare Durable Objects, Kafka?) then the concept of a 'master' is less important.

estensen4y ago· 1 in thread

Too bad you probably can't use this to store data about EU citizens. Phone numbers like they show in the demo are considered PII, right?

methyl4y ago

why?

whitepaint4y ago· 1 in thread

Will they seriously challenge Azure, AWS and GCP eventually? Cloudflare is very innovative and what they are doing is really exciting.

015a4y ago

The unique thing about Cloudflare's product offerings is how global-first they are; traditional cloud providers (AWS to DigitalOcean) have a very region-oriented domain model, with select christened services allowed or architected to be global (ex: AWS Cloudfront, IAM, Route53, that's about it there). That's their disaster/failure model; but all it really does is force cross-regional architecture onto the customer. Most customers don't bother.

In comparison, everything at CF is global. And its not just "global" from an AWS perspective of "we've got 14 regions and your stuff runs in all of them"; its global from 300+ points-of-presence, within 50ms of like 98% of all humans. CDN for compute, databases, etc.

CF has a way to go in DevEx on many of their products. For example; Workers, being based on V8 Isolates, is a pain to use even compared to e.g. Lambda. It's a battle of figuring out what's possible and what isn't within the runtime. But I'm sure it'll be improved!

losvedir4y ago

Wow, this looks potentially very interesting. Since this is sort of fresh in my mind from the recent Fly post about it:

* How exactly is the read replication implemented? Is it using litestream behind the scenes to stream the WAL somewhere? How do the readers keep up? Last I saw you just had to poll it, but that could be computationally expensive depending on the size of the data (since I thought you had to download the whole DB), and could potentially introduce a bit of latency in propagation. Any idea what the metrics are for latency in propagation?

* How are writes handled? Does it do the Fly thing about sending all requests to one worker?

I don't quite know what a "worker" is but I'm assuming it's kind of like a Lambda? If you have it replicated around the world, is that one worker all running the same code, and Cloudflare somehow manages the SQL replicating and write forwarding? Or would those all be separate workers?

ngrilly4y ago

Not clear from reading the post if the SQLite C library is embedded and linked in the Worker runtime (which would mean no network roundtrip) or if each query or batch of queries is converted to a network request to a server embedding the SQLite C library.

That's important to understand because that's one of the key advantages of SQLite compared to the usual client-server architecture of databases like PostgreSQL or MySQL: https://www.sqlite.org/np1queryprob.html

tyingq4y ago

"With D1, it will be possible to define a chunk of your Worker code that runs directly next to the database...each request first hits your Worker near your users, but depending on the operation, can hand off to another Worker deployed alongside a replica or your primary D1 instance to complete its work."

That's interesting to me. It opens the door for Cloudflare to offer something more like a "normal" serverless offering. One that can run containers, or least natively run Python/Golang/Java/etc, like AWS Lambda does. And with this ecosystem described above that can conditionally route between the lighter edge Workers and the heavier central serverless functions. To me, that's the tipping point where they start to threaten larger portions of AWS.

irq-14y ago

Best Effort Writes[1] are an opportunity here. Non-transactional, write to the local replica (ensure foreign keys, constrains, valid data, etc...) and then try to write to the main write-enabled DB. Caching should work without changes since the local replica is updated. This could be cheaper (send binary diffs) and more resilient to brief network issues.

The key is to let the user decide what really needs ACID and what doesn't. If someone wants to make the next Facebook or Reddit they'll need huge write throughput and if some votes or updates are lost, that may be a good trade-off.

[1] You could add a BEW file (like WAL file) to sqlite for Best Effort Writes.

aeyes4y ago

What write throughput and latency can we expect from this database?

Are there any limitations, for example on the number of tables or size of the database?

xwdv4y ago

With this we can probably switch our infrastructure off AWS and entirely onto Cloudflare.

pier254y ago

So where are the databases running? In the same regions as workers?

Is the data replicated to all regions?

dinkleberg4y ago

This is convenient, I’ve been building an app which is using SQLite but am wanting to deploy it to Cloudflare pages. I expected I was going to have to switch to a hosted Postgres instance somewhere, but this could be perfect.

jcuenod4y ago

So I assume we'll see a nice big donation to the sqlite coffers, then?

ralusek4y ago

Unless I missed it by skimming, where are the deets? Is this strongly or eventually consistent? What are max table sizes, and do they become partitioned? Are there cross partition joins?

robertlagrant4y ago

This looks awesome. I was thinking about creating a custom version of this to live behind a CF Worker. Much better to have an official version!

philholden4y ago

Glad to hear was considering moving to Deno Deploy + Supabase because KV was not good for relationships.

jzer0cool4y ago

How does this work when developing locally. Is it SQLite for local development?

benjiweber4y ago

I was expecting this to be using https://en.wikipedia.org/wiki/D_(data_language_specification... given the name.

polskibus4y ago

Is this going to be open sourced? Seems to be building on the shoulder of a particular giant that could use a bit wider ecosystem.

deanc4y ago

Any word on pricing =)?

oxff4y ago

Its a bold strategy, Cotton, sounding a bit like they want to compete with AWS.

onphonenow4y ago

Our first database … I like it. I wonder what’s next

alberth4y ago

First, super excited by having Cloudflare offer a RDMS (can SQLite be called that?)

This enables entirely new classes of applications where everything can now be hosted by Cloudflare.

Questions:

a. To help with concurrent writes, will Cloudflare be using WAL2 and BEGIN CONCURRENT branches of SQLite?

b. How is Cloudflare replicating the data cross region? Will it be Litestream.io behind the scenes?

c. Will our Worker code need to be written differently to ensure only a single-writer is writing to SQLite database?

d. How does data persistency and database file size get factored in? I have to imagine their is a limit to how much storage can be used, whether or not that storage is local to the Worker machine, and if its persistent.

rvz4y ago

Now is this a Cloudflare ($NET) buy signal? I think you know the answer.

Maybe they will announce a Hashicorp competitor in their next reveal. Who knows.

j / k navigate · click thread line to collapse

228 comments

150 comments · 43 top-level

kurinikku4y ago· 20 in thread

wow SQLite getting a lot of love these days

https://tailscale.com/blog/database-for-2022

https://fly.io/blog/all-in-on-sqlite-litestream

https://blog.cloudflare.com/introducing-d1

peterhunt4y ago

Using SQLite for these applications is really fun, and it seems like a good idea on paper. But in practice I just don't think it's worth it. YMMV though.

manigandham4y ago

So you didn't use SQLite then? Because RocksDB + Kafka is not similar at all.

2 more replies

samatman4y ago

I accept that you learned a lot about the limits of combining RockDB with Kafka, especially in the exact way you combined them.

This might have limited utility if the goal were to combine RocksDB with something else. And even less for SQLite and something else.

ripley124y ago

I'm not sure why you concluded that SQLite is the problem when you built a "really cool distributed database" with Kafka. Distributed databases are complicated, Kafka's complicated.

If you're saying that a replicated Postgres setup would be simpler than what you're built, I agree; but SQLite+Litestream probably would be too.

1 more reply

MuffinFlavored4y ago

Is this any good? https://github.com/rqlite/rqlite

I've been looking for a turn key solution that is better than me running a single node Postgres instance "bare metal" or in a container.

postgres-operator seems cool but... k8s, pretty heavy I guess.

1 more reply

jen204y ago

It’s the default storage engine for FoundationDB - not sure many would agree that isn’t a “durable, distributed database”.

1 more reply

jgrahamc4y ago

SQLite has been cool forever. It was the underlying data store for my machine learning email filter POPFile 20 years ago!

https://en.wikipedia.org/wiki/POPFile https://getpopfile.org/browser/trunk/engine/POPFile/Database...

runlevel14y ago

It's high-quality software too. It's well-commented and exceptionally well tested.[^1][^2]

I don't usually place much stock in those sort of counts, but 640x is notable.

It makes sense considering the wide variety of use-cases, from embedded devices to edge computing and everything in between.

[1]: https://www.sqlite.org/testing.html [2]: https://sqlite.org/src/dir?ci=trunk

sigzero4y ago

I used POPFile!! It was awesome.

alberth4y ago

SQLite was originally great for desktop applications.

jchw4y ago

Cthulhu_4y ago

1 more reply

RaoulP4y ago

With the mileage (and attention) those new products are getting out of using SQLite, I think Richard Hipp deserves a lot more acknowledgement for creating such an amazing piece of software.

sophacles4y ago

At least that's been my observation since I started coming around here.

tootie4y ago

I'm wondering if we'll see some similar energy around non-sql embedded databases like leveldb or rocksdb

sanderjd4y ago

Thaxll4y ago

Well I don't think it's a good fit for regular service, exactly how do you handle 2 replicas of the same service talking to the same DB?

The fact that it's just a file on disk limits the usage.

nibab4y ago

Projects such as litestream and rqlite have this figured out.

2 more replies

manigandham4y ago

Transactions, locks, queues, etc. No different than multiple app instances changing the same row in other databases.

Any state mutation is ultimately ordered in time and how that that ordering is accomplished depends on the abstractions you're using: in your app, network layer, database, etc.

1 more reply

mbreese4y ago

the_duke4y ago· 19 in thread

All this recent hype around sqlite...

sqlite is a great embedded database and thanks to use by browsers and on mobile the most used database in the world by orders of magnitude.

But it also comes with lots of limitations.

* there is no type safety, unless you run with the new strict mode, which comes with some significant drawbacks (eg limited to the handful of primitive types)

* very narrow set of column types and overall functionality in general

* the big one for me: limited migration support, requiring quite a lot of ceremony for common tasks (eg rewriting a whole table and swapping it out)

This seems to be mitigated/solved here though by the ability to run worker code "next to the database".

I'm somewhat surprised they went this route. It probably makes sense given the constraints of Cloudflares architecture and the complexity of running a more advanced globally distributed database.

On the upside: hopefully this usage in domains that are somewhat unusual can lead to funding for more upstream sqlite features.

prirun4y ago

* the big one for me: very limited migration support, requiring quite a lot of ceremony for common tasks (eg rewriting a whole table and swapping it out)

https://www.sqlite.org/lang_altertable.html

the_duke4y ago

For a long time sqlite did not have DROP COLUMN and RENAME COLUMN support, which are both pretty essential.

I'm embarrassed to admit that I didn't realize RENAME COLUMN was actually added in 3.25, almost four years ago.

DROP COLUMN was only just added last year in 3.35.

I'm surprised a database schema lasted 9/12 years without ever renaming or dropping a column.

This changes things! But even now, ALTER TABLE is not transactional. So especially with many concurrent readers there can definitely be situations where you'd still want to rewrite.

2 more replies

cryptonector4y ago

It would really help if SQLite3 had a `MERGE`, or, failing that, `FULL OUTER JOIN`. In fact, I want it to have `FULL OUTER JOIN` even if it gains a `MERGE`.

`FULL OUTER JOIN` is essential to implementing `MERGE`. Granted, one could implement `MERGE` without implementing `FULL OUTER JOIN` as a public feature, but that seems silly.

Sadly, the SQLite3 dev team specifically says they will not implement `FULL OUTER JOIN`[0].

Implementing `MERGE`-like updates without `FULL OUTER JOIN` is possible (using two `LEFT OUTER JOIN`s), but it's an O(N log N) operation instead of O(N).

The lack of `FULL OUTER JOIN` is a serious flaw in SQLite3. IMO.

  [0] https://www.sqlite.org/omitted.html

1 more reply

sorenbs4y ago

Migrations have gotten better recently, but there are still cases where you need to follow the 12 steps very carefully: https://www.sqlite.org/lang_altertable.html#otheralter

Prisma Migrate can automatically generate these steps, removing most of the pain. I'm sure other migration tools can do this as well.

llimllib4y ago

simonw's sqlite-utils can help here too: https://sqlite-utils.datasette.io/en/stable/cli.html#transfo...

vlovich1234y ago

D1 does not throw away consistency. It’s built on top of Durable Objects which is globally strongly consistent.

smarx0074y ago

"D1 will create read-only clones of your data, close to where your users are, and constantly keep them up-to-date with changes."

So, while the consistency is not "thrown away", it's no longer a strong consistency? Anyway, Kyle from Jepsen will figure it out soon, I guess :)

1 more reply

greg-m4y ago

Just clarifying - D1 without read replicas is strongly consistent. If you add read replicas, those can have replication lag and will not be strongly consistent.

Disclaimer: I work at Cloudflare :)

1 more reply

mwcampbell4y ago

1 more reply

hn_ei_ser_234y ago

That is interesting. I wish CF would give us some more information as I've assumed that there must be a lack of strong consistency which would be a major drawback.

Edit: But that would mean that durable objects can't be replicated asynchronously? That would mean a big latency hit. Then what's the difference to a central DB in one datacenter?

kwizzt4y ago

I’m not familiar with Durable Objects. When D1 does replication to read replicas, if it’s not doing synchronous replication, then it’s not strongly consistent, is that correct?

the_duke4y ago

I wish the post had provided some more technical details.

It's more of a "quickstart" than a peek under the hood.

1 more reply

jpcapdevila4y ago

Are you guys using litestream or a similar approach? E.g storing WAL frames in a durable object.

jambutters4y ago

What types are missing from strict that you need?

1 more reply

vaughan4y ago

Has anyone tried to write a new modern SQLite?

sophacles4y ago

jpcapdevila4y ago

I think the most successful attempt would be Realm.

https://realm.io/

chrisshroba4y ago

DuckDB comes to mind, but I can't speak to its differences from SQLite.

https://duckdb.org/

2 more replies

steaminghams4y ago

why do you consider sqlite to not be modern?

all the hip service providers seem to be all over it which would indicate pretty good modernity to me at least.

mwcampbell4y ago· 13 in thread

Any current or planned support for existing ORMs, such as Prisma or TypeOrm?

Also, I wonder how hard it will be to migrate existing PostgreSQL databases and SQL statements. Of course, I understand if Cloudflare is focused on greenfield applications.

sorenbs4y ago

Prisma won't work with D1 out of the box. The primary limitations are:

Note that Prisma already support querying Postgres, MySQL, SQL Server and MongoDB from Cloudflare Workers through the Prisma Data Proxy, which will see a GA release next month.

geelen4y ago

I am also excited about this :)

Cthulhu_4y ago

Before you consider using an ORM, try using regular SQL and some tooling first; your future self will thank you. Just write the code, it's only volume and it's not so bad.

joshstrange4y ago

pier254y ago

Totally agree.

Source: someone who avoided learning SQL for 20 years.

gigatexal4y ago

+1 to this as well.

jgrahamc4y ago

joshstrange4y ago

> I hope someone creates the next Rails using Workers

I wish CF Workers had been out when I first started building my current project, I might have gone in that direction instead, I still might.

2 more replies

gen2204y ago

You might want to consider adding Deno [1] to the language examples: https://developers.cloudflare.com/workers/platform/languages...

Deno can compile to wasm, so it can plug in through that vertical. But it's just TS on the frontend.

I'm mainly a python programmer, but Deno's been the most alluring development in the JS ecosystem since typescript for me. Might be helpful to you all to capture some steam from source.

[1]: https://deno.land/

jpcapdevila4y ago

I'm building an open source firebase alternative using sqlite. I'll be reaching out, I was thinking to build the distribution & durability part myself, but I would rather use D1!

I guess it would count as a client focused ORM :)

I'll be reaching out from jp@javascriptdb.com

Great addition, congrats!

eatonphil4y ago

2 more replies

irq-14y ago

This should have a virtual file system. CF should write it so each user doesn't have to load a JS abstraction and it has better performance.

Cthulhu_4y ago

slashdev4y ago· 9 in thread

For a Cloudflare article, this one is surprisingly light on technical details. And for the product where it most matters.

I'm a bit curious how that replication works. Ship the whole db? Binary diffs of the master? Ship the SQL statements that did the write and reapply them? Lots of performance and other tradeoffs here.

So many questions, but no details yet.

otoolep4y ago

Disclaimer: I am the creator of rqlite.

[1] https://github.com/rqlite/rqlite

rad_gruchalski4y ago

I’ve been looking at rqlite for some time and it’s really great to track the product on github.

A combo of D1, R2 and Workers is a serious contender for over-the-top serverless distributed apps. This is great.

1 more reply

sebk4y ago

Small nitpick, but that's still consistent as in ACID. I think what you mean is it wouldn't be consistent in the CAP sense (it wouldn't be linearizable).

TFA does say that read-replicas will be present at every edge location, which makes sense for a product like Workers. But it doesn't mention writes at all.

eloff4y ago

Yes, that's true.

dragonwriter4y ago

> I'm guessing this is a single master database with multiple read replicas. That means it's not consistent

Single master with read replicas is fully consistent if commits don't return until propagated to and acknowledged by replicas (the expense here being commit latency.)

otoolep4y ago

Disclaimer: I am the creator of rqlite.

[1] https://github.com/rqlite/rqlite

1 more reply

eloff4y ago

ithrow4y ago

Yeah, nothing about WAL mode which is what most users will want for web apps.

sqlite is accessed via a socket? defeats the whole purpose of using sqlite.

Many here are mentioning using one sqlite file per customer but that sounds like a nightmare for migrations and analytics.

SQLite is great and all these new services and articles are nice but intentionally shadowing lots of complexity.

detroitcoder4y ago

endisneigh4y ago· 8 in thread

Have any of the problems that led people to use Postgres instead of SQLite actually been solved? Are we doomed to repeat the same mistakes?

Also, any plans to support PATCH x-update-range so SQLite can be used entirely in the browser via SQLite.js?

Can someone enlighten me with the types of use cases this would be better for vs say Postgres?

ignoramous4y ago

endisneigh4y ago

SQLite made sense as an embedded database on day a desktop or phone because there’s only a single person generally writing to it. The perfect use case.

I don’t understand how it will be usable at all in a website with multiple users. Is the idea to make your site to every user gets their own database? How do you stop SQL injection?

Once you solve all of these problems aren’t you better off just using Postgres?

1 more reply

nindalf4y ago

Which problems were you thinking of?

Cloudflare and fly.io both promise hassle free read replicas and backup. They will both offer only a single node capable of writes, because that’s how SQLite rolls.

This is a pretty good fit for a read heavy load that requires SQL and very low latency.

endisneigh4y ago

I guess I’m not understanding what the benefit is vs hosted Postgres. Also low latency and setup can be equally trivial - see supabase for example.

2 more replies

hn_ei_ser_234y ago

The important drawback is async replication and therefore the lack of full consistency. On the other hand, this is the big advantage of hosted Postgres and the like.

Those offerings are great for use-cases that don't need that kind of consistency, which are many.

hn_ei_ser_234y ago

No and no. I think this is great for Edge computing, where there is currently no solution. So, it's better than nothing.

jve4y ago

What problems? Both are for different use cases albeit overlapping.

endisneigh4y ago

Concurrent writes, for one.

ranguna4y ago· 6 in thread

This looks amazing!

I see cloudflare people are on this post, any chance to compar D1 vs postgres in terms of DB features?

Insert ... Returning

Stored procedures and triggers

Etc etc

Would be really helpful to get a comparison like cockroachDB did here https://www.cockroachlabs.com/docs/stable/postgresql-compati...

Or even better, a general sql compatibility matrix like this https://www.cockroachlabs.com/docs/stable/sql-feature-suppor...

Kudos to the cloudflare team!

the_duke4y ago

Well, it's sqlite... so presumably you will get most of the capabilities sqlite has.

RETURNING is covered.

tyingq4y ago

1 more reply

ranguna4y ago

> Stored procedures are indirectly there by running your own code "next to the database",

"indirectly" is a keyword here, because running code when data is modified potentially won't replace triggers since they'll probably execute outside the running transaction.

pier254y ago

Listen/notify

Cthulhu_4y ago

The announcement - if you read it before posting - says it's sqlite, so that's something you can punch into google.

Long story short, don't expect anything fancy. Support for alter table is limited, and concurrency can be an issue.

ranguna4y ago

It is indeed sqlite but it could possibly have modification done or additions added. Please be considerate and think a little more before commenting.

jgrahamc4y ago· 4 in thread

BTW R2 is open beta now: https://blog.cloudflare.com/r2-open-beta/

mariushn4y ago

R2 is 3x more expensive than B2 (storage) https://www.backblaze.com/b2/cloud-storage-pricing.html

Am I missing something? Is there no bandwidth cost at all?

messe4y ago

Yep, you're not charged for egress.

1 more reply

alberth4y ago

Does R2 provide synching between regions? Maybe that's why it's so much more expensive? You're getting regional failover?

1 more reply

rubenv4y ago

Latency

greenie_beans4y ago· 4 in thread

dang i was hoping for postgres so i can use postgis

edit: maybe one day! this looks cool regardless

edvinbesic4y ago

I'm right there with you. I wonder if this is an SQLite compatible API on top of their own solution, or if it's using actual SQLite under the hood with custom replication.

If the latter, and anyone from CloudFlare is here, is there any chance to have SpatiaLite enabled?

https://www.gaia-gis.it/fossil/libspatialite/index

durkie4y ago

Seconding a vote for Spatialite support! I came here just to make that same request.

yawaramin4y ago

No need to call dang!

greenie_beans4y ago

lol i'm from a place where "dang" is a natural part of our vocab

lucasyvas4y ago· 3 in thread

The API for this is currently the only thing I wish I could grok a bit better. It seems like it would be hard to make it work with existing libraries that can access SQLite, which is kind of a shame.

I am getting a bit ahead of myself on the Rust part (presumably that will eventually be supported as part of workers-rs), but I think the feelings still stand if you consider the JS ecosystem.

Edit: I may actually be wrong, but presumably the entire surface isn't covered because there's no file opening, etc.

mritchie7124y ago

There might be a `env.DB.url` (e.g. the jdbc URL) which you could pass into an existing library.

yencabulator4y ago

I'm kinda willing to make a bet that this rides on top of what looks like HTTP to the Javascript engine. That's how their worker-to-worker and worker-to-durable-object protocols are.

(It's not really HTTP as in it might never cross a TCP socket, just get shuffled from one V8 isolate to another, but it looks like a `fetch` call to the Javascript.)

It's also worth remembering that SQLite itself has no wire protocol, it's a library. And there is no such thing as a "SQL wire protocol". It sure isn't gonna be Postgres wire protocol either.

From the article:

lucasyvas4y ago

Interesting thought! Would love to see more details.

fzaninotto4y ago· 2 in thread

Love the Northwind Traders reference! However, for a demo, I suggest a slightly larger and more complex data set, [data-generator-retail](https://www.npmjs.com/package/data-generator-retail).

I also think the demo would have more impact if it wasn't read-only (although I understand that this could lead to broken pages if visitors mess up with the data).

Anyway, kudos to the CloudFlare team!

naiv4y ago

I was thinking the same. The dataset is way too small.

celso4y ago

Fixed the orders table. Good catch.

samwillis4y ago· 2 in thread

This is really interesting, it's (basing it on SQLite) exactly what I was expecting CloudFlare to do for their first DB.

Its perfect for content type sites that want search and querying.

Anyone from CF here, is it using Litestream (https://litestream.io) for its replication or have you built your own replication system?

0: https://www.sqlite.org/sessionintro.html

jgrahamc4y ago

1. No, it's not built on Litestream. Operating a massive network and shuttling data around is kind of our thing.

2. We are going all in on databases and D2 sounds like a cool name for something...

xafke4y ago

R2, D2. I see what you did there!

lucasyvas4y ago· 2 in thread

To the person from Cloudflare I complained to in last year's thread about putting your money where your mouth is on serverless databases:

You weren't lying, and this is super cool - the SQLite hype train also seems to be in full force.

throwaway8943454y ago

It's interesting to see a relatively old technology get hyped.

jgrahamc4y ago

:-)

rmbyrro4y ago· 2 in thread

I'm buying Cloudflare stocks right now.

In 2-3 years from now, these services will be so mature and strong they will be crushing the cloud market.

They're turning dreams into reality, one after another.

endisneigh4y ago

Cloud business is driven by enterprise generally. Would enterprise be using SQLite?

Quarrelsome4y ago

they should be using SQLite more often than they are.

1 more reply

jpcapdevila4y ago· 2 in thread

If SQLite gets you excited, I'm building a firebase alternative based on sqlite. I'm betting hard on sqlite so this get's me super excited!!

https://javascriptdb.com

CF people around, I would love to chat, if anyone is interested please reach out at: jp@javascriptdb.com

I'll be applying to this beta for sure!

js4ever4y ago

Super interesting! I really like the idea. I'll join the beta, email sent :)

jpcapdevila4y ago

Any feedback on what do you find interesting would be awesome :) thanks!!

frogger84y ago· 2 in thread

Not a expert on DOM or JavaScript so be kind ;)

One thing I hope to see in the future is a better product filtering experience. When I worked on a jquery product filter I realized the DOM bloat was the main problem.

I wonder if D1 can help devs build instant product filtering pages that don’t require the reload like microcenter or Newegg does.

IE https://www.newegg.com/p/pl?d=hdmi+cable&N=-1&SortType=8

mbreese4y ago

Having a SQLite db close to your worker node really isn’t going to affect this problem all that much.

Cthulhu_4y ago

It's probably better - especially for more advanced search engines - to have an elasticsearch instance or whichever is the more recent example handle product search and filtering like that.

_kyran4y ago· 2 in thread

So can we assume that D2 will be postgres/mysql ?

eatonphil4y ago

It sounds like you're making a simile but I don't understand it. The article did literally state D1 is based on sqlite.

_kyran4y ago

The opening paragraph reads "Today, we're excited to announce D1, our first SQL database." read: first

and well R2 and D2 would make for a great naming scheme.

1 more reply

hn_ei_ser_234y ago· 1 in thread

First, I'm very excited. Sure, SQLite has some limitations compared to Postgres, esp. regarding the type system and concurrency. But we get ACID compliance and SQL.

But it is really hard getting some useful information from this article. I can't even tell if it is not there or just buried in all this marketing hot air.

So, what is it really? Is there one Write-Master that is asynchronously replicated to all other locations? Will writes be forwarded to this master and then replicated back?

But overall I'm very excited. Also by the fly.io announcement, of course. Lots of innovation and competition. Good times for customers.

tyingq4y ago

>So, what is it really? Is there one Write-Master that is asynchronously replicated to all other locations? Will writes be forwarded to this master and then replicated back?

Not a lot of detail, but that is mentioned:

infogulch4y ago· 1 in thread

Very cool! Glad to see all the love for SQLite recently.

Some questions:

Will D1 be able to deliver this design of having many thousands of separate databases for a single application? Will this be problematic from a cost perspective?

> since we're building on the redundant storage of Durable Objects, your database can physically move locations as needed

Will D1 be able to easily migrate the "primary" at will? CockroachDB described this as "follow the sun" primary.

unraveller4y ago

I guess the first answer is: similar to Durable Object limits (unlimited databases / 50 GB total) since they alluded to those abilities more so than a simple file stored on R2 (only for backups).

ryanto4y ago· 1 in thread

This is so cool!

From the blog post it says read-only replicas are created close to users and kept up to date with the latest data.

- How should I think about this in terms of CAP? If there's a write and I query a replica what happens?

- How are writes handled? Do they go to a single location or are they handled by various locations?

I'm excited to try this. It's so cool to see databases being distributed "on CDNs" for lack of a better term.

leonidasv4y ago

> Embedded compute

SheinhardtWigCo4y ago· 1 in thread

Big fan of Cloudflare but I wish they would stick to descriptive product names.

Good: Workers, KV, Durable Objects, Cron Triggers

Bad: Spectrum, Zaraz, R2, D1

alberth4y ago

Naming is hard.

> Zaraz

That's the name of the company they acquired. Though, I do agree that more descriptive naming is nice.

E.g.

Zaraz = SafeXXS

D1 = LDS (light database system)

R2 = ObjectStore

Spectrum = Reverse Proxy

didip4y ago· 1 in thread

All these hype around SQLite recently and I am still confused.

* How do you replicate it consistently?

* Who has the master privilege (or masters if sharded)? What's the failover story?

I am guessing a blob store is involved, but I have gaps in my understanding here.

discodave4y ago

SQLite has a write ahead log (journal) mode. If you write that log to some store that is already replicated (S3, CloudFlare Durable Objects, Kafka?) then the concept of a 'master' is less important.

estensen4y ago· 1 in thread

Too bad you probably can't use this to store data about EU citizens. Phone numbers like they show in the demo are considered PII, right?

methyl4y ago

why?

whitepaint4y ago· 1 in thread

Will they seriously challenge Azure, AWS and GCP eventually? Cloudflare is very innovative and what they are doing is really exciting.

015a4y ago

losvedir4y ago

Wow, this looks potentially very interesting. Since this is sort of fresh in my mind from the recent Fly post about it:

* How are writes handled? Does it do the Fly thing about sending all requests to one worker?

ngrilly4y ago

tyingq4y ago

irq-14y ago

[1] You could add a BEW file (like WAL file) to sqlite for Best Effort Writes.

aeyes4y ago

What write throughput and latency can we expect from this database?

Are there any limitations, for example on the number of tables or size of the database?

xwdv4y ago

With this we can probably switch our infrastructure off AWS and entirely onto Cloudflare.

pier254y ago

So where are the databases running? In the same regions as workers?

Is the data replicated to all regions?

dinkleberg4y ago

jcuenod4y ago

So I assume we'll see a nice big donation to the sqlite coffers, then?

ralusek4y ago

Unless I missed it by skimming, where are the deets? Is this strongly or eventually consistent? What are max table sizes, and do they become partitioned? Are there cross partition joins?

robertlagrant4y ago

This looks awesome. I was thinking about creating a custom version of this to live behind a CF Worker. Much better to have an official version!

philholden4y ago

Glad to hear was considering moving to Deno Deploy + Supabase because KV was not good for relationships.

jzer0cool4y ago

How does this work when developing locally. Is it SQLite for local development?

benjiweber4y ago

I was expecting this to be using https://en.wikipedia.org/wiki/D_(data_language_specification... given the name.

polskibus4y ago

Is this going to be open sourced? Seems to be building on the shoulder of a particular giant that could use a bit wider ecosystem.

deanc4y ago

Any word on pricing =)?

oxff4y ago

Its a bold strategy, Cotton, sounding a bit like they want to compete with AWS.

onphonenow4y ago

Our first database … I like it. I wonder what’s next

alberth4y ago

First, super excited by having Cloudflare offer a RDMS (can SQLite be called that?)

This enables entirely new classes of applications where everything can now be hosted by Cloudflare.

Questions:

a. To help with concurrent writes, will Cloudflare be using WAL2 and BEGIN CONCURRENT branches of SQLite?

b. How is Cloudflare replicating the data cross region? Will it be Litestream.io behind the scenes?

c. Will our Worker code need to be written differently to ensure only a single-writer is writing to SQLite database?

rvz4y ago

Now is this a Cloudflare ($NET) buy signal? I think you know the answer.

Maybe they will announce a Hashicorp competitor in their next reveal. Who knows.

j / k navigate · click thread line to collapse