CouchDB 2.1.0 (opens in new tab)

(docs.couchdb.org)

101 pointstropshop8y ago23 comments

23 comments

16 comments · 4 top-level

tarr118y ago· 4 in thread

I was looking really hard at using CouchDB 2.0 for an offline-first progressive web app. The biggest asset in my opinion of CouchDB is actually PouchDB [1] PouchDB is really great and nicely designed.

However, the showstopper that I hit was that CouchDB still does not support permissions on a per-user basis (even though PouchDB encourages this explicitly [2])

What this means is that if you have a typical system with users and logins, who can only access their own data, and you require some sort of aggregated view where you can view all users data behind some sort of permissions (admin user, etc), you are forced to create a new db for each user in Couch, and then run replication for those users to a separate master.

Think for example if you wanted to store images in CouchDB, and then have some sort of global feed (like instagram)

So, systems with many users (tens of thousands -> millions) would be stuck replicating an equal number of couch databases constantly. Each replica in 2.0 was an erlang thread, and each database is a separate file. Apparently some of the performance issues were fixed in 2.1, but it still requires a physical file per db and continuous replication.

Some promising solutions like envoy [3] from Cloudant, which were based on Mango [4] have been worked on, but don't seem to have much support in the core Couch community and feels to have been abandoned.

Another problem is that you can't really throttle or manage pouchDB <-> couchDB replication very well. Eg, if you have a large-ish database that's out of sync, it's just going to try to download itself as soon as you sync, which isn't very mobile friendly, or "PWA" friendly IMO.

Net result for me was that I just stuck with my Postgres DB, Rails API, and a bunch of Redux/React code to accomplish what I needed.

Still a fan of Couch/Pouch but wouldn't recommend it if your system requires any kind of serious per-user or role-based permissions or aggregated views.

[1] https://pouchdb.com/

[2] https://pouchdb.com/2015/04/05/filtered-replication.html

[3] https://github.com/cloudant-labs/envoy

[4] https://blog.couchdb.org/2016/08/03/feature-mango-query/

sagelywizard8y ago

The main feature which was added in 2.1 was the replication scheduler, which makes it possible to run millions of replications concurrently. That makes the database-per-user model much more feasible.

As for throttling, the CouchDB replicator gracefully handles HTTP 429, but that wouldn't help with PouchDB-managed replications. I can't really speak on PouchDB, but maybe it's possible to limit the replication on the client side, since that's where it's managed?

marknadal8y ago

If you are looking for something like CouchDB but only syncs partial subsets of the data you request (rather than the whole thing), try checking out https://github.com/amark/gun (disclosure: I build it). However Postgres is probably the best database choice out there right now, so you made a good decision.

tropshopOP8y ago

> So, systems with many users (tens of thousands -> millions) would be stuck replicating an equal number of couch databases constantly

I also ran into this when using `continuous: true` to create persistent replication, which suffers from this 1-to-1 resource problem. I now use a single listener on `_db_changes` and fire one-off replication to the aggregate db. I'm also keeping an eye on spiegel[1]

I'm curious on how you handle offline support, sync conflicts, and multiple devices. I started with CouchDB 1.x and have been through enough growing pains to wonder how things would have been to stick with SQL, but I think the future is only brighter. CouchDB 2.1 looks great, and I hope the entire erlang ecosystem continues to see growth and adoption.

[1] https://github.com/redgeoff/spiegel

tarr118y ago

I know that it is a common trope that Couch / Pouch seems to think that SQL databases don't work for offline replication and you need their "batteries-included" solution. However, it's not the only solution and every app in the app store that works offline has had to deal with it. Most of those are using SQL databases.

In truth, what Couch does is force you to handle conflict events which gives you a convenient platform level place to put some of your automatic conflict/merge code. A lot of merge code is UI however and Couch does not help with that at all.

Detecting conflicts is a subtle thing, but it's not necessarily a one-size-fits-all problem as couch would have us believe. I

It also gives you versioning for free (which is pretty easy to implement in SQL)

However, at the end of the day, every developer must make application level decisions on how to handle merge conflicts. Couch/Pouch does not obviate this - it just forces you to deal with it and lets you think about your application a little differently. Once you have to deal with merges and conflicts in your application code, it's more a matter of designing your application to have fewer conflicts (either through better merging, vector clocks, pessimistic locking, etc)

I wish I could have used it but all my other data is in a battle-tested Postgres db, being nicely backed up regularly on Heroku without fail. My CouchDB was on a Google Cloud server, using a brand new replication framework as backup.

I don't really mean to rag on Couch - I think it's a promising idea. But I ended up wasting several weeks going down this path only to throw it away because I didn't feel like it was production-ready.

1 more reply

lmcardle8y ago· 4 in thread

Why use CouchDB instead of Couchbase?

gdelfino018y ago

Full RESTful HTTP/JSON API

stock_toaster8y ago

I presume the parent was talking about couchbase's SyncGateway offering, but maybe not.

granitosaurus8y ago

My favorite explanation: https://stackoverflow.com/a/15184612/3737009 They are very different things and Couchbase is just misleading since it doesn't have much to do with original CouchDB at all and in general the consensus seem to be that CDB is better than CB but CB is just being heavily marketed.

HodGreeley8y ago

That analysis is from 4 years ago and is wrong in many, many ways at this point. As for "the general consensus", I'd like to see evidence that's true among real enterprise users. (FD: I work for Couchbase.)

1 more reply

nothrows8y ago· 3 in thread

no Per Document Authorization is still my biggest issue with couch that keeps me from using it https://wiki.apache.org/couchdb/PerDocumentAuthorization

feld8y ago

The answer I got to this problem was "design middleware that handles this for you" which has to be a joke

random0239878y ago

> The answer I got to this problem was "design middleware that handles this for you" which has to be a joke

It's not a joke.

The couch security model doesn't match the requirements of multi-user untrusted clients typical of internet distributed applications. But then most database have a similar limitation, it's only more visible in CouchDB because you can read/write documents directly from a browser without an application server, so the next logical step is to just let clients read/write directly to CouchDB over the internet without an app server.

If your data is in postgres, you will need an application server handing access control, business logic, and serialization.

If your data is in CouchDB, you need a proxy server that handles access control, whitelisting certain URL patterns and body content based on user entitlements.

1 more reply

oblib8y ago

That's not too difficult to do with a server side script that accesses the database as an Admin user.

I use perl cgi scripts to handle that. There are perl modules for interfacing with CouchDB but you can also create a simple "curl" call to do it.

I have run into some issues with encoding/decoding JSON doing that with perl though, but I've not looked far into how they might be solved yet.

dsun1798y ago· 1 in thread

Has anyone tried out rxdb on the client with couchdb as server? I'm affraid mostly about the scaling issues when you sync with many clients

nwienert8y ago

Currently using this. Haven't tested heavy scaling though because our app doesn't have huge scale needs. BUT, will say things are looking promising. Theres are some[1][2] improvements to replication and general speed landing/landed.

As far RxDB, in our limited tests, using only the query-sync feature speeds up everything a ton, not to mention optimistic updates. Overall it's been pretty great, and developer is very active.

[1] https://github.com/apache/couchdb/pull/495 [2] https://github.com/apache/couchdb/pull/470 [2]

j / k navigate · click thread line to collapse

23 comments

16 comments · 4 top-level

tarr118y ago· 4 in thread

However, the showstopper that I hit was that CouchDB still does not support permissions on a per-user basis (even though PouchDB encourages this explicitly [2])

Think for example if you wanted to store images in CouchDB, and then have some sort of global feed (like instagram)

Net result for me was that I just stuck with my Postgres DB, Rails API, and a bunch of Redux/React code to accomplish what I needed.

Still a fan of Couch/Pouch but wouldn't recommend it if your system requires any kind of serious per-user or role-based permissions or aggregated views.

[1] https://pouchdb.com/

[2] https://pouchdb.com/2015/04/05/filtered-replication.html

[3] https://github.com/cloudant-labs/envoy

[4] https://blog.couchdb.org/2016/08/03/feature-mango-query/

sagelywizard8y ago

The main feature which was added in 2.1 was the replication scheduler, which makes it possible to run millions of replications concurrently. That makes the database-per-user model much more feasible.

marknadal8y ago

tropshopOP8y ago

> So, systems with many users (tens of thousands -> millions) would be stuck replicating an equal number of couch databases constantly

[1] https://github.com/redgeoff/spiegel

tarr118y ago

Detecting conflicts is a subtle thing, but it's not necessarily a one-size-fits-all problem as couch would have us believe. I

It also gives you versioning for free (which is pretty easy to implement in SQL)

1 more reply

lmcardle8y ago· 4 in thread

Why use CouchDB instead of Couchbase?

gdelfino018y ago

Full RESTful HTTP/JSON API

stock_toaster8y ago

I presume the parent was talking about couchbase's SyncGateway offering, but maybe not.

granitosaurus8y ago

HodGreeley8y ago

1 more reply

nothrows8y ago· 3 in thread

no Per Document Authorization is still my biggest issue with couch that keeps me from using it https://wiki.apache.org/couchdb/PerDocumentAuthorization

feld8y ago

The answer I got to this problem was "design middleware that handles this for you" which has to be a joke

random0239878y ago

> The answer I got to this problem was "design middleware that handles this for you" which has to be a joke

It's not a joke.

If your data is in postgres, you will need an application server handing access control, business logic, and serialization.

If your data is in CouchDB, you need a proxy server that handles access control, whitelisting certain URL patterns and body content based on user entitlements.

1 more reply

oblib8y ago

That's not too difficult to do with a server side script that accesses the database as an Admin user.

I use perl cgi scripts to handle that. There are perl modules for interfacing with CouchDB but you can also create a simple "curl" call to do it.

I have run into some issues with encoding/decoding JSON doing that with perl though, but I've not looked far into how they might be solved yet.

dsun1798y ago· 1 in thread

Has anyone tried out rxdb on the client with couchdb as server? I'm affraid mostly about the scaling issues when you sync with many clients

nwienert8y ago

As far RxDB, in our limited tests, using only the query-sync feature speeds up everything a ton, not to mention optimistic updates. Overall it's been pretty great, and developer is very active.

[1] https://github.com/apache/couchdb/pull/495 [2] https://github.com/apache/couchdb/pull/470 [2]

j / k navigate · click thread line to collapse