> When you publish a Notion page to the web, the webpage’s metadata may include the names, profile photos, and email addresses associated with any Notion users that have contributed to the page.
The flaw itself is absurd but then just accepting it as "by design" makes it even worse.
Conceptually, I agree it should be easy, but I suspect they're stuck with legacy code and behaviors that rely on the current system. Not breaking anything else while fixing this is likely the time consuming part.
Anyways, I think Notion has a learning curve that is a little longer than one expects. I can believe that with some dedicated learning time I could be turned into a believer. But I also distinctly had the impression that it was one of those things where it saved a ton of time for a few narrow-visioned people (the people who championed it), but added meaningful time to everyone else's. Those people were largely project managers or operations folks, and transitively the leaders they reported to. It heavily threw the switch towards "legibility" over reality.
It's like when someone new to a messy project, creates a spreadsheet, and says, "Let not overthink this, everybody just fill in your project details in your row". If your work, which you are the expert on, doesn't fit nicely into the person's columns, it's not easy for you to fill out. Meanwhile, the person who created the spreadsheet, gets what looks like a neat and orderly answer to everything. All the messy things—which are or at least have in them the correct status of the thing—will be masked under a clean and simple, but rather incorrect, thing. That spreadsheet will also travel far specifically because it's neat and therefore portable. There aren't a bunch of "it depends" in it.
It never meant anything. Motion has always tried to be everything, do everything and work for absolutely everyone and that has always meant it was just a jumbled mess of pure waste of computing cycles. Notion has always been a disgrace of an app and a service—shoving AI into it is just the natural next step for a “whatever” company such as this.
What does this mean?
Like every other AI tool it mainly seems to exist to produce productivity porn. Summarize the meetings nobody could be bothered to summarize. Write the docs nobody can be bothered to read or write. Communicate as an end, not a means, because the company your work for has transitioned into the dead-weight phase.
First: This is documented and we also warn users when they publish a page. But, that’s not good enough!
Second: We don’t like this and are looking at ways to fix this either by removing the PII from the public endpoints or by replacing it with an email proxy similar to GitHub’s equivalent functionality for public commits.
P.S: Some folks here have speculated that this should be a 1 minute fix. Unfortunately that is not the case. :(
4 years.
https://cleanshot.com/share/trYdqYFZ
This is pretty meh. We will deploy more explicit messaging while we mitigate this properly.
There is a way to mitigate this. Re-hash and cache the page to be meta-less for public URLs. I guess that requires a huge amount of coding for a team that has not built the product from the ground up. But I feel like a "copy and paste" could fix that (remove author data).
Ignoring the “the bug was raised four years ago” part and assuming you just mean it isn't as easy as that and might break other things: what other things could resolving this potentially break? If the issue is that the PII needs to be present for private/authenticated views, would not making it unavailable everywhere including there, and fixing that later, be the better option over leaving the PII present for public views for a second longer?
Nonsense! It is a 1 minute fix. You just don't want to take a $ hit from inconveniencing users by breaking another part of your app.
Pull your thumb out and do the right thing. Implement the 1 minute fix, and then spend the rest of the week or month fixing the other parts of your app that might break as a result of fixing this.
What are you doing to address the support issues that allowed such a privacy issue to remain after being reported?
What are you doing to address the issues with the company's prioritisation framework that allowed such a privacy issue to remain for 4 years?
Which authorities are you reporting the privacy issue to in line with local requirements?
The sad thing is that people are used by now that anything they enter on a website is sooner or later going to be leaked, if not sold as if often happens with email addresses.
Yes, some users probably didn't realize their edits to public pages were saved publicly, and that's a legitimate UX complaint. But some of the responsibility has to sit with the user. Otherwise we'd be running daily headlines about Meta "leaking" user data to every advertiser with a checkbook.
I'm not saying it's the most likely project to survive, but they've been working in quiet mode for a good while now.
Notion looks to be pretty capable in that regard, so the knowledge graph options really fell short (Logseq, Obsidian, Joplin, Trilium, Craft). They are likely good if your use case is in their lane.
Anynote looks like a good option, except it doesn't have a web client, just the Android/iOS (and MacOS I guess?).
Milanote sounds like a possible option if my use were more inspiration-board heavy.
I'll probably give Anynote a try, but Notion really does seem to be a compelling product if it weren't for the jackassery that lead to this thread to begin with.
I was just trying to get a list of building supplies, one of which was the doors I wanted to use, to have a page where I could put a link to the product page for the doors I found.
Anynote looks promising, if I could understand why I didn't have what look to be the "standard objects" in a new space.
I kinda dislike where Notion is heading though, forcing more and more things on their users without any ways to disable them. But yes, it's capable to do what you are looking for.
Maybe Affine could also work though, you can self-host it and it's more customizable: https://affine.pro/
We’ll also see more token heavy services like dependabot, sonar cube, etc that specialize in providing security related PR Reviews and codebase audits.
This is one of the spaces where a small team could build something that quickly pulls great ARR numbers.
The reason for it is very simple: big companies bribe politicians and.... buy ads in media.
We need laws and a competent government to force these companies to care by levying significant fines or jail time for executives depending on severity. Not fines like 0.00002 cents per exposed customers, existential fines like 1% of annual revinue for each exposed customer. If you fuck up bad enough, your company burns to the ground and your CEO goes to jail type consequences.
If we also make the penalty for every crime the death penalty we'll have no more crime. Very simple solution no one has thought of.
some problems I've identified:
1. suppose you have x users and y groups, of which require some subset of x. joining the data on demand can become expensive, O(x*y).
2. the main usefulness of such an architecture is if the data itself is stored with the user, but as group sizes y increase, a single user's data being offline makes aggregate usecases more difficult. this would lend itself to replicating the data server side, but that would defeat the purpose
3. assuming the previous two are solved, which is very difficult to say the least, how do you secure the data for the user such that someone who knows about this architecture can't just go to the clients and trivially scrape all of the data (per user)?
4. how do you allow for these features without allowing people to modify their data in ways you don't want to allow? encryption?
a concrete example of this would be if HN had it so that each user had a sqlite database that stored all of the posts made per user. then, HN server would actually go and fetch the data for each of the posters to then show the regular page. presumably here if a data of a given user is inaccessible then their data would be omitted.
Premise: treat it as certain that the server will eventually be compromised, subpoenaed, or misconfigured. So the server must hold nothing that can be decrypted or linked to a specific user's content. Users hold their own encryption keys, the server stores ciphertext, and there is no UUID→identity mapping at the sync layer. Sync runs over any-sync, which is peer-to-peer-capable; intermediate nodes see ciphertext.
On your four problems:
1. O(x*y) joins - pushed to the client, because the server can't decrypt enough to do them.
2. Offline members - eventual-consistency sync and CRDT.
3. Client-side theft - if an attacker has the user's keys, they have the data. Intentional: no server-side gate to break means no server-side gate to exfiltrate at scale. We're considering optional 2FA at the infrastructure layer as an additional barrier to data retrieval.
4. Unwanted modifications - content is signed with user keys and validated on read.
Real cost is on the product side: no server-side AI over your notes, no server-side full-text search, slower cold-start, and harder to build product analytics (no access to user data). Granular ACLs are also harder — permissions are enforced by key possession, so revoking access often requires key rotation rather than a permission-flag change.
But the exact bug this post is about (a server endpoint that maps a public UUID to an email) is structurally impossible in this model, because there's no such mapping on our servers to misuse.
any-sync and our data format (any-block) are MIT, if you want to poke at how it works: https://github.com/anyproto
These apps are a disease and no one should be using services that offer them.
Here's a Reddit post just as confirmation: https://www.reddit.com/r/Notion/comments/hqyxid/possible_sec.... I also reported it privately two months prior, of course.
I don’t love Confluence, but at least it doesn’t do this to me.
Obsidian is built on-top of just markdown files, so you can do whatever you want with them. E.g. if you need multiplayer editing you could use 3rd party solutions or even something like HedgeDoc.
Affine is more closer to Notion and self-hostable.
Obsidian: https://obsidian.md/
Affine: https://affine.pro/
It’s open-source, easy to self-host and feature-packed.
GitHub: https://github.com/docmost/docmost.
I'm always disappointed by note-taking tools calling themselves a Notion alternative when they do not provide an alternative to Notion and are instead just another note-taking tool with a simple UI.
If you want to be a Notion alternative provide the things that make Notion great, e.g. the database functionality. It's okay to be a simple colaborative notes tool, but that is not a Notion alternative.
We have support for team-spaces, permissions, diagrams, real-time collaboration, comments, page verification workflows, AI, SSO/LDAP, search, audit logs, API, public sharing, and a lot more.
Btw, we have plans to introduce a database-like feature.
Tells me everything I need to know about this industry. No regard or seriousness to security at all.