Facebook crawls links in PDFs you send in Messenger (opens in new tab)

(twitter.com)

476 pointsortekk6y ago155 comments

155 comments

101 comments · 23 top-level

javagram6y ago· 12 in thread

This will keep happening until they enable e2e.

I’ve had Facebook block several links sent in private message groups, to completely legal and safe sites (Messenger prints out an obscure API error and refuses to send the content). They have done this for a long time.

rohug6y ago

Worth noting WhatsApp also provides link previews now. Although it is supposedly e2e communication, the link previews are likely generated by reaching out to a similar facebook unfurl service.

They can then have a single map of phone num -> links rendered between fb and whatsapp.

yawaramin6y ago

WhatsApp fetches a link preview on the sender's device before the message is encrypted, and packages it up with the message before sending. Depending on how exactly they implement the fetch, they may or may not know what links you sent.

1 more reply

exadeci6y ago

Try sending a link immediately and you'll see that it doesn't get a preview.

Give it a few second after pasting it and you'll get the preview, because it gets it from your device.

imanobody6y ago

WhatsApp also scans pdf files you send to a contact. Easy to confirm as well: get some random pdf with chinese filename and chinese content. Send it to a contact. Watch the delay in send/receive. Now do the same for any random pdf that's all English. Watch the regular send/receive time occur.

The only conclusion: it takes a little time for a file that's flagged - based on its language - to pass the scanners?

yellow_lead6y ago

I experienced this too, Facebook will block most torrent links, regardless of if they're legal or not. I've taken to encoding these with Base64 first and instructing the recipient to decode them.

BubRoss6y ago

Why not just make it a broken link and tell them how to correct it?

2 more replies

9HZZRfNlpR6y ago

I have had similar experiences, numerous to be more exact. The latest was 10 yrs old WordPress blog living on WordPress.com subdomain, definitely not hacked. It was about science, to be more exact, about neurology.

TooCreative6y ago

e2e would not necessarily stop it. Since FB controls the apps that send and receive the message, they can do whatever they want to the unencrypted message on both sides.

espes6y ago

You can choose to enable e2e on Messenger

SlowRobotAhead6y ago

Because the key, nonce, result, and keyshare or Diffie-Hellman exchange are all done inside of messenger... why would anyone believe this is legit?

It might be, IDK, but if it’s all inside their system, how could you audit that?

2 more replies

kerng6y ago

Same can be done if e2e is enabled. Nothing prevents Facebook from sending links from client to a "validation" service.

They do this already in WhatsApp for instance.

JadeNB6y ago

Maybe I'm being foolish, but isn't the point of e2e that Facebook wouldn't even know what you were sending (a link or otherwise), it being encrypted in flight?

3 more replies

criddell6y ago· 10 in thread

Microsoft does this with Skype too. They say it's for detecting malicious links.

knzhou6y ago

As always in big tech, you're damned if you do and damned if you don't.

st0le6y ago

Honestly, This is good to prevent malware but I imagine this breaks a bunch of things if for eg. If the link has a limited visit count. The link will "expire" before the recipient gets a chance to view it.

1 more reply

feanaro6y ago

I have absolutely no problems with the don't. I don't think any central body should be responsible for policing my private conversations and it just seems like a convenient excuse for these companies to perpetuate the surveillance.

barney546y ago

Oh, I don’t know. I refuse to use Facebook Messenger and I have yet to be damned (at least to my knowledge)!

1 more reply

journalctl6y ago

So they might as well don’t; at least then we get some modicum of privacy.

1 more reply

amelius6y ago

Why not simply ask the user then?

grumblepeet6y ago

And all email into Office 365. Gets pulled into a sandboxed environment where items such as pdfs, in fact all attachments, are ‘exploded’ and all the links investigated ie executed and checked for malicious end points or payloads. I was in a meeting recently with someone from Microsoft where they explained this (I might be misrepresenting what she said, she was an expert in this field, and I’m definitely not). I was shocked though at the capability of the system to examine content so such an extent.

Gustomaximus6y ago

If Microsoft really wanted to help Skype security it would be pretty easy to realise an account has been hacked when people are suddenly message a link to all their contact list they never have for 10 years.

The amount of time I've gotten those obviously spammy links form people I have never talked to in a decade plus.... cant be hard to red flag these.

CydeWeys6y ago

And I do appreciate that they're doing that even. I want that, just like I want spam filtering on my email.

It's what else might be going on with the link analysis that's worrisome.

ecf6y ago

Just remember that almost every feature that’s “announced” is a masquerade for ad-tech software to do its thing.

Example, Facebook asking for phone numbers in the name of “security” when they don’t give a shit about security. They wanted to tie a phone number to the owner, and create a social graph based on their contact uploads.

crazygringo6y ago· 9 in thread

Huh, but why?

I can totally understand scanning a PDF for links to look for malicious links to protect users.

But that wouldn't involve actual HTTP requests to them.

I'm struggling to imagine what purpose this could have.

mdasen6y ago

How do you know if they're malicious if you don't make HTTP requests to them?

One of the things that phishers and others do is use link wrapping and other services to hide malicious links. So, I get something.wordpress.com/something-clean. I then put in an HTML or JS redirect on that page to something malicious. Given that browsers don't warn about HTTP, HTML, or JS redirects, it's an easy way for scammers to get around a list of malicious pages.

These kinds of attacks are very common in the email space.

gruez6y ago

But in this case, that doesn't help at all because facebook's crawler uses a predictable user agent string. You give a clean result to the facebook crawler and a malicious result to everyone else.

3 more replies

bluntfang6y ago

>How do you know if they're malicious if you don't make HTTP requests to them?

look-alike domains are phishing vector that don't require you to make an http request.

TazeTSchnitzel6y ago

The malicious links could be camouflaged behind a redirect.

allie16y ago

Could be collecting the links so if a user blocks the sender after opening the pdf, and this is done at scale, they can infer it was one of the links and starts blocking them?

Or link support requests to people who received a certain link via message.

So basically data mining to feed a model that takes future actions in consideration.

danielfoster6y ago

Probably anti-spam, particularly to catch groups of fake accounts sending the same or similar PDF.

austhrow7436y ago

How do you check if a link is serving up something terrible without http requests to them?

tialaramex6y ago

You _could_ ask a service like Google Safe Search

Just in case you didn't follow any of the previous HN discussion of how that's done

consider the URL https://accounts.example.com/tmp/badmojo.exe

You (Facebook in this case) run a hypothetical method SafeSearch('accounts.example.com') and also SafeSearch('example.com') and SafeSearch('accounts.example.com/tmp') and SafeSearch('accounts.example.com/tmp/badmojo.exe')

SafeSearch(string) is defined as, you do SHA(string) and that's your hash, you compare the start of this hash to a huge list of prefixes that Google provides, which you fetch updates for every few minutes. If there's no match, fine, done. If there's a match you ask Google OK, I saw this Prefix you sent me, what hashes should I be scared of? Google gives you a list of hashes with that Prefix. If your hash in this new list, the original URL was scary, warn users not to visit, otherwise continue what you were doing.

1 more reply

Frost1x6y ago

The obvious argument is they need to scan pages linked for malware and couldn't rely on a white/black list.

I'm sure if they're pulling data to do this analysis, it's not the only analysis they're doing.

buboard6y ago· 8 in thread

Sidenote i wonder why FB doesnt launch a search engine since they crawl most of the web anyways

saagarjha6y ago

That would let people leave the Facebook platform and explore the open web.

kortex6y ago

I remember the sheer awe when I first learned there was a huge open web outside of AOL. I'm sure people nowadays are aware of the rest of the web, but if the draw is minimal, they will likely get stuck in the same loops of well-trodden space.

1 more reply

buboard6y ago

I dont think they re worried much about that anymore , people always return, they ve established their position. OTOH, it would be nice for google to have some serious competition on the web, esp. considering that FB has a great NLP AI team.

progval6y ago

They could create their own AMP-like tech to keep people on their website/application

EDIT: Actually they already did, it's called Facebook Instant Articles.

tantalor6y ago

Facebook is America Online.

catalogia6y ago

> "According to legend, upon hearing of Colt's entrance into the lever-action rifle market, Winchester began to develop a prototype revolver to compete with Colt's market. A "gentleman's agreement" then followed between Colt and Winchester, with Colt agreeing to drop production of the Burgess and Winchester abandoning its plans to develop a revolver. The truth of this story has never been fully verified, and as such, the reason for the Burgess rifle's short production history is unknown.[1][4][5]"

https://en.wikipedia.org/wiki/Colt-Burgess_rifle

bagacrap6y ago

There is a search bar in Facebook. That searches Facebook, which Facebook would tell you is all the web you need.

nuclear_eclipse6y ago

They did that, and it wasn't successful.

djohnston6y ago· 6 in thread

And if they didn't the headline would read: "Facebook fails to stop malicious and illegal content from being shared on their Network! Should they be shut down?!"

Barrin926y ago

this sounds like a strawman to be honest because I haven't heard anyone rant about illegal music since probably 15 years, and if anything ever only politicians and not ordinary people.

If we'd be talking really malicious stuff like chid pornography then in the context of filesharing these companies already have systems in place to distinguish content, so blanket banning of torrent files seems blatantly unnecessary.

wutbrodo6y ago

> this sounds like a strawman to be honest because I haven't heard anyone rant about illegal music since probably 15 years

This is the real strawman, as nobody on this entire thread is talking about illegal music. On the other hand, there's a strong and persistent thread of calls for tech platforms like Facebook to control "malicious or illegal" information being spread on their platforms: an obvious example is the NZ shooter's manifesto + video.

1 more reply

egypturnash6y ago

Malicious content could also include phishing and viruses.

fittslickare6y ago

I can not imagine this headline really. We are talking about private messages.

stevewilhelm6y ago

So we want Facebook to stop propagating fake news and hate speech and we want them to do so without crawling URLs?

djohnston6y ago

Yes. Those are mutually exclusive.

worldofmatthew6y ago· 6 in thread

This should not be news to anyone. Facebook scans all links posted in Messenger.

sushid6y ago

This is links INSIDE a pdf. Thats one step further than most people assumed.

manojlds6y ago

Mostly to scan the PDF and ensure it's safe I believe, or atleast that's would be the stated reason.

1 more reply

gryffin6y ago

PDF is an open format. Scanning PDF for the links is probably the least you could assume.

You gotta assume they will reconstruct generations of your family tree if they can from your chat history.

1 more reply

nitwit0056y ago

Then people have strange assumptions.

Once a service starts blocking malicious links, the obvious next step is to conceal the link through some sort of indirection.

joshspankit6y ago

I would also assume they are scanning links in zip files and (if anyone is crazy enough to do this) links visible in images

axegon_6y ago

This. I honestly don't get why this is news. I truly hate facebook with a passion, I really do. But on this occasion I don't really blame them: You know what you are getting yourself into, what did you expect? A tuna salad? You shouldn't really be sharing any personal information on any platform which you can't hold accountable, regardless of e2e encryption.

archie26y ago· 5 in thread

Facebook spies on everything you do. Stop using it. This is the only way this will end. It's not even a useful product. Stop being sheep.

harry86y ago

I did, more than a decade ago. Do you think they stopped? How would I know there is no shadow profile being constantly updated?

It's apparently not my data and privacy and I have no say in it at all when someone uploads their goddamn contact list to these frickin' crooks. Is that your understanding too?

So This solution is not a solution. This solution only works if we stop other people from using it. How's that going to work? Well it isn't.

Cool? Now welcome to the "what regulation is appropriate" debate because this is clearly an example of market failure, same as national defence, same as national parks, same as pollution, same as any other. The sooner we treat "...using computers" exactly the same as any other industry the better. This is nuts.

freeflight6y ago

> It's apparently not my data and privacy and I have no say in it at all when someone uploads their goddamn contact list to these frickin' crooks.

Afaik that's actually the legal situation in the US due to the third-party doctrine [0]. It's one of the ways government mass surveillance has been "legalized" without having to comply with the Fourth Amendment.

[0] https://en.wikipedia.org/wiki/Third-party_doctrine

dropin6856y ago

> This solution is not a solution.

No, not a complete solution, but it's a step in the right direction. One of the reasons advertisers as well as ordinary users come to FB is to be in contact with you -- to have your attention. Stop using FB and you take that away. FB ends up with less of an audience with which to entice others.

1 more reply

biblytens6y ago

What are you claiming is the market failure here?

1 more reply

catacombs6y ago

This.

I quit several years after being on the platform for nearly a decade, which I mainly used to communicate with classmates.

Over time, I realized the platform was only good for pushing false information and giving a platform to people who shouldn't be sharing their thoughts. I also became more aware that I was a product, and it infuriated me to no end.

I'm glad to not be on it anymore. What a waste of time.

gravypod6y ago· 5 in thread

Could someone effectively DOS another site using this method by including a bunch of links that generate a lot of load?

Would be interesting to see if Facebook has a maximum number of links it'll follow.

GhostVII6y ago

The cost of uploading a PDF of links is probably not much less than the cost of following those links on your own. So I don't think you gain much by leveraging Facebook in this case.

ahbyb6y ago

What if I create a PDF with this content...

https://news.ycombinator.com/item?id=1

https://news.ycombinator.com/item?id=2

https://news.ycombinator.com/item?id=3

and so on, until 10,000,000? Perhaps Facebook starts opening every link using 10,000 parallel threads. Can you really replicate that from your connection at home? Perhaps even the sysadmin of your victim site has whitelisted all Facebook IP addresses so their crawlers get a free ride.

3 more replies

_underfl0w_6y ago

I'm sure you could just point all those links to a domain you own (on some poor, unsuspecting VPS) and see what happens? Might be playing with fire, though.

ahbyb6y ago

I remember something like this was possible back in the day with Google Sheets. You could embed a URL as if it was an image in each cell of a sheet and it would make thousands of requests. I don't remember the details.

cj6y ago

https://news.ycombinator.com/item?id=3890328

steelframe6y ago· 4 in thread

My company's malware detection crap on my work laptop once scanned a PDF of a security research paper I was reading for my project, found a link to a web site with malware on it because that's what the research was about, and then it summarily deleted the PDF to "protect" the company from that link.

TwoBit6y ago

Well it wasn't entirely wrong. Just because the doc was about malware doesn't mean somebody won't accidentally click it.

Spivak6y ago

I mean how is the security scanner supposed to know that you’re working on a project which is super specific edge case?

I mean almost all of the time that PDF will be malware designed to trick the reader into clicking that link and it did the right thing.

archie26y ago

Most security scanners allow you to create exclusion folders, where it doesn't scan files in those folders. Something somebody researching malware on a company computer with a malware detector should probably be aware of.

RaiseProfits6y ago

Presumably by disabling it on the workstations of security researchers. Or just the feature that flags linked content rather than content itself.

lukeschlather6y ago· 4 in thread

Are there any comparable hosted messaging services that don't do this?

s09dfhks6y ago

signal/telegram

tialaramex6y ago

Signal actually does optionally offer previews for a handful of services, and they really jump through some hoops to make that safer:

https://support.signal.org/hc/en-us/articles/360022474332-Li...

The service being previewed doesn't know who you are because Signal acts as a proxy, Signal doesn't know what you previewed on that service because their client deliberately sends overlapping Range requests so that the preview size is rounded.

duskwuff6y ago

Telegram does crawl links sent in non-encrypted messages. Links in secret chats are left alone, though.

brennebeck6y ago

Probably Signal.

dontbenebby6y ago· 3 in thread

This feels like it could be an attack vector. Gather intel on what the user agent is, nmap the IP, possibly find a vulnerability in the parser or the server.

ressetera6y ago

I doubt the downloader isn't restricted in some kind of jail.

Enginerrrd6y ago

You'd think... but after that story about Microsoft just executing random threatening code it found on someone's computer and allowing it access to the internet, I have to question some of the wisdom these big companies show.

1 more reply

lostmsu6y ago

You can just pint the link to your own server. That will tell you Facebook IP at the very least.

beager6y ago· 3 in thread

A good way to enable delivery tracking for PDFs over messenger, I guess!

NullPrefix6y ago

I assume, the tracking happens before the recipient sees the message. This could be used to track sent messages, not received.

beager6y ago

Yep there’s still no “open” tracking on PDFs, at least afaik from a pixel/beacon standpoint. Entire businesses like docusign are built with that value prop in mind.

fsociety6y ago

It's to protect from malware and phishing. Plenty of other companies do this too and it is reasonable. Someone else here put it really well.. damned if you do and damned if you don't.

_8j506y ago· 2 in thread

Not in the least surprising. Wouldn't be surprised if Gmail does this to..."detect phishing" (pdfs containing phish links are common). Always a plausible reason they can use.

supernova87a6y ago

There's no surprise. Gmail does.

If you search for a text string in Gmail, it will return emails that contain that text only in scanned images or PDFs that are in your mailbox.

odensc6y ago

That doesn't mean they're crawling the URLs, just that they're indexing the content of PDFs/images for your search. Which is, honestly, a pretty useful feature. Whereas Facebook is doing this without providing any value to the user.

1 more reply

h1fra6y ago· 1 in thread

While I don't like FB this is not something related to PDF, they provide quick preview for all links like almost all social networks

lostmyoldone6y ago

This isn't about rendering a preview, it's about following like links inside the pdf. While there may be a case to do this to prevent phishing and malware attacks, they really should ask/tell the user they are doing it!

mikorym6y ago

You can also use canary tokens for this. [1]

I am not personally affiliated with them, but I believe they are South African.

[1] https://www.canarytokens.org/generate

egypturnash6y ago

The galaxy brain maneuver here is to start trying to fuzz whatever machines Facebook is using to do this.

w1nst0nsm1th6y ago

There is a way to send any type of file through messenger without facebook snooping into your private life.

1. Compress the file you want to send in an password protected zip. 2. change the extension of the Zip file (.zip) to text file (.txt). 3. Send the file trough Messenger.

I already did it to send MacOS application to a friend. To avoid size restriction, compress in several zip parts, rename the extentions to .txt and send.

File size can be as high as 50mb+.

zzo38computer6y ago

Would making a passworded PDF file help a bit? Then you can tell them the password by a separate message.

(And, I don't use Facebook, anyways.)

omani6y ago

seriously. how many times has the world tell you to stop using facebook?

stop using facebook. and no, there is not a single reason for you to be there. trust me. how do you think we lived our lives before there was a facebook?

sys_647386y ago

I only use the messenger thing on the Facebook webpage. I don’t generally install apps on my phone as I don’t trust them. I trust an ad company like Facebook the least.

ariyadi6y ago

Let’s play this game

burtonator6y ago

Facebook: we can crawl you but you can't crawl us!

tus886y ago

Not to be flippant, but if you choose to use Facebook you are already throwing your privacy to the wind, and you are probably getting what you deserve.

j / k navigate · click thread line to collapse

155 comments

101 comments · 23 top-level

javagram6y ago· 12 in thread

This will keep happening until they enable e2e.

rohug6y ago

Worth noting WhatsApp also provides link previews now. Although it is supposedly e2e communication, the link previews are likely generated by reaching out to a similar facebook unfurl service.

They can then have a single map of phone num -> links rendered between fb and whatsapp.

yawaramin6y ago

1 more reply

exadeci6y ago

Try sending a link immediately and you'll see that it doesn't get a preview.

Give it a few second after pasting it and you'll get the preview, because it gets it from your device.

imanobody6y ago

The only conclusion: it takes a little time for a file that's flagged - based on its language - to pass the scanners?

yellow_lead6y ago

I experienced this too, Facebook will block most torrent links, regardless of if they're legal or not. I've taken to encoding these with Base64 first and instructing the recipient to decode them.

BubRoss6y ago

Why not just make it a broken link and tell them how to correct it?

2 more replies

9HZZRfNlpR6y ago

TooCreative6y ago

e2e would not necessarily stop it. Since FB controls the apps that send and receive the message, they can do whatever they want to the unencrypted message on both sides.

espes6y ago

You can choose to enable e2e on Messenger

SlowRobotAhead6y ago

Because the key, nonce, result, and keyshare or Diffie-Hellman exchange are all done inside of messenger... why would anyone believe this is legit?

It might be, IDK, but if it’s all inside their system, how could you audit that?

2 more replies

kerng6y ago

Same can be done if e2e is enabled. Nothing prevents Facebook from sending links from client to a "validation" service.

They do this already in WhatsApp for instance.

JadeNB6y ago

Maybe I'm being foolish, but isn't the point of e2e that Facebook wouldn't even know what you were sending (a link or otherwise), it being encrypted in flight?

3 more replies

criddell6y ago· 10 in thread

Microsoft does this with Skype too. They say it's for detecting malicious links.

knzhou6y ago

As always in big tech, you're damned if you do and damned if you don't.

st0le6y ago

1 more reply

feanaro6y ago

barney546y ago

Oh, I don’t know. I refuse to use Facebook Messenger and I have yet to be damned (at least to my knowledge)!

1 more reply

journalctl6y ago

So they might as well don’t; at least then we get some modicum of privacy.

1 more reply

amelius6y ago

Why not simply ask the user then?

grumblepeet6y ago

Gustomaximus6y ago

The amount of time I've gotten those obviously spammy links form people I have never talked to in a decade plus.... cant be hard to red flag these.

CydeWeys6y ago

And I do appreciate that they're doing that even. I want that, just like I want spam filtering on my email.

It's what else might be going on with the link analysis that's worrisome.

ecf6y ago

Just remember that almost every feature that’s “announced” is a masquerade for ad-tech software to do its thing.

crazygringo6y ago· 9 in thread

Huh, but why?

I can totally understand scanning a PDF for links to look for malicious links to protect users.

But that wouldn't involve actual HTTP requests to them.

I'm struggling to imagine what purpose this could have.

mdasen6y ago

How do you know if they're malicious if you don't make HTTP requests to them?

These kinds of attacks are very common in the email space.

gruez6y ago

But in this case, that doesn't help at all because facebook's crawler uses a predictable user agent string. You give a clean result to the facebook crawler and a malicious result to everyone else.

3 more replies

bluntfang6y ago

>How do you know if they're malicious if you don't make HTTP requests to them?

look-alike domains are phishing vector that don't require you to make an http request.

TazeTSchnitzel6y ago

The malicious links could be camouflaged behind a redirect.

allie16y ago

Could be collecting the links so if a user blocks the sender after opening the pdf, and this is done at scale, they can infer it was one of the links and starts blocking them?

Or link support requests to people who received a certain link via message.

So basically data mining to feed a model that takes future actions in consideration.

danielfoster6y ago

Probably anti-spam, particularly to catch groups of fake accounts sending the same or similar PDF.

austhrow7436y ago

How do you check if a link is serving up something terrible without http requests to them?

tialaramex6y ago

You _could_ ask a service like Google Safe Search

Just in case you didn't follow any of the previous HN discussion of how that's done

consider the URL https://accounts.example.com/tmp/badmojo.exe

1 more reply

Frost1x6y ago

The obvious argument is they need to scan pages linked for malware and couldn't rely on a white/black list.

I'm sure if they're pulling data to do this analysis, it's not the only analysis they're doing.

buboard6y ago· 8 in thread

Sidenote i wonder why FB doesnt launch a search engine since they crawl most of the web anyways

saagarjha6y ago

That would let people leave the Facebook platform and explore the open web.

kortex6y ago

1 more reply

buboard6y ago

progval6y ago

They could create their own AMP-like tech to keep people on their website/application

EDIT: Actually they already did, it's called Facebook Instant Articles.

tantalor6y ago

Facebook is America Online.

catalogia6y ago

https://en.wikipedia.org/wiki/Colt-Burgess_rifle

bagacrap6y ago

There is a search bar in Facebook. That searches Facebook, which Facebook would tell you is all the web you need.

nuclear_eclipse6y ago

They did that, and it wasn't successful.

djohnston6y ago· 6 in thread

And if they didn't the headline would read: "Facebook fails to stop malicious and illegal content from being shared on their Network! Should they be shut down?!"

Barrin926y ago

this sounds like a strawman to be honest because I haven't heard anyone rant about illegal music since probably 15 years, and if anything ever only politicians and not ordinary people.

wutbrodo6y ago

> this sounds like a strawman to be honest because I haven't heard anyone rant about illegal music since probably 15 years

1 more reply

egypturnash6y ago

Malicious content could also include phishing and viruses.

fittslickare6y ago

I can not imagine this headline really. We are talking about private messages.

stevewilhelm6y ago

So we want Facebook to stop propagating fake news and hate speech and we want them to do so without crawling URLs?

djohnston6y ago

Yes. Those are mutually exclusive.

worldofmatthew6y ago· 6 in thread

This should not be news to anyone. Facebook scans all links posted in Messenger.

sushid6y ago

This is links INSIDE a pdf. Thats one step further than most people assumed.

manojlds6y ago

Mostly to scan the PDF and ensure it's safe I believe, or atleast that's would be the stated reason.

1 more reply

gryffin6y ago

PDF is an open format. Scanning PDF for the links is probably the least you could assume.

You gotta assume they will reconstruct generations of your family tree if they can from your chat history.

1 more reply

nitwit0056y ago

Then people have strange assumptions.

Once a service starts blocking malicious links, the obvious next step is to conceal the link through some sort of indirection.

joshspankit6y ago

I would also assume they are scanning links in zip files and (if anyone is crazy enough to do this) links visible in images

axegon_6y ago

archie26y ago· 5 in thread

Facebook spies on everything you do. Stop using it. This is the only way this will end. It's not even a useful product. Stop being sheep.

harry86y ago

I did, more than a decade ago. Do you think they stopped? How would I know there is no shadow profile being constantly updated?

It's apparently not my data and privacy and I have no say in it at all when someone uploads their goddamn contact list to these frickin' crooks. Is that your understanding too?

So This solution is not a solution. This solution only works if we stop other people from using it. How's that going to work? Well it isn't.

freeflight6y ago

> It's apparently not my data and privacy and I have no say in it at all when someone uploads their goddamn contact list to these frickin' crooks.

[0] https://en.wikipedia.org/wiki/Third-party_doctrine

dropin6856y ago

> This solution is not a solution.

1 more reply

biblytens6y ago

What are you claiming is the market failure here?

1 more reply

catacombs6y ago

This.

I quit several years after being on the platform for nearly a decade, which I mainly used to communicate with classmates.

I'm glad to not be on it anymore. What a waste of time.

gravypod6y ago· 5 in thread

Could someone effectively DOS another site using this method by including a bunch of links that generate a lot of load?

Would be interesting to see if Facebook has a maximum number of links it'll follow.

GhostVII6y ago

The cost of uploading a PDF of links is probably not much less than the cost of following those links on your own. So I don't think you gain much by leveraging Facebook in this case.

ahbyb6y ago

What if I create a PDF with this content...

https://news.ycombinator.com/item?id=1

https://news.ycombinator.com/item?id=2

https://news.ycombinator.com/item?id=3

3 more replies

_underfl0w_6y ago

I'm sure you could just point all those links to a domain you own (on some poor, unsuspecting VPS) and see what happens? Might be playing with fire, though.

ahbyb6y ago

cj6y ago

https://news.ycombinator.com/item?id=3890328

steelframe6y ago· 4 in thread

TwoBit6y ago

Well it wasn't entirely wrong. Just because the doc was about malware doesn't mean somebody won't accidentally click it.

Spivak6y ago

I mean how is the security scanner supposed to know that you’re working on a project which is super specific edge case?

I mean almost all of the time that PDF will be malware designed to trick the reader into clicking that link and it did the right thing.

archie26y ago

RaiseProfits6y ago

Presumably by disabling it on the workstations of security researchers. Or just the feature that flags linked content rather than content itself.

lukeschlather6y ago· 4 in thread

Are there any comparable hosted messaging services that don't do this?

s09dfhks6y ago

signal/telegram

tialaramex6y ago

Signal actually does optionally offer previews for a handful of services, and they really jump through some hoops to make that safer:

https://support.signal.org/hc/en-us/articles/360022474332-Li...

duskwuff6y ago

Telegram does crawl links sent in non-encrypted messages. Links in secret chats are left alone, though.

brennebeck6y ago

Probably Signal.

dontbenebby6y ago· 3 in thread

This feels like it could be an attack vector. Gather intel on what the user agent is, nmap the IP, possibly find a vulnerability in the parser or the server.

ressetera6y ago

I doubt the downloader isn't restricted in some kind of jail.

Enginerrrd6y ago

1 more reply

lostmsu6y ago

You can just pint the link to your own server. That will tell you Facebook IP at the very least.

beager6y ago· 3 in thread

A good way to enable delivery tracking for PDFs over messenger, I guess!

NullPrefix6y ago

I assume, the tracking happens before the recipient sees the message. This could be used to track sent messages, not received.

beager6y ago

Yep there’s still no “open” tracking on PDFs, at least afaik from a pixel/beacon standpoint. Entire businesses like docusign are built with that value prop in mind.

fsociety6y ago

It's to protect from malware and phishing. Plenty of other companies do this too and it is reasonable. Someone else here put it really well.. damned if you do and damned if you don't.

_8j506y ago· 2 in thread

Not in the least surprising. Wouldn't be surprised if Gmail does this to..."detect phishing" (pdfs containing phish links are common). Always a plausible reason they can use.

supernova87a6y ago

There's no surprise. Gmail does.

If you search for a text string in Gmail, it will return emails that contain that text only in scanned images or PDFs that are in your mailbox.

odensc6y ago

1 more reply

h1fra6y ago· 1 in thread

While I don't like FB this is not something related to PDF, they provide quick preview for all links like almost all social networks

lostmyoldone6y ago

mikorym6y ago

You can also use canary tokens for this. [1]

I am not personally affiliated with them, but I believe they are South African.

[1] https://www.canarytokens.org/generate

egypturnash6y ago

The galaxy brain maneuver here is to start trying to fuzz whatever machines Facebook is using to do this.

w1nst0nsm1th6y ago

There is a way to send any type of file through messenger without facebook snooping into your private life.

1. Compress the file you want to send in an password protected zip. 2. change the extension of the Zip file (.zip) to text file (.txt). 3. Send the file trough Messenger.

I already did it to send MacOS application to a friend. To avoid size restriction, compress in several zip parts, rename the extentions to .txt and send.

File size can be as high as 50mb+.

zzo38computer6y ago

Would making a passworded PDF file help a bit? Then you can tell them the password by a separate message.

(And, I don't use Facebook, anyways.)

omani6y ago

seriously. how many times has the world tell you to stop using facebook?

stop using facebook. and no, there is not a single reason for you to be there. trust me. how do you think we lived our lives before there was a facebook?

sys_647386y ago

I only use the messenger thing on the Facebook webpage. I don’t generally install apps on my phone as I don’t trust them. I trust an ad company like Facebook the least.

ariyadi6y ago

Let’s play this game

burtonator6y ago

Facebook: we can crawl you but you can't crawl us!

tus886y ago

Not to be flippant, but if you choose to use Facebook you are already throwing your privacy to the wind, and you are probably getting what you deserve.

j / k navigate · click thread line to collapse