I’ve had Facebook block several links sent in private message groups, to completely legal and safe sites (Messenger prints out an obscure API error and refuses to send the content). They have done this for a long time.
They can then have a single map of phone num -> links rendered between fb and whatsapp.
Give it a few second after pasting it and you'll get the preview, because it gets it from your device.
The only conclusion: it takes a little time for a file that's flagged - based on its language - to pass the scanners?
It might be, IDK, but if it’s all inside their system, how could you audit that?
They do this already in WhatsApp for instance.
The amount of time I've gotten those obviously spammy links form people I have never talked to in a decade plus.... cant be hard to red flag these.
It's what else might be going on with the link analysis that's worrisome.
Example, Facebook asking for phone numbers in the name of “security” when they don’t give a shit about security. They wanted to tie a phone number to the owner, and create a social graph based on their contact uploads.
I can totally understand scanning a PDF for links to look for malicious links to protect users.
But that wouldn't involve actual HTTP requests to them.
I'm struggling to imagine what purpose this could have.
One of the things that phishers and others do is use link wrapping and other services to hide malicious links. So, I get something.wordpress.com/something-clean. I then put in an HTML or JS redirect on that page to something malicious. Given that browsers don't warn about HTTP, HTML, or JS redirects, it's an easy way for scammers to get around a list of malicious pages.
These kinds of attacks are very common in the email space.
look-alike domains are phishing vector that don't require you to make an http request.
Or link support requests to people who received a certain link via message.
So basically data mining to feed a model that takes future actions in consideration.
Just in case you didn't follow any of the previous HN discussion of how that's done
consider the URL https://accounts.example.com/tmp/badmojo.exe
You (Facebook in this case) run a hypothetical method SafeSearch('accounts.example.com') and also SafeSearch('example.com') and SafeSearch('accounts.example.com/tmp') and SafeSearch('accounts.example.com/tmp/badmojo.exe')
SafeSearch(string) is defined as, you do SHA(string) and that's your hash, you compare the start of this hash to a huge list of prefixes that Google provides, which you fetch updates for every few minutes. If there's no match, fine, done. If there's a match you ask Google OK, I saw this Prefix you sent me, what hashes should I be scared of? Google gives you a list of hashes with that Prefix. If your hash in this new list, the original URL was scary, warn users not to visit, otherwise continue what you were doing.
I'm sure if they're pulling data to do this analysis, it's not the only analysis they're doing.
EDIT: Actually they already did, it's called Facebook Instant Articles.
If we'd be talking really malicious stuff like chid pornography then in the context of filesharing these companies already have systems in place to distinguish content, so blanket banning of torrent files seems blatantly unnecessary.
This is the real strawman, as nobody on this entire thread is talking about illegal music. On the other hand, there's a strong and persistent thread of calls for tech platforms like Facebook to control "malicious or illegal" information being spread on their platforms: an obvious example is the NZ shooter's manifesto + video.
You gotta assume they will reconstruct generations of your family tree if they can from your chat history.
Once a service starts blocking malicious links, the obvious next step is to conceal the link through some sort of indirection.
It's apparently not my data and privacy and I have no say in it at all when someone uploads their goddamn contact list to these frickin' crooks. Is that your understanding too?
So This solution is not a solution. This solution only works if we stop other people from using it. How's that going to work? Well it isn't.
Cool? Now welcome to the "what regulation is appropriate" debate because this is clearly an example of market failure, same as national defence, same as national parks, same as pollution, same as any other. The sooner we treat "...using computers" exactly the same as any other industry the better. This is nuts.
Afaik that's actually the legal situation in the US due to the third-party doctrine [0]. It's one of the ways government mass surveillance has been "legalized" without having to comply with the Fourth Amendment.
No, not a complete solution, but it's a step in the right direction. One of the reasons advertisers as well as ordinary users come to FB is to be in contact with you -- to have your attention. Stop using FB and you take that away. FB ends up with less of an audience with which to entice others.
I quit several years after being on the platform for nearly a decade, which I mainly used to communicate with classmates.
Over time, I realized the platform was only good for pushing false information and giving a platform to people who shouldn't be sharing their thoughts. I also became more aware that I was a product, and it infuriated me to no end.
I'm glad to not be on it anymore. What a waste of time.
Would be interesting to see if Facebook has a maximum number of links it'll follow.
https://news.ycombinator.com/item?id=1
https://news.ycombinator.com/item?id=2
https://news.ycombinator.com/item?id=3
and so on, until 10,000,000? Perhaps Facebook starts opening every link using 10,000 parallel threads. Can you really replicate that from your connection at home? Perhaps even the sysadmin of your victim site has whitelisted all Facebook IP addresses so their crawlers get a free ride.
I mean almost all of the time that PDF will be malware designed to trick the reader into clicking that link and it did the right thing.
https://support.signal.org/hc/en-us/articles/360022474332-Li...
The service being previewed doesn't know who you are because Signal acts as a proxy, Signal doesn't know what you previewed on that service because their client deliberately sends overlapping Range requests so that the preview size is rounded.
If you search for a text string in Gmail, it will return emails that contain that text only in scanned images or PDFs that are in your mailbox.
I am not personally affiliated with them, but I believe they are South African.
1. Compress the file you want to send in an password protected zip. 2. change the extension of the Zip file (.zip) to text file (.txt). 3. Send the file trough Messenger.
I already did it to send MacOS application to a friend. To avoid size restriction, compress in several zip parts, rename the extentions to .txt and send.
File size can be as high as 50mb+.
(And, I don't use Facebook, anyways.)
stop using facebook. and no, there is not a single reason for you to be there. trust me. how do you think we lived our lives before there was a facebook?