https://cloudbrowser.website/ runs a headless Google Chrome instance on a VPS, loading a website and taking a screenshot using NodeJS puppeteer. The links are iterated to create an image map, so you can browse (many) websites as normal, but through the cloud. Not all websites work yet, but many do.
Why is this useful? While this is only a simple prototype, it is an experiment in a new way to browse the web which I believe shows promise. Websites have become increasingly complex, heavy on JavaScript and HTML5 features. Cloud Browser pushes this complexity to the cloud, so you can browse websites requiring JavaScript without enabling JavaScript on your end-user browser, for example.
It is still in its incipient stages, but I welcome any feedback/suggestions on how to expand and refine this concept, or better yet, code contributions ;) feel free to fork on GitLab.
Source code: https://gitlab.com/epitactic/cloudbrowser/
Live demo of Cloud Browser loading Hacker News: http://cloudbrowser.website/b/https://news.ycombinator.com/
Static mirror if the VPS is down: https://epitactic.gitlab.io/cloudbrowser/
It didn't really teach me anything about programming, but it seemed like such a novel thing to put hyperlinks on parts of images.
https://web.archive.org/web/20010301200926/http://www.bonega...
An absolutely positioned CSS div is an interesting idea, if I recall correctly I actually tried this first, but ran into various problems (with scaling as the page is zoomed in/out, I think) and the image map worked perfectly. area shape="rect" works well as it translates directly from puppeteer's await page $$('a') boundingBox().
Second: If my cloud browser has all of my credentials, do I care that my cloud browser is the process that’s pwned instead of my local browser? I think I don’t, so the main benefit would seem to be something like “view a certain static-ish subset of JS-required websites on a device that can’t/won’t run JS”.
That was essentially my envisioned use case, as well. Not only for devices that won't run JS but those where you might not want to. For example, a lot of news websites are surprisingly JavaScript heavy, slow and buggy, not something I necessarily want to run if I merely want to read what the news is talking about. Public read-only sites. Granted, arguably services like outline.com and archive.is already cover this use case better.
As for sites requiring credentials, the way I see it users could run the cloud browser on their own server. This has benefits even for public sites, adding additional privacy. However it is not something this Cloud Browser really supports yet, since there is no support for forms, text fields, or other interactivity with the exception of hyperlinks.
I've been toying with this configuration and think it might be usable for some opensource projects.
It also points out a significant limitation of Cloud Browser's imagemap-based architecture. Screenshotting the page locks away the text behind an image, inaccessible to screen readers, copy and paste, or other interactions. This doesn't seem easy to solve, since the most the img tag offers for accessibility is an "alt" tag, which does not allow specifying which areas of the image contain what text. Sending the actual text (like html.brow.sh) would solve this problem, but then layout is up to the end-user browser again.
Client-side OCR'ing of the image may be a possible alternative, will look into it thanks!
https://chrome.google.com/webstore/detail/prerendercloud-dis...
If you want to borrow the image map idea, feel free :). I'm not planning on developing cloudbrowser.website much further, moving onto other projects, but maybe I'll be a customer.
I'm pretty sure it would take less effort and give better results if you just hooked the viewer to something like a guacamole [1] session tethering you to a remote browser.
Looking into it more, maybe noVNC or Guacamole, with VNC or RDP, tethered to a remote browser instance. This would solve the problem of user interaction with the web page, which is currently very limited with Cloud Browser's image maps (hyperlinks only). On the other hand it would increase the end-user browser requirements (image maps date back to HTML 3.2! https://en.wikipedia.org/wiki/Image_map), but still not run the target website's content, so it could be worthwhile and most browsers actively in use now support HTML5.
I'd try for myself, but as I take it you are not doing this on a production scale the HN effect produces a lot of out error messages so I can't really see for my self.
Unfortunately, I'm only hosting the demo on a quad-core 6GB VPS, which frequently runs out of memory. There is a known issue (https://gitlab.com/epitactic/cloudbrowser/issues/2) where old Chrome processes are not cleaned up for some reason. I've restarted the VPS, should be available again for now, for at least a while.
So why not just sanitise the DOM? Embed images. Remove script before sending. Add an event listener to send all events back to your server. User clicks, your server emulates and sends back a DOM diff. You'd need to do more work for things like :hover (et al) but it all seems more sustainable than sending full on rasters.
Another inspiration is image boards such as 4chan, where screenshots are a very common means of sharing information, including articles on websites, or even tweets. Even though it may not be technically ideal, and annotated text seems like it would be more efficient, in practice images as the lowest-common-denominator seem to be a reasonably effective format for sharing information.
On the other hand, if someone does come up with a true "browser in browser" implementation like you propose, I would be very interested in trying it out. Could be a promising idea, but a lot of work to get right.
Cloud Browser does use a few modern features, a bit of CSS and (optional) JS, but not for anything strictly essential. Again I haven't tested it on any old browsers, but if anyone does I welcome bug reports/patches at https://gitlab.com/epitactic/cloudbrowser/issues.
(I wonder if it would work on NCSA Mosaic? https://news.ycombinator.com/item?id=18428682 - well, Mosaic added the img tag, but not sure if imagemap was yet available.)
Looks like they have been around a while (5+ years), and from their website https://www.authentic8.com, they are focused on the improved endpoint security aspect:
"The Browser for a Zero Trust Web"
> Traditional browsers run on blind trust. Silo assumes zero trust by running the browser in the cloud.
> Web code can’t be trusted. Organizations know that every page view means risk to the business. Silo restores your trust in the web through isolation, control and audit of the browser.
> Isolate: Silo executes all web code on our servers. Nothing touches your endpoint, and untrusted endpoints can’t corrupt your environment or your data.
> Mitigate risk: Shift your attack surface area off your network and devices to disposable, anonymous cloud infrastructure.
I am intrigued, wonder how well they are doing, and how well it works. Somewhat expensive, I've heard $10/month and $100/year for individuals. No online live free demo, but available on request.
With the Epitactic Cloud Browser, I'm only running the VPS temporarily as a demo, the way I envision it end-users can run their own instance either on a home server or virtual server, maintaining control and privacy.
These remote browser services seem to be difficult to keep running... (expensive if not profitable, I assume. My VPS is good for a few more weeks.)
Almost all metadata is not transferred through. There are two exceptions I can think of:
1) Browser window size. This is actually a significant fingerprinting leak, since desktop users can resize the dimensions of their browser down to the pixel.Cloud Browser uses it to generate an appropriately-sized image, matching the Chrome instance in the cloud to the end-user's browser. Less of a problem with mobile devices where the browser window is fixed, but could help fingerprint the device type.
If you want to avoid this, disabling JavaScript will prevent Cloud Browser from using window.innerWidth, innerHeight, and devicePixelRatio, and it will default to 800x600x1. This may not match your device. The best way to solve this is probably to run your own Cloud Browser instance, configured for what you will browse it from.
Interestingly, Firefox is implementing a "letterboxing" feature, from TorBrowser, to reduce fingerprinting from this technique: https://nakedsecurity.sophos.com/2019/03/08/firefox-browser-...
2) Time of access. The time Cloud Browser accesses a website will be shortly after the end-user accesses the website, as you would expect from a proxy. Could allow some forms of fingerprinting, e.g. work hours, depending your browsing habits, or correlating with other non-cloud website accesses.
If you are concerned about this, Cloud Browser makes it very easy to share the cached pages offline, in a time-independent manner. That is, you can access the files in cache/ offline as needed. The online browser will try to load from the cache first, but automatically refresh with a live version when it is available. But you could setup a cron job to fetch the websites you commonly visit on a fixed schedule, then only browse through the cache while offline, and then websites wouldn't be able to see when you read them.
I've thought about developing this feature further, it could lead to a better user experience, and avoid some of the problems with running Cloud Browser on a VPS. The VPS would be needed for running headless Chrome, but it could upload the static HTML and images as plain files to any static hosting website, for quick and easy browsing. You would need to "subscribe" to the websites you want to visit, and they would have to be periodically refreshed, however.
What if it sent a DOM string instead of a picture?
The client would need some js, but mostly just click and input handlers. All the heavy JS runs server-side.
One could use v-dom diffing and only send updates to the client to reduce bandwidth.
But that is getting complicated. I really appreciate the simplicity and elegance of using image maps.