Indeed, for a complete media player, I need:
- access to files not directly opened by the user (playlists, MKV, DCP, MXF),
- access (and unique probably) to raw devices read-only (DVD, AudioCD, Blu-Rays, webcams-v4l2, SDI, DVB),
- direct access to raw audio output,
- access to X11 for YUV output, or at least a direct OpenGL context,
- access to network.
For access to files and network, there seemed to be a solution with a manifest to get $home access; for audio, a solution might come with kdbus and pulseaudio; but for the others, they refuse blankly, saying that my "use case is irrelevant and dangerous".
I hope this will evolve (maybe it already has), but so far, it's a bit hard to make a complete media player, tbh.
For instance, once any client has any kind of X11 access they can snope any kind of keyboard events, including your password at the unlock screen.
If we want to support the full functionallity for a media player in a sandboxed way we need to start looking into each requirement and designing a safe way to access each item. This is gonna be a lot of work, but I don't see any way around that.
For your exact list:
Files access can either be granted to the app fully or partially. But we also want some kind of file selector service that runs in the session (outside the sandbox) that grants some kind of access to files the user chose.
Raw device access will not happen by just having the app open the raw device nodes. Instead we'll have some kind of service in the session that (via user interaction or "remembered" grants from the user) virtualizes access to these things. This could be all from e.g. passing a file descriptor of the opened dvd device to a complete replacement of the subsystem. For an example of the later, for webcams see the pulse-video project: https://github.com/wmanley/pulsevideo
Is raw audio output necessary? Why does not pulseaudio work?
OpenGL access is supported
Network access is (optionally, but i think this will be on for most apps) allowed
Well, we've seen sandboxes on other desktop platforms (OSX, WinRT, ChromeOS), and so far, they all are horribly limiting. So, I'm making sure the same does not happen on Linux, before we get kicked out of our own platform.
> kind of file selector service that runs in the session (outside the sandbox) that grants some kind of access to files the user chose.
This is not enough, as explained above.
> Raw device access will not happen by just having the app open the raw device nodes. Instead we'll have some kind of service in the session that (via user interaction or "remembered" grants from the user) virtualizes access to these things.
Well, this is a deal breaker so far. Are you going to do a pulsedvd, a pulsecd, a pulsedvb, a pulsesdi for all the access modules? Playing encrypted DVD requires direct access, as far as I know.
> For an example of the later, for webcams see the pulse-video project: https://github.com/wmanley/pulsevideo
Something using GStreamer in Vala to get indirect access to webcams? I don't see how this could even work: how do you control brightness or other webcam controls from two applications, how do you get direct H26x access with preview synchronized? (And asking someone to use a competitive project to get video input is also a bit rude, but that's beside the subject)
> Why does not pulseaudio work?
libpulse requires X, as far as I know.
> OpenGL access is supported
Through Wayland? How do I get YUV surfaces? What about overlay? What about VDPAU/VAAPI? I guess this will get a lot of improvement, because wayland, wl_scaler (et al) are very limited, so far.
We'll see how it fares, but from the past interactions, the answers were pretty dismissive; and therefore, I'm not that optimistic about the outcome.
I don't know if this applies to VLC, too, but raw audio output (or something that isn't PulseAudio) is necessary wherever low (or at least constant) latency is required.
Edit: oh - now that I look at your username, I believe congratulations are in order re. the subject of this thread :)?
Wayland fixes this.
Is this true? Can you provide a proof of concept that can attack xscreensaver in this way? I was not under the impression that xscreensaver in particular is vulnerable to such an attack. I would like to see the code.
> Is raw audio output necessary? Why does not pulseaudio work?
Funny story, I have never gotten pulseaudio to successfully play any audio. Every so often I read somewhere that pulse makes audio scenario xyz really easy. I try it out and I cannot get any audio samples out to the sound card, period. The only other time I hear about pulse is when I'm telling people to kill it, which ends up fixing all their audio problems. So no thanks to pulse.
Meanwhile I can't help but think that Unix already has a security model for talking to devices, it is called enforcing security at open(2). What is bad about letting an app talk to alsa if it needs audio? If you can provide an example, is that not a privilege escalation in alsa that should be fixed?
It just strikes me that your article here is a technical solution in search of a problem. Apple did sandboxing so it must be right, huh?
I'm currently working on converting a PDF viewer to use privilege separation. I'd love to see a media player use privilege separation too. Does VLC currently have pretty clear separation between its various components? Do you think it be much work to spin security-conscious parts like the decoders into separate processes?
> Does VLC currently have pretty clear separation between its various components?
Very clear separation. One of the best, tbh.
> Do you think it be much work to spin security-conscious parts like the decoders into separate processes?
Extremely difficult. We've thought about it.
For a video player the 3 parts that are sensitive, are protocols (file, http), demuxers (mkv, avi) and decoders.
The crashes mostly happen in protocols and demuxers, but not in decoders (a contrario from what people think).
The main issue is that the video decoder MUST be in the same process than the video output, for performance reasons (buffer sharing: memcpy is murder) and for hardware decoders. And video output are usually with very high access in the kernels. Moreover video outputs are almost necessarily in the process with the UI thread.
For audio output, it's not good either, although some platforms are better (pulseaudio+Kdbus might work).
Privilege separation is an answer to the question: "I have a program whose authors I trust even though they are fallible, and the program needs to process untrusted data"
In desktop scenarios, I really want an answer to the question: "I have a program whose authors I don't trust, but I want to run it anyway"
For that, you really need full sandboxing; privilege separation is not enough.
It seems pretty impossible to create something which: 1) does sandboxing 2) has no limitations on what an application can do
Edit: What I meant is: Is it possible to lower your requirements or change how VLC works/can do when it is sandboxed?
Maybe for some, but not for others. You really need to be able to open a file without file opener. Opening optical disks is really important too. I don't get why read-only access to a device poses so much threats but I'm probably missing something.
FYI, so far, libpulse links to x11.
That would be a misnomer, because "important" file locations like ~/.config, ~/.share, etc would always prompt user acknowledgement when a program tries to open something in those and it is not the originator program (ie, VLC can open something like ~/.config/vlc or ~/.share/vlc, but if it tried to open ~/.config/mplayer or ~/.share/mplayer you would get a prompt notification of it.
> access (and unique probably) to raw devices read-only
I don't think this is hard at all, device permissions are something that mostly already works. Just say "App X wants to access device Y, allow? Yes / No / Always".
With good package management and distributors, you could have most of these "prompts" configured with sane defaults so you don't get spam when VLC opens files - the distro can just trust VLC in ~/Music and ~/Video all the time, and any user made directories, so just give it access everywhere except the config and share dirs, so on and so forth. When you install something it would make sense to tell users what it uses in the same way Android does upon installation, and if it tries to access something post-install you get a dialog about it (ie, VLC could say at install it uses the network, or it could ask at runtime if it tries to access the network).
Why would VLC need raw audio out? Route through pulseaudio, nobody should be opening raw audio devices anymore unless you are Ardour.
I think the real problem is nobody gives a shit about desktop security. I mean I run Archlinux where there isn't a single working MAC solution that does not require days of prep work. But having delved a lot into Mac and such, I have no faith in Apparmor or SELinux when PAX and Grsec is doing a much better job and nobody is using it. And they all have holes somewhere, either they don't have fine grained device permissions, or lack tunables, or don't harden the kernel enough to avoid simple exploits. Its a mess nobody is really trying to solve upstream of security implementations and distros are doing a half assed job about it across the board.
What's stopping you from running your filesystem and network requests through a local service... you can sandbox all the rendering and other access, and tighten controls on your service interface.
Hypothetical: A vulnerability in Chrome could allow a remote attacker the ability to read/write arbitrary files on your filesystem. /tmp/.X11-unix/X0 for example. This could let them cause all sorts of havoc on your X11 display including controlling/compromising other running programs (and capturing your keystrokes).
If Chrome were sandboxed as described in the article such an attack would not be possible without first breaking out of the sandbox.
Unless you personally built software from trusted source with a trusted toolchain, or reverse-engineered and inspected the binary package you are installing, you have no more confidence that your installed binary package is not malicious than any proprietary binary blob. Even presuming the maintainer and packager is not malicious (and why would you trust a random open source maintainer more than a random proprietary maintainer?), there have been cases in the past where repositories have compromised and with malicious packages inserted. These are just the ones we know about. Signatures help but don't solve anything; especially with distributed projects, keys can be compromised in any number of ways. And don't help with malicious actors.
That said, I don't see sandboxing Linux apps as particularly revolutionary. Users, ACLs, and chroots have been around for ages. While not especially sugary, they are pretty effective. For full-on app sandboxing, Android is Linux, and they've been doing this since 2008. The average Linux desktop user has legitimate reasons they don't want to sandbox any given desktop app to this extent. On the server side, "sandboxing" in the form of "separation of privileges" has been standard best practice in the UNIX realm as long as Richard Stallman's beard. This is just not particularly interesting. More interesting to me is what has been done for sandboxing in the browser, most especially Chrome.
Because it's not like anyone uses media players in the real world.
Seriously, WHAT?
But if all applications use the same OpenSSL that is managed by dpkg+apt (or something similar), a single update will fix all applications.
Personally, I don't think that this problem can be solved without losing the advantages of bundling (robustness, reproducability, binary portability to other distributions).
It's how you ship 1000 different versions of the same library and have no idea what to do when they need to be upgraded. It's just a terrible way to manage a system. Easier doesn't mean better.
Be aware of typical Linus swearing: https://www.youtube.com/watch?v=5PmHRSeA2c8#t=358
security will stifle application innovation. Making cool things hard, and awesome things impossible.
P.S.: systemd now allows logging in via your Facebook account on every machine, per default since Ubuntu Timid Tamandu!
https://www.usenix.org/legacy/events/atc10/tech/full_papers/...
which was turned into
It works very well but I can't release it yet because there's an underlying problem and the author of this article pointed it out very clearly:
> because X11 is impossible to secure
So imagine the scenario: You want to have a single server that hosts desktop environments for multiple users over the web. X11 is multi-user! No problem, right? Wrong.
Your little web server daemon is going to run as some user. Let's say that user is root to keep things simple. So now your root user needs to spin up an X11 server for each user (so their sessions are separate). But that won't work so well because the X11 servers will all be running as the same user (root). This means that each user can mess with each other's applications, log their keystrokes, etc.
So what do you do? Well, you can just create random, one-time user accounts in /etc/passwd and spin up X11 using those accounts but now your little daemon has to run as root. Also, you now have to keep track of and maintain not just all those temporary users but all the files owned by those users. You also need to keep track of which user had what account and when (for auditing purposes). You also have to worry about UID conflicts (especially with external systems) and some other less common scenarios (e.g. LDAP integration with sudo).
Another option would be to give each user their own container and run X11 inside of that. Except now the application can't get access to OpenGL acceleration and shared memory access (so your little deamon can capture the screen) becomes complicated. Then there's the fact that if you want to give the users access to more applications those applications will need to be installed inside each user's container. You can do some tricks with mounts in order to work around that problem somewhat but it's complicated. REALLY complicated!
For now I've decided to just assume the daemon will be running as a single user (doesn't matter which one) while I work on some other things (e.g. improving audio support) but very soon I'm going to have to come back to the multi-user security problem. It's not easy to solve.
The way X11 was engineered just assumes that each user has their own processes and if you do have multiple users all their applications will be running under different accounts.
related to what you are trying to do?
As an example of the efficiency difference, when viewing an entire desktop with a single terminal application running 'top' Gate One used up 1/10th the amount of bandwidth as noVNC when I last performed benchmarking (I had them both displaying the same exact desktop; both Gate One and novnc running simultaneously).
Also, noVNC CPU utilization goes through the roof if you try to do something like play back a video. When playing back a video inside Gate One the gateone.py process only eats up about 8% of a single core of my laptop's i7 (4th gen). That's with loads of debugging enabled (I tested it just now with SMPlayer playing Big Buck Bunny somewhere at ~1024x768 resolution).
My benchmark goal is to be able to play Minecraft @30fps (~1024x768) remotely using an AWS/Rackspace/OpenStack server. I've already achieved that except the audio delay sucks (~2 seconds) so that's what I'm currently working on (had to write my own Opus/WebM audio encoder).
Yet X11 was designed in the prime example world of a mult-user OS, UNIX. Hmm.
>> We also need to use kdbus to allow desktop integration that is properly filtered at the kernel level.
Didn't I read an article on HN recently talking about a vulnerability in Windows and the subject of too close a relationship between the kernel and the end user graphics came up?
Also, kdbus has nothing to do with graphics.
* Is independent of the host distribution
* Has no access to any system or user files other than the ones from the runtime and application itself
* Has no access to any hardware devices (GL rendering supported)
* Has restricted network access
* Can’t see any other processes in the system
* Can only get input via standard APIs
* Can only show graphics via DOM/Canvas/WebGL/SVG/MathML
* Can only output audio via Audio Tags/Web Audio/MSE
* plus more sandboxing details
I have a guide on running accelerated GUI apps in both privileged and unprivileged containers: http://www.flockport.com/run-gui-apps-in-lxc-containers and here's one more by Stephane Graber, lead developer of LXC using only unprivileged containers: https://www.stgraber.org/2014/02/09/lxc-1-0-gui-in-container...
On the one hand, I want to be able to run untrusted applications safely. And in a larger sense, I think that it might be a hopeless endeavor to try to get users to only run trustworthy applications on their machines.
On the other hand, I want to have full access to my system, and sometimes that means having full access to it through applications (jbk's VLC use case is a good example of that). And sandboxes are far from perfect (this may improve, but right now existing sandboxes still have lots of holes). Sandboxes may be just security theater at this point (although as I said, this might change).
This program wants to access your camera, mike, //home/personal, connect to irc://botnet.com, and install these dependencies: spyware1, 2, 3 Are you sure you want to install/run [insert innocent program]?
I think that would be an better option then sand-boxing everything.
Is it even possible for the OS to know if a program is using the webcam?