'x86 virtualization is about basically placing another nearly full kernel, full of new bugs, on top of a nasty x86 architecture which barely has correct page protection. Then running your operating system on the other side of this brand new pile of shit.
You are absolutely deluded, if not stupid, if you think that a worldwide collection of software engineers who can't write operating systems or applications without security holes, can then turn around and suddenly write virtualization layers without security holes.'
Source: http://marc.info/?l=openbsd-misc&m=119318909016582
Personally, I have hope for things like cgroups/jails and MAC/SELinux over virtualization.
'barely has correct page protection' is just a way of saying 'has correct page protection, but I want to be really snotty about it'. So, highlighting a non-problem.
No-one is claiming that virtualization makes a system magically completely secure, but do people actually believe that it makes it less secure? (Compared to, running the same software on the same hardware using a single OS). I don't think so.
You can be as insulting as you like if you compare X86 (and X86-64) to SPARC and POWER which is what I assume he is doing here considering he provides operating systems for multiple architectures. We're talking about an architecture that started with the 8086 and despite changes to the underlying microcode architecture, the front end ISA and system interface is still plagued with poorly designed extensions hacked on.
Regarding virtualization, any sharing of resources, particularly at a hardware level is an attack vector if not implemented correctly. Whether or not it is implemented correctly or is exploitable is merely a matter of time and effort as demonstrated here. That is unless mathematically verified, which it isn't and based on the evolved x86 architecture probably isn't possible so it can't be more secure and is unlikely to be as secure. That leaves only less secure.
If you had been running an OpenBSD instance on hardware as a single OS, you would not be vulnerable to having your system's memory read by this hypervisor bug. So... yes.
With virtualization you have the hypervisor that has applications running along with it (ie: ssh, a cli, syslog, bash etc) and then you install a guest in a VM on top of the hypervisor. The OS on the VM is another vector which has applications on it (DB, web, ftp, etc).
If I have a bare metal server with just an OS installed on it and its applications on top of that, I only have to worry about that set of OS and applications and their associated risk vectors.
If I have a bare metal server with a hypervisor and then the above OS and set of applications, I have increased the number of risk vectors by however many applications are running along with the hypervisor.
“but do people actually believe that it makes it less secure?”
Just bringing a hypervisor into an environment does not of itself immediately make it less secure. I agree with you, I don’t think it makes it less secure. It does increase the risk of the environment and appropriate architecture and action must be taken to prevent your statement from being true. A large number of environments do not architect and manage properly.
Another very realistic, and happening today, example: Bare metal server with Windows OS installed. Bare metal server with a hypervisor which just happens to have bash on it (or is susceptible to this memory issue). The same Windows OS is installed as a VM. In the second instance with the hypervisor I would have, indirectly and out of my immediate control, made my environment less secure.
An increase in complexity or increase in components will increase the risk of an environment.
However, common virtualized platforms such as EC2 encourage you to run your whole server setup in an environment where a (possibly malicious) neighbour could be running arbitrary x86 code in an instance on the same physical machine. This is not an attack vector that is remotely possible in the traditional, non-virtualized setup.
so the answer to your question is yes: there have been cases where simply adding the hypervisor introduced a vulnerability .
I believe he was comparing it with jails and other similar technologies, which work at the OS level rather than loading another kernel.
The attack surface of a virtualization layer is much smaller than the attack surface of an operating system kernel.
PS- I have only done a undergraduate level course on computer architecture
0x3ff to 0xff
elsewhere a 256 rather than 1024 wide window is being used too
Simple but potentially dangerous right? :P
> While the write path change appears to be purely cosmetic (but still gets done here for consistency), the read side mistake permitted accesses beyond the virtual APIC page.
""" Yesterday we started notifying some of our customers of a timely security and operational update we need to perform on a small percentage (less than 10%) of our EC2 fleet globally.
AWS customers know that security and operational excellence are our top two priorities. These updates must be completed by October 1st before the issue is made public as part of an upcoming Xen Security Announcement (XSA). Following security best practices, the details of this update are embargoed until then. The issue in that notice affects many Xen environments, and is not specific to AWS. """
@@ -2189,6 +2190,11 @@
*msr_content = vcpu_vlapic(v)->hw.apic_base_msr;
break;
+ case MSR_IA32_APICBASE_MSR ... MSR_IA32_APICBASE_MSR + 0x3ff:
+ if ( hvm_x2apic_msr_read(v, msr, msr_content) )
+ goto gp_fault;
+ break;
+
case MSR_IA32_CR_PAT:
*msr_content = v->arch.hvm_vcpu.pat_cr;
break;
@@ -2296,6 +2302,11 @@
vlapic_msr_set(vcpu_vlapic(v), msr_content);
break;
+ case MSR_IA32_APICBASE_MSR ... MSR_IA32_APICBASE_MSR + 0x3ff:
+ if ( hvm_x2apic_msr_write(v, msr, msr_content) )
+ goto gp_fault;
+ break;
+
case MSR_IA32_CR_PAT:
if ( !pat_msr_set(&v->arch.hvm_vcpu.pat_cr, msr_content) )
goto gp_fault;Since KVM isn't vulnerable to this cross-domain issue, it may be useful to compare with the equivalent code in KVM and/or Linux.
(Or, for that matter, whether they considered making an ad-hoc machine code patch - based on the source patch, it looks like it would probably be doable just by changing a few bytes. I guess it's a bit risky...)
From an older post @ https://news.ycombinator.com/item?id=2791756
The first is "Method of finding a safe time to modify code of a running computer program": http://bit.ly/ksplice-1
The second is "Method of determining which computer program functions are changed by an arbitrary source code modification": http://bit.ly/ksplice-2
It would be helpful if errata announcements included documentation of the static analysis tools, code review process or automated testing techniques which identified the weakness, along with a postmortem of previous audits of relevant code paths.
What made it possible for this issue to be identified now, when the issue escaped previous analysis, audits and tests? Such process improvement knowledge is possibly more valuable to the worldwide technical community than any point fix.
Heartbleed was discovered by an external party, but this issue which affects the data of millions of users was found by the originating open-source project. Kudos to Jan for finding this cross-domain escalation.
So if you get a reboot announcment from your xen vps provider after today 12:00Z, you should list them here.
I haven't checked after the reboot, but I hope the MSRs I'm using can still be accessed: IA32_MPERF and IA32_APERF (to calculate real CPU MHz); IA32_THERM_STATUS and MSR_TEMPERATURE_TARGET (to calculate CPU temperatures); and MSR_TURBO_RATIO_LIMIT and MSR_TURBO_RATIO_LIMIT1 (to see turbo ratios).
I use them in the showboost, cputemp, and cpuhot MSR-based tools: https://github.com/brendangregg/msr-cloud-tools
Got a reasonably long list of VPS providers to submit tickets to.
---
This seemingly looks like a serious problem, but if we think a little bit about the practical impact the conclusion might be quite different.
First, there are really no secrets or keys in the hypervisor memory that might make a good target for an exploit here. Xen hypervisor does not do encryption, neither it deals with any storage subsystems. Also there is no explicit guest memory content intermixed with the hypervisor code and data.
But one place to see pieces of potentially sensitive data are the Xen internal structures where the guest _registers_ are stored whenever the guest execution is interrupted (e.g. because of a trap). These registers might contain e.g. (parts of) keys or other secrets, if the guest was executing some sensitive crypto operation just before it got interrupted.
The vulnerability allows to read only a few kB of the hypervisor memory, with only relative addressing from the emulated APIC registers page, whose address is not known to the attacker. Still, for the exactly same systems (same binaries running, same ACPI tables, etc) it's likely that the attacker would be able to guess the address of the APIC page. However, it is much less probable she would be able to predict what Xen structures are located in the adjacent memory. Much less the attacker would be able to control what structure are located there, as there doesn't seem to be many ways of how a malicious HVM might be significantly affecting the layout of the hypervisor heap (e.g. force arch_vcpu structures of interesting domains to appear nearby).
Nevertheless, it might happen, by pure coincidence, that an arch_vcpu structure with a content of an interesting VM will just happen to be located adjacently to the emulated APIC page.
In that case, the next problem for the attacker would be lack of control and knowledge over the target VM execution: even if the attacker were somehow lucky to find the other VM's register-holding-structure adjacent to the APIC page, it would still be unclear what the target VM was executing at the time it was suspended and so, whether the registers stored in the structure are worthwhile or not.
It is thinkable that the attacker might attempt to use some form of a heuristic, such as e.g. "if RIP == X, then RAX likely contains (parts of) the important key", hoping that this specific RIP would signify a specific interesting instruction (e.g. part of some crypto library) being executed while the VM was interrupted, and so the key is to be found in one of the registers.
But the attacker's memory reading exploit doesn't offer a comfort of synchronization, so even though the attacker might be so extremely lucky as to find out that
*(apic_page + guessed_offset_to_rip) == X
(the
attacker here assumes the 'guessed_offset_to_rip' is the distance
between the APIC page and the address where RIP is stored in the
presumable arch_vcpu structure, that presumably is located adjacently),
still there is no guarantees that the next read to *(apic_page + guest_offset_to_rax)
will return the content of RAX from the same moment
that RIP was snapshot (and which the attacker considered interesting).Arguably the attacker might try to fire up the attack continuously, thus increasing chances of success. Assuming this won't cause system to crash due to accessing non-mapped memory, this might sound like a somehow good strategy.
However, in case of a desktop system like Qubes OS, the attacker has very limited control over other domains. Unlike as in case of attacking a VM playing a role of a Web server for instance, the attacker probably won't be able to force the target VMs to do lots of repeated crypto operations, neither choose moments when the target VM traps.
It seems like exploiting this bug in an IaaS scenario might be more practical, though, as the attacker also has some control of domain creation/termination, so can affect Xen heap to some extent. But on a system like Qubes OS, it seems unlikely.
So, are we doomed? We likely are, but probably not because of this bug.
---
Given that writes are no-ops, I don't understand this mechanism. Can someone explain it, please?