I wonder what software redesigns he has in mind. As far as I can tell, best practices are already trending toward only one trust zone per address space. Some might argue that that's the whole point of multiple address spaces. I suspect that Spectre will accelerate this trend.
I do know how difficult this kind of change can be. The example I have in mind started before Spectre, and is unique to one platform. On Windows, developers of third-party screen readers for the blind are going through a painful transition where they can no longer inject code into application processes in order to make numerous accessibility API calls with low overhead. This change particularly impacts the way screen readers have been making web pages accessible since 1999. For the curious, here's a blog post on this subject: https://www.marcozehe.de/2017/09/29/rethinking-web-accessibi...
All these isolation changes are ostensibly for "security", but I suspect DRM is at least part of the motivation; corporations want to be able to silo content more and restrict the free flow thereof. To a user, a screenreader is a benevolent helper; to them, it's a malicious "attack", a way of extracting and consuming content that they may not want.
Source: I'm the dev who built the foundation for this transition in Gecko.
But Spectre variant 1 is really a consequence of the CPU working correctly. For a large number of branches, perhaps most, we want loads to proceed during speculative execution. This is because the code accesess the same or closely related data on both sides of the branch, so priming the caches during speculation is very valuable even when the branch is mispredicted.
I remember reading a study of different binary search implementations which is probably the clearest example of this: when the data is laid out in a heap layout (with child nodes next to each other in an array) the branchy variant of the code performs better than the branchless variant due to this cache priming effect.
What CPU designers could and should probably help with is providing instructions to cheaply mark the (comparatively few!) cases where this speculative execution behaviour leaks secret information.
How can we, as software developers, find these cases in our multi-megabyte code bases, and how can we be sure we haven't missed any?
One of the most common ways major ad networks get compromised to the extent that they serve malware to hundreds of thousands of web users (this happens at least once a year) is that they hotlink to JS libraries, that hotlink to JS libraries, that hotlink to more JS libraries.
If you use a script blocker, it's not that uncommon to see that once you get down far enough, scripts are being loaded from bare IP addresses rather than domain names. Every now and again, someone compromises one of these deep-nested hotlinked JS files and maliciously modifies the javascript, and random sites all over the web dutifully serve the malware.
It's not that I don't trust the first-party website owners, more like I don't trust their friends friends friend.
EDIT: I would love a list of minimum required scripts for certain sites. It's painful to fight through what I need- and I really resent it when I am a PAYING FUCKING CUSTOMER.
I don't see a problem with that. "Web applications" are inherently untrusted code. If it were not for untrusted code these attacks would not be an issue, so it doesn't seem unfair for a mitigation to negatively affect them.
In particular, browsers could always run JS in a separate process that's appropriately virtualized (i.e. has limited access to host information and resources).
That is, we seem to be plagued by misplaced trust moreso than untrusted applications.
The analogy to civil engineering is we trust building makers. Few of us enter buildings we don't trust to stay up around us.
Look what happened after the VW diesel scandal ('dieselgate'): VW had to pay for repairs, and pay buyers (my friend bought one of the cars and got about $6k IIRC). Some people even went to jail.
Intel (or any other CPU maker) will probably not suffer similar fates. This situation is a bit different, because they may not have known about the problem. Still, everyone who bought a CPU is going to get a 10-30% performance haircut because they made a mistake. And Intel isn't going to have to pay for it.
Per dictionary.com, the legal definition of negligence is "the failure to exercise that degree of care that, in the circumstances, the law requires for the protection of other persons or those interests of other persons that may be injuriously affected by the want of such care. "
What Intel did was not recognize that a specific attack possibility existed. Nobody else recognized it either, for a decade. That's not negligence. That's failure to be omniscient.
In many ways, spectre is one more kind of attack on code that doesn’t properly separate validating untrusted input from acting on that input, except unlike overruns and TOCTOU races, this is microarchitectural.
> The second thing is that it’s not just about speculation. We now live in a world with side channels in microarchitectures that leave no real trace in the machine’s architectural state. There is already work on leaks through prefetching, where someone learns about your activity by observing how it affected a reverse-engineered prefetcher. You can imagine similar attacks on TLB state, store buffer coalescing, coherence protocols, or even replacement policies. Suddenly, the SMT side channel doesn’t look so bad.