undefined | Better HN

0 pointsadrian_b4y ago0 comments

Instruction fusion is the magical rescue invoked by all those who believe that the RISC-V ISA is well designed.

Instruction fusion has no effect on code size, but only on execution speed.

For example RISC-V has combined compare-and-branch instructions, while the Intel/AMD ISA does not have such instructions, but all Intel & AMD CPUs fuse the compare and branch instruction pairs.

So there is no speed difference, but the separate compare and branch instructions of Intel/AMD remain longer at 5 bytes, instead of the 4 bytes of RISC-V.

Unfortunately for RISC-V, this is the only example favorable for it, because for a large number of ARM or Intel/AMD instructions RISC-V needs a pair of instructions or even more instructions.

Fusing instructions will not help RISC-V with the code density, but it is the only way available for RISC-V to match the speed of other CPUs.

Even if instruction fusion can enable an adequate speed, implementing such decoders is more expensive than implementing decoders for an ISA that does not need instruction fusion for the same performance

0 comments

snvzz4y ago

>Unfortunately for RISC-V, this is the only example favorable for it, because for a large number of ARM or Intel/AMD instructions RISC-V needs a pair of instructions or even more instructions.

Yet, as many pointed out to you already, RISC-V has the highest code density of all contemporary 64bit architectures. And aarch64, which you seem to like, is beyond bad.

>but it is the only way available for RISC-V to match the speed of other CPUs.

Higher code density and lack of flags helps the decoder a big deal. This means it is far cheaper for RISC-V to keep execution units well fed. It also enables smaller caches and conversely higher clock speeds. It's great for performance.

This, if anything, makes RISC-V the better ISA.

>Even if instruction fusion can enable an adequate speed, implementing such decoders is more expensive than implementing decoders for an ISA that does not need instruction fusion for the same performance

Grasping at straws. RISC-V has been designed for fusion, from the get-go. The cost of doing fusion with it has been quoted to be as low as 400 gates. This is something you've been told elsewhere in the discussion, but that you chose to ignore, for reasons unknown.

avianes4y ago

I see that you are pretty active here in debunking anti-RISC-V attacks, thanks for that! There are a bunch of poor criticisms about RISC-V.

> This is something you've been told elsewhere in the discussion, but that you chose to ignore, for reasons unknown.

I would call it RISC-V bashing.

Everyone loves to hate RISC-V, probably because it's new and heavily hyped.

It is really common to see irrelevant and uninformed criticism about RISC-V. The article, which seems to be enjoyed by the HN audience, literally says: "I believe that an average computer science student could come up with a better instruction set that Risc V in a single term project". How can anyone say such a thing about a collaborative project of more than 10 years, fed by many scientific works and projects and many companies in the industry?

I do not mean that RISC-V is perfect, there are some points which are source of debate (e.g. favoring a vector extension rather than the classic SIMD is a source of interesting discussion). But I would appreciate on HN to read better analysis and more interesting discussions.

imtringued4y ago

The more I think about CPU implementations the more I think that what RISC-V is doing isn't as bad as many people think. Everyone is going "more instructions = worse".

But the truth is that if you can build a CPU that fetches an infinite number of instructions per cycle, your biggest bottleneck isn't going to be the number of instructions, it's going to be unpredicted branches, jumps and function calls because fetching the entire function + everything behind it doesn't help, if you're going somewhere else. But the opposite is also true, adding more instructions than you need doesn't hurt as much as many people seem to think.

In practice the code density of RISC-V is not significantly worse than other architectures. So we don't even have to imagine an infinitely large fetcher, a finite fetcher that is bigger than what x86 CPUs have is good enough.

avianes4y ago

> But the truth is that if you can build a CPU that fetches an infinite number of instructions per cycle, your biggest bottleneck isn't going to be the number of instructions, it's going to be unpredicted branches

Note that: Within Superscalar processors a group of instructions that are decoded at the same cycle is called a decoding group.

Branching is a problem, but the branch predictors do an excellent job. (especially for function calls which are very well predicted by the RAS [Return Address Stack]) But the biggest bottleneck to fetch a large instruction group is decoding.

Especially the instruction size decoding. An ISA like RISC-V or ARM that drastically reduces the possible instruction sizes is a big advantage to decode large instruction groups.

And dependencies between instructions within the decoding group is also a concern. For example, register renaming will quickly require several cycles (several stages) when the decoding group scales up. RISC-V also addresses this since the register indexes are easily decoded earlier and the number of registers used can also be quickly decoded.

And you're right, these are topics that are rarely addressed by RISC-V detractors.

foxfluff4y ago

> How can anyone say such a thing about a collaborative project of more than 10 years, fed by many scientific works and projects and many companies in the industry?

Well the statement you quoted might be exaggerating things quite a bit but you're also just handwaving. The base ISA isn't a result of 10 years of industry experts doing their best; it's an academic project and a large proportion of it carried out by students:

> Krste Asanović at the University of California, Berkeley, had a research requirement for an open-source computer system, and in 2010, he decided to develop and publish one in a "short, three-month project over the summer" with several of his graduate students. [..] At this stage, students provided initial software, simulations, and CPU designs

The ISA specification was released in 2011! In one year! Of course there's been revisions since then, the most substantial being 2.0 in 2014 (I think). But if you look at the changes and skip anything that isn't just renaming / reordering / clarifying things, it always fits on half a page. It's by and large the same ISA, with some nice finetuning.

And here's the thing, a lot of people who originally read the spec felt like it is what it looks like, a "textbook isa", very much the kind of thing a group of students might come up with (I wonder why?).. just taken to completion. And what I remember from the spec (I read it long ago) is that cost of implementation was almost always a primary concern (and that high performance inmplementations would have to work harder but shrug itsnobigdealright?): it smelled like a small academic ISA tuned specifically for cheap microcontrollers. Not an ISA designed by industry experts for high performance cores. But the hype party is trying to sell it as a solution for all your computing needs, and almost seem to claim that no tradeoffs have been made against performance? And this is on a submission about performance, which is of course a subject a lot of people find interesting...

So I think there's very much reason to be critical of and discuss the ISA. Some critique may come from wrong assumptions, but being critical is not just bashing, and calling (attempts at) technical criticism uninformed & irrelevant and handwaving it away with 10 years of hype isn't helping the discussion. Better contribute that better analysis you refer to. (Unfortunately it seems like mostly everyone is just arguing without posting technical analysis)

avianes4y ago

> Some critique may come from wrong assumptions, but being critical is not just bashing

Focusing criticism only on this architecture, producing criticism without any solid argument, putting aside all the positive aspects, criticism when you clearly lack of expertise, sending similar criticisms in several conversations when already debunked on multiple occasions.. This is really systematic on discussion about RISC-V. That's what I call bashing, I cannot call that legitimate, correct or constructive criticism.

> The ISA specification was released in 2011! In one year! [...] But if you look at the changes and skip anything that isn't just renaming / reordering / clarifying things, it always fits on half a page.

I encourage anyone to open the original document published in 2011 [1] and compare it with current RISC-V specification documents [2].

There is very little left from the 3 month student work, it mainly remains the philosophy which is probably highly influenced by project supervisors. Moreover, it mainly consists of the basic RISC-V ISA which is indeed designed to be simple and minimalist, whereas the current RISC-V full spec. consists of a multitude of extensions.

At this stage your statement and the statement from the email is not just exaggerated, it is pure misinformation.

> Not an ISA designed by industry experts for high performance cores

Okay, the industry wasn't involved as much as today in the beginning. But RISC-V is really the product of experts in the field of high performance architectures

[1] https://www2.eecs.berkeley.edu/Pubs/TechRpts/2011/EECS-2011-...

[2] https://riscv.org/technical/specifications/

1 more reply

audunw4y ago

> Even if instruction fusion can enable an adequate speed, implementing such decoders is more expensive than implementing decoders for an ISA that does not need instruction fusion for the same performance

I'm very skeptical that a RISC-V decoder would be much more complex than an X86 one, even with instruction fusion. For the simpler fusion pairs, decoding the fused instructions wouldn't be more complex than matching some of the crazy instruction encoding in X86.

For ARM I'm not so sure, but RISC-V does have very significant instruction decoding benefits over ARM too, so my guess would be that they'd be similar enough.

socialdemocrat4y ago

You ignore compressed instructions on RISC-V which 64-bit ARM does not have.

And if you compare 32-bit CPUs then RISC-V has twice as many registers reducing the number of instructions needed to read and write from memory.

RISC-V branching takes less space and so does vector instructions. There are many case like that which adds up end results in RISC-V having the most dense ISA in all studies when using compressed instructions.

Dylan168074y ago

> Even if instruction fusion can enable an adequate speed, implementing such decoders is more expensive than implementing decoders for an ISA that does not need instruction fusion for the same performance

On the other hand just splitting up x86 instructions is very expensive, and decoding in general takes a lot of work before you even start to do fancy tricks.

j / k navigate · click thread line to collapse

0 comments

snvzz4y ago

>Unfortunately for RISC-V, this is the only example favorable for it, because for a large number of ARM or Intel/AMD instructions RISC-V needs a pair of instructions or even more instructions.

Yet, as many pointed out to you already, RISC-V has the highest code density of all contemporary 64bit architectures. And aarch64, which you seem to like, is beyond bad.

>but it is the only way available for RISC-V to match the speed of other CPUs.

This, if anything, makes RISC-V the better ISA.

avianes4y ago

I see that you are pretty active here in debunking anti-RISC-V attacks, thanks for that! There are a bunch of poor criticisms about RISC-V.

> This is something you've been told elsewhere in the discussion, but that you chose to ignore, for reasons unknown.

I would call it RISC-V bashing.

Everyone loves to hate RISC-V, probably because it's new and heavily hyped.

imtringued4y ago

The more I think about CPU implementations the more I think that what RISC-V is doing isn't as bad as many people think. Everyone is going "more instructions = worse".

avianes4y ago

Note that: Within Superscalar processors a group of instructions that are decoded at the same cycle is called a decoding group.

Especially the instruction size decoding. An ISA like RISC-V or ARM that drastically reduces the possible instruction sizes is a big advantage to decode large instruction groups.

And you're right, these are topics that are rarely addressed by RISC-V detractors.

foxfluff4y ago

> How can anyone say such a thing about a collaborative project of more than 10 years, fed by many scientific works and projects and many companies in the industry?

avianes4y ago

> Some critique may come from wrong assumptions, but being critical is not just bashing

I encourage anyone to open the original document published in 2011 [1] and compare it with current RISC-V specification documents [2].

At this stage your statement and the statement from the email is not just exaggerated, it is pure misinformation.

> Not an ISA designed by industry experts for high performance cores

Okay, the industry wasn't involved as much as today in the beginning. But RISC-V is really the product of experts in the field of high performance architectures

[1] https://www2.eecs.berkeley.edu/Pubs/TechRpts/2011/EECS-2011-...

[2] https://riscv.org/technical/specifications/

1 more reply

audunw4y ago

> Even if instruction fusion can enable an adequate speed, implementing such decoders is more expensive than implementing decoders for an ISA that does not need instruction fusion for the same performance

For ARM I'm not so sure, but RISC-V does have very significant instruction decoding benefits over ARM too, so my guess would be that they'd be similar enough.

socialdemocrat4y ago

You ignore compressed instructions on RISC-V which 64-bit ARM does not have.

And if you compare 32-bit CPUs then RISC-V has twice as many registers reducing the number of instructions needed to read and write from memory.

Dylan168074y ago

> Even if instruction fusion can enable an adequate speed, implementing such decoders is more expensive than implementing decoders for an ISA that does not need instruction fusion for the same performance

On the other hand just splitting up x86 instructions is very expensive, and decoding in general takes a lot of work before you even start to do fancy tricks.

j / k navigate · click thread line to collapse