undefined | Better HN

0 pointsahl4y ago0 comments

You mean the total lack of comments didn't help? ;-)

SPARC has two attributes (I hesitate to call them features) that this code interacts with: register windows and a delay slot. Register windows are a neat idea that leads to some challenging pathologies, but in short: the CPU has a bunch of registers, say 64, only 24 of which are visible at any moment. There are three classes of windowed registers: %iN (inputs), %lN (local), %oN (output). When you SAVE in a function preamble, the register windows rotate such that the callers %os become your %is and you get a new set of %ls and %os. There are also 8 %gN (global) registers. Problems? There's a fixed number of registers so a bunch of them effectively go to waste; also spilling and filling windows can lead to odd pathologies. The other attribute is the delay slot which simply means that in addition to a %pc you have an %npc (next program counter) and the instruction after a control flow instruction (e.g. branch, jmp, call) is also executed (usually, although branches may "annul" the slot).

This code is in DTrace where we want to know the value of parameters from elsewhere in the stack, but don't want to incur the penalty of a register window flush (i.e. writing all the registers to memory). This code reaches into the register windows to pluck out a particular value. It turns out that for very similar use cases, Bryan Cantrill and I devised the same mechanism completely independently in two unrelated areas of DTrace.

How's it work?

We rotate to the correct register window (note that this instruction is in the delay slot just for swag): https://github.com/illumos/illumos-gate/blob/master/usr/src/...

Then we jmp to %g3 which is an index into the table of instructions below (depending on the register we wanted to snag): https://github.com/illumos/illumos-gate/blob/master/usr/src/...

The subsequent instruction is a branch always (ba) to the next instruction. So:

%pc is the jmp and %npc is the ba. The jmp sets %npc to an instruction in dtrace_getreg_win_table and %pc is the old %npc thus points to the ba. The ba sets %npc to be the label 3f (the wrpr) and %pc is set to the old %npc, the instruction in the table. Finally the particular mov instruction from the table is executed and %pc is set to the old %npc at label 3f.

Why do it this way? Mostly because it was neat. This isn't particularly performance critical code; a big switch statement would probably have worked fine. In the Solaris kernel group I remember talking about interesting constructions like this a bit around the lunch table which is probably why Bryan and I both thought this would be a cool solution.

I'm not aware of another instance of instruction picking in the illumos code base (although the DTrace user-mode tracing does make use of the %pc/%npc split in some cases).

0 comments

richardfey4y ago

It would be nice to add this to the code in form of a comment, for historical preservation reasons

j / k navigate · click thread line to collapse