Despite the relatively thin and incomplete coverage of the material, I've heard from several people over the years who appreciated the work as a nice introduction to the topic and even once received a job offer because of it (which didn't work out, for a variety of reasons). All things considered, if I had to change anything it would be the title to make it a little more focused. It's not really a book about disassembly so much as it is an introduction to what high-level language features look like when translated into non-optimized x86 assembly code. Find me a short, catchy title that accurately describes that, and you win some kind of prize.
I doubt I'll ever get back to this book either. I haven't worked with this material at all since school, and don't feel like I have up-to-date knowledge of the subject. Unless somebody else wants to jump in and fill it out, it will probably stay the way you see it now.
I'm glad to see that this book is still around and I'm glad that people are benefiting from it in some small way. I know it doesn't cover nearly what would be needed for a real book on the subject (I do still recommend Eilam's Reversing for book-o-philes) but I think it should be a decent stepping stone to pique interest and get people moving on towards more in-depth treatments.
http://reocities.com/SiliconValley/heights/7052/opcode.txt
The 8080/8085/Z80 instruction sets also look much better in octal:
It does not suprise me that something so simple would be so well overlooked (or, at least, "forgotten"). I wonder if I ever would have figured this out from my own readings and experiments. Doubtful.
Great tip!
I'm currently using [0] as a helping hand among other resources which is quite good; my maybe not-so-interesting results are at [1].
For example, to get a good overview of instruction encoding you have to 0/ read through and ditch horrible blog posts 1/ find the correct Intel manual 2/ search and read through thousands of PDF pages until you find something interesting 3/ understand the environment and facts that are either implicitly given or in the documents but not easy to find.
For the handful lines of actual code I wrote yesterday [1] I still have around 25 tabs open. Complexity and no end in sight.
Do you have any recommendations and hints as where to start with this in the year 2015?
[0] http://0xax.blogspot.se/p/assembly-x8664-programming-for-lin...
* comprehensible, but far from complete (some blogs)
* complete, but hard to understand and requiring some implicit knowledge (Intel manual or [1])
Rather than disassembler I recommend writing some simple JIT compiler, with [2] as a starting point. You skip some problems this way.
[1] http://ref.x86asm.net/ this seems pretty cool as a reference, but I can't wrap my head around it
[2] http://eli.thegreenplace.net/2013/11/05/how-to-jit-an-introd...
You have to get over it and keep going anyways or you'll never become one of those people yourself! (Or even be able to make an informed decision about whether you want to).
Not coincidentally: part of the point of the company we just started. :)
[1] http://rada.re/
sub esp, 4
mov DWORD PTR SS:[esp], eax
A brief skim of Intel's "Software Developer’s Manual" (particularly ch. 6 on stacks), didn't seem to find an answer.While hitting the ALU just for `sub` might be an extra step, doesn't hitting RAM make that a drop in the bucket? (`sub` may account for less than 1%?) Or is there some caching going on, so RAM may be updated in the background?
(I'm not an assembly programmer; very ignorant of what's happening.)
Memory reads/writes do take a few more cycles to complete, but since this is a write, the CPU can continue on with other non-dependent instructions following it. All the above information assumes a CPU based on P6 and its successors (Core, Nehalem, Sandy Bridge, Ivy Bridge, Haswell, etc.); NetBurst and Atom are very different.
Linus also has some interesting things to say about using the dedicated stack instructions: http://yarchive.net/comp/linux/pop_instruction_speed.html
Somewhat amusingly, GCC was well known to generate the explicit sub/mov instructions by default, while most other x86 C compilers I knew of, including MSVC and ICC, would always use push.
Beyond that, the best explanations I've seen are that the push/pop instructions are "optimized" in some way. I don't know if that means they are more optimized than just making the inc/read pair on the ESP atomic so it doesn't stall or if there is more to it.
http://www.agner.org/optimize/microarchitecture.pdf
§7.7.