This code smells of desperation | Better HN

89 comments

56 comments · 16 top-level

michaelcampbell2y ago· 14 in thread

The removal of "This..." from the title here really confuses it.

With "This", it's obvious the title is "(This code) smells of desperation". The submitted title is ambiguous; it could mean "(Code smells) of desperation".

The submitted title might very well have been "This code smells of desperation.". HN has some strange rules that edit submitted titles like stripping "10" or "How to"

kgwgk2y ago

It was already annoying that titles can be changed by a moderator without leaving any trace and now titles are also being mutilated automatically at submission time into something that may or may not make sense. At least in this case the submitter may notice what happened and change it back.

dang2y ago

They would seem less strange if you saw a list of the baity titles that get de-baited that way. Maybe we should publish that.

"How to" doesn't get edited out. Certain other leading hows do.

graton2y ago

This comment reminds me of those Amazon reviews that give a product 1-star because Amazon had a shipping issue. Yeah, sorry the product wasn't able to get to you but you aren't helping me figure if the product itself is any good or not.

russh2y ago

Like the time I ordered some slate coasters but received an envelope of slate chips and fine powder. I was not able to leave a review that mentioned that the seller just wrapped the coasters in a paper towel and tossed the coasters in a sturdy envelope.

Lewton2y ago

I clicked expecting a joke article about Code Smells, I was a little disappointed

Thank you both for saving me early morning disappointment.

Code Smell is an industry accepted term. I am unclear how any other interpretation of the modified title could be expected.

the "smells" in the industry term "code smells" can function as both a verb and a noun.

As a noun, it can describe specific ways that hypothetical code might not follow best practices. For instance, code fragments that have been copy-pasted many times rather than refactored into a function, is a code smell. The use of many global variables is a code smell. Together, these are "code smells".

As a verb, it describes specific code which exhibits these sorts of attributes. A particular source file can smell. The code... smells. The phrase can also be used adjectively, to say that code is smelly.

The title "Code smells of desperation" could imply the noun form, in which an article discusses various code smells which could be a general indication that a hypothetical code base might be in desperate shape. Or that the team maintaining it is. It is an article about smells, in code.

Whereas, "this code smells of desperation" uses the verb form to indicate that the article is about a particular code base which appears to be in desperate shape, because of the smells it specifically gives off. It is an article about code, which smells.

benchaney2y ago

Well, interpreting the title in that way is incorrect, so it seems like GP kind of has a point then.

You just proved OP's point since you misunderstood what the original title was.

Alternative interpretation based on the mangled title: here are things to look for in any code base which indicate the programmer was desperate.

All code smells of desperation.

picometer2y ago

I was also slightly disappointed not to find a discussion of <code smells>, but the post is interesting, and we can still discuss code smells here.

The post author (Michal Necasek) states, about the WIN87EM.DLL code:

> It bears all the hallmarks of code that was written, rewritten, rewritten again, hacked, tweaked, modified, and eventually beaten into submission even if the author(s) had no real idea why it finally worked.

From what I gather, here are those hallmarks:

- Looping a no-op action, presumably to slow things down.

- Unnecessarily performing actions multiple times. This happens for three things: (a) writing a zero to an I/O port to clear something; (b) executing an instruction to clear exceptions; and (c) repeating the aforementioned no-op loop at different points.

- Saving a status in a separate location, only to reinstate it to its original location after clearing things out.

- Communicating procedure state (an EOI, “end of interrupt”) to one entity (the master interrupt controller) but not another (the controller’s slave). Furthermore, this “end” signal was sent near the beginning of the procedure. (This final point is my own observation and not explicitly called out by the author. Perhaps it’s common and not “smelly” for interrupt handlers to do this up front.)

I’ve tried to reframe the technical terms as actions and signals in a way that could be recognizable to devs of higher-level systems. My familiarity with OS-level systems is minimal so my interpretations could be a little wrong.

But despite my lack of knowledge, and with the author’s help, it does seem clear that there were serious timing and state related bugs here. And as a dev at other levels of the stack, I can relate: it’s very hard to reason about async global state! And this code’s responsibility was handling math errors, not timing errors. It is - or, perhaps, should be - the responsibility of the OS to orchestrate these things appropriately so that math libraries can focus on math stuff.

So my takeaways, for “code smells of desperation”, would be:

- There are violations of module responsibility.

- There are modifications of process timing with no discernible reason.

- There are modifications of status/environment/state with no discernible reason.

- And finally, other experts (in this case, the post author) can’t make sense of the code.

layer82y ago· 10 in thread

This sounds like the kind of thing where Raymond Chen would write up a historically completely sensible rationale for why that code is the way it is.

bsder2y ago

Most likely cuplrit: AutoCAD.

AutoCAD did all manner of nasty things with floating point numbers in order to stash extra data into them. Denormals, NaNs and the like were painfully common. You had to make sure your trap handlers were fast or AutoCAD performance would suck and everybody would slag your computer.

AutoCAD was one of the banes of existence for the FX!32 guys.

Ballas2y ago

Did AutoCAD ever run inside Windows 3.xx? From my vague recollection, it was a DOS program in those days that was launched outside of Windows, or am missing something?

If git had existed, one would anticipate the commit log for the code would read like a Lovecraftian descent into madness as the coder makes increasingly unhinged pleas to the Great Old Ones to accept the unit tests.

Having read plenty of version control system logs from around the turn of the millennium (i.e. when things were _far_ less crazy than 1987), I'd wager most of it would come in the form of commits marked “implemented EGA driver, faster AI in Reversi, improve FPU exception code”, aka “I'm done for today, committing”.

matheusmoreira2y ago

Wish this was more common. I really want to build a collection of such "unhinged religious text" commits. Right now I only know of one example: the mpv locale commit.

MBCook2y ago

Not knowing anything about x87 programming, but assuming the code is rational, I would guess this causes the code to work with a handful of _really_ broken or flakey FPUs. Or perhaps just one popular one.

anonymoushn2y ago

I've written to him to ask why huge pages cannot be used by 99% of software targeting windows even though they speed up most software by 10-20% but he hasn't written anything bout it yet :(

layer82y ago

What do you mean “cannot be used”? Applications have to be coded specifically for huge/large pages. It’s similar on Linux, only certain applications support huge pages.

cratermoon2y ago

If there was one, he would have written it by now.

veave2y ago

It's been a few years since it seems he ran out of interesting historic things to tell.

marcosdumay2y ago· 5 in thread

The only way to have more fun than abstracting broken software is abstracting broken hardware.

I can imagine somebody spent months on those few lines of assembly.

I suppose it depends how the hardware was broken.

If you just needed a delay, this is bad code thats just been randomly iterated until it 'works'.

On the other hand, if the hardware does require such an incantation then it's impressive that someone managed to wade through the brokenness.

I'm inclined to believe it's the former though.

Palomides2y ago

it looks suspicious to me, like the kind of thing you find on page 30 of a processor errata document, something like "single writes to external device fail 0.5% of the time due to a register clearing bug on mask revisions prior to version 23. recommended workaround: write twice."

brmgb2y ago

I can tell from this post that you probably have had the chance of never having to work with or emulate broken hardware (which is to say every pieces of hardware ever). At some point you just stop trying to be sane and just go with what works.

wruza2y ago

Otoh, these operations are too specific to come up with by random iteration. I believe it was some hardware nonsense that was both arcane and avoided by random iteration.

The delays introduced by the repeated PUSH/POPs would be quite short, even on the 8088. How would you propose such making high-precision waits for the external x87 chip? (Assuming they were needed at all.)

peterfirefly2y ago· 5 in thread

It is clearly written to not use the (F)WAIT instruction -- the "dumb" code is there to make sure the previous 80287 instruction has completed.

The first time wasting code is long because it has to be slower than the slowest 287 instruction takes to complete after signaling an error. The other time wasters are shorter because they come after known instructions that are faster (FNSTSW just stores 2 bytes to memory, FNCLEX clears some bits inside the 287). Note also that they are the FNSTSW and FNCLEX -- that means there is no implicit (F)WAIT instruction before the real 287 instruction.

Why two FNCLEX? I don't know.

Why 4 writes to port F0? Probably in case the FNSTSW and FNCLEX instructions lead to errors.

13of402y ago

> Why two FNCLEX?

There is a behavior on some CPUs where "out 0xf0" can leave IGNNE# active, but you can clear it after the "out" by running "fnclex".

Why are there two of them? Either the "out 0xf0" is affected by IGNNE# being active, or maybe the original draft had one "spin, out, fnclex" and that whole block of code was just copy+pasted when they added the second one.

anyfoo2y ago

You should write that answer as a comment to the blog post. The author of the blog is very thorough and likely to take an interest to it, if there’s anything to it.

(As an aside, why are we assuming 80287 and not 8087? I know nothing about both, so it’s well likely that I missed obvious hints. EDIT: Ah, I guess because it’s the int 13 handler specifically.)

peterfirefly2y ago

I did. Stuck in moderation. Correct, int 13h.

userbinator2y ago

That was my first thought too - they were trying to synchronise the CPU and FPU.

The mention of not using the wait instruction reminded me of this other post on the same site: https://www.os2museum.com/wp/learn-something-old-every-day-p...

Varriount2y ago

How does an FPU get out of sync with a CPU? Wouldn't the CPU automatically wait for the FPU logic to complete (just like with every other instruction)?

whoopdedo2y ago· 2 in thread

FPUs in the early x86 family are weird. They were typically on separate chips so you could have an 8088+8087, 80286+287, 80286+287XL (which was actually a 80387), 80386+387 (SX and DX models for 24 or 32 bit bus), 80386+287[1], 80386 or 486+Weitek[2], 80386+Weitek+387, 80486SX+80487 where the co-processor was a full CPU that disabled the main chip. And then there were the clones doing creative things such as the Nx586+587[3] which because of it's lack of on-board FPU was often confused for a 386 by software and lost the advantage of its Pentium ops.

So I'm not surprised the exception handler is a mess. It's a domain built entirely out of corner-cases.

[1] https://old.reddit.com/r/retrobattlestations/comments/hj12ck...

[2] https://micro.magnet.fsu.edu/optics/olympusmicd/galleries/ch...

[3] https://en.wikipedia.org/wiki/NexGen

jftuga2y ago

A friend and I each bought 387's (which was physically, a separate chip) for our 386's circa 1992. IIRC, I had a 80386/25 MHz with 4 MB of ram.

I remember a tank game called Scorched Earth where you would have to set angle & power to try to hit the other person's tank. Some ordinances took a 10-15 seconds to fire & complete because it was running FP ops on the CPU. Once the 387 was installed, this calculation was done almost instantly. That's about all I remember my FPU being good for. LOL good times!

EdwardDiego2y ago

The funky bomb or death's head? Especially when it was a large map with lots of ground to destroy.

Me and my siblings had a house rule to not use either when playing on our 286 because it took a minute or so to complete...

Izkata2y ago· 1 in thread

> But the code in WIN87EM.DLL looks very much like the result of changes made in desperation until it worked somehow, even though the changes made little or no sense.

This is how the characters in Coding Machines realized something was up, assembly instructions involving carry bits that made no sense, that they later realized was how an AI writes code: https://www.teamten.com/lawrence/writings/coding-machines/

> It took us the rest of the afternoon to pick through the convoluted jump targets and decode four consecutive instructions. That snippet, it turns out, was finding the sign of an integer. Anyone else would have done a simple comparison and a jump to set the output register to -1, 0, or 1, but the four instructions were a mess of instructions that all either set the carry bit as a side-effect, or used it in an unorthodox way.

Terr_2y ago

That reminds me of a case where an evolutionary-algorithm was being applied to FPGA circuits ("programmable" circuit layouts) with the goal of detecting the presence or absence of a particular tone. [0]

One of the results was a bizarre circuit that wasn't really digital anymore, because the pieces were arranged to exploit ways in which the digital circuit was imperfect, forming a system that was actually analog and idiosyncratic to the test environment.

[0] https://www.damninteresting.com/on-the-origin-of-circuits/

emoemwin-asm2y ago· 1 in thread

The Microsoft code leak mentioned by one of the comments has been out there for years so might as well paste it here so cut down on some of the speculation? Fair use - commercial value is zero, historical value for analysis and criticism is high.

The relevant code comments seems to be

"Fix timing problem??"

and

"486 bug - must wait till after last "out f0" to clear fp exceptions or IGNNE# will be permanently active."

    public __fpIRQ13
    __fpIRQ13:
     cli
    
     WASTE_TIME  70
    
     push    ax
     xor     al, al
     NULL_JMP
     out     0f0h, al        ; reset busy line.
     NULL_JMP
     mov     al, 65h
     NULL_JMP
     out     0a0h, al        ; EOI slave irq 5
     NULL_JMP
     mov     al, 62h
     NULL_JMP
     out     20h, al         ; EOI master irq 2
     NULL_JMP
     pop     ax
    
    
     sub     sp, 2
    
     push    bp
     mov     bp, sp
    
     fnstsw  [bp+2]
     WASTE_TIME
     push    ax
     xor     al, al
     NULL_JMP
     out     0f0h, al        ; reset busy line.
     NULL_JMP
     pop     ax
    
     pop     bp
    
    ;       fnclex                  ; 486 bug - must wait till after last
        ; "out f0" to clear fp exceptions
        ; or IGNNE# will be permanently active.
     WASTE_TIME
     push    ax
     xor     al, al
     NULL_JMP
     out     0f0h, al        ; reset busy line.
     NULL_JMP
     pop     ax
    
    ;       fnclex                  ; 486 bug - must wait till after last
        ; "out f0" to clear fp exceptions
        ; or IGNNE# will be permanently active.
     WASTE_TIME
     push    ax
     xor     al, al
     NULL_JMP
     out     0f0h, al        ; reset busy line.
     NULL_JMP
     pop     ax
    
     fnclex                  ;Now this is safe.
     WASTE_TIME 70           ;Fix timing problem??
    
     jmp     __FPEXCEPTION87P

projektfu2y ago

Really interesting that it was a 486 bug, given the provenance listed in the article. Windows 3.0 was, indeed, released after the 80486 was. I am not sure why the reset busy code was repeated 3 times, I assume the bit must have been somewhat sticky.

"If an unmasked exception occurs when the numeric exception bit in CR0 is clear and the IGNNE# pin is active, the performance of the FPU will be retarded as long as the exception remains pending."

https://www.cs.earlham.edu/~dusko/cs63/prepentium.html

I wonder if that has anything to do with it all.

outside12342y ago· 1 in thread

This is what you’d run across in codebases before the internet, let alone Stack Overflow.

People didn’t have code to copy and paste — so they randomly wrote it like monkeys until it worked based their understanding of one page of a manual, which was literally the only documentation or description anywhere of how the system they were working with worked.

Source: I was there :)

jethkl2y ago

Add to the mix bug reports like, "This worked on the Gateway when the Epson was freshly plugged into the LPR port but crashed after the Epson had printed 5 pages. If we remove our sound card, then no more problems..." Microsoft's strategy was to support legacy and buggy hardware -- this reduced friction for OEMs and helped expand the market, but it also caused a lot of trouble.

xbar2y ago· 1 in thread

Somewhere there is a production codebase containing a particular sequence of check-ins that reflect the peak of my similar flailings.

I am not proud of my desperation, but I can acknowledge it now.

"This time for sure!"

h2odragon2y ago

those opening "wtf" sequences might be there as filler space; harmless instructions with a known pattern where you can come back later and insert different instructions. Most people use NOPs for that but perhaps they wanted a different signature or needed 3 separate, differentiated patch points at entry. Or maybe they wanted to help sell more 8087 chips.

Anybody recall if there was a notable performance difference between Borland's FP emulation lib and M$, then? My habit at the time was to religiously avoid all floats, to the point of shipping a home made arbitrary precision BCD math library. It was no faster than anything else but it gave the same results for the same inputs, every time on every machine.

I've inherited a similar bit of code that kicks in right after pivot (of Linux boot) and tries to disassemble and clean up whatever storage was concocted by the previous steps during boot, and then proceeds to assemble it using some user-supplied layout.

The code is awful, but, really, if anyone's to blame, it's the Linux people who never cared to systematize and unify system's understanding and representation of storage.

quickthrower22y ago

Got rabbit holed... I love this ad - https://www.os2museum.com/wp/os2-history/os2-beginnings/1987... - it is a sort of weird mixture of Steve Job's Apple smooth talking and desperate street seller at the same time.

I'm not nearly expert enough to judge, but to me it smells like heavy wizardry.

readyplayernull2y ago

"Desperation" or random iterations until it passed every test. It doesn't seem to have a lot of opcodes. How much time did it take to find the algorithm with the processing speed of their time?

Vecr2y ago

Somewhat off topic, but your network switches don't still come with metal cases? I get the cheapest stuff that's likely to be reasonably good quality and they all have metal cases.

This triggered my PTSD haha

j / k navigate · click thread line to collapse