Harvard got the first Unix v7 outside of Bell Labs in, I think, '75, and my office mate in the old Harvard graduate computing center (a medieval beast compared to the new Gates/Ballmer cheese wedge building), Geoff Steckel, had some kernel listings on his desk which I of course devoured without really understanding their provenance.
Being teethed on (post-IBM mainframes systems hacking) PDP-10 assembler, TOPS-10, TENEX and Lisp (Harvard's ECL was a Dylan-like algebraic syntax on top of a Lisp internally), reading the C kernel listings was like a revelation: here was an operating system written in a high-level language!
The line printer listings were printed on an upper-case-only printer (not many lower-case-capable printers existed in the DEC world at the time) with strike-throughs (overprinted) used for upper-case letters.
In any case, it was immediately clear that this was something special. I didn't actually get involved with Unix until a few years later in grad school at Columbia, and then ending up as a Unix (BSD) kernel hacker at Multiflow, a Yale spin-off, some years later.
(Sorry for waxing "old duffer...")
Previously on HN for the Google code project: https://news.ycombinator.com/item?id=1132682 and https://news.ycombinator.com/item?id=2698685
Two days ago on HN: The init system from the same source code tree: https://news.ycombinator.com/item?id=10206309
ospace() {} /* fake */
waste() /* waste space */
{
waste(waste(waste),waste(waste),waste(waste));
waste(waste(waste),waste(waste),waste(waste));
waste(waste(waste),waste(waste),waste(waste));
waste(waste(waste),waste(waste),waste(waste));
waste(waste(waste),waste(waste),waste(waste));
waste(waste(waste),waste(waste),waste(waste));
waste(waste(waste),waste(waste),waste(waste));
waste(waste(waste),waste(waste),waste(waste));
}It's creating space in the code section right after ospace. If you look at ospace, it's being used as a buffer. So we have a buffer in the code section instead of the data section. So the question is why? Remember that this is a PDP-11/20 and the maximum memory is only 56 KB (maybe the one at AT&T had less). The data section must be nearly out of space.
There is more: notice that the variables are always allocated at the end of each file, and that "extern" is being used within each function as a forward reference. I think the purpose of this is to keep the symbol table usage within the compiler low- at the end of each function you get the symbol table space used by the externs back. The only symbols which remain are the function names themselves until the end of the file.
I never used UNIX on a PDP-11 (except simh), but I did use DEC's RSX-11 (on an 11/34). In that operating system a lot of effort was dedicated toward making a good overlay linker. Overlays were in the form of a tree: You had to carefully structure your code to maximize the efficiency of this, so that the most commonly referenced things were closer to root. UNIX didn't have any of this..
> The data section must be nearly out of space.
It is for space, but the 11/20 didn't have split I/D. The specific answer¹ is ‘worse’: A second, less noticeable, but astonishing peculiarity is
the space allocation: temporary storage is allocated that
deliberately overwrites the beginning of the program,
smashing its initialization code to save space.
That is, use of the ospace() ‘array’ overlaps the beginning of main().e.g. http://minnie.tuhs.org/cgi-bin/utree.pl?file=V7/usr/src/cmd/...
IF ...
THEN
...
FI
translates to if ( ...
) {
...
}
I remember last seeing this code around 1980-82 when I was working with 7th Edition Unix and wondering why someone would want to do that, since it would have made the program hard for another C programmer to read and maintain. (If I remember correctly, this programming style is unique to the shell code.)