V2: (PDP-11 Unix) Kernel is written in assembly. C compiler is written in assembly.
V3: Kernel is written in assembly. C compiler is written in C.
V4: Kernel is written in C. C compiler is written in C.
https://www.tuhs.org/cgi-bin/utree.pl?file=PDP7-Unix
https://www.tuhs.org/cgi-bin/utree.pl?file=V2
This isn't to deny the popularity of the PDP-11, of course, but its use of 16-bit words was more of a consequence of the System/360. DEC's use of 12-bit words goes back to the LINC (1962) but by 1970, bytes were obviously the way to go.
- Pick a language that's simple enough. A subset of ML would be good, but if you want to complete it, I'd recommend a simple LISP. This is your new language. This is C.
- Use a language you know and like to implement a compiler for this language. This is your bootstrap language. Compile this for example into C--, ASM or LLVM, depending on what you know. This is the target language. As a recommendation, keep this compiler as simple as possible so you have a reference for the next step. For C, both the bootstrap and the target language were ASM.
- And now iterate on extending the stdlib your language has, until you can implement a compiler for your new language in your new language. Again, keep this compiler simple without optimization or passes, just generate the most trivial machine code possible. This usually takes a bit of back-and-forth. You'll need some function evaluation first, some expression evaluation first (this is where a lisp can be an advantage, as those are the same), then you need function definitions, then you need filesystem interactions and so on. You kinda discover what you need as you implement.
- Once you have all of that, (i) compile the compiler for the new language in your bootstrap language and (ii) compile the compiler for the new language using the result of (i). If you want to verify the results, compile the compiler again with the output of (ii) and check if (ii) and (iii) are different.
- Your new language is now self-hosted.
This was fun, because it was accompanied with other courses like how processor microcode implements processor instructions, how different kinds of assembly is mapped onto processor instructions, and then how higher level languages are compiled down into assembly. All of this across 4-6 semesters resulted in a pretty good understanding how Java ends up being flip-flop operations.
EDIT - got target & bootstrap mixed up in first part.
https://en.wikipedia.org/wiki/Compilers:_Principles,_Techniq...
That is why there was a nerd joke that the follower language to C should be P, not C++
Originally BCPL wasn't going to be used for anything beyond bootstraping the CPL compiler, eventually it took a life of its own.
https://en.m.wikipedia.org/wiki/CPL_(programming_language)
There is some irony that for UNIX workloads we are stuck with the evolution of a language whose main purpose was only to bootstrap a compiler and be done with it.
Then Dennis came up with C. C was just portable B (it added types for the purpose of getting around the fact that word sizes differed between archs). And then the B code was recompiled with C and modified as needed. Though C is back-compatible to B somewhat (hence why C has the thing where no type implies int).
That is not quite true, the C specification defines the standard headers, and there are many facilities thereof which don't make much sense outside of an OS (environment, filesystem, dynamic memory allocation, threads, ...)
There is a somewhat restricted subset of C called "freestanding", which is not required to accept every "strictly conforming program", specifically the standard only requires a small subset of the standard headers to be implemented: <float.h>, <iso646.h>, <limits.h>, <stdalign.h>, <stdarg.h>, <stdbool.h>, <stddef.h>, <stdint.h>, and <stdnoreturn.h>. Everything else is optional, and may be implementation-defined rather than standard-conforming.
> filesystem, dynamic memory allocation, threads
All of which can be (and has been) done without an OS. Besides, there is no native C language support for any of those things. Those things are usually (but certainly not always) done by library calls, not language primitives.
My underlying point is that the premise of the question is mistaken. It's assuming that an operating system is required in order to have a programming language. That is simply untrue, and is particularly untrue for C, which is very commonly used to program microcontrollers that have no operating system. I'm working on one such system right now.
Related: early OSes were not timesharing systems. They were batch oriented - you submit your stack of punch cards to the operator, and pick up your output later once they've had a chance to run it.
pre-C Unix was written in assembly
Generally it's an incremental process where the compiler for an early/subset version of the language is written in another, existing language (absent one, may be assembly code).
Once it's possible to rewrite the compiler in its own subset language, it becomes self-hosting. Then you can add a feature to the language, and once it works, enhance the compiler to use it, and so on.
Eventually the language and compiler go hand-in hand: The only way to compile it is with a compiler, and the only way to compile the compiler is with itself. This leads to interesting thought experiments such as:
https://www.cs.cmu.edu/~rdriley/487/papers/Thompson_1984_Ref...
https://www.youtube.com/watch?v=lJf2i87jgFA&list=RDLVlJf2i87...
(I'm just teasing, as you were.)
Now imagine how most things around you were made, how higher tech was made with lower tech. How they made high precision tools, when there were only lower precision tools available? For example: how to make a 0.001 mm precise caliper when all you have is a 0.1 mm one? There were a lot of challenges like that and we still get to new ones. I just wonder what general term is used for things like that.
UNIX is just an UNICS rewrite in C, and was done by the authors of UNICS and C.
There were Cs for MSDOS, Cs for CP/M, even Cs for Windows, etc, etc.
Doesn't that imply already that C precludes UNIX? The question doesn't make sense.