> It may be thought of as what happens when a whole computer starts, since the CPU is the center of the computer and the place where the action begins.
I thought that too. Last year I spent a while getting as low-level as I could and trying to understand how to write a boot loader, a kernel, learn about clocks and pins and interrupts, etc. I thought, "I know, I'll get a Raspberry Pi! That way even if I brick it I didn't waste too much money".
Turns out the Raspberry Pi (and I'm guessing many other systems) are pretty confusing to understand at boot time. For one, it's the GPU that actually does the initial boot process, and much of that is hard to find good info on. (https://raspberrypi.stackexchange.com/questions/14862/why-do...)
I spent many many hours reading various specs & docs and watching tons of low-level YouTube videos. Compared to software development higher up the stack (my usual area), I found the material surprisingly sparse and poor for the most part. (Maybe that's reflective of the size of the audience and the value in producing it).
One slightly weird but fascinating path I have been playing with is writing programs for old game consoles, particularly the Nintendo DS. You can get full and comprehensive hardware documentation for these consoles and there are easy ways to run your own code on them now as well as libraries/tooling around it. But they run no OS, your program runs directly on the hardware, so you get a good feel for low level programming while not being down to the level of atmel chips.
It can be a little hard to work out how to get started but it's really as simple as setting up `libnds` from devkitpro and then either hacking a dsi to run the twilight firmware, or buying a cheap flashcard from ebay to run your own programs. Read the example programs from the devkitpro github and some posts on the hardware and you'll get the hang of it.
Oh yeah, like the Intellivision.
particularly the Nintendo DS.
Oh come on, the DS ain't that old.
How is threading/time scheduling/interrupt handling/context switching usually done?
Most of these processors are fixed-in-ROM sort of machines, so the story for booting them individually is pretty simple. Much like a late 20th century PC, when switched on (either by the power supply or by another processor), start running BIOS code from the hardwired start address. Some need to have more software transferred to them after that.
Modern machines are really networks of computers in themselves. Networking and bringing all these parts together to support the main processors, at the low level, is not only poorly or completely undocumented, but it's probably impossible for one person to fit it all in their head these days.
So easily-understandable articles like this are essential for beginners, along with a short list of masterful books ( e.g. those by Forrest Mims, for electronics) and playing with physical components! The rest is the endless variations, but they're all speaking that language, 'cuz the laws don't change.
Once you're familiar with that move to a more modern microcontroller like an Arm Cortex-M0, and after that maybe something with off-chip memory, a MMU, etc.
The proprietary blobs and GPU on the Raspberry Pi make basically impossible to have a full understanding of what's happening. Instead, I'd recommend learning with the TI Beaglebone Black, which has an ARM Cortex-A8 with an MMU and lots of open documentation.
x86 is similar, since Intel ME (part of the PCH, whether in the chip or not) is needed to boot the CPU.
Here is a presentation about open firmware with a lot of stuff about boot process: https://www.youtube.com/watch?v=fTLsS_QZ8us
Those systems startup into "Cache Contained" mode. Where the Boot ROM is copied to CPU Cache and there's no main memory yet. The code in the ROM has to initialized main memory before it can use it.
In a "real project" you would be relying on your suppliers and other teams / colleagues for many of the "hard questions". Chances are you'd have a contact with the silica manufacturer, just as a major example.
There would be a boot team, or sometimes just one boot developer. If anything goes wrong, or you want to change anything, you would be pretty helpless without them. You can go to the various specs and code, but this can be quite in depth if you're starting from scratch.
Change project or micro? Chances are that all goes out the window, and you have to start from scratch.
But another suggestion if you want a modern alternative to understand - RISC-V has an open boot ecosystem. You can just try it in QEmu and maybe buy a board if you get more advanced?
That is because how ARM ecosystem functions. There is no standard way of integrating ARM CPU into a product as Arm, the company, just sells base IP and not the complete CPU. Every ARM licensee, from Qualcomm to Apple to Nvidia are free to design their own extensions and integrations into their SoC. There is no standard for this. This creates a lot of problems for writing a generic tutorial that you see in the x86 world.
https://github.com/micropython/micropython/tree/master/ports
Just about anything else (including dodgy Chinese substitutes) is better, sadly.
6502 playlist: https://www.youtube.com/watch?v=LnzuMJLZRdU&list=PLowKtXNTBy...
8-bit build playlist: https://www.youtube.com/watch?v=HyznrdDSSGM&list=PLowKtXNTBy...
He also sells kits if one is interested in playing along.
Only watched the first video yet, after initializing itself the CPU actually runs this code (because D8-D15 is wired to zero):
addr opcode
FFFFF0 90 NOP
FFFFF1 00 90 00 90 ADD [BX+SI+9000],DL
FFFFF5 00 90 00 90 ADD [BX+SI+9000],DL
FFFFF9 00 90 00 90 ADD [BX+SI+9000],DL
FFFFFD 00 90 00 -- !! general protection fault
You can see it read and write the same address three times, then fetch the interrupt 0Dh vector and push flags+CS+IP to the stack :)Among other things, since it uses magnetic core memory, you can run the program that was loaded when you shut it off.
Gameboy actually does a funny thing where the boot ROM gets mapped at the bottom of the address space, and then it writes to a MMIO address to unmap the ROM overlay and restore the first 256 bytes of the cartridge there instead. It’s quite amusing!
> On some computer platforms, the instruction pointer is called the "program counter", inexplicably abbreviated "PG"
Typo, maybe? Typically it’s called “pc”.
I can't see any date on this, but this is a bit antiquated. For security and reliability, modern CPUs have an on-chip ROM, which is executed first. That on-chip ROM will tend do basic things like check clock, power, memory etc. Once that's complete it will then securely load firmware from the motherboard flash. Even modern cheapo microcontrollers are shipping with on-chip ROM these days.
https://www.bunniestudios.com/blog/?p=5127
> By pre-boot code, I’m not talking about the little ROM blob that gets run after reset to set up your peripherals so you can pull your bootloader from SD card or SSD. That part was a no-brainer to share. I’m talking about the code that gets run before the architecturally guaranteed “reset vector”. A number of software developers (and alarmingly, some security experts) believe that the life of a CPU begins at the reset vector. In fact, there’s often a significant body of code that gets executed on a CPU to set things up to meet the architectural guarantees of a hard reset – bringing all the registers to their reset state, tuning clock generators, gating peripherals, and so forth. Critically, chip makers heavily rely upon this pre-boot code to also patch all kinds of embarrassing silicon bugs, and to enforce binning rules.
I suspect that almost all the big "application processors" from Intel and AMD, and the exotic ARM/SPARC server chips, have equivalent embedded ICs to jump-start the "big cores".
[0] https://github.com/open-power/sbe/blob/master/src/sbefw/app/...
In a microcontroller, the clock generators and the peripherals are often set up by the main core just after boot, and are under user control - the chip's reset network (literally just a wire) handles bringing things into a known state before boot.
Just one small example: CPUs have many small SRAM arrays for micro-architecture features like branch predictors. Some of these need to be initialized after reset by a state machine that takes many cycles.
I have even heard of a large chip that pushes initialization vectors through the scan chains so that all flops begin in an initial state, without requiring a reset network.
https://embeddedartistry.com/blog/2019/04/08/a-general-overv...
On microcontrollers, there are often some preliminaries that are programmed into nonvolatile settings of the hardware, such as the type of clock oscillator. On a microprocessor like the Z80, your circuit was supposed to ensure that things like the clock oscillator and power supplies were stable before releasing the RESET pin.
https://9esec.io/blog/hardware-assisted-root-of-trust-mechan...
> The following are the memory ranges you get with a 2-to-4 converter on an 8-bit address bus:
And that's it. It looks truncated, or incomplete :thinking:
Wonder when this was written? I would guess mid-1980s; maybe earlier.
Back then, we were all a lot closer to the hardware.
These ad networks on top of ad networks on the oldest IE-compliant code with it's `document.write` fighting it out for eyeballs since 1999. As Lycos' motto says: Battling it out to complete obsolescence and will never give an inch. Go iframe go!
What does that ROM memory cell _physically look like_? How do we physically manipulate it to contain a 1 or a 0 (absence of something)?
The memory cell itself might be a handful of transistors forming a bistable flipflop (for SRAM), or it might be a capacitor and transistor (for DRAM), or a floating gate and transistor (for EPROM), or just a wire and transistor (for mask-ROM).
Then there's PROM (Programmable Read-Only Memory), which is "empty" (either all 0, or all 1, I don't know the full details) but can be programmed once and that's it. Then there's EPROM (Erasable Programmable Read-Only Memory), which can be erased (by exposing the actual chip to UV light), then reprogrammed. Then there is EEPROM (Electrically Erasable Programmable Read-Only Memory) which can be electrically erased and reprogrammed. For each type of ROM, once programmed, it good for years, if not forever.
The programming of a PROM, EPROM or EEPROM usually consists of applying a higher than normal voltage to the chip on certain pins and is usually done in a separate device. How these chips works internally (how the gates are arranged, erased, programmed, etc.) is not something I know (being into software). This is just stuff I picked up over the years.
on a die, you wouldn't go through that trouble, you would just directly synthesize the zeros and ones.
but we don't really use ROMs that much anymore, they are usually persistent and programmable
Now here's the clever bit - there are diodes between the vertical and horizontal wires. When you pull a vertical wire to ground it pulls all the horizontal wires connected through diodes to ground, putting a 0 on those pins.
In a real "mask programmmed" ROM the grid is more square, so there are a lot of horizontal lines grouped in 8 bit bytes.
In an EPROM the diode is replaced by a little MOSFET. By applying a high voltage to its gate it'll stay charged basically forever, switching on and forming a "0" in the output.
I’m not an expert in modern hardware, but these are the basic principles I recollect. Happy to be corrected if this answer is dated :)
--The instruction set is the lowest level of user-visible software. The instruction set is implemented (in principle) by microcode.
--Microcode (if present) is a bunch of hardwired logic signals that cause data to move between subunits of the CPU in response to instructions. Register transfer specifications govern the design at this level.
--A CPU is a collection of subunits including registers, arithmetic units, and logic gates.
--Registers are collections of flip-flops.
--Arithmetic units are made of logic gates.
--Flip-flops are made from logic gates in feedback configurations to make them stateful. State machine theory governs the design of these circuits.
--Logic gates are stateless and made of transistors. Boolean algebra governs the design at this level.
--Transistors are made of NMOS or PMOS or bipolar silicon junctions. Device physics and electrical properties govern this level.
This abstraction hierarchy is useful when learning, but in the real world the abstraction layers are much blurrier: Everything's really just transistors, stateless stuff often has invisible state, electrical considerations matter at every level, etc.
nand2tetris.org/
Example:
https://timhutton.github.io/2010/03/10/30984.html
https://github.com/GollyGang/ruletablerepository/wiki/CoddsD...
This is only partially true. When digital chips boot up, gate outputs are in an indeterminate state. The reset sets them to know initial (and valid) values/bits.
For the purposes of this, the nop instruction is useful as it's a great way to delay the processor for 1 instruction at a time and given you know the clock speed and therefore instructions per second you can: * Set the IO high * Use nops (or other instructions) to waste time * Set the IO low
This is very useful in situations where the timing needs to be too precise to use interrupts, which are somewhat unpredictable. Obviously this means an entire CPU is held up running this code.
Another timing usage is for generating a signal to display a picture on a CRT using a microcontroller. Plenty of others as well.
Personally, though, I would never bother. It's always better to get a dedicated chip/controller for something like driving those stupid LEDs (or just get an SPI LED) or to generate a TV signal.
Reading up on it further, apparently it is also used as a way to reserve space in code memory, I imagine for self-modifying code (ie fill with nops which can be replaced with actual instructions by your code, depending on code execution). But I've never actually done this myself.
Additionally the reset registers differ by the platform. (Some platforms expect software to initialize and reset some registers.)
If you like this sort of plain text technical document, you might enjoy browsing the net with Gopher.
I recommend Gopherus[0] as a modern implementation that's cross-platform.
Say goodbye to a couple hours!