What happens when a CPU starts (opens in new tab)

(lateblt.tripod.com)

452 pointsnowandlater3y ago130 comments

130 comments

93 comments · 24 top-level

billti3y ago· 27 in thread

Amongst the first sentences...

> It may be thought of as what happens when a whole computer starts, since the CPU is the center of the computer and the place where the action begins.

I thought that too. Last year I spent a while getting as low-level as I could and trying to understand how to write a boot loader, a kernel, learn about clocks and pins and interrupts, etc. I thought, "I know, I'll get a Raspberry Pi! That way even if I brick it I didn't waste too much money".

Turns out the Raspberry Pi (and I'm guessing many other systems) are pretty confusing to understand at boot time. For one, it's the GPU that actually does the initial boot process, and much of that is hard to find good info on. (https://raspberrypi.stackexchange.com/questions/14862/why-do...)

I spent many many hours reading various specs & docs and watching tons of low-level YouTube videos. Compared to software development higher up the stack (my usual area), I found the material surprisingly sparse and poor for the most part. (Maybe that's reflective of the size of the audience and the value in producing it).

Gigachad3y ago

Yeah modern hardware is crazy hard to understand and a lot is proprietary/trademark stuff like you saw on the rpi. Arduino is a good start but it's just so low level that it doesn't get you much further to understanding how a modern system works.

One slightly weird but fascinating path I have been playing with is writing programs for old game consoles, particularly the Nintendo DS. You can get full and comprehensive hardware documentation for these consoles and there are easy ways to run your own code on them now as well as libraries/tooling around it. But they run no OS, your program runs directly on the hardware, so you get a good feel for low level programming while not being down to the level of atmel chips.

It can be a little hard to work out how to get started but it's really as simple as setting up `libnds` from devkitpro and then either hacking a dsi to run the twilight firmware, or buying a cheap flashcard from ebay to run your own programs. Read the example programs from the devkitpro github and some posts on the hardware and you'll get the hang of it.

tmtvl3y ago

writing programs for old game consoles,

Oh yeah, like the Intellivision.

particularly the Nintendo DS.

Oh come on, the DS ain't that old.

2 more replies

MuffinFlavored3y ago

> But they run no OS, your program runs directly on the hardware

How is threading/time scheduling/interrupt handling/context switching usually done?

3 more replies

eismcc3y ago

Just looked in eBay and had no idea there were so many DS systems available.

1 more reply

retrac3y ago

As early as the mid-70s, minicomputers started including microcontrollers to manage the boot environment. I had a Sun machine once that included what they called Lights Out Management. A little microcontroller that had control over the system power and etc. Always available via its dedicated serial port, even when the machine was shut off. Everything is like that now. A smartphone will have multiple processors. Some doing IO interfacing. One to manage the battery. The radio hardware will have a general-purpose processor.

Most of these processors are fixed-in-ROM sort of machines, so the story for booting them individually is pretty simple. Much like a late 20th century PC, when switched on (either by the power supply or by another processor), start running BIOS code from the hardwired start address. Some need to have more software transferred to them after that.

Modern machines are really networks of computers in themselves. Networking and bringing all these parts together to support the main processors, at the low level, is not only poorly or completely undocumented, but it's probably impossible for one person to fit it all in their head these days.

8bitsrule3y ago

True. It's like what's happened with electronics, radio, video in that way. A solid grasp of the (always-present) basic components, and basic tech and design philosophy, goes a long way. It's like learning a language.

So easily-understandable articles like this are essential for beginners, along with a short list of masterful books ( e.g. those by Forrest Mims, for electronics) and playing with physical components! The rest is the endless variations, but they're all speaking that language, 'cuz the laws don't change.

1 more reply

rmckayfleming3y ago

A recurring question I have is: how many microcontrollers/CPUs are in a modern personal computer? There are clearly a lot, but just how many?

1 more reply

wiml3y ago

If you want something you can almost fully understand, I'd go another step or two lower, to a microcontroller like an AVR or one of the old Motorola chips (68HC11 or something). These chips were actually designed by hand and expected to be programmed by hand, and their documentation reflects this. They also ave much less hidden microarchitectural state than a modern CPU.

Once you're familiar with that move to a more modern microcontroller like an Arm Cortex-M0, and after that maybe something with off-chip memory, a MMU, etc.

LeifCarrotson3y ago

Those 8-bit controllers with (exclusively) on-chip SRAM and Flash are indeed quite simple - they can be summed up as "Jump to 0x0000", but really are surprisingly complex once you get into bootloaders and interrupt service routines and so on.

The proprietary blobs and GPU on the Raspberry Pi make basically impossible to have a full understanding of what's happening. Instead, I'd recommend learning with the TI Beaglebone Black, which has an ARM Cortex-A8 with an MMU and lots of open documentation.

userbinator3y ago

Look at the IBM PC/XT/AT if you want something more documented. IBM supplied schematics and even BIOS listings for those machines, the newest being a 286 (AT). Thanks to doing some BIOS RE work many years ago, the address F000:E05B still remains in my mind...

pcwalton3y ago

> For one, it's the GPU that actually does the initial boot process, and much of that is hard to find good info on. (https://raspberrypi.stackexchange.com/questions/14862/why-do...)

x86 is similar, since Intel ME (part of the PCH, whether in the chip or not) is needed to boot the CPU.

ajross3y ago

The PMC also needs to be online before the main cores. And many machines there's still an external EC on the board responsible for sequencing power state external to the chip. The x86 application cores actually start up quite late in the process, long after the SPI has been read out, memory and cache controllers initialized, etc...

ilyt3y ago

Yup, in bigger systems you also often need to initialize/train DRAM which means your first code essentially runs off L2/L3 cache

Here is a presentation about open firmware with a lot of stuff about boot process: https://www.youtube.com/watch?v=fTLsS_QZ8us

trissylegs3y ago

I think OpenPower has a fully open bootloader. But it's not exactly cheap to try.

Those systems startup into "Cache Contained" mode. Where the Boot ROM is copied to CPU Cache and there's no main memory yet. The code in the ROM has to initialized main memory before it can use it.

P_I_Staker3y ago

This is very common in embedded systems, and honestly I'm not surprised to see this, even with projects like Raspberry Pi... although, I do think it's a big shortcoming for a project like that.

In a "real project" you would be relying on your suppliers and other teams / colleagues for many of the "hard questions". Chances are you'd have a contact with the silica manufacturer, just as a major example.

There would be a boot team, or sometimes just one boot developer. If anything goes wrong, or you want to change anything, you would be pretty helpless without them. You can go to the various specs and code, but this can be quite in depth if you're starting from scratch.

Change project or micro? Chances are that all goes out the window, and you have to start from scratch.

c0wb0yc0d3r3y ago

Would you mind sharing some of your favorite youtube videos that you've come across?

shash3y ago

Chiming in with everyone else - RPi is way too complex.

But another suggestion if you want a modern alternative to understand - RISC-V has an open boot ecosystem. You can just try it in QEmu and maybe buy a board if you get more advanced?

_4483y ago

> Turns out the Raspberry Pi (and I'm guessing many other systems) are pretty confusing to understand at boot time.

That is because how ARM ecosystem functions. There is no standard way of integrating ARM CPU into a product as Arm, the company, just sells base IP and not the complete CPU. Every ARM licensee, from Qualcomm to Apple to Nvidia are free to design their own extensions and integrations into their SoC. There is no standard for this. This creates a lot of problems for writing a generic tutorial that you see in the x86 world.

hot_gril3y ago

Maybe a good place to start is an old video game console. Those have a lot of community info because of interest in retro gaming/emulation, plus they're simpler than modern hardware.

Dork12343y ago

Might want to look at MicroPython. The ports have various init functions for various chips /system. It might not be a full OS, but it is pretty low level code you can compare to various other systems.

https://github.com/micropython/micropython/tree/master/ports

pabs33y ago

Check out the LibreRPi project, they are aiming to reverse engineer, document and replace the GPU firmware boot blob and other blobs.

https://github.com/librerpi/

saagarjha3y ago

Macs with a T2 chip first start execution there before pulling the Intel CPU out of reset.

pabs33y ago

This keynote about hardware vs operating systems knowledge of hardware was enlightening:

https://www.youtube.com/watch?v=36myc8wQhLo

sgtnoodle3y ago

You could start with a simple microcontroller like an AVR. Pick up an Arduino Uno or a Mega and an ICSP programmer, then write your own bootloader for it.

bsder3y ago

RPi's unfortunately have lousy documentation for the low level stuff.

Just about anything else (including dodgy Chinese substitutes) is better, sadly.

epigramx3y ago

the "whole computer" is also thought to start on the memory, because you can have memory on a computer and a manual operator of it, but you can't have a computer with an operator of it and no memory.

psychphysic3y ago

Yeah better off just getting QEMU to boot. Target 286 or something.

pclmulqdq3y ago· 12 in thread

"It starts at 0 and executes instructions" is a funny, but mostly true way to express this. Some people are shocked that no magic happens before the instructions start.

monocasa3y ago

Sort of. It's actually fairly common on larger cores (and particular, larger SoCs) for there to exist magic that happens before architectural reset vectors.

https://www.bunniestudios.com/blog/?p=5127

> By pre-boot code, I’m not talking about the little ROM blob that gets run after reset to set up your peripherals so you can pull your bootloader from SD card or SSD. That part was a no-brainer to share. I’m talking about the code that gets run before the architecturally guaranteed “reset vector”. A number of software developers (and alarmingly, some security experts) believe that the life of a CPU begins at the reset vector. In fact, there’s often a significant body of code that gets executed on a CPU to set things up to meet the architectural guarantees of a hard reset – bringing all the registers to their reset state, tuning clock generators, gating peripherals, and so forth. Critically, chip makers heavily rely upon this pre-boot code to also patch all kinds of embarrassing silicon bugs, and to enforce binning rules.

spijdar3y ago

Something fun mentioned in a comment on that article, this the majority of this "pre-boot" code is actually FOSS in POWER9. There are a set of "auxiliary processors" called "PPE"s, among which there is one, the "SBE", or "self boot engine", which is a very small and simple PowerPC core that IPLs the big POWER9 cores [0]. These big processors with tons of cache and interconnects need a lot of help to get to executing PC 0x00.

I suspect that almost all the big "application processors" from Intel and AMD, and the exotic ARM/SPARC server chips, have equivalent embedded ICs to jump-start the "big cores".

[0] https://github.com/open-power/sbe/blob/master/src/sbefw/app/...

1 more reply

pclmulqdq3y ago

In many modern CPUs, it's an auxiliary processor that "starts at 0" (within its dedicated ROM) and then eventually turns on the main CPUs. In a modern Intel core, I think that CPU is actually the one in the secure enclave, which also happens to do things like DRAM training...

In a microcontroller, the clock generators and the peripherals are often set up by the main core just after boot, and are under user control - the chip's reset network (literally just a wire) handles bringing things into a known state before boot.

P_I_Staker3y ago

Yeah, I was actually thinking the basic behavior that OP mentioned could be quite rare these days. Last project I worked on there was a whole startup procedure. You could even use a special bootloader that they kept in ROM, if you wanted... It did a number of other things that were generally not visible to the developer, but probably required some CPU interaction.

MichaelZuo3y ago

So what boots the ‘pre-boot’ code?

zerohp3y ago

This is pretty old. There's quite a lot that happens before instructions start on modern CPUs.

Just one small example: CPUs have many small SRAM arrays for micro-architecture features like branch predictors. Some of these need to be initialized after reset by a state machine that takes many cycles.

I have even heard of a large chip that pushes initialization vectors through the scan chains so that all flops begin in an initial state, without requiring a reset network.

ip263y ago

sidebar, have a question about an older comment of yours that can't be replied to. would you reach out via email (in profile)?

analog313y ago

It's easy to overlook for those who program mostly on higher level systems. There was a recent HN thread that I think pointed to this article on what happens before main() in C programs:

https://embeddedartistry.com/blog/2019/04/08/a-general-overv...

On microcontrollers, there are often some preliminaries that are programmed into nonvolatile settings of the hardware, such as the type of clock oscillator. On a microprocessor like the Z80, your circuit was supposed to ensure that things like the clock oscillator and power supplies were stable before releasing the RESET pin.

raggi3y ago

As long as things like embedded coprocessors (IME, TXT, PSP, etc) are kept out of view.

intelVISA3y ago

That hasn't been true for a long time, unfortunately.

https://9esec.io/blog/hardware-assisted-root-of-trust-mechan...

derefr3y ago

If older CPUs have no magic, what is the 6502 doing here (https://youtu.be/yl8vPW5hydQ?t=706) that causes it to put eb60 → ffff → eb60 → 01f7 → 01f6 → 01f5 on the address bus, before it actually reads from memory?

abbeyj3y ago

https://www.pagetable.com/?p=410 may go some way toward explaining this. The behavior is slightly different than in the video. I believe this is because the video is using a 65C02 instead of the original NMOS 6502 and the implementation is slightly different.

1 more reply

Rimintil3y ago· 12 in thread

> The memory chips respond by sending the contents of the selected memory cell over the data bus to the CPU.

What does that ROM memory cell _physically look like_? How do we physically manipulate it to contain a 1 or a 0 (absence of something)?

wiml3y ago

Each type of memory has a somewhat different design, but generally, the address lines will control a pair of demultiplexers, one which activates a single row in a rectangular grid of memory cells, and one which reads a single column. (Old DRAM chips had separate RAS and CAS — row address select and column address select — phases.) Each memory cell in that row puts its value onto its column line, and the column portion of the address determines which column line is routed to the output pin and fed back to the data bus. Multiply everything by 16 for a 16-bit memory bus.

The memory cell itself might be a handful of transistors forming a bistable flipflop (for SRAM), or it might be a capacitor and transistor (for DRAM), or a floating gate and transistor (for EPROM), or just a wire and transistor (for mask-ROM).

kaba03y ago

For (much) more detail I can recommend Drepper’s “What every developer should know about memory” paper. Don’t be afraid by its publication date, it is still very very relevant.

spc4763y ago

There are several types of ROM available. The type you would find in most 8-bit computers from Atari, Apple, Commodore, Radio Shack, etc. would be chips pre-programmed (aka a Mask Read-Only Memory) at a factory with the contents, such that each bit is set high or low and cannot be changed at all.

Then there's PROM (Programmable Read-Only Memory), which is "empty" (either all 0, or all 1, I don't know the full details) but can be programmed once and that's it. Then there's EPROM (Erasable Programmable Read-Only Memory), which can be erased (by exposing the actual chip to UV light), then reprogrammed. Then there is EEPROM (Electrically Erasable Programmable Read-Only Memory) which can be electrically erased and reprogrammed. For each type of ROM, once programmed, it good for years, if not forever.

The programming of a PROM, EPROM or EEPROM usually consists of applying a higher than normal voltage to the chip on certain pins and is usually done in a separate device. How these chips works internally (how the gates are arranged, erased, programmed, etc.) is not something I know (being into software). This is just stuff I picked up over the years.

convolvatron3y ago

my understanding is that there is a thin trace (a fuse) at each cell. in order to program it, the address lines are manipulated to select the word, and a programming voltage is applied (for some reason I remember 18v, but it really could be anything), that draws a large current across the fused and blows it.

on a die, you wouldn't go through that trouble, you would just directly synthesize the zeros and ones.

but we don't really use ROMs that much anymore, they are usually persistent and programmable

wittenbunk3y ago

Many modern microcontrollers have programmable fuses for security reasons.

Gordonjcp3y ago

Internally? Imagine a grid of wires where each address is one vertical wire and each bit of the output is one horizontal wire. The output wires have resistors pulling them up to 5V. To select an entry in the ROM you pull its corresponding vertical wire to ground.

Now here's the clever bit - there are diodes between the vertical and horizontal wires. When you pull a vertical wire to ground it pulls all the horizontal wires connected through diodes to ground, putting a 0 on those pins.

In a real "mask programmmed" ROM the grid is more square, so there are a lot of horizontal lines grouped in 8 bit bytes.

In an EPROM the diode is replaced by a little MOSFET. By applying a high voltage to its gate it'll stay charged basically forever, switching on and forming a "0" in the output.

alexeldeib3y ago

A peer mentioned different types of memory and flip flops. I’d search for JK flip flop or D flip flop/latch for conceptual pointers. Those can be used to build up register files. The logic gates used to build flip flops/latches use things like NMOS/CMOS/PMOS - different kind of metal oxide seniconductor logic based on substrate. The differing behavior affects how you arrange circuits (voltage source, ground, etc) to provide desired logic from inputs (keywords: pull down/pull up network). Once you have logic and memory, you can build control units and things like arithmetic logic (look up von Neumann/Harvard architecture).

I’m not an expert in modern hardware, but these are the basic principles I recollect. Happy to be corrected if this answer is dated :)

dreamcompiler3y ago

The standard decomposition is

--The instruction set is the lowest level of user-visible software. The instruction set is implemented (in principle) by microcode.

--Microcode (if present) is a bunch of hardwired logic signals that cause data to move between subunits of the CPU in response to instructions. Register transfer specifications govern the design at this level.

--A CPU is a collection of subunits including registers, arithmetic units, and logic gates.

--Registers are collections of flip-flops.

--Arithmetic units are made of logic gates.

--Flip-flops are made from logic gates in feedback configurations to make them stateful. State machine theory governs the design of these circuits.

--Logic gates are stateless and made of transistors. Boolean algebra governs the design at this level.

--Transistors are made of NMOS or PMOS or bipolar silicon junctions. Device physics and electrical properties govern this level.

This abstraction hierarchy is useful when learning, but in the real world the abstraction layers are much blurrier: Everything's really just transistors, stateless stuff often has invisible state, electrical considerations matter at every level, etc.

superasn3y ago

I found this video a long time ago about this and it's a great way to understand it if you're a visual learner

https://www.youtube.com/watch?v=7J7X7aZvMXQ&t=5s

Rimintil3y ago

This is an excellent video of exactly how RAM works. Now to find one for the various types of ROM!

gdprrrr3y ago

Ben eater to the rescue https://www.youtube.com/watch?v=FnxPIZR1ybs

Rimintil3y ago

For RAM, but I was asking about ROM. Since [type]ROM keeps its state for a long-term (looks like some manufactures claim 200 years shelf life under optimal conditions), I'd imagine it wouldn't use capacitors holding an electrical charge that require any form of refresh like RAM.

broast3y ago· 4 in thread

Amazing to see that tripod sites are still around.

greenyoda3y ago

Even stranger: If you go to www.tripod.com, it redirects to https://www.tripod.lycos.com. Going to https://www.lycos.com, you can see that Lycos, one of several search engines that predated Google, is still around. Here's their about page: https://info.lycos.com/about/company-overview/

makeworld3y ago

It seems like a pretty decent search engine also.

andirk3y ago

Mine's still around altho it looks like I turned it from a The Simpsons fan page in to one sentence https://bartzone.tripod.com/ . And with allowing all scripts, it's >1,000 requests, >10MB of data.

These ad networks on top of ad networks on the oldest IE-compliant code with it's `document.write` fighting it out for eyeballs since 1999. As Lycos' motto says: Battling it out to complete obsolescence and will never give an inch. Go iframe go!

systematical3y ago

It's the only reason I clicked on this.

santadakota3y ago· 3 in thread

Ben Eater's fantastic video series on building a breadboard 6502 based computer and an 8-bit breadboard computer from scratch might be appreciated in this thread.

6502 playlist: https://www.youtube.com/watch?v=LnzuMJLZRdU&list=PLowKtXNTBy...

8-bit build playlist: https://www.youtube.com/watch?v=HyznrdDSSGM&list=PLowKtXNTBy...

He also sells kits if one is interested in playing along.

AceJohnny23y ago

Highly recommend Ben Eater's video, which in this embedded developer's eyes has been the clearest, best explanation of the fundamentals of a CPU as I've ever seen.

snvzz3y ago

Similarly, I have come to appreciate rehsd's[0] efforts in building a 80286 computer.

0. https://www.youtube.com/@rehsd/videos

rep_lodsb3y ago

Wow, that series looks fascinating!

Only watched the first video yet, after initializing itself the CPU actually runs this code (because D8-D15 is wired to zero):

  addr    opcode
  FFFFF0  90           NOP
  FFFFF1  00 90 00 90  ADD [BX+SI+9000],DL
  FFFFF5  00 90 00 90  ADD [BX+SI+9000],DL
  FFFFF9  00 90 00 90  ADD [BX+SI+9000],DL
  FFFFFD  00 90 00 --  !! general protection fault

You can see it read and write the same address three times, then fetch the interrupt 0Dh vector and push flags+CS+IP to the stack :)

storklathe3y ago· 3 in thread

Could anybody clarify for me the purpose of the NOP opcode that the article refers to? I would think that something like a "do nothing" instruction would want to be optimized away as much as possible, but maybe there's some hidden facet of the instruction protocol I'm not familiar with that necessitates it?

fennecfoxy3y ago

I don't know how much it's really used these days, but when I've done asm stuff before if you have tight timing requirements in assembly, say you're bit banging an IO for some arcane serial protocol a la WS281X LED series, where it doesn't use a hardware supported serial protocol like SPI (where you can just DMA what you want to send to the controller) you then have to implement this protocol manually, in code, switching the IO pin on & off according to the timing requirements of said serial protocol/what data you want to send.

For the purposes of this, the nop instruction is useful as it's a great way to delay the processor for 1 instruction at a time and given you know the clock speed and therefore instructions per second you can: * Set the IO high * Use nops (or other instructions) to waste time * Set the IO low

This is very useful in situations where the timing needs to be too precise to use interrupts, which are somewhat unpredictable. Obviously this means an entire CPU is held up running this code.

Another timing usage is for generating a signal to display a picture on a CRT using a microcontroller. Plenty of others as well.

Personally, though, I would never bother. It's always better to get a dedicated chip/controller for something like driving those stupid LEDs (or just get an SPI LED) or to generate a TV signal.

Reading up on it further, apparently it is also used as a way to reserve space in code memory, I imagine for self-modifying code (ie fill with nops which can be replaced with actual instructions by your code, depending on code execution). But I've never actually done this myself.

seanw4443y ago

I believe you can get a good idea for using NOPs if you watch Ben Eater's video on implementing the RS-232 protocol.

storklathe3y ago

Thank you so much, that makes perfect sense!

hackan3y ago· 2 in thread

Is it me or the article seems incomplete? It kinda finishes for me after:

> The following are the memory ranges you get with a 2-to-4 converter on an 8-bit address bus:

And that's it. It looks truncated, or incomplete :thinking:

tyingq3y ago

I see this at the end:

  The following are the memory ranges you get with a 2-to-4 converter on an 8-bit address bus:

  00: 00h to 3Fh
  01: 40h to 7Fh
  10: 80h to BFh
  11: C0h to FFh

hackan3y ago

Yes, exactly that. It feels incomplete :S

charcircuit3y ago· 2 in thread

I thought it started by executing microcode from ROM.

martyvis3y ago

Which is precisely what another of today's HN posts alludes to https://news.ycombinator.com/item?id=34329201

valleyer3y ago

It does, in the sense that executing any (regular, macro) code involves executing microcode. But no, there's not a separate phase during which only microcode is executed, if that's what you mean.

kens3y ago· 1 in thread

If anyone is interested in what happens with an older CPU, I've written up how the IBM 1401 from 1959 starts up: https://www.righto.com/2021/02/an-ibm-1401-mainframe-compute...

Among other things, since it uses magnetic core memory, you can run the program that was loaded when you shut it off.

TedDoesntTalk3y ago

Go Ken!

saagarjha3y ago· 1 in thread

> The Z80's system, although simpler, creates a "hole" in the memory, because the bottom of the memory space is used by ROM and therefore you cannot use the beginning of the memory space for normal RAM work.

Gameboy actually does a funny thing where the boot ROM gets mapped at the bottom of the address space, and then it writes to a MMIO address to unmap the ROM overlay and restore the first 256 bytes of the cartridge there instead. It’s quite amusing!

> On some computer platforms, the instruction pointer is called the "program counter", inexplicably abbreviated "PG"

Typo, maybe? Typically it’s called “pc”.

Gordonjcp3y ago

CP/M systems used to do that too, where the ROM would copy its important bits to upper memory and then swap itself out.

squokko3y ago· 1 in thread

Now there's a domain I haven't seen in a very long time.

ChrisMarshallNY3y ago

Looks like old 8-bit computers. I remember this from my days of Machine Code, and that was a long time ago.

Wonder when this was written? I would guess mid-1980s; maybe earlier.

Back then, we were all a lot closer to the hardware.

fzliu3y ago· 1 in thread

> This is because when the power supply is first powering up, even if it only takes a second or two, the CPU has already received "dirty" power, because the power supply was building up a steady stream of electricity. Digital logic chips like CPUs require precise voltages, and they get confused if they receive something outside their intended voltage range.

This is only partially true. When digital chips boot up, gate outputs are in an indeterminate state. The reset sets them to know initial (and valid) values/bits.

martyvis3y ago

All so called digital devices are really analog devices corralled into operating using binary logic. All it takes is a volt or two applied or removed unexpectedly on those discrete components and you have a bit-flip and possible ensuing mayhem.

als03y ago

> Regardless of where the CPU begins getting its instructions, the beginning point should always be somewhere in a ROM chip. The computer needs startup instructions to perform basic hardware checking and preparation, and these are contained in a ROM chip on the motherboard called the BIOS. This is where any computer begins executing its code when it is turned on.

I can't see any date on this, but this is a bit antiquated. For security and reliability, modern CPUs have an on-chip ROM, which is executed first. That on-chip ROM will tend do basic things like check clock, power, memory etc. Once that's complete it will then securely load firmware from the motherboard flash. Even modern cheapo microcontrollers are shipping with on-chip ROM these days.

atomjames3y ago

the nand2tetris course is a nice primer for understanding this kind of stuff:

nand2tetris.org/

ruslan3y ago

Starting procedure on MOS6502 is not that simple as it is told in the article. It spends some 7 clocks on doing internal initialization and only then fetches address of first jump from reset vector (0xFFFC/0xFFFD). I find Ben Eater's series on building simple 8-bit breadboard computer using 6502 CPU very educating and entertaining.

https://www.youtube.com/watch?v=HyznrdDSSGM

tim_hutton3y ago

One route to understanding how CPUs work is to explore the computers that have been made in cellular automata. Golly (https://golly.sourceforge.net/) has several, including one by John von Neumann, one by Edgar Codd and another by John Devore. The advantage of course is that the physics is trivial and you can see everything that happens and step backwards and forwards.

Example:

https://timhutton.github.io/2010/03/10/30984.html

https://github.com/GollyGang/ruletablerepository/wiki/CoddsD...

sanatgersappa3y ago

Honestly, I had no idea tripod was still around.

miga3y ago

It describes old 8-bit machines. New machines start by executing instructions firmware ROM or in cache memory before RAM is tested and initialized.

Additionally the reset registers differ by the platform. (Some platforms expect software to initialize and reset some registers.)

snvzz3y ago

A tangent, but I thought worth mentioning:

If you like this sort of plain text technical document, you might enjoy browsing the net with Gopher.

I recommend Gopherus[0] as a modern implementation that's cross-platform.

0. https://gopherus.sourceforge.net/

fennecfoxy3y ago

I think this is traditional for this topic: https://nandgame.com/

Say goodbye to a couple hours!

DotaFan3y ago

Last week I've tried to understand how Chip8 worked, and it did help me understand this article a bit more.

dirtyid3y ago

Somewhere out there is a company specializing in selling tripods patiently waiting for the domain to free up.

Wolfenstein98k3y ago

What a delightful article to read. Thanks for sharing!

bandrami3y ago

Tripod still exists?

j / k navigate · click thread line to collapse

130 comments

93 comments · 24 top-level

billti3y ago· 27 in thread

Amongst the first sentences...

> It may be thought of as what happens when a whole computer starts, since the CPU is the center of the computer and the place where the action begins.

Gigachad3y ago

tmtvl3y ago

writing programs for old game consoles,

Oh yeah, like the Intellivision.

particularly the Nintendo DS.

Oh come on, the DS ain't that old.

2 more replies

MuffinFlavored3y ago

> But they run no OS, your program runs directly on the hardware

How is threading/time scheduling/interrupt handling/context switching usually done?

3 more replies

eismcc3y ago

Just looked in eBay and had no idea there were so many DS systems available.

1 more reply

retrac3y ago

8bitsrule3y ago

1 more reply

rmckayfleming3y ago

A recurring question I have is: how many microcontrollers/CPUs are in a modern personal computer? There are clearly a lot, but just how many?

1 more reply

wiml3y ago

Once you're familiar with that move to a more modern microcontroller like an Arm Cortex-M0, and after that maybe something with off-chip memory, a MMU, etc.

LeifCarrotson3y ago

userbinator3y ago

pcwalton3y ago

> For one, it's the GPU that actually does the initial boot process, and much of that is hard to find good info on. (https://raspberrypi.stackexchange.com/questions/14862/why-do...)

x86 is similar, since Intel ME (part of the PCH, whether in the chip or not) is needed to boot the CPU.

ajross3y ago

ilyt3y ago

Yup, in bigger systems you also often need to initialize/train DRAM which means your first code essentially runs off L2/L3 cache

Here is a presentation about open firmware with a lot of stuff about boot process: https://www.youtube.com/watch?v=fTLsS_QZ8us

trissylegs3y ago

I think OpenPower has a fully open bootloader. But it's not exactly cheap to try.

Those systems startup into "Cache Contained" mode. Where the Boot ROM is copied to CPU Cache and there's no main memory yet. The code in the ROM has to initialized main memory before it can use it.

P_I_Staker3y ago

This is very common in embedded systems, and honestly I'm not surprised to see this, even with projects like Raspberry Pi... although, I do think it's a big shortcoming for a project like that.

Change project or micro? Chances are that all goes out the window, and you have to start from scratch.

c0wb0yc0d3r3y ago

Would you mind sharing some of your favorite youtube videos that you've come across?

shash3y ago

Chiming in with everyone else - RPi is way too complex.

But another suggestion if you want a modern alternative to understand - RISC-V has an open boot ecosystem. You can just try it in QEmu and maybe buy a board if you get more advanced?

_4483y ago

> Turns out the Raspberry Pi (and I'm guessing many other systems) are pretty confusing to understand at boot time.

hot_gril3y ago

Maybe a good place to start is an old video game console. Those have a lot of community info because of interest in retro gaming/emulation, plus they're simpler than modern hardware.

Dork12343y ago

https://github.com/micropython/micropython/tree/master/ports

pabs33y ago

Check out the LibreRPi project, they are aiming to reverse engineer, document and replace the GPU firmware boot blob and other blobs.

https://github.com/librerpi/

saagarjha3y ago

Macs with a T2 chip first start execution there before pulling the Intel CPU out of reset.

pabs33y ago

This keynote about hardware vs operating systems knowledge of hardware was enlightening:

https://www.youtube.com/watch?v=36myc8wQhLo

sgtnoodle3y ago

You could start with a simple microcontroller like an AVR. Pick up an Arduino Uno or a Mega and an ICSP programmer, then write your own bootloader for it.

bsder3y ago

RPi's unfortunately have lousy documentation for the low level stuff.

Just about anything else (including dodgy Chinese substitutes) is better, sadly.

epigramx3y ago

the "whole computer" is also thought to start on the memory, because you can have memory on a computer and a manual operator of it, but you can't have a computer with an operator of it and no memory.

psychphysic3y ago

Yeah better off just getting QEMU to boot. Target 286 or something.

pclmulqdq3y ago· 12 in thread

"It starts at 0 and executes instructions" is a funny, but mostly true way to express this. Some people are shocked that no magic happens before the instructions start.

monocasa3y ago

Sort of. It's actually fairly common on larger cores (and particular, larger SoCs) for there to exist magic that happens before architectural reset vectors.

https://www.bunniestudios.com/blog/?p=5127

spijdar3y ago

I suspect that almost all the big "application processors" from Intel and AMD, and the exotic ARM/SPARC server chips, have equivalent embedded ICs to jump-start the "big cores".

[0] https://github.com/open-power/sbe/blob/master/src/sbefw/app/...

1 more reply

pclmulqdq3y ago

P_I_Staker3y ago

MichaelZuo3y ago

So what boots the ‘pre-boot’ code?

zerohp3y ago

This is pretty old. There's quite a lot that happens before instructions start on modern CPUs.

I have even heard of a large chip that pushes initialization vectors through the scan chains so that all flops begin in an initial state, without requiring a reset network.

ip263y ago

sidebar, have a question about an older comment of yours that can't be replied to. would you reach out via email (in profile)?

analog313y ago

It's easy to overlook for those who program mostly on higher level systems. There was a recent HN thread that I think pointed to this article on what happens before main() in C programs:

https://embeddedartistry.com/blog/2019/04/08/a-general-overv...

raggi3y ago

As long as things like embedded coprocessors (IME, TXT, PSP, etc) are kept out of view.

intelVISA3y ago

That hasn't been true for a long time, unfortunately.

https://9esec.io/blog/hardware-assisted-root-of-trust-mechan...

derefr3y ago

abbeyj3y ago

1 more reply

Rimintil3y ago· 12 in thread

> The memory chips respond by sending the contents of the selected memory cell over the data bus to the CPU.

What does that ROM memory cell _physically look like_? How do we physically manipulate it to contain a 1 or a 0 (absence of something)?

wiml3y ago

kaba03y ago

For (much) more detail I can recommend Drepper’s “What every developer should know about memory” paper. Don’t be afraid by its publication date, it is still very very relevant.

spc4763y ago

convolvatron3y ago

on a die, you wouldn't go through that trouble, you would just directly synthesize the zeros and ones.

but we don't really use ROMs that much anymore, they are usually persistent and programmable

wittenbunk3y ago

Many modern microcontrollers have programmable fuses for security reasons.

Gordonjcp3y ago

In a real "mask programmmed" ROM the grid is more square, so there are a lot of horizontal lines grouped in 8 bit bytes.

In an EPROM the diode is replaced by a little MOSFET. By applying a high voltage to its gate it'll stay charged basically forever, switching on and forming a "0" in the output.

alexeldeib3y ago

I’m not an expert in modern hardware, but these are the basic principles I recollect. Happy to be corrected if this answer is dated :)

dreamcompiler3y ago

The standard decomposition is

--The instruction set is the lowest level of user-visible software. The instruction set is implemented (in principle) by microcode.

--A CPU is a collection of subunits including registers, arithmetic units, and logic gates.

--Registers are collections of flip-flops.

--Arithmetic units are made of logic gates.

--Flip-flops are made from logic gates in feedback configurations to make them stateful. State machine theory governs the design of these circuits.

--Logic gates are stateless and made of transistors. Boolean algebra governs the design at this level.

--Transistors are made of NMOS or PMOS or bipolar silicon junctions. Device physics and electrical properties govern this level.

superasn3y ago

I found this video a long time ago about this and it's a great way to understand it if you're a visual learner

https://www.youtube.com/watch?v=7J7X7aZvMXQ&t=5s

Rimintil3y ago

This is an excellent video of exactly how RAM works. Now to find one for the various types of ROM!

gdprrrr3y ago

Ben eater to the rescue https://www.youtube.com/watch?v=FnxPIZR1ybs

Rimintil3y ago

broast3y ago· 4 in thread

Amazing to see that tripod sites are still around.

greenyoda3y ago

makeworld3y ago

It seems like a pretty decent search engine also.

andirk3y ago

Mine's still around altho it looks like I turned it from a The Simpsons fan page in to one sentence https://bartzone.tripod.com/ . And with allowing all scripts, it's >1,000 requests, >10MB of data.

systematical3y ago

It's the only reason I clicked on this.

santadakota3y ago· 3 in thread

Ben Eater's fantastic video series on building a breadboard 6502 based computer and an 8-bit breadboard computer from scratch might be appreciated in this thread.

6502 playlist: https://www.youtube.com/watch?v=LnzuMJLZRdU&list=PLowKtXNTBy...

8-bit build playlist: https://www.youtube.com/watch?v=HyznrdDSSGM&list=PLowKtXNTBy...

He also sells kits if one is interested in playing along.

AceJohnny23y ago

Highly recommend Ben Eater's video, which in this embedded developer's eyes has been the clearest, best explanation of the fundamentals of a CPU as I've ever seen.

snvzz3y ago

Similarly, I have come to appreciate rehsd's[0] efforts in building a 80286 computer.

0. https://www.youtube.com/@rehsd/videos

rep_lodsb3y ago

Wow, that series looks fascinating!

Only watched the first video yet, after initializing itself the CPU actually runs this code (because D8-D15 is wired to zero):

  addr    opcode
  FFFFF0  90           NOP
  FFFFF1  00 90 00 90  ADD [BX+SI+9000],DL
  FFFFF5  00 90 00 90  ADD [BX+SI+9000],DL
  FFFFF9  00 90 00 90  ADD [BX+SI+9000],DL
  FFFFFD  00 90 00 --  !! general protection fault

You can see it read and write the same address three times, then fetch the interrupt 0Dh vector and push flags+CS+IP to the stack :)

storklathe3y ago· 3 in thread

fennecfoxy3y ago

This is very useful in situations where the timing needs to be too precise to use interrupts, which are somewhat unpredictable. Obviously this means an entire CPU is held up running this code.

Another timing usage is for generating a signal to display a picture on a CRT using a microcontroller. Plenty of others as well.

Personally, though, I would never bother. It's always better to get a dedicated chip/controller for something like driving those stupid LEDs (or just get an SPI LED) or to generate a TV signal.

seanw4443y ago

I believe you can get a good idea for using NOPs if you watch Ben Eater's video on implementing the RS-232 protocol.

storklathe3y ago

Thank you so much, that makes perfect sense!

hackan3y ago· 2 in thread

Is it me or the article seems incomplete? It kinda finishes for me after:

> The following are the memory ranges you get with a 2-to-4 converter on an 8-bit address bus:

And that's it. It looks truncated, or incomplete :thinking:

tyingq3y ago

I see this at the end:

  The following are the memory ranges you get with a 2-to-4 converter on an 8-bit address bus:

  00: 00h to 3Fh
  01: 40h to 7Fh
  10: 80h to BFh
  11: C0h to FFh

hackan3y ago

Yes, exactly that. It feels incomplete :S

charcircuit3y ago· 2 in thread

I thought it started by executing microcode from ROM.

martyvis3y ago

Which is precisely what another of today's HN posts alludes to https://news.ycombinator.com/item?id=34329201

valleyer3y ago

It does, in the sense that executing any (regular, macro) code involves executing microcode. But no, there's not a separate phase during which only microcode is executed, if that's what you mean.

kens3y ago· 1 in thread

If anyone is interested in what happens with an older CPU, I've written up how the IBM 1401 from 1959 starts up: https://www.righto.com/2021/02/an-ibm-1401-mainframe-compute...

Among other things, since it uses magnetic core memory, you can run the program that was loaded when you shut it off.

TedDoesntTalk3y ago

Go Ken!

saagarjha3y ago· 1 in thread

> On some computer platforms, the instruction pointer is called the "program counter", inexplicably abbreviated "PG"

Typo, maybe? Typically it’s called “pc”.

Gordonjcp3y ago

CP/M systems used to do that too, where the ROM would copy its important bits to upper memory and then swap itself out.

squokko3y ago· 1 in thread

Now there's a domain I haven't seen in a very long time.

ChrisMarshallNY3y ago

Looks like old 8-bit computers. I remember this from my days of Machine Code, and that was a long time ago.

Wonder when this was written? I would guess mid-1980s; maybe earlier.

Back then, we were all a lot closer to the hardware.

fzliu3y ago· 1 in thread

This is only partially true. When digital chips boot up, gate outputs are in an indeterminate state. The reset sets them to know initial (and valid) values/bits.

martyvis3y ago

als03y ago

atomjames3y ago

the nand2tetris course is a nice primer for understanding this kind of stuff:

nand2tetris.org/

ruslan3y ago

https://www.youtube.com/watch?v=HyznrdDSSGM

tim_hutton3y ago

Example:

https://timhutton.github.io/2010/03/10/30984.html

https://github.com/GollyGang/ruletablerepository/wiki/CoddsD...

sanatgersappa3y ago