I implemented a cheap subset of it used in Super Mario 64 DS for my online model viewer ( https://noclip.website/#sm64ds/44;-517.89,899.85,1300.08,0.3... ), but implementing all of the quirks and the weird featuresets might be nearly impossible to do in a modern graphics API. 2D rasterizers don't have to be slow (as SwiftShader and ryg show), and you can get the bizarre conditions exactly correct. I'm not sure what a GPU-based implementation would even add.
EDIT: The math to be able to handle the bilinear quad interpolation on a GPU was worked out by reedbeta last year: http://reedbeta.com/blog/quadrilateral-interpolation-part-2/ . That's a big roadblock gone, but there's still a lot of other questionable things.
This sounds so ridiculously obvious when reading it that I would be surprised if nobody thought of working this out before. Or is the GPU code really difficult to work out compared to how simple the conceptual approach is?
> By the way, the fact that bilinear interpolation creates quadratic splines along diagonals can be exploited to evaluate splines in a GPU texture unit.
That also sounds very interesting!
One possibility might be to do two passes: one to build up linked lists of per-fragment data (polygon ID, color, depth, etc.) and a second pass to sort all the linked lists into the proper order and determine a final color. This is the standard order-independent transparency trick.
You could build up tables as well--for instance, you could emulate the "one span per scanline/polygon" behavior by allocating a table of scanlines for each polygon that you fill with the lowest X coordinate for that scanline and discard fragments that don't belong to the triangle contributing the lowest such X coordinate.
I have no idea if this will actually work--if I had to guess I'd put a 50% probability on it not working out at all. The fallback would be a SIMD scanline renderer. The Image Load/Store GPU implementation would be really fun though :)
I truly enjoyed messing around with the home brew dev kits and loved putting my home made demos on my flash cart in high school, though I mostly stuck to 2D demos.
From what I understand, since the GPU is more akin to the GBA’s, and is a scanline-based renderer, does that mean it is more similar in its 3D architecture to, for instance, the Sega Saturn? (Incidentally, my other favourite system to write home brew for.)
The main differential, of course, it’s ability to display native triangles in addition to quads? (The Saturn literally did not have native 3D hardware as we understand it but literally drew thousands of scaled and transformed sprites as quads instead of the triangles we are used to today.)
Edit: Reading other articles on the site, it seems like they started with a software renderer. I wonder why they decided to try OpenGL?
Kind of an interesting aesthetic. It looks less terrible than I expected. I think it would be interesting to apply some of the recent machine learning based upsampling techniques on DS games too.