But on the subject of extremely simple constructions, you can compose many random permutations of a particular form---state[i] = state[i] xor F(state[j], state[k]), where state[i] is the ith bit of the state and F is a randomly chosen boolean function--- and obtain something asymptotically secure [1].
Worth noting that bitwise AND also works (indeed, you can build addition from it), if you don't want to implement anything that operates on bits in non-parallel for some reason. (Which is to say/emphasize that it's linear GF(2) operations that are insufficient, not GF(2) operations in general.)
Past that: I spent a long time blowing off block cipher cryptanalysis (I'm pretty much exclusively interested in crypto bugs that will actually let me own up a system, and you are essentially never going to cryptanalyze a block cipher for a vulnerability research project) but a lot of block cipher design clicked for me once I took the time to work through linear and differential cryptanalysis.
So I guess my point here would be: if you're blown away that you can get such durably effective security --- so much so that 3DES is still in some ways resilient in 2021 --- well, (1) you should be and (2) Aleks' post here is I think probably a really good way to understand why this stuff works.
Also, 128 bit numbers are just very big. :)