Your standard cryptographic books from any undergraduate program will tell you about the basics of confusion / diffusion, but I don't think the concept is very difficult at all.
Invertibility is hugely important but not discussed very much. It seems like crypto-experts 'obviously' know about it so they just don't talk about it? Knuth and Bob Jenkins both talked about the importance of invertibility to random number generators. EDIT: Invertibility is the basis of bijective / 1-to-1 and onto permutation functions. To be fair, I think everyone discusses the concept but maybe with different words.
Here's Jenkin's discussion on (non-crypto) hash functions, including a few paragraphs on "invertibility" (or "funnels" as they put it): http://www.burtleburtle.net/bob/hash/doobs.html
The "pcg-random" generator also has a large discussion on invertibility. Honestly, one of the best writeups on the subject, and I felt like my IQ went up severely after reading the paper end-to-end: https://www.pcg-random.org/paper.html
--------
The study of the invertibility of (data XOR (rotate-right data) XOR (rotate-left data)) is covered in many blogposts. https://marc-b-reynolds.github.io/math/2017/10/13/XorRotate....
The study of invertibility in general was covered by Lemire: https://lemire.me/blog/2016/08/09/how-many-reversible-intege...
-----
So as you can see, invertibility is "important", but rarely discussed as important. Its just assumed that everyone knows how important it is. Once you realize that invertibility / lack of funnels is important, everything else makes sense.
Galois field multiplication is useful in AES because its invertible (multiply by the GF(2^8) reciprocal). Aaaaah. I see. Why didn't my undergrad teacher just start with the importance of invertibility before talking about prime numbers and extension fields?
-----
Ah right. Once you know that these things have great properties (ie: invertible), then you can just read the SHA256 paper directly and the rest is just "obviously" confusion + diffusion principles.
-----
Linear Cryptoanalysis is the "standard" run a distribution kinda thing. Anyone who has done Chi-squared tests over random numbers kinda-gets this principle. Honestly, your standard non-crypto RNGs are roughly at this level at this point, so the study of any statistical-test for any random-number generator is good. Knuth / Art of Computer Programming Vol2 is one source of information on Chi-squared tests, but the study of statistics in general also results in Chi-squared tests and the like.
Differential Cryptoanalysis is more difficult and looks at the change-of-bits at each step of every operation. I don't quite remember how I was introduced to the concept, but differential-cryptoanalysis of many popular ciphers is a well-studied thing throughout the field.
--------
Knowledge of "what is fast" and "what is not fast" is... not really easy to come by, and seems to change as computer architectures change. I know that memory has gotten relatively slower compared to CPU-speeds in the past 30 years. I can imagine that in the 90s, an XOR-rotate-add cipher would take "relatively more" time than today, and that SBox-based / lookup table based ciphers were far more popular back then.
I'm not sure where I learned that information. Maybe software optimization manuals in general?
Can you shed some light here why I'm getting these differences?
https://jsfiddle.net/7kf15dje/
Here is an example of the output I'm getting:
// X A B X^1 X^-1 :: Difference
471490377 6 13 = 1365552781 = 471490377 :: 0
1528396978 9 11 = -1576695076 = 1528396978 :: -0
1592322722 9 20 = 622346385 = 1592322722 :: -0
1214152986 8 16 = -1748578289 = 1214152985 :: -1
1193897367 2 16 = 907713766 = 1193897366 :: -1
335642564 9 10 = 318891964 = 335642564 :: -0
486208953 16 23 = 894211128 = 486208952 :: -1
629577059 13 14 = 1383225523 = 629577058 :: -1
1609442937 8 18 = 674046110 = 1609442937 :: -0
234450967 6 12 = -459008694 = 234450966 :: -1
1840721644 19 28 = -602984005 = 1840721644 :: -0