3Calculating the cost of a Google DeepMind paper (opens in new tab)(152334h.github.io)303152334H1y ago150Save
4Knowing Enough About MoE to Explain Dropped Tokens in GPT-4 (opens in new tab)(152334h.github.io)3152334H2y ago1Save
5Non-determinism in GPT-4 is caused by Sparse MoE (opens in new tab)(152334h.github.io)397152334H2y ago181Save