1. You did find a tweet that claimed 100 trillion parameters, as the GP post did.
2. The video mentions he saw _a_ tweet about GPT...and actually we don't even know what the tweet said, the moderator never finished their question.
3. I'm not sure what sort of claim "connected" is, other than unfalsifiable, like all of the confirmation bias motivated arguing on this topic. People do know Geohot's name and Pytorch is an open source ML framework, neither of which make them likely venues to know a closely kept trade secret of Open AI's. (and as we show in the rest of this post, they were parroting claims made months earlier, I'm showing you through March '23, Geohot didn't get around to repeating it until June!)
Recentering: it's not a mixture of experts model, no matter if people claimed 1 trillion, 100 trillion or both. (btw, easy proof of the extensive 1 trillion claims: innumerable, all in 2022: https://twitter.com/search?q=until%3A2022-12-31%20since%3A20...)
Now: let's say a reader just can't let go of the fact some people also made 100 trillion claims, but I said most people made 1 trillion claims. I'm not sure what to say, because I never claimed no one made 100 trillion claims as well, so I'm not sure how to give those people peace so we can talk mixture of experts. I guess apologize? I'm sorry.
Now we can definitely focus on mixture of experts.
Here's innumerable claims between Jan 1st 2023 and March 31st 2023 that GPT4 was a 1 trillion mixture of experts model, as I claimed: [https://www.google.com/search?q=mixture+of+experts+trillion+.... [/r/MachineLearning](https://www.reddit.com/r/MachineLearning/comments/121q6nk/n_...) [the-decoder](https://the-decoder.com/gpt-4-has-a-trillion-parameters/) [rando boards](https://www.futuretimeline.net/forum/viewtopic.php?p=31145)