@mikachip
Interesting to finally get a glimpse into the inner workings of GPT-4.
TL;DR: GPT-4 is made up of 16 'expert' models, each of which are ~110B parameters and make for ~1.8 trillion total parameters (more than 10x the 175B parameters of GPT-3.5).
https://www.semianalysis.com/p/gpt-4-architecture-infrastructure