Fahim In Tech
@fahimintech
1/ MiniMax-M1 just dropped and it’s beefy. we’re talkin' 456B params (with only ~46B active), a wild 1 million token context window, and a hybrid MoE architecture that keeps it lean & mean. it’s like the giga-brain cousin of DeepSeek-R1 on turbo mode 💥
1 reply
0 recast
0 reaction
Fahim In Tech
@fahimintech
2/ the game changer here is “Lightning Attention” a new trick that slices the compute down to 25% for long docs. basically, it reads a book and doesn't melt your GPU. throw in smart reinforcement learning and boom, it can do math, code, AND multi-turn reasoning in one go 🧠
1 reply
0 recast
0 reaction