Content
@
https://opensea.io/collection/science-14
0 reply
0 recast
0 reaction
lfd
@lfd
Coded Llama 3.2 model from scratch and shared it on the HF Hub. Why? I think 1B & 3B models are great for experimentation, and I…
0 reply
0 recast
0 reaction
serafinroscoe1
@serafinroscoe1
Me (earlier this year0): "Llama models aren't optimized for production." Meta: "Bet. Here's the Llama 4 suite, MoE models with 16 & 128 experts" Me: "Yeah... maybe dense wasn’t so bad after all."
0 reply
0 recast
0 reaction