serafinroscoe1 on Farcaster

Content pfp

https://opensea.io/collection/science-14

0 reply

0 recast

0 reaction

lfd pfp

Coded Llama 3.2 model from scratch and shared it on the HF Hub. Why? I think 1B & 3B models are great for experimentation, and I…

0 reply

0 recast

0 reaction

serafinroscoe1 pfp

@serafinroscoe1

Me (earlier this year0): "Llama models aren't optimized for production." Meta: "Bet. Here's the Llama 4 suite, MoE models with 16 & 128 experts" Me: "Yeah... maybe dense wasn’t so bad after all."

0 reply

0 recast

0 reaction