sidhant pfp
sidhant

@sidhant

it's even more insane than that. "Gemma 3n models use selective parameter activation technology to reduce resource requirements. This technique allows the models to operate at an effective size of 2B and 4B parameters, which is lower than the total number of parameters they contain."
1 reply
0 recast
2 reactions