Btw, this is who you’re talking to with when using  www.elroy.computer

DX Engineer | Head of Developer Relations at /pinata | Building orbiter.host | stevedylan.dev/links | opinions are my own

Neat, is it TinyLlama 1.1b on a Jetson Nano?

At the moment I think I have it set to just llama3.2, so nothing crazy at all. I’ll need to experiment with some higher models and see what the latency is like.

Any idea if you're running the 1b or 3b parameter Llama 3.2? I don't have a Jetson Nano so I'm not sure what it can reasonably handle.

I'm super curious how Gemma 3 1b or 4b compare for the sort of stuff you're doing. I don't think the Jetson Nano can run 4b, that might be too big, but it would be cool if it could.

So it’s running the 3b parameter, so a little smaller but also runs it pretty fast. The Jetson does have an NVIDIA GPU but this model only has 8GB of ram, so that does hamper it a little bit