@kazani
You may not find a model that hits all the capability boxes, so it's better to download a variety of models and experiment with them.
Here are some of my favorites, in no particular order:
1. Gemma 3 12B QAT: for visual intelligence and it's generally a good non-reasoning model that's fast and produces good text
https://huggingface.co/google/gemma-3-12b-it-qat-q4_0-gguf
2. Qwen3 4B 2507 Thinking: This is the updated version of Qwen3 4B, which also has a non-reasoning variant; it's really small, fast, and good quality for its size
https://huggingface.co/Qwen/Qwen3-4B-Thinking-2507
3. GPT-OSS 20B: The largest and most capable model that can run on my machine, has three levels of reasoning; it's rather slow but very capable, smartest of all
https://huggingface.co/openai/gpt-oss-20b
4. Phi-4 (14B): It was my favorite before GPT-OSS, now has reasoning and reasoning plus variants, but I haven't used it lately
https://huggingface.co/microsoft/phi-4