@thief
So we given VMs that has GPUs and an open ended task to serve vLLMs of a certain model (no restriction on quantised variants etc). Course instructors would be blasting/stress testing it on Monday - will be a last server standing thing. Optimising the fuck out of my vLLM now. this is the most fun I’ve had in a while.