AravSrinivas pfp
AravSrinivas
@ozogysozogysevaq
Perplexity serves MoEs like post-trained versions of DeepSeek-v3. These models can be made to utilize GPUs efficiently in multi-node settings, achieving high throughput and low latency simultaneously, compared to single-node deployments. https://t.co/pZwOaRb0oZ
0 reply
0 recast
0 reaction