Content
@
https://warpcast.com/~/channel/theai
0 reply
0 recast
0 reaction
Web3Gen0
@web3gen0
Hey folks, I recently came across this and wanted to share! VMoBA: Mixture-of-Block Attention for Video Diffusion Models This paper tackles one of the core bottlenecks in Video Diffusion Models (VDMs), the quadratic complexity of full attention that slows down training and inference, especially for long-duration, high-resolution videos. π The proposed solution, VMoBA (Video Mixture of Block Attention), introduces a smart sparse attention mechanism that: β Adapts to spatio-temporal patterns β Selects important blocks globally β Dynamically reduces attention complexity π‘ The results? βοΈ ~3x FLOPs speedup in training βοΈ ~1.5x faster inference latency βοΈ Maintains or even improves video generation quality Super exciting direction for scaling up video generation efficiently! Check it out: https://arxiv.org/abs/2506.22347
0 reply
0 recast
2 reactions