Artificial Intelligence (AI)

Kyutai STT is a streaming speech-to-text model architecture, providing an unmatched trade-off between latency and accuracy, perfect for interactive applications. Its support for batching allows for processing hundreds of concurrent conversations on a single GPU. They release two models:
kyutai/stt-1b-en_fr, a low-latency model that understands English and French, and has a built-in semantic voice activity detector.
kyutai/stt-2.6b-en, a larger English-only model optimized to be as accurate as possible.

Kyutai STT is a streaming speech-to-text model architecture, providing an unmatched trade-off between latency and accuracy, perfect for interactive applications. Its support for batching allows for processing hundreds of concurrent conversations on a single GPU. They release two models:
kyutai/stt-1b-en_fr, a low-latency model that understands English and French, and has a built-in semantic voice activity detector.
kyutai/stt-2.6b-en, a larger English-only model optimized to be as accurate as possible.
https://kyutai.org/next/stt