Batching / Concurrent audio streams for different clients

Is it possible to handle multiple audio streams at once?  I think [this comment](https://github.com/ufal/SimulStreaming/blob/main/whisper_streaming/whisper_server.py#L51-L52) is about how to _finish_ one stream and _then_ start another.  I'm curious about whether it's possible to start processing a new stream while still processing existing ones. I think there's two levels two this:

Level 1.  handle concurrent audio streams in a non-batching fashion, like be able to handle two different states and perhaps alternate inference calls to each in a round robin fashion
Level 2. handle concurrent audio streams in the same model call, optimized with batching, and dynamically push and pop different streams as they start and finish

It seems like the [asr model itself](https://github.com/ufal/SimulStreaming/blob/main/simul_whisper/simul_whisper.py#L43) isn't stateful (looks like KV caching happens [outside the model](https://github.com/ufal/SimulStreaming/blob/main/simul_whisper/simul_whisper.py#L74-L82)).  So, maybe a modest refactor could allow for sharing the same model instance across multiple active online asr instances. Unless I'm missing something, this would accomplish level 1 which would be good

Level 2 seems like a more significant lift

Have you guys thought about implementing either of these things?

Thanks!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Batching / Concurrent audio streams for different clients #20

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Batching / Concurrent audio streams for different clients #20

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions