Skip to content

How do I chunk audio from a stream for downstream processing? #618

@shashi-netra

Description

@shashi-netra

IMPORTANT: Be sure to replace all template sections {{ like this }} or your issue may be discarded.

Overview

I need to run a speech recognition engine on an audio stream, but it needs to happen in (near) real-time. My idea was to use Pyav to save audio for say 10 audio frames and then run the speech recognition. Is there a recommended way to chunk an incoming audio stream?

Expected behavior

#file container for output:

out_container = av.open('test.wav','w')
out_stream = out_container.add_stream(template=audio_stream)
for i,packet in enumerate(container.demux(audio_stream)):
    print float(packet.pts*packet.stream.time_base)
    out_container.mux(packet)
    if i %10:
       #run speech recognition module here every 10th audio frame
     speech_recog()

Actual behavior

Is this the recommended approach to chunk audio from a stream?

Research

I have done the following:

Additional context

{{ Add any other context about the problem here. }}

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions