get webrtc adm into rust#1037
Conversation
…o a room and thus failing the audio mode switching
|
|
||
| /// Tracks the number of active room connections. | ||
| /// Used to prevent audio mode switching while rooms are connected. | ||
| static ACTIVE_ROOM_COUNT: AtomicUsize = AtomicUsize::new(0); |
There was a problem hiding this comment.
suggestion: A potentially cleaner way to handle this is to have every room hold a Arc<()> and leverage strong_count to learn the number of active rooms—no need to manually decrement.
|
|
||
| /// Test setting Platform mode. | ||
| #[test] | ||
| #[serial] |
There was a problem hiding this comment.
comment (non-blocking): Currently in CI, all tests are run serially. If we switch over to Nextest (outdated PR, #816), we can configure which tests are run in serial through that config and run everything else in parallel.
…different local audio tracks
ChangesetThe following package versions will be affected by this PR:
|
faf99f6 to
0ba69b0
Compare
…hub.com/livekit/rust-sdks into sxian/CLT-2765/bring-webrtc-adm-to-rust
…hub.com/livekit/rust-sdks into sxian/CLT-2765/bring-webrtc-adm-to-rust
…hub.com/livekit/rust-sdks into sxian/CLT-2765/bring-webrtc-adm-to-rust
| - Text-to-speech (TTS) audio | ||
| - Audio from files or network streams | ||
| - Testing without audio hardware | ||
|
|
There was a problem hiding this comment.
This is the original audio input right? Existing Unity clients who want to keep the "Unity" style microphone management would also use this.
|
|
||
| ### Hybrid Approach | ||
|
|
||
| You can combine both approaches - use `PlatformAudio` for automatic speaker playback while also creating `NativeAudioStream` for audio processing/analysis: |
There was a problem hiding this comment.
Is this also possible from a Unity client? As we discussed, for the lip sync animation Unity clients might want read access to the audio data, but still want output through the platform audio.
| // Set recording device | ||
| message SetRecordingDeviceRequest { | ||
| uint64 platform_audio_handle = 1; | ||
| uint32 index = 2; | ||
| } | ||
|
|
||
| message SetRecordingDeviceResponse { | ||
| optional string error = 1; | ||
| } | ||
|
|
||
| // Set playout device | ||
| message SetPlayoutDeviceRequest { | ||
| uint64 platform_audio_handle = 1; | ||
| uint32 index = 2; | ||
| } |
There was a problem hiding this comment.
How does it handle switching the device at runtime?
| **Suitable for:** | ||
| - Server-side agents | ||
| - Text-to-speech (TTS) audio | ||
| - Audio from files or network streams |
There was a problem hiding this comment.
Or screen share audio right?
…hub.com/livekit/rust-sdks into sxian/CLT-2765/bring-webrtc-adm-to-rust
…hub.com/livekit/rust-sdks into sxian/CLT-2765/bring-webrtc-adm-to-rust
| async fn main() { | ||
| env_logger::init(); | ||
|
|
||
| let args: Vec<String> = env::args().collect(); |
There was a problem hiding this comment.
suggestion: Since we already use clap (with derive) for argument parsing in the other examples, it might be a good idea to use the same approach here. We also would get --help for free if the args have doc comments.
|
|
||
| // Connect to a room using the specified env variables | ||
| // and print all incoming events | ||
| // Usage: |
There was a problem hiding this comment.
nitpick: Other examples put usage guide in a README in the example directory.
| /// let audio = PlatformAudio::new()?; | ||
| /// println!("Found {} microphones", audio.recording_devices()); | ||
| /// ``` | ||
| pub fn recording_devices(&self) -> i16 { |
There was a problem hiding this comment.
suggestion: Index-based property accessors like this are not typical in Rust APIs. I would recommend encapsulating these info fields in a struct and making the API iterator based:
struct RecordingDeviceInfo {
pub index: u16, // or new type
pub name: String,
pub guid: String // or new type
}The signature of this method becomes:
pub fn recording_devices(&self) -> impl IntoIterator<Item = AudioDeviceInfo>Usage:
let audio = PlatformAudio::new()?;
for device in audio.recording_devices() {
println!("{} (GUID: {})", device.name, device.guid);
}
// Alternatively, collect into a Vec
let device_list: Vec<_> = audio.recording_devices().collect();This same pattern would apply to playout devices.
| /// ``` | ||
| /// | ||
| /// [`recording_device_guid`]: Self::recording_device_guid | ||
| pub fn set_recording_device_by_guid(&self, guid: &str) -> AudioResult<()> { |
There was a problem hiding this comment.
suggestion: This is a good application for the new type pattern. The query API would provide a new type wrapper (e.g. RecordingDeviceGuid) instead of a string for device GUID, and this method would accept it. This method still has to be fallible since a device might no longer be available, but applying this pattern adds a level of type safety that enforces correct usage (e.g., providing an arbitrary string that is not a valid guid is not possible). Also applicable to playout devices.
| /// [`set_recording_device`]: Self::set_recording_device | ||
| pub fn switch_recording_device(&self, index: u16) -> AudioResult<()> { | ||
| let count = self.recording_devices(); | ||
| if index >= count as u16 { |
There was a problem hiding this comment.
question: What happens if the device index is invalidated between this check and the call to runtime.set_recording_device(index)?
| livekit = { workspace = true, features = ["rustls-tls-native-roots"] } | ||
| livekit-api = { workspace = true, features = ["rustls-tls-native-roots"] } | ||
| log = { workspace = true } | ||
| hound = "3.5" |
There was a problem hiding this comment.
suggestion: This should be made a workspace dependency since it is also used by livekit-wakeword, the basic_room example, and soxr-sys (as a dev dependency). This will ensure we only pull down one version.
| // Log audio m-lines to debug sample rate issues | ||
| for line in sdp.lines() { | ||
| if line.starts_with("m=audio") || line.contains("opus") || line.contains("a=rtpmap") { | ||
| log::info!("SDP audio: {}", line); |
There was a problem hiding this comment.
issue: This should be removed (or made debug level) before merging.
| use livekit::options::TrackPublishOptions; | ||
| use livekit::{prelude::*, AudioError, AudioResult, PlatformAudio, RtcAudioSource}; | ||
| use serial_test::serial; | ||
| use tokio::time::timeout; |
There was a problem hiding this comment.
nitpick: There are a few unused imports here.
| // ==================== Platform Audio ==================== | ||
|
|
||
| /// FFI wrapper for PlatformAudio handle. | ||
| pub struct FfiPlatformAudio { |
There was a problem hiding this comment.
issue: To follow the established pattern, all request handlers and FFI wrappers for platform audio should be moved into their own module.
| } | ||
|
|
||
| // ===== Device Management Methods ===== | ||
| // These methods are primarily for FFI use. Use PlatformAudio for the public API. |
There was a problem hiding this comment.
issue: I do not see these methods being called from livekit-ffi. If they are never used directly for FFI, they should be marked pub(crate).
Summary
This PR implements Platform Audio support for the LiveKit Rust SDK, enabling WebRTC's built-in audio device handling with microphone capture and speaker playout. The implementation introduces a handle-based PlatformAudio API that coexists with the existing NativeAudioSource for manual audio pushing.
Key Features
Design Document
See docs/ADM_PROXY_DESIGN.md for full architecture details including:
API Overview
use livekit::prelude::*;
// Create PlatformAudio instance (enables ADM recording)
let audio = PlatformAudio::new()?;
// Enumerate and select devices
for i in 0..audio.recording_devices() as u16 {
println!("Mic [{}]: {}", i, audio.recording_device_name(i));
}
audio.set_recording_device(0)?;
// Connect and publish
let (room, _) = Room::connect(&url, &token, RoomOptions::default()).await?;
let track = LocalAudioTrack::create_audio_track("mic", audio.rtc_source());
room.local_participant().publish_track(LocalTrack::Audio(track), opts).await?;
// Cleanup - just drop the handle
room.close().await?;
drop(audio); // ADM recording disabled when all handles released
Testing
Run Standalone Tests (no LiveKit server required)
Set custom WebRTC build path
export LK_CUSTOM_WEBRTC="/path/to/webrtc-sys/libwebrtc/mac-arm64-debug"
Run standalone PlatformAudio tests
cargo test -p livekit --test platform_audio_test test_platform_audio_standalone -- --nocapture
Run FFI request handler tests
cargo test -p livekit-ffi requests::tests -- --nocapture
Run E2E Integration Tests (requires LiveKit server)
Start a local LiveKit server first, then:
LIVEKIT_URL=ws://localhost:7880
LIVEKIT_API_KEY=devkey
LIVEKIT_API_SECRET=secret
cargo test -p livekit --test platform_audio_test --features __lk-e2e-test -- --nocapture
Test Coverage
Category │ Tests │ Description │
Standalone - Creation │ 1 │ PlatformAudio creation, device enumeration
Standalone - Ref Counting │ 1 │ Clone, sharing, drop behavior
Standalone - Device Selection │ 1 │ Set devices, invalid index handling
Standalone - Processing │ 1 │ AEC/AGC/NS configuration, hardware availability
Standalone - Reset │ 1 │ reset_platform_audio() function
Standalone - Lifecycle │ 1 │ Full create→configure→use→release cycle
FFI - Handlers │ 6 │ NewPlatformAudio, GetDevices, SetDevice, handle lifecycle
E2E - Room Connection │ 4+ │ Platform audio with room, two participants, device switching
All tests handle missing audio devices gracefully (CI-friendly).
Run the Example
List Audio Devices
cargo run -p basic_room -- --list-devices
Connect with Platform Audio (microphone capture)
LIVEKIT_URL=wss://your-server.livekit.cloud
LIVEKIT_API_KEY=your-key
LIVEKIT_API_SECRET=your-secret
cargo run -p basic_room -- --platform-audio
Connect with File Audio
cargo run -p basic_room -- --file path/to/audio.raw
Connect with Both Platform Audio and File
cargo run -p basic_room -- --platform-audio-and-file path/to/audio.raw
WebRTC Build Requirements
The external_audio_source.patch must be applied to WebRTC. The patch is automatically applied by all platform build scripts:
For local development, set LK_CUSTOM_WEBRTC to point to your patched WebRTC build.
Known Limitations
Process-global │ Audio configuration affects all rooms in the process
Device indices │ May change on hot-plug; match by name for persistence
Single device track │ One device audio track per ADM (use NativeAudioSource for additional streams)