Skip to content

Commit 8b19bac

Browse files
committed
docs: add Audio Converter guide
1 parent b06f578 commit 8b19bac

3 files changed

Lines changed: 109 additions & 0 deletions

File tree

docs/_sidebar.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -27,6 +27,7 @@
2727
- [RTC Stats](guide/rtc_stats.md)
2828
- [Logging](guide/logging.md)
2929
- Utilities
30+
- [Audio Converter](guide/audio_converter.md)
3031
- [Voice Activity Detector](guide/voice_activity_detector.md)
3132

3233
- [**Build Notes**](build.md)

docs/guide/audio_converter.md

Lines changed: 107 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,107 @@
1+
# Audio Converter
2+
3+
The `AudioConverter` remixes and resamples PCM audio frames to a target sample rate and channel count. It operates on 10 ms frames of 16‑bit PCM data and returns the number of output samples produced for each 10 ms frame.
4+
5+
API: `dev.onvoid.webrtc.media.audio.AudioConverter`
6+
7+
## Overview
8+
9+
- Input format: 16‑bit little‑endian PCM (`byte[]`)
10+
- Frame duration: exactly 10 ms per call
11+
- Channel remixing: up/down‑mix between mono/stereo (and other counts if supported by the native backend)
12+
- Resampling: arbitrary input/output sample rates (e.g., 48 kHz → 16 kHz)
13+
- Memory ownership: you provide both input and output buffers
14+
- Native resources: must be released with `dispose()` when done
15+
16+
Key methods:
17+
- `AudioConverter(int srcSampleRate, int srcChannels, int dstSampleRate, int dstChannels)` – configure the converter
18+
- `int getTargetBufferSize()` – bytes required for the destination buffer for one 10 ms frame
19+
- `int convert(byte[] src, byte[] dst)` – convert one 10 ms input frame into the destination buffer, returns number of samples written (per frame across all channels)
20+
- `void dispose()` – free native resources
21+
22+
## Frame sizing
23+
24+
The converter operates on 10 ms frames. For a given sample rate and channel count, the number of samples per 10 ms is:
25+
26+
- Samples per channel = sampleRate / 100
27+
- Total samples (all channels) = samples per channel × channels
28+
- Bytes required = total samples × 2 (because 16‑bit PCM)
29+
30+
Examples:
31+
- 48 kHz stereo input: samples = (48000 / 100) × 2 = 960 × 2 = 1920 samples → 3840 bytes
32+
- 16 kHz mono output: samples = (16000 / 100) × 1 = 160 samples → 320 bytes
33+
34+
The method `getTargetBufferSize()` returns the exact number of bytes you need for the destination buffer for one 10 ms frame of the configured output format.
35+
36+
## Basic usage
37+
38+
```java
39+
import dev.onvoid.webrtc.media.audio.AudioConverter;
40+
41+
// Convert 48 kHz stereo to 16 kHz mono
42+
int srcSampleRate = 48000;
43+
int srcChannels = 2;
44+
int dstSampleRate = 16000;
45+
int dstChannels = 1;
46+
47+
AudioConverter converter = new AudioConverter(srcSampleRate, srcChannels, dstSampleRate, dstChannels);
48+
49+
try {
50+
// Compute 10 ms frame sizes
51+
int srcSamplesPer10ms = (srcSampleRate / 100) * srcChannels; // 960 * 2 = 1920 samples
52+
int srcBytesPer10ms = srcSamplesPer10ms * 2; // 3840 bytes
53+
54+
byte[] srcFrame = new byte[srcBytesPer10ms];
55+
56+
// Destination buffer for one 10 ms frame of output
57+
byte[] dstFrame = new byte[converter.getTargetBufferSize()]; // e.g., 320 bytes for 16 kHz mono
58+
59+
// Fill srcFrame from your capture or pipeline (exactly 10 ms of PCM 16‑bit data)
60+
// ...
61+
62+
int outSamples = converter.convert(srcFrame, dstFrame);
63+
// outSamples equals (dstSampleRate / 100) * dstChannels, e.g., 160 for 16 kHz mono
64+
65+
// Process/use dstFrame (contains 10 ms of resampled/remixed PCM 16‑bit data)
66+
}
67+
finally {
68+
converter.dispose();
69+
}
70+
```
71+
72+
## Continuous conversion loop
73+
74+
```java
75+
AudioConverter converter = new AudioConverter(48000, 2, 48000, 1); // stereo to mono, same rate
76+
77+
try {
78+
int srcBytesPer10ms = (48000 / 100) * 2 /*channels*/ * 2 /*bytes*/; // 1920 * 2 = 3840
79+
byte[] srcFrame = new byte[srcBytesPer10ms];
80+
byte[] dstFrame = new byte[converter.getTargetBufferSize()];
81+
82+
while (running) {
83+
// Read exactly 10 ms of input into srcFrame
84+
// ...
85+
86+
converter.convert(srcFrame, dstFrame);
87+
88+
// Write/queue dstFrame to the next stage (encoder, file, etc.)
89+
}
90+
}
91+
finally {
92+
converter.dispose();
93+
}
94+
```
95+
96+
## Error handling and caveats
97+
98+
- Frame length must be exactly 10 ms. If `src` has fewer samples than required, `convert` throws `IllegalArgumentException`.
99+
- Ensure `dst` is at least `getTargetBufferSize()` bytes. Otherwise, `IllegalArgumentException` is thrown.
100+
- Audio is assumed to be 16‑bit PCM. Do not pass float or 24‑bit samples.
101+
- Always call `dispose()` to free native resources when the converter is no longer needed.
102+
103+
## Related guides
104+
105+
- [Audio Processing](guide/audio_processing.md)
106+
- [Headless Audio](guide/headless_audio_device_module.md)
107+
- [Voice Activity Detector](guide/voice_activity_detector.md)

docs/guide/overview.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -37,6 +37,7 @@ This section provides detailed guides for various features of the webrtc-java li
3737

3838
## Utilities
3939

40+
- [Audio Converter](guide/audio_converter.md) - Resample and remix 10 ms PCM frames between different rates and channel layouts
4041
- [Voice Activity Detector](guide/voice_activity_detector.md) - Detect speech activity in PCM audio streams
4142

4243
## Additional Resources

0 commit comments

Comments
 (0)