Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
66 changes: 58 additions & 8 deletions api-reference/client/android/transports/gemini-websocket.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -14,24 +14,49 @@ The Gemini Live Websocket transport implementation enables real-time audio commu

## Installation

Add the transport dependency to your `build.gradle`:
Add the transport dependency to your `app/build.gradle.kts`:

```gradle
implementation "ai.pipecat:gemini-live-websocket-transport:0.3.7"
```kotlin
dependencies {
implementation("ai.pipecat:gemini-live-websocket-transport:0.3.7")
}
```

## Usage
Add the microphone permission to `AndroidManifest.xml`:

```xml
<uses-permission android:name="android.permission.RECORD_AUDIO" />
```

Create a client:
## Usage

```kotlin
import ai.pipecat.client.RTVIClient
import ai.pipecat.client.RTVIEventCallbacks
import ai.pipecat.client.gemini_live_websocket.GeminiLiveWebsocketTransport
import ai.pipecat.client.types.BotReadyData
import ai.pipecat.client.types.RTVIClientOptions
import ai.pipecat.client.types.RTVIClientParams
import ai.pipecat.client.types.Value

val callbacks = object : RTVIEventCallbacks() {
override fun onBackendError(message: String) {
Log.e(TAG, "Backend error: $message")
}

override fun onBotReady(data: BotReadyData) {
Log.d(TAG, "Bot is ready")
}
}

val transport = GeminiLiveWebsocketTransport.Factory(context)

val options = RTVIClientOptions(
params = RTVIClientParams(
baseUrl = null,
config = GeminiLiveWebsocketTransport.buildConfig(
apiKey = "<your Gemini api key>",
apiKey = "your-gemini-api-key",
model = "models/gemini-2.0-flash-exp",
generationConfig = Value.Object(
"speech_config" to Value.Object(
"voice_config" to Value.Object(
Expand All @@ -48,11 +73,36 @@ val options = RTVIClientOptions(

val client = RTVIClient(transport, callbacks, options)

client.start().withCallback {
// ...
client.connect().withCallback { result ->
result.errorOrNull?.let { Log.e(TAG, "Connection failed: $it") }
}
```

## Configuration

### buildConfig

`GeminiLiveWebsocketTransport.buildConfig()` accepts the following parameters:

| Parameter | Type | Description |
|---|---|---|
| `apiKey` | `String` | Your Gemini API key |
| `model` | `String` | Model name (default: `"models/gemini-2.0-flash-exp"`) |
| `initialUserMessage` | `String?` | Optional message to send at session start |
| `generationConfig` | `Value.Object` | Generation config (voice, language, etc.) |
| `systemInstruction` | `Value?` | Optional system instruction |
| `tools` | `Value.Array` | Optional tools/function definitions |

### Audio devices

The transport exposes static constants for audio routing:

```kotlin
// Route audio to speakerphone (default) or earpiece
client.updateMic(GeminiLiveWebsocketTransport.AudioDevices.Speakerphone.id)
client.updateMic(GeminiLiveWebsocketTransport.AudioDevices.Earpiece.id)
```

## Resources

<CardGroup cols={2}>
Expand Down
111 changes: 85 additions & 26 deletions api-reference/client/android/transports/openai-webrtc.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -5,44 +5,103 @@ description: "WebRTC implementation for Android using OpenAI"

The OpenAI Realtime WebRTC transport implementation enables real-time audio communication with the OpenAI Realtime service, using a direct WebRTC connection.

<Note>
Transports of this type connect directly to OpenAI's API from the client,
which exposes your API key. This is designed primarily for development and
testing. For production applications, proxy through a server component to
keep credentials secure.
</Note>

## Installation

Add the transport dependency to your `build.gradle`:
Add the transport dependency to your `app/build.gradle.kts`:

```gradle
implementation "ai.pipecat:openai-realtime-webrtc-transport:0.3.7"
```kotlin
dependencies {
implementation("ai.pipecat:openai-realtime-webrtc-transport:1.2.0")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This version doesn't exist, the OpenAI realtime transport is still on 0.3.7.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or more specifically, it hasn't been published -- I think it was a work in progress when I was pulled onto other tasks.

}
```

## Usage
Add the microphone permission to `AndroidManifest.xml`:

Create a client:
```xml
<uses-permission android:name="android.permission.RECORD_AUDIO" />
```

## Usage

```kotlin
val transport = OpenAIRealtimeWebRTCTransport.Factory(context)

val options = RTVIClientOptions(
params = RTVIClientParams(
baseUrl = null,
config = OpenAIRealtimeWebRTCTransport.buildConfig(
apiKey = apiKey,
initialMessages = listOf(
LLMContextMessage(role = "user", content = "How tall is the Eiffel Tower?")
),
initialConfig = OpenAIRealtimeSessionConfig(
voice = "ballad",
turnDetection = Value.Object("type" to Value.Str("semantic_vad")),
inputAudioNoiseReduction = Value.Object("type" to Value.Str("near_field")),
inputAudioTranscription = Value.Object("model" to Value.Str("gpt-4o-transcribe"))
)
import ai.pipecat.client.PipecatClientOptions
import ai.pipecat.client.PipecatEventCallbacks
import ai.pipecat.client.openai_realtime_webrtc.OpenAIRealtimeSessionConfig
import ai.pipecat.client.openai_realtime_webrtc.OpenAIServiceOptions
import ai.pipecat.client.openai_realtime_webrtc.PipecatClientOpenAIRealtimeWebRTC
import ai.pipecat.client.openai_realtime_webrtc.OpenAIRealtimeWebRTCTransport
import ai.pipecat.client.types.BotReadyData
import ai.pipecat.client.types.Value

val callbacks = object : PipecatEventCallbacks() {
override fun onBackendError(message: String) {
Log.e(TAG, "Backend error: $message")
}

override fun onBotReady(data: BotReadyData) {
Log.d(TAG, "Bot is ready")
}
}

val options = PipecatClientOptions(callbacks = callbacks, enableMic = true)
val client = PipecatClientOpenAIRealtimeWebRTC(OpenAIRealtimeWebRTCTransport(context), options)

client.connect(
OpenAIServiceOptions(
apiKey = "your-openai-api-key",
model = "gpt-4o-realtime-preview",
sessionConfig = OpenAIRealtimeSessionConfig(
voice = "alloy",
instructions = "You are a helpful assistant.",
turnDetection = Value.Object("type" to Value.Str("semantic_vad")),
inputAudioTranscription = Value.Object("model" to Value.Str("gpt-4o-transcribe"))
)
)
)
).withCallback { result ->
result.errorOrNull?.let { Log.e(TAG, "Connection failed: $it") }
}
```

val client = RTVIClient(transport, callbacks, options)
## Configuration

client.start().withCallback {
// ...
}
### OpenAIServiceOptions

| Parameter | Type | Description |
|---|---|---|
| `apiKey` | `String` | Your OpenAI API key |
| `sessionConfig` | `OpenAIRealtimeSessionConfig` | Session configuration |
| `model` | `String?` | Model name (default: `"gpt-realtime"`) |
| `initialMessages` | `List<LLMContextMessage>` | Messages to inject at session start |

### OpenAIRealtimeSessionConfig

| Parameter | Type | Description |
|---|---|---|
| `modalities` | `List<String>?` | Output modalities (e.g. `["audio", "text"]`) |
| `instructions` | `String?` | System instructions for the model |
| `voice` | `String?` | Voice name (e.g. `"alloy"`, `"ballad"`) |
| `turnDetection` | `Value?` | Turn detection config |
| `inputAudioNoiseReduction` | `Value?` | Noise reduction config |
| `inputAudioTranscription` | `Value?` | Transcription model config |
| `tools` | `Value?` | Tool/function definitions |
| `toolChoice` | `String?` | Tool choice strategy |
| `temperature` | `Float?` | Sampling temperature |

### Audio devices

The transport exposes static constants for audio routing:

```kotlin
// Route audio to speakerphone (default) or earpiece
client.updateMic(OpenAIRealtimeWebRTCTransport.AudioDevices.Speakerphone.id)
client.updateMic(OpenAIRealtimeWebRTCTransport.AudioDevices.Earpiece.id)
```

## Resources
Expand Down
91 changes: 76 additions & 15 deletions api-reference/client/android/transports/small-webrtc.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -3,34 +3,95 @@ title: "Small WebRTC Transport"
description: "WebRTC implementation for Android"
---

The Small WebRTC transport implementation enables real-time audio communication with the Small WebRTC Pipecat transport, using a direct WebRTC connection.
The Small WebRTC transport enables real-time audio communication with a Pipecat bot over a direct WebRTC connection, with no third-party account required.

## Installation

Add the transport dependency to your `build.gradle`:
Add the transport dependency to your `app/build.gradle.kts`:

```gradle
implementation "ai.pipecat:small-webrtc-transport:0.3.7"
```kotlin
dependencies {
implementation("ai.pipecat:small-webrtc-transport:1.2.0")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This version has also never existed

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

}
```

## Usage
Add the microphone permission to `AndroidManifest.xml`:

Create a client:
```xml
<uses-permission android:name="android.permission.RECORD_AUDIO" />
```

## Usage

```kotlin
val transport = SmallWebRTCTransport.Factory(context, baseUrl)
import ai.pipecat.client.PipecatClientOptions
import ai.pipecat.client.PipecatEventCallbacks
import ai.pipecat.client.small_webrtc_transport.PipecatClientSmallWebRTC
import ai.pipecat.client.small_webrtc_transport.SmallWebRTCTransport
import ai.pipecat.client.types.APIRequest
import ai.pipecat.client.types.BotReadyData
import ai.pipecat.client.types.Value

val options = RTVIClientOptions(
params = RTVIClientParams(baseUrl = null),
enableMic = true,
enableCam = true
)
val callbacks = object : PipecatEventCallbacks() {
override fun onBackendError(message: String) {
Log.e(TAG, "Backend error: $message")
}

val client = RTVIClient(transport, callbacks, options)
override fun onBotReady(data: BotReadyData) {
Log.d(TAG, "Bot is ready")
}
}

val options = PipecatClientOptions(callbacks = callbacks, enableMic = true)
val client = PipecatClientSmallWebRTC(SmallWebRTCTransport(context), options)

client.start().withCallback {
// ...
// Connect via your server endpoint (recommended)
client.startBotAndConnect(
APIRequest(endpoint = "https://your-server.com/api/offer", requestData = Value.Object())
).withCallback { result ->
result.errorOrNull?.let { Log.e(TAG, "Connection failed: $it") }
}

// Or connect with coroutines
// client.startBotAndConnect(...).await()
```

## Configuration

### IceConfig

Pass custom ICE servers for TURN/STUN support:

```kotlin
val iceConfig = IceConfig(
iceServers = listOf(
IceServer(
urls = listOf("turn:your-turn-server.com:3478"),
username = "user",
credential = "pass"
)
)
)

val transport = SmallWebRTCTransport(context, iceConfig = iceConfig)
```

### Audio devices

The transport exposes static constants for audio routing:

```kotlin
// Route audio to speakerphone (default) or earpiece
client.updateMic(SmallWebRTCTransport.AudioDevices.Speakerphone.id)
client.updateMic(SmallWebRTCTransport.AudioDevices.Earpiece.id)
```

### Camera selection

```kotlin
// Switch between front and rear cameras
client.updateCam(SmallWebRTCTransport.Cameras.Front.id)
client.updateCam(SmallWebRTCTransport.Cameras.Rear.id)
```

## Resources
Expand Down
Loading
Loading