Release v6.0.0-beta.2 · deepgram/deepgram-python-sdk

Release Notes: v6.0.0-beta.2

Overview

This beta release (v6.0.0-beta.2) includes critical fixes for binary audio handling across all WebSocket clients, SDK regeneration with the latest API specifications, and important improvements to the Agent, Listen, and Speak modules.

🎯 What's Changed

🐛 Bug Fixes

Binary Audio Support Improvements

Listen (WebSocket): Fixed send_media() parameter type from str to bytes for proper binary audio handling in both V1 and V2 clients
Speak (WebSocket): Added comprehensive binary audio response support in socket client
- Updated response union type from str to bytes
- Enhanced recv(), __iter__(), __aiter__(), and start_listening() methods to handle both binary bytes and JSON text messages
Agent (WebSocket):
- Added missing send_media(message: bytes) method to both async and sync Agent V1 clients
- Updated response union type from str to bytes for binary audio data
- Enhanced all response handling methods to support both binary bytes and JSON text messages

🔄 SDK Updates

SDK Regeneration: Complete regeneration of SDK with latest API specifications
- Added entity detection support in Listen V1 with new entities field in transcription results
- Updated Agent V1 think provider models with support for:
  - Anthropic models
  - AWS Bedrock with new credential handling
  - Google models
  - Groq models
- Improved Agent V1 speak provider structure with dedicated provider classes (Cartesia, Deepgram, ElevenLabs, OpenAI)
- Enhanced Agent V1 listen provider with V1 and V2 configurations
- Added comprehensive HTTP client improvements and test coverage

📚 Documentation & Configuration

Updated .fernignore to protect manual socket client fixes from regeneration
Added .github, docs, and examples folders to fernignore
Documented all manual binary audio fixes in socket clients

📊 Stats

107 files changed: 1,940 insertions(+), 1,036 deletions(-)
Key modules affected: Agent, Listen, Speak, Core HTTP Client

⚠️ Breaking Changes

send_media() methods now require bytes instead of str for all WebSocket clients (Listen V1, Listen V2, Agent V1, Speak V1)

📦 Installation

pip install deepgram-sdk==6.0.0b2

Full Changelog: v6.0.0-alpha.4...v6.0.0-beta.2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v6.0.0-beta.2

Choose a tag to compare

Sorry, something went wrong.