Skip to content

v6.0.0-beta.2

Pre-release
Pre-release

Choose a tag to compare

@lukeocodes lukeocodes released this 08 Jan 15:28
430a46a

Release Notes: v6.0.0-beta.2

Overview

This beta release (v6.0.0-beta.2) includes critical fixes for binary audio handling across all WebSocket clients, SDK regeneration with the latest API specifications, and important improvements to the Agent, Listen, and Speak modules.

🎯 What's Changed

🐛 Bug Fixes

Binary Audio Support Improvements

  • Listen (WebSocket): Fixed send_media() parameter type from str to bytes for proper binary audio handling in both V1 and V2 clients
  • Speak (WebSocket): Added comprehensive binary audio response support in socket client
    • Updated response union type from str to bytes
    • Enhanced recv(), __iter__(), __aiter__(), and start_listening() methods to handle both binary bytes and JSON text messages
  • Agent (WebSocket):
    • Added missing send_media(message: bytes) method to both async and sync Agent V1 clients
    • Updated response union type from str to bytes for binary audio data
    • Enhanced all response handling methods to support both binary bytes and JSON text messages

🔄 SDK Updates

  • SDK Regeneration: Complete regeneration of SDK with latest API specifications
    • Added entity detection support in Listen V1 with new entities field in transcription results
    • Updated Agent V1 think provider models with support for:
      • Anthropic models
      • AWS Bedrock with new credential handling
      • Google models
      • Groq models
    • Improved Agent V1 speak provider structure with dedicated provider classes (Cartesia, Deepgram, ElevenLabs, OpenAI)
    • Enhanced Agent V1 listen provider with V1 and V2 configurations
    • Added comprehensive HTTP client improvements and test coverage

📚 Documentation & Configuration

  • Updated .fernignore to protect manual socket client fixes from regeneration
  • Added .github, docs, and examples folders to fernignore
  • Documented all manual binary audio fixes in socket clients

📊 Stats

  • 107 files changed: 1,940 insertions(+), 1,036 deletions(-)
  • Key modules affected: Agent, Listen, Speak, Core HTTP Client

⚠️ Breaking Changes

  • send_media() methods now require bytes instead of str for all WebSocket clients (Listen V1, Listen V2, Agent V1, Speak V1)

📦 Installation

pip install deepgram-sdk==6.0.0b2

Full Changelog: v6.0.0-alpha.4...v6.0.0-beta.2