fix: Handle Edge TTS failures with Azure fallback and stabilize media pipeline (Issue #785)#787
fix: Handle Edge TTS failures with Azure fallback and stabilize media pipeline (Issue #785)#787JAAAACCCCCCKKKK wants to merge 1 commit intoharry0703:mainfrom
Conversation
|
复现日志: You can now view your Streamlit app in your browser. Local URL: http://localhost:8501 ******** sys.path ******** 2025-10-01 22:06:54.502 | INFO | app.config.config:load_config:23 - load config from file: C:\Users\a1523\Downloads\MoneyPrinterTurbo-Portable-Windows-1.2.6\MoneyPrinterTurbo/config.toml generating video script2025-10-01 22:10:54 | DEBUG | "./app\services\task.py:26": generate_script - video script: generating audio2025-10-01 22:10:54 | INFO | "./app\services\voice.py:1137": azure_tts_v1 - Edge TTS start, voice name: zh-CN-XiaoxiaoNeural, try: 1 generating subtitle, provider: edge2025-10-01 22:10:55 | INFO | "./app\services\voice.py:1532": create_subtitle - completed, subtitle file created: .\storage\tasks\3f3a6f4b-4251-4670-8074-7adcbd6542ef\subtitle.srt, duration: 4.237 preprocess local materials{'video_found': True, 'audio_found': False, 'metadata': {'major_brand': 'isom', 'minor_version': '512', 'compatible_brands': 'isomiso2avc1mp41', 'encoder': 'Lavf61.7.100'}, 'inputs': [{'streams': [{'input_number': 0, 'stream_number': 0, 'stream_type': 'video', 'language': None, 'default': True, 'size': [580, 751], 'bitrate': 180, 'fps': 30.0, 'codec_name': 'h264', 'profile': '(High 4:4:4 Predictive)', 'metadata': {'Metadata': '', 'handler_name': 'VideoHandler', 'vendor_id': '[0][0][0][0]', 'encoder': 'Lavc61.19.100 libx264'}}], 'input_number': 0}], 'duration': 3.0, 'bitrate': 185, 'start': 0.0, 'default_video_input_number': 0, 'default_video_stream_number': 0, 'video_codec_name': 'h264', 'video_profile': '(High 4:4:4 Predictive)', 'video_size': [580, 751], 'video_bitrate': 180, 'video_fps': 30.0, 'video_duration': 3.0, 'video_n_frames': 90} combining video: 1 => .\storage\tasks\3f3a6f4b-4251-4670-8074-7adcbd6542ef\combined-1.mp42025-10-01 22:10:56 | INFO | "./app\services\video.py:129": combine_videos - audio duration: 4.82 seconds generating video script2025-10-01 22:12:30 | DEBUG | "./app\services\task.py:26": generate_script - video script: generating audio2025-10-01 22:12:30 | INFO | "./app\services\voice.py:1137": azure_tts_v1 - Edge TTS start, voice name: zh-CN-XiaoxiaoNeural, try: 1 generating subtitle, provider: edge2025-10-01 22:12:32 | INFO | "./app\services\voice.py:1532": create_subtitle - completed, subtitle file created: .\storage\tasks\a4b3ebcd-00b1-4942-9fce-a6642c9fcc7a\subtitle.srt, duration: 4.237 preprocess local materials{'video_found': True, 'audio_found': False, 'metadata': {'major_brand': 'isom', 'minor_version': '512', 'compatible_brands': 'isomiso2avc1mp41', 'encoder': 'Lavf61.7.100'}, 'inputs': [{'streams': [{'input_number': 0, 'stream_number': 0, 'stream_type': 'video', 'language': None, 'default': True, 'size': [580, 751], 'bitrate': 180, 'fps': 30.0, 'codec_name': 'h264', 'profile': '(High 4:4:4 Predictive)', 'metadata': {'Metadata': '', 'handler_name': 'VideoHandler', 'vendor_id': '[0][0][0][0]', 'encoder': 'Lavc61.19.100 libx264'}}], 'input_number': 0}], 'duration': 3.0, 'bitrate': 185, 'start': 0.0, 'default_video_input_number': 0, 'default_video_stream_number': 0, 'video_codec_name': 'h264', 'video_profile': '(High 4:4:4 Predictive)', 'video_size': [580, 751], 'video_bitrate': 180, 'video_fps': 30.0, 'video_duration': 3.0, 'video_n_frames': 90} combining video: 1 => .\storage\tasks\a4b3ebcd-00b1-4942-9fce-a6642c9fcc7a\combined-1.mp42025-10-01 22:12:33 | INFO | "./app\services\video.py:129": combine_videos - audio duration: 4.82 seconds generating video: 1 => .\storage\tasks\3f3a6f4b-4251-4670-8074-7adcbd6542ef\final-1.mp42025-10-01 22:13:07 | INFO | "./app\services\video.py:379": generate_video - generating video: 1080 x 1920 generating video script2025-10-01 22:13:39 | DEBUG | "./app\services\task.py:26": generate_script - video script: generating audio2025-10-01 22:13:39 | INFO | "./app\services\voice.py:1137": azure_tts_v1 - Edge TTS start, voice name: zh-CN-XiaoxiaoNeural, try: 1 generating subtitle, provider: edge2025-10-01 22:13:41 | INFO | "./app\services\voice.py:1532": create_subtitle - completed, subtitle file created: .\storage\tasks\eb0762bc-f525-4e5b-b16f-d0ff6c57c102\subtitle.srt, duration: 4.938 preprocess local materials{'video_found': True, 'audio_found': False, 'metadata': {'major_brand': 'isom', 'minor_version': '512', 'compatible_brands': 'isomiso2avc1mp41', 'encoder': 'Lavf61.7.100'}, 'inputs': [{'streams': [{'input_number': 0, 'stream_number': 0, 'stream_type': 'video', 'language': None, 'default': True, 'size': [580, 751], 'bitrate': 180, 'fps': 30.0, 'codec_name': 'h264', 'profile': '(High 4:4:4 Predictive)', 'metadata': {'Metadata': '', 'handler_name': 'VideoHandler', 'vendor_id': '[0][0][0][0]', 'encoder': 'Lavc61.19.100 libx264'}}], 'input_number': 0}], 'duration': 3.0, 'bitrate': 185, 'start': 0.0, 'default_video_input_number': 0, 'default_video_stream_number': 0, 'video_codec_name': 'h264', 'video_profile': '(High 4:4:4 Predictive)', 'video_size': [580, 751], 'video_bitrate': 180, 'video_fps': 30.0, 'video_duration': 3.0, 'video_n_frames': 90} combining video: 1 => .\storage\tasks\eb0762bc-f525-4e5b-b16f-d0ff6c57c102\combined-1.mp42025-10-01 22:13:42 | INFO | "./app\services\video.py:129": combine_videos - audio duration: 5.52 seconds generating video: 1 => .\storage\tasks\eb0762bc-f525-4e5b-b16f-d0ff6c57c102\final-1.mp42025-10-01 22:15:53 | INFO | "./app\services\video.py:379": generate_video - generating video: 1080 x 1920 |
|
单测日志(v1断开时): Ran 7 tests in 273.605s OK |
|
Rejected after local validation: this fallback only handles timeout/connector errors and does not fix the current Edge TTS 403 failure mode, so the reported issue remains unresolved. |
This pull request introduces improvements to video and voice processing services, focusing on robustness, error handling, and resource management. The most significant changes include enhanced file handling for video merging, improved cleanup of video/image clip resources, and a new fallback mechanism for voice synthesis using Azure Speech SDK when Edge TTS fails. Additionally, the test suite now covers the voice synthesis fallback logic.
Video Processing Improvements:
os.renametoos.replacefor safer file operations incombine_videos(app/services/video.py). [1] [2]preprocess_video(app/services/video.py). [1] [2]app/services/video.py).Voice Synthesis Enhancements:
azure_tts_v1when Edge TTS encounters network errors, with improved logging and credential checks (app/services/voice.py). [1] [2]_ensure_voice_directoryto create necessary directories before saving voice files, used throughout voice synthesis functions (app/services/voice.py). [1] [2] [3]Testing Improvements:
test/services/test_voice.py).Miscellaneous:
test/services/test_video.py).ClientConnectorErrorin voice service (app/services/voice.py).