Skip to content

feat: add MiniMax Cloud TTS as a new dubbing provider#534

Open
octo-patch wants to merge 1 commit intoHuanshere:mainfrom
octo-patch:feature/add-minimax-tts-provider
Open

feat: add MiniMax Cloud TTS as a new dubbing provider#534
octo-patch wants to merge 1 commit intoHuanshere:mainfrom
octo-patch:feature/add-minimax-tts-provider

Conversation

@octo-patch
Copy link

Summary

Add MiniMax Cloud TTS as a new dubbing provider for VideoLingo, complementing the existing TTS options (Azure, OpenAI, Fish, Edge, etc.).

What's included

  • New TTS provider (core/tts_backend/minimax_tts.py): Supports speech-2.8-hd (recommended, HD quality) and speech-2.8-turbo (faster) models via the MiniMax T2A v2 API
  • 12 built-in voices: 5 English voices + 7 multilingual voices, selectable from the Streamlit sidebar
  • Streamlit UI integration: Voice selector with human-readable labels, model selector, and API key input in the Dubbing Settings panel
  • Full i18n support: Translation keys added for all 7 locales (en, zh-CN, zh-HK, ja, es, fr, ru)
  • Documentation: MiniMax TTS added to the TTS comparison tables in both English and Chinese docs
  • Config: minimax_tts section added to config.yaml with sensible defaults

How it works

MiniMax TTS uses a simple REST API (OpenAI-compatible auth pattern). The provider:

  1. Sends text to https://api.minimax.io/v1/t2a_v2
  2. Receives hex-encoded MP3 audio
  3. Converts to WAV using pydub (already a project dependency)

Users just need a MiniMax API key from minimax.io — MiniMax is already listed as a recommended LLM provider in VideoLingo's docs.

Files changed (15 files, ~500 additions)

File Change
core/tts_backend/minimax_tts.py New TTS provider
core/tts_backend/tts_main.py Router entry for minimax_tts
core/st_utils/sidebar_setting.py UI controls for voice/model selection
config.yaml Default config section
docs/pages/docs/start.en-US.md EN docs update
docs/pages/docs/start.zh-CN.md ZH docs update
translations/*.json (7 files) i18n keys
tests/test_minimax_tts.py 8 unit + 3 integration tests

Test plan

  • 8 unit tests pass (mocked API, voice validation, parameter checking, directory creation)
  • 3 integration tests pass with real MiniMax API (English text, Chinese text, turbo model)
  • Generated WAV files are valid audio with reasonable duration
  • Manual test: select minimax_tts in Streamlit sidebar, configure API key, generate dubbed audio

Add MiniMax Cloud TTS (speech-2.8-hd / speech-2.8-turbo) as a new TTS
provider for video dubbing, with 12 built-in voices, Streamlit sidebar
integration, full i18n support (7 locales), and unit + integration tests.

- New provider: core/tts_backend/minimax_tts.py
- Config: minimax_tts section in config.yaml
- UI: voice/model selectors in sidebar settings
- Docs: TTS comparison tables updated (EN + ZH)
- Tests: 8 unit tests + 3 integration tests

Co-Authored-By: Octopus <liyuan851277048@icloud.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant