Spaces:

Nymbo
/

Tools

Running

Nymbo commited on 5 days ago

Commit

d1c6c5d

verified ·

1 Parent(s): d6038df

no more 30 second time limit on kokoro, can go for several minutes.

Files changed (1) hide show

app.py CHANGED Viewed

@@ -583,18 +583,18 @@ def Generate_Speech(  # <-- MCP tool #4 (Generate Speech)
     Japanese, Portuguese, and Chinese speakers.
     Enhanced for longer audio generation:
         - Can generate audio of any length based on input text
         - Concatenates multiple segments for seamless longer audio
     Default behavior:
-        - Speed defaults to 1.25 (slightly brisk cadence).
-        - Voice defaults to "af_heart".
     Args:
         text: The text to synthesize. Works best with English but supports multiple languages.
         speed: Speech speed multiplier in 0.5–2.0; 1.0 = normal speed. Default: 1.25 (slightly brisk).
-        voice: Voice identifier from 54 available options. Use List_Kokoro_Voices() to see all choices.
-               Examples: 'af_heart' (US female), 'am_adam' (US male), 'bf_bella' (British female),
     Returns:
         A tuple of (sample_rate_hz, audio_waveform) where:
@@ -605,6 +605,7 @@ def Generate_Speech(  # <-- MCP tool #4 (Generate Speech)
         - Requires the 'kokoro' package (>=0.9.4). If unavailable, an error is raised.
         - Runs on CUDA if available; otherwise CPU.
         - Supports 54 voices across 9 language/accent categories.
         - Use List_Kokoro_Voices() MCP tool to discover all available voice options.
     """
     if not text or not text.strip():

     Japanese, Portuguese, and Chinese speakers.
     Enhanced for longer audio generation:
+        - Processes ALL text segments (not just the first one)
         - Can generate audio of any length based on input text
         - Concatenates multiple segments for seamless longer audio
     Default behavior:
+        - Speed defaults to 1.25 (slightly brisk cadence) for clearer, snappier delivery.
+        - Voice defaults to "af_heart" (American Female, Heart voice)
     Args:
         text: The text to synthesize. Works best with English but supports multiple languages.
         speed: Speech speed multiplier in 0.5–2.0; 1.0 = normal speed. Default: 1.25 (slightly brisk).
+        voice: Voice identifier from 54 available options. Use List_Kokoro_Voices() to see all choices. Default: 'af_heart'.
     Returns:
         A tuple of (sample_rate_hz, audio_waveform) where:
         - Requires the 'kokoro' package (>=0.9.4). If unavailable, an error is raised.
         - Runs on CUDA if available; otherwise CPU.
         - Supports 54 voices across 9 language/accent categories.
+        - Can generate audio of any length - no 30 second limit!
         - Use List_Kokoro_Voices() MCP tool to discover all available voice options.
     """
     if not text or not text.strip():