Spaces:

Edmon02
/

SpeechT5_hy

Runtime error

App Files Files Community

Edmon02 commited on Jun 18

Commit

d2f6021

1 Parent(s): 9fb8195

Refactor: Simplify TTS application for HuggingFace Spaces with improved error handling and interface

Browse files

Files changed (7) hide show

DEPLOYMENT_FIX_SUMMARY.md +98 -0
app.py +120 -329
app_deploy.py +170 -0
app_optimized.py +6 -3
app_simple.py +210 -0
deploy.py +6 -2
requirements.txt +2 -2

DEPLOYMENT_FIX_SUMMARY.md ADDED Viewed

	@@ -0,0 +1,98 @@

+# HuggingFace Spaces Deployment Fix
+## Issues Identified and Fixed
+### 1. Gradio JSON Schema Error
+**Error**: `TypeError: argument of type 'bool' is not iterable`
+**Root Cause**: The error occurred in Gradio's JSON schema processing when trying to check `if "const" in schema:` where `schema` was a boolean instead of a dictionary.
+**Fixes Applied**:
+- Updated Gradio version to a more stable release (4.20.0)
+- Simplified the interface using `gr.Interface` instead of complex `gr.Blocks`
+- Disabled example caching (`cache_examples=False`)
+- Disabled flagging (`allow_flagging="never"`)
+- Removed `share=True` parameter (not supported on HF Spaces)
+### 2. Import and Dependency Issues
+**Fixes Applied**:
+- Added robust fallback import system
+- Created dummy pipeline for testing when imports fail
+- Improved error handling throughout the application
+- Added proper sys.path management for src imports
+### 3. HuggingFace Spaces Compatibility
+**Fixes Applied**:
+- Set `share=False` (share links not supported on HF Spaces)
+- Used standard server configuration (`0.0.0.0:7860`)
+- Simplified interface structure
+- Added proper error boundaries
+## Files Modified
+1. **`app.py`** - Main deployment file with robust error handling
+2. **`app_deploy.py`** - Clean deployment version
+3. **`app_simple.py`** - Simplified alternative
+4. **`requirements.txt`** - Updated Gradio version
+5. **`deploy.py`** - Enhanced deployment script
+## Deployment Steps
+1. **Test Locally** (optional):
+   ```bash
+   python app.py
+   ```
+2. **Deploy to HuggingFace Spaces**:
+   ```bash
+   git add .
+   git commit -m "Fix Gradio schema errors and improve compatibility"
+   git push
+   ```
+## Key Changes Made
+### App Structure
+- Switched from `gr.Blocks` to `gr.Interface` for better compatibility
+- Simplified input/output definitions
+- Removed complex state management
+### Error Handling
+- Added comprehensive try-catch blocks
+- Created fallback pipeline for testing
+- Improved logging throughout
+### Dependencies
+- Pinned Gradio to stable version
+- Maintained all core ML dependencies
+- Added proper import fallbacks
+### Configuration
+- Disabled problematic features (share, caching, flagging)
+- Set proper server configuration for HF Spaces
+- Simplified launch parameters
+## Testing the Fix
+The fixed version should:
+1. ✅ Load without JSON schema errors
+2. ✅ Handle import failures gracefully
+3. ✅ Work on HuggingFace Spaces infrastructure
+4. ✅ Provide fallback functionality when models fail to load
+5. ✅ Display proper error messages to users
+## Backup Files
+- `app_original.py` - Your original application
+- `app_optimized.py` - The optimized version (fixed)
+- `app_simple.py` - Simplified version
+- `app_deploy.py` - Final deployment version
+## If Issues Persist
+1. Check HuggingFace Spaces logs for specific errors
+2. Verify all dependencies are properly installed
+3. Test with the simple version (`app_simple.py`)
+4. Contact HF support if infrastructure issues persist
+The main fix addresses the Gradio JSON schema error by simplifying the interface structure and using compatible Gradio features.

app.py CHANGED Viewed

@@ -1,379 +1,170 @@
 """
-Optimized SpeechT5 Armenian TTS Application
-==========================================
-High-performance Gradio application with advanced optimization features.
 """
 import gradio as gr
 import numpy as np
 import logging
 import time
-from typing import Tuple, Optional
 import os
 import sys
-# Add src to path for imports
-current_dir = os.path.dirname(os.path.abspath(__file__))
-src_path = os.path.join(current_dir, 'src')
-if src_path not in sys.path:
-    sys.path.insert(0, src_path)
-try:
-    from src.pipeline import TTSPipeline
-except ImportError as e:
-    logging.error(f"Failed to import pipeline: {e}")
-    # Fallback import attempt
-    sys.path.append(os.path.join(os.path.dirname(__file__), 'src'))
-    from src.pipeline import TTSPipeline
-# Configure logging
 logging.basicConfig(
     level=logging.INFO,
-    format='%(asctime)s - %(name)s - %(levelname)s - %(message)s'
 )
 logger = logging.getLogger(__name__)
-# Global pipeline instance
-tts_pipeline: Optional[TTSPipeline] = None
-def initialize_pipeline():
-    """Initialize the TTS pipeline with error handling."""
-    global tts_pipeline
     try:
         logger.info("Initializing TTS Pipeline...")
-        tts_pipeline = TTSPipeline(
             model_checkpoint="Edmon02/TTS_NB_2",
-            max_chunk_length=200,  # Optimal for 5-20s clips
             crossfade_duration=0.1,
             use_mixed_precision=True
         )
-        # Apply production optimizations
-        tts_pipeline.optimize_for_production()
-        logger.info("TTS Pipeline initialized successfully")
         return True
     except Exception as e:
-        logger.error(f"Failed to initialize TTS pipeline: {e}")
         return False
-def predict(text: str, speaker: str,
-           enable_chunking: bool = True,
-           apply_processing: bool = True) -> Tuple[int, np.ndarray]:
     """
-    Main prediction function with optimization and error handling.
     Args:
-        text: Input text to synthesize
-        speaker: Speaker selection
-        enable_chunking: Whether to enable intelligent chunking
-        apply_processing: Whether to apply audio post-processing
     Returns:
-        Tuple of (sample_rate, audio_array)
     """
-    global tts_pipeline
-    start_time = time.time()
     try:
-        # Validate inputs
-        if not text or not text.strip():
-            logger.warning("Empty text provided")
-            return 16000, np.zeros(0, dtype=np.int16)
-        if tts_pipeline is None:
-            logger.error("TTS pipeline not initialized")
-            return 16000, np.zeros(0, dtype=np.int16)
-        # Extract speaker code from selection
-        speaker_code = speaker.split("(")[0].strip()
-        # Log request
-        logger.info(f"Processing request: {len(text)} chars, speaker: {speaker_code}")
-        # Synthesize speech
-        sample_rate, audio = tts_pipeline.synthesize(
             text=text,
-            speaker=speaker_code,
-            enable_chunking=enable_chunking,
-            apply_audio_processing=apply_processing
         )
-        # Log performance
-        total_time = time.time() - start_time
-        audio_duration = len(audio) / sample_rate if len(audio) > 0 else 0
-        rtf = total_time / audio_duration if audio_duration > 0 else float('inf')
-        logger.info(f"Request completed in {total_time:.3f}s (RTF: {rtf:.2f})")
         return sample_rate, audio
     except Exception as e:
-        logger.error(f"Prediction failed: {e}")
-        return 16000, np.zeros(0, dtype=np.int16)
-def get_performance_info() -> str:
-    """Get performance statistics as formatted string."""
-    global tts_pipeline
-    if tts_pipeline is None:
-        return "Pipeline not initialized"
-    try:
-        stats = tts_pipeline.get_performance_stats()
-        info = f"""
-**Performance Statistics:**
-- Total Inferences: {stats['pipeline_stats']['total_inferences']}
-- Average Processing Time: {stats['pipeline_stats']['avg_processing_time']:.3f}s
-- Translation Cache Size: {stats['text_processor_stats']['translation_cache_size']}
-- Model Inferences: {stats['model_stats']['total_inferences']}
-- Average Model Time: {stats['model_stats'].get('avg_inference_time', 0):.3f}s
-        """
-        return info.strip()
-    except Exception as e:
-        return f"Error getting performance info: {e}"
-def health_check() -> str:
-    """Perform system health check."""
-    global tts_pipeline
-    if tts_pipeline is None:
-        return "❌ Pipeline not initialized"
-    try:
-        health = tts_pipeline.health_check()
-        if health["status"] == "healthy":
-            return "✅ All systems operational"
-        elif health["status"] == "degraded":
-            return "⚠️ Some components have issues"
-        else:
-            return f"❌ System error: {health.get('error', 'Unknown error')}"
-    except Exception as e:
-        return f"❌ Health check failed: {e}"
-# Application metadata
-TITLE = "🎤 SpeechT5 Armenian TTS - Optimized"
-DESCRIPTION = """
-# High-Performance Armenian Text-to-Speech
-This is an **optimized version** of SpeechT5 for Armenian language synthesis, featuring:
-### 🚀 **Performance Optimizations**
-- **Intelligent Text Chunking**: Handles long texts by splitting them intelligently at sentence boundaries
-- **Caching**: Translation and embedding caching for faster repeated requests
-- **Mixed Precision**: GPU optimization with FP16 inference when available
-- **Crossfading**: Smooth audio transitions between chunks for natural-sounding longer texts
-### 🎯 **Advanced Features**
-- **Smart Text Processing**: Automatic number-to-word conversion with Armenian translation
-- **Audio Post-Processing**: Noise gating, normalization, and dynamic range optimization
-- **Robust Error Handling**: Graceful fallbacks and comprehensive logging
-- **Real-time Performance Monitoring**: Track processing times and system health
-### 📝 **Usage Tips**
-- **Short texts** (< 200 chars): Processed directly for maximum speed
-- **Long texts**: Automatically chunked with overlap for seamless audio
-- **Numbers**: Automatically converted to Armenian words
-- **Performance**: Enable chunking for texts longer than a few sentences
-### 🎵 **Audio Quality**
-- Sample Rate: 16 kHz
-- Optimized for natural prosody and clear pronunciation
-- Cross-fade transitions for multi-chunk synthesis
-The model was trained on short clips (5-20s) but uses advanced algorithms to handle longer texts effectively.
-"""
-EXAMPLES = [
-    # Short examples for quick testing
-    ["Բարև ձեզ, ինչպե՞ս եք:", "BDL (male)", True, True],
-    ["Այսօր գեղեցիկ օր է:", "BDL (male)", False, True],
-    # Medium examples demonstrating chunking
-    ["Հայաստանն ունի հարուստ պատմություն և մշակույթ: Երևանը մայրաքաղաքն է, որն ունի 2800 տարվա պատմություն:", "BDL (male)", True, True],
-    # Long example with numbers
-    ["Արարատ լեռը բարձրությունը 5165 մետր է: Այն Հայաստանի խորհրդանիշն է և գտնվում է Թուրքիայի տարածքում: Լեռան վրա ըստ Աստվածաշնչի՝ կանգնել է Նոյի տապանը 40 օրվա ջրհեղեղից հետո:", "BDL (male)", True, True],
-    # Technical example
-    ["Մեքենայի շարժիչը 150 ձիուժ է և 2.0 լիտր ծավալ ունի: Այն կարող է արագացնել 0-ից 100 կմ/ժ 8.5 վայրկյանում:", "BDL (male)", True, True],
-]
-# Custom CSS for better styling
-CUSTOM_CSS = """
-.gradio-container {
-    max-width: 1200px !important;
-    margin: auto !important;
-}
-.performance-info {
-    background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
-    padding: 15px;
-    border-radius: 10px;
-    color: white;
-    margin: 10px 0;
-}
-.health-status {
-    padding: 10px;
-    border-radius: 8px;
-    margin: 10px 0;
-    font-weight: bold;
-}
-.status-healthy { background-color: #d4edda; color: #155724; }
-.status-warning { background-color: #fff3cd; color: #856404; }
-.status-error { background-color: #f8d7da; color: #721c24; }
-"""
-def create_interface():
-    """Create and configure the Gradio interface."""
-    with gr.Blocks(
-        theme=gr.themes.Soft(),
-        css=CUSTOM_CSS,
-        title="SpeechT5 Armenian TTS"
-    ) as interface:
-        # Header
-        gr.Markdown(f"# {TITLE}")
-        gr.Markdown(DESCRIPTION)
-        with gr.Row():
-            with gr.Column(scale=2):
-                # Main input controls
-                text_input = gr.Textbox(
-                    label="📝 Input Text (Armenian)",
-                    placeholder="Մուտքագրեք ձեր տեքստը այստեղ...",
-                    lines=3,
-                    max_lines=10
-                )
-                with gr.Row():
-                    speaker_input = gr.Radio(
-                        label="🎭 Speaker",
-                        choices=["BDL (male)"],
-                        value="BDL (male)"
-                    )
-                with gr.Row():
-                    chunking_checkbox = gr.Checkbox(
-                        label="🧩 Enable Intelligent Chunking",
-                        value=True,
-                        info="Automatically split long texts for better quality"
-                    )
-                    processing_checkbox = gr.Checkbox(
-                        label="🎚️ Apply Audio Processing",
-                        value=True,
-                        info="Apply noise gating, normalization, and crossfading"
-                    )
-                # Generate button
-                generate_btn = gr.Button(
-                    "🎤 Generate Speech",
-                    variant="primary",
-                    size="lg"
-                )
-            with gr.Column(scale=1):
-                # System information panel
-                gr.Markdown("### 📊 System Status")
-                health_display = gr.Textbox(
-                    label="Health Status",
-                    value="Initializing...",
-                    interactive=False,
-                    max_lines=1
-                )
-                performance_display = gr.Textbox(
-                    label="Performance Stats",
-                    value="No data yet",
-                    interactive=False,
-                    max_lines=8
-                )
-                refresh_btn = gr.Button("🔄 Refresh Stats", size="sm")
-        # Output
-        audio_output = gr.Audio(
-            label="🔊 Generated Speech",
-            type="numpy",
-            interactive=False
-        )
-        # Examples section
-        gr.Markdown("### 💡 Example Texts")
-        gr.Examples(
-            examples=EXAMPLES,
-            inputs=[text_input, speaker_input, chunking_checkbox, processing_checkbox],
-            outputs=[audio_output],
-            fn=predict,
-            label="Click any example to try it:"
-        )
-        # Event handlers
-        generate_btn.click(
-            fn=predict,
-            inputs=[text_input, speaker_input, chunking_checkbox, processing_checkbox],
-            outputs=[audio_output],
-            show_progress="full"
-        )
-        refresh_btn.click(
-            fn=lambda: (health_check(), get_performance_info()),
-            outputs=[health_display, performance_display],
-            show_progress="minimal"
-        )
-        # Auto-refresh health status on load
-        interface.load(
-            fn=lambda: (health_check(), get_performance_info()),
-            outputs=[health_display, performance_display]
-        )
-    return interface
-def main():
-    """Main application entry point."""
-    logger.info("Starting SpeechT5 Armenian TTS Application")
-    # Initialize pipeline
-    if not initialize_pipeline():
-        logger.error("Failed to initialize TTS pipeline - exiting")
-        sys.exit(1)
-    # Create and launch interface
-    interface = create_interface()
-    # Launch with optimized settings
-    interface.launch(
-        share=True,
-        inbrowser=False,
-        show_error=True,
-        quiet=False,
-        server_name="0.0.0.0",  # Allow external connections
-        server_port=7860,       # Standard Gradio port
-        max_threads=4,          # Limit concurrent requests
-    )
 if __name__ == "__main__":
-    main()

 """
+SpeechT5 Armenian TTS - Production Deployment
+============================================
+Production-ready version for HuggingFace Spaces with robust error handling.
 """
 import gradio as gr
 import numpy as np
 import logging
 import time
 import os
 import sys
+from typing import Tuple, Optional, Union
+# Setup logging first
 logging.basicConfig(
     level=logging.INFO,
+    format='%(asctime)s - %(levelname)s - %(message)s'
 )
 logger = logging.getLogger(__name__)
+# Global pipeline variable
+pipeline = None
+def safe_import():
+    """Safely import the TTS pipeline with fallbacks."""
+    global pipeline
     try:
+        # Add src to path
+        current_dir = os.path.dirname(os.path.abspath(__file__))
+        src_path = os.path.join(current_dir, 'src')
+        if src_path not in sys.path:
+            sys.path.insert(0, src_path)
+        # Import pipeline
+        from src.pipeline import TTSPipeline
         logger.info("Initializing TTS Pipeline...")
+        pipeline = TTSPipeline(
             model_checkpoint="Edmon02/TTS_NB_2",
+            max_chunk_length=200,
             crossfade_duration=0.1,
             use_mixed_precision=True
         )
+        # Optimize for production
+        pipeline.optimize_for_production()
+        logger.info("TTS Pipeline ready")
         return True
     except Exception as e:
+        logger.error(f"Failed to initialize pipeline: {e}")
+        logger.info("Creating fallback pipeline for testing")
+        # Create a simple fallback
+        class FallbackPipeline:
+            def synthesize(self, text, **kwargs):
+                # Generate simple tone as placeholder
+                duration = min(len(text) * 0.08, 3.0)
+                sample_rate = 16000
+                samples = int(duration * sample_rate)
+                t = np.linspace(0, duration, samples)
+                # Create a simple beep
+                audio = np.sin(2 * np.pi * 440 * t) * 0.3
+                return sample_rate, (audio * 32767).astype(np.int16)
+        pipeline = FallbackPipeline()
         return False
+def generate_audio(text: str) -> Tuple[int, np.ndarray]:
     """
+    Generate audio from Armenian text.
     Args:
+        text: Armenian text to synthesize
     Returns:
+        Tuple of (sample_rate, audio_data)
     """
+    if not text or not text.strip():
+        logger.warning("Empty text provided")
+        # Return silence
+        return 16000, np.zeros(1000, dtype=np.int16)
+    if pipeline is None:
+        logger.error("Pipeline not available")
+        return 16000, np.zeros(1000, dtype=np.int16)
     try:
+        logger.info(f"Processing: {text[:50]}...")
+        start_time = time.time()
+        # Synthesize with basic parameters
+        sample_rate, audio = pipeline.synthesize(
             text=text,
+            speaker="BDL",
+            enable_chunking=True,
+            apply_audio_processing=True
         )
+        duration = time.time() - start_time
+        logger.info(f"Generated {len(audio)} samples in {duration:.2f}s")
         return sample_rate, audio
     except Exception as e:
+        logger.error(f"Synthesis error: {e}")
+        # Return silence on error
+        return 16000, np.zeros(1000, dtype=np.int16)
+# Initialize the pipeline
+logger.info("Starting TTS application...")
+initialization_success = safe_import()
+if initialization_success:
+    status_message = "✅ TTS System Ready"
+else:
+    status_message = "⚠️ Running in Test Mode (Limited Functionality)"
+# Create the Gradio interface using the simpler gr.Interface
+demo = gr.Interface(
+    fn=generate_audio,
+    inputs=gr.Textbox(
+        label="Armenian Text",
+        placeholder="Գրեք ձեր տեքստը այստեղ...",
+        lines=3,
+        max_lines=8
+    ),
+    outputs=gr.Audio(
+        label="Generated Speech",
+        type="numpy"
+    ),
+    title="🎤 Armenian Text-to-Speech",
+    description=f"""
+    {status_message}
+    Convert Armenian text to speech using SpeechT5.
+    **How to use:**
+    1. Enter Armenian text in the box below
+    2. Click Submit to generate speech
+    3. Play the generated audio
+    **Tips:**
+    - Use standard Armenian script
+    - Shorter sentences work better
+    - Include punctuation for natural pauses
+    """,
+    examples=[
+        "Բարև ձեզ:",
+        "Ինչպե՞ս եք:",
+        "Շնորհակալություն:",
+        "Կեցցե՛ Հայաստանը:",
+        "Այսօր լավ օր է:"
+    ],
+    theme=gr.themes.Default(),
+    allow_flagging="never"
+)
+# Launch the app
 if __name__ == "__main__":
+    demo.launch(
+        server_name="0.0.0.0",
+        server_port=7860,
+        share=False
+    )

app_deploy.py ADDED Viewed

	@@ -0,0 +1,170 @@

+"""
+SpeechT5 Armenian TTS - Production Deployment
+============================================
+Production-ready version for HuggingFace Spaces with robust error handling.
+"""
+import gradio as gr
+import numpy as np
+import logging
+import time
+import os
+import sys
+from typing import Tuple, Optional, Union
+# Setup logging first
+logging.basicConfig(
+    level=logging.INFO,
+    format='%(asctime)s - %(levelname)s - %(message)s'
+)
+logger = logging.getLogger(__name__)
+# Global pipeline variable
+pipeline = None
+def safe_import():
+    """Safely import the TTS pipeline with fallbacks."""
+    global pipeline
+    try:
+        # Add src to path
+        current_dir = os.path.dirname(os.path.abspath(__file__))
+        src_path = os.path.join(current_dir, 'src')
+        if src_path not in sys.path:
+            sys.path.insert(0, src_path)
+        # Import pipeline
+        from src.pipeline import TTSPipeline
+        logger.info("Initializing TTS Pipeline...")
+        pipeline = TTSPipeline(
+            model_checkpoint="Edmon02/TTS_NB_2",
+            max_chunk_length=200,
+            crossfade_duration=0.1,
+            use_mixed_precision=True
+        )
+        # Optimize for production
+        pipeline.optimize_for_production()
+        logger.info("TTS Pipeline ready")
+        return True
+    except Exception as e:
+        logger.error(f"Failed to initialize pipeline: {e}")
+        logger.info("Creating fallback pipeline for testing")
+        # Create a simple fallback
+        class FallbackPipeline:
+            def synthesize(self, text, **kwargs):
+                # Generate simple tone as placeholder
+                duration = min(len(text) * 0.08, 3.0)
+                sample_rate = 16000
+                samples = int(duration * sample_rate)
+                t = np.linspace(0, duration, samples)
+                # Create a simple beep
+                audio = np.sin(2 * np.pi * 440 * t) * 0.3
+                return sample_rate, (audio * 32767).astype(np.int16)
+        pipeline = FallbackPipeline()
+        return False
+def generate_audio(text: str) -> Tuple[int, np.ndarray]:
+    """
+    Generate audio from Armenian text.
+    Args:
+        text: Armenian text to synthesize
+    Returns:
+        Tuple of (sample_rate, audio_data)
+    """
+    if not text or not text.strip():
+        logger.warning("Empty text provided")
+        # Return silence
+        return 16000, np.zeros(1000, dtype=np.int16)
+    if pipeline is None:
+        logger.error("Pipeline not available")
+        return 16000, np.zeros(1000, dtype=np.int16)
+    try:
+        logger.info(f"Processing: {text[:50]}...")
+        start_time = time.time()
+        # Synthesize with basic parameters
+        sample_rate, audio = pipeline.synthesize(
+            text=text,
+            speaker="BDL",
+            enable_chunking=True,
+            apply_audio_processing=True
+        )
+        duration = time.time() - start_time
+        logger.info(f"Generated {len(audio)} samples in {duration:.2f}s")
+        return sample_rate, audio
+    except Exception as e:
+        logger.error(f"Synthesis error: {e}")
+        # Return silence on error
+        return 16000, np.zeros(1000, dtype=np.int16)
+# Initialize the pipeline
+logger.info("Starting TTS application...")
+initialization_success = safe_import()
+if initialization_success:
+    status_message = "✅ TTS System Ready"
+else:
+    status_message = "⚠️ Running in Test Mode (Limited Functionality)"
+# Create the Gradio interface using the simpler gr.Interface
+demo = gr.Interface(
+    fn=generate_audio,
+    inputs=gr.Textbox(
+        label="Armenian Text",
+        placeholder="Գրեք ձեր տեքստը այստեղ...",
+        lines=3,
+        max_lines=8
+    ),
+    outputs=gr.Audio(
+        label="Generated Speech",
+        type="numpy"
+    ),
+    title="🎤 Armenian Text-to-Speech",
+    description=f"""
+    {status_message}
+    Convert Armenian text to speech using SpeechT5.
+    **How to use:**
+    1. Enter Armenian text in the box below
+    2. Click Submit to generate speech
+    3. Play the generated audio
+    **Tips:**
+    - Use standard Armenian script
+    - Shorter sentences work better
+    - Include punctuation for natural pauses
+    """,
+    examples=[
+        "Բարև ձեզ:",
+        "Ինչպե՞ս եք:",
+        "Շնորհակալություն:",
+        "Կեցցե՛ Հայաստանը:",
+        "Այսօր լավ օր է:"
+    ],
+    theme=gr.themes.Default(),
+    allow_flagging="never"
+)
+# Launch the app
+if __name__ == "__main__":
+    demo.launch(
+        server_name="0.0.0.0",
+        server_port=7860,
+        share=False
+    )

app_optimized.py CHANGED Viewed

@@ -320,11 +320,14 @@ def create_interface():
         # Examples section
         gr.Markdown("### 💡 Example Texts")
-        gr.Examples(
             examples=EXAMPLES,
             inputs=[text_input, speaker_input, chunking_checkbox, processing_checkbox],
-            outputs=[audio_output],
             fn=predict,
             label="Click any example to try it:"
         )
@@ -365,7 +368,7 @@ def main():
     # Launch with optimized settings
     interface.launch(
-        share=True,
         inbrowser=False,
         show_error=True,
         quiet=False,

         # Examples section
         gr.Markdown("### 💡 Example Texts")
+        # Use simpler Examples component to avoid schema issues
+        examples = gr.Examples(
             examples=EXAMPLES,
             inputs=[text_input, speaker_input, chunking_checkbox, processing_checkbox],
+            outputs=audio_output,
             fn=predict,
+            cache_examples=False,  # Disable caching to avoid schema issues
             label="Click any example to try it:"
         )
     # Launch with optimized settings
     interface.launch(
+        share=False,  # Disable share for HF Spaces
         inbrowser=False,
         show_error=True,
         quiet=False,

app_simple.py ADDED Viewed

	@@ -0,0 +1,210 @@

+"""
+SpeechT5 Armenian TTS - HuggingFace Spaces Deployment Version
+============================================================
+Simplified and optimized for HuggingFace Spaces deployment.
+"""
+import gradio as gr
+import numpy as np
+import logging
+import time
+from typing import Tuple, Optional
+import os
+import sys
+# Add src to path for imports
+current_dir = os.path.dirname(os.path.abspath(__file__))
+src_path = os.path.join(current_dir, 'src')
+if src_path not in sys.path:
+    sys.path.insert(0, src_path)
+try:
+    from src.pipeline import TTSPipeline
+    HAS_PIPELINE = True
+except ImportError as e:
+    logging.error(f"Failed to import pipeline: {e}")
+    # Fallback import attempt
+    sys.path.append(os.path.join(os.path.dirname(__file__), 'src'))
+    try:
+        from src.pipeline import TTSPipeline
+        HAS_PIPELINE = True
+    except ImportError:
+        HAS_PIPELINE = False
+        # Create a dummy pipeline for testing
+        class TTSPipeline:
+            def __init__(self, *args, **kwargs):
+                pass
+            def synthesize(self, text, **kwargs):
+                # Return dummy audio for testing
+                duration = min(len(text) * 0.1, 5.0)  # Approximate duration
+                sample_rate = 16000
+                samples = int(duration * sample_rate)
+                # Generate a simple sine wave as placeholder
+                t = np.linspace(0, duration, samples)
+                frequency = 440  # A4 note
+                audio = (np.sin(2 * np.pi * frequency * t) * 0.3).astype(np.float32)
+                return sample_rate, (audio * 32767).astype(np.int16)
+            def optimize_for_production(self):
+                pass
+# Configure logging
+logging.basicConfig(
+    level=logging.INFO,
+    format='%(asctime)s - %(name)s - %(levelname)s - %(message)s'
+)
+logger = logging.getLogger(__name__)
+# Global pipeline instance
+tts_pipeline: Optional[TTSPipeline] = None
+def initialize_pipeline():
+    """Initialize the TTS pipeline with error handling."""
+    global tts_pipeline
+    if not HAS_PIPELINE:
+        logger.warning("Pipeline not available - using dummy implementation")
+        tts_pipeline = TTSPipeline()
+        return True
+    try:
+        logger.info("Initializing TTS Pipeline...")
+        tts_pipeline = TTSPipeline(
+            model_checkpoint="Edmon02/TTS_NB_2",
+            max_chunk_length=200,
+            crossfade_duration=0.1,
+            use_mixed_precision=True
+        )
+        # Apply production optimizations
+        tts_pipeline.optimize_for_production()
+        logger.info("TTS Pipeline initialized successfully")
+        return True
+    except Exception as e:
+        logger.error(f"Failed to initialize TTS pipeline: {e}")
+        # Fallback to dummy pipeline
+        tts_pipeline = TTSPipeline()
+        return False
+def generate_speech(text: str) -> Tuple[int, np.ndarray]:
+    """
+    Main synthesis function optimized for HF Spaces.
+    Args:
+        text: Input text to synthesize
+    Returns:
+        Tuple of (sample_rate, audio_array)
+    """
+    global tts_pipeline
+    start_time = time.time()
+    try:
+        # Validate inputs
+        if not text or not text.strip():
+            logger.warning("Empty text provided")
+            return 16000, np.zeros(1000, dtype=np.int16)
+        if tts_pipeline is None:
+            logger.error("TTS pipeline not initialized")
+            return 16000, np.zeros(1000, dtype=np.int16)
+        # Log request
+        logger.info(f"Processing request: {len(text)} characters")
+        # Synthesize speech with default settings
+        sample_rate, audio = tts_pipeline.synthesize(
+            text=text,
+            speaker="BDL",
+            enable_chunking=True,
+            apply_audio_processing=True
+        )
+        # Log performance
+        total_time = time.time() - start_time
+        logger.info(f"Request completed in {total_time:.3f}s")
+        return sample_rate, audio
+    except Exception as e:
+        logger.error(f"Synthesis failed: {e}")
+        return 16000, np.zeros(1000, dtype=np.int16)
+# Create the Gradio interface
+def create_app():
+    """Create the main Gradio application."""
+    # Simple interface definition
+    interface = gr.Interface(
+        fn=generate_speech,
+        inputs=[
+            gr.Textbox(
+                label="Armenian Text",
+                placeholder="Մուտքագրեք ձեր տեքստը այստեղ...",
+                lines=3,
+                max_lines=10
+            )
+        ],
+        outputs=[
+            gr.Audio(
+                label="Generated Speech",
+                type="numpy"
+            )
+        ],
+        title="🎤 SpeechT5 Armenian Text-to-Speech",
+        description="""
+        Convert Armenian text to natural speech using SpeechT5.
+        **Instructions:**
+        1. Enter Armenian text in the input box
+        2. Click Submit to generate speech
+        3. Listen to the generated audio
+        **Tips:**
+        - Works best with standard Armenian orthography
+        - Shorter sentences produce better quality
+        - Include proper punctuation for natural pauses
+        """,
+        examples=[
+            ["Բարև ձեզ, ինչպե՞ս եք:"],
+            ["Այսօր գեղեցիկ օր է:"],
+            ["Հայաստանն ունի հարուստ պատմություն:"],
+            ["Երևանը Հայաստանի մայրաքաղաքն է:"],
+            ["Արարատ լեռը Հայաստանի խորհրդանիշն է:"]
+        ],
+        theme=gr.themes.Soft(),
+        allow_flagging="never",  # Disable flagging to avoid schema issues
+        cache_examples=False     # Disable example caching
+    )
+    return interface
+def main():
+    """Main application entry point."""
+    logger.info("Starting SpeechT5 Armenian TTS Application")
+    # Initialize pipeline
+    if not initialize_pipeline():
+        logger.error("Failed to initialize TTS pipeline - continuing with limited functionality")
+    # Create and launch interface
+    app = create_app()
+    # Launch with HF Spaces settings
+    app.launch(
+        share=False,      # Don't create share link on HF Spaces
+        server_name="0.0.0.0",
+        server_port=7860,
+        show_error=True
+    )
+if __name__ == "__main__":
+    main()

deploy.py CHANGED Viewed

@@ -24,12 +24,16 @@ def backup_original():
 def deploy_optimized():
     """Deploy the optimized version."""
-    if os.path.exists("app_optimized.py"):
         shutil.copy2("app_optimized.py", "app.py")
         print("✅ Optimized version deployed as app.py")
         print("🚀 Ready for Hugging Face Spaces deployment!")
     else:
-        print("❌ app_optimized.py not found")
         return False
     return True

 def deploy_optimized():
     """Deploy the optimized version."""
+    if os.path.exists("app_simple.py"):
+        shutil.copy2("app_simple.py", "app.py")
+        print("✅ Simple optimized version deployed as app.py")
+        print("🚀 Ready for Hugging Face Spaces deployment!")
+    elif os.path.exists("app_optimized.py"):
         shutil.copy2("app_optimized.py", "app.py")
         print("✅ Optimized version deployed as app.py")
         print("🚀 Ready for Hugging Face Spaces deployment!")
     else:
+        print("❌ No optimized version found")
         return False
     return True

requirements.txt CHANGED Viewed

@@ -11,8 +11,8 @@ librosa==0.10.1
 soundfile==0.12.1
 scipy==1.11.4
-# Gradio and web interface (updated to latest stable)
-gradio==4.44.1
 # Text processing
 inflect==7.0.0

 soundfile==0.12.1
 scipy==1.11.4
+# Gradio and web interface (stable version for HF Spaces)
+gradio==4.20.0
 # Text processing
 inflect==7.0.0