Spaces:

Debito
/

mamba-encoder-swarm_app

Running

App Files Files Community

Debito commited on 16 days ago

Commit

4e66afc

verified ·

1 Parent(s): 4ad8608

Upload app.py

Browse files

Files changed (1) hide show

app.py +7 -2

app.py CHANGED Viewed

@@ -268,13 +268,17 @@ class UltimateModelLoader:
         for model_name, config in self.model_configs.items():
             # Skip resource-intensive models on limited systems
             if not has_gpu and config["params"] > 500_000_000:
                 continue
-            if memory_gb < 6 and config["params"] > 400_000_000:
                 continue
             # More reasonable Mamba filtering - only skip very large models on low memory
             if memory_gb < 12 and "mamba" in model_name.lower() and config["params"] > 1_000_000_000:
                 continue
             available_models.append((model_name, config))
         # Sort by preference and priority
@@ -867,8 +871,10 @@ class UltimateMambaSwarm:
             # Generate response
             if self.model_loaded:
                 response = self._generate_with_ultimate_model(prompt, max_length, temperature, top_p, domain)
             else:
                 response = self._generate_ultimate_fallback(prompt, domain)
             # Quality validation
@@ -1378,7 +1384,6 @@ Continued research, development, and practical application will likely yield add
 **⚡ Mamba Swarm Performance:**
 - **Architecture**: Mamba Encoder Swarm (CPU Alternative Mode)
-- **Active Model**: {model_info}
 - **Model Size**: {routing_info['model_size'].title()}
 - **Selected Encoders**: {routing_info['total_active']}/100
 - **Hardware**: {self.model_loader.device}

         for model_name, config in self.model_configs.items():
             # Skip resource-intensive models on limited systems
             if not has_gpu and config["params"] > 500_000_000:
+                print(f"⚠️  Skipping {config['display_name']} - too large for CPU ({config['params']:,} > 500M)")
                 continue
+            if memory_gb < 3 and config["params"] > 150_000_000:
+                print(f"⚠️  Skipping {config['display_name']} - insufficient RAM ({memory_gb:.1f}GB < 3GB for {config['params']:,})")
                 continue
             # More reasonable Mamba filtering - only skip very large models on low memory
             if memory_gb < 12 and "mamba" in model_name.lower() and config["params"] > 1_000_000_000:
+                print(f"⚠️  Skipping {config['display_name']} - large Mamba model needs more RAM")
                 continue
+            print(f"✅ Available: {config['display_name']} ({config['params']:,} params)")
             available_models.append((model_name, config))
         # Sort by preference and priority
             # Generate response
             if self.model_loaded:
+                print(f"🧠 Using actual model inference: {self.model_loader.model_name}")
                 response = self._generate_with_ultimate_model(prompt, max_length, temperature, top_p, domain)
             else:
+                print(f"🔄 Using fallback response system (no model loaded)")
                 response = self._generate_ultimate_fallback(prompt, domain)
             # Quality validation
 **⚡ Mamba Swarm Performance:**
 - **Architecture**: Mamba Encoder Swarm (CPU Alternative Mode)
 - **Model Size**: {routing_info['model_size'].title()}
 - **Selected Encoders**: {routing_info['total_active']}/100
 - **Hardware**: {self.model_loader.device}