2025-05-29 14:50:28,409 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-29 14:50:28,409 - __main__ - DEBUG - API key found, length: 39 2025-05-29 14:50:28,409 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-29 14:50:28,409 - auto_diffusers - DEBUG - API key length: 39 2025-05-29 14:50:28,409 - auto_diffusers - INFO - Successfully configured Gemini AI model 2025-05-29 14:50:28,409 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-29 14:50:28,409 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-29 14:50:28,409 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-29 14:50:28,409 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-29 14:50:28,409 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-29 14:50:28,413 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-29 14:50:28,413 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-29 14:50:28,856 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-29 14:50:28,856 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-29 14:50:28,856 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-29 14:50:28,856 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-29 14:50:28,856 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-29 14:50:28,856 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-29 14:50:28,856 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-29 14:50:28,856 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-29 14:50:28,856 - __main__ - ERROR - Failed to initialize SimpleMemoryCalculator: 'SimpleMemoryCalculator' object has no attribute 'known_models' 2025-05-29 14:52:16,109 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-29 14:52:16,109 - __main__ - DEBUG - API key found, length: 39 2025-05-29 14:52:16,109 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-29 14:52:16,109 - auto_diffusers - DEBUG - API key length: 39 2025-05-29 14:52:16,109 - auto_diffusers - INFO - Successfully configured Gemini AI model 2025-05-29 14:52:16,109 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-29 14:52:16,109 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-29 14:52:16,109 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-29 14:52:16,109 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-29 14:52:16,109 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-29 14:52:16,113 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-29 14:52:16,113 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-29 14:52:16,551 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-29 14:52:16,551 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-29 14:52:16,551 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-29 14:52:16,551 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-29 14:52:16,551 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-29 14:52:16,551 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-29 14:52:16,551 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-29 14:52:16,551 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-29 14:52:16,551 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-29 14:52:16,551 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-29 14:52:16,551 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-29 14:52:16,553 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-29 14:52:16,566 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-29 14:52:16,572 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-29 14:52:16,648 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-29 14:52:16,683 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=None socket_options=None 2025-05-29 14:52:16,684 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-29 14:52:16,684 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-29 14:52:16,684 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-29 14:52:16,684 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-29 14:52:16,684 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-29 14:52:16,685 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-29 14:52:16,685 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Thu, 29 May 2025 05:52:16 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')]) 2025-05-29 14:52:16,685 - httpx - INFO - HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK" 2025-05-29 14:52:16,685 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-29 14:52:16,685 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-29 14:52:16,685 - httpcore.http11 - DEBUG - response_closed.started 2025-05-29 14:52:16,685 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-29 14:52:16,685 - httpcore.connection - DEBUG - close.started 2025-05-29 14:52:16,685 - httpcore.connection - DEBUG - close.complete 2025-05-29 14:52:16,686 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=3 socket_options=None 2025-05-29 14:52:16,686 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-29 14:52:16,686 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-29 14:52:16,686 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-29 14:52:16,686 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-29 14:52:16,686 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-29 14:52:16,686 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-29 14:52:16,692 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Thu, 29 May 2025 05:52:16 GMT'), (b'server', b'uvicorn'), (b'content-length', b'73070'), (b'content-type', b'text/html; charset=utf-8')]) 2025-05-29 14:52:16,692 - httpx - INFO - HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK" 2025-05-29 14:52:16,692 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-29 14:52:16,692 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-29 14:52:16,692 - httpcore.http11 - DEBUG - response_closed.started 2025-05-29 14:52:16,692 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-29 14:52:16,692 - httpcore.connection - DEBUG - close.started 2025-05-29 14:52:16,692 - httpcore.connection - DEBUG - close.complete 2025-05-29 14:52:16,703 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=30 socket_options=None 2025-05-29 14:52:16,842 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-29 14:52:16,842 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-29 14:52:16,852 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-29 14:52:16,861 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-29 14:52:16,861 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=30 2025-05-29 14:52:17,179 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-29 14:52:17,179 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-29 14:52:17,180 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-29 14:52:17,180 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-29 14:52:17,180 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-29 14:52:17,180 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-29 14:52:17,182 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-29 14:52:17,182 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-29 14:52:17,182 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-29 14:52:17,182 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-29 14:52:17,183 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-29 14:52:17,183 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-29 14:52:17,340 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Thu, 29 May 2025 05:52:17 GMT'), (b'Content-Type', b'text/html; charset=utf-8'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'ContentType', b'application/json'), (b'Access-Control-Allow-Origin', b'*'), (b'Content-Encoding', b'gzip')]) 2025-05-29 14:52:17,340 - httpx - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK" 2025-05-29 14:52:17,340 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-29 14:52:17,340 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-29 14:52:17,340 - httpcore.http11 - DEBUG - response_closed.started 2025-05-29 14:52:17,340 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-29 14:52:17,340 - httpcore.connection - DEBUG - close.started 2025-05-29 14:52:17,340 - httpcore.connection - DEBUG - close.complete 2025-05-29 14:52:17,354 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Thu, 29 May 2025 05:52:17 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-29 14:52:17,355 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-29 14:52:17,355 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-29 14:52:17,355 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-29 14:52:17,355 - httpcore.http11 - DEBUG - response_closed.started 2025-05-29 14:52:17,355 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-29 14:52:17,355 - httpcore.connection - DEBUG - close.started 2025-05-29 14:52:17,355 - httpcore.connection - DEBUG - close.complete 2025-05-29 14:52:18,360 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-29 14:52:18,573 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/launched HTTP/1.1" 200 0 2025-05-29 15:59:34,212 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-29 15:59:34,212 - __main__ - DEBUG - API key found, length: 39 2025-05-29 15:59:34,212 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-29 15:59:34,212 - auto_diffusers - DEBUG - API key length: 39 2025-05-29 15:59:34,212 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-29 15:59:34,212 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-29 15:59:34,212 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-29 15:59:34,212 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-29 15:59:34,212 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-29 15:59:34,212 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-29 15:59:34,216 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-29 15:59:34,216 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-29 15:59:34,645 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-29 15:59:34,646 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-29 15:59:34,646 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-29 15:59:34,646 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-29 15:59:34,646 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-29 15:59:34,646 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-29 15:59:34,646 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-29 15:59:34,646 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-29 15:59:34,646 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-29 15:59:34,646 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-29 15:59:34,646 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-29 15:59:34,648 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-29 15:59:34,661 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-29 15:59:34,667 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-29 15:59:34,749 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-29 15:59:34,784 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=None socket_options=None 2025-05-29 15:59:34,785 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-29 15:59:34,785 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-29 15:59:34,785 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-29 15:59:34,785 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-29 15:59:34,785 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-29 15:59:34,785 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-29 15:59:34,785 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Thu, 29 May 2025 06:59:34 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')]) 2025-05-29 15:59:34,786 - httpx - INFO - HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK" 2025-05-29 15:59:34,786 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-29 15:59:34,786 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-29 15:59:34,786 - httpcore.http11 - DEBUG - response_closed.started 2025-05-29 15:59:34,786 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-29 15:59:34,786 - httpcore.connection - DEBUG - close.started 2025-05-29 15:59:34,786 - httpcore.connection - DEBUG - close.complete 2025-05-29 15:59:34,786 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=3 socket_options=None 2025-05-29 15:59:34,787 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-29 15:59:34,787 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-29 15:59:34,787 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-29 15:59:34,787 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-29 15:59:34,787 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-29 15:59:34,787 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-29 15:59:34,792 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Thu, 29 May 2025 06:59:34 GMT'), (b'server', b'uvicorn'), (b'content-length', b'73058'), (b'content-type', b'text/html; charset=utf-8')]) 2025-05-29 15:59:34,792 - httpx - INFO - HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK" 2025-05-29 15:59:34,792 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-29 15:59:34,792 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-29 15:59:34,792 - httpcore.http11 - DEBUG - response_closed.started 2025-05-29 15:59:34,792 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-29 15:59:34,793 - httpcore.connection - DEBUG - close.started 2025-05-29 15:59:34,793 - httpcore.connection - DEBUG - close.complete 2025-05-29 15:59:34,803 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=30 socket_options=None 2025-05-29 15:59:34,825 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-29 15:59:34,825 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-29 15:59:34,940 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-29 15:59:34,940 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=30 2025-05-29 15:59:34,971 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-29 15:59:35,099 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-29 15:59:35,099 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-29 15:59:35,100 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-29 15:59:35,100 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-29 15:59:35,100 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-29 15:59:35,100 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-29 15:59:35,222 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-29 15:59:35,222 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-29 15:59:35,223 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-29 15:59:35,223 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-29 15:59:35,223 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-29 15:59:35,223 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-29 15:59:35,237 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Thu, 29 May 2025 06:59:35 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-29 15:59:35,238 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-29 15:59:35,238 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-29 15:59:35,238 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-29 15:59:35,238 - httpcore.http11 - DEBUG - response_closed.started 2025-05-29 15:59:35,238 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-29 15:59:35,238 - httpcore.connection - DEBUG - close.started 2025-05-29 15:59:35,239 - httpcore.connection - DEBUG - close.complete 2025-05-29 15:59:35,362 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Thu, 29 May 2025 06:59:35 GMT'), (b'Content-Type', b'text/html; charset=utf-8'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'ContentType', b'application/json'), (b'Access-Control-Allow-Origin', b'*'), (b'Content-Encoding', b'gzip')]) 2025-05-29 15:59:35,363 - httpx - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK" 2025-05-29 15:59:35,363 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-29 15:59:35,364 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-29 15:59:35,364 - httpcore.http11 - DEBUG - response_closed.started 2025-05-29 15:59:35,364 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-29 15:59:35,365 - httpcore.connection - DEBUG - close.started 2025-05-29 15:59:35,365 - httpcore.connection - DEBUG - close.complete 2025-05-29 15:59:36,012 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-29 15:59:36,260 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/launched HTTP/1.1" 200 0 2025-05-29 16:02:12,960 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-29 16:02:12,961 - simple_memory_calculator - INFO - Using known memory data for black-forest-labs/FLUX.1-schnell 2025-05-29 16:02:12,961 - simple_memory_calculator - DEBUG - Known data: {'params_billions': 12.0, 'fp16_gb': 24.0, 'inference_fp16_gb': 36.0} 2025-05-29 16:02:12,961 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-29 16:02:12,961 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-29 16:02:12,962 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-29 16:02:12,962 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-29 16:02:12,962 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-29 16:02:12,962 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-29 16:02:32,785 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-29 16:02:32,785 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-29 16:02:32,785 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-29 16:02:32,785 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-29 16:02:32,785 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-29 16:02:32,786 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-29 16:02:32,786 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-29 16:02:32,786 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-29 16:02:47,460 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-29 16:02:47,460 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-29 16:02:47,460 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-29 16:02:47,460 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-29 16:02:47,460 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-29 16:02:47,460 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-29 16:02:47,460 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-29 16:02:47,460 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-29 16:02:52,300 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-29 16:02:52,300 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-29 16:02:52,300 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 32.0GB VRAM 2025-05-29 16:02:52,300 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-29 16:02:52,300 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-29 16:02:52,300 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-29 16:02:52,300 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-29 16:02:52,300 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-29 16:02:52,452 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-29 16:02:52,452 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-29 16:02:52,452 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 32.0GB VRAM 2025-05-29 16:02:52,452 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-29 16:02:52,452 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-29 16:02:52,452 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-29 16:02:52,452 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-29 16:02:52,452 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-29 16:02:54,804 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-29 16:02:54,804 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-29 16:02:54,804 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 32.0GB VRAM 2025-05-29 16:02:54,804 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-29 16:02:54,804 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-29 16:02:54,804 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-29 16:02:54,804 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-29 16:02:54,804 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-29 16:02:54,804 - auto_diffusers - INFO - Starting code generation for model: black-forest-labs/FLUX.1-schnell 2025-05-29 16:02:54,804 - auto_diffusers - DEBUG - Parameters: prompt='A cat holding a sign that says hello world...', size=(768, 1360), steps=4 2025-05-29 16:02:54,804 - auto_diffusers - DEBUG - Manual specs: True, Memory analysis provided: True 2025-05-29 16:02:54,804 - auto_diffusers - INFO - Using manual hardware specifications 2025-05-29 16:02:54,804 - auto_diffusers - DEBUG - Manual specs: {'platform': 'Linux', 'architecture': 'manual_input', 'cpu_count': 8, 'python_version': '3.11', 'cuda_available': False, 'mps_available': False, 'torch_version': '2.0+', 'manual_input': True, 'ram_gb': 16, 'user_dtype': None, 'gpu_info': [{'name': 'RTX 5090', 'memory_mb': 32768}]} 2025-05-29 16:02:54,804 - auto_diffusers - DEBUG - GPU detected with 32.0 GB VRAM 2025-05-29 16:02:54,804 - auto_diffusers - INFO - Selected optimization profile: performance 2025-05-29 16:02:54,804 - auto_diffusers - DEBUG - Creating generation prompt for Gemini API 2025-05-29 16:02:54,804 - auto_diffusers - DEBUG - Prompt length: 3456 characters 2025-05-29 16:02:54,804 - auto_diffusers - INFO - Sending request to Gemini API with tool calling enabled 2025-05-29 16:03:10,966 - auto_diffusers - INFO - Successfully received response from Gemini API (no tools used) 2025-05-29 16:03:10,966 - auto_diffusers - DEBUG - Response length: 1710 characters 2025-05-29 16:08:00,894 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-29 16:08:00,894 - __main__ - DEBUG - API key found, length: 39 2025-05-29 16:08:00,894 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-29 16:08:00,894 - auto_diffusers - DEBUG - API key length: 39 2025-05-29 16:08:00,894 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-29 16:08:00,894 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-29 16:08:00,894 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-29 16:08:00,894 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-29 16:08:00,894 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-29 16:08:00,894 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-29 16:08:00,898 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-29 16:08:00,898 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-29 16:08:01,310 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-29 16:08:01,310 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-29 16:08:01,310 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-29 16:08:01,310 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-29 16:08:01,310 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-29 16:08:01,310 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-29 16:08:01,310 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-29 16:08:01,310 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-29 16:08:01,310 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-29 16:08:01,310 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-29 16:08:01,310 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-29 16:08:01,312 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-29 16:08:01,325 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-29 16:08:01,325 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-29 16:08:01,404 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-29 16:08:01,439 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=None socket_options=None 2025-05-29 16:08:01,440 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-29 16:08:01,440 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-29 16:08:01,440 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-29 16:08:01,440 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-29 16:08:01,441 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-29 16:08:01,441 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-29 16:08:01,441 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Thu, 29 May 2025 07:08:01 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')]) 2025-05-29 16:08:01,441 - httpx - INFO - HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK" 2025-05-29 16:08:01,441 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-29 16:08:01,441 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-29 16:08:01,441 - httpcore.http11 - DEBUG - response_closed.started 2025-05-29 16:08:01,441 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-29 16:08:01,441 - httpcore.connection - DEBUG - close.started 2025-05-29 16:08:01,441 - httpcore.connection - DEBUG - close.complete 2025-05-29 16:08:01,442 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=3 socket_options=None 2025-05-29 16:08:01,442 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-29 16:08:01,442 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-29 16:08:01,442 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-29 16:08:01,442 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-29 16:08:01,442 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-29 16:08:01,442 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-29 16:08:01,447 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Thu, 29 May 2025 07:08:01 GMT'), (b'server', b'uvicorn'), (b'content-length', b'73065'), (b'content-type', b'text/html; charset=utf-8')]) 2025-05-29 16:08:01,448 - httpx - INFO - HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK" 2025-05-29 16:08:01,448 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-29 16:08:01,448 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-29 16:08:01,448 - httpcore.http11 - DEBUG - response_closed.started 2025-05-29 16:08:01,448 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-29 16:08:01,448 - httpcore.connection - DEBUG - close.started 2025-05-29 16:08:01,448 - httpcore.connection - DEBUG - close.complete 2025-05-29 16:08:01,459 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=30 socket_options=None 2025-05-29 16:08:01,611 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-29 16:08:01,746 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-29 16:08:01,746 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=30 2025-05-29 16:08:01,764 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-29 16:08:01,764 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-29 16:08:02,034 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-29 16:08:02,035 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-29 16:08:02,035 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-29 16:08:02,036 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-29 16:08:02,036 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-29 16:08:02,036 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-29 16:08:02,101 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-29 16:08:02,101 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-29 16:08:02,101 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-29 16:08:02,101 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-29 16:08:02,101 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-29 16:08:02,101 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-29 16:08:02,186 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Thu, 29 May 2025 07:08:02 GMT'), (b'Content-Type', b'text/html; charset=utf-8'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'ContentType', b'application/json'), (b'Access-Control-Allow-Origin', b'*'), (b'Content-Encoding', b'gzip')]) 2025-05-29 16:08:02,186 - httpx - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK" 2025-05-29 16:08:02,187 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-29 16:08:02,187 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-29 16:08:02,187 - httpcore.http11 - DEBUG - response_closed.started 2025-05-29 16:08:02,187 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-29 16:08:02,187 - httpcore.connection - DEBUG - close.started 2025-05-29 16:08:02,187 - httpcore.connection - DEBUG - close.complete 2025-05-29 16:08:02,272 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Thu, 29 May 2025 07:08:02 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-29 16:08:02,273 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-29 16:08:02,273 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-29 16:08:02,273 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-29 16:08:02,273 - httpcore.http11 - DEBUG - response_closed.started 2025-05-29 16:08:02,273 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-29 16:08:02,273 - httpcore.connection - DEBUG - close.started 2025-05-29 16:08:02,273 - httpcore.connection - DEBUG - close.complete 2025-05-29 16:08:02,845 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-29 16:08:03,061 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/launched HTTP/1.1" 200 0 2025-05-29 16:08:25,941 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-29 16:08:25,942 - simple_memory_calculator - INFO - Using known memory data for black-forest-labs/FLUX.1-schnell 2025-05-29 16:08:25,942 - simple_memory_calculator - DEBUG - Known data: {'params_billions': 12.0, 'fp16_gb': 24.0, 'inference_fp16_gb': 36.0} 2025-05-29 16:08:25,942 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-29 16:08:25,942 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-29 16:08:25,942 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-29 16:08:25,942 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-29 16:08:25,942 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-29 16:08:25,943 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-29 16:08:30,760 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-29 16:08:30,761 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-29 16:08:30,761 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-29 16:08:30,761 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-29 16:08:30,761 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-29 16:08:30,761 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-29 16:08:30,761 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-29 16:08:30,761 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-29 16:08:37,477 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-29 16:08:37,477 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-29 16:08:37,478 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 32.0GB VRAM 2025-05-29 16:08:37,478 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-29 16:08:37,478 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-29 16:08:37,478 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-29 16:08:37,480 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-29 16:08:37,480 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-29 16:08:37,527 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-29 16:08:37,527 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-29 16:08:37,527 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 32.0GB VRAM 2025-05-29 16:08:37,527 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-29 16:08:37,527 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-29 16:08:37,527 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-29 16:08:37,527 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-29 16:08:37,527 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-29 16:08:39,349 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-29 16:08:39,350 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-29 16:08:39,350 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 32.0GB VRAM 2025-05-29 16:08:39,350 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-29 16:08:39,350 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-29 16:08:39,351 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-29 16:08:39,351 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-29 16:08:39,351 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-29 16:08:39,351 - auto_diffusers - INFO - Starting code generation for model: black-forest-labs/FLUX.1-schnell 2025-05-29 16:08:39,351 - auto_diffusers - DEBUG - Parameters: prompt='A cat holding a sign that says hello world...', size=(768, 1360), steps=4 2025-05-29 16:08:39,351 - auto_diffusers - DEBUG - Manual specs: True, Memory analysis provided: True 2025-05-29 16:08:39,351 - auto_diffusers - INFO - Using manual hardware specifications 2025-05-29 16:08:39,351 - auto_diffusers - DEBUG - Manual specs: {'platform': 'Linux', 'architecture': 'manual_input', 'cpu_count': 8, 'python_version': '3.11', 'cuda_available': False, 'mps_available': False, 'torch_version': '2.0+', 'manual_input': True, 'ram_gb': 16, 'user_dtype': None, 'gpu_info': [{'name': 'RTX 5090', 'memory_mb': 32768}]} 2025-05-29 16:08:39,352 - auto_diffusers - DEBUG - GPU detected with 32.0 GB VRAM 2025-05-29 16:08:39,352 - auto_diffusers - INFO - Selected optimization profile: performance 2025-05-29 16:08:39,352 - auto_diffusers - DEBUG - Creating generation prompt for Gemini API 2025-05-29 16:08:39,352 - auto_diffusers - DEBUG - Prompt length: 3456 characters 2025-05-29 16:08:39,352 - auto_diffusers - INFO - ================================================================================ 2025-05-29 16:08:39,352 - auto_diffusers - INFO - PROMPT SENT TO GEMINI API: 2025-05-29 16:08:39,352 - auto_diffusers - INFO - ================================================================================ 2025-05-29 16:08:39,352 - auto_diffusers - INFO - You are an expert in optimizing diffusers library code for different hardware configurations. NOTE: Advanced tool calling features are available when dependencies are installed. TASK: Generate optimized Python code for running a diffusion model with the following specifications: - Model: black-forest-labs/FLUX.1-schnell - Prompt: "A cat holding a sign that says hello world" - Image size: 768x1360 - Inference steps: 4 HARDWARE SPECIFICATIONS: - Platform: Linux (manual_input) - CPU Cores: 8 - CUDA Available: False - MPS Available: False - Optimization Profile: performance - GPU: RTX 5090 (32.0 GB VRAM) MEMORY ANALYSIS: - Model Memory Requirements: 36.0 GB (FP16 inference) - Model Weights Size: 24.0 GB (FP16) - Memory Recommendation: ⚠️ Model weights fit, enable memory optimizations - Recommended Precision: float16 - Attention Slicing Recommended: True - VAE Slicing Recommended: True OPTIMIZATION REQUIREMENTS: Please scrape and analyze the latest optimization techniques from this URL: https://huggingface.co/docs/diffusers/main/en/optimization IMPORTANT: For FLUX.1-schnell models, do NOT include guidance_scale parameter as it's not needed. Based on the hardware specs and optimization profile, generate Python code that includes: 1. **Memory Optimizations** (if low VRAM): - Model offloading (enable_model_cpu_offload, enable_sequential_cpu_offload) - Attention slicing (enable_attention_slicing) - VAE slicing (enable_vae_slicing) - Memory efficient attention 2. **Speed Optimizations**: - Appropriate torch.compile() usage - Optimal dtype selection (torch.float16, torch.bfloat16) - Device placement optimization 3. **Hardware-Specific Optimizations**: - CUDA optimizations for NVIDIA GPUs - MPS optimizations for Apple Silicon - CPU fallbacks when needed 4. **Model-Specific Optimizations**: - Appropriate scheduler selection - Optimal inference parameters - Pipeline configuration 5. **Data Type (dtype) Selection**: - If user specified a dtype, use that exact dtype in the code - If no dtype specified, automatically select the optimal dtype based on hardware: * Apple Silicon (MPS): prefer torch.bfloat16 * NVIDIA GPUs: prefer torch.float16 or torch.bfloat16 based on capability * CPU only: use torch.float32 - Add a comment explaining why that dtype was chosen IMPORTANT GUIDELINES: - Include all necessary imports - Add brief comments explaining optimization choices - Use the most current and effective optimization techniques - Ensure code is production-ready CODE STYLE REQUIREMENTS - GENERATE COMPACT CODE: - Assign static values directly to function arguments instead of using variables when possible - Minimize variable declarations - inline values where it improves readability - Reduce exception handling to essential cases only - assume normal operation - Use concise, direct code patterns - Combine operations where logical and readable - Avoid unnecessary intermediate variables - Keep code clean and minimal while maintaining functionality Examples of preferred compact style: - pipe = Pipeline.from_pretrained("model", torch_dtype=torch.float16) instead of storing dtype in variable - image = pipe("prompt", num_inference_steps=4, height=768, width=1360) instead of separate variables - Direct assignment: device = "cuda" if torch.cuda.is_available() else "cpu" Generate ONLY the Python code, no explanations before or after the code block. 2025-05-29 16:08:39,353 - auto_diffusers - INFO - ================================================================================ 2025-05-29 16:08:39,353 - auto_diffusers - INFO - Sending request to Gemini API 2025-05-29 16:08:54,821 - auto_diffusers - INFO - Successfully received response from Gemini API (no tools used) 2025-05-29 16:08:54,821 - auto_diffusers - DEBUG - Response length: 2512 characters 2025-05-29 16:16:14,940 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-29 16:16:14,940 - __main__ - DEBUG - API key found, length: 39 2025-05-29 16:16:14,940 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-29 16:16:14,940 - auto_diffusers - DEBUG - API key length: 39 2025-05-29 16:16:14,940 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-29 16:16:14,940 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-29 16:16:14,940 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-29 16:16:14,940 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-29 16:16:14,940 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-29 16:16:14,940 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-29 16:16:14,943 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-29 16:16:14,943 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-29 16:16:15,359 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-29 16:16:15,359 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-29 16:16:15,359 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-29 16:16:15,359 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-29 16:16:15,359 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-29 16:16:15,359 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-29 16:16:15,359 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-29 16:16:15,359 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-29 16:16:15,359 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-29 16:16:15,359 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-29 16:16:15,359 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-29 16:16:15,362 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-29 16:16:15,374 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-29 16:16:15,381 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-29 16:16:15,454 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-29 16:16:15,488 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=None socket_options=None 2025-05-29 16:16:15,489 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-29 16:16:15,489 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-29 16:16:15,489 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-29 16:16:15,489 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-29 16:16:15,489 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-29 16:16:15,489 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-29 16:16:15,490 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Thu, 29 May 2025 07:16:15 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')]) 2025-05-29 16:16:15,490 - httpx - INFO - HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK" 2025-05-29 16:16:15,490 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-29 16:16:15,490 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-29 16:16:15,490 - httpcore.http11 - DEBUG - response_closed.started 2025-05-29 16:16:15,490 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-29 16:16:15,490 - httpcore.connection - DEBUG - close.started 2025-05-29 16:16:15,490 - httpcore.connection - DEBUG - close.complete 2025-05-29 16:16:15,490 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=3 socket_options=None 2025-05-29 16:16:15,491 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-29 16:16:15,491 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-29 16:16:15,491 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-29 16:16:15,491 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-29 16:16:15,491 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-29 16:16:15,491 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-29 16:16:15,496 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Thu, 29 May 2025 07:16:15 GMT'), (b'server', b'uvicorn'), (b'content-length', b'73064'), (b'content-type', b'text/html; charset=utf-8')]) 2025-05-29 16:16:15,496 - httpx - INFO - HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK" 2025-05-29 16:16:15,496 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-29 16:16:15,496 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-29 16:16:15,496 - httpcore.http11 - DEBUG - response_closed.started 2025-05-29 16:16:15,496 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-29 16:16:15,496 - httpcore.connection - DEBUG - close.started 2025-05-29 16:16:15,496 - httpcore.connection - DEBUG - close.complete 2025-05-29 16:16:15,507 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=30 socket_options=None 2025-05-29 16:16:15,593 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-29 16:16:15,593 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-29 16:16:15,648 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-29 16:16:15,648 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=30 2025-05-29 16:16:15,663 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-29 16:16:15,894 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-29 16:16:15,895 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-29 16:16:15,896 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-29 16:16:15,896 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-29 16:16:15,896 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-29 16:16:15,896 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-29 16:16:15,930 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-29 16:16:15,931 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-29 16:16:15,931 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-29 16:16:15,931 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-29 16:16:15,931 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-29 16:16:15,931 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-29 16:16:16,047 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Thu, 29 May 2025 07:16:16 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-29 16:16:16,047 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-29 16:16:16,047 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-29 16:16:16,048 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-29 16:16:16,048 - httpcore.http11 - DEBUG - response_closed.started 2025-05-29 16:16:16,048 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-29 16:16:16,048 - httpcore.connection - DEBUG - close.started 2025-05-29 16:16:16,048 - httpcore.connection - DEBUG - close.complete 2025-05-29 16:16:16,073 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Thu, 29 May 2025 07:16:16 GMT'), (b'Content-Type', b'text/html; charset=utf-8'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'ContentType', b'application/json'), (b'Access-Control-Allow-Origin', b'*'), (b'Content-Encoding', b'gzip')]) 2025-05-29 16:16:16,074 - httpx - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK" 2025-05-29 16:16:16,074 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-29 16:16:16,074 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-29 16:16:16,074 - httpcore.http11 - DEBUG - response_closed.started 2025-05-29 16:16:16,074 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-29 16:16:16,074 - httpcore.connection - DEBUG - close.started 2025-05-29 16:16:16,074 - httpcore.connection - DEBUG - close.complete 2025-05-29 16:16:16,750 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-29 16:16:16,967 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/launched HTTP/1.1" 200 0 2025-05-29 16:16:50,011 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-29 16:16:50,012 - simple_memory_calculator - INFO - Using known memory data for black-forest-labs/FLUX.1-schnell 2025-05-29 16:16:50,012 - simple_memory_calculator - DEBUG - Known data: {'params_billions': 12.0, 'fp16_gb': 24.0, 'inference_fp16_gb': 36.0} 2025-05-29 16:16:50,012 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-29 16:16:50,012 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-29 16:16:50,012 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-29 16:16:50,012 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-29 16:16:50,012 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-29 16:16:50,012 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-29 16:16:56,212 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-29 16:16:56,212 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-29 16:16:56,212 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-29 16:16:56,212 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-29 16:16:56,212 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-29 16:16:56,212 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-29 16:16:56,212 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-29 16:16:56,212 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-29 16:17:00,382 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-29 16:17:00,382 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-29 16:17:00,383 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 32.0GB VRAM 2025-05-29 16:17:00,383 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-29 16:17:00,383 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-29 16:17:00,383 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-29 16:17:00,383 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-29 16:17:00,383 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-29 16:17:00,534 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-29 16:17:00,534 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-29 16:17:00,534 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 32.0GB VRAM 2025-05-29 16:17:00,534 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-29 16:17:00,534 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-29 16:17:00,534 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-29 16:17:00,534 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-29 16:17:00,535 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-29 16:17:02,112 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-29 16:17:02,112 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-29 16:17:02,112 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 32.0GB VRAM 2025-05-29 16:17:02,112 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-29 16:17:02,112 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-29 16:17:02,112 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-29 16:17:02,112 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-29 16:17:02,112 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-29 16:17:02,112 - auto_diffusers - INFO - Starting code generation for model: black-forest-labs/FLUX.1-schnell 2025-05-29 16:17:02,112 - auto_diffusers - DEBUG - Parameters: prompt='A cat holding a sign that says hello world...', size=(768, 1360), steps=4 2025-05-29 16:17:02,112 - auto_diffusers - DEBUG - Manual specs: True, Memory analysis provided: True 2025-05-29 16:17:02,112 - auto_diffusers - INFO - Using manual hardware specifications 2025-05-29 16:17:02,112 - auto_diffusers - DEBUG - Manual specs: {'platform': 'Linux', 'architecture': 'manual_input', 'cpu_count': 8, 'python_version': '3.11', 'cuda_available': False, 'mps_available': False, 'torch_version': '2.0+', 'manual_input': True, 'ram_gb': 16, 'user_dtype': None, 'gpu_info': [{'name': 'RTX 5090', 'memory_mb': 32768}]} 2025-05-29 16:17:02,112 - auto_diffusers - DEBUG - GPU detected with 32.0 GB VRAM 2025-05-29 16:17:02,112 - auto_diffusers - INFO - Selected optimization profile: performance 2025-05-29 16:17:02,112 - auto_diffusers - DEBUG - Creating generation prompt for Gemini API 2025-05-29 16:17:02,113 - auto_diffusers - DEBUG - Prompt length: 3456 characters 2025-05-29 16:17:02,113 - auto_diffusers - INFO - ================================================================================ 2025-05-29 16:17:02,113 - auto_diffusers - INFO - PROMPT SENT TO GEMINI API: 2025-05-29 16:17:02,113 - auto_diffusers - INFO - ================================================================================ 2025-05-29 16:17:02,113 - auto_diffusers - INFO - You are an expert in optimizing diffusers library code for different hardware configurations. NOTE: Advanced tool calling features are available when dependencies are installed. TASK: Generate optimized Python code for running a diffusion model with the following specifications: - Model: black-forest-labs/FLUX.1-schnell - Prompt: "A cat holding a sign that says hello world" - Image size: 768x1360 - Inference steps: 4 HARDWARE SPECIFICATIONS: - Platform: Linux (manual_input) - CPU Cores: 8 - CUDA Available: False - MPS Available: False - Optimization Profile: performance - GPU: RTX 5090 (32.0 GB VRAM) MEMORY ANALYSIS: - Model Memory Requirements: 36.0 GB (FP16 inference) - Model Weights Size: 24.0 GB (FP16) - Memory Recommendation: ⚠️ Model weights fit, enable memory optimizations - Recommended Precision: float16 - Attention Slicing Recommended: True - VAE Slicing Recommended: True OPTIMIZATION REQUIREMENTS: Please scrape and analyze the latest optimization techniques from this URL: https://huggingface.co/docs/diffusers/main/en/optimization IMPORTANT: For FLUX.1-schnell models, do NOT include guidance_scale parameter as it's not needed. Based on the hardware specs and optimization profile, generate Python code that includes: 1. **Memory Optimizations** (if low VRAM): - Model offloading (enable_model_cpu_offload, enable_sequential_cpu_offload) - Attention slicing (enable_attention_slicing) - VAE slicing (enable_vae_slicing) - Memory efficient attention 2. **Speed Optimizations**: - Appropriate torch.compile() usage - Optimal dtype selection (torch.float16, torch.bfloat16) - Device placement optimization 3. **Hardware-Specific Optimizations**: - CUDA optimizations for NVIDIA GPUs - MPS optimizations for Apple Silicon - CPU fallbacks when needed 4. **Model-Specific Optimizations**: - Appropriate scheduler selection - Optimal inference parameters - Pipeline configuration 5. **Data Type (dtype) Selection**: - If user specified a dtype, use that exact dtype in the code - If no dtype specified, automatically select the optimal dtype based on hardware: * Apple Silicon (MPS): prefer torch.bfloat16 * NVIDIA GPUs: prefer torch.float16 or torch.bfloat16 based on capability * CPU only: use torch.float32 - Add a comment explaining why that dtype was chosen IMPORTANT GUIDELINES: - Include all necessary imports - Add brief comments explaining optimization choices - Use the most current and effective optimization techniques - Ensure code is production-ready CODE STYLE REQUIREMENTS - GENERATE COMPACT CODE: - Assign static values directly to function arguments instead of using variables when possible - Minimize variable declarations - inline values where it improves readability - Reduce exception handling to essential cases only - assume normal operation - Use concise, direct code patterns - Combine operations where logical and readable - Avoid unnecessary intermediate variables - Keep code clean and minimal while maintaining functionality Examples of preferred compact style: - pipe = Pipeline.from_pretrained("model", torch_dtype=torch.float16) instead of storing dtype in variable - image = pipe("prompt", num_inference_steps=4, height=768, width=1360) instead of separate variables - Direct assignment: device = "cuda" if torch.cuda.is_available() else "cpu" Generate ONLY the Python code, no explanations before or after the code block. 2025-05-29 16:17:02,113 - auto_diffusers - INFO - ================================================================================ 2025-05-29 16:17:02,113 - auto_diffusers - INFO - Sending request to Gemini API 2025-05-29 16:17:17,152 - auto_diffusers - INFO - Successfully received response from Gemini API (no tools used) 2025-05-29 16:17:17,153 - auto_diffusers - DEBUG - Response length: 2451 characters 2025-05-29 16:47:23,476 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-29 16:47:23,476 - __main__ - DEBUG - API key found, length: 39 2025-05-29 16:47:23,476 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-29 16:47:23,476 - auto_diffusers - DEBUG - API key length: 39 2025-05-29 16:47:23,477 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-29 16:47:23,477 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-29 16:47:23,477 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-29 16:47:23,477 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-29 16:47:23,477 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-29 16:47:23,477 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-29 16:47:23,480 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-29 16:47:23,480 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-29 16:47:23,928 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-29 16:47:23,928 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-29 16:47:23,928 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-29 16:47:23,928 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-29 16:47:23,928 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-29 16:47:23,928 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-29 16:47:23,928 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-29 16:47:23,928 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-29 16:47:23,928 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-29 16:47:23,928 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-29 16:47:23,928 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-29 16:47:23,930 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-29 16:47:23,944 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-29 16:47:23,950 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-29 16:47:24,025 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-29 16:47:24,059 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=None socket_options=None 2025-05-29 16:47:24,060 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-29 16:47:24,060 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-29 16:47:24,060 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-29 16:47:24,060 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-29 16:47:24,061 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-29 16:47:24,061 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-29 16:47:24,061 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Thu, 29 May 2025 07:47:24 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')]) 2025-05-29 16:47:24,061 - httpx - INFO - HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK" 2025-05-29 16:47:24,061 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-29 16:47:24,061 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-29 16:47:24,061 - httpcore.http11 - DEBUG - response_closed.started 2025-05-29 16:47:24,061 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-29 16:47:24,061 - httpcore.connection - DEBUG - close.started 2025-05-29 16:47:24,061 - httpcore.connection - DEBUG - close.complete 2025-05-29 16:47:24,062 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=3 socket_options=None 2025-05-29 16:47:24,062 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-29 16:47:24,062 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-29 16:47:24,062 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-29 16:47:24,062 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-29 16:47:24,062 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-29 16:47:24,062 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-29 16:47:24,068 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Thu, 29 May 2025 07:47:24 GMT'), (b'server', b'uvicorn'), (b'content-length', b'73064'), (b'content-type', b'text/html; charset=utf-8')]) 2025-05-29 16:47:24,068 - httpx - INFO - HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK" 2025-05-29 16:47:24,068 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-29 16:47:24,068 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-29 16:47:24,068 - httpcore.http11 - DEBUG - response_closed.started 2025-05-29 16:47:24,068 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-29 16:47:24,068 - httpcore.connection - DEBUG - close.started 2025-05-29 16:47:24,068 - httpcore.connection - DEBUG - close.complete 2025-05-29 16:47:24,079 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=30 socket_options=None 2025-05-29 16:47:24,140 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-29 16:47:24,140 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-29 16:47:24,220 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-29 16:47:24,220 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=30 2025-05-29 16:47:24,223 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-29 16:47:24,415 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-29 16:47:24,415 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-29 16:47:24,415 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-29 16:47:24,415 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-29 16:47:24,415 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-29 16:47:24,415 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-29 16:47:24,504 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-29 16:47:24,505 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-29 16:47:24,505 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-29 16:47:24,505 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-29 16:47:24,505 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-29 16:47:24,505 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-29 16:47:24,553 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Thu, 29 May 2025 07:47:24 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-29 16:47:24,554 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-29 16:47:24,554 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-29 16:47:24,554 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-29 16:47:24,554 - httpcore.http11 - DEBUG - response_closed.started 2025-05-29 16:47:24,554 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-29 16:47:24,554 - httpcore.connection - DEBUG - close.started 2025-05-29 16:47:24,554 - httpcore.connection - DEBUG - close.complete 2025-05-29 16:47:24,648 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Thu, 29 May 2025 07:47:24 GMT'), (b'Content-Type', b'text/html; charset=utf-8'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'ContentType', b'application/json'), (b'Access-Control-Allow-Origin', b'*'), (b'Content-Encoding', b'gzip')]) 2025-05-29 16:47:24,648 - httpx - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK" 2025-05-29 16:47:24,648 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-29 16:47:24,648 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-29 16:47:24,648 - httpcore.http11 - DEBUG - response_closed.started 2025-05-29 16:47:24,648 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-29 16:47:24,649 - httpcore.connection - DEBUG - close.started 2025-05-29 16:47:24,649 - httpcore.connection - DEBUG - close.complete 2025-05-29 16:47:25,332 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-29 16:47:25,554 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/launched HTTP/1.1" 200 0 2025-05-29 16:47:35,239 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-29 16:47:35,239 - simple_memory_calculator - INFO - Using known memory data for black-forest-labs/FLUX.1-schnell 2025-05-29 16:47:35,239 - simple_memory_calculator - DEBUG - Known data: {'params_billions': 12.0, 'fp16_gb': 24.0, 'inference_fp16_gb': 36.0} 2025-05-29 16:47:35,239 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-29 16:47:35,239 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-29 16:47:35,239 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-29 16:47:35,239 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-29 16:47:35,240 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-29 16:47:35,240 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-29 16:47:40,282 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-29 16:47:40,282 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-29 16:47:40,282 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-29 16:47:40,282 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-29 16:47:40,282 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-29 16:47:40,282 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-29 16:47:40,282 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-29 16:47:40,282 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-29 16:47:43,895 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-29 16:47:43,895 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-29 16:47:43,895 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 32.0GB VRAM 2025-05-29 16:47:43,895 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-29 16:47:43,895 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-29 16:47:43,895 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-29 16:47:43,895 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-29 16:47:43,896 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-29 16:47:44,048 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-29 16:47:44,048 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-29 16:47:44,048 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 32.0GB VRAM 2025-05-29 16:47:44,048 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-29 16:47:44,048 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-29 16:47:44,048 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-29 16:47:44,048 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-29 16:47:44,048 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-29 16:47:48,011 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-29 16:47:48,011 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-29 16:47:48,011 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 32.0GB VRAM 2025-05-29 16:47:48,011 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-29 16:47:48,011 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-29 16:47:48,011 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-29 16:47:48,011 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-29 16:47:48,012 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-29 16:47:48,012 - auto_diffusers - INFO - Starting code generation for model: black-forest-labs/FLUX.1-schnell 2025-05-29 16:47:48,012 - auto_diffusers - DEBUG - Parameters: prompt='A cat holding a sign that says hello world...', size=(768, 1360), steps=4 2025-05-29 16:47:48,012 - auto_diffusers - DEBUG - Manual specs: True, Memory analysis provided: True 2025-05-29 16:47:48,012 - auto_diffusers - INFO - Using manual hardware specifications 2025-05-29 16:47:48,012 - auto_diffusers - DEBUG - Manual specs: {'platform': 'Linux', 'architecture': 'manual_input', 'cpu_count': 8, 'python_version': '3.11', 'cuda_available': False, 'mps_available': False, 'torch_version': '2.0+', 'manual_input': True, 'ram_gb': 16, 'user_dtype': None, 'gpu_info': [{'name': 'RTX 5090', 'memory_mb': 32768}]} 2025-05-29 16:47:48,012 - auto_diffusers - DEBUG - GPU detected with 32.0 GB VRAM 2025-05-29 16:47:48,012 - auto_diffusers - INFO - Selected optimization profile: performance 2025-05-29 16:47:48,012 - auto_diffusers - DEBUG - Creating generation prompt for Gemini API 2025-05-29 16:47:48,012 - auto_diffusers - DEBUG - Prompt length: 7613 characters 2025-05-29 16:47:48,012 - auto_diffusers - INFO - ================================================================================ 2025-05-29 16:47:48,012 - auto_diffusers - INFO - PROMPT SENT TO GEMINI API: 2025-05-29 16:47:48,013 - auto_diffusers - INFO - ================================================================================ 2025-05-29 16:47:48,013 - auto_diffusers - INFO - You are an expert in optimizing diffusers library code for different hardware configurations. NOTE: This system includes curated optimization knowledge from HuggingFace documentation. TASK: Generate optimized Python code for running a diffusion model with the following specifications: - Model: black-forest-labs/FLUX.1-schnell - Prompt: "A cat holding a sign that says hello world" - Image size: 768x1360 - Inference steps: 4 HARDWARE SPECIFICATIONS: - Platform: Linux (manual_input) - CPU Cores: 8 - CUDA Available: False - MPS Available: False - Optimization Profile: performance - GPU: RTX 5090 (32.0 GB VRAM) MEMORY ANALYSIS: - Model Memory Requirements: 36.0 GB (FP16 inference) - Model Weights Size: 24.0 GB (FP16) - Memory Recommendation: ⚠️ Model weights fit, enable memory optimizations - Recommended Precision: float16 - Attention Slicing Recommended: True - VAE Slicing Recommended: True OPTIMIZATION KNOWLEDGE BASE: # DIFFUSERS OPTIMIZATION TECHNIQUES ## Memory Optimization Techniques ### 1. Model CPU Offloading Use `enable_model_cpu_offload()` to move models between GPU and CPU automatically: ```python pipe.enable_model_cpu_offload() ``` - Saves significant VRAM by keeping only active models on GPU - Automatic management, no manual intervention needed - Compatible with all pipelines ### 2. Sequential CPU Offloading Use `enable_sequential_cpu_offload()` for more aggressive memory saving: ```python pipe.enable_sequential_cpu_offload() ``` - More memory efficient than model offloading - Moves models to CPU after each forward pass - Best for very limited VRAM scenarios ### 3. Attention Slicing Use `enable_attention_slicing()` to reduce memory during attention computation: ```python pipe.enable_attention_slicing() # or specify slice size pipe.enable_attention_slicing("max") # maximum slicing pipe.enable_attention_slicing(1) # slice_size = 1 ``` - Trades compute time for memory - Most effective for high-resolution images - Can be combined with other techniques ### 4. VAE Slicing Use `enable_vae_slicing()` for large batch processing: ```python pipe.enable_vae_slicing() ``` - Decodes images one at a time instead of all at once - Essential for batch sizes > 4 - Minimal performance impact on single images ### 5. VAE Tiling Use `enable_vae_tiling()` for high-resolution image generation: ```python pipe.enable_vae_tiling() ``` - Enables 4K+ image generation on 8GB VRAM - Splits images into overlapping tiles - Automatically disabled for 512x512 or smaller images ### 6. Memory Efficient Attention (xFormers) Use `enable_xformers_memory_efficient_attention()` if xFormers is installed: ```python pipe.enable_xformers_memory_efficient_attention() ``` - Significantly reduces memory usage and improves speed - Requires xformers library installation - Compatible with most models ## Performance Optimization Techniques ### 1. Half Precision (FP16/BF16) Use lower precision for better memory and speed: ```python # FP16 (widely supported) pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16) # BF16 (better numerical stability, newer hardware) pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.bfloat16) ``` - FP16: Halves memory usage, widely supported - BF16: Better numerical stability, requires newer GPUs - Essential for most optimization scenarios ### 2. Torch Compile (PyTorch 2.0+) Use `torch.compile()` for significant speed improvements: ```python pipe.unet = torch.compile(pipe.unet, mode="reduce-overhead", fullgraph=True) # For some models, compile VAE too: pipe.vae.decode = torch.compile(pipe.vae.decode, mode="reduce-overhead", fullgraph=True) ``` - 5-50% speed improvement - Requires PyTorch 2.0+ - First run is slower due to compilation ### 3. Fast Schedulers Use faster schedulers for fewer steps: ```python from diffusers import LMSDiscreteScheduler, UniPCMultistepScheduler # LMS Scheduler (good quality, fast) pipe.scheduler = LMSDiscreteScheduler.from_config(pipe.scheduler.config) # UniPC Scheduler (fastest) pipe.scheduler = UniPCMultistepScheduler.from_config(pipe.scheduler.config) ``` ## Hardware-Specific Optimizations ### NVIDIA GPU Optimizations ```python # Enable Tensor Cores torch.backends.cudnn.benchmark = True # Optimal data type for NVIDIA torch_dtype = torch.float16 # or torch.bfloat16 for RTX 30/40 series ``` ### Apple Silicon (MPS) Optimizations ```python # Use MPS device device = "mps" if torch.backends.mps.is_available() else "cpu" pipe = pipe.to(device) # Recommended dtype for Apple Silicon torch_dtype = torch.bfloat16 # Better than float16 on Apple Silicon # Attention slicing often helps on MPS pipe.enable_attention_slicing() ``` ### CPU Optimizations ```python # Use float32 for CPU torch_dtype = torch.float32 # Enable optimized attention pipe.enable_attention_slicing() ``` ## Model-Specific Guidelines ### FLUX Models - Do NOT use guidance_scale parameter (not needed for FLUX) - Use 4-8 inference steps maximum - BF16 dtype recommended - Enable attention slicing for memory optimization ### Stable Diffusion XL - Enable attention slicing for high resolutions - Use refiner model sparingly to save memory - Consider VAE tiling for >1024px images ### Stable Diffusion 1.5/2.1 - Very memory efficient base models - Can often run without optimizations on 8GB+ VRAM - Enable VAE slicing for batch processing ## Memory Usage Estimation - FLUX.1: ~24GB for full precision, ~12GB for FP16 - SDXL: ~7GB for FP16, ~14GB for FP32 - SD 1.5: ~2GB for FP16, ~4GB for FP32 ## Optimization Combinations by VRAM ### 24GB+ VRAM (High-end) ```python pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.bfloat16) pipe = pipe.to("cuda") pipe.unet = torch.compile(pipe.unet, mode="reduce-overhead", fullgraph=True) ``` ### 12-24GB VRAM (Mid-range) ```python pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16) pipe = pipe.to("cuda") pipe.enable_model_cpu_offload() pipe.enable_xformers_memory_efficient_attention() ``` ### 8-12GB VRAM (Entry-level) ```python pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16) pipe.enable_sequential_cpu_offload() pipe.enable_attention_slicing() pipe.enable_vae_slicing() pipe.enable_xformers_memory_efficient_attention() ``` ### <8GB VRAM (Low-end) ```python pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16) pipe.enable_sequential_cpu_offload() pipe.enable_attention_slicing("max") pipe.enable_vae_slicing() pipe.enable_vae_tiling() ``` IMPORTANT: For FLUX.1-schnell models, do NOT include guidance_scale parameter as it's not needed. Using the OPTIMIZATION KNOWLEDGE BASE above, generate Python code that: 1. **Selects the best optimization techniques** for the specific hardware profile 2. **Applies appropriate memory optimizations** based on available VRAM 3. **Uses optimal data types** for the target hardware: - User specified dtype (if provided): Use exactly as specified - Apple Silicon (MPS): prefer torch.bfloat16 - NVIDIA GPUs: prefer torch.float16 or torch.bfloat16 - CPU only: use torch.float32 4. **Implements hardware-specific optimizations** (CUDA, MPS, CPU) 5. **Follows model-specific guidelines** (e.g., FLUX guidance_scale handling) IMPORTANT GUIDELINES: - Reference the OPTIMIZATION KNOWLEDGE BASE to select appropriate techniques - Include all necessary imports - Add brief comments explaining optimization choices - Generate compact, production-ready code - Inline values where possible for concise code - Generate ONLY the Python code, no explanations before or after the code block 2025-05-29 16:47:48,013 - auto_diffusers - INFO - ================================================================================ 2025-05-29 16:47:48,013 - auto_diffusers - INFO - Sending request to Gemini API 2025-05-29 16:48:09,467 - auto_diffusers - INFO - Successfully received response from Gemini API (no tools used) 2025-05-29 16:48:09,467 - auto_diffusers - DEBUG - Response length: 3996 characters 2025-05-29 16:57:34,668 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-29 16:57:34,668 - __main__ - DEBUG - API key found, length: 39 2025-05-29 16:57:34,668 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-29 16:57:34,668 - auto_diffusers - DEBUG - API key length: 39 2025-05-29 16:57:34,668 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-29 16:57:34,668 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-29 16:57:34,668 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-29 16:57:34,668 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-29 16:57:34,668 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-29 16:57:34,668 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-29 16:57:34,672 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-29 16:57:34,672 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-29 16:57:35,129 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-29 16:57:35,129 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-29 16:57:35,129 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-29 16:57:35,129 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-29 16:57:35,129 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-29 16:57:35,129 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-29 16:57:35,129 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-29 16:57:35,129 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-29 16:57:35,129 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-29 16:57:35,129 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-29 16:57:35,129 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-29 16:57:35,131 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-29 16:57:35,145 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-29 16:57:35,145 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-29 16:57:35,222 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-29 16:57:35,257 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=None socket_options=None 2025-05-29 16:57:35,257 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-29 16:57:35,258 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-29 16:57:35,258 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-29 16:57:35,258 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-29 16:57:35,258 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-29 16:57:35,258 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-29 16:57:35,258 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Thu, 29 May 2025 07:57:35 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')]) 2025-05-29 16:57:35,259 - httpx - INFO - HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK" 2025-05-29 16:57:35,259 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-29 16:57:35,259 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-29 16:57:35,259 - httpcore.http11 - DEBUG - response_closed.started 2025-05-29 16:57:35,259 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-29 16:57:35,259 - httpcore.connection - DEBUG - close.started 2025-05-29 16:57:35,259 - httpcore.connection - DEBUG - close.complete 2025-05-29 16:57:35,259 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=3 socket_options=None 2025-05-29 16:57:35,260 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-29 16:57:35,260 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-29 16:57:35,260 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-29 16:57:35,260 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-29 16:57:35,260 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-29 16:57:35,260 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-29 16:57:35,265 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Thu, 29 May 2025 07:57:35 GMT'), (b'server', b'uvicorn'), (b'content-length', b'75554'), (b'content-type', b'text/html; charset=utf-8')]) 2025-05-29 16:57:35,265 - httpx - INFO - HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK" 2025-05-29 16:57:35,265 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-29 16:57:35,265 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-29 16:57:35,265 - httpcore.http11 - DEBUG - response_closed.started 2025-05-29 16:57:35,266 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-29 16:57:35,266 - httpcore.connection - DEBUG - close.started 2025-05-29 16:57:35,266 - httpcore.connection - DEBUG - close.complete 2025-05-29 16:57:35,276 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=30 socket_options=None 2025-05-29 16:57:35,346 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-29 16:57:35,346 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-29 16:57:35,425 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-29 16:57:35,434 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-29 16:57:35,434 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=30 2025-05-29 16:57:35,637 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-29 16:57:35,638 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-29 16:57:35,638 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-29 16:57:35,638 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-29 16:57:35,638 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-29 16:57:35,638 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-29 16:57:35,751 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-29 16:57:35,751 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-29 16:57:35,752 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-29 16:57:35,752 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-29 16:57:35,752 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-29 16:57:35,752 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-29 16:57:35,786 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Thu, 29 May 2025 07:57:35 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-29 16:57:35,787 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-29 16:57:35,787 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-29 16:57:35,787 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-29 16:57:35,787 - httpcore.http11 - DEBUG - response_closed.started 2025-05-29 16:57:35,787 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-29 16:57:35,787 - httpcore.connection - DEBUG - close.started 2025-05-29 16:57:35,788 - httpcore.connection - DEBUG - close.complete 2025-05-29 16:57:35,912 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Thu, 29 May 2025 07:57:35 GMT'), (b'Content-Type', b'text/html; charset=utf-8'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'ContentType', b'application/json'), (b'Access-Control-Allow-Origin', b'*'), (b'Content-Encoding', b'gzip')]) 2025-05-29 16:57:35,912 - httpx - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK" 2025-05-29 16:57:35,912 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-29 16:57:35,912 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-29 16:57:35,912 - httpcore.http11 - DEBUG - response_closed.started 2025-05-29 16:57:35,912 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-29 16:57:35,912 - httpcore.connection - DEBUG - close.started 2025-05-29 16:57:35,912 - httpcore.connection - DEBUG - close.complete 2025-05-29 16:57:36,487 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-29 16:57:36,707 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/launched HTTP/1.1" 200 0 2025-05-29 16:57:49,246 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-29 16:57:49,246 - simple_memory_calculator - INFO - Using known memory data for black-forest-labs/FLUX.1-schnell 2025-05-29 16:57:49,246 - simple_memory_calculator - DEBUG - Known data: {'params_billions': 12.0, 'fp16_gb': 24.0, 'inference_fp16_gb': 36.0} 2025-05-29 16:57:49,246 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-29 16:57:49,246 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-29 16:57:49,246 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-29 16:57:49,247 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-29 16:57:49,247 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-29 16:57:49,247 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-29 17:00:22,113 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-29 17:00:22,113 - __main__ - DEBUG - API key found, length: 39 2025-05-29 17:00:22,113 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-29 17:00:22,113 - auto_diffusers - DEBUG - API key length: 39 2025-05-29 17:00:22,113 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-29 17:00:22,113 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-29 17:00:22,113 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-29 17:00:22,113 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-29 17:00:22,113 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-29 17:00:22,113 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-29 17:00:22,117 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-29 17:00:22,117 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-29 17:00:22,530 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-29 17:00:22,530 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-29 17:00:22,530 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-29 17:00:22,530 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-29 17:00:22,530 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-29 17:00:22,530 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-29 17:00:22,530 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-29 17:00:22,530 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-29 17:00:22,530 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-29 17:00:22,530 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-29 17:00:22,530 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-29 17:00:22,532 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-29 17:00:22,545 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-29 17:00:22,550 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-29 17:00:22,624 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-29 17:00:22,657 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=None socket_options=None 2025-05-29 17:00:22,657 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-29 17:00:22,657 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-29 17:00:22,657 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-29 17:00:22,658 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-29 17:00:22,658 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-29 17:00:22,658 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-29 17:00:22,658 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Thu, 29 May 2025 08:00:22 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')]) 2025-05-29 17:00:22,658 - httpx - INFO - HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK" 2025-05-29 17:00:22,658 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-29 17:00:22,658 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-29 17:00:22,658 - httpcore.http11 - DEBUG - response_closed.started 2025-05-29 17:00:22,658 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-29 17:00:22,658 - httpcore.connection - DEBUG - close.started 2025-05-29 17:00:22,658 - httpcore.connection - DEBUG - close.complete 2025-05-29 17:00:22,659 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=3 socket_options=None 2025-05-29 17:00:22,659 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-29 17:00:22,659 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-29 17:00:22,659 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-29 17:00:22,659 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-29 17:00:22,659 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-29 17:00:22,659 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-29 17:00:22,665 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Thu, 29 May 2025 08:00:22 GMT'), (b'server', b'uvicorn'), (b'content-length', b'75615'), (b'content-type', b'text/html; charset=utf-8')]) 2025-05-29 17:00:22,665 - httpx - INFO - HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK" 2025-05-29 17:00:22,665 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-29 17:00:22,665 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-29 17:00:22,665 - httpcore.http11 - DEBUG - response_closed.started 2025-05-29 17:00:22,665 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-29 17:00:22,665 - httpcore.connection - DEBUG - close.started 2025-05-29 17:00:22,665 - httpcore.connection - DEBUG - close.complete 2025-05-29 17:00:22,676 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=30 socket_options=None 2025-05-29 17:00:22,750 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-29 17:00:22,750 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-29 17:00:22,815 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-29 17:00:22,815 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=30 2025-05-29 17:00:22,823 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-29 17:00:23,027 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-29 17:00:23,028 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-29 17:00:23,028 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-29 17:00:23,029 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-29 17:00:23,029 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-29 17:00:23,029 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-29 17:00:23,090 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-29 17:00:23,090 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-29 17:00:23,091 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-29 17:00:23,091 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-29 17:00:23,091 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-29 17:00:23,091 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-29 17:00:23,199 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Thu, 29 May 2025 08:00:23 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-29 17:00:23,201 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-29 17:00:23,201 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-29 17:00:23,201 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-29 17:00:23,201 - httpcore.http11 - DEBUG - response_closed.started 2025-05-29 17:00:23,201 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-29 17:00:23,201 - httpcore.connection - DEBUG - close.started 2025-05-29 17:00:23,202 - httpcore.connection - DEBUG - close.complete 2025-05-29 17:00:23,232 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Thu, 29 May 2025 08:00:23 GMT'), (b'Content-Type', b'text/html; charset=utf-8'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'ContentType', b'application/json'), (b'Access-Control-Allow-Origin', b'*'), (b'Content-Encoding', b'gzip')]) 2025-05-29 17:00:23,232 - httpx - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK" 2025-05-29 17:00:23,232 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-29 17:00:23,233 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-29 17:00:23,233 - httpcore.http11 - DEBUG - response_closed.started 2025-05-29 17:00:23,233 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-29 17:00:23,233 - httpcore.connection - DEBUG - close.started 2025-05-29 17:00:23,233 - httpcore.connection - DEBUG - close.complete 2025-05-29 17:00:23,883 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-29 17:00:24,103 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/launched HTTP/1.1" 200 0 2025-05-29 17:00:34,004 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-29 17:00:34,004 - simple_memory_calculator - INFO - Using known memory data for black-forest-labs/FLUX.1-schnell 2025-05-29 17:00:34,005 - simple_memory_calculator - DEBUG - Known data: {'params_billions': 12.0, 'fp16_gb': 24.0, 'inference_fp16_gb': 36.0} 2025-05-29 17:00:34,005 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-29 17:00:34,005 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-29 17:00:34,005 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-29 17:00:34,005 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-29 17:00:34,005 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-29 17:00:34,005 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-29 17:03:33,448 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-29 17:03:33,448 - __main__ - DEBUG - API key found, length: 39 2025-05-29 17:03:33,448 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-29 17:03:33,448 - auto_diffusers - DEBUG - API key length: 39 2025-05-29 17:03:33,448 - auto_diffusers - DEBUG - Creating tools for Gemini 2025-05-29 17:03:33,448 - auto_diffusers - INFO - Created 3 tools for Gemini 2025-05-29 17:03:33,448 - auto_diffusers - INFO - Successfully configured Gemini AI model with tools 2025-05-29 17:03:33,448 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-29 17:03:33,448 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-29 17:03:33,448 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-29 17:03:33,448 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.12.9 2025-05-29 17:03:33,448 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-29 17:03:33,452 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-29 17:03:33,452 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-29 17:03:33,924 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-29 17:03:33,924 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-29 17:03:33,924 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-29 17:03:33,924 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.12.9', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-29 17:03:33,924 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-29 17:03:33,924 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-29 17:03:33,925 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-29 17:03:33,925 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-29 17:03:33,925 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-29 17:03:33,925 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-29 17:03:33,925 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-29 17:03:33,927 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-29 17:03:33,940 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-29 17:03:33,947 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-29 17:03:33,995 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-29 17:03:34,042 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=None socket_options=None 2025-05-29 17:03:34,042 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-29 17:03:34,042 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-29 17:03:34,043 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-29 17:03:34,043 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-29 17:03:34,043 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-29 17:03:34,043 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-29 17:03:34,043 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Thu, 29 May 2025 08:03:33 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')]) 2025-05-29 17:03:34,043 - httpx - INFO - HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK" 2025-05-29 17:03:34,043 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-29 17:03:34,044 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-29 17:03:34,044 - httpcore.http11 - DEBUG - response_closed.started 2025-05-29 17:03:34,044 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-29 17:03:34,044 - httpcore.connection - DEBUG - close.started 2025-05-29 17:03:34,044 - httpcore.connection - DEBUG - close.complete 2025-05-29 17:03:34,044 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=3 socket_options=None 2025-05-29 17:03:34,044 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-29 17:03:34,044 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-29 17:03:34,045 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-29 17:03:34,045 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-29 17:03:34,045 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-29 17:03:34,045 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-29 17:03:34,051 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Thu, 29 May 2025 08:03:33 GMT'), (b'server', b'uvicorn'), (b'content-length', b'74295'), (b'content-type', b'text/html; charset=utf-8')]) 2025-05-29 17:03:34,051 - httpx - INFO - HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK" 2025-05-29 17:03:34,051 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-29 17:03:34,051 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-29 17:03:34,051 - httpcore.http11 - DEBUG - response_closed.started 2025-05-29 17:03:34,051 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-29 17:03:34,051 - httpcore.connection - DEBUG - close.started 2025-05-29 17:03:34,051 - httpcore.connection - DEBUG - close.complete 2025-05-29 17:03:34,063 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=30 socket_options=None 2025-05-29 17:03:34,200 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-29 17:03:34,469 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-29 17:03:34,470 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=30 2025-05-29 17:03:34,476 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-29 17:03:34,476 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-29 17:03:34,760 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-29 17:03:34,761 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-29 17:03:34,761 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-29 17:03:34,761 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-29 17:03:34,761 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-29 17:03:34,762 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-29 17:03:34,771 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-29 17:03:34,771 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-29 17:03:34,772 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-29 17:03:34,772 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-29 17:03:34,772 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-29 17:03:34,772 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-29 17:03:34,907 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Thu, 29 May 2025 08:03:34 GMT'), (b'Content-Type', b'text/html; charset=utf-8'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'ContentType', b'application/json'), (b'Access-Control-Allow-Origin', b'*'), (b'Content-Encoding', b'gzip')]) 2025-05-29 17:03:34,907 - httpx - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK" 2025-05-29 17:03:34,907 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-29 17:03:34,908 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-29 17:03:34,908 - httpcore.http11 - DEBUG - response_closed.started 2025-05-29 17:03:34,908 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-29 17:03:34,908 - httpcore.connection - DEBUG - close.started 2025-05-29 17:03:34,908 - httpcore.connection - DEBUG - close.complete 2025-05-29 17:03:34,919 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Thu, 29 May 2025 08:03:34 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-29 17:03:34,919 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-29 17:03:34,919 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-29 17:03:34,919 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-29 17:03:34,920 - httpcore.http11 - DEBUG - response_closed.started 2025-05-29 17:03:34,920 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-29 17:03:34,920 - httpcore.connection - DEBUG - close.started 2025-05-29 17:03:34,920 - httpcore.connection - DEBUG - close.complete 2025-05-29 17:03:35,503 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-29 17:03:35,733 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/launched HTTP/1.1" 200 0 2025-05-29 17:05:44,828 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-29 17:05:44,828 - __main__ - DEBUG - API key found, length: 39 2025-05-29 17:05:44,828 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-29 17:05:44,828 - auto_diffusers - DEBUG - API key length: 39 2025-05-29 17:05:44,828 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-29 17:05:44,828 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-29 17:05:44,828 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-29 17:05:44,828 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-29 17:05:44,828 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-29 17:05:44,828 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-29 17:05:44,831 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-29 17:05:44,832 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-29 17:05:45,252 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-29 17:05:45,252 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-29 17:05:45,252 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-29 17:05:45,252 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-29 17:05:45,252 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-29 17:05:45,252 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-29 17:05:45,252 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-29 17:05:45,252 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-29 17:05:45,252 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-29 17:05:45,252 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-29 17:05:45,252 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-29 17:05:45,254 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-29 17:05:45,267 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-29 17:05:45,272 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-29 17:05:45,344 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-29 17:05:45,377 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=None socket_options=None 2025-05-29 17:05:45,377 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-29 17:05:45,378 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-29 17:05:45,378 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-29 17:05:45,378 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-29 17:05:45,378 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-29 17:05:45,378 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-29 17:05:45,378 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Thu, 29 May 2025 08:05:45 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')]) 2025-05-29 17:05:45,379 - httpx - INFO - HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK" 2025-05-29 17:05:45,379 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-29 17:05:45,379 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-29 17:05:45,379 - httpcore.http11 - DEBUG - response_closed.started 2025-05-29 17:05:45,379 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-29 17:05:45,379 - httpcore.connection - DEBUG - close.started 2025-05-29 17:05:45,379 - httpcore.connection - DEBUG - close.complete 2025-05-29 17:05:45,379 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=3 socket_options=None 2025-05-29 17:05:45,379 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-29 17:05:45,380 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-29 17:05:45,380 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-29 17:05:45,380 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-29 17:05:45,380 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-29 17:05:45,380 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-29 17:05:45,385 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Thu, 29 May 2025 08:05:45 GMT'), (b'server', b'uvicorn'), (b'content-length', b'75706'), (b'content-type', b'text/html; charset=utf-8')]) 2025-05-29 17:05:45,385 - httpx - INFO - HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK" 2025-05-29 17:05:45,385 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-29 17:05:45,385 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-29 17:05:45,385 - httpcore.http11 - DEBUG - response_closed.started 2025-05-29 17:05:45,385 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-29 17:05:45,385 - httpcore.connection - DEBUG - close.started 2025-05-29 17:05:45,385 - httpcore.connection - DEBUG - close.complete 2025-05-29 17:05:45,396 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=30 socket_options=None 2025-05-29 17:05:45,466 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-29 17:05:45,466 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-29 17:05:45,538 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-29 17:05:45,538 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=30 2025-05-29 17:05:45,548 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-29 17:05:45,746 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-29 17:05:45,746 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-29 17:05:45,747 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-29 17:05:45,747 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-29 17:05:45,747 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-29 17:05:45,747 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-29 17:05:45,821 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-29 17:05:45,821 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-29 17:05:45,822 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-29 17:05:45,822 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-29 17:05:45,822 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-29 17:05:45,822 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-29 17:05:45,885 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Thu, 29 May 2025 08:05:45 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-29 17:05:45,886 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-29 17:05:45,886 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-29 17:05:45,887 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-29 17:05:45,887 - httpcore.http11 - DEBUG - response_closed.started 2025-05-29 17:05:45,887 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-29 17:05:45,887 - httpcore.connection - DEBUG - close.started 2025-05-29 17:05:45,888 - httpcore.connection - DEBUG - close.complete 2025-05-29 17:05:45,965 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Thu, 29 May 2025 08:05:45 GMT'), (b'Content-Type', b'text/html; charset=utf-8'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'ContentType', b'application/json'), (b'Access-Control-Allow-Origin', b'*'), (b'Content-Encoding', b'gzip')]) 2025-05-29 17:05:45,965 - httpx - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK" 2025-05-29 17:05:45,966 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-29 17:05:45,967 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-29 17:05:45,967 - httpcore.http11 - DEBUG - response_closed.started 2025-05-29 17:05:45,967 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-29 17:05:45,968 - httpcore.connection - DEBUG - close.started 2025-05-29 17:05:45,968 - httpcore.connection - DEBUG - close.complete 2025-05-29 17:05:46,631 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-29 17:05:46,857 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/launched HTTP/1.1" 200 0 2025-05-29 17:05:55,606 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-29 17:05:55,606 - simple_memory_calculator - INFO - Using known memory data for black-forest-labs/FLUX.1-schnell 2025-05-29 17:05:55,606 - simple_memory_calculator - DEBUG - Known data: {'params_billions': 12.0, 'fp16_gb': 24.0, 'inference_fp16_gb': 36.0} 2025-05-29 17:05:55,606 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-29 17:05:55,606 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-29 17:05:55,607 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-29 17:05:55,607 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-29 17:05:55,607 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-29 17:05:55,607 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-29 23:38:26,490 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-29 23:38:26,490 - __main__ - DEBUG - API key found, length: 39 2025-05-29 23:38:26,490 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-29 23:38:26,490 - auto_diffusers - DEBUG - API key length: 39 2025-05-29 23:38:26,490 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-29 23:38:26,490 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-29 23:38:26,490 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-29 23:38:26,490 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-29 23:38:26,491 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-29 23:38:26,491 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-29 23:38:26,494 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-29 23:38:26,494 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-29 23:38:26,909 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-29 23:38:26,909 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-29 23:38:26,909 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-29 23:38:26,909 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-29 23:38:26,909 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-29 23:38:26,909 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-29 23:38:26,909 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-29 23:38:26,909 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-29 23:38:26,909 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-29 23:38:26,909 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-29 23:38:26,909 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-29 23:38:26,911 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-29 23:38:26,924 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-29 23:38:26,929 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-29 23:38:27,000 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-29 23:38:27,034 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=None socket_options=None 2025-05-29 23:38:27,035 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-29 23:38:27,035 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-29 23:38:27,035 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-29 23:38:27,035 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-29 23:38:27,035 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-29 23:38:27,035 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-29 23:38:27,035 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Thu, 29 May 2025 14:38:27 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')]) 2025-05-29 23:38:27,036 - httpx - INFO - HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK" 2025-05-29 23:38:27,036 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-29 23:38:27,036 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-29 23:38:27,036 - httpcore.http11 - DEBUG - response_closed.started 2025-05-29 23:38:27,036 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-29 23:38:27,036 - httpcore.connection - DEBUG - close.started 2025-05-29 23:38:27,036 - httpcore.connection - DEBUG - close.complete 2025-05-29 23:38:27,036 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=3 socket_options=None 2025-05-29 23:38:27,037 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-29 23:38:27,037 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-29 23:38:27,037 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-29 23:38:27,037 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-29 23:38:27,037 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-29 23:38:27,037 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-29 23:38:27,042 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Thu, 29 May 2025 14:38:27 GMT'), (b'server', b'uvicorn'), (b'content-length', b'75707'), (b'content-type', b'text/html; charset=utf-8')]) 2025-05-29 23:38:27,042 - httpx - INFO - HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK" 2025-05-29 23:38:27,043 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-29 23:38:27,043 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-29 23:38:27,043 - httpcore.http11 - DEBUG - response_closed.started 2025-05-29 23:38:27,043 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-29 23:38:27,043 - httpcore.connection - DEBUG - close.started 2025-05-29 23:38:27,043 - httpcore.connection - DEBUG - close.complete 2025-05-29 23:38:27,053 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=30 socket_options=None 2025-05-29 23:38:27,124 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-29 23:38:27,124 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-29 23:38:27,198 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-29 23:38:27,198 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=30 2025-05-29 23:38:27,215 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-29 23:38:27,405 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-29 23:38:27,406 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-29 23:38:27,406 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-29 23:38:27,406 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-29 23:38:27,407 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-29 23:38:27,407 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-29 23:38:27,488 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-29 23:38:27,489 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-29 23:38:27,489 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-29 23:38:27,489 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-29 23:38:27,490 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-29 23:38:27,490 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-29 23:38:27,549 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Thu, 29 May 2025 14:38:27 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-29 23:38:27,550 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-29 23:38:27,550 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-29 23:38:27,550 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-29 23:38:27,550 - httpcore.http11 - DEBUG - response_closed.started 2025-05-29 23:38:27,550 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-29 23:38:27,551 - httpcore.connection - DEBUG - close.started 2025-05-29 23:38:27,551 - httpcore.connection - DEBUG - close.complete 2025-05-29 23:38:27,636 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Thu, 29 May 2025 14:38:27 GMT'), (b'Content-Type', b'text/html; charset=utf-8'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'ContentType', b'application/json'), (b'Access-Control-Allow-Origin', b'*'), (b'Content-Encoding', b'gzip')]) 2025-05-29 23:38:27,636 - httpx - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK" 2025-05-29 23:38:27,637 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-29 23:38:27,637 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-29 23:38:27,638 - httpcore.http11 - DEBUG - response_closed.started 2025-05-29 23:38:27,638 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-29 23:38:27,638 - httpcore.connection - DEBUG - close.started 2025-05-29 23:38:27,638 - httpcore.connection - DEBUG - close.complete 2025-05-29 23:38:28,286 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-29 23:38:28,884 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/launched HTTP/1.1" 200 0 2025-05-29 23:38:42,661 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-29 23:38:42,662 - simple_memory_calculator - INFO - Using known memory data for black-forest-labs/FLUX.1-schnell 2025-05-29 23:38:42,662 - simple_memory_calculator - DEBUG - Known data: {'params_billions': 12.0, 'fp16_gb': 24.0, 'inference_fp16_gb': 36.0} 2025-05-29 23:38:42,662 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-29 23:38:42,662 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-29 23:38:42,662 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-29 23:38:42,662 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-29 23:38:42,662 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-29 23:38:42,662 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-29 23:43:46,493 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-29 23:43:46,494 - __main__ - DEBUG - API key found, length: 39 2025-05-29 23:43:46,494 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-29 23:43:46,494 - auto_diffusers - DEBUG - API key length: 39 2025-05-29 23:43:46,494 - auto_diffusers - DEBUG - Creating tools for Gemini 2025-05-29 23:43:46,494 - auto_diffusers - INFO - Created 3 tools for Gemini 2025-05-29 23:43:46,494 - auto_diffusers - INFO - Successfully configured Gemini AI model with tools 2025-05-29 23:43:46,494 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-29 23:43:46,494 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-29 23:43:46,494 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-29 23:43:46,494 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.12.9 2025-05-29 23:43:46,494 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-29 23:43:46,497 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-29 23:43:46,497 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-29 23:43:46,942 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-29 23:43:46,942 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-29 23:43:46,942 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-29 23:43:46,942 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.12.9', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-29 23:43:46,942 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-29 23:43:46,942 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-29 23:43:46,942 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-29 23:43:46,942 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-29 23:43:46,942 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-29 23:43:46,942 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-29 23:43:46,942 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-29 23:43:46,944 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-29 23:43:46,945 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-29 23:43:46,963 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-29 23:43:47,014 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-29 23:43:47,169 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-29 23:43:47,170 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-29 23:43:47,218 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-29 23:43:47,500 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-29 23:43:47,500 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-29 23:43:47,501 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-29 23:43:47,501 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-29 23:43:47,502 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-29 23:43:47,502 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-29 23:43:47,667 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Thu, 29 May 2025 14:43:47 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-29 23:43:47,669 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-29 23:43:47,669 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-29 23:43:47,669 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-29 23:43:47,670 - httpcore.http11 - DEBUG - response_closed.started 2025-05-29 23:43:47,670 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-29 23:43:47,670 - httpcore.connection - DEBUG - close.started 2025-05-29 23:43:47,671 - httpcore.connection - DEBUG - close.complete 2025-05-29 23:45:37,625 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-29 23:45:37,625 - __main__ - DEBUG - API key found, length: 39 2025-05-29 23:45:37,625 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-29 23:45:37,625 - auto_diffusers - DEBUG - API key length: 39 2025-05-29 23:45:37,625 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-29 23:45:37,625 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-29 23:45:37,625 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-29 23:45:37,626 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-29 23:45:37,626 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-29 23:45:37,626 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-29 23:45:37,628 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-29 23:45:37,629 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-29 23:45:38,048 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-29 23:45:38,048 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-29 23:45:38,048 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-29 23:45:38,048 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-29 23:45:38,048 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-29 23:45:38,048 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-29 23:45:38,048 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-29 23:45:38,048 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-29 23:45:38,048 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-29 23:45:38,048 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-29 23:45:38,048 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-29 23:45:38,051 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-29 23:45:38,064 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-29 23:45:38,071 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-29 23:45:38,143 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-29 23:45:38,178 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7861 local_address=None timeout=None socket_options=None 2025-05-29 23:45:38,179 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-29 23:45:38,179 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-29 23:45:38,179 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-29 23:45:38,179 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-29 23:45:38,180 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-29 23:45:38,180 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-29 23:45:38,180 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Thu, 29 May 2025 14:45:38 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')]) 2025-05-29 23:45:38,180 - httpx - INFO - HTTP Request: GET http://localhost:7861/gradio_api/startup-events "HTTP/1.1 200 OK" 2025-05-29 23:45:38,180 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-29 23:45:38,180 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-29 23:45:38,180 - httpcore.http11 - DEBUG - response_closed.started 2025-05-29 23:45:38,180 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-29 23:45:38,180 - httpcore.connection - DEBUG - close.started 2025-05-29 23:45:38,180 - httpcore.connection - DEBUG - close.complete 2025-05-29 23:45:38,181 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7861 local_address=None timeout=3 socket_options=None 2025-05-29 23:45:38,181 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-29 23:45:38,181 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-29 23:45:38,181 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-29 23:45:38,181 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-29 23:45:38,181 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-29 23:45:38,181 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-29 23:45:38,188 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Thu, 29 May 2025 14:45:38 GMT'), (b'server', b'uvicorn'), (b'content-length', b'113569'), (b'content-type', b'text/html; charset=utf-8')]) 2025-05-29 23:45:38,188 - httpx - INFO - HTTP Request: HEAD http://localhost:7861/ "HTTP/1.1 200 OK" 2025-05-29 23:45:38,188 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-29 23:45:38,188 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-29 23:45:38,188 - httpcore.http11 - DEBUG - response_closed.started 2025-05-29 23:45:38,188 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-29 23:45:38,188 - httpcore.connection - DEBUG - close.started 2025-05-29 23:45:38,189 - httpcore.connection - DEBUG - close.complete 2025-05-29 23:45:38,200 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=30 socket_options=None 2025-05-29 23:45:38,227 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-29 23:45:38,227 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-29 23:45:38,337 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-29 23:45:38,338 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=30 2025-05-29 23:45:38,349 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-29 23:45:38,503 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-29 23:45:38,503 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-29 23:45:38,503 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-29 23:45:38,503 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-29 23:45:38,503 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-29 23:45:38,503 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-29 23:45:38,611 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-29 23:45:38,611 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-29 23:45:38,611 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-29 23:45:38,611 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-29 23:45:38,611 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-29 23:45:38,611 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-29 23:45:38,641 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Thu, 29 May 2025 14:45:38 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-29 23:45:38,642 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-29 23:45:38,642 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-29 23:45:38,642 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-29 23:45:38,642 - httpcore.http11 - DEBUG - response_closed.started 2025-05-29 23:45:38,642 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-29 23:45:38,642 - httpcore.connection - DEBUG - close.started 2025-05-29 23:45:38,642 - httpcore.connection - DEBUG - close.complete 2025-05-29 23:45:38,750 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Thu, 29 May 2025 14:45:38 GMT'), (b'Content-Type', b'text/html; charset=utf-8'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'ContentType', b'application/json'), (b'Access-Control-Allow-Origin', b'*'), (b'Content-Encoding', b'gzip')]) 2025-05-29 23:45:38,750 - httpx - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK" 2025-05-29 23:45:38,750 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-29 23:45:38,751 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-29 23:45:38,751 - httpcore.http11 - DEBUG - response_closed.started 2025-05-29 23:45:38,751 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-29 23:45:38,751 - httpcore.connection - DEBUG - close.started 2025-05-29 23:45:38,751 - httpcore.connection - DEBUG - close.complete 2025-05-29 23:45:39,336 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-29 23:45:39,554 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/launched HTTP/1.1" 200 0 2025-05-29 23:45:56,868 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-29 23:45:56,868 - simple_memory_calculator - INFO - Using known memory data for black-forest-labs/FLUX.1-schnell 2025-05-29 23:45:56,868 - simple_memory_calculator - DEBUG - Known data: {'params_billions': 12.0, 'fp16_gb': 24.0, 'inference_fp16_gb': 36.0} 2025-05-29 23:45:56,869 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-29 23:45:56,869 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-29 23:45:56,869 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-29 23:45:56,869 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-29 23:45:56,869 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-29 23:45:56,869 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-29 23:46:55,462 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-29 23:46:55,462 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-29 23:46:55,462 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-29 23:46:55,462 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-29 23:46:55,462 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-29 23:46:55,462 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-29 23:46:55,462 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-29 23:46:55,462 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-29 23:47:01,722 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-29 23:47:01,722 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-29 23:47:01,722 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 32.0GB VRAM 2025-05-29 23:47:01,723 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-29 23:47:01,723 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-29 23:47:01,723 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-29 23:47:01,723 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-29 23:47:01,723 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-29 23:47:01,774 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-29 23:47:01,775 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-29 23:47:01,775 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 32.0GB VRAM 2025-05-29 23:47:01,775 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-29 23:47:01,775 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-29 23:47:01,775 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-29 23:47:01,775 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-29 23:47:01,775 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-29 23:48:44,089 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-29 23:48:44,089 - __main__ - DEBUG - API key found, length: 39 2025-05-29 23:48:44,089 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-29 23:48:44,089 - auto_diffusers - DEBUG - API key length: 39 2025-05-29 23:48:44,089 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-29 23:48:44,089 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-29 23:48:44,089 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-29 23:48:44,089 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-29 23:48:44,089 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-29 23:48:44,089 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-29 23:48:44,092 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-29 23:48:44,092 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-29 23:48:44,496 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-29 23:48:44,497 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-29 23:48:44,497 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-29 23:48:44,497 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-29 23:48:44,497 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-29 23:48:44,497 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-29 23:48:44,497 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-29 23:48:44,497 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-29 23:48:44,497 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-29 23:48:44,497 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-29 23:48:44,497 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-29 23:48:44,499 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-29 23:48:44,512 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-29 23:48:44,517 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-29 23:48:44,590 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-29 23:48:44,625 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7861 local_address=None timeout=None socket_options=None 2025-05-29 23:48:44,626 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-29 23:48:44,626 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-29 23:48:44,626 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-29 23:48:44,626 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-29 23:48:44,626 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-29 23:48:44,627 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-29 23:48:44,627 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Thu, 29 May 2025 14:48:44 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')]) 2025-05-29 23:48:44,627 - httpx - INFO - HTTP Request: GET http://localhost:7861/gradio_api/startup-events "HTTP/1.1 200 OK" 2025-05-29 23:48:44,627 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-29 23:48:44,627 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-29 23:48:44,627 - httpcore.http11 - DEBUG - response_closed.started 2025-05-29 23:48:44,627 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-29 23:48:44,627 - httpcore.connection - DEBUG - close.started 2025-05-29 23:48:44,627 - httpcore.connection - DEBUG - close.complete 2025-05-29 23:48:44,628 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7861 local_address=None timeout=3 socket_options=None 2025-05-29 23:48:44,628 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-29 23:48:44,628 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-29 23:48:44,628 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-29 23:48:44,629 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-29 23:48:44,629 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-29 23:48:44,629 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-29 23:48:44,635 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Thu, 29 May 2025 14:48:44 GMT'), (b'server', b'uvicorn'), (b'content-length', b'113500'), (b'content-type', b'text/html; charset=utf-8')]) 2025-05-29 23:48:44,635 - httpx - INFO - HTTP Request: HEAD http://localhost:7861/ "HTTP/1.1 200 OK" 2025-05-29 23:48:44,635 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-29 23:48:44,635 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-29 23:48:44,635 - httpcore.http11 - DEBUG - response_closed.started 2025-05-29 23:48:44,635 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-29 23:48:44,636 - httpcore.connection - DEBUG - close.started 2025-05-29 23:48:44,636 - httpcore.connection - DEBUG - close.complete 2025-05-29 23:48:44,647 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=30 socket_options=None 2025-05-29 23:48:44,676 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-29 23:48:44,676 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-29 23:48:44,796 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-29 23:48:44,796 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=30 2025-05-29 23:48:44,801 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-29 23:48:44,957 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-29 23:48:44,958 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-29 23:48:44,958 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-29 23:48:44,958 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-29 23:48:44,958 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-29 23:48:44,958 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-29 23:48:45,095 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-29 23:48:45,096 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-29 23:48:45,096 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-29 23:48:45,096 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-29 23:48:45,096 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-29 23:48:45,096 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-29 23:48:45,100 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Thu, 29 May 2025 14:48:45 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-29 23:48:45,100 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-29 23:48:45,101 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-29 23:48:45,101 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-29 23:48:45,101 - httpcore.http11 - DEBUG - response_closed.started 2025-05-29 23:48:45,101 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-29 23:48:45,101 - httpcore.connection - DEBUG - close.started 2025-05-29 23:48:45,101 - httpcore.connection - DEBUG - close.complete 2025-05-29 23:48:45,247 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Thu, 29 May 2025 14:48:45 GMT'), (b'Content-Type', b'text/html; charset=utf-8'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'ContentType', b'application/json'), (b'Access-Control-Allow-Origin', b'*'), (b'Content-Encoding', b'gzip')]) 2025-05-29 23:48:45,248 - httpx - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK" 2025-05-29 23:48:45,248 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-29 23:48:45,249 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-29 23:48:45,249 - httpcore.http11 - DEBUG - response_closed.started 2025-05-29 23:48:45,249 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-29 23:48:45,250 - httpcore.connection - DEBUG - close.started 2025-05-29 23:48:45,250 - httpcore.connection - DEBUG - close.complete 2025-05-29 23:48:45,928 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-29 23:48:46,505 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/launched HTTP/1.1" 200 0 2025-05-29 23:48:47,065 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-29 23:48:47,065 - simple_memory_calculator - INFO - Using known memory data for black-forest-labs/FLUX.1-schnell 2025-05-29 23:48:47,065 - simple_memory_calculator - DEBUG - Known data: {'params_billions': 12.0, 'fp16_gb': 24.0, 'inference_fp16_gb': 36.0} 2025-05-29 23:48:47,065 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-29 23:48:47,065 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-29 23:48:47,065 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-29 23:48:47,065 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-29 23:48:47,065 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-29 23:48:47,065 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-29 23:50:45,671 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-29 23:50:45,671 - __main__ - DEBUG - API key found, length: 39 2025-05-29 23:50:45,671 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-29 23:50:45,671 - auto_diffusers - DEBUG - API key length: 39 2025-05-29 23:50:45,671 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-29 23:50:45,671 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-29 23:50:45,671 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-29 23:50:45,671 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-29 23:50:45,671 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-29 23:50:45,671 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-29 23:50:45,675 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-29 23:50:45,675 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-29 23:50:46,156 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-29 23:50:46,156 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-29 23:50:46,156 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-29 23:50:46,156 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-29 23:50:46,156 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-29 23:50:46,156 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-29 23:50:46,156 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-29 23:50:46,156 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-29 23:50:46,156 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-29 23:50:46,156 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-29 23:50:46,156 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-29 23:50:46,158 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-29 23:50:46,172 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-29 23:50:46,178 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-29 23:50:46,266 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-29 23:50:46,302 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7861 local_address=None timeout=None socket_options=None 2025-05-29 23:50:46,303 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-29 23:50:46,303 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-29 23:50:46,303 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-29 23:50:46,303 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-29 23:50:46,303 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-29 23:50:46,304 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-29 23:50:46,304 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Thu, 29 May 2025 14:50:46 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')]) 2025-05-29 23:50:46,304 - httpx - INFO - HTTP Request: GET http://localhost:7861/gradio_api/startup-events "HTTP/1.1 200 OK" 2025-05-29 23:50:46,304 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-29 23:50:46,304 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-29 23:50:46,304 - httpcore.http11 - DEBUG - response_closed.started 2025-05-29 23:50:46,304 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-29 23:50:46,304 - httpcore.connection - DEBUG - close.started 2025-05-29 23:50:46,305 - httpcore.connection - DEBUG - close.complete 2025-05-29 23:50:46,305 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7861 local_address=None timeout=3 socket_options=None 2025-05-29 23:50:46,305 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-29 23:50:46,305 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-29 23:50:46,306 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-29 23:50:46,306 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-29 23:50:46,306 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-29 23:50:46,306 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-29 23:50:46,312 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Thu, 29 May 2025 14:50:46 GMT'), (b'server', b'uvicorn'), (b'content-length', b'111775'), (b'content-type', b'text/html; charset=utf-8')]) 2025-05-29 23:50:46,312 - httpx - INFO - HTTP Request: HEAD http://localhost:7861/ "HTTP/1.1 200 OK" 2025-05-29 23:50:46,313 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-29 23:50:46,313 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-29 23:50:46,313 - httpcore.http11 - DEBUG - response_closed.started 2025-05-29 23:50:46,313 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-29 23:50:46,313 - httpcore.connection - DEBUG - close.started 2025-05-29 23:50:46,313 - httpcore.connection - DEBUG - close.complete 2025-05-29 23:50:46,324 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=30 socket_options=None 2025-05-29 23:50:46,376 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-29 23:50:46,376 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-29 23:50:46,463 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-29 23:50:46,464 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=30 2025-05-29 23:50:46,471 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-29 23:50:46,648 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-29 23:50:46,648 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-29 23:50:46,648 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-29 23:50:46,648 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-29 23:50:46,649 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-29 23:50:46,649 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-29 23:50:46,744 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-29 23:50:46,744 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-29 23:50:46,744 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-29 23:50:46,744 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-29 23:50:46,744 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-29 23:50:46,744 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-29 23:50:46,786 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Thu, 29 May 2025 14:50:46 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-29 23:50:46,786 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-29 23:50:46,786 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-29 23:50:46,786 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-29 23:50:46,786 - httpcore.http11 - DEBUG - response_closed.started 2025-05-29 23:50:46,786 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-29 23:50:46,786 - httpcore.connection - DEBUG - close.started 2025-05-29 23:50:46,787 - httpcore.connection - DEBUG - close.complete 2025-05-29 23:50:46,885 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Thu, 29 May 2025 14:50:46 GMT'), (b'Content-Type', b'text/html; charset=utf-8'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'ContentType', b'application/json'), (b'Access-Control-Allow-Origin', b'*'), (b'Content-Encoding', b'gzip')]) 2025-05-29 23:50:46,885 - httpx - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK" 2025-05-29 23:50:46,885 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-29 23:50:46,885 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-29 23:50:46,885 - httpcore.http11 - DEBUG - response_closed.started 2025-05-29 23:50:46,885 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-29 23:50:46,886 - httpcore.connection - DEBUG - close.started 2025-05-29 23:50:46,886 - httpcore.connection - DEBUG - close.complete 2025-05-29 23:50:47,013 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-29 23:50:47,013 - simple_memory_calculator - INFO - Using known memory data for black-forest-labs/FLUX.1-schnell 2025-05-29 23:50:47,014 - simple_memory_calculator - DEBUG - Known data: {'params_billions': 12.0, 'fp16_gb': 24.0, 'inference_fp16_gb': 36.0} 2025-05-29 23:50:47,014 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-29 23:50:47,014 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-29 23:50:47,014 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-29 23:50:47,014 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-29 23:50:47,014 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-29 23:50:47,014 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-29 23:50:47,517 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-29 23:50:47,739 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/launched HTTP/1.1" 200 0 2025-05-29 23:54:53,253 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-29 23:54:53,253 - __main__ - DEBUG - API key found, length: 39 2025-05-29 23:54:53,253 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-29 23:54:53,253 - auto_diffusers - DEBUG - API key length: 39 2025-05-29 23:54:53,253 - auto_diffusers - DEBUG - Creating tools for Gemini 2025-05-29 23:54:53,253 - auto_diffusers - INFO - Created 3 tools for Gemini 2025-05-29 23:54:53,253 - auto_diffusers - INFO - Successfully configured Gemini AI model with tools 2025-05-29 23:54:53,253 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-29 23:54:53,253 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-29 23:54:53,253 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-29 23:54:53,253 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.12.9 2025-05-29 23:54:53,253 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-29 23:54:53,258 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-29 23:54:53,258 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-29 23:54:53,724 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-29 23:54:53,724 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-29 23:54:53,724 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-29 23:54:53,724 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.12.9', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-29 23:54:53,724 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-29 23:54:53,724 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-29 23:54:53,724 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-29 23:54:53,724 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-29 23:54:53,724 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-29 23:54:53,724 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-29 23:54:53,724 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-29 23:54:53,726 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-29 23:54:53,734 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-29 23:54:53,740 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-29 23:54:53,988 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-29 23:54:53,989 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-29 23:54:53,989 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-29 23:54:54,271 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-29 23:54:54,271 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-29 23:54:54,272 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-29 23:54:54,272 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-29 23:54:54,272 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-29 23:54:54,272 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-29 23:54:54,415 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Thu, 29 May 2025 14:54:54 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-29 23:54:54,416 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-29 23:54:54,416 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-29 23:54:54,416 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-29 23:54:54,416 - httpcore.http11 - DEBUG - response_closed.started 2025-05-29 23:54:54,416 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-29 23:54:54,416 - httpcore.connection - DEBUG - close.started 2025-05-29 23:54:54,416 - httpcore.connection - DEBUG - close.complete 2025-05-29 23:55:03,477 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-29 23:55:03,477 - __main__ - DEBUG - API key found, length: 39 2025-05-29 23:55:03,477 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-29 23:55:03,477 - auto_diffusers - DEBUG - API key length: 39 2025-05-29 23:55:03,477 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-29 23:55:03,477 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-29 23:55:03,477 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-29 23:55:03,477 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-29 23:55:03,477 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-29 23:55:03,477 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-29 23:55:03,481 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-29 23:55:03,481 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-29 23:55:03,929 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-29 23:55:03,929 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-29 23:55:03,929 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-29 23:55:03,929 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-29 23:55:03,929 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-29 23:55:03,929 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-29 23:55:03,929 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-29 23:55:03,929 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-29 23:55:03,929 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-29 23:55:03,929 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-29 23:55:03,929 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-29 23:55:03,931 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-29 23:55:03,944 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-29 23:55:03,950 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-29 23:55:04,086 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-29 23:55:04,086 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-29 23:55:04,359 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-29 23:55:04,359 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-29 23:55:04,360 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-29 23:55:04,360 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-29 23:55:04,360 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-29 23:55:04,360 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-29 23:55:04,410 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-29 23:55:04,498 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Thu, 29 May 2025 14:55:04 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-29 23:55:04,500 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-29 23:55:04,500 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-29 23:55:04,501 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-29 23:55:04,501 - httpcore.http11 - DEBUG - response_closed.started 2025-05-29 23:55:04,501 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-29 23:55:04,501 - httpcore.connection - DEBUG - close.started 2025-05-29 23:55:04,502 - httpcore.connection - DEBUG - close.complete 2025-05-29 23:55:14,094 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-29 23:55:14,094 - __main__ - DEBUG - API key found, length: 39 2025-05-29 23:55:14,094 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-29 23:55:14,094 - auto_diffusers - DEBUG - API key length: 39 2025-05-29 23:55:14,094 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-29 23:55:14,094 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-29 23:55:14,094 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-29 23:55:14,094 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-29 23:55:14,094 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-29 23:55:14,094 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-29 23:55:14,097 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-29 23:55:14,098 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-29 23:55:14,515 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-29 23:55:14,515 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-29 23:55:14,515 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-29 23:55:14,516 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-29 23:55:14,516 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-29 23:55:14,516 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-29 23:55:14,516 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-29 23:55:14,516 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-29 23:55:14,516 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-29 23:55:14,516 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-29 23:55:14,516 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-29 23:55:14,518 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-29 23:55:14,530 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-29 23:55:14,536 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-29 23:55:14,678 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-29 23:55:14,679 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-29 23:55:14,759 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-29 23:55:14,964 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-29 23:55:14,965 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-29 23:55:14,965 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-29 23:55:14,965 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-29 23:55:14,965 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-29 23:55:14,965 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-29 23:55:15,107 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Thu, 29 May 2025 14:55:15 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-29 23:55:15,108 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-29 23:55:15,108 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-29 23:55:15,108 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-29 23:55:15,108 - httpcore.http11 - DEBUG - response_closed.started 2025-05-29 23:55:15,108 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-29 23:55:15,108 - httpcore.connection - DEBUG - close.started 2025-05-29 23:55:15,109 - httpcore.connection - DEBUG - close.complete 2025-05-29 23:55:43,365 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-29 23:55:43,366 - __main__ - DEBUG - API key found, length: 39 2025-05-29 23:55:43,366 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-29 23:55:43,366 - auto_diffusers - DEBUG - API key length: 39 2025-05-29 23:55:43,366 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-29 23:55:43,366 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-29 23:55:43,366 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-29 23:55:43,366 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-29 23:55:43,366 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-29 23:55:43,366 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-29 23:55:43,369 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-29 23:55:43,369 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-29 23:55:43,790 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-29 23:55:43,790 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-29 23:55:43,790 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-29 23:55:43,790 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-29 23:55:43,790 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-29 23:55:43,790 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-29 23:55:43,790 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-29 23:55:43,790 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-29 23:55:43,790 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-29 23:55:43,790 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-29 23:55:43,790 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-29 23:55:43,792 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-29 23:55:43,804 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-29 23:55:43,811 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-29 23:55:43,947 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-29 23:55:43,947 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-29 23:55:44,023 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-29 23:55:44,220 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-29 23:55:44,220 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-29 23:55:44,221 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-29 23:55:44,221 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-29 23:55:44,221 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-29 23:55:44,221 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-29 23:55:44,358 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Thu, 29 May 2025 14:55:44 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-29 23:55:44,359 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-29 23:55:44,359 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-29 23:55:44,359 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-29 23:55:44,359 - httpcore.http11 - DEBUG - response_closed.started 2025-05-29 23:55:44,359 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-29 23:55:44,359 - httpcore.connection - DEBUG - close.started 2025-05-29 23:55:44,359 - httpcore.connection - DEBUG - close.complete 2025-05-29 23:56:37,153 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-29 23:56:37,153 - __main__ - DEBUG - API key found, length: 39 2025-05-29 23:56:37,153 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-29 23:56:37,153 - auto_diffusers - DEBUG - API key length: 39 2025-05-29 23:56:37,153 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-29 23:56:37,153 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-29 23:56:37,153 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-29 23:56:37,153 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-29 23:56:37,153 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-29 23:56:37,153 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-29 23:56:37,156 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-29 23:56:37,156 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-29 23:56:37,560 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-29 23:56:37,560 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-29 23:56:37,560 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-29 23:56:37,560 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-29 23:56:37,560 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-29 23:56:37,560 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-29 23:56:37,560 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-29 23:56:37,560 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-29 23:56:37,560 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-29 23:56:37,560 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-29 23:56:37,560 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-29 23:56:37,562 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-29 23:56:37,575 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-29 23:56:37,581 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-29 23:56:37,658 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-29 23:56:37,691 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7861 local_address=None timeout=None socket_options=None 2025-05-29 23:56:37,692 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-29 23:56:37,692 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-29 23:56:37,692 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-29 23:56:37,692 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-29 23:56:37,692 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-29 23:56:37,692 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-29 23:56:37,692 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Thu, 29 May 2025 14:56:37 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')]) 2025-05-29 23:56:37,693 - httpx - INFO - HTTP Request: GET http://localhost:7861/gradio_api/startup-events "HTTP/1.1 200 OK" 2025-05-29 23:56:37,693 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-29 23:56:37,693 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-29 23:56:37,693 - httpcore.http11 - DEBUG - response_closed.started 2025-05-29 23:56:37,693 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-29 23:56:37,693 - httpcore.connection - DEBUG - close.started 2025-05-29 23:56:37,693 - httpcore.connection - DEBUG - close.complete 2025-05-29 23:56:37,693 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7861 local_address=None timeout=3 socket_options=None 2025-05-29 23:56:37,694 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-29 23:56:37,694 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-29 23:56:37,694 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-29 23:56:37,694 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-29 23:56:37,694 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-29 23:56:37,694 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-29 23:56:37,700 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Thu, 29 May 2025 14:56:37 GMT'), (b'server', b'uvicorn'), (b'content-length', b'106594'), (b'content-type', b'text/html; charset=utf-8')]) 2025-05-29 23:56:37,700 - httpx - INFO - HTTP Request: HEAD http://localhost:7861/ "HTTP/1.1 200 OK" 2025-05-29 23:56:37,700 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-29 23:56:37,700 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-29 23:56:37,700 - httpcore.http11 - DEBUG - response_closed.started 2025-05-29 23:56:37,700 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-29 23:56:37,701 - httpcore.connection - DEBUG - close.started 2025-05-29 23:56:37,701 - httpcore.connection - DEBUG - close.complete 2025-05-29 23:56:37,711 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=30 socket_options=None 2025-05-29 23:56:37,874 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-29 23:56:37,902 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-29 23:56:37,902 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-29 23:56:37,903 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-29 23:56:37,903 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=30 2025-05-29 23:56:38,185 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-29 23:56:38,185 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-29 23:56:38,185 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-29 23:56:38,185 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-29 23:56:38,185 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-29 23:56:38,185 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-29 23:56:38,187 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-29 23:56:38,187 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-29 23:56:38,187 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-29 23:56:38,187 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-29 23:56:38,187 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-29 23:56:38,187 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-29 23:56:38,330 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Thu, 29 May 2025 14:56:38 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-29 23:56:38,330 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-29 23:56:38,331 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-29 23:56:38,331 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Thu, 29 May 2025 14:56:38 GMT'), (b'Content-Type', b'text/html; charset=utf-8'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'ContentType', b'application/json'), (b'Access-Control-Allow-Origin', b'*'), (b'Content-Encoding', b'gzip')]) 2025-05-29 23:56:38,331 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-29 23:56:38,332 - httpx - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK" 2025-05-29 23:56:38,332 - httpcore.http11 - DEBUG - response_closed.started 2025-05-29 23:56:38,332 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-29 23:56:38,332 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-29 23:56:38,333 - httpcore.connection - DEBUG - close.started 2025-05-29 23:56:38,333 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-29 23:56:38,333 - httpcore.http11 - DEBUG - response_closed.started 2025-05-29 23:56:38,333 - httpcore.connection - DEBUG - close.complete 2025-05-29 23:56:38,333 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-29 23:56:38,334 - httpcore.connection - DEBUG - close.started 2025-05-29 23:56:38,334 - httpcore.connection - DEBUG - close.complete 2025-05-29 23:56:39,018 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-29 23:56:39,237 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/launched HTTP/1.1" 200 0 2025-05-29 23:56:42,308 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-29 23:56:42,308 - simple_memory_calculator - INFO - Using known memory data for black-forest-labs/FLUX.1-schnell 2025-05-29 23:56:42,308 - simple_memory_calculator - DEBUG - Known data: {'params_billions': 12.0, 'fp16_gb': 24.0, 'inference_fp16_gb': 36.0} 2025-05-29 23:56:42,308 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-29 23:56:42,308 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-29 23:56:42,308 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-29 23:56:42,308 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-29 23:56:42,308 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-29 23:56:42,308 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 00:00:01,476 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-30 00:00:01,476 - __main__ - DEBUG - API key found, length: 39 2025-05-30 00:00:01,476 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-30 00:00:01,476 - auto_diffusers - DEBUG - API key length: 39 2025-05-30 00:00:01,476 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-30 00:00:01,476 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-30 00:00:01,476 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-30 00:00:01,476 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-30 00:00:01,476 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-30 00:00:01,476 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-30 00:00:01,480 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-30 00:00:01,481 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-30 00:00:01,916 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-30 00:00:01,916 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-30 00:00:01,916 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-30 00:00:01,916 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-30 00:00:01,916 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-30 00:00:01,916 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-30 00:00:01,916 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-30 00:00:01,916 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-30 00:00:01,916 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-30 00:00:01,916 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-30 00:00:01,916 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-30 00:00:01,918 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 00:00:01,930 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-30 00:00:01,936 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 00:00:02,010 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 00:00:02,048 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7861 local_address=None timeout=None socket_options=None 2025-05-30 00:00:02,049 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 00:00:02,049 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 00:00:02,049 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 00:00:02,049 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 00:00:02,049 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 00:00:02,049 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 00:00:02,050 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Thu, 29 May 2025 15:00:02 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')]) 2025-05-30 00:00:02,050 - httpx - INFO - HTTP Request: GET http://localhost:7861/gradio_api/startup-events "HTTP/1.1 200 OK" 2025-05-30 00:00:02,050 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 00:00:02,050 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 00:00:02,050 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 00:00:02,050 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 00:00:02,050 - httpcore.connection - DEBUG - close.started 2025-05-30 00:00:02,050 - httpcore.connection - DEBUG - close.complete 2025-05-30 00:00:02,050 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7861 local_address=None timeout=3 socket_options=None 2025-05-30 00:00:02,052 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 00:00:02,052 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 00:00:02,052 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 00:00:02,052 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 00:00:02,052 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 00:00:02,052 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 00:00:02,058 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Thu, 29 May 2025 15:00:02 GMT'), (b'server', b'uvicorn'), (b'content-length', b'109706'), (b'content-type', b'text/html; charset=utf-8')]) 2025-05-30 00:00:02,058 - httpx - INFO - HTTP Request: HEAD http://localhost:7861/ "HTTP/1.1 200 OK" 2025-05-30 00:00:02,058 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 00:00:02,058 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 00:00:02,058 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 00:00:02,058 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 00:00:02,058 - httpcore.connection - DEBUG - close.started 2025-05-30 00:00:02,058 - httpcore.connection - DEBUG - close.complete 2025-05-30 00:00:02,069 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=30 socket_options=None 2025-05-30 00:00:02,138 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 00:00:02,138 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-30 00:00:02,207 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 00:00:02,207 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=30 2025-05-30 00:00:02,293 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-30 00:00:02,420 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 00:00:02,420 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 00:00:02,420 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 00:00:02,421 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 00:00:02,421 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 00:00:02,421 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 00:00:02,486 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 00:00:02,486 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 00:00:02,487 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 00:00:02,487 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 00:00:02,487 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 00:00:02,487 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 00:00:02,561 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Thu, 29 May 2025 15:00:02 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-30 00:00:02,562 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-30 00:00:02,563 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 00:00:02,563 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 00:00:02,564 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 00:00:02,564 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 00:00:02,564 - httpcore.connection - DEBUG - close.started 2025-05-30 00:00:02,565 - httpcore.connection - DEBUG - close.complete 2025-05-30 00:00:02,627 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Thu, 29 May 2025 15:00:02 GMT'), (b'Content-Type', b'text/html; charset=utf-8'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'ContentType', b'application/json'), (b'Access-Control-Allow-Origin', b'*'), (b'Content-Encoding', b'gzip')]) 2025-05-30 00:00:02,629 - httpx - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK" 2025-05-30 00:00:02,630 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 00:00:02,630 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 00:00:02,630 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 00:00:02,631 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 00:00:02,635 - httpcore.connection - DEBUG - close.started 2025-05-30 00:00:02,636 - httpcore.connection - DEBUG - close.complete 2025-05-30 00:00:02,709 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 00:00:02,709 - simple_memory_calculator - INFO - Using known memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 00:00:02,709 - simple_memory_calculator - DEBUG - Known data: {'params_billions': 12.0, 'fp16_gb': 24.0, 'inference_fp16_gb': 36.0} 2025-05-30 00:00:02,710 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 00:00:02,710 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 00:00:02,710 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 00:00:02,710 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 00:00:02,710 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 00:00:02,710 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 00:00:03,267 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 00:00:03,490 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/launched HTTP/1.1" 200 0 2025-05-30 00:04:08,649 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-30 00:04:08,650 - __main__ - DEBUG - API key found, length: 39 2025-05-30 00:04:08,650 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-30 00:04:08,650 - auto_diffusers - DEBUG - API key length: 39 2025-05-30 00:04:08,650 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-30 00:04:08,650 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-30 00:04:08,650 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-30 00:04:08,650 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-30 00:04:08,650 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-30 00:04:08,650 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-30 00:04:08,653 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-30 00:04:08,654 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-30 00:04:09,095 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-30 00:04:09,095 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-30 00:04:09,095 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-30 00:04:09,095 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-30 00:04:09,095 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-30 00:04:09,095 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-30 00:04:09,095 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-30 00:04:09,095 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-30 00:04:09,095 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-30 00:04:09,095 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-30 00:04:09,096 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-30 00:04:09,098 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 00:04:09,119 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-30 00:04:09,125 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 00:04:09,205 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 00:04:09,239 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7861 local_address=None timeout=None socket_options=None 2025-05-30 00:04:09,240 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 00:04:09,240 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 00:04:09,240 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 00:04:09,240 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 00:04:09,240 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 00:04:09,240 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 00:04:09,240 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Thu, 29 May 2025 15:04:09 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')]) 2025-05-30 00:04:09,241 - httpx - INFO - HTTP Request: GET http://localhost:7861/gradio_api/startup-events "HTTP/1.1 200 OK" 2025-05-30 00:04:09,241 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 00:04:09,241 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 00:04:09,241 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 00:04:09,241 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 00:04:09,241 - httpcore.connection - DEBUG - close.started 2025-05-30 00:04:09,241 - httpcore.connection - DEBUG - close.complete 2025-05-30 00:04:09,241 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7861 local_address=None timeout=3 socket_options=None 2025-05-30 00:04:09,242 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 00:04:09,242 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 00:04:09,242 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 00:04:09,242 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 00:04:09,242 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 00:04:09,242 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 00:04:09,248 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Thu, 29 May 2025 15:04:09 GMT'), (b'server', b'uvicorn'), (b'content-length', b'107757'), (b'content-type', b'text/html; charset=utf-8')]) 2025-05-30 00:04:09,248 - httpx - INFO - HTTP Request: HEAD http://localhost:7861/ "HTTP/1.1 200 OK" 2025-05-30 00:04:09,248 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 00:04:09,248 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 00:04:09,248 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 00:04:09,248 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 00:04:09,248 - httpcore.connection - DEBUG - close.started 2025-05-30 00:04:09,248 - httpcore.connection - DEBUG - close.complete 2025-05-30 00:04:09,259 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=30 socket_options=None 2025-05-30 00:04:09,286 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 00:04:09,286 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-30 00:04:09,399 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 00:04:09,399 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=30 2025-05-30 00:04:09,404 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-30 00:04:09,567 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 00:04:09,567 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 00:04:09,567 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 00:04:09,567 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 00:04:09,567 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 00:04:09,567 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 00:04:09,678 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 00:04:09,678 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 00:04:09,678 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 00:04:09,678 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 00:04:09,678 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 00:04:09,678 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 00:04:09,710 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Thu, 29 May 2025 15:04:09 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-30 00:04:09,710 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-30 00:04:09,710 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 00:04:09,710 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 00:04:09,710 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 00:04:09,710 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 00:04:09,710 - httpcore.connection - DEBUG - close.started 2025-05-30 00:04:09,711 - httpcore.connection - DEBUG - close.complete 2025-05-30 00:04:09,820 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Thu, 29 May 2025 15:04:09 GMT'), (b'Content-Type', b'text/html; charset=utf-8'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'ContentType', b'application/json'), (b'Access-Control-Allow-Origin', b'*'), (b'Content-Encoding', b'gzip')]) 2025-05-30 00:04:09,820 - httpx - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK" 2025-05-30 00:04:09,820 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 00:04:09,820 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 00:04:09,820 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 00:04:09,820 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 00:04:09,820 - httpcore.connection - DEBUG - close.started 2025-05-30 00:04:09,820 - httpcore.connection - DEBUG - close.complete 2025-05-30 00:04:10,304 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 00:04:10,304 - simple_memory_calculator - INFO - Using known memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 00:04:10,305 - simple_memory_calculator - DEBUG - Known data: {'params_billions': 12.0, 'fp16_gb': 24.0, 'inference_fp16_gb': 36.0} 2025-05-30 00:04:10,305 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 00:04:10,305 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 00:04:10,305 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 00:04:10,305 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 00:04:10,305 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 00:04:10,305 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 00:04:10,453 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 00:04:10,668 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/launched HTTP/1.1" 200 0 2025-05-30 00:05:09,731 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-30 00:05:09,732 - __main__ - DEBUG - API key found, length: 39 2025-05-30 00:05:09,732 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-30 00:05:09,732 - auto_diffusers - DEBUG - API key length: 39 2025-05-30 00:05:09,732 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-30 00:05:09,732 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-30 00:05:09,732 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-30 00:05:09,732 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-30 00:05:09,732 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-30 00:05:09,732 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-30 00:05:09,736 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-30 00:05:09,736 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-30 00:05:10,155 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-30 00:05:10,155 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-30 00:05:10,155 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-30 00:05:10,155 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-30 00:05:10,155 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-30 00:05:10,155 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-30 00:05:10,155 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-30 00:05:10,155 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-30 00:05:10,155 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-30 00:05:10,155 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-30 00:05:10,155 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-30 00:05:10,157 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 00:05:10,170 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-30 00:05:10,177 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 00:05:10,271 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 00:05:10,302 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7861 local_address=None timeout=None socket_options=None 2025-05-30 00:05:10,303 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 00:05:10,303 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 00:05:10,303 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 00:05:10,303 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 00:05:10,303 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 00:05:10,304 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 00:05:10,304 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Thu, 29 May 2025 15:05:10 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')]) 2025-05-30 00:05:10,304 - httpx - INFO - HTTP Request: GET http://localhost:7861/gradio_api/startup-events "HTTP/1.1 200 OK" 2025-05-30 00:05:10,304 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 00:05:10,304 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 00:05:10,304 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 00:05:10,304 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 00:05:10,304 - httpcore.connection - DEBUG - close.started 2025-05-30 00:05:10,304 - httpcore.connection - DEBUG - close.complete 2025-05-30 00:05:10,304 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7861 local_address=None timeout=3 socket_options=None 2025-05-30 00:05:10,305 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 00:05:10,305 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 00:05:10,305 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 00:05:10,305 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 00:05:10,305 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 00:05:10,305 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 00:05:10,311 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Thu, 29 May 2025 15:05:10 GMT'), (b'server', b'uvicorn'), (b'content-length', b'107689'), (b'content-type', b'text/html; charset=utf-8')]) 2025-05-30 00:05:10,311 - httpx - INFO - HTTP Request: HEAD http://localhost:7861/ "HTTP/1.1 200 OK" 2025-05-30 00:05:10,311 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 00:05:10,311 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 00:05:10,311 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 00:05:10,311 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 00:05:10,311 - httpcore.connection - DEBUG - close.started 2025-05-30 00:05:10,311 - httpcore.connection - DEBUG - close.complete 2025-05-30 00:05:10,315 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 00:05:10,315 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-30 00:05:10,322 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=30 socket_options=None 2025-05-30 00:05:10,453 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-30 00:05:10,464 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 00:05:10,464 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=30 2025-05-30 00:05:10,591 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 00:05:10,591 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 00:05:10,592 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 00:05:10,592 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 00:05:10,592 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 00:05:10,592 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 00:05:10,732 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Thu, 29 May 2025 15:05:10 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-30 00:05:10,733 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-30 00:05:10,733 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 00:05:10,734 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 00:05:10,734 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 00:05:10,734 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 00:05:10,734 - httpcore.connection - DEBUG - close.started 2025-05-30 00:05:10,735 - httpcore.connection - DEBUG - close.complete 2025-05-30 00:05:10,750 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 00:05:10,751 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 00:05:10,751 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 00:05:10,751 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 00:05:10,751 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 00:05:10,751 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 00:05:10,896 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Thu, 29 May 2025 15:05:10 GMT'), (b'Content-Type', b'text/html; charset=utf-8'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'ContentType', b'application/json'), (b'Access-Control-Allow-Origin', b'*'), (b'Content-Encoding', b'gzip')]) 2025-05-30 00:05:10,897 - httpx - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK" 2025-05-30 00:05:10,898 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 00:05:10,898 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 00:05:10,898 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 00:05:10,898 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 00:05:10,899 - httpcore.connection - DEBUG - close.started 2025-05-30 00:05:10,899 - httpcore.connection - DEBUG - close.complete 2025-05-30 00:05:11,318 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 00:05:11,318 - simple_memory_calculator - INFO - Using known memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 00:05:11,318 - simple_memory_calculator - DEBUG - Known data: {'params_billions': 12.0, 'fp16_gb': 24.0, 'inference_fp16_gb': 36.0} 2025-05-30 00:05:11,318 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 00:05:11,318 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 00:05:11,318 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 00:05:11,318 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 00:05:11,318 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 00:05:11,318 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 00:05:11,467 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 00:05:11,688 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/launched HTTP/1.1" 200 0 2025-05-30 00:06:35,442 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-30 00:06:35,442 - __main__ - DEBUG - API key found, length: 39 2025-05-30 00:06:35,442 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-30 00:06:35,442 - auto_diffusers - DEBUG - API key length: 39 2025-05-30 00:06:35,442 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-30 00:06:35,442 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-30 00:06:35,442 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-30 00:06:35,442 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-30 00:06:35,442 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-30 00:06:35,442 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-30 00:06:35,446 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-30 00:06:35,446 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-30 00:06:35,858 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-30 00:06:35,858 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-30 00:06:35,858 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-30 00:06:35,858 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-30 00:06:35,858 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-30 00:06:35,858 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-30 00:06:35,858 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-30 00:06:35,858 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-30 00:06:35,858 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-30 00:06:35,858 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-30 00:06:35,858 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-30 00:06:35,860 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 00:06:35,873 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-30 00:06:35,878 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 00:06:35,949 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 00:06:35,984 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7861 local_address=None timeout=None socket_options=None 2025-05-30 00:06:35,984 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 00:06:35,985 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 00:06:35,985 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 00:06:35,985 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 00:06:35,985 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 00:06:35,985 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 00:06:35,985 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Thu, 29 May 2025 15:06:35 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')]) 2025-05-30 00:06:35,985 - httpx - INFO - HTTP Request: GET http://localhost:7861/gradio_api/startup-events "HTTP/1.1 200 OK" 2025-05-30 00:06:35,985 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 00:06:35,985 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 00:06:35,986 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 00:06:35,986 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 00:06:35,986 - httpcore.connection - DEBUG - close.started 2025-05-30 00:06:35,986 - httpcore.connection - DEBUG - close.complete 2025-05-30 00:06:35,986 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7861 local_address=None timeout=3 socket_options=None 2025-05-30 00:06:35,986 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 00:06:35,986 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 00:06:35,987 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 00:06:35,987 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 00:06:35,987 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 00:06:35,987 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 00:06:35,993 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Thu, 29 May 2025 15:06:35 GMT'), (b'server', b'uvicorn'), (b'content-length', b'107788'), (b'content-type', b'text/html; charset=utf-8')]) 2025-05-30 00:06:35,993 - httpx - INFO - HTTP Request: HEAD http://localhost:7861/ "HTTP/1.1 200 OK" 2025-05-30 00:06:35,993 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 00:06:35,993 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 00:06:35,993 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 00:06:35,993 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 00:06:35,993 - httpcore.connection - DEBUG - close.started 2025-05-30 00:06:35,993 - httpcore.connection - DEBUG - close.complete 2025-05-30 00:06:36,003 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=30 socket_options=None 2025-05-30 00:06:36,066 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 00:06:36,066 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-30 00:06:36,146 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 00:06:36,146 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=30 2025-05-30 00:06:36,188 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-30 00:06:36,349 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 00:06:36,350 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 00:06:36,351 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 00:06:36,351 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 00:06:36,351 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 00:06:36,351 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 00:06:36,439 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 00:06:36,441 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 00:06:36,443 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 00:06:36,444 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 00:06:36,444 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 00:06:36,444 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 00:06:36,493 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Thu, 29 May 2025 15:06:36 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-30 00:06:36,494 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-30 00:06:36,494 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 00:06:36,495 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 00:06:36,495 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 00:06:36,495 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 00:06:36,495 - httpcore.connection - DEBUG - close.started 2025-05-30 00:06:36,496 - httpcore.connection - DEBUG - close.complete 2025-05-30 00:06:36,588 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Thu, 29 May 2025 15:06:36 GMT'), (b'Content-Type', b'text/html; charset=utf-8'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'ContentType', b'application/json'), (b'Access-Control-Allow-Origin', b'*'), (b'Content-Encoding', b'gzip')]) 2025-05-30 00:06:36,589 - httpx - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK" 2025-05-30 00:06:36,589 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 00:06:36,590 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 00:06:36,590 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 00:06:36,590 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 00:06:36,590 - httpcore.connection - DEBUG - close.started 2025-05-30 00:06:36,590 - httpcore.connection - DEBUG - close.complete 2025-05-30 00:06:37,231 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 00:06:37,458 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/launched HTTP/1.1" 200 0 2025-05-30 00:06:55,393 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 00:06:55,393 - simple_memory_calculator - INFO - Using known memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 00:06:55,393 - simple_memory_calculator - DEBUG - Known data: {'params_billions': 12.0, 'fp16_gb': 24.0, 'inference_fp16_gb': 36.0} 2025-05-30 00:06:55,393 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 00:06:55,393 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 00:06:55,393 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 00:06:55,393 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 00:06:55,393 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 00:06:55,394 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 00:08:02,686 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-30 00:08:02,686 - __main__ - DEBUG - API key found, length: 39 2025-05-30 00:08:02,686 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-30 00:08:02,686 - auto_diffusers - DEBUG - API key length: 39 2025-05-30 00:08:02,686 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-30 00:08:02,686 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-30 00:08:02,687 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-30 00:08:02,687 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-30 00:08:02,687 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-30 00:08:02,687 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-30 00:08:02,690 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-30 00:08:02,690 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-30 00:08:03,143 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-30 00:08:03,143 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-30 00:08:03,144 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-30 00:08:03,144 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-30 00:08:03,144 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-30 00:08:03,144 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-30 00:08:03,144 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-30 00:08:03,144 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-30 00:08:03,144 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-30 00:08:03,144 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-30 00:08:03,144 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-30 00:08:03,146 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 00:08:03,160 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-30 00:08:03,166 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 00:08:03,239 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 00:08:03,273 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7861 local_address=None timeout=None socket_options=None 2025-05-30 00:08:03,274 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 00:08:03,274 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 00:08:03,274 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 00:08:03,275 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 00:08:03,275 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 00:08:03,275 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 00:08:03,275 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Thu, 29 May 2025 15:08:03 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')]) 2025-05-30 00:08:03,275 - httpx - INFO - HTTP Request: GET http://localhost:7861/gradio_api/startup-events "HTTP/1.1 200 OK" 2025-05-30 00:08:03,275 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 00:08:03,275 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 00:08:03,275 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 00:08:03,275 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 00:08:03,275 - httpcore.connection - DEBUG - close.started 2025-05-30 00:08:03,275 - httpcore.connection - DEBUG - close.complete 2025-05-30 00:08:03,276 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7861 local_address=None timeout=3 socket_options=None 2025-05-30 00:08:03,276 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 00:08:03,276 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 00:08:03,276 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 00:08:03,276 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 00:08:03,276 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 00:08:03,276 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 00:08:03,283 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Thu, 29 May 2025 15:08:03 GMT'), (b'server', b'uvicorn'), (b'content-length', b'107135'), (b'content-type', b'text/html; charset=utf-8')]) 2025-05-30 00:08:03,283 - httpx - INFO - HTTP Request: HEAD http://localhost:7861/ "HTTP/1.1 200 OK" 2025-05-30 00:08:03,283 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 00:08:03,283 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 00:08:03,283 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 00:08:03,283 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 00:08:03,283 - httpcore.connection - DEBUG - close.started 2025-05-30 00:08:03,283 - httpcore.connection - DEBUG - close.complete 2025-05-30 00:08:03,295 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=30 socket_options=None 2025-05-30 00:08:03,328 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 00:08:03,328 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-30 00:08:03,436 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 00:08:03,436 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=30 2025-05-30 00:08:03,443 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-30 00:08:03,617 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 00:08:03,618 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 00:08:03,618 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 00:08:03,619 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 00:08:03,619 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 00:08:03,619 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 00:08:03,715 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 00:08:03,715 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 00:08:03,716 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 00:08:03,716 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 00:08:03,716 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 00:08:03,716 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 00:08:03,763 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Thu, 29 May 2025 15:08:03 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-30 00:08:03,763 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-30 00:08:03,763 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 00:08:03,763 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 00:08:03,763 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 00:08:03,763 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 00:08:03,763 - httpcore.connection - DEBUG - close.started 2025-05-30 00:08:03,763 - httpcore.connection - DEBUG - close.complete 2025-05-30 00:08:03,855 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Thu, 29 May 2025 15:08:03 GMT'), (b'Content-Type', b'text/html; charset=utf-8'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'ContentType', b'application/json'), (b'Access-Control-Allow-Origin', b'*'), (b'Content-Encoding', b'gzip')]) 2025-05-30 00:08:03,856 - httpx - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK" 2025-05-30 00:08:03,856 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 00:08:03,856 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 00:08:03,856 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 00:08:03,856 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 00:08:03,856 - httpcore.connection - DEBUG - close.started 2025-05-30 00:08:03,856 - httpcore.connection - DEBUG - close.complete 2025-05-30 00:08:04,087 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 00:08:04,087 - simple_memory_calculator - INFO - Using known memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 00:08:04,087 - simple_memory_calculator - DEBUG - Known data: {'params_billions': 12.0, 'fp16_gb': 24.0, 'inference_fp16_gb': 36.0} 2025-05-30 00:08:04,087 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 00:08:04,087 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 00:08:04,087 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 00:08:04,087 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 00:08:04,087 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 00:08:04,087 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 00:08:04,452 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 00:08:04,684 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/launched HTTP/1.1" 200 0 2025-05-30 00:12:12,860 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-30 00:12:12,860 - __main__ - DEBUG - API key found, length: 39 2025-05-30 00:12:12,860 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-30 00:12:12,860 - auto_diffusers - DEBUG - API key length: 39 2025-05-30 00:12:12,860 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-30 00:12:12,860 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-30 00:12:12,860 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-30 00:12:12,860 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-30 00:12:12,860 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-30 00:12:12,860 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-30 00:12:12,864 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-30 00:12:12,865 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-30 00:12:13,341 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-30 00:12:13,341 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-30 00:12:13,341 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-30 00:12:13,341 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-30 00:12:13,341 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-30 00:12:13,342 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-30 00:12:13,342 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-30 00:12:13,342 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-30 00:12:13,342 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-30 00:12:13,342 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-30 00:12:13,342 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-30 00:12:13,344 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 00:12:13,358 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-30 00:12:13,365 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 00:12:13,448 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 00:12:13,481 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7861 local_address=None timeout=None socket_options=None 2025-05-30 00:12:13,482 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 00:12:13,482 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 00:12:13,482 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 00:12:13,483 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 00:12:13,483 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 00:12:13,483 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 00:12:13,483 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Thu, 29 May 2025 15:12:13 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')]) 2025-05-30 00:12:13,483 - httpx - INFO - HTTP Request: GET http://localhost:7861/gradio_api/startup-events "HTTP/1.1 200 OK" 2025-05-30 00:12:13,483 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 00:12:13,483 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 00:12:13,483 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 00:12:13,483 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 00:12:13,483 - httpcore.connection - DEBUG - close.started 2025-05-30 00:12:13,483 - httpcore.connection - DEBUG - close.complete 2025-05-30 00:12:13,484 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7861 local_address=None timeout=3 socket_options=None 2025-05-30 00:12:13,484 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 00:12:13,484 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 00:12:13,484 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 00:12:13,484 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 00:12:13,484 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 00:12:13,484 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 00:12:13,490 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Thu, 29 May 2025 15:12:13 GMT'), (b'server', b'uvicorn'), (b'content-length', b'106328'), (b'content-type', b'text/html; charset=utf-8')]) 2025-05-30 00:12:13,491 - httpx - INFO - HTTP Request: HEAD http://localhost:7861/ "HTTP/1.1 200 OK" 2025-05-30 00:12:13,491 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 00:12:13,491 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 00:12:13,491 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 00:12:13,491 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 00:12:13,491 - httpcore.connection - DEBUG - close.started 2025-05-30 00:12:13,491 - httpcore.connection - DEBUG - close.complete 2025-05-30 00:12:13,502 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=30 socket_options=None 2025-05-30 00:12:13,604 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 00:12:13,604 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-30 00:12:13,643 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 00:12:13,643 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=30 2025-05-30 00:12:13,650 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-30 00:12:13,887 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 00:12:13,888 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 00:12:13,888 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 00:12:13,888 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 00:12:13,888 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 00:12:13,888 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 00:12:13,919 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 00:12:13,919 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 00:12:13,919 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 00:12:13,919 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 00:12:13,919 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 00:12:13,919 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 00:12:14,033 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Thu, 29 May 2025 15:12:14 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-30 00:12:14,033 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-30 00:12:14,034 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 00:12:14,034 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 00:12:14,034 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 00:12:14,034 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 00:12:14,034 - httpcore.connection - DEBUG - close.started 2025-05-30 00:12:14,034 - httpcore.connection - DEBUG - close.complete 2025-05-30 00:12:14,060 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Thu, 29 May 2025 15:12:14 GMT'), (b'Content-Type', b'text/html; charset=utf-8'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'ContentType', b'application/json'), (b'Access-Control-Allow-Origin', b'*'), (b'Content-Encoding', b'gzip')]) 2025-05-30 00:12:14,061 - httpx - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK" 2025-05-30 00:12:14,061 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 00:12:14,061 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 00:12:14,062 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 00:12:14,062 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 00:12:14,062 - httpcore.connection - DEBUG - close.started 2025-05-30 00:12:14,062 - httpcore.connection - DEBUG - close.complete 2025-05-30 00:12:14,141 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 00:12:14,141 - simple_memory_calculator - INFO - Using known memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 00:12:14,141 - simple_memory_calculator - DEBUG - Known data: {'params_billions': 12.0, 'fp16_gb': 24.0, 'inference_fp16_gb': 36.0} 2025-05-30 00:12:14,141 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 00:12:14,141 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 00:12:14,142 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 00:12:14,142 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 00:12:14,142 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 00:12:14,142 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 00:12:14,655 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 00:12:14,884 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/launched HTTP/1.1" 200 0 2025-05-30 00:13:17,944 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-30 00:13:17,944 - __main__ - DEBUG - API key found, length: 39 2025-05-30 00:13:17,944 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-30 00:13:17,944 - auto_diffusers - DEBUG - API key length: 39 2025-05-30 00:13:17,944 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-30 00:13:17,944 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-30 00:13:17,944 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-30 00:13:17,944 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-30 00:13:17,944 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-30 00:13:17,944 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-30 00:13:17,948 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-30 00:13:17,948 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-30 00:13:18,354 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-30 00:13:18,354 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-30 00:13:18,354 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-30 00:13:18,354 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-30 00:13:18,354 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-30 00:13:18,354 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-30 00:13:18,354 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-30 00:13:18,354 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-30 00:13:18,354 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-30 00:13:18,354 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-30 00:13:18,354 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-30 00:13:18,356 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 00:13:18,369 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-30 00:13:18,375 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 00:13:18,445 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 00:13:18,480 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7861 local_address=None timeout=None socket_options=None 2025-05-30 00:13:18,481 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 00:13:18,481 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 00:13:18,481 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 00:13:18,481 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 00:13:18,481 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 00:13:18,481 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 00:13:18,481 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Thu, 29 May 2025 15:13:18 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')]) 2025-05-30 00:13:18,482 - httpx - INFO - HTTP Request: GET http://localhost:7861/gradio_api/startup-events "HTTP/1.1 200 OK" 2025-05-30 00:13:18,482 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 00:13:18,482 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 00:13:18,482 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 00:13:18,482 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 00:13:18,482 - httpcore.connection - DEBUG - close.started 2025-05-30 00:13:18,482 - httpcore.connection - DEBUG - close.complete 2025-05-30 00:13:18,482 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7861 local_address=None timeout=3 socket_options=None 2025-05-30 00:13:18,482 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 00:13:18,483 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 00:13:18,483 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 00:13:18,483 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 00:13:18,483 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 00:13:18,483 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 00:13:18,489 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Thu, 29 May 2025 15:13:18 GMT'), (b'server', b'uvicorn'), (b'content-length', b'106650'), (b'content-type', b'text/html; charset=utf-8')]) 2025-05-30 00:13:18,489 - httpx - INFO - HTTP Request: HEAD http://localhost:7861/ "HTTP/1.1 200 OK" 2025-05-30 00:13:18,489 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 00:13:18,489 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 00:13:18,489 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 00:13:18,489 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 00:13:18,489 - httpcore.connection - DEBUG - close.started 2025-05-30 00:13:18,489 - httpcore.connection - DEBUG - close.complete 2025-05-30 00:13:18,500 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=30 socket_options=None 2025-05-30 00:13:18,512 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 00:13:18,512 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-30 00:13:18,637 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 00:13:18,637 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=30 2025-05-30 00:13:18,786 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 00:13:18,787 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 00:13:18,789 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 00:13:18,792 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 00:13:18,792 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 00:13:18,793 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 00:13:18,873 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-30 00:13:18,912 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 00:13:18,912 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 00:13:18,913 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 00:13:18,913 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 00:13:18,913 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 00:13:18,913 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 00:13:18,927 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Thu, 29 May 2025 15:13:18 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-30 00:13:18,927 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-30 00:13:18,927 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 00:13:18,927 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 00:13:18,928 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 00:13:18,928 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 00:13:18,928 - httpcore.connection - DEBUG - close.started 2025-05-30 00:13:18,928 - httpcore.connection - DEBUG - close.complete 2025-05-30 00:13:19,052 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Thu, 29 May 2025 15:13:19 GMT'), (b'Content-Type', b'text/html; charset=utf-8'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'ContentType', b'application/json'), (b'Access-Control-Allow-Origin', b'*'), (b'Content-Encoding', b'gzip')]) 2025-05-30 00:13:19,052 - httpx - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK" 2025-05-30 00:13:19,052 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 00:13:19,053 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 00:13:19,053 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 00:13:19,053 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 00:13:19,053 - httpcore.connection - DEBUG - close.started 2025-05-30 00:13:19,053 - httpcore.connection - DEBUG - close.complete 2025-05-30 00:13:19,638 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 00:13:19,859 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/launched HTTP/1.1" 200 0 2025-05-30 00:14:24,341 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-30 00:14:24,341 - __main__ - DEBUG - API key found, length: 39 2025-05-30 00:14:24,341 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-30 00:14:24,341 - auto_diffusers - DEBUG - API key length: 39 2025-05-30 00:14:24,341 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-30 00:14:24,341 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-30 00:14:24,341 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-30 00:14:24,341 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-30 00:14:24,341 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-30 00:14:24,341 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-30 00:14:24,345 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-30 00:14:24,345 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-30 00:14:24,811 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-30 00:14:24,811 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-30 00:14:24,811 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-30 00:14:24,811 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-30 00:14:24,811 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-30 00:14:24,811 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-30 00:14:24,812 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-30 00:14:24,812 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-30 00:14:24,812 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-30 00:14:24,812 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-30 00:14:24,812 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-30 00:14:24,814 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 00:14:24,827 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-30 00:14:24,833 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 00:14:24,917 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 00:14:24,952 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7861 local_address=None timeout=None socket_options=None 2025-05-30 00:14:24,952 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 00:14:24,953 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 00:14:24,953 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 00:14:24,953 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 00:14:24,953 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 00:14:24,953 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 00:14:24,953 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Thu, 29 May 2025 15:14:24 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')]) 2025-05-30 00:14:24,954 - httpx - INFO - HTTP Request: GET http://localhost:7861/gradio_api/startup-events "HTTP/1.1 200 OK" 2025-05-30 00:14:24,954 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 00:14:24,954 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 00:14:24,954 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 00:14:24,954 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 00:14:24,954 - httpcore.connection - DEBUG - close.started 2025-05-30 00:14:24,954 - httpcore.connection - DEBUG - close.complete 2025-05-30 00:14:24,954 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7861 local_address=None timeout=3 socket_options=None 2025-05-30 00:14:24,955 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 00:14:24,955 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 00:14:24,955 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 00:14:24,955 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 00:14:24,955 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 00:14:24,955 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 00:14:24,961 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Thu, 29 May 2025 15:14:24 GMT'), (b'server', b'uvicorn'), (b'content-length', b'106654'), (b'content-type', b'text/html; charset=utf-8')]) 2025-05-30 00:14:24,961 - httpx - INFO - HTTP Request: HEAD http://localhost:7861/ "HTTP/1.1 200 OK" 2025-05-30 00:14:24,961 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 00:14:24,961 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 00:14:24,961 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 00:14:24,961 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 00:14:24,961 - httpcore.connection - DEBUG - close.started 2025-05-30 00:14:24,961 - httpcore.connection - DEBUG - close.complete 2025-05-30 00:14:24,972 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=30 socket_options=None 2025-05-30 00:14:24,993 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 00:14:24,993 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-30 00:14:25,111 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 00:14:25,111 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=30 2025-05-30 00:14:25,273 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 00:14:25,273 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 00:14:25,273 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 00:14:25,274 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 00:14:25,274 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 00:14:25,274 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 00:14:25,389 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 00:14:25,389 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 00:14:25,389 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 00:14:25,389 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 00:14:25,389 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 00:14:25,390 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 00:14:25,414 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Thu, 29 May 2025 15:14:25 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-30 00:14:25,414 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-30 00:14:25,414 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 00:14:25,415 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 00:14:25,415 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 00:14:25,415 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 00:14:25,415 - httpcore.connection - DEBUG - close.started 2025-05-30 00:14:25,415 - httpcore.connection - DEBUG - close.complete 2025-05-30 00:14:25,495 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-30 00:14:25,530 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Thu, 29 May 2025 15:14:25 GMT'), (b'Content-Type', b'text/html; charset=utf-8'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'ContentType', b'application/json'), (b'Access-Control-Allow-Origin', b'*'), (b'Content-Encoding', b'gzip')]) 2025-05-30 00:14:25,531 - httpx - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK" 2025-05-30 00:14:25,531 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 00:14:25,532 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 00:14:25,532 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 00:14:25,532 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 00:14:25,532 - httpcore.connection - DEBUG - close.started 2025-05-30 00:14:25,533 - httpcore.connection - DEBUG - close.complete 2025-05-30 00:14:26,127 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 00:14:26,354 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/launched HTTP/1.1" 200 0 2025-05-30 00:14:28,000 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 00:14:28,001 - simple_memory_calculator - INFO - Using known memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 00:14:28,001 - simple_memory_calculator - DEBUG - Known data: {'params_billions': 12.0, 'fp16_gb': 24.0, 'inference_fp16_gb': 36.0} 2025-05-30 00:14:28,001 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 00:14:28,001 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 00:14:28,001 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 00:14:28,001 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 00:14:28,001 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 00:14:28,001 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 00:16:16,934 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-30 00:16:16,934 - __main__ - DEBUG - API key found, length: 39 2025-05-30 00:16:16,934 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-30 00:16:16,934 - auto_diffusers - DEBUG - API key length: 39 2025-05-30 00:16:16,934 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-30 00:16:16,934 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-30 00:16:16,934 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-30 00:16:16,934 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-30 00:16:16,934 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-30 00:16:16,934 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-30 00:16:16,939 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-30 00:16:16,939 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-30 00:16:17,437 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-30 00:16:17,437 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-30 00:16:17,437 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-30 00:16:17,437 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-30 00:16:17,437 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-30 00:16:17,437 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-30 00:16:17,437 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-30 00:16:17,437 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-30 00:16:17,437 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-30 00:16:17,437 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-30 00:16:17,437 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-30 00:16:17,439 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 00:16:17,452 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-30 00:16:17,458 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 00:16:17,529 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 00:16:17,563 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7861 local_address=None timeout=None socket_options=None 2025-05-30 00:16:17,563 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 00:16:17,563 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 00:16:17,563 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 00:16:17,563 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 00:16:17,564 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 00:16:17,564 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 00:16:17,564 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Thu, 29 May 2025 15:16:17 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')]) 2025-05-30 00:16:17,564 - httpx - INFO - HTTP Request: GET http://localhost:7861/gradio_api/startup-events "HTTP/1.1 200 OK" 2025-05-30 00:16:17,564 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 00:16:17,564 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 00:16:17,564 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 00:16:17,564 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 00:16:17,564 - httpcore.connection - DEBUG - close.started 2025-05-30 00:16:17,564 - httpcore.connection - DEBUG - close.complete 2025-05-30 00:16:17,565 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7861 local_address=None timeout=3 socket_options=None 2025-05-30 00:16:17,565 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 00:16:17,565 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 00:16:17,565 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 00:16:17,565 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 00:16:17,565 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 00:16:17,565 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 00:16:17,571 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Thu, 29 May 2025 15:16:17 GMT'), (b'server', b'uvicorn'), (b'content-length', b'106648'), (b'content-type', b'text/html; charset=utf-8')]) 2025-05-30 00:16:17,571 - httpx - INFO - HTTP Request: HEAD http://localhost:7861/ "HTTP/1.1 200 OK" 2025-05-30 00:16:17,571 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 00:16:17,571 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 00:16:17,571 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 00:16:17,571 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 00:16:17,572 - httpcore.connection - DEBUG - close.started 2025-05-30 00:16:17,572 - httpcore.connection - DEBUG - close.complete 2025-05-30 00:16:17,582 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=30 socket_options=None 2025-05-30 00:16:17,653 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 00:16:17,654 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-30 00:16:17,724 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 00:16:17,724 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=30 2025-05-30 00:16:17,729 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-30 00:16:17,936 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 00:16:17,936 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 00:16:17,937 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 00:16:17,937 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 00:16:17,937 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 00:16:17,937 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 00:16:18,009 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 00:16:18,009 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 00:16:18,011 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 00:16:18,011 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 00:16:18,011 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 00:16:18,011 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 00:16:18,080 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Thu, 29 May 2025 15:16:18 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-30 00:16:18,080 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-30 00:16:18,080 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 00:16:18,081 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 00:16:18,081 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 00:16:18,081 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 00:16:18,081 - httpcore.connection - DEBUG - close.started 2025-05-30 00:16:18,082 - httpcore.connection - DEBUG - close.complete 2025-05-30 00:16:18,155 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Thu, 29 May 2025 15:16:18 GMT'), (b'Content-Type', b'text/html; charset=utf-8'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'ContentType', b'application/json'), (b'Access-Control-Allow-Origin', b'*'), (b'Content-Encoding', b'gzip')]) 2025-05-30 00:16:18,156 - httpx - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK" 2025-05-30 00:16:18,156 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 00:16:18,157 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 00:16:18,157 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 00:16:18,157 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 00:16:18,157 - httpcore.connection - DEBUG - close.started 2025-05-30 00:16:18,157 - httpcore.connection - DEBUG - close.complete 2025-05-30 00:16:18,761 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 00:16:18,778 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 00:16:18,778 - simple_memory_calculator - INFO - Using known memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 00:16:18,778 - simple_memory_calculator - DEBUG - Known data: {'params_billions': 12.0, 'fp16_gb': 24.0, 'inference_fp16_gb': 36.0} 2025-05-30 00:16:18,779 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 00:16:18,779 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 00:16:18,779 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 00:16:18,779 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 00:16:18,779 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 00:16:18,779 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 00:16:18,982 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/launched HTTP/1.1" 200 0 2025-05-30 00:16:30,473 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 00:16:30,473 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 00:16:30,473 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 00:16:30,473 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 00:16:30,474 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 00:16:30,474 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 00:16:30,474 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 00:16:30,474 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 00:16:30,474 - auto_diffusers - INFO - Starting code generation for model: black-forest-labs/FLUX.1-schnell 2025-05-30 00:16:30,474 - auto_diffusers - DEBUG - Parameters: prompt='A cat holding a sign that says hello world...', size=(768, 1360), steps=4 2025-05-30 00:16:30,474 - auto_diffusers - DEBUG - Manual specs: True, Memory analysis provided: True 2025-05-30 00:16:30,474 - auto_diffusers - INFO - Using manual hardware specifications 2025-05-30 00:16:30,474 - auto_diffusers - DEBUG - Manual specs: {'platform': 'Linux', 'architecture': 'manual_input', 'cpu_count': 8, 'python_version': '3.11', 'cuda_available': False, 'mps_available': False, 'torch_version': '2.0+', 'manual_input': True, 'ram_gb': 16, 'user_dtype': None, 'gpu_info': None} 2025-05-30 00:16:30,474 - auto_diffusers - INFO - No GPU detected, using CPU-only profile 2025-05-30 00:16:30,475 - auto_diffusers - INFO - Selected optimization profile: cpu_only 2025-05-30 00:16:30,475 - auto_diffusers - DEBUG - Creating generation prompt for Gemini API 2025-05-30 00:16:30,475 - auto_diffusers - DEBUG - Prompt length: 7566 characters 2025-05-30 00:16:30,475 - auto_diffusers - INFO - ================================================================================ 2025-05-30 00:16:30,475 - auto_diffusers - INFO - PROMPT SENT TO GEMINI API: 2025-05-30 00:16:30,475 - auto_diffusers - INFO - ================================================================================ 2025-05-30 00:16:30,475 - auto_diffusers - INFO - You are an expert in optimizing diffusers library code for different hardware configurations. NOTE: This system includes curated optimization knowledge from HuggingFace documentation. TASK: Generate optimized Python code for running a diffusion model with the following specifications: - Model: black-forest-labs/FLUX.1-schnell - Prompt: "A cat holding a sign that says hello world" - Image size: 768x1360 - Inference steps: 4 HARDWARE SPECIFICATIONS: - Platform: Linux (manual_input) - CPU Cores: 8 - CUDA Available: False - MPS Available: False - Optimization Profile: cpu_only MEMORY ANALYSIS: - Model Memory Requirements: 36.0 GB (FP16 inference) - Model Weights Size: 24.0 GB (FP16) - Memory Recommendation: 🔄 Requires sequential CPU offloading - Recommended Precision: float16 - Attention Slicing Recommended: True - VAE Slicing Recommended: True OPTIMIZATION KNOWLEDGE BASE: # DIFFUSERS OPTIMIZATION TECHNIQUES ## Memory Optimization Techniques ### 1. Model CPU Offloading Use `enable_model_cpu_offload()` to move models between GPU and CPU automatically: ```python pipe.enable_model_cpu_offload() ``` - Saves significant VRAM by keeping only active models on GPU - Automatic management, no manual intervention needed - Compatible with all pipelines ### 2. Sequential CPU Offloading Use `enable_sequential_cpu_offload()` for more aggressive memory saving: ```python pipe.enable_sequential_cpu_offload() ``` - More memory efficient than model offloading - Moves models to CPU after each forward pass - Best for very limited VRAM scenarios ### 3. Attention Slicing Use `enable_attention_slicing()` to reduce memory during attention computation: ```python pipe.enable_attention_slicing() # or specify slice size pipe.enable_attention_slicing("max") # maximum slicing pipe.enable_attention_slicing(1) # slice_size = 1 ``` - Trades compute time for memory - Most effective for high-resolution images - Can be combined with other techniques ### 4. VAE Slicing Use `enable_vae_slicing()` for large batch processing: ```python pipe.enable_vae_slicing() ``` - Decodes images one at a time instead of all at once - Essential for batch sizes > 4 - Minimal performance impact on single images ### 5. VAE Tiling Use `enable_vae_tiling()` for high-resolution image generation: ```python pipe.enable_vae_tiling() ``` - Enables 4K+ image generation on 8GB VRAM - Splits images into overlapping tiles - Automatically disabled for 512x512 or smaller images ### 6. Memory Efficient Attention (xFormers) Use `enable_xformers_memory_efficient_attention()` if xFormers is installed: ```python pipe.enable_xformers_memory_efficient_attention() ``` - Significantly reduces memory usage and improves speed - Requires xformers library installation - Compatible with most models ## Performance Optimization Techniques ### 1. Half Precision (FP16/BF16) Use lower precision for better memory and speed: ```python # FP16 (widely supported) pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16) # BF16 (better numerical stability, newer hardware) pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.bfloat16) ``` - FP16: Halves memory usage, widely supported - BF16: Better numerical stability, requires newer GPUs - Essential for most optimization scenarios ### 2. Torch Compile (PyTorch 2.0+) Use `torch.compile()` for significant speed improvements: ```python pipe.unet = torch.compile(pipe.unet, mode="reduce-overhead", fullgraph=True) # For some models, compile VAE too: pipe.vae.decode = torch.compile(pipe.vae.decode, mode="reduce-overhead", fullgraph=True) ``` - 5-50% speed improvement - Requires PyTorch 2.0+ - First run is slower due to compilation ### 3. Fast Schedulers Use faster schedulers for fewer steps: ```python from diffusers import LMSDiscreteScheduler, UniPCMultistepScheduler # LMS Scheduler (good quality, fast) pipe.scheduler = LMSDiscreteScheduler.from_config(pipe.scheduler.config) # UniPC Scheduler (fastest) pipe.scheduler = UniPCMultistepScheduler.from_config(pipe.scheduler.config) ``` ## Hardware-Specific Optimizations ### NVIDIA GPU Optimizations ```python # Enable Tensor Cores torch.backends.cudnn.benchmark = True # Optimal data type for NVIDIA torch_dtype = torch.float16 # or torch.bfloat16 for RTX 30/40 series ``` ### Apple Silicon (MPS) Optimizations ```python # Use MPS device device = "mps" if torch.backends.mps.is_available() else "cpu" pipe = pipe.to(device) # Recommended dtype for Apple Silicon torch_dtype = torch.bfloat16 # Better than float16 on Apple Silicon # Attention slicing often helps on MPS pipe.enable_attention_slicing() ``` ### CPU Optimizations ```python # Use float32 for CPU torch_dtype = torch.float32 # Enable optimized attention pipe.enable_attention_slicing() ``` ## Model-Specific Guidelines ### FLUX Models - Do NOT use guidance_scale parameter (not needed for FLUX) - Use 4-8 inference steps maximum - BF16 dtype recommended - Enable attention slicing for memory optimization ### Stable Diffusion XL - Enable attention slicing for high resolutions - Use refiner model sparingly to save memory - Consider VAE tiling for >1024px images ### Stable Diffusion 1.5/2.1 - Very memory efficient base models - Can often run without optimizations on 8GB+ VRAM - Enable VAE slicing for batch processing ## Memory Usage Estimation - FLUX.1: ~24GB for full precision, ~12GB for FP16 - SDXL: ~7GB for FP16, ~14GB for FP32 - SD 1.5: ~2GB for FP16, ~4GB for FP32 ## Optimization Combinations by VRAM ### 24GB+ VRAM (High-end) ```python pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.bfloat16) pipe = pipe.to("cuda") pipe.unet = torch.compile(pipe.unet, mode="reduce-overhead", fullgraph=True) ``` ### 12-24GB VRAM (Mid-range) ```python pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16) pipe = pipe.to("cuda") pipe.enable_model_cpu_offload() pipe.enable_xformers_memory_efficient_attention() ``` ### 8-12GB VRAM (Entry-level) ```python pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16) pipe.enable_sequential_cpu_offload() pipe.enable_attention_slicing() pipe.enable_vae_slicing() pipe.enable_xformers_memory_efficient_attention() ``` ### <8GB VRAM (Low-end) ```python pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16) pipe.enable_sequential_cpu_offload() pipe.enable_attention_slicing("max") pipe.enable_vae_slicing() pipe.enable_vae_tiling() ``` IMPORTANT: For FLUX.1-schnell models, do NOT include guidance_scale parameter as it's not needed. Using the OPTIMIZATION KNOWLEDGE BASE above, generate Python code that: 1. **Selects the best optimization techniques** for the specific hardware profile 2. **Applies appropriate memory optimizations** based on available VRAM 3. **Uses optimal data types** for the target hardware: - User specified dtype (if provided): Use exactly as specified - Apple Silicon (MPS): prefer torch.bfloat16 - NVIDIA GPUs: prefer torch.float16 or torch.bfloat16 - CPU only: use torch.float32 4. **Implements hardware-specific optimizations** (CUDA, MPS, CPU) 5. **Follows model-specific guidelines** (e.g., FLUX guidance_scale handling) IMPORTANT GUIDELINES: - Reference the OPTIMIZATION KNOWLEDGE BASE to select appropriate techniques - Include all necessary imports - Add brief comments explaining optimization choices - Generate compact, production-ready code - Inline values where possible for concise code - Generate ONLY the Python code, no explanations before or after the code block 2025-05-30 00:16:30,475 - auto_diffusers - INFO - ================================================================================ 2025-05-30 00:16:30,475 - auto_diffusers - INFO - Sending request to Gemini API 2025-05-30 00:16:45,777 - auto_diffusers - INFO - Successfully received response from Gemini API (no tools used) 2025-05-30 00:16:45,777 - auto_diffusers - DEBUG - Response length: 2393 characters 2025-05-30 00:20:26,203 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-30 00:20:26,203 - __main__ - DEBUG - API key found, length: 39 2025-05-30 00:20:26,203 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-30 00:20:26,203 - auto_diffusers - DEBUG - API key length: 39 2025-05-30 00:20:26,203 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-30 00:20:26,203 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-30 00:20:26,203 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-30 00:20:26,204 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-30 00:20:26,204 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-30 00:20:26,204 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-30 00:20:26,207 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-30 00:20:26,208 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-30 00:20:26,674 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-30 00:20:26,674 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-30 00:20:26,674 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-30 00:20:26,674 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-30 00:20:26,674 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-30 00:20:26,674 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-30 00:20:26,674 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-30 00:20:26,674 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-30 00:20:26,674 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-30 00:20:26,674 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-30 00:20:26,674 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-30 00:20:26,676 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 00:20:26,689 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-30 00:20:26,695 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 00:20:26,775 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 00:20:26,809 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7861 local_address=None timeout=None socket_options=None 2025-05-30 00:20:26,809 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 00:20:26,810 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 00:20:26,810 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 00:20:26,810 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 00:20:26,810 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 00:20:26,810 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 00:20:26,810 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Thu, 29 May 2025 15:20:26 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')]) 2025-05-30 00:20:26,810 - httpx - INFO - HTTP Request: GET http://localhost:7861/gradio_api/startup-events "HTTP/1.1 200 OK" 2025-05-30 00:20:26,811 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 00:20:26,811 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 00:20:26,811 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 00:20:26,811 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 00:20:26,811 - httpcore.connection - DEBUG - close.started 2025-05-30 00:20:26,811 - httpcore.connection - DEBUG - close.complete 2025-05-30 00:20:26,811 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7861 local_address=None timeout=3 socket_options=None 2025-05-30 00:20:26,812 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 00:20:26,812 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 00:20:26,812 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 00:20:26,812 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 00:20:26,812 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 00:20:26,812 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 00:20:26,818 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Thu, 29 May 2025 15:20:26 GMT'), (b'server', b'uvicorn'), (b'content-length', b'104665'), (b'content-type', b'text/html; charset=utf-8')]) 2025-05-30 00:20:26,818 - httpx - INFO - HTTP Request: HEAD http://localhost:7861/ "HTTP/1.1 200 OK" 2025-05-30 00:20:26,818 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 00:20:26,818 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 00:20:26,819 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 00:20:26,819 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 00:20:26,819 - httpcore.connection - DEBUG - close.started 2025-05-30 00:20:26,819 - httpcore.connection - DEBUG - close.complete 2025-05-30 00:20:26,829 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=30 socket_options=None 2025-05-30 00:20:26,983 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-30 00:20:27,005 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 00:20:27,005 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 00:20:27,005 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-30 00:20:27,005 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=30 2025-05-30 00:20:27,291 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 00:20:27,291 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 00:20:27,291 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 00:20:27,292 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 00:20:27,292 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 00:20:27,292 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 00:20:27,292 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 00:20:27,292 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 00:20:27,292 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 00:20:27,293 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 00:20:27,293 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 00:20:27,293 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 00:20:27,437 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Thu, 29 May 2025 15:20:27 GMT'), (b'Content-Type', b'text/html; charset=utf-8'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'ContentType', b'application/json'), (b'Access-Control-Allow-Origin', b'*'), (b'Content-Encoding', b'gzip')]) 2025-05-30 00:20:27,437 - httpx - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK" 2025-05-30 00:20:27,437 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 00:20:27,438 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 00:20:27,438 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 00:20:27,438 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 00:20:27,438 - httpcore.connection - DEBUG - close.started 2025-05-30 00:20:27,439 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Thu, 29 May 2025 15:20:27 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-30 00:20:27,439 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-30 00:20:27,439 - httpcore.connection - DEBUG - close.complete 2025-05-30 00:20:27,439 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 00:20:27,440 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 00:20:27,440 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 00:20:27,440 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 00:20:27,440 - httpcore.connection - DEBUG - close.started 2025-05-30 00:20:27,440 - httpcore.connection - DEBUG - close.complete 2025-05-30 00:20:28,103 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 00:20:28,371 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/launched HTTP/1.1" 200 0 2025-05-30 00:20:28,686 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 00:20:28,686 - simple_memory_calculator - INFO - Using known memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 00:20:28,686 - simple_memory_calculator - DEBUG - Known data: {'params_billions': 12.0, 'fp16_gb': 24.0, 'inference_fp16_gb': 36.0} 2025-05-30 00:20:28,686 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 00:20:28,686 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 00:20:28,687 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 00:20:28,687 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 00:20:28,687 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 00:20:28,687 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 00:20:30,112 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 00:20:30,112 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 00:20:30,112 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 00:20:30,112 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 00:20:30,112 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 00:20:30,112 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 00:20:30,112 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 00:20:30,112 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 00:20:30,112 - auto_diffusers - INFO - Starting code generation for model: black-forest-labs/FLUX.1-schnell 2025-05-30 00:20:30,112 - auto_diffusers - DEBUG - Parameters: prompt='A cat holding a sign that says hello world...', size=(768, 1360), steps=4 2025-05-30 00:20:30,112 - auto_diffusers - DEBUG - Manual specs: True, Memory analysis provided: True 2025-05-30 00:20:30,112 - auto_diffusers - INFO - Using manual hardware specifications 2025-05-30 00:20:30,112 - auto_diffusers - DEBUG - Manual specs: {'platform': 'Linux', 'architecture': 'manual_input', 'cpu_count': 8, 'python_version': '3.11', 'cuda_available': False, 'mps_available': False, 'torch_version': '2.0+', 'manual_input': True, 'ram_gb': 16, 'user_dtype': None, 'gpu_info': None} 2025-05-30 00:20:30,112 - auto_diffusers - INFO - No GPU detected, using CPU-only profile 2025-05-30 00:20:30,112 - auto_diffusers - INFO - Selected optimization profile: cpu_only 2025-05-30 00:20:30,112 - auto_diffusers - DEBUG - Creating generation prompt for Gemini API 2025-05-30 00:20:30,112 - auto_diffusers - DEBUG - Prompt length: 7566 characters 2025-05-30 00:20:30,112 - auto_diffusers - INFO - ================================================================================ 2025-05-30 00:20:30,112 - auto_diffusers - INFO - PROMPT SENT TO GEMINI API: 2025-05-30 00:20:30,112 - auto_diffusers - INFO - ================================================================================ 2025-05-30 00:20:30,112 - auto_diffusers - INFO - You are an expert in optimizing diffusers library code for different hardware configurations. NOTE: This system includes curated optimization knowledge from HuggingFace documentation. TASK: Generate optimized Python code for running a diffusion model with the following specifications: - Model: black-forest-labs/FLUX.1-schnell - Prompt: "A cat holding a sign that says hello world" - Image size: 768x1360 - Inference steps: 4 HARDWARE SPECIFICATIONS: - Platform: Linux (manual_input) - CPU Cores: 8 - CUDA Available: False - MPS Available: False - Optimization Profile: cpu_only MEMORY ANALYSIS: - Model Memory Requirements: 36.0 GB (FP16 inference) - Model Weights Size: 24.0 GB (FP16) - Memory Recommendation: 🔄 Requires sequential CPU offloading - Recommended Precision: float16 - Attention Slicing Recommended: True - VAE Slicing Recommended: True OPTIMIZATION KNOWLEDGE BASE: # DIFFUSERS OPTIMIZATION TECHNIQUES ## Memory Optimization Techniques ### 1. Model CPU Offloading Use `enable_model_cpu_offload()` to move models between GPU and CPU automatically: ```python pipe.enable_model_cpu_offload() ``` - Saves significant VRAM by keeping only active models on GPU - Automatic management, no manual intervention needed - Compatible with all pipelines ### 2. Sequential CPU Offloading Use `enable_sequential_cpu_offload()` for more aggressive memory saving: ```python pipe.enable_sequential_cpu_offload() ``` - More memory efficient than model offloading - Moves models to CPU after each forward pass - Best for very limited VRAM scenarios ### 3. Attention Slicing Use `enable_attention_slicing()` to reduce memory during attention computation: ```python pipe.enable_attention_slicing() # or specify slice size pipe.enable_attention_slicing("max") # maximum slicing pipe.enable_attention_slicing(1) # slice_size = 1 ``` - Trades compute time for memory - Most effective for high-resolution images - Can be combined with other techniques ### 4. VAE Slicing Use `enable_vae_slicing()` for large batch processing: ```python pipe.enable_vae_slicing() ``` - Decodes images one at a time instead of all at once - Essential for batch sizes > 4 - Minimal performance impact on single images ### 5. VAE Tiling Use `enable_vae_tiling()` for high-resolution image generation: ```python pipe.enable_vae_tiling() ``` - Enables 4K+ image generation on 8GB VRAM - Splits images into overlapping tiles - Automatically disabled for 512x512 or smaller images ### 6. Memory Efficient Attention (xFormers) Use `enable_xformers_memory_efficient_attention()` if xFormers is installed: ```python pipe.enable_xformers_memory_efficient_attention() ``` - Significantly reduces memory usage and improves speed - Requires xformers library installation - Compatible with most models ## Performance Optimization Techniques ### 1. Half Precision (FP16/BF16) Use lower precision for better memory and speed: ```python # FP16 (widely supported) pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16) # BF16 (better numerical stability, newer hardware) pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.bfloat16) ``` - FP16: Halves memory usage, widely supported - BF16: Better numerical stability, requires newer GPUs - Essential for most optimization scenarios ### 2. Torch Compile (PyTorch 2.0+) Use `torch.compile()` for significant speed improvements: ```python pipe.unet = torch.compile(pipe.unet, mode="reduce-overhead", fullgraph=True) # For some models, compile VAE too: pipe.vae.decode = torch.compile(pipe.vae.decode, mode="reduce-overhead", fullgraph=True) ``` - 5-50% speed improvement - Requires PyTorch 2.0+ - First run is slower due to compilation ### 3. Fast Schedulers Use faster schedulers for fewer steps: ```python from diffusers import LMSDiscreteScheduler, UniPCMultistepScheduler # LMS Scheduler (good quality, fast) pipe.scheduler = LMSDiscreteScheduler.from_config(pipe.scheduler.config) # UniPC Scheduler (fastest) pipe.scheduler = UniPCMultistepScheduler.from_config(pipe.scheduler.config) ``` ## Hardware-Specific Optimizations ### NVIDIA GPU Optimizations ```python # Enable Tensor Cores torch.backends.cudnn.benchmark = True # Optimal data type for NVIDIA torch_dtype = torch.float16 # or torch.bfloat16 for RTX 30/40 series ``` ### Apple Silicon (MPS) Optimizations ```python # Use MPS device device = "mps" if torch.backends.mps.is_available() else "cpu" pipe = pipe.to(device) # Recommended dtype for Apple Silicon torch_dtype = torch.bfloat16 # Better than float16 on Apple Silicon # Attention slicing often helps on MPS pipe.enable_attention_slicing() ``` ### CPU Optimizations ```python # Use float32 for CPU torch_dtype = torch.float32 # Enable optimized attention pipe.enable_attention_slicing() ``` ## Model-Specific Guidelines ### FLUX Models - Do NOT use guidance_scale parameter (not needed for FLUX) - Use 4-8 inference steps maximum - BF16 dtype recommended - Enable attention slicing for memory optimization ### Stable Diffusion XL - Enable attention slicing for high resolutions - Use refiner model sparingly to save memory - Consider VAE tiling for >1024px images ### Stable Diffusion 1.5/2.1 - Very memory efficient base models - Can often run without optimizations on 8GB+ VRAM - Enable VAE slicing for batch processing ## Memory Usage Estimation - FLUX.1: ~24GB for full precision, ~12GB for FP16 - SDXL: ~7GB for FP16, ~14GB for FP32 - SD 1.5: ~2GB for FP16, ~4GB for FP32 ## Optimization Combinations by VRAM ### 24GB+ VRAM (High-end) ```python pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.bfloat16) pipe = pipe.to("cuda") pipe.unet = torch.compile(pipe.unet, mode="reduce-overhead", fullgraph=True) ``` ### 12-24GB VRAM (Mid-range) ```python pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16) pipe = pipe.to("cuda") pipe.enable_model_cpu_offload() pipe.enable_xformers_memory_efficient_attention() ``` ### 8-12GB VRAM (Entry-level) ```python pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16) pipe.enable_sequential_cpu_offload() pipe.enable_attention_slicing() pipe.enable_vae_slicing() pipe.enable_xformers_memory_efficient_attention() ``` ### <8GB VRAM (Low-end) ```python pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16) pipe.enable_sequential_cpu_offload() pipe.enable_attention_slicing("max") pipe.enable_vae_slicing() pipe.enable_vae_tiling() ``` IMPORTANT: For FLUX.1-schnell models, do NOT include guidance_scale parameter as it's not needed. Using the OPTIMIZATION KNOWLEDGE BASE above, generate Python code that: 1. **Selects the best optimization techniques** for the specific hardware profile 2. **Applies appropriate memory optimizations** based on available VRAM 3. **Uses optimal data types** for the target hardware: - User specified dtype (if provided): Use exactly as specified - Apple Silicon (MPS): prefer torch.bfloat16 - NVIDIA GPUs: prefer torch.float16 or torch.bfloat16 - CPU only: use torch.float32 4. **Implements hardware-specific optimizations** (CUDA, MPS, CPU) 5. **Follows model-specific guidelines** (e.g., FLUX guidance_scale handling) IMPORTANT GUIDELINES: - Reference the OPTIMIZATION KNOWLEDGE BASE to select appropriate techniques - Include all necessary imports - Add brief comments explaining optimization choices - Generate compact, production-ready code - Inline values where possible for concise code - Generate ONLY the Python code, no explanations before or after the code block 2025-05-30 00:20:30,113 - auto_diffusers - INFO - ================================================================================ 2025-05-30 00:20:30,113 - auto_diffusers - INFO - Sending request to Gemini API 2025-05-30 00:20:43,867 - auto_diffusers - INFO - Successfully received response from Gemini API (no tools used) 2025-05-30 00:20:43,868 - auto_diffusers - DEBUG - Response length: 1716 characters 2025-05-30 00:23:06,277 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-30 00:23:06,277 - __main__ - DEBUG - API key found, length: 39 2025-05-30 00:23:06,277 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-30 00:23:06,277 - auto_diffusers - DEBUG - API key length: 39 2025-05-30 00:23:06,277 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-30 00:23:06,277 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-30 00:23:06,277 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-30 00:23:06,277 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-30 00:23:06,277 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-30 00:23:06,277 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-30 00:23:06,281 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-30 00:23:06,281 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-30 00:23:06,749 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-30 00:23:06,749 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-30 00:23:06,749 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-30 00:23:06,749 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-30 00:23:06,749 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-30 00:23:06,749 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-30 00:23:06,750 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-30 00:23:06,750 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-30 00:23:06,750 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-30 00:23:06,750 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-30 00:23:06,750 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-30 00:23:06,752 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 00:23:06,764 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-30 00:23:06,770 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 00:23:06,840 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 00:23:06,876 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7861 local_address=None timeout=None socket_options=None 2025-05-30 00:23:06,877 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 00:23:06,877 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 00:23:06,877 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 00:23:06,877 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 00:23:06,877 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 00:23:06,877 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 00:23:06,878 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Thu, 29 May 2025 15:23:06 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')]) 2025-05-30 00:23:06,878 - httpx - INFO - HTTP Request: GET http://localhost:7861/gradio_api/startup-events "HTTP/1.1 200 OK" 2025-05-30 00:23:06,878 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 00:23:06,878 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 00:23:06,878 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 00:23:06,878 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 00:23:06,878 - httpcore.connection - DEBUG - close.started 2025-05-30 00:23:06,878 - httpcore.connection - DEBUG - close.complete 2025-05-30 00:23:06,879 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7861 local_address=None timeout=3 socket_options=None 2025-05-30 00:23:06,879 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 00:23:06,879 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 00:23:06,879 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 00:23:06,879 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 00:23:06,879 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 00:23:06,879 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 00:23:06,885 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Thu, 29 May 2025 15:23:06 GMT'), (b'server', b'uvicorn'), (b'content-length', b'105144'), (b'content-type', b'text/html; charset=utf-8')]) 2025-05-30 00:23:06,885 - httpx - INFO - HTTP Request: HEAD http://localhost:7861/ "HTTP/1.1 200 OK" 2025-05-30 00:23:06,885 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 00:23:06,885 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 00:23:06,885 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 00:23:06,885 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 00:23:06,885 - httpcore.connection - DEBUG - close.started 2025-05-30 00:23:06,885 - httpcore.connection - DEBUG - close.complete 2025-05-30 00:23:06,896 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=30 socket_options=None 2025-05-30 00:23:06,961 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 00:23:06,961 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-30 00:23:07,038 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 00:23:07,038 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=30 2025-05-30 00:23:07,047 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-30 00:23:07,239 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 00:23:07,240 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 00:23:07,240 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 00:23:07,240 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 00:23:07,240 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 00:23:07,240 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 00:23:07,323 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 00:23:07,324 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 00:23:07,324 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 00:23:07,324 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 00:23:07,324 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 00:23:07,325 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 00:23:07,379 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Thu, 29 May 2025 15:23:07 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-30 00:23:07,380 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-30 00:23:07,380 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 00:23:07,380 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 00:23:07,381 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 00:23:07,381 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 00:23:07,381 - httpcore.connection - DEBUG - close.started 2025-05-30 00:23:07,381 - httpcore.connection - DEBUG - close.complete 2025-05-30 00:23:07,469 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Thu, 29 May 2025 15:23:07 GMT'), (b'Content-Type', b'text/html; charset=utf-8'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'ContentType', b'application/json'), (b'Access-Control-Allow-Origin', b'*'), (b'Content-Encoding', b'gzip')]) 2025-05-30 00:23:07,471 - httpx - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK" 2025-05-30 00:23:07,471 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 00:23:07,476 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 00:23:07,477 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 00:23:07,477 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 00:23:07,477 - httpcore.connection - DEBUG - close.started 2025-05-30 00:23:07,478 - httpcore.connection - DEBUG - close.complete 2025-05-30 00:23:08,047 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 00:23:08,048 - simple_memory_calculator - INFO - Using known memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 00:23:08,048 - simple_memory_calculator - DEBUG - Known data: {'params_billions': 12.0, 'fp16_gb': 24.0, 'inference_fp16_gb': 36.0} 2025-05-30 00:23:08,048 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 00:23:08,048 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 00:23:08,048 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 00:23:08,048 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 00:23:08,048 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 00:23:08,049 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 00:23:08,068 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 00:23:08,293 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/launched HTTP/1.1" 200 0 2025-05-30 00:23:09,169 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 00:23:09,169 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 00:23:09,169 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 00:23:09,169 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 00:23:09,169 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 00:23:09,169 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 00:23:09,169 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 00:23:09,169 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 00:23:09,169 - auto_diffusers - INFO - Starting code generation for model: black-forest-labs/FLUX.1-schnell 2025-05-30 00:23:09,169 - auto_diffusers - DEBUG - Parameters: prompt='A cat holding a sign that says hello world...', size=(768, 1360), steps=4 2025-05-30 00:23:09,169 - auto_diffusers - DEBUG - Manual specs: True, Memory analysis provided: True 2025-05-30 00:23:09,169 - auto_diffusers - INFO - Using manual hardware specifications 2025-05-30 00:23:09,169 - auto_diffusers - DEBUG - Manual specs: {'platform': 'Linux', 'architecture': 'manual_input', 'cpu_count': 8, 'python_version': '3.11', 'cuda_available': False, 'mps_available': False, 'torch_version': '2.0+', 'manual_input': True, 'ram_gb': 16, 'user_dtype': None, 'gpu_info': None} 2025-05-30 00:23:09,169 - auto_diffusers - INFO - No GPU detected, using CPU-only profile 2025-05-30 00:23:09,169 - auto_diffusers - INFO - Selected optimization profile: cpu_only 2025-05-30 00:23:09,169 - auto_diffusers - DEBUG - Creating generation prompt for Gemini API 2025-05-30 00:23:09,169 - auto_diffusers - DEBUG - Prompt length: 7566 characters 2025-05-30 00:23:09,169 - auto_diffusers - INFO - ================================================================================ 2025-05-30 00:23:09,169 - auto_diffusers - INFO - PROMPT SENT TO GEMINI API: 2025-05-30 00:23:09,169 - auto_diffusers - INFO - ================================================================================ 2025-05-30 00:23:09,169 - auto_diffusers - INFO - You are an expert in optimizing diffusers library code for different hardware configurations. NOTE: This system includes curated optimization knowledge from HuggingFace documentation. TASK: Generate optimized Python code for running a diffusion model with the following specifications: - Model: black-forest-labs/FLUX.1-schnell - Prompt: "A cat holding a sign that says hello world" - Image size: 768x1360 - Inference steps: 4 HARDWARE SPECIFICATIONS: - Platform: Linux (manual_input) - CPU Cores: 8 - CUDA Available: False - MPS Available: False - Optimization Profile: cpu_only MEMORY ANALYSIS: - Model Memory Requirements: 36.0 GB (FP16 inference) - Model Weights Size: 24.0 GB (FP16) - Memory Recommendation: 🔄 Requires sequential CPU offloading - Recommended Precision: float16 - Attention Slicing Recommended: True - VAE Slicing Recommended: True OPTIMIZATION KNOWLEDGE BASE: # DIFFUSERS OPTIMIZATION TECHNIQUES ## Memory Optimization Techniques ### 1. Model CPU Offloading Use `enable_model_cpu_offload()` to move models between GPU and CPU automatically: ```python pipe.enable_model_cpu_offload() ``` - Saves significant VRAM by keeping only active models on GPU - Automatic management, no manual intervention needed - Compatible with all pipelines ### 2. Sequential CPU Offloading Use `enable_sequential_cpu_offload()` for more aggressive memory saving: ```python pipe.enable_sequential_cpu_offload() ``` - More memory efficient than model offloading - Moves models to CPU after each forward pass - Best for very limited VRAM scenarios ### 3. Attention Slicing Use `enable_attention_slicing()` to reduce memory during attention computation: ```python pipe.enable_attention_slicing() # or specify slice size pipe.enable_attention_slicing("max") # maximum slicing pipe.enable_attention_slicing(1) # slice_size = 1 ``` - Trades compute time for memory - Most effective for high-resolution images - Can be combined with other techniques ### 4. VAE Slicing Use `enable_vae_slicing()` for large batch processing: ```python pipe.enable_vae_slicing() ``` - Decodes images one at a time instead of all at once - Essential for batch sizes > 4 - Minimal performance impact on single images ### 5. VAE Tiling Use `enable_vae_tiling()` for high-resolution image generation: ```python pipe.enable_vae_tiling() ``` - Enables 4K+ image generation on 8GB VRAM - Splits images into overlapping tiles - Automatically disabled for 512x512 or smaller images ### 6. Memory Efficient Attention (xFormers) Use `enable_xformers_memory_efficient_attention()` if xFormers is installed: ```python pipe.enable_xformers_memory_efficient_attention() ``` - Significantly reduces memory usage and improves speed - Requires xformers library installation - Compatible with most models ## Performance Optimization Techniques ### 1. Half Precision (FP16/BF16) Use lower precision for better memory and speed: ```python # FP16 (widely supported) pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16) # BF16 (better numerical stability, newer hardware) pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.bfloat16) ``` - FP16: Halves memory usage, widely supported - BF16: Better numerical stability, requires newer GPUs - Essential for most optimization scenarios ### 2. Torch Compile (PyTorch 2.0+) Use `torch.compile()` for significant speed improvements: ```python pipe.unet = torch.compile(pipe.unet, mode="reduce-overhead", fullgraph=True) # For some models, compile VAE too: pipe.vae.decode = torch.compile(pipe.vae.decode, mode="reduce-overhead", fullgraph=True) ``` - 5-50% speed improvement - Requires PyTorch 2.0+ - First run is slower due to compilation ### 3. Fast Schedulers Use faster schedulers for fewer steps: ```python from diffusers import LMSDiscreteScheduler, UniPCMultistepScheduler # LMS Scheduler (good quality, fast) pipe.scheduler = LMSDiscreteScheduler.from_config(pipe.scheduler.config) # UniPC Scheduler (fastest) pipe.scheduler = UniPCMultistepScheduler.from_config(pipe.scheduler.config) ``` ## Hardware-Specific Optimizations ### NVIDIA GPU Optimizations ```python # Enable Tensor Cores torch.backends.cudnn.benchmark = True # Optimal data type for NVIDIA torch_dtype = torch.float16 # or torch.bfloat16 for RTX 30/40 series ``` ### Apple Silicon (MPS) Optimizations ```python # Use MPS device device = "mps" if torch.backends.mps.is_available() else "cpu" pipe = pipe.to(device) # Recommended dtype for Apple Silicon torch_dtype = torch.bfloat16 # Better than float16 on Apple Silicon # Attention slicing often helps on MPS pipe.enable_attention_slicing() ``` ### CPU Optimizations ```python # Use float32 for CPU torch_dtype = torch.float32 # Enable optimized attention pipe.enable_attention_slicing() ``` ## Model-Specific Guidelines ### FLUX Models - Do NOT use guidance_scale parameter (not needed for FLUX) - Use 4-8 inference steps maximum - BF16 dtype recommended - Enable attention slicing for memory optimization ### Stable Diffusion XL - Enable attention slicing for high resolutions - Use refiner model sparingly to save memory - Consider VAE tiling for >1024px images ### Stable Diffusion 1.5/2.1 - Very memory efficient base models - Can often run without optimizations on 8GB+ VRAM - Enable VAE slicing for batch processing ## Memory Usage Estimation - FLUX.1: ~24GB for full precision, ~12GB for FP16 - SDXL: ~7GB for FP16, ~14GB for FP32 - SD 1.5: ~2GB for FP16, ~4GB for FP32 ## Optimization Combinations by VRAM ### 24GB+ VRAM (High-end) ```python pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.bfloat16) pipe = pipe.to("cuda") pipe.unet = torch.compile(pipe.unet, mode="reduce-overhead", fullgraph=True) ``` ### 12-24GB VRAM (Mid-range) ```python pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16) pipe = pipe.to("cuda") pipe.enable_model_cpu_offload() pipe.enable_xformers_memory_efficient_attention() ``` ### 8-12GB VRAM (Entry-level) ```python pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16) pipe.enable_sequential_cpu_offload() pipe.enable_attention_slicing() pipe.enable_vae_slicing() pipe.enable_xformers_memory_efficient_attention() ``` ### <8GB VRAM (Low-end) ```python pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16) pipe.enable_sequential_cpu_offload() pipe.enable_attention_slicing("max") pipe.enable_vae_slicing() pipe.enable_vae_tiling() ``` IMPORTANT: For FLUX.1-schnell models, do NOT include guidance_scale parameter as it's not needed. Using the OPTIMIZATION KNOWLEDGE BASE above, generate Python code that: 1. **Selects the best optimization techniques** for the specific hardware profile 2. **Applies appropriate memory optimizations** based on available VRAM 3. **Uses optimal data types** for the target hardware: - User specified dtype (if provided): Use exactly as specified - Apple Silicon (MPS): prefer torch.bfloat16 - NVIDIA GPUs: prefer torch.float16 or torch.bfloat16 - CPU only: use torch.float32 4. **Implements hardware-specific optimizations** (CUDA, MPS, CPU) 5. **Follows model-specific guidelines** (e.g., FLUX guidance_scale handling) IMPORTANT GUIDELINES: - Reference the OPTIMIZATION KNOWLEDGE BASE to select appropriate techniques - Include all necessary imports - Add brief comments explaining optimization choices - Generate compact, production-ready code - Inline values where possible for concise code - Generate ONLY the Python code, no explanations before or after the code block 2025-05-30 00:23:09,170 - auto_diffusers - INFO - ================================================================================ 2025-05-30 00:23:09,170 - auto_diffusers - INFO - Sending request to Gemini API 2025-05-30 00:23:21,834 - auto_diffusers - INFO - Successfully received response from Gemini API (no tools used) 2025-05-30 00:23:21,836 - auto_diffusers - DEBUG - Response length: 1661 characters 2025-05-30 00:27:11,690 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-30 00:27:11,690 - __main__ - DEBUG - API key found, length: 39 2025-05-30 00:27:11,690 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-30 00:27:11,690 - auto_diffusers - DEBUG - API key length: 39 2025-05-30 00:27:11,690 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-30 00:27:11,690 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-30 00:27:11,690 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-30 00:27:11,690 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-30 00:27:11,690 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-30 00:27:11,690 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-30 00:27:11,695 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-30 00:27:11,695 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-30 00:27:12,187 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-30 00:27:12,187 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-30 00:27:12,187 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-30 00:27:12,187 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-30 00:27:12,187 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-30 00:27:12,187 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-30 00:27:12,187 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-30 00:27:12,187 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-30 00:27:12,187 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-30 00:27:12,188 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-30 00:27:12,188 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-30 00:27:12,190 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 00:27:12,203 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-30 00:27:12,207 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 00:27:12,282 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 00:27:12,315 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7861 local_address=None timeout=None socket_options=None 2025-05-30 00:27:12,315 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 00:27:12,315 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 00:27:12,316 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 00:27:12,316 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 00:27:12,316 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 00:27:12,316 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 00:27:12,316 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Thu, 29 May 2025 15:27:12 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')]) 2025-05-30 00:27:12,316 - httpx - INFO - HTTP Request: GET http://localhost:7861/gradio_api/startup-events "HTTP/1.1 200 OK" 2025-05-30 00:27:12,316 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 00:27:12,316 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 00:27:12,316 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 00:27:12,316 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 00:27:12,316 - httpcore.connection - DEBUG - close.started 2025-05-30 00:27:12,317 - httpcore.connection - DEBUG - close.complete 2025-05-30 00:27:12,317 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7861 local_address=None timeout=3 socket_options=None 2025-05-30 00:27:12,317 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 00:27:12,317 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 00:27:12,318 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 00:27:12,318 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 00:27:12,318 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 00:27:12,318 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 00:27:12,324 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Thu, 29 May 2025 15:27:12 GMT'), (b'server', b'uvicorn'), (b'content-length', b'105338'), (b'content-type', b'text/html; charset=utf-8')]) 2025-05-30 00:27:12,324 - httpx - INFO - HTTP Request: HEAD http://localhost:7861/ "HTTP/1.1 200 OK" 2025-05-30 00:27:12,324 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 00:27:12,324 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 00:27:12,324 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 00:27:12,325 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 00:27:12,325 - httpcore.connection - DEBUG - close.started 2025-05-30 00:27:12,325 - httpcore.connection - DEBUG - close.complete 2025-05-30 00:27:12,335 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=30 socket_options=None 2025-05-30 00:27:12,366 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 00:27:12,366 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-30 00:27:12,478 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 00:27:12,478 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=30 2025-05-30 00:27:12,484 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-30 00:27:12,648 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 00:27:12,649 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 00:27:12,649 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 00:27:12,649 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 00:27:12,649 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 00:27:12,649 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 00:27:12,762 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 00:27:12,762 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 00:27:12,762 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 00:27:12,762 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 00:27:12,762 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 00:27:12,762 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 00:27:12,793 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Thu, 29 May 2025 15:27:12 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-30 00:27:12,793 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-30 00:27:12,793 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 00:27:12,793 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 00:27:12,793 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 00:27:12,793 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 00:27:12,793 - httpcore.connection - DEBUG - close.started 2025-05-30 00:27:12,793 - httpcore.connection - DEBUG - close.complete 2025-05-30 00:27:12,907 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Thu, 29 May 2025 15:27:12 GMT'), (b'Content-Type', b'text/html; charset=utf-8'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'ContentType', b'application/json'), (b'Access-Control-Allow-Origin', b'*'), (b'Content-Encoding', b'gzip')]) 2025-05-30 00:27:12,907 - httpx - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK" 2025-05-30 00:27:12,908 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 00:27:12,908 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 00:27:12,908 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 00:27:12,909 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 00:27:12,909 - httpcore.connection - DEBUG - close.started 2025-05-30 00:27:12,909 - httpcore.connection - DEBUG - close.complete 2025-05-30 00:27:13,500 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 00:27:13,583 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 00:27:13,583 - simple_memory_calculator - INFO - Using known memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 00:27:13,583 - simple_memory_calculator - DEBUG - Known data: {'params_billions': 12.0, 'fp16_gb': 24.0, 'inference_fp16_gb': 36.0} 2025-05-30 00:27:13,584 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 00:27:13,584 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 00:27:13,584 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 00:27:13,584 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 00:27:13,584 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 00:27:13,584 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 00:27:13,720 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/launched HTTP/1.1" 200 0 2025-05-30 00:30:00,106 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-30 00:30:00,106 - __main__ - DEBUG - API key found, length: 39 2025-05-30 00:30:00,106 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-30 00:30:00,106 - auto_diffusers - DEBUG - API key length: 39 2025-05-30 00:30:00,106 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-30 00:30:00,106 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-30 00:30:00,106 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-30 00:30:00,106 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-30 00:30:00,106 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-30 00:30:00,106 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-30 00:30:00,110 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-30 00:30:00,111 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-30 00:30:00,561 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-30 00:30:00,561 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-30 00:30:00,561 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-30 00:30:00,561 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-30 00:30:00,561 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-30 00:30:00,561 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-30 00:30:00,561 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-30 00:30:00,561 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-30 00:30:00,561 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-30 00:30:00,561 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-30 00:30:00,561 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-30 00:30:00,563 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 00:30:00,576 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-30 00:30:00,582 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 00:30:00,660 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 00:30:00,699 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7861 local_address=None timeout=None socket_options=None 2025-05-30 00:30:00,699 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 00:30:00,699 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 00:30:00,699 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 00:30:00,700 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 00:30:00,700 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 00:30:00,700 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 00:30:00,700 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Thu, 29 May 2025 15:30:00 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')]) 2025-05-30 00:30:00,701 - httpx - INFO - HTTP Request: GET http://localhost:7861/gradio_api/startup-events "HTTP/1.1 200 OK" 2025-05-30 00:30:00,701 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 00:30:00,701 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 00:30:00,701 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 00:30:00,701 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 00:30:00,701 - httpcore.connection - DEBUG - close.started 2025-05-30 00:30:00,701 - httpcore.connection - DEBUG - close.complete 2025-05-30 00:30:00,701 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7861 local_address=None timeout=3 socket_options=None 2025-05-30 00:30:00,703 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 00:30:00,704 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 00:30:00,704 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 00:30:00,704 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 00:30:00,704 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 00:30:00,704 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 00:30:00,713 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Thu, 29 May 2025 15:30:00 GMT'), (b'server', b'uvicorn'), (b'content-length', b'105821'), (b'content-type', b'text/html; charset=utf-8')]) 2025-05-30 00:30:00,713 - httpx - INFO - HTTP Request: HEAD http://localhost:7861/ "HTTP/1.1 200 OK" 2025-05-30 00:30:00,713 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 00:30:00,713 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 00:30:00,713 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 00:30:00,713 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 00:30:00,713 - httpcore.connection - DEBUG - close.started 2025-05-30 00:30:00,713 - httpcore.connection - DEBUG - close.complete 2025-05-30 00:30:00,723 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=30 socket_options=None 2025-05-30 00:30:00,737 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 00:30:00,737 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-30 00:30:00,857 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-30 00:30:00,882 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 00:30:00,882 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=30 2025-05-30 00:30:01,011 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 00:30:01,012 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 00:30:01,012 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 00:30:01,012 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 00:30:01,013 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 00:30:01,013 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 00:30:01,152 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Thu, 29 May 2025 15:30:01 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-30 00:30:01,152 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-30 00:30:01,152 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 00:30:01,152 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 00:30:01,152 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 00:30:01,152 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 00:30:01,152 - httpcore.connection - DEBUG - close.started 2025-05-30 00:30:01,152 - httpcore.connection - DEBUG - close.complete 2025-05-30 00:30:01,200 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 00:30:01,200 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 00:30:01,200 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 00:30:01,200 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 00:30:01,200 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 00:30:01,200 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 00:30:01,362 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Thu, 29 May 2025 15:30:01 GMT'), (b'Content-Type', b'text/html; charset=utf-8'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'ContentType', b'application/json'), (b'Access-Control-Allow-Origin', b'*'), (b'Content-Encoding', b'gzip')]) 2025-05-30 00:30:01,362 - httpx - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK" 2025-05-30 00:30:01,362 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 00:30:01,363 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 00:30:01,363 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 00:30:01,363 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 00:30:01,363 - httpcore.connection - DEBUG - close.started 2025-05-30 00:30:01,363 - httpcore.connection - DEBUG - close.complete 2025-05-30 00:30:01,967 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 00:30:01,967 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 00:30:01,968 - simple_memory_calculator - INFO - Using known memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 00:30:01,968 - simple_memory_calculator - DEBUG - Known data: {'params_billions': 12.0, 'fp16_gb': 24.0, 'inference_fp16_gb': 36.0} 2025-05-30 00:30:01,968 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 00:30:01,968 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 00:30:01,968 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 00:30:01,968 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 00:30:01,968 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 00:30:01,968 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 00:30:02,190 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/launched HTTP/1.1" 200 0 2025-05-30 00:30:05,722 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 00:30:05,722 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 00:30:05,723 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 00:30:05,723 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 00:30:05,723 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 00:30:05,723 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 00:30:05,723 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 00:30:05,723 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 00:30:05,723 - auto_diffusers - INFO - Starting code generation for model: black-forest-labs/FLUX.1-schnell 2025-05-30 00:30:05,724 - auto_diffusers - DEBUG - Parameters: prompt='A cat holding a sign that says hello world...', size=(768, 1360), steps=4 2025-05-30 00:30:05,724 - auto_diffusers - DEBUG - Manual specs: True, Memory analysis provided: True 2025-05-30 00:30:05,724 - auto_diffusers - INFO - Using manual hardware specifications 2025-05-30 00:30:05,724 - auto_diffusers - DEBUG - Manual specs: {'platform': 'Linux', 'architecture': 'manual_input', 'cpu_count': 8, 'python_version': '3.11', 'cuda_available': False, 'mps_available': False, 'torch_version': '2.0+', 'manual_input': True, 'ram_gb': 16, 'user_dtype': None, 'gpu_info': None} 2025-05-30 00:30:05,724 - auto_diffusers - INFO - No GPU detected, using CPU-only profile 2025-05-30 00:30:05,724 - auto_diffusers - INFO - Selected optimization profile: cpu_only 2025-05-30 00:30:05,724 - auto_diffusers - DEBUG - Creating generation prompt for Gemini API 2025-05-30 00:30:05,725 - auto_diffusers - DEBUG - Prompt length: 7566 characters 2025-05-30 00:30:05,725 - auto_diffusers - INFO - ================================================================================ 2025-05-30 00:30:05,725 - auto_diffusers - INFO - PROMPT SENT TO GEMINI API: 2025-05-30 00:30:05,725 - auto_diffusers - INFO - ================================================================================ 2025-05-30 00:30:05,725 - auto_diffusers - INFO - You are an expert in optimizing diffusers library code for different hardware configurations. NOTE: This system includes curated optimization knowledge from HuggingFace documentation. TASK: Generate optimized Python code for running a diffusion model with the following specifications: - Model: black-forest-labs/FLUX.1-schnell - Prompt: "A cat holding a sign that says hello world" - Image size: 768x1360 - Inference steps: 4 HARDWARE SPECIFICATIONS: - Platform: Linux (manual_input) - CPU Cores: 8 - CUDA Available: False - MPS Available: False - Optimization Profile: cpu_only MEMORY ANALYSIS: - Model Memory Requirements: 36.0 GB (FP16 inference) - Model Weights Size: 24.0 GB (FP16) - Memory Recommendation: 🔄 Requires sequential CPU offloading - Recommended Precision: float16 - Attention Slicing Recommended: True - VAE Slicing Recommended: True OPTIMIZATION KNOWLEDGE BASE: # DIFFUSERS OPTIMIZATION TECHNIQUES ## Memory Optimization Techniques ### 1. Model CPU Offloading Use `enable_model_cpu_offload()` to move models between GPU and CPU automatically: ```python pipe.enable_model_cpu_offload() ``` - Saves significant VRAM by keeping only active models on GPU - Automatic management, no manual intervention needed - Compatible with all pipelines ### 2. Sequential CPU Offloading Use `enable_sequential_cpu_offload()` for more aggressive memory saving: ```python pipe.enable_sequential_cpu_offload() ``` - More memory efficient than model offloading - Moves models to CPU after each forward pass - Best for very limited VRAM scenarios ### 3. Attention Slicing Use `enable_attention_slicing()` to reduce memory during attention computation: ```python pipe.enable_attention_slicing() # or specify slice size pipe.enable_attention_slicing("max") # maximum slicing pipe.enable_attention_slicing(1) # slice_size = 1 ``` - Trades compute time for memory - Most effective for high-resolution images - Can be combined with other techniques ### 4. VAE Slicing Use `enable_vae_slicing()` for large batch processing: ```python pipe.enable_vae_slicing() ``` - Decodes images one at a time instead of all at once - Essential for batch sizes > 4 - Minimal performance impact on single images ### 5. VAE Tiling Use `enable_vae_tiling()` for high-resolution image generation: ```python pipe.enable_vae_tiling() ``` - Enables 4K+ image generation on 8GB VRAM - Splits images into overlapping tiles - Automatically disabled for 512x512 or smaller images ### 6. Memory Efficient Attention (xFormers) Use `enable_xformers_memory_efficient_attention()` if xFormers is installed: ```python pipe.enable_xformers_memory_efficient_attention() ``` - Significantly reduces memory usage and improves speed - Requires xformers library installation - Compatible with most models ## Performance Optimization Techniques ### 1. Half Precision (FP16/BF16) Use lower precision for better memory and speed: ```python # FP16 (widely supported) pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16) # BF16 (better numerical stability, newer hardware) pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.bfloat16) ``` - FP16: Halves memory usage, widely supported - BF16: Better numerical stability, requires newer GPUs - Essential for most optimization scenarios ### 2. Torch Compile (PyTorch 2.0+) Use `torch.compile()` for significant speed improvements: ```python pipe.unet = torch.compile(pipe.unet, mode="reduce-overhead", fullgraph=True) # For some models, compile VAE too: pipe.vae.decode = torch.compile(pipe.vae.decode, mode="reduce-overhead", fullgraph=True) ``` - 5-50% speed improvement - Requires PyTorch 2.0+ - First run is slower due to compilation ### 3. Fast Schedulers Use faster schedulers for fewer steps: ```python from diffusers import LMSDiscreteScheduler, UniPCMultistepScheduler # LMS Scheduler (good quality, fast) pipe.scheduler = LMSDiscreteScheduler.from_config(pipe.scheduler.config) # UniPC Scheduler (fastest) pipe.scheduler = UniPCMultistepScheduler.from_config(pipe.scheduler.config) ``` ## Hardware-Specific Optimizations ### NVIDIA GPU Optimizations ```python # Enable Tensor Cores torch.backends.cudnn.benchmark = True # Optimal data type for NVIDIA torch_dtype = torch.float16 # or torch.bfloat16 for RTX 30/40 series ``` ### Apple Silicon (MPS) Optimizations ```python # Use MPS device device = "mps" if torch.backends.mps.is_available() else "cpu" pipe = pipe.to(device) # Recommended dtype for Apple Silicon torch_dtype = torch.bfloat16 # Better than float16 on Apple Silicon # Attention slicing often helps on MPS pipe.enable_attention_slicing() ``` ### CPU Optimizations ```python # Use float32 for CPU torch_dtype = torch.float32 # Enable optimized attention pipe.enable_attention_slicing() ``` ## Model-Specific Guidelines ### FLUX Models - Do NOT use guidance_scale parameter (not needed for FLUX) - Use 4-8 inference steps maximum - BF16 dtype recommended - Enable attention slicing for memory optimization ### Stable Diffusion XL - Enable attention slicing for high resolutions - Use refiner model sparingly to save memory - Consider VAE tiling for >1024px images ### Stable Diffusion 1.5/2.1 - Very memory efficient base models - Can often run without optimizations on 8GB+ VRAM - Enable VAE slicing for batch processing ## Memory Usage Estimation - FLUX.1: ~24GB for full precision, ~12GB for FP16 - SDXL: ~7GB for FP16, ~14GB for FP32 - SD 1.5: ~2GB for FP16, ~4GB for FP32 ## Optimization Combinations by VRAM ### 24GB+ VRAM (High-end) ```python pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.bfloat16) pipe = pipe.to("cuda") pipe.unet = torch.compile(pipe.unet, mode="reduce-overhead", fullgraph=True) ``` ### 12-24GB VRAM (Mid-range) ```python pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16) pipe = pipe.to("cuda") pipe.enable_model_cpu_offload() pipe.enable_xformers_memory_efficient_attention() ``` ### 8-12GB VRAM (Entry-level) ```python pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16) pipe.enable_sequential_cpu_offload() pipe.enable_attention_slicing() pipe.enable_vae_slicing() pipe.enable_xformers_memory_efficient_attention() ``` ### <8GB VRAM (Low-end) ```python pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16) pipe.enable_sequential_cpu_offload() pipe.enable_attention_slicing("max") pipe.enable_vae_slicing() pipe.enable_vae_tiling() ``` IMPORTANT: For FLUX.1-schnell models, do NOT include guidance_scale parameter as it's not needed. Using the OPTIMIZATION KNOWLEDGE BASE above, generate Python code that: 1. **Selects the best optimization techniques** for the specific hardware profile 2. **Applies appropriate memory optimizations** based on available VRAM 3. **Uses optimal data types** for the target hardware: - User specified dtype (if provided): Use exactly as specified - Apple Silicon (MPS): prefer torch.bfloat16 - NVIDIA GPUs: prefer torch.float16 or torch.bfloat16 - CPU only: use torch.float32 4. **Implements hardware-specific optimizations** (CUDA, MPS, CPU) 5. **Follows model-specific guidelines** (e.g., FLUX guidance_scale handling) IMPORTANT GUIDELINES: - Reference the OPTIMIZATION KNOWLEDGE BASE to select appropriate techniques - Include all necessary imports - Add brief comments explaining optimization choices - Generate compact, production-ready code - Inline values where possible for concise code - Generate ONLY the Python code, no explanations before or after the code block 2025-05-30 00:30:05,726 - auto_diffusers - INFO - ================================================================================ 2025-05-30 00:30:05,726 - auto_diffusers - INFO - Sending request to Gemini API 2025-05-30 00:30:17,996 - auto_diffusers - INFO - Successfully received response from Gemini API (no tools used) 2025-05-30 00:30:17,997 - auto_diffusers - DEBUG - Response length: 2079 characters 2025-05-30 00:33:31,874 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-30 00:33:31,874 - __main__ - DEBUG - API key found, length: 39 2025-05-30 00:33:31,874 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-30 00:33:31,874 - auto_diffusers - DEBUG - API key length: 39 2025-05-30 00:33:31,874 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-30 00:33:31,874 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-30 00:33:31,874 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-30 00:33:31,874 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-30 00:33:31,874 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-30 00:33:31,874 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-30 00:33:31,879 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-30 00:33:31,879 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-30 00:33:32,357 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-30 00:33:32,357 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-30 00:33:32,357 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-30 00:33:32,357 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-30 00:33:32,357 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-30 00:33:32,357 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-30 00:33:32,357 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-30 00:33:32,357 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-30 00:33:32,357 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-30 00:33:32,357 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-30 00:33:32,357 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-30 00:33:32,359 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 00:33:32,371 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-30 00:33:32,371 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 00:33:32,465 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 00:33:32,498 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7861 local_address=None timeout=None socket_options=None 2025-05-30 00:33:32,498 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 00:33:32,498 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 00:33:32,499 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 00:33:32,499 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 00:33:32,499 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 00:33:32,499 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 00:33:32,499 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Thu, 29 May 2025 15:33:32 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')]) 2025-05-30 00:33:32,499 - httpx - INFO - HTTP Request: GET http://localhost:7861/gradio_api/startup-events "HTTP/1.1 200 OK" 2025-05-30 00:33:32,500 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 00:33:32,500 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 00:33:32,500 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 00:33:32,500 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 00:33:32,500 - httpcore.connection - DEBUG - close.started 2025-05-30 00:33:32,500 - httpcore.connection - DEBUG - close.complete 2025-05-30 00:33:32,500 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7861 local_address=None timeout=3 socket_options=None 2025-05-30 00:33:32,500 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 00:33:32,500 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 00:33:32,501 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 00:33:32,501 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 00:33:32,501 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 00:33:32,501 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 00:33:32,507 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Thu, 29 May 2025 15:33:32 GMT'), (b'server', b'uvicorn'), (b'content-length', b'105814'), (b'content-type', b'text/html; charset=utf-8')]) 2025-05-30 00:33:32,507 - httpx - INFO - HTTP Request: HEAD http://localhost:7861/ "HTTP/1.1 200 OK" 2025-05-30 00:33:32,507 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 00:33:32,507 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 00:33:32,507 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 00:33:32,507 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 00:33:32,507 - httpcore.connection - DEBUG - close.started 2025-05-30 00:33:32,507 - httpcore.connection - DEBUG - close.complete 2025-05-30 00:33:32,517 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=30 socket_options=None 2025-05-30 00:33:32,537 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 00:33:32,537 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-30 00:33:32,654 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 00:33:32,654 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=30 2025-05-30 00:33:32,677 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-30 00:33:32,813 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 00:33:32,813 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 00:33:32,813 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 00:33:32,813 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 00:33:32,813 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 00:33:32,813 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 00:33:32,928 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 00:33:32,929 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 00:33:32,929 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 00:33:32,929 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 00:33:32,929 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 00:33:32,929 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 00:33:32,955 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Thu, 29 May 2025 15:33:32 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-30 00:33:32,955 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-30 00:33:32,955 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 00:33:32,955 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 00:33:32,955 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 00:33:32,955 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 00:33:32,955 - httpcore.connection - DEBUG - close.started 2025-05-30 00:33:32,956 - httpcore.connection - DEBUG - close.complete 2025-05-30 00:33:33,068 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Thu, 29 May 2025 15:33:33 GMT'), (b'Content-Type', b'text/html; charset=utf-8'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'ContentType', b'application/json'), (b'Access-Control-Allow-Origin', b'*'), (b'Content-Encoding', b'gzip')]) 2025-05-30 00:33:33,069 - httpx - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK" 2025-05-30 00:33:33,069 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 00:33:33,069 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 00:33:33,069 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 00:33:33,069 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 00:33:33,069 - httpcore.connection - DEBUG - close.started 2025-05-30 00:33:33,069 - httpcore.connection - DEBUG - close.complete 2025-05-30 00:33:33,701 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 00:33:33,917 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/launched HTTP/1.1" 200 0 2025-05-30 00:33:34,332 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 00:33:34,332 - simple_memory_calculator - INFO - Using known memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 00:33:34,332 - simple_memory_calculator - DEBUG - Known data: {'params_billions': 12.0, 'fp16_gb': 24.0, 'inference_fp16_gb': 36.0} 2025-05-30 00:33:34,332 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 00:33:34,333 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 00:33:34,333 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 00:33:34,333 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 00:33:34,333 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 00:33:34,333 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 00:33:38,242 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 00:33:38,242 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 00:33:38,242 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 00:33:38,242 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 00:33:38,242 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 00:33:38,243 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 00:33:38,243 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 00:33:38,243 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 00:33:38,243 - auto_diffusers - INFO - Starting code generation for model: black-forest-labs/FLUX.1-schnell 2025-05-30 00:33:38,243 - auto_diffusers - DEBUG - Parameters: prompt='A cat holding a sign that says hello world...', size=(768, 1360), steps=4 2025-05-30 00:33:38,243 - auto_diffusers - DEBUG - Manual specs: True, Memory analysis provided: True 2025-05-30 00:33:38,243 - auto_diffusers - INFO - Using manual hardware specifications 2025-05-30 00:33:38,243 - auto_diffusers - DEBUG - Manual specs: {'platform': 'Linux', 'architecture': 'manual_input', 'cpu_count': 8, 'python_version': '3.11', 'cuda_available': False, 'mps_available': False, 'torch_version': '2.0+', 'manual_input': True, 'ram_gb': 16, 'user_dtype': None, 'gpu_info': None} 2025-05-30 00:33:38,243 - auto_diffusers - INFO - No GPU detected, using CPU-only profile 2025-05-30 00:33:38,243 - auto_diffusers - INFO - Selected optimization profile: cpu_only 2025-05-30 00:33:38,243 - auto_diffusers - DEBUG - Creating generation prompt for Gemini API 2025-05-30 00:33:38,243 - auto_diffusers - DEBUG - Prompt length: 7566 characters 2025-05-30 00:33:38,243 - auto_diffusers - INFO - ================================================================================ 2025-05-30 00:33:38,243 - auto_diffusers - INFO - PROMPT SENT TO GEMINI API: 2025-05-30 00:33:38,243 - auto_diffusers - INFO - ================================================================================ 2025-05-30 00:33:38,243 - auto_diffusers - INFO - You are an expert in optimizing diffusers library code for different hardware configurations. NOTE: This system includes curated optimization knowledge from HuggingFace documentation. TASK: Generate optimized Python code for running a diffusion model with the following specifications: - Model: black-forest-labs/FLUX.1-schnell - Prompt: "A cat holding a sign that says hello world" - Image size: 768x1360 - Inference steps: 4 HARDWARE SPECIFICATIONS: - Platform: Linux (manual_input) - CPU Cores: 8 - CUDA Available: False - MPS Available: False - Optimization Profile: cpu_only MEMORY ANALYSIS: - Model Memory Requirements: 36.0 GB (FP16 inference) - Model Weights Size: 24.0 GB (FP16) - Memory Recommendation: 🔄 Requires sequential CPU offloading - Recommended Precision: float16 - Attention Slicing Recommended: True - VAE Slicing Recommended: True OPTIMIZATION KNOWLEDGE BASE: # DIFFUSERS OPTIMIZATION TECHNIQUES ## Memory Optimization Techniques ### 1. Model CPU Offloading Use `enable_model_cpu_offload()` to move models between GPU and CPU automatically: ```python pipe.enable_model_cpu_offload() ``` - Saves significant VRAM by keeping only active models on GPU - Automatic management, no manual intervention needed - Compatible with all pipelines ### 2. Sequential CPU Offloading Use `enable_sequential_cpu_offload()` for more aggressive memory saving: ```python pipe.enable_sequential_cpu_offload() ``` - More memory efficient than model offloading - Moves models to CPU after each forward pass - Best for very limited VRAM scenarios ### 3. Attention Slicing Use `enable_attention_slicing()` to reduce memory during attention computation: ```python pipe.enable_attention_slicing() # or specify slice size pipe.enable_attention_slicing("max") # maximum slicing pipe.enable_attention_slicing(1) # slice_size = 1 ``` - Trades compute time for memory - Most effective for high-resolution images - Can be combined with other techniques ### 4. VAE Slicing Use `enable_vae_slicing()` for large batch processing: ```python pipe.enable_vae_slicing() ``` - Decodes images one at a time instead of all at once - Essential for batch sizes > 4 - Minimal performance impact on single images ### 5. VAE Tiling Use `enable_vae_tiling()` for high-resolution image generation: ```python pipe.enable_vae_tiling() ``` - Enables 4K+ image generation on 8GB VRAM - Splits images into overlapping tiles - Automatically disabled for 512x512 or smaller images ### 6. Memory Efficient Attention (xFormers) Use `enable_xformers_memory_efficient_attention()` if xFormers is installed: ```python pipe.enable_xformers_memory_efficient_attention() ``` - Significantly reduces memory usage and improves speed - Requires xformers library installation - Compatible with most models ## Performance Optimization Techniques ### 1. Half Precision (FP16/BF16) Use lower precision for better memory and speed: ```python # FP16 (widely supported) pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16) # BF16 (better numerical stability, newer hardware) pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.bfloat16) ``` - FP16: Halves memory usage, widely supported - BF16: Better numerical stability, requires newer GPUs - Essential for most optimization scenarios ### 2. Torch Compile (PyTorch 2.0+) Use `torch.compile()` for significant speed improvements: ```python pipe.unet = torch.compile(pipe.unet, mode="reduce-overhead", fullgraph=True) # For some models, compile VAE too: pipe.vae.decode = torch.compile(pipe.vae.decode, mode="reduce-overhead", fullgraph=True) ``` - 5-50% speed improvement - Requires PyTorch 2.0+ - First run is slower due to compilation ### 3. Fast Schedulers Use faster schedulers for fewer steps: ```python from diffusers import LMSDiscreteScheduler, UniPCMultistepScheduler # LMS Scheduler (good quality, fast) pipe.scheduler = LMSDiscreteScheduler.from_config(pipe.scheduler.config) # UniPC Scheduler (fastest) pipe.scheduler = UniPCMultistepScheduler.from_config(pipe.scheduler.config) ``` ## Hardware-Specific Optimizations ### NVIDIA GPU Optimizations ```python # Enable Tensor Cores torch.backends.cudnn.benchmark = True # Optimal data type for NVIDIA torch_dtype = torch.float16 # or torch.bfloat16 for RTX 30/40 series ``` ### Apple Silicon (MPS) Optimizations ```python # Use MPS device device = "mps" if torch.backends.mps.is_available() else "cpu" pipe = pipe.to(device) # Recommended dtype for Apple Silicon torch_dtype = torch.bfloat16 # Better than float16 on Apple Silicon # Attention slicing often helps on MPS pipe.enable_attention_slicing() ``` ### CPU Optimizations ```python # Use float32 for CPU torch_dtype = torch.float32 # Enable optimized attention pipe.enable_attention_slicing() ``` ## Model-Specific Guidelines ### FLUX Models - Do NOT use guidance_scale parameter (not needed for FLUX) - Use 4-8 inference steps maximum - BF16 dtype recommended - Enable attention slicing for memory optimization ### Stable Diffusion XL - Enable attention slicing for high resolutions - Use refiner model sparingly to save memory - Consider VAE tiling for >1024px images ### Stable Diffusion 1.5/2.1 - Very memory efficient base models - Can often run without optimizations on 8GB+ VRAM - Enable VAE slicing for batch processing ## Memory Usage Estimation - FLUX.1: ~24GB for full precision, ~12GB for FP16 - SDXL: ~7GB for FP16, ~14GB for FP32 - SD 1.5: ~2GB for FP16, ~4GB for FP32 ## Optimization Combinations by VRAM ### 24GB+ VRAM (High-end) ```python pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.bfloat16) pipe = pipe.to("cuda") pipe.unet = torch.compile(pipe.unet, mode="reduce-overhead", fullgraph=True) ``` ### 12-24GB VRAM (Mid-range) ```python pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16) pipe = pipe.to("cuda") pipe.enable_model_cpu_offload() pipe.enable_xformers_memory_efficient_attention() ``` ### 8-12GB VRAM (Entry-level) ```python pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16) pipe.enable_sequential_cpu_offload() pipe.enable_attention_slicing() pipe.enable_vae_slicing() pipe.enable_xformers_memory_efficient_attention() ``` ### <8GB VRAM (Low-end) ```python pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16) pipe.enable_sequential_cpu_offload() pipe.enable_attention_slicing("max") pipe.enable_vae_slicing() pipe.enable_vae_tiling() ``` IMPORTANT: For FLUX.1-schnell models, do NOT include guidance_scale parameter as it's not needed. Using the OPTIMIZATION KNOWLEDGE BASE above, generate Python code that: 1. **Selects the best optimization techniques** for the specific hardware profile 2. **Applies appropriate memory optimizations** based on available VRAM 3. **Uses optimal data types** for the target hardware: - User specified dtype (if provided): Use exactly as specified - Apple Silicon (MPS): prefer torch.bfloat16 - NVIDIA GPUs: prefer torch.float16 or torch.bfloat16 - CPU only: use torch.float32 4. **Implements hardware-specific optimizations** (CUDA, MPS, CPU) 5. **Follows model-specific guidelines** (e.g., FLUX guidance_scale handling) IMPORTANT GUIDELINES: - Reference the OPTIMIZATION KNOWLEDGE BASE to select appropriate techniques - Include all necessary imports - Add brief comments explaining optimization choices - Generate compact, production-ready code - Inline values where possible for concise code - Generate ONLY the Python code, no explanations before or after the code block 2025-05-30 00:33:38,244 - auto_diffusers - INFO - ================================================================================ 2025-05-30 00:33:38,244 - auto_diffusers - INFO - Sending request to Gemini API 2025-05-30 00:33:51,592 - auto_diffusers - INFO - Successfully received response from Gemini API (no tools used) 2025-05-30 00:33:51,593 - auto_diffusers - DEBUG - Response length: 1708 characters 2025-05-30 00:36:05,006 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-30 00:36:05,006 - __main__ - DEBUG - API key found, length: 39 2025-05-30 00:36:05,006 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-30 00:36:05,006 - auto_diffusers - DEBUG - API key length: 39 2025-05-30 00:36:05,006 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-30 00:36:05,006 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-30 00:36:05,006 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-30 00:36:05,006 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-30 00:36:05,006 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-30 00:36:05,006 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-30 00:36:05,009 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-30 00:36:05,009 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-30 00:36:05,422 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-30 00:36:05,422 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-30 00:36:05,422 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-30 00:36:05,422 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-30 00:36:05,422 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-30 00:36:05,422 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-30 00:36:05,422 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-30 00:36:05,422 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-30 00:36:05,422 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-30 00:36:05,422 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-30 00:36:05,422 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-30 00:36:05,424 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 00:36:05,437 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-30 00:36:05,442 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 00:36:05,513 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 00:36:05,548 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7861 local_address=None timeout=None socket_options=None 2025-05-30 00:36:05,548 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 00:36:05,548 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 00:36:05,548 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 00:36:05,549 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 00:36:05,549 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 00:36:05,549 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 00:36:05,549 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Thu, 29 May 2025 15:36:05 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')]) 2025-05-30 00:36:05,549 - httpx - INFO - HTTP Request: GET http://localhost:7861/gradio_api/startup-events "HTTP/1.1 200 OK" 2025-05-30 00:36:05,549 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 00:36:05,549 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 00:36:05,549 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 00:36:05,549 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 00:36:05,549 - httpcore.connection - DEBUG - close.started 2025-05-30 00:36:05,550 - httpcore.connection - DEBUG - close.complete 2025-05-30 00:36:05,550 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7861 local_address=None timeout=3 socket_options=None 2025-05-30 00:36:05,550 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 00:36:05,550 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 00:36:05,550 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 00:36:05,550 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 00:36:05,551 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 00:36:05,551 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 00:36:05,557 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Thu, 29 May 2025 15:36:05 GMT'), (b'server', b'uvicorn'), (b'content-length', b'105805'), (b'content-type', b'text/html; charset=utf-8')]) 2025-05-30 00:36:05,557 - httpx - INFO - HTTP Request: HEAD http://localhost:7861/ "HTTP/1.1 200 OK" 2025-05-30 00:36:05,557 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 00:36:05,557 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 00:36:05,557 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 00:36:05,557 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 00:36:05,557 - httpcore.connection - DEBUG - close.started 2025-05-30 00:36:05,557 - httpcore.connection - DEBUG - close.complete 2025-05-30 00:36:05,568 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=30 socket_options=None 2025-05-30 00:36:05,681 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 00:36:05,681 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-30 00:36:05,718 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 00:36:05,718 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=30 2025-05-30 00:36:05,734 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-30 00:36:05,966 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 00:36:05,966 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 00:36:05,967 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 00:36:05,967 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 00:36:05,967 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 00:36:05,967 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 00:36:06,020 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 00:36:06,020 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 00:36:06,020 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 00:36:06,020 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 00:36:06,021 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 00:36:06,021 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 00:36:06,109 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Thu, 29 May 2025 15:36:06 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-30 00:36:06,110 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-30 00:36:06,110 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 00:36:06,110 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 00:36:06,110 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 00:36:06,110 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 00:36:06,110 - httpcore.connection - DEBUG - close.started 2025-05-30 00:36:06,110 - httpcore.connection - DEBUG - close.complete 2025-05-30 00:36:06,172 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Thu, 29 May 2025 15:36:06 GMT'), (b'Content-Type', b'text/html; charset=utf-8'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'ContentType', b'application/json'), (b'Access-Control-Allow-Origin', b'*'), (b'Content-Encoding', b'gzip')]) 2025-05-30 00:36:06,172 - httpx - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK" 2025-05-30 00:36:06,172 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 00:36:06,172 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 00:36:06,172 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 00:36:06,172 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 00:36:06,172 - httpcore.connection - DEBUG - close.started 2025-05-30 00:36:06,172 - httpcore.connection - DEBUG - close.complete 2025-05-30 00:36:06,788 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 00:36:07,005 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/launched HTTP/1.1" 200 0 2025-05-30 00:36:25,970 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 00:36:25,971 - simple_memory_calculator - INFO - Using known memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 00:36:25,971 - simple_memory_calculator - DEBUG - Known data: {'params_billions': 12.0, 'fp16_gb': 24.0, 'inference_fp16_gb': 36.0} 2025-05-30 00:36:25,971 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 00:36:25,971 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 00:36:25,971 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 00:36:25,971 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 00:36:25,971 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 00:36:25,972 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 00:41:24,618 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-30 00:41:24,618 - __main__ - DEBUG - API key found, length: 39 2025-05-30 00:41:24,618 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-30 00:41:24,618 - auto_diffusers - DEBUG - API key length: 39 2025-05-30 00:41:24,618 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-30 00:41:24,618 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-30 00:41:24,618 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-30 00:41:24,618 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-30 00:41:24,618 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-30 00:41:24,618 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-30 00:41:24,622 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-30 00:41:24,623 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-30 00:41:25,060 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-30 00:41:25,060 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-30 00:41:25,060 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-30 00:41:25,060 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-30 00:41:25,060 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-30 00:41:25,060 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-30 00:41:25,060 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-30 00:41:25,060 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-30 00:41:25,060 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-30 00:41:25,060 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-30 00:41:25,060 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-30 00:41:25,062 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 00:41:25,076 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-30 00:41:25,082 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 00:41:25,164 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 00:41:25,195 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7861 local_address=None timeout=None socket_options=None 2025-05-30 00:41:25,196 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 00:41:25,196 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 00:41:25,196 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 00:41:25,196 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 00:41:25,197 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 00:41:25,197 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 00:41:25,197 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Thu, 29 May 2025 15:41:25 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')]) 2025-05-30 00:41:25,197 - httpx - INFO - HTTP Request: GET http://localhost:7861/gradio_api/startup-events "HTTP/1.1 200 OK" 2025-05-30 00:41:25,197 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 00:41:25,197 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 00:41:25,197 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 00:41:25,197 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 00:41:25,197 - httpcore.connection - DEBUG - close.started 2025-05-30 00:41:25,197 - httpcore.connection - DEBUG - close.complete 2025-05-30 00:41:25,198 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7861 local_address=None timeout=3 socket_options=None 2025-05-30 00:41:25,198 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 00:41:25,198 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 00:41:25,198 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 00:41:25,198 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 00:41:25,198 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 00:41:25,198 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 00:41:25,205 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Thu, 29 May 2025 15:41:25 GMT'), (b'server', b'uvicorn'), (b'content-length', b'104937'), (b'content-type', b'text/html; charset=utf-8')]) 2025-05-30 00:41:25,206 - httpx - INFO - HTTP Request: HEAD http://localhost:7861/ "HTTP/1.1 200 OK" 2025-05-30 00:41:25,206 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 00:41:25,206 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 00:41:25,206 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 00:41:25,206 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 00:41:25,206 - httpcore.connection - DEBUG - close.started 2025-05-30 00:41:25,206 - httpcore.connection - DEBUG - close.complete 2025-05-30 00:41:25,217 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=30 socket_options=None 2025-05-30 00:41:25,326 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 00:41:25,326 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-30 00:41:25,367 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-30 00:41:25,371 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 00:41:25,372 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=30 2025-05-30 00:41:25,608 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 00:41:25,608 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 00:41:25,609 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 00:41:25,609 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 00:41:25,609 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 00:41:25,609 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 00:41:25,681 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 00:41:25,682 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 00:41:25,682 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 00:41:25,682 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 00:41:25,682 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 00:41:25,682 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 00:41:25,750 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Thu, 29 May 2025 15:41:25 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-30 00:41:25,751 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-30 00:41:25,751 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 00:41:25,751 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 00:41:25,751 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 00:41:25,751 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 00:41:25,751 - httpcore.connection - DEBUG - close.started 2025-05-30 00:41:25,751 - httpcore.connection - DEBUG - close.complete 2025-05-30 00:41:25,839 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Thu, 29 May 2025 15:41:25 GMT'), (b'Content-Type', b'text/html; charset=utf-8'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'ContentType', b'application/json'), (b'Access-Control-Allow-Origin', b'*'), (b'Content-Encoding', b'gzip')]) 2025-05-30 00:41:25,840 - httpx - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK" 2025-05-30 00:41:25,840 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 00:41:25,841 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 00:41:25,841 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 00:41:25,841 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 00:41:25,841 - httpcore.connection - DEBUG - close.started 2025-05-30 00:41:25,841 - httpcore.connection - DEBUG - close.complete 2025-05-30 00:41:26,445 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 00:41:26,683 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/launched HTTP/1.1" 200 0 2025-05-30 00:41:27,126 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 00:41:27,127 - simple_memory_calculator - INFO - Using known memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 00:41:27,127 - simple_memory_calculator - DEBUG - Known data: {'params_billions': 12.0, 'fp16_gb': 24.0, 'inference_fp16_gb': 36.0} 2025-05-30 00:41:27,127 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 00:41:27,127 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 00:41:27,127 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 00:41:27,127 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 00:41:27,128 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 00:41:27,128 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 00:41:28,862 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 00:41:28,863 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 00:41:28,863 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 00:41:28,863 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 00:41:28,863 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 00:41:28,863 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 00:41:28,863 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 00:41:28,863 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 00:41:30,138 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 00:41:30,139 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 00:41:30,139 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 00:41:30,139 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 00:41:30,139 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 00:41:30,139 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 00:41:30,139 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 00:41:30,140 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 00:43:45,643 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-30 00:43:45,643 - __main__ - DEBUG - API key found, length: 39 2025-05-30 00:43:45,643 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-30 00:43:45,643 - auto_diffusers - DEBUG - API key length: 39 2025-05-30 00:43:45,644 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-30 00:43:45,644 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-30 00:43:45,644 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-30 00:43:45,644 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-30 00:43:45,644 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-30 00:43:45,644 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-30 00:43:45,647 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-30 00:43:45,647 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-30 00:43:46,100 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-30 00:43:46,100 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-30 00:43:46,100 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-30 00:43:46,100 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-30 00:43:46,100 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-30 00:43:46,100 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-30 00:43:46,100 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-30 00:43:46,100 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-30 00:43:46,100 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-30 00:43:46,100 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-30 00:43:46,100 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-30 00:43:46,103 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 00:43:46,116 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-30 00:43:46,120 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 00:43:46,191 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 00:43:46,228 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7861 local_address=None timeout=None socket_options=None 2025-05-30 00:43:46,228 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 00:43:46,228 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 00:43:46,229 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 00:43:46,229 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 00:43:46,229 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 00:43:46,229 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 00:43:46,229 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Thu, 29 May 2025 15:43:46 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')]) 2025-05-30 00:43:46,229 - httpx - INFO - HTTP Request: GET http://localhost:7861/gradio_api/startup-events "HTTP/1.1 200 OK" 2025-05-30 00:43:46,229 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 00:43:46,230 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 00:43:46,230 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 00:43:46,230 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 00:43:46,230 - httpcore.connection - DEBUG - close.started 2025-05-30 00:43:46,230 - httpcore.connection - DEBUG - close.complete 2025-05-30 00:43:46,230 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7861 local_address=None timeout=3 socket_options=None 2025-05-30 00:43:46,230 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 00:43:46,230 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 00:43:46,231 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 00:43:46,231 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 00:43:46,231 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 00:43:46,231 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 00:43:46,237 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Thu, 29 May 2025 15:43:46 GMT'), (b'server', b'uvicorn'), (b'content-length', b'105071'), (b'content-type', b'text/html; charset=utf-8')]) 2025-05-30 00:43:46,237 - httpx - INFO - HTTP Request: HEAD http://localhost:7861/ "HTTP/1.1 200 OK" 2025-05-30 00:43:46,237 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 00:43:46,237 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 00:43:46,237 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 00:43:46,237 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 00:43:46,237 - httpcore.connection - DEBUG - close.started 2025-05-30 00:43:46,237 - httpcore.connection - DEBUG - close.complete 2025-05-30 00:43:46,247 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=30 socket_options=None 2025-05-30 00:43:46,317 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 00:43:46,317 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-30 00:43:46,384 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 00:43:46,384 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=30 2025-05-30 00:43:46,394 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-30 00:43:46,603 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 00:43:46,604 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 00:43:46,604 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 00:43:46,605 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 00:43:46,605 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 00:43:46,605 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 00:43:46,659 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 00:43:46,660 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 00:43:46,660 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 00:43:46,661 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 00:43:46,661 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 00:43:46,661 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 00:43:46,753 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Thu, 29 May 2025 15:43:46 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-30 00:43:46,753 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-30 00:43:46,754 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 00:43:46,754 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 00:43:46,754 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 00:43:46,754 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 00:43:46,754 - httpcore.connection - DEBUG - close.started 2025-05-30 00:43:46,755 - httpcore.connection - DEBUG - close.complete 2025-05-30 00:43:46,800 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Thu, 29 May 2025 15:43:46 GMT'), (b'Content-Type', b'text/html; charset=utf-8'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'ContentType', b'application/json'), (b'Access-Control-Allow-Origin', b'*'), (b'Content-Encoding', b'gzip')]) 2025-05-30 00:43:46,800 - httpx - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK" 2025-05-30 00:43:46,800 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 00:43:46,801 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 00:43:46,801 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 00:43:46,801 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 00:43:46,801 - httpcore.connection - DEBUG - close.started 2025-05-30 00:43:46,801 - httpcore.connection - DEBUG - close.complete 2025-05-30 00:43:47,349 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 00:43:47,349 - simple_memory_calculator - INFO - Using known memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 00:43:47,349 - simple_memory_calculator - DEBUG - Known data: {'params_billions': 12.0, 'fp16_gb': 24.0, 'inference_fp16_gb': 36.0} 2025-05-30 00:43:47,349 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 00:43:47,349 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 00:43:47,349 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 00:43:47,350 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 00:43:47,350 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 00:43:47,350 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 00:43:47,379 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 00:43:47,604 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/launched HTTP/1.1" 200 0 2025-05-30 00:45:54,947 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-30 00:45:54,947 - __main__ - DEBUG - API key found, length: 39 2025-05-30 00:45:54,947 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-30 00:45:54,947 - auto_diffusers - DEBUG - API key length: 39 2025-05-30 00:45:54,947 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-30 00:45:54,947 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-30 00:45:54,947 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-30 00:45:54,947 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-30 00:45:54,947 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-30 00:45:54,947 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-30 00:45:54,950 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-30 00:45:54,950 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-30 00:45:55,354 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-30 00:45:55,354 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-30 00:45:55,354 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-30 00:45:55,354 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-30 00:45:55,354 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-30 00:45:55,354 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-30 00:45:55,354 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-30 00:45:55,354 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-30 00:45:55,354 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-30 00:45:55,354 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-30 00:45:55,354 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-30 00:45:55,356 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 00:45:55,369 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-30 00:45:55,374 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 00:45:55,445 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 00:45:55,480 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7861 local_address=None timeout=None socket_options=None 2025-05-30 00:45:55,480 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 00:45:55,480 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 00:45:55,481 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 00:45:55,481 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 00:45:55,481 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 00:45:55,481 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 00:45:55,481 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Thu, 29 May 2025 15:45:55 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')]) 2025-05-30 00:45:55,482 - httpx - INFO - HTTP Request: GET http://localhost:7861/gradio_api/startup-events "HTTP/1.1 200 OK" 2025-05-30 00:45:55,482 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 00:45:55,482 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 00:45:55,482 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 00:45:55,482 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 00:45:55,482 - httpcore.connection - DEBUG - close.started 2025-05-30 00:45:55,482 - httpcore.connection - DEBUG - close.complete 2025-05-30 00:45:55,482 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7861 local_address=None timeout=3 socket_options=None 2025-05-30 00:45:55,483 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 00:45:55,483 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 00:45:55,483 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 00:45:55,483 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 00:45:55,483 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 00:45:55,483 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 00:45:55,489 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Thu, 29 May 2025 15:45:55 GMT'), (b'server', b'uvicorn'), (b'content-length', b'103733'), (b'content-type', b'text/html; charset=utf-8')]) 2025-05-30 00:45:55,489 - httpx - INFO - HTTP Request: HEAD http://localhost:7861/ "HTTP/1.1 200 OK" 2025-05-30 00:45:55,489 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 00:45:55,489 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 00:45:55,489 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 00:45:55,489 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 00:45:55,489 - httpcore.connection - DEBUG - close.started 2025-05-30 00:45:55,489 - httpcore.connection - DEBUG - close.complete 2025-05-30 00:45:55,500 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=30 socket_options=None 2025-05-30 00:45:55,532 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 00:45:55,532 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-30 00:45:55,637 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 00:45:55,638 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=30 2025-05-30 00:45:55,646 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-30 00:45:55,812 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 00:45:55,812 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 00:45:55,812 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 00:45:55,812 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 00:45:55,812 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 00:45:55,812 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 00:45:55,912 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 00:45:55,913 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 00:45:55,913 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 00:45:55,913 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 00:45:55,913 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 00:45:55,913 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 00:45:55,954 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Thu, 29 May 2025 15:45:55 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-30 00:45:55,954 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-30 00:45:55,954 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 00:45:55,954 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 00:45:55,955 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 00:45:55,955 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 00:45:55,955 - httpcore.connection - DEBUG - close.started 2025-05-30 00:45:55,955 - httpcore.connection - DEBUG - close.complete 2025-05-30 00:45:56,053 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Thu, 29 May 2025 15:45:55 GMT'), (b'Content-Type', b'text/html; charset=utf-8'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'ContentType', b'application/json'), (b'Access-Control-Allow-Origin', b'*'), (b'Content-Encoding', b'gzip')]) 2025-05-30 00:45:56,054 - httpx - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK" 2025-05-30 00:45:56,054 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 00:45:56,055 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 00:45:56,055 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 00:45:56,055 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 00:45:56,055 - httpcore.connection - DEBUG - close.started 2025-05-30 00:45:56,056 - httpcore.connection - DEBUG - close.complete 2025-05-30 00:45:56,657 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 00:45:56,876 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/launched HTTP/1.1" 200 0 2025-05-30 00:45:58,569 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 00:45:58,569 - simple_memory_calculator - INFO - Using known memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 00:45:58,569 - simple_memory_calculator - DEBUG - Known data: {'params_billions': 12.0, 'fp16_gb': 24.0, 'inference_fp16_gb': 36.0} 2025-05-30 00:45:58,569 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 00:45:58,569 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 00:45:58,569 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 00:45:58,569 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 00:45:58,570 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 00:45:58,570 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 00:47:58,681 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-30 00:47:58,682 - __main__ - DEBUG - API key found, length: 39 2025-05-30 00:47:58,682 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-30 00:47:58,682 - auto_diffusers - DEBUG - API key length: 39 2025-05-30 00:47:58,682 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-30 00:47:58,682 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-30 00:47:58,682 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-30 00:47:58,682 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-30 00:47:58,682 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-30 00:47:58,682 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-30 00:47:58,685 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-30 00:47:58,685 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-30 00:47:59,088 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-30 00:47:59,088 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-30 00:47:59,088 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-30 00:47:59,088 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-30 00:47:59,088 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-30 00:47:59,088 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-30 00:47:59,088 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-30 00:47:59,088 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-30 00:47:59,089 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-30 00:47:59,089 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-30 00:47:59,089 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-30 00:47:59,091 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 00:47:59,104 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-30 00:47:59,109 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 00:47:59,179 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 00:47:59,214 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7861 local_address=None timeout=None socket_options=None 2025-05-30 00:47:59,215 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 00:47:59,215 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 00:47:59,215 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 00:47:59,215 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 00:47:59,215 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 00:47:59,216 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 00:47:59,216 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Thu, 29 May 2025 15:47:59 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')]) 2025-05-30 00:47:59,216 - httpx - INFO - HTTP Request: GET http://localhost:7861/gradio_api/startup-events "HTTP/1.1 200 OK" 2025-05-30 00:47:59,216 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 00:47:59,216 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 00:47:59,216 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 00:47:59,216 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 00:47:59,216 - httpcore.connection - DEBUG - close.started 2025-05-30 00:47:59,216 - httpcore.connection - DEBUG - close.complete 2025-05-30 00:47:59,217 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7861 local_address=None timeout=3 socket_options=None 2025-05-30 00:47:59,217 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 00:47:59,217 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 00:47:59,217 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 00:47:59,217 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 00:47:59,217 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 00:47:59,217 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 00:47:59,223 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Thu, 29 May 2025 15:47:59 GMT'), (b'server', b'uvicorn'), (b'content-length', b'103526'), (b'content-type', b'text/html; charset=utf-8')]) 2025-05-30 00:47:59,224 - httpx - INFO - HTTP Request: HEAD http://localhost:7861/ "HTTP/1.1 200 OK" 2025-05-30 00:47:59,224 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 00:47:59,224 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 00:47:59,224 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 00:47:59,224 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 00:47:59,224 - httpcore.connection - DEBUG - close.started 2025-05-30 00:47:59,224 - httpcore.connection - DEBUG - close.complete 2025-05-30 00:47:59,235 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=30 socket_options=None 2025-05-30 00:47:59,375 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-30 00:47:59,424 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 00:47:59,424 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=30 2025-05-30 00:47:59,426 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 00:47:59,426 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-30 00:47:59,699 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 00:47:59,699 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 00:47:59,700 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 00:47:59,700 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 00:47:59,700 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 00:47:59,700 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 00:47:59,705 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 00:47:59,705 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 00:47:59,706 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 00:47:59,706 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 00:47:59,706 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 00:47:59,707 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 00:47:59,840 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Thu, 29 May 2025 15:47:59 GMT'), (b'Content-Type', b'text/html; charset=utf-8'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'ContentType', b'application/json'), (b'Access-Control-Allow-Origin', b'*'), (b'Content-Encoding', b'gzip')]) 2025-05-30 00:47:59,841 - httpx - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK" 2025-05-30 00:47:59,841 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 00:47:59,842 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 00:47:59,842 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 00:47:59,842 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 00:47:59,843 - httpcore.connection - DEBUG - close.started 2025-05-30 00:47:59,843 - httpcore.connection - DEBUG - close.complete 2025-05-30 00:47:59,854 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Thu, 29 May 2025 15:47:59 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-30 00:47:59,856 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-30 00:47:59,856 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 00:47:59,856 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 00:47:59,856 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 00:47:59,856 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 00:47:59,857 - httpcore.connection - DEBUG - close.started 2025-05-30 00:47:59,857 - httpcore.connection - DEBUG - close.complete 2025-05-30 00:48:00,536 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 00:48:00,763 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/launched HTTP/1.1" 200 0 2025-05-30 00:48:01,404 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 00:48:01,404 - simple_memory_calculator - INFO - Using known memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 00:48:01,404 - simple_memory_calculator - DEBUG - Known data: {'params_billions': 12.0, 'fp16_gb': 24.0, 'inference_fp16_gb': 36.0} 2025-05-30 00:48:01,404 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 00:48:01,404 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 00:48:01,404 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 00:48:01,404 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 00:48:01,404 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 00:48:01,404 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 00:49:30,798 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-30 00:49:30,798 - __main__ - DEBUG - API key found, length: 39 2025-05-30 00:49:30,798 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-30 00:49:30,798 - auto_diffusers - DEBUG - API key length: 39 2025-05-30 00:49:30,798 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-30 00:49:30,798 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-30 00:49:30,798 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-30 00:49:30,798 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-30 00:49:30,798 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-30 00:49:30,798 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-30 00:49:30,803 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-30 00:49:30,803 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-30 00:49:31,302 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-30 00:49:31,302 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-30 00:49:31,302 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-30 00:49:31,302 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-30 00:49:31,302 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-30 00:49:31,302 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-30 00:49:31,302 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-30 00:49:31,302 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-30 00:49:31,302 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-30 00:49:31,302 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-30 00:49:31,302 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-30 00:49:31,304 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 00:49:31,318 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-30 00:49:31,323 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 00:49:31,393 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 00:49:31,429 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7861 local_address=None timeout=None socket_options=None 2025-05-30 00:49:31,429 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 00:49:31,429 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 00:49:31,429 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 00:49:31,429 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 00:49:31,430 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 00:49:31,430 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 00:49:31,430 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Thu, 29 May 2025 15:49:31 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')]) 2025-05-30 00:49:31,430 - httpx - INFO - HTTP Request: GET http://localhost:7861/gradio_api/startup-events "HTTP/1.1 200 OK" 2025-05-30 00:49:31,430 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 00:49:31,430 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 00:49:31,430 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 00:49:31,430 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 00:49:31,430 - httpcore.connection - DEBUG - close.started 2025-05-30 00:49:31,430 - httpcore.connection - DEBUG - close.complete 2025-05-30 00:49:31,431 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7861 local_address=None timeout=3 socket_options=None 2025-05-30 00:49:31,431 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 00:49:31,431 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 00:49:31,431 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 00:49:31,431 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 00:49:31,431 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 00:49:31,431 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 00:49:31,437 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Thu, 29 May 2025 15:49:31 GMT'), (b'server', b'uvicorn'), (b'content-length', b'102528'), (b'content-type', b'text/html; charset=utf-8')]) 2025-05-30 00:49:31,437 - httpx - INFO - HTTP Request: HEAD http://localhost:7861/ "HTTP/1.1 200 OK" 2025-05-30 00:49:31,438 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 00:49:31,438 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 00:49:31,438 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 00:49:31,438 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 00:49:31,438 - httpcore.connection - DEBUG - close.started 2025-05-30 00:49:31,438 - httpcore.connection - DEBUG - close.complete 2025-05-30 00:49:31,448 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=30 socket_options=None 2025-05-30 00:49:31,483 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 00:49:31,483 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-30 00:49:31,588 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 00:49:31,588 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=30 2025-05-30 00:49:31,648 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-30 00:49:31,769 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 00:49:31,770 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 00:49:31,770 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 00:49:31,771 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 00:49:31,771 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 00:49:31,771 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 00:49:31,957 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Thu, 29 May 2025 15:49:31 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-30 00:49:31,958 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 00:49:31,959 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-30 00:49:31,959 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 00:49:31,959 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 00:49:31,960 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 00:49:31,960 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 00:49:31,960 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 00:49:31,960 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 00:49:31,960 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 00:49:31,960 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 00:49:31,961 - httpcore.connection - DEBUG - close.started 2025-05-30 00:49:31,961 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 00:49:31,961 - httpcore.connection - DEBUG - close.complete 2025-05-30 00:49:32,105 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Thu, 29 May 2025 15:49:32 GMT'), (b'Content-Type', b'text/html; charset=utf-8'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'ContentType', b'application/json'), (b'Access-Control-Allow-Origin', b'*'), (b'Content-Encoding', b'gzip')]) 2025-05-30 00:49:32,106 - httpx - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK" 2025-05-30 00:49:32,106 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 00:49:32,107 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 00:49:32,107 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 00:49:32,107 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 00:49:32,107 - httpcore.connection - DEBUG - close.started 2025-05-30 00:49:32,108 - httpcore.connection - DEBUG - close.complete 2025-05-30 00:49:32,646 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 00:49:32,647 - simple_memory_calculator - INFO - Using known memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 00:49:32,647 - simple_memory_calculator - DEBUG - Known data: {'params_billions': 12.0, 'fp16_gb': 24.0, 'inference_fp16_gb': 36.0} 2025-05-30 00:49:32,647 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 00:49:32,647 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 00:49:32,647 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 00:49:32,647 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 00:49:32,647 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 00:49:32,647 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 00:49:32,765 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 00:49:33,005 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/launched HTTP/1.1" 200 0 2025-05-30 00:53:47,386 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-30 00:53:47,386 - __main__ - DEBUG - API key found, length: 39 2025-05-30 00:53:47,386 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-30 00:53:47,386 - auto_diffusers - DEBUG - API key length: 39 2025-05-30 00:53:47,386 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-30 00:53:47,386 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-30 00:53:47,386 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-30 00:53:47,386 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-30 00:53:47,386 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-30 00:53:47,386 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-30 00:53:47,392 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-30 00:53:47,392 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-30 00:53:47,845 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-30 00:53:47,845 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-30 00:53:47,845 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-30 00:53:47,845 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-30 00:53:47,845 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-30 00:53:47,845 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-30 00:53:47,845 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-30 00:53:47,845 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-30 00:53:47,845 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-30 00:53:47,845 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-30 00:53:47,845 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-30 00:53:47,847 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 00:53:47,861 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-30 00:53:47,861 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 00:53:47,946 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 00:53:47,981 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7861 local_address=None timeout=None socket_options=None 2025-05-30 00:53:47,982 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 00:53:47,982 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 00:53:47,983 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 00:53:47,985 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 00:53:47,985 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 00:53:47,985 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 00:53:47,985 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Thu, 29 May 2025 15:53:47 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')]) 2025-05-30 00:53:47,986 - httpx - INFO - HTTP Request: GET http://localhost:7861/gradio_api/startup-events "HTTP/1.1 200 OK" 2025-05-30 00:53:47,986 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 00:53:47,986 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 00:53:47,986 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 00:53:47,986 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 00:53:47,986 - httpcore.connection - DEBUG - close.started 2025-05-30 00:53:47,986 - httpcore.connection - DEBUG - close.complete 2025-05-30 00:53:47,986 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7861 local_address=None timeout=3 socket_options=None 2025-05-30 00:53:47,987 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 00:53:47,987 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 00:53:47,987 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 00:53:47,987 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 00:53:47,987 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 00:53:47,988 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 00:53:47,994 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Thu, 29 May 2025 15:53:47 GMT'), (b'server', b'uvicorn'), (b'content-length', b'95944'), (b'content-type', b'text/html; charset=utf-8')]) 2025-05-30 00:53:47,994 - httpx - INFO - HTTP Request: HEAD http://localhost:7861/ "HTTP/1.1 200 OK" 2025-05-30 00:53:47,994 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 00:53:47,994 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 00:53:47,994 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 00:53:47,994 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 00:53:47,994 - httpcore.connection - DEBUG - close.started 2025-05-30 00:53:47,994 - httpcore.connection - DEBUG - close.complete 2025-05-30 00:53:48,005 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=30 socket_options=None 2025-05-30 00:53:48,021 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 00:53:48,022 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-30 00:53:48,142 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 00:53:48,142 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=30 2025-05-30 00:53:48,149 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-30 00:53:48,296 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 00:53:48,296 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 00:53:48,297 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 00:53:48,297 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 00:53:48,297 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 00:53:48,297 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 00:53:48,420 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 00:53:48,420 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 00:53:48,420 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 00:53:48,420 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 00:53:48,420 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 00:53:48,420 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 00:53:48,434 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Thu, 29 May 2025 15:53:48 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-30 00:53:48,435 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-30 00:53:48,435 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 00:53:48,435 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 00:53:48,435 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 00:53:48,435 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 00:53:48,436 - httpcore.connection - DEBUG - close.started 2025-05-30 00:53:48,436 - httpcore.connection - DEBUG - close.complete 2025-05-30 00:53:48,560 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Thu, 29 May 2025 15:53:48 GMT'), (b'Content-Type', b'text/html; charset=utf-8'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'ContentType', b'application/json'), (b'Access-Control-Allow-Origin', b'*'), (b'Content-Encoding', b'gzip')]) 2025-05-30 00:53:48,561 - httpx - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK" 2025-05-30 00:53:48,562 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 00:53:48,562 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 00:53:48,563 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 00:53:48,563 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 00:53:48,563 - httpcore.connection - DEBUG - close.started 2025-05-30 00:53:48,563 - httpcore.connection - DEBUG - close.complete 2025-05-30 00:53:49,159 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 00:53:49,382 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/launched HTTP/1.1" 200 0 2025-05-30 00:56:51,723 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 00:56:51,724 - simple_memory_calculator - INFO - Using known memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 00:56:51,724 - simple_memory_calculator - DEBUG - Known data: {'params_billions': 12.0, 'fp16_gb': 24.0, 'inference_fp16_gb': 36.0} 2025-05-30 00:56:51,724 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 00:56:51,725 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 00:56:51,725 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 00:56:51,725 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 00:56:51,725 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 00:56:51,725 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 01:00:36,891 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-30 01:00:36,891 - __main__ - DEBUG - API key found, length: 39 2025-05-30 01:00:36,891 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-30 01:00:36,891 - auto_diffusers - DEBUG - API key length: 39 2025-05-30 01:00:36,891 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-30 01:00:36,891 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-30 01:00:36,891 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-30 01:00:36,891 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-30 01:00:36,891 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-30 01:00:36,891 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-30 01:00:36,895 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-30 01:00:36,895 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-30 01:00:37,384 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-30 01:00:37,384 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-30 01:00:37,384 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-30 01:00:37,384 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-30 01:00:37,384 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-30 01:00:37,384 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-30 01:00:37,384 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-30 01:00:37,384 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-30 01:00:37,384 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-30 01:00:37,384 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-30 01:00:37,384 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-30 01:00:37,386 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 01:00:37,399 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-30 01:00:37,399 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 01:00:37,476 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 01:00:37,512 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7861 local_address=None timeout=None socket_options=None 2025-05-30 01:00:37,512 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 01:00:37,513 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 01:00:37,513 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 01:00:37,513 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 01:00:37,513 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 01:00:37,513 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 01:00:37,513 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Thu, 29 May 2025 16:00:37 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')]) 2025-05-30 01:00:37,513 - httpx - INFO - HTTP Request: GET http://localhost:7861/gradio_api/startup-events "HTTP/1.1 200 OK" 2025-05-30 01:00:37,514 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 01:00:37,514 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 01:00:37,514 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 01:00:37,514 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 01:00:37,514 - httpcore.connection - DEBUG - close.started 2025-05-30 01:00:37,514 - httpcore.connection - DEBUG - close.complete 2025-05-30 01:00:37,514 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7861 local_address=None timeout=3 socket_options=None 2025-05-30 01:00:37,515 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 01:00:37,515 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 01:00:37,515 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 01:00:37,515 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 01:00:37,515 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 01:00:37,515 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 01:00:37,521 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Thu, 29 May 2025 16:00:37 GMT'), (b'server', b'uvicorn'), (b'content-length', b'95123'), (b'content-type', b'text/html; charset=utf-8')]) 2025-05-30 01:00:37,521 - httpx - INFO - HTTP Request: HEAD http://localhost:7861/ "HTTP/1.1 200 OK" 2025-05-30 01:00:37,521 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 01:00:37,521 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 01:00:37,521 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 01:00:37,521 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 01:00:37,521 - httpcore.connection - DEBUG - close.started 2025-05-30 01:00:37,521 - httpcore.connection - DEBUG - close.complete 2025-05-30 01:00:37,532 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=30 socket_options=None 2025-05-30 01:00:37,634 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 01:00:37,635 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-30 01:00:37,672 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 01:00:37,672 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=30 2025-05-30 01:00:37,681 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-30 01:00:37,911 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 01:00:37,911 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 01:00:37,912 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 01:00:37,912 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 01:00:37,913 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 01:00:37,913 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 01:00:37,953 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 01:00:37,954 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 01:00:37,955 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 01:00:37,955 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 01:00:37,955 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 01:00:37,955 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 01:00:38,052 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Thu, 29 May 2025 16:00:37 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-30 01:00:38,053 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-30 01:00:38,054 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 01:00:38,054 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 01:00:38,055 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 01:00:38,055 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 01:00:38,055 - httpcore.connection - DEBUG - close.started 2025-05-30 01:00:38,056 - httpcore.connection - DEBUG - close.complete 2025-05-30 01:00:38,097 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Thu, 29 May 2025 16:00:38 GMT'), (b'Content-Type', b'text/html; charset=utf-8'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'ContentType', b'application/json'), (b'Access-Control-Allow-Origin', b'*'), (b'Content-Encoding', b'gzip')]) 2025-05-30 01:00:38,098 - httpx - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK" 2025-05-30 01:00:38,098 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 01:00:38,099 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 01:00:38,099 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 01:00:38,099 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 01:00:38,099 - httpcore.connection - DEBUG - close.started 2025-05-30 01:00:38,100 - httpcore.connection - DEBUG - close.complete 2025-05-30 01:00:38,699 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 01:00:38,831 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 01:00:38,831 - simple_memory_calculator - INFO - Using known memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 01:00:38,832 - simple_memory_calculator - DEBUG - Known data: {'params_billions': 12.0, 'fp16_gb': 24.0, 'inference_fp16_gb': 36.0} 2025-05-30 01:00:38,832 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 01:00:38,832 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 01:00:38,832 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 01:00:38,832 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 01:00:38,832 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 01:00:38,832 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 01:00:38,915 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/launched HTTP/1.1" 200 0 2025-05-30 01:00:43,208 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 01:00:43,209 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 01:00:43,209 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 01:00:43,209 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 01:00:43,209 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 01:00:43,209 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 01:00:43,209 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 01:00:43,209 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 01:00:53,193 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 01:00:53,194 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 01:00:53,194 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 16.0GB VRAM 2025-05-30 01:00:53,194 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 01:00:53,194 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 01:00:53,194 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 01:00:53,194 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 01:00:53,194 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 01:00:53,245 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 01:00:53,245 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 01:00:53,245 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 16.0GB VRAM 2025-05-30 01:00:53,245 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 01:00:53,245 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 01:00:53,245 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 01:00:53,245 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 01:00:53,245 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 01:01:20,157 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 01:01:20,157 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 01:01:20,157 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 16.0GB VRAM 2025-05-30 01:01:20,157 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 01:01:20,158 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 01:01:20,158 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 01:01:20,158 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 01:01:20,158 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 01:01:20,158 - auto_diffusers - INFO - Starting code generation for model: black-forest-labs/FLUX.1-schnell 2025-05-30 01:01:20,158 - auto_diffusers - DEBUG - Parameters: prompt='A cat holding a sign that says hello world...', size=(768, 1360), steps=4 2025-05-30 01:01:20,158 - auto_diffusers - DEBUG - Manual specs: True, Memory analysis provided: True 2025-05-30 01:01:20,158 - auto_diffusers - INFO - Using manual hardware specifications 2025-05-30 01:01:20,158 - auto_diffusers - DEBUG - Manual specs: {'platform': 'Linux', 'architecture': 'manual_input', 'cpu_count': 8, 'python_version': '3.11', 'cuda_available': False, 'mps_available': False, 'torch_version': '2.0+', 'manual_input': True, 'ram_gb': 16, 'user_dtype': None, 'gpu_info': [{'name': 'RTX 5070 Ti', 'memory_mb': 16384}]} 2025-05-30 01:01:20,158 - auto_diffusers - DEBUG - GPU detected with 16.0 GB VRAM 2025-05-30 01:01:20,158 - auto_diffusers - INFO - Selected optimization profile: performance 2025-05-30 01:01:20,158 - auto_diffusers - DEBUG - Creating generation prompt for Gemini API 2025-05-30 01:01:20,158 - auto_diffusers - DEBUG - Prompt length: 7603 characters 2025-05-30 01:01:20,158 - auto_diffusers - INFO - ================================================================================ 2025-05-30 01:01:20,158 - auto_diffusers - INFO - PROMPT SENT TO GEMINI API: 2025-05-30 01:01:20,158 - auto_diffusers - INFO - ================================================================================ 2025-05-30 01:01:20,159 - auto_diffusers - INFO - You are an expert in optimizing diffusers library code for different hardware configurations. NOTE: This system includes curated optimization knowledge from HuggingFace documentation. TASK: Generate optimized Python code for running a diffusion model with the following specifications: - Model: black-forest-labs/FLUX.1-schnell - Prompt: "A cat holding a sign that says hello world" - Image size: 768x1360 - Inference steps: 4 HARDWARE SPECIFICATIONS: - Platform: Linux (manual_input) - CPU Cores: 8 - CUDA Available: False - MPS Available: False - Optimization Profile: performance - GPU: RTX 5070 Ti (16.0 GB VRAM) MEMORY ANALYSIS: - Model Memory Requirements: 36.0 GB (FP16 inference) - Model Weights Size: 24.0 GB (FP16) - Memory Recommendation: 🔄 Requires sequential CPU offloading - Recommended Precision: float16 - Attention Slicing Recommended: True - VAE Slicing Recommended: True OPTIMIZATION KNOWLEDGE BASE: # DIFFUSERS OPTIMIZATION TECHNIQUES ## Memory Optimization Techniques ### 1. Model CPU Offloading Use `enable_model_cpu_offload()` to move models between GPU and CPU automatically: ```python pipe.enable_model_cpu_offload() ``` - Saves significant VRAM by keeping only active models on GPU - Automatic management, no manual intervention needed - Compatible with all pipelines ### 2. Sequential CPU Offloading Use `enable_sequential_cpu_offload()` for more aggressive memory saving: ```python pipe.enable_sequential_cpu_offload() ``` - More memory efficient than model offloading - Moves models to CPU after each forward pass - Best for very limited VRAM scenarios ### 3. Attention Slicing Use `enable_attention_slicing()` to reduce memory during attention computation: ```python pipe.enable_attention_slicing() # or specify slice size pipe.enable_attention_slicing("max") # maximum slicing pipe.enable_attention_slicing(1) # slice_size = 1 ``` - Trades compute time for memory - Most effective for high-resolution images - Can be combined with other techniques ### 4. VAE Slicing Use `enable_vae_slicing()` for large batch processing: ```python pipe.enable_vae_slicing() ``` - Decodes images one at a time instead of all at once - Essential for batch sizes > 4 - Minimal performance impact on single images ### 5. VAE Tiling Use `enable_vae_tiling()` for high-resolution image generation: ```python pipe.enable_vae_tiling() ``` - Enables 4K+ image generation on 8GB VRAM - Splits images into overlapping tiles - Automatically disabled for 512x512 or smaller images ### 6. Memory Efficient Attention (xFormers) Use `enable_xformers_memory_efficient_attention()` if xFormers is installed: ```python pipe.enable_xformers_memory_efficient_attention() ``` - Significantly reduces memory usage and improves speed - Requires xformers library installation - Compatible with most models ## Performance Optimization Techniques ### 1. Half Precision (FP16/BF16) Use lower precision for better memory and speed: ```python # FP16 (widely supported) pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16) # BF16 (better numerical stability, newer hardware) pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.bfloat16) ``` - FP16: Halves memory usage, widely supported - BF16: Better numerical stability, requires newer GPUs - Essential for most optimization scenarios ### 2. Torch Compile (PyTorch 2.0+) Use `torch.compile()` for significant speed improvements: ```python pipe.unet = torch.compile(pipe.unet, mode="reduce-overhead", fullgraph=True) # For some models, compile VAE too: pipe.vae.decode = torch.compile(pipe.vae.decode, mode="reduce-overhead", fullgraph=True) ``` - 5-50% speed improvement - Requires PyTorch 2.0+ - First run is slower due to compilation ### 3. Fast Schedulers Use faster schedulers for fewer steps: ```python from diffusers import LMSDiscreteScheduler, UniPCMultistepScheduler # LMS Scheduler (good quality, fast) pipe.scheduler = LMSDiscreteScheduler.from_config(pipe.scheduler.config) # UniPC Scheduler (fastest) pipe.scheduler = UniPCMultistepScheduler.from_config(pipe.scheduler.config) ``` ## Hardware-Specific Optimizations ### NVIDIA GPU Optimizations ```python # Enable Tensor Cores torch.backends.cudnn.benchmark = True # Optimal data type for NVIDIA torch_dtype = torch.float16 # or torch.bfloat16 for RTX 30/40 series ``` ### Apple Silicon (MPS) Optimizations ```python # Use MPS device device = "mps" if torch.backends.mps.is_available() else "cpu" pipe = pipe.to(device) # Recommended dtype for Apple Silicon torch_dtype = torch.bfloat16 # Better than float16 on Apple Silicon # Attention slicing often helps on MPS pipe.enable_attention_slicing() ``` ### CPU Optimizations ```python # Use float32 for CPU torch_dtype = torch.float32 # Enable optimized attention pipe.enable_attention_slicing() ``` ## Model-Specific Guidelines ### FLUX Models - Do NOT use guidance_scale parameter (not needed for FLUX) - Use 4-8 inference steps maximum - BF16 dtype recommended - Enable attention slicing for memory optimization ### Stable Diffusion XL - Enable attention slicing for high resolutions - Use refiner model sparingly to save memory - Consider VAE tiling for >1024px images ### Stable Diffusion 1.5/2.1 - Very memory efficient base models - Can often run without optimizations on 8GB+ VRAM - Enable VAE slicing for batch processing ## Memory Usage Estimation - FLUX.1: ~24GB for full precision, ~12GB for FP16 - SDXL: ~7GB for FP16, ~14GB for FP32 - SD 1.5: ~2GB for FP16, ~4GB for FP32 ## Optimization Combinations by VRAM ### 24GB+ VRAM (High-end) ```python pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.bfloat16) pipe = pipe.to("cuda") pipe.unet = torch.compile(pipe.unet, mode="reduce-overhead", fullgraph=True) ``` ### 12-24GB VRAM (Mid-range) ```python pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16) pipe = pipe.to("cuda") pipe.enable_model_cpu_offload() pipe.enable_xformers_memory_efficient_attention() ``` ### 8-12GB VRAM (Entry-level) ```python pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16) pipe.enable_sequential_cpu_offload() pipe.enable_attention_slicing() pipe.enable_vae_slicing() pipe.enable_xformers_memory_efficient_attention() ``` ### <8GB VRAM (Low-end) ```python pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16) pipe.enable_sequential_cpu_offload() pipe.enable_attention_slicing("max") pipe.enable_vae_slicing() pipe.enable_vae_tiling() ``` IMPORTANT: For FLUX.1-schnell models, do NOT include guidance_scale parameter as it's not needed. Using the OPTIMIZATION KNOWLEDGE BASE above, generate Python code that: 1. **Selects the best optimization techniques** for the specific hardware profile 2. **Applies appropriate memory optimizations** based on available VRAM 3. **Uses optimal data types** for the target hardware: - User specified dtype (if provided): Use exactly as specified - Apple Silicon (MPS): prefer torch.bfloat16 - NVIDIA GPUs: prefer torch.float16 or torch.bfloat16 - CPU only: use torch.float32 4. **Implements hardware-specific optimizations** (CUDA, MPS, CPU) 5. **Follows model-specific guidelines** (e.g., FLUX guidance_scale handling) IMPORTANT GUIDELINES: - Reference the OPTIMIZATION KNOWLEDGE BASE to select appropriate techniques - Include all necessary imports - Add brief comments explaining optimization choices - Generate compact, production-ready code - Inline values where possible for concise code - Generate ONLY the Python code, no explanations before or after the code block 2025-05-30 01:01:20,159 - auto_diffusers - INFO - ================================================================================ 2025-05-30 01:01:20,159 - auto_diffusers - INFO - Sending request to Gemini API 2025-05-30 01:01:43,263 - auto_diffusers - INFO - Successfully received response from Gemini API (no tools used) 2025-05-30 01:01:43,263 - auto_diffusers - DEBUG - Response length: 2591 characters 2025-05-30 09:07:29,183 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-30 09:07:29,183 - __main__ - DEBUG - API key found, length: 39 2025-05-30 09:07:29,183 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-30 09:07:29,183 - auto_diffusers - DEBUG - API key length: 39 2025-05-30 09:07:29,183 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-30 09:07:29,183 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-30 09:07:29,183 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-30 09:07:29,183 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-30 09:07:29,183 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-30 09:07:29,183 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-30 09:07:29,187 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-30 09:07:29,187 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-30 09:07:29,635 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-30 09:07:29,635 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-30 09:07:29,635 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-30 09:07:29,635 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-30 09:07:29,635 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-30 09:07:29,635 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-30 09:07:29,635 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-30 09:07:29,635 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-30 09:07:29,635 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-30 09:07:29,635 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-30 09:07:29,635 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-30 09:07:29,637 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 09:07:29,650 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-30 09:07:29,655 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 09:07:29,730 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 09:07:29,765 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=None socket_options=None 2025-05-30 09:07:29,766 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 09:07:29,766 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 09:07:29,766 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 09:07:29,766 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 09:07:29,767 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 09:07:29,767 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 09:07:29,767 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 00:07:29 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')]) 2025-05-30 09:07:29,767 - httpx - INFO - HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK" 2025-05-30 09:07:29,767 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 09:07:29,767 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 09:07:29,767 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 09:07:29,767 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 09:07:29,767 - httpcore.connection - DEBUG - close.started 2025-05-30 09:07:29,767 - httpcore.connection - DEBUG - close.complete 2025-05-30 09:07:29,768 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=3 socket_options=None 2025-05-30 09:07:29,768 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 09:07:29,768 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 09:07:29,768 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 09:07:29,768 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 09:07:29,768 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 09:07:29,768 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 09:07:29,774 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 00:07:29 GMT'), (b'server', b'uvicorn'), (b'content-length', b'95123'), (b'content-type', b'text/html; charset=utf-8')]) 2025-05-30 09:07:29,774 - httpx - INFO - HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK" 2025-05-30 09:07:29,774 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 09:07:29,774 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 09:07:29,774 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 09:07:29,774 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 09:07:29,774 - httpcore.connection - DEBUG - close.started 2025-05-30 09:07:29,775 - httpcore.connection - DEBUG - close.complete 2025-05-30 09:07:29,785 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=30 socket_options=None 2025-05-30 09:07:29,931 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-30 09:07:29,998 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 09:07:29,999 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=30 2025-05-30 09:07:30,001 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 09:07:30,001 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-30 09:07:30,282 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 09:07:30,283 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 09:07:30,283 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 09:07:30,283 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 09:07:30,283 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 09:07:30,284 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 09:07:30,292 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 09:07:30,292 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 09:07:30,292 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 09:07:30,292 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 09:07:30,293 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 09:07:30,293 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 09:07:30,425 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 00:07:30 GMT'), (b'Content-Type', b'text/html; charset=utf-8'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'ContentType', b'application/json'), (b'Access-Control-Allow-Origin', b'*'), (b'Content-Encoding', b'gzip')]) 2025-05-30 09:07:30,426 - httpx - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK" 2025-05-30 09:07:30,426 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 09:07:30,427 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 09:07:30,427 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 09:07:30,427 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 09:07:30,427 - httpcore.connection - DEBUG - close.started 2025-05-30 09:07:30,427 - httpcore.connection - DEBUG - close.complete 2025-05-30 09:07:30,441 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 00:07:30 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-30 09:07:30,442 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-30 09:07:30,442 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 09:07:30,442 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 09:07:30,442 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 09:07:30,442 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 09:07:30,443 - httpcore.connection - DEBUG - close.started 2025-05-30 09:07:30,443 - httpcore.connection - DEBUG - close.complete 2025-05-30 09:07:31,087 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 09:07:31,309 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/launched HTTP/1.1" 200 0 2025-05-30 09:07:44,601 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 09:07:44,601 - simple_memory_calculator - INFO - Using known memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 09:07:44,601 - simple_memory_calculator - DEBUG - Known data: {'params_billions': 12.0, 'fp16_gb': 24.0, 'inference_fp16_gb': 36.0} 2025-05-30 09:07:44,601 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 09:07:44,601 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 09:07:44,601 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 09:07:44,601 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 09:07:44,601 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 09:07:44,601 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 09:11:09,156 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-30 09:11:09,156 - __main__ - DEBUG - API key found, length: 39 2025-05-30 09:11:09,156 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-30 09:11:09,156 - auto_diffusers - DEBUG - API key length: 39 2025-05-30 09:11:09,156 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-30 09:11:09,156 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-30 09:11:09,156 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-30 09:11:09,156 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-30 09:11:09,156 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-30 09:11:09,156 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-30 09:11:09,160 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-30 09:11:09,161 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-30 09:11:09,617 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-30 09:11:09,617 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-30 09:11:09,617 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-30 09:11:09,617 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-30 09:11:09,617 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-30 09:11:09,617 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-30 09:11:09,617 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-30 09:11:09,617 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-30 09:11:09,617 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-30 09:11:09,617 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-30 09:11:09,617 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-30 09:11:09,619 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 09:11:09,632 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-30 09:11:09,640 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 09:11:09,721 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 09:11:09,757 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=None socket_options=None 2025-05-30 09:11:09,757 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 09:11:09,757 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 09:11:09,758 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 09:11:09,758 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 09:11:09,758 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 09:11:09,758 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 09:11:09,758 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 00:11:09 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')]) 2025-05-30 09:11:09,758 - httpx - INFO - HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK" 2025-05-30 09:11:09,759 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 09:11:09,759 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 09:11:09,759 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 09:11:09,759 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 09:11:09,759 - httpcore.connection - DEBUG - close.started 2025-05-30 09:11:09,759 - httpcore.connection - DEBUG - close.complete 2025-05-30 09:11:09,759 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=3 socket_options=None 2025-05-30 09:11:09,760 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 09:11:09,760 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 09:11:09,760 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 09:11:09,760 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 09:11:09,760 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 09:11:09,760 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 09:11:09,766 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 00:11:09 GMT'), (b'server', b'uvicorn'), (b'content-length', b'95128'), (b'content-type', b'text/html; charset=utf-8')]) 2025-05-30 09:11:09,766 - httpx - INFO - HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK" 2025-05-30 09:11:09,766 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 09:11:09,766 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 09:11:09,766 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 09:11:09,766 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 09:11:09,766 - httpcore.connection - DEBUG - close.started 2025-05-30 09:11:09,766 - httpcore.connection - DEBUG - close.complete 2025-05-30 09:11:09,778 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=30 socket_options=None 2025-05-30 09:11:09,975 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-30 09:11:09,981 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 09:11:09,981 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-30 09:11:09,986 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 09:11:09,986 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=30 2025-05-30 09:11:10,272 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 09:11:10,272 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 09:11:10,273 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 09:11:10,273 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 09:11:10,273 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 09:11:10,273 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 09:11:10,284 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 09:11:10,284 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 09:11:10,284 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 09:11:10,284 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 09:11:10,284 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 09:11:10,284 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 09:11:10,419 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 00:11:10 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-30 09:11:10,419 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-30 09:11:10,419 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 09:11:10,419 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 09:11:10,419 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 09:11:10,419 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 09:11:10,419 - httpcore.connection - DEBUG - close.started 2025-05-30 09:11:10,420 - httpcore.connection - DEBUG - close.complete 2025-05-30 09:11:10,435 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 00:11:10 GMT'), (b'Content-Type', b'text/html; charset=utf-8'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'ContentType', b'application/json'), (b'Access-Control-Allow-Origin', b'*'), (b'Content-Encoding', b'gzip')]) 2025-05-30 09:11:10,435 - httpx - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK" 2025-05-30 09:11:10,435 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 09:11:10,435 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 09:11:10,435 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 09:11:10,435 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 09:11:10,436 - httpcore.connection - DEBUG - close.started 2025-05-30 09:11:10,436 - httpcore.connection - DEBUG - close.complete 2025-05-30 09:11:11,033 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 09:11:11,259 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/launched HTTP/1.1" 200 0 2025-05-30 09:11:13,467 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 09:11:13,467 - simple_memory_calculator - INFO - Using known memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 09:11:13,467 - simple_memory_calculator - DEBUG - Known data: {'params_billions': 12.0, 'fp16_gb': 24.0, 'inference_fp16_gb': 36.0} 2025-05-30 09:11:13,467 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 09:11:13,467 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 09:11:13,467 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 09:11:13,467 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 09:11:13,467 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 09:11:13,467 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 09:13:36,094 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-30 09:13:36,094 - __main__ - DEBUG - API key found, length: 39 2025-05-30 09:13:36,094 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-30 09:13:36,094 - auto_diffusers - DEBUG - API key length: 39 2025-05-30 09:13:36,094 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-30 09:13:36,094 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-30 09:13:36,094 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-30 09:13:36,094 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-30 09:13:36,094 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-30 09:13:36,094 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-30 09:13:36,098 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-30 09:13:36,099 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-30 09:13:36,533 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-30 09:13:36,533 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-30 09:13:36,533 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-30 09:13:36,533 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-30 09:13:36,533 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-30 09:13:36,533 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-30 09:13:36,533 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-30 09:13:36,533 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-30 09:13:36,533 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-30 09:13:36,533 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-30 09:13:36,533 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-30 09:13:36,535 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 09:13:36,548 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-30 09:13:36,554 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 09:13:36,628 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 09:13:36,666 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=None socket_options=None 2025-05-30 09:13:36,667 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 09:13:36,667 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 09:13:36,667 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 09:13:36,667 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 09:13:36,667 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 09:13:36,667 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 09:13:36,667 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 00:13:36 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')]) 2025-05-30 09:13:36,668 - httpx - INFO - HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK" 2025-05-30 09:13:36,668 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 09:13:36,668 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 09:13:36,668 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 09:13:36,668 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 09:13:36,668 - httpcore.connection - DEBUG - close.started 2025-05-30 09:13:36,668 - httpcore.connection - DEBUG - close.complete 2025-05-30 09:13:36,668 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=3 socket_options=None 2025-05-30 09:13:36,669 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 09:13:36,669 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 09:13:36,669 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 09:13:36,669 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 09:13:36,669 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 09:13:36,669 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 09:13:36,675 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 00:13:36 GMT'), (b'server', b'uvicorn'), (b'content-length', b'95123'), (b'content-type', b'text/html; charset=utf-8')]) 2025-05-30 09:13:36,675 - httpx - INFO - HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK" 2025-05-30 09:13:36,675 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 09:13:36,675 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 09:13:36,675 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 09:13:36,675 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 09:13:36,675 - httpcore.connection - DEBUG - close.started 2025-05-30 09:13:36,675 - httpcore.connection - DEBUG - close.complete 2025-05-30 09:13:36,686 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=30 socket_options=None 2025-05-30 09:13:36,845 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 09:13:36,845 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-30 09:13:36,848 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 09:13:36,848 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=30 2025-05-30 09:13:36,946 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-30 09:13:37,121 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 09:13:37,121 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 09:13:37,122 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 09:13:37,122 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 09:13:37,122 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 09:13:37,122 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 09:13:37,133 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 09:13:37,134 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 09:13:37,134 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 09:13:37,134 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 09:13:37,134 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 09:13:37,134 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 09:13:37,261 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 00:13:37 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-30 09:13:37,262 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-30 09:13:37,262 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 09:13:37,262 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 09:13:37,262 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 09:13:37,262 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 09:13:37,263 - httpcore.connection - DEBUG - close.started 2025-05-30 09:13:37,263 - httpcore.connection - DEBUG - close.complete 2025-05-30 09:13:37,277 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 00:13:37 GMT'), (b'Content-Type', b'text/html; charset=utf-8'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'ContentType', b'application/json'), (b'Access-Control-Allow-Origin', b'*'), (b'Content-Encoding', b'gzip')]) 2025-05-30 09:13:37,277 - httpx - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK" 2025-05-30 09:13:37,278 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 09:13:37,278 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 09:13:37,278 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 09:13:37,278 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 09:13:37,278 - httpcore.connection - DEBUG - close.started 2025-05-30 09:13:37,278 - httpcore.connection - DEBUG - close.complete 2025-05-30 09:13:38,071 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 09:13:38,138 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 09:13:38,138 - simple_memory_calculator - INFO - Using known memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 09:13:38,138 - simple_memory_calculator - DEBUG - Known data: {'params_billions': 12.0, 'fp16_gb': 24.0, 'inference_fp16_gb': 36.0} 2025-05-30 09:13:38,138 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 09:13:38,138 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 09:13:38,138 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 09:13:38,138 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 09:13:38,138 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 09:13:38,139 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 09:13:38,290 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/launched HTTP/1.1" 200 0 2025-05-30 09:15:53,436 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-30 09:15:53,436 - __main__ - DEBUG - API key found, length: 39 2025-05-30 09:15:53,436 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-30 09:15:53,436 - auto_diffusers - DEBUG - API key length: 39 2025-05-30 09:15:53,436 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-30 09:15:53,436 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-30 09:15:53,436 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-30 09:15:53,436 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-30 09:15:53,436 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-30 09:15:53,436 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-30 09:15:53,439 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-30 09:15:53,439 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-30 09:15:53,862 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-30 09:15:53,862 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-30 09:15:53,862 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-30 09:15:53,862 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-30 09:15:53,862 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-30 09:15:53,862 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-30 09:15:53,862 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-30 09:15:53,862 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-30 09:15:53,862 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-30 09:15:53,862 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-30 09:15:53,862 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-30 09:15:53,864 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 09:15:53,878 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-30 09:15:53,878 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 09:15:53,959 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 09:15:53,996 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=None socket_options=None 2025-05-30 09:15:53,996 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 09:15:53,997 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 09:15:53,997 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 09:15:53,997 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 09:15:53,997 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 09:15:53,997 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 09:15:53,997 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 00:15:53 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')]) 2025-05-30 09:15:53,998 - httpx - INFO - HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK" 2025-05-30 09:15:53,998 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 09:15:53,998 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 09:15:53,998 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 09:15:53,998 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 09:15:53,998 - httpcore.connection - DEBUG - close.started 2025-05-30 09:15:53,998 - httpcore.connection - DEBUG - close.complete 2025-05-30 09:15:53,998 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=3 socket_options=None 2025-05-30 09:15:53,999 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 09:15:53,999 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 09:15:53,999 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 09:15:53,999 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 09:15:53,999 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 09:15:53,999 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 09:15:54,005 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 00:15:53 GMT'), (b'server', b'uvicorn'), (b'content-length', b'93371'), (b'content-type', b'text/html; charset=utf-8')]) 2025-05-30 09:15:54,005 - httpx - INFO - HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK" 2025-05-30 09:15:54,005 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 09:15:54,005 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 09:15:54,005 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 09:15:54,005 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 09:15:54,005 - httpcore.connection - DEBUG - close.started 2025-05-30 09:15:54,005 - httpcore.connection - DEBUG - close.complete 2025-05-30 09:15:54,017 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=30 socket_options=None 2025-05-30 09:15:54,037 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 09:15:54,037 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-30 09:15:54,165 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-30 09:15:54,177 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 09:15:54,177 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=30 2025-05-30 09:15:54,311 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 09:15:54,311 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 09:15:54,311 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 09:15:54,311 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 09:15:54,311 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 09:15:54,311 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 09:15:54,450 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 00:15:54 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-30 09:15:54,451 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-30 09:15:54,451 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 09:15:54,451 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 09:15:54,451 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 09:15:54,451 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 09:15:54,451 - httpcore.connection - DEBUG - close.started 2025-05-30 09:15:54,451 - httpcore.connection - DEBUG - close.complete 2025-05-30 09:15:54,500 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 09:15:54,500 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 09:15:54,500 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 09:15:54,500 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 09:15:54,500 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 09:15:54,500 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 09:15:54,673 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 00:15:54 GMT'), (b'Content-Type', b'text/html; charset=utf-8'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'ContentType', b'application/json'), (b'Access-Control-Allow-Origin', b'*'), (b'Content-Encoding', b'gzip')]) 2025-05-30 09:15:54,674 - httpx - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK" 2025-05-30 09:15:54,675 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 09:15:54,676 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 09:15:54,676 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 09:15:54,676 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 09:15:54,677 - httpcore.connection - DEBUG - close.started 2025-05-30 09:15:54,677 - httpcore.connection - DEBUG - close.complete 2025-05-30 09:15:55,285 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 09:15:55,504 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/launched HTTP/1.1" 200 0 2025-05-30 09:16:00,140 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 09:16:00,140 - simple_memory_calculator - INFO - Using known memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 09:16:00,140 - simple_memory_calculator - DEBUG - Known data: {'params_billions': 12.0, 'fp16_gb': 24.0, 'inference_fp16_gb': 36.0} 2025-05-30 09:16:00,140 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 09:16:00,141 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 09:16:00,141 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 09:16:00,141 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 09:16:00,141 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 09:16:00,141 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 09:16:04,018 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 09:16:04,018 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 09:16:04,018 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 09:16:04,018 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 09:16:04,018 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 09:16:04,018 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 09:16:04,018 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 09:16:04,018 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 09:17:42,173 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-30 09:17:42,173 - __main__ - DEBUG - API key found, length: 39 2025-05-30 09:17:42,173 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-30 09:17:42,173 - auto_diffusers - DEBUG - API key length: 39 2025-05-30 09:17:42,173 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-30 09:17:42,173 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-30 09:17:42,173 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-30 09:17:42,173 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-30 09:17:42,173 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-30 09:17:42,173 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-30 09:17:42,177 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-30 09:17:42,177 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-30 09:17:42,617 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-30 09:17:42,617 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-30 09:17:42,617 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-30 09:17:42,617 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-30 09:17:42,617 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-30 09:17:42,617 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-30 09:17:42,617 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-30 09:17:42,618 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-30 09:17:42,618 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-30 09:17:42,618 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-30 09:17:42,618 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-30 09:17:42,620 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 09:17:42,633 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-30 09:17:42,639 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 09:17:42,720 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 09:17:42,756 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=None socket_options=None 2025-05-30 09:17:42,756 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 09:17:42,756 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 09:17:42,757 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 09:17:42,757 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 09:17:42,757 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 09:17:42,757 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 09:17:42,757 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 00:17:42 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')]) 2025-05-30 09:17:42,757 - httpx - INFO - HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK" 2025-05-30 09:17:42,757 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 09:17:42,757 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 09:17:42,757 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 09:17:42,757 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 09:17:42,758 - httpcore.connection - DEBUG - close.started 2025-05-30 09:17:42,758 - httpcore.connection - DEBUG - close.complete 2025-05-30 09:17:42,758 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=3 socket_options=None 2025-05-30 09:17:42,758 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 09:17:42,758 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 09:17:42,759 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 09:17:42,759 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 09:17:42,759 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 09:17:42,759 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 09:17:42,765 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 00:17:42 GMT'), (b'server', b'uvicorn'), (b'content-length', b'93382'), (b'content-type', b'text/html; charset=utf-8')]) 2025-05-30 09:17:42,765 - httpx - INFO - HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK" 2025-05-30 09:17:42,765 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 09:17:42,765 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 09:17:42,765 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 09:17:42,765 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 09:17:42,765 - httpcore.connection - DEBUG - close.started 2025-05-30 09:17:42,765 - httpcore.connection - DEBUG - close.complete 2025-05-30 09:17:42,775 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=30 socket_options=None 2025-05-30 09:17:42,928 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-30 09:17:42,973 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 09:17:42,973 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=30 2025-05-30 09:17:42,974 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 09:17:42,974 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-30 09:17:43,245 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 09:17:43,245 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 09:17:43,245 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 09:17:43,245 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 09:17:43,245 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 09:17:43,246 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 09:17:43,256 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 09:17:43,256 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 09:17:43,257 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 09:17:43,257 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 09:17:43,257 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 09:17:43,257 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 09:17:43,387 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 00:17:43 GMT'), (b'Content-Type', b'text/html; charset=utf-8'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'ContentType', b'application/json'), (b'Access-Control-Allow-Origin', b'*'), (b'Content-Encoding', b'gzip')]) 2025-05-30 09:17:43,389 - httpx - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK" 2025-05-30 09:17:43,389 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 09:17:43,389 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 09:17:43,389 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 09:17:43,389 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 09:17:43,389 - httpcore.connection - DEBUG - close.started 2025-05-30 09:17:43,389 - httpcore.connection - DEBUG - close.complete 2025-05-30 09:17:43,402 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 00:17:43 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-30 09:17:43,404 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-30 09:17:43,404 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 09:17:43,405 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 09:17:43,405 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 09:17:43,405 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 09:17:43,405 - httpcore.connection - DEBUG - close.started 2025-05-30 09:17:43,405 - httpcore.connection - DEBUG - close.complete 2025-05-30 09:17:44,007 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 09:17:44,224 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/launched HTTP/1.1" 200 0 2025-05-30 09:17:47,229 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 09:17:47,230 - simple_memory_calculator - INFO - Using known memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 09:17:47,230 - simple_memory_calculator - DEBUG - Known data: {'params_billions': 12.0, 'fp16_gb': 24.0, 'inference_fp16_gb': 36.0} 2025-05-30 09:17:47,230 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 09:17:47,230 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 09:17:47,230 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 09:17:47,230 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 09:17:47,230 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 09:17:47,230 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 09:18:40,149 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-30 09:18:40,149 - __main__ - DEBUG - API key found, length: 39 2025-05-30 09:18:40,149 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-30 09:18:40,149 - auto_diffusers - DEBUG - API key length: 39 2025-05-30 09:18:40,149 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-30 09:18:40,149 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-30 09:18:40,149 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-30 09:18:40,149 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-30 09:18:40,149 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-30 09:18:40,149 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-30 09:18:40,153 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-30 09:18:40,153 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-30 09:18:40,577 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-30 09:18:40,577 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-30 09:18:40,577 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-30 09:18:40,577 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-30 09:18:40,577 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-30 09:18:40,577 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-30 09:18:40,577 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-30 09:18:40,577 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-30 09:18:40,577 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-30 09:18:40,577 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-30 09:18:40,577 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-30 09:18:40,580 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 09:18:40,593 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-30 09:18:40,601 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 09:18:40,683 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 09:18:40,721 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=None socket_options=None 2025-05-30 09:18:40,722 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 09:18:40,722 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 09:18:40,722 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 09:18:40,722 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 09:18:40,723 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 09:18:40,723 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 09:18:40,723 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 00:18:40 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')]) 2025-05-30 09:18:40,723 - httpx - INFO - HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK" 2025-05-30 09:18:40,723 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 09:18:40,723 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 09:18:40,723 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 09:18:40,723 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 09:18:40,723 - httpcore.connection - DEBUG - close.started 2025-05-30 09:18:40,723 - httpcore.connection - DEBUG - close.complete 2025-05-30 09:18:40,724 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=3 socket_options=None 2025-05-30 09:18:40,724 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 09:18:40,724 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 09:18:40,724 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 09:18:40,724 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 09:18:40,725 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 09:18:40,725 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 09:18:40,730 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 00:18:40 GMT'), (b'server', b'uvicorn'), (b'content-length', b'93376'), (b'content-type', b'text/html; charset=utf-8')]) 2025-05-30 09:18:40,731 - httpx - INFO - HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK" 2025-05-30 09:18:40,731 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 09:18:40,731 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 09:18:40,731 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 09:18:40,731 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 09:18:40,731 - httpcore.connection - DEBUG - close.started 2025-05-30 09:18:40,731 - httpcore.connection - DEBUG - close.complete 2025-05-30 09:18:40,735 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 09:18:40,735 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-30 09:18:40,741 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=30 socket_options=None 2025-05-30 09:18:40,867 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-30 09:18:40,885 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 09:18:40,885 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=30 2025-05-30 09:18:41,061 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 09:18:41,062 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 09:18:41,062 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 09:18:41,063 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 09:18:41,063 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 09:18:41,063 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 09:18:41,177 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 09:18:41,178 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 09:18:41,178 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 09:18:41,178 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 09:18:41,178 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 09:18:41,179 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 09:18:41,199 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 00:18:41 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-30 09:18:41,200 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-30 09:18:41,200 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 09:18:41,200 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 09:18:41,200 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 09:18:41,200 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 09:18:41,200 - httpcore.connection - DEBUG - close.started 2025-05-30 09:18:41,200 - httpcore.connection - DEBUG - close.complete 2025-05-30 09:18:41,324 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 00:18:41 GMT'), (b'Content-Type', b'text/html; charset=utf-8'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'ContentType', b'application/json'), (b'Access-Control-Allow-Origin', b'*'), (b'Content-Encoding', b'gzip')]) 2025-05-30 09:18:41,326 - httpx - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK" 2025-05-30 09:18:41,327 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 09:18:41,328 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 09:18:41,328 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 09:18:41,328 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 09:18:41,329 - httpcore.connection - DEBUG - close.started 2025-05-30 09:18:41,329 - httpcore.connection - DEBUG - close.complete 2025-05-30 09:18:41,938 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 09:18:42,158 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/launched HTTP/1.1" 200 0 2025-05-30 09:18:43,365 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 09:18:43,365 - simple_memory_calculator - INFO - Using known memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 09:18:43,365 - simple_memory_calculator - DEBUG - Known data: {'params_billions': 12.0, 'fp16_gb': 24.0, 'inference_fp16_gb': 36.0} 2025-05-30 09:18:43,365 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 09:18:43,365 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 09:18:43,365 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 09:18:43,365 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 09:18:43,365 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 09:18:43,366 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 09:18:45,715 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 09:18:45,715 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 09:18:45,715 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 09:18:45,715 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 09:18:45,716 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 09:18:45,716 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 09:18:45,716 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 09:18:45,716 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 09:21:02,035 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-30 09:21:02,035 - __main__ - DEBUG - API key found, length: 39 2025-05-30 09:21:02,035 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-30 09:21:02,035 - auto_diffusers - DEBUG - API key length: 39 2025-05-30 09:21:02,035 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-30 09:21:02,035 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-30 09:21:02,035 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-30 09:21:02,035 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-30 09:21:02,035 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-30 09:21:02,035 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-30 09:21:02,038 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-30 09:21:02,039 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-30 09:21:02,480 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-30 09:21:02,480 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-30 09:21:02,480 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-30 09:21:02,480 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-30 09:21:02,480 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-30 09:21:02,480 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-30 09:21:02,480 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-30 09:21:02,480 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-30 09:21:02,480 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-30 09:21:02,480 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-30 09:21:02,480 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-30 09:21:02,482 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 09:21:02,496 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-30 09:21:02,502 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 09:21:02,579 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 09:21:02,617 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=None socket_options=None 2025-05-30 09:21:02,618 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 09:21:02,618 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 09:21:02,618 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 09:21:02,618 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 09:21:02,619 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 09:21:02,619 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 09:21:02,619 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 00:21:02 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')]) 2025-05-30 09:21:02,619 - httpx - INFO - HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK" 2025-05-30 09:21:02,619 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 09:21:02,619 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 09:21:02,619 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 09:21:02,619 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 09:21:02,619 - httpcore.connection - DEBUG - close.started 2025-05-30 09:21:02,619 - httpcore.connection - DEBUG - close.complete 2025-05-30 09:21:02,620 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=3 socket_options=None 2025-05-30 09:21:02,620 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 09:21:02,620 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 09:21:02,620 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 09:21:02,620 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 09:21:02,620 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 09:21:02,621 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 09:21:02,627 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 00:21:02 GMT'), (b'server', b'uvicorn'), (b'content-length', b'92301'), (b'content-type', b'text/html; charset=utf-8')]) 2025-05-30 09:21:02,627 - httpx - INFO - HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK" 2025-05-30 09:21:02,627 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 09:21:02,627 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 09:21:02,627 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 09:21:02,627 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 09:21:02,627 - httpcore.connection - DEBUG - close.started 2025-05-30 09:21:02,627 - httpcore.connection - DEBUG - close.complete 2025-05-30 09:21:02,637 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=30 socket_options=None 2025-05-30 09:21:02,678 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 09:21:02,678 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-30 09:21:02,783 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 09:21:02,783 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=30 2025-05-30 09:21:02,826 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-30 09:21:02,998 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 09:21:02,998 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 09:21:02,999 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 09:21:02,999 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 09:21:02,999 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 09:21:02,999 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 09:21:03,077 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 09:21:03,078 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 09:21:03,078 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 09:21:03,079 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 09:21:03,079 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 09:21:03,079 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 09:21:03,159 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 00:21:03 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-30 09:21:03,161 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-30 09:21:03,161 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 09:21:03,162 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 09:21:03,162 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 09:21:03,162 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 09:21:03,162 - httpcore.connection - DEBUG - close.started 2025-05-30 09:21:03,163 - httpcore.connection - DEBUG - close.complete 2025-05-30 09:21:03,227 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 00:21:03 GMT'), (b'Content-Type', b'text/html; charset=utf-8'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'ContentType', b'application/json'), (b'Access-Control-Allow-Origin', b'*'), (b'Content-Encoding', b'gzip')]) 2025-05-30 09:21:03,228 - httpx - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK" 2025-05-30 09:21:03,228 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 09:21:03,228 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 09:21:03,228 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 09:21:03,228 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 09:21:03,229 - httpcore.connection - DEBUG - close.started 2025-05-30 09:21:03,229 - httpcore.connection - DEBUG - close.complete 2025-05-30 09:21:03,865 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 09:21:04,081 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/launched HTTP/1.1" 200 0 2025-05-30 09:21:05,618 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 09:21:05,618 - simple_memory_calculator - INFO - Using known memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 09:21:05,618 - simple_memory_calculator - DEBUG - Known data: {'params_billions': 12.0, 'fp16_gb': 24.0, 'inference_fp16_gb': 36.0} 2025-05-30 09:21:05,618 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 09:21:05,619 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 09:21:05,619 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 09:21:05,619 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 09:21:05,619 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 09:21:05,619 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 09:21:07,815 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 09:21:07,816 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 09:21:07,816 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 09:21:07,816 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 09:21:07,816 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 09:21:07,816 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 09:21:07,816 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 09:21:07,816 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 09:23:08,080 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-30 09:23:08,080 - __main__ - DEBUG - API key found, length: 39 2025-05-30 09:23:08,080 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-30 09:23:08,080 - auto_diffusers - DEBUG - API key length: 39 2025-05-30 09:23:08,080 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-30 09:23:08,080 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-30 09:23:08,080 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-30 09:23:08,080 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-30 09:23:08,080 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-30 09:23:08,080 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-30 09:23:08,084 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-30 09:23:08,084 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-30 09:23:08,503 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-30 09:23:08,503 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-30 09:23:08,503 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-30 09:23:08,503 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-30 09:23:08,503 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-30 09:23:08,503 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-30 09:23:08,503 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-30 09:23:08,503 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-30 09:23:08,503 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-30 09:23:08,503 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-30 09:23:08,503 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-30 09:23:08,505 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 09:23:08,518 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-30 09:23:08,524 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 09:23:08,598 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 09:23:08,637 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=None socket_options=None 2025-05-30 09:23:08,638 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 09:23:08,638 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 09:23:08,639 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 09:23:08,639 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 09:23:08,639 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 09:23:08,639 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 09:23:08,639 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 00:23:08 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')]) 2025-05-30 09:23:08,639 - httpx - INFO - HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK" 2025-05-30 09:23:08,639 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 09:23:08,639 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 09:23:08,639 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 09:23:08,639 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 09:23:08,640 - httpcore.connection - DEBUG - close.started 2025-05-30 09:23:08,640 - httpcore.connection - DEBUG - close.complete 2025-05-30 09:23:08,640 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=3 socket_options=None 2025-05-30 09:23:08,640 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 09:23:08,640 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 09:23:08,641 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 09:23:08,641 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 09:23:08,641 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 09:23:08,641 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 09:23:08,647 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 00:23:08 GMT'), (b'server', b'uvicorn'), (b'content-length', b'91293'), (b'content-type', b'text/html; charset=utf-8')]) 2025-05-30 09:23:08,647 - httpx - INFO - HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK" 2025-05-30 09:23:08,647 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 09:23:08,647 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 09:23:08,647 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 09:23:08,647 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 09:23:08,647 - httpcore.connection - DEBUG - close.started 2025-05-30 09:23:08,647 - httpcore.connection - DEBUG - close.complete 2025-05-30 09:23:08,658 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=30 socket_options=None 2025-05-30 09:23:08,687 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 09:23:08,687 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-30 09:23:08,793 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 09:23:08,793 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=30 2025-05-30 09:23:08,797 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-30 09:23:08,979 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 09:23:08,980 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 09:23:08,980 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 09:23:08,980 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 09:23:08,980 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 09:23:08,981 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 09:23:09,066 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 09:23:09,066 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 09:23:09,067 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 09:23:09,067 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 09:23:09,067 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 09:23:09,067 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 09:23:09,130 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 00:23:09 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-30 09:23:09,131 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-30 09:23:09,131 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 09:23:09,132 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 09:23:09,132 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 09:23:09,132 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 09:23:09,132 - httpcore.connection - DEBUG - close.started 2025-05-30 09:23:09,133 - httpcore.connection - DEBUG - close.complete 2025-05-30 09:23:09,244 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 00:23:09 GMT'), (b'Content-Type', b'text/html; charset=utf-8'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'ContentType', b'application/json'), (b'Access-Control-Allow-Origin', b'*'), (b'Content-Encoding', b'gzip')]) 2025-05-30 09:23:09,245 - httpx - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK" 2025-05-30 09:23:09,245 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 09:23:09,246 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 09:23:09,246 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 09:23:09,246 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 09:23:09,246 - httpcore.connection - DEBUG - close.started 2025-05-30 09:23:09,247 - httpcore.connection - DEBUG - close.complete 2025-05-30 09:23:09,841 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 09:23:10,058 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/launched HTTP/1.1" 200 0 2025-05-30 09:23:10,267 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 09:23:10,267 - simple_memory_calculator - INFO - Using known memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 09:23:10,267 - simple_memory_calculator - DEBUG - Known data: {'params_billions': 12.0, 'fp16_gb': 24.0, 'inference_fp16_gb': 36.0} 2025-05-30 09:23:10,267 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 09:23:10,267 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 09:23:10,267 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 09:23:10,268 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 09:23:10,268 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 09:23:10,268 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 09:23:12,394 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 09:23:12,394 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 09:23:12,394 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 09:23:12,394 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 09:23:12,394 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 09:23:12,394 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 09:23:12,394 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 09:23:12,395 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 09:24:16,687 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 09:24:16,687 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 09:24:16,687 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 09:24:16,687 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 09:24:16,687 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 09:24:16,688 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 09:24:16,688 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 09:24:16,688 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 09:26:11,528 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-30 09:26:11,528 - __main__ - DEBUG - API key found, length: 39 2025-05-30 09:26:11,528 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-30 09:26:11,528 - auto_diffusers - DEBUG - API key length: 39 2025-05-30 09:26:11,528 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-30 09:26:11,528 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-30 09:26:11,528 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-30 09:26:11,528 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-30 09:26:11,528 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-30 09:26:11,528 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-30 09:26:11,532 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-30 09:26:11,532 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-30 09:26:11,962 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-30 09:26:11,962 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-30 09:26:11,962 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-30 09:26:11,962 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-30 09:26:11,962 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-30 09:26:11,962 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-30 09:26:11,962 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-30 09:26:11,962 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-30 09:26:11,962 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-30 09:26:11,962 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-30 09:26:11,962 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-30 09:26:11,964 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 09:26:11,978 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-30 09:26:11,978 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 09:26:12,060 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 09:26:12,090 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=None socket_options=None 2025-05-30 09:26:12,091 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 09:26:12,091 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 09:26:12,091 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 09:26:12,091 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 09:26:12,091 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 09:26:12,092 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 09:26:12,092 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 00:26:12 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')]) 2025-05-30 09:26:12,092 - httpx - INFO - HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK" 2025-05-30 09:26:12,092 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 09:26:12,092 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 09:26:12,092 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 09:26:12,092 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 09:26:12,092 - httpcore.connection - DEBUG - close.started 2025-05-30 09:26:12,092 - httpcore.connection - DEBUG - close.complete 2025-05-30 09:26:12,093 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=3 socket_options=None 2025-05-30 09:26:12,095 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 09:26:12,095 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 09:26:12,095 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 09:26:12,095 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 09:26:12,095 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 09:26:12,095 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 09:26:12,101 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 00:26:12 GMT'), (b'server', b'uvicorn'), (b'content-length', b'89315'), (b'content-type', b'text/html; charset=utf-8')]) 2025-05-30 09:26:12,101 - httpx - INFO - HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK" 2025-05-30 09:26:12,101 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 09:26:12,101 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 09:26:12,101 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 09:26:12,101 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 09:26:12,101 - httpcore.connection - DEBUG - close.started 2025-05-30 09:26:12,102 - httpcore.connection - DEBUG - close.complete 2025-05-30 09:26:12,112 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=30 socket_options=None 2025-05-30 09:26:12,204 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 09:26:12,204 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-30 09:26:12,258 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 09:26:12,258 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=30 2025-05-30 09:26:12,263 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-30 09:26:12,546 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 09:26:12,546 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 09:26:12,546 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 09:26:12,546 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 09:26:12,546 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 09:26:12,546 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 09:26:12,550 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 09:26:12,550 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 09:26:12,550 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 09:26:12,550 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 09:26:12,551 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 09:26:12,551 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 09:26:12,698 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 00:26:12 GMT'), (b'Content-Type', b'text/html; charset=utf-8'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'ContentType', b'application/json'), (b'Access-Control-Allow-Origin', b'*'), (b'Content-Encoding', b'gzip')]) 2025-05-30 09:26:12,700 - httpx - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK" 2025-05-30 09:26:12,700 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 09:26:12,701 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 09:26:12,702 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 09:26:12,703 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 09:26:12,705 - httpcore.connection - DEBUG - close.started 2025-05-30 09:26:12,706 - httpcore.connection - DEBUG - close.complete 2025-05-30 09:26:12,724 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 00:26:12 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-30 09:26:12,725 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-30 09:26:12,726 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 09:26:12,726 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 09:26:12,726 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 09:26:12,726 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 09:26:12,726 - httpcore.connection - DEBUG - close.started 2025-05-30 09:26:12,726 - httpcore.connection - DEBUG - close.complete 2025-05-30 09:26:13,321 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 09:26:13,519 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 09:26:13,519 - simple_memory_calculator - INFO - Using known memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 09:26:13,519 - simple_memory_calculator - DEBUG - Known data: {'params_billions': 12.0, 'fp16_gb': 24.0, 'inference_fp16_gb': 36.0} 2025-05-30 09:26:13,520 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 09:26:13,520 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 09:26:13,520 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 09:26:13,520 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 09:26:13,520 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 09:26:13,520 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 09:26:13,550 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/launched HTTP/1.1" 200 0 2025-05-30 09:29:17,309 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-30 09:29:17,309 - __main__ - DEBUG - API key found, length: 39 2025-05-30 09:29:17,309 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-30 09:29:17,309 - auto_diffusers - DEBUG - API key length: 39 2025-05-30 09:29:17,309 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-30 09:29:17,309 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-30 09:29:17,309 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-30 09:29:17,309 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-30 09:29:17,309 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-30 09:29:17,309 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-30 09:29:17,312 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-30 09:29:17,312 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-30 09:29:17,727 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-30 09:29:17,727 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-30 09:29:17,727 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-30 09:29:17,727 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-30 09:29:17,727 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-30 09:29:17,727 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-30 09:29:17,727 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-30 09:29:17,727 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-30 09:29:17,727 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-30 09:29:17,727 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-30 09:29:17,727 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-30 09:29:17,729 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 09:29:17,742 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-30 09:29:17,747 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 09:29:17,819 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 09:29:17,854 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=None socket_options=None 2025-05-30 09:29:17,854 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 09:29:17,854 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 09:29:17,854 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 09:29:17,855 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 09:29:17,855 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 09:29:17,855 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 09:29:17,855 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 00:29:17 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')]) 2025-05-30 09:29:17,855 - httpx - INFO - HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK" 2025-05-30 09:29:17,855 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 09:29:17,855 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 09:29:17,855 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 09:29:17,855 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 09:29:17,856 - httpcore.connection - DEBUG - close.started 2025-05-30 09:29:17,856 - httpcore.connection - DEBUG - close.complete 2025-05-30 09:29:17,856 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=3 socket_options=None 2025-05-30 09:29:17,856 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 09:29:17,856 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 09:29:17,857 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 09:29:17,857 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 09:29:17,857 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 09:29:17,857 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 09:29:17,862 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 00:29:17 GMT'), (b'server', b'uvicorn'), (b'content-length', b'89336'), (b'content-type', b'text/html; charset=utf-8')]) 2025-05-30 09:29:17,862 - httpx - INFO - HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK" 2025-05-30 09:29:17,862 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 09:29:17,862 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 09:29:17,863 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 09:29:17,863 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 09:29:17,863 - httpcore.connection - DEBUG - close.started 2025-05-30 09:29:17,863 - httpcore.connection - DEBUG - close.complete 2025-05-30 09:29:17,873 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=30 socket_options=None 2025-05-30 09:29:17,952 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 09:29:17,952 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-30 09:29:18,019 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 09:29:18,019 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=30 2025-05-30 09:29:18,028 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-30 09:29:18,249 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 09:29:18,249 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 09:29:18,249 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 09:29:18,249 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 09:29:18,249 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 09:29:18,249 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 09:29:18,310 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 09:29:18,310 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 09:29:18,310 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 09:29:18,311 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 09:29:18,311 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 09:29:18,311 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 09:29:18,397 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 00:29:18 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-30 09:29:18,398 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-30 09:29:18,398 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 09:29:18,398 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 09:29:18,398 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 09:29:18,399 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 09:29:18,399 - httpcore.connection - DEBUG - close.started 2025-05-30 09:29:18,399 - httpcore.connection - DEBUG - close.complete 2025-05-30 09:29:18,459 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 00:29:18 GMT'), (b'Content-Type', b'text/html; charset=utf-8'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'ContentType', b'application/json'), (b'Access-Control-Allow-Origin', b'*'), (b'Content-Encoding', b'gzip')]) 2025-05-30 09:29:18,459 - httpx - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK" 2025-05-30 09:29:18,459 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 09:29:18,459 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 09:29:18,459 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 09:29:18,459 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 09:29:18,460 - httpcore.connection - DEBUG - close.started 2025-05-30 09:29:18,460 - httpcore.connection - DEBUG - close.complete 2025-05-30 09:29:19,063 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 09:29:19,292 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/launched HTTP/1.1" 200 0 2025-05-30 09:29:19,583 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 09:29:19,583 - simple_memory_calculator - INFO - Using known memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 09:29:19,584 - simple_memory_calculator - DEBUG - Known data: {'params_billions': 12.0, 'fp16_gb': 24.0, 'inference_fp16_gb': 36.0} 2025-05-30 09:29:19,584 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 09:29:19,584 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 09:29:19,584 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 09:29:19,584 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 09:29:19,584 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 09:29:19,584 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 09:30:37,148 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-30 09:30:37,148 - __main__ - DEBUG - API key found, length: 39 2025-05-30 09:30:37,148 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-30 09:30:37,148 - auto_diffusers - DEBUG - API key length: 39 2025-05-30 09:30:37,148 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-30 09:30:37,148 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-30 09:30:37,148 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-30 09:30:37,148 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-30 09:30:37,148 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-30 09:30:37,148 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-30 09:30:37,152 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-30 09:30:37,152 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-30 09:30:37,581 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-30 09:30:37,581 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-30 09:30:37,581 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-30 09:30:37,581 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-30 09:30:37,581 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-30 09:30:37,581 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-30 09:30:37,581 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-30 09:30:37,581 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-30 09:30:37,581 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-30 09:30:37,581 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-30 09:30:37,581 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-30 09:30:37,584 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 09:30:37,597 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-30 09:30:37,603 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 09:30:37,676 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 09:30:37,716 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=None socket_options=None 2025-05-30 09:30:37,716 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 09:30:37,716 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 09:30:37,716 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 09:30:37,717 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 09:30:37,717 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 09:30:37,717 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 09:30:37,717 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 00:30:37 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')]) 2025-05-30 09:30:37,717 - httpx - INFO - HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK" 2025-05-30 09:30:37,717 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 09:30:37,717 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 09:30:37,717 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 09:30:37,717 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 09:30:37,718 - httpcore.connection - DEBUG - close.started 2025-05-30 09:30:37,718 - httpcore.connection - DEBUG - close.complete 2025-05-30 09:30:37,718 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=3 socket_options=None 2025-05-30 09:30:37,719 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 09:30:37,719 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 09:30:37,719 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 09:30:37,719 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 09:30:37,719 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 09:30:37,719 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 09:30:37,726 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 00:30:37 GMT'), (b'server', b'uvicorn'), (b'content-length', b'90180'), (b'content-type', b'text/html; charset=utf-8')]) 2025-05-30 09:30:37,726 - httpx - INFO - HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK" 2025-05-30 09:30:37,726 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 09:30:37,726 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 09:30:37,726 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 09:30:37,726 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 09:30:37,727 - httpcore.connection - DEBUG - close.started 2025-05-30 09:30:37,727 - httpcore.connection - DEBUG - close.complete 2025-05-30 09:30:37,737 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=30 socket_options=None 2025-05-30 09:30:37,755 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 09:30:37,755 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-30 09:30:37,878 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-30 09:30:37,883 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 09:30:37,883 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=30 2025-05-30 09:30:38,027 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 09:30:38,027 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 09:30:38,027 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 09:30:38,028 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 09:30:38,028 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 09:30:38,028 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 09:30:38,229 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 00:30:38 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-30 09:30:38,229 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 09:30:38,231 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-30 09:30:38,232 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 09:30:38,232 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 09:30:38,233 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 09:30:38,233 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 09:30:38,233 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 09:30:38,233 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 09:30:38,233 - httpcore.connection - DEBUG - close.started 2025-05-30 09:30:38,233 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 09:30:38,234 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 09:30:38,234 - httpcore.connection - DEBUG - close.complete 2025-05-30 09:30:38,234 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 09:30:38,378 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 09:30:38,378 - simple_memory_calculator - INFO - Using known memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 09:30:38,379 - simple_memory_calculator - DEBUG - Known data: {'params_billions': 12.0, 'fp16_gb': 24.0, 'inference_fp16_gb': 36.0} 2025-05-30 09:30:38,379 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 09:30:38,379 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 09:30:38,379 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 09:30:38,379 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 09:30:38,379 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 09:30:38,379 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 09:30:38,380 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 00:30:38 GMT'), (b'Content-Type', b'text/html; charset=utf-8'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'ContentType', b'application/json'), (b'Access-Control-Allow-Origin', b'*'), (b'Content-Encoding', b'gzip')]) 2025-05-30 09:30:38,380 - httpx - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK" 2025-05-30 09:30:38,380 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 09:30:38,380 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 09:30:38,380 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 09:30:38,380 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 09:30:38,380 - httpcore.connection - DEBUG - close.started 2025-05-30 09:30:38,381 - httpcore.connection - DEBUG - close.complete 2025-05-30 09:30:38,978 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 09:30:39,192 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/launched HTTP/1.1" 200 0 2025-05-30 09:32:39,242 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-30 09:32:39,242 - __main__ - DEBUG - API key found, length: 39 2025-05-30 09:32:39,242 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-30 09:32:39,242 - auto_diffusers - DEBUG - API key length: 39 2025-05-30 09:32:39,242 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-30 09:32:39,242 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-30 09:32:39,242 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-30 09:32:39,242 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-30 09:32:39,242 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-30 09:32:39,242 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-30 09:32:39,245 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-30 09:32:39,246 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-30 09:32:39,674 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-30 09:32:39,674 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-30 09:32:39,674 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-30 09:32:39,674 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-30 09:32:39,674 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-30 09:32:39,674 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-30 09:32:39,674 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-30 09:32:39,674 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-30 09:32:39,674 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-30 09:32:39,674 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-30 09:32:39,674 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-30 09:32:39,676 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 09:32:39,690 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-30 09:32:39,694 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 09:32:39,767 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 09:32:39,806 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=None socket_options=None 2025-05-30 09:32:39,806 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 09:32:39,807 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 09:32:39,807 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 09:32:39,807 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 09:32:39,807 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 09:32:39,807 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 09:32:39,807 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 00:32:39 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')]) 2025-05-30 09:32:39,807 - httpx - INFO - HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK" 2025-05-30 09:32:39,808 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 09:32:39,808 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 09:32:39,808 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 09:32:39,808 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 09:32:39,808 - httpcore.connection - DEBUG - close.started 2025-05-30 09:32:39,808 - httpcore.connection - DEBUG - close.complete 2025-05-30 09:32:39,808 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=3 socket_options=None 2025-05-30 09:32:39,809 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 09:32:39,809 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 09:32:39,809 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 09:32:39,809 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 09:32:39,809 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 09:32:39,809 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 09:32:39,814 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 00:32:39 GMT'), (b'server', b'uvicorn'), (b'content-length', b'90468'), (b'content-type', b'text/html; charset=utf-8')]) 2025-05-30 09:32:39,815 - httpx - INFO - HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK" 2025-05-30 09:32:39,815 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 09:32:39,815 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 09:32:39,815 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 09:32:39,815 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 09:32:39,815 - httpcore.connection - DEBUG - close.started 2025-05-30 09:32:39,815 - httpcore.connection - DEBUG - close.complete 2025-05-30 09:32:39,825 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=30 socket_options=None 2025-05-30 09:32:39,928 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 09:32:39,928 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-30 09:32:39,968 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 09:32:39,968 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=30 2025-05-30 09:32:39,981 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-30 09:32:40,204 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 09:32:40,204 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 09:32:40,204 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 09:32:40,205 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 09:32:40,205 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 09:32:40,205 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 09:32:40,261 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 09:32:40,261 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 09:32:40,261 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 09:32:40,261 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 09:32:40,262 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 09:32:40,262 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 09:32:40,342 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 00:32:40 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-30 09:32:40,343 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-30 09:32:40,343 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 09:32:40,343 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 09:32:40,343 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 09:32:40,343 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 09:32:40,343 - httpcore.connection - DEBUG - close.started 2025-05-30 09:32:40,343 - httpcore.connection - DEBUG - close.complete 2025-05-30 09:32:40,407 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 00:32:40 GMT'), (b'Content-Type', b'text/html; charset=utf-8'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'ContentType', b'application/json'), (b'Access-Control-Allow-Origin', b'*'), (b'Content-Encoding', b'gzip')]) 2025-05-30 09:32:40,407 - httpx - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK" 2025-05-30 09:32:40,408 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 09:32:40,408 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 09:32:40,408 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 09:32:40,409 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 09:32:40,409 - httpcore.connection - DEBUG - close.started 2025-05-30 09:32:40,409 - httpcore.connection - DEBUG - close.complete 2025-05-30 09:32:41,061 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 09:32:41,280 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/launched HTTP/1.1" 200 0 2025-05-30 09:32:47,262 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 09:32:47,262 - simple_memory_calculator - INFO - Using known memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 09:32:47,262 - simple_memory_calculator - DEBUG - Known data: {'params_billions': 12.0, 'fp16_gb': 24.0, 'inference_fp16_gb': 36.0} 2025-05-30 09:32:47,263 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 09:32:47,263 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 09:32:47,263 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 09:32:47,263 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 09:32:47,263 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 09:32:47,263 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 09:33:51,807 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-30 09:33:51,807 - __main__ - DEBUG - API key found, length: 39 2025-05-30 09:33:51,807 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-30 09:33:51,807 - auto_diffusers - DEBUG - API key length: 39 2025-05-30 09:33:51,807 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-30 09:33:51,807 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-30 09:33:51,807 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-30 09:33:51,807 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-30 09:33:51,807 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-30 09:33:51,807 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-30 09:33:51,811 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-30 09:33:51,812 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-30 09:33:52,239 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-30 09:33:52,239 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-30 09:33:52,239 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-30 09:33:52,239 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-30 09:33:52,239 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-30 09:33:52,239 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-30 09:33:52,239 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-30 09:33:52,239 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-30 09:33:52,239 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-30 09:33:52,239 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-30 09:33:52,239 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-30 09:33:52,241 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 09:33:52,254 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-30 09:33:52,260 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 09:33:52,334 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 09:33:52,372 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=None socket_options=None 2025-05-30 09:33:52,372 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 09:33:52,372 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 09:33:52,373 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 09:33:52,373 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 09:33:52,373 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 09:33:52,373 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 09:33:52,373 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 00:33:52 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')]) 2025-05-30 09:33:52,373 - httpx - INFO - HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK" 2025-05-30 09:33:52,373 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 09:33:52,373 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 09:33:52,374 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 09:33:52,374 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 09:33:52,374 - httpcore.connection - DEBUG - close.started 2025-05-30 09:33:52,374 - httpcore.connection - DEBUG - close.complete 2025-05-30 09:33:52,374 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=3 socket_options=None 2025-05-30 09:33:52,374 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 09:33:52,375 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 09:33:52,375 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 09:33:52,375 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 09:33:52,375 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 09:33:52,375 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 09:33:52,380 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 00:33:52 GMT'), (b'server', b'uvicorn'), (b'content-length', b'90684'), (b'content-type', b'text/html; charset=utf-8')]) 2025-05-30 09:33:52,380 - httpx - INFO - HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK" 2025-05-30 09:33:52,380 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 09:33:52,381 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 09:33:52,381 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 09:33:52,381 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 09:33:52,381 - httpcore.connection - DEBUG - close.started 2025-05-30 09:33:52,381 - httpcore.connection - DEBUG - close.complete 2025-05-30 09:33:52,391 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=30 socket_options=None 2025-05-30 09:33:52,403 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 09:33:52,403 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-30 09:33:52,515 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-30 09:33:52,532 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 09:33:52,532 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=30 2025-05-30 09:33:52,689 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 09:33:52,690 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 09:33:52,691 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 09:33:52,691 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 09:33:52,691 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 09:33:52,691 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 09:33:52,816 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 09:33:52,817 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 09:33:52,818 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 09:33:52,818 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 09:33:52,818 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 09:33:52,818 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 09:33:52,835 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 00:33:52 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-30 09:33:52,835 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-30 09:33:52,835 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 09:33:52,835 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 09:33:52,835 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 09:33:52,835 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 09:33:52,836 - httpcore.connection - DEBUG - close.started 2025-05-30 09:33:52,836 - httpcore.connection - DEBUG - close.complete 2025-05-30 09:33:52,962 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 00:33:52 GMT'), (b'Content-Type', b'text/html; charset=utf-8'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'ContentType', b'application/json'), (b'Access-Control-Allow-Origin', b'*'), (b'Content-Encoding', b'gzip')]) 2025-05-30 09:33:52,963 - httpx - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK" 2025-05-30 09:33:52,963 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 09:33:52,963 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 09:33:52,963 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 09:33:52,963 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 09:33:52,963 - httpcore.connection - DEBUG - close.started 2025-05-30 09:33:52,963 - httpcore.connection - DEBUG - close.complete 2025-05-30 09:33:53,568 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 09:33:53,692 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 09:33:53,692 - simple_memory_calculator - INFO - Using known memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 09:33:53,692 - simple_memory_calculator - DEBUG - Known data: {'params_billions': 12.0, 'fp16_gb': 24.0, 'inference_fp16_gb': 36.0} 2025-05-30 09:33:53,692 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 09:33:53,692 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 09:33:53,693 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 09:33:53,693 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 09:33:53,693 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 09:33:53,693 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 09:33:53,974 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/launched HTTP/1.1" 200 0 2025-05-30 09:35:18,862 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-30 09:35:18,862 - __main__ - DEBUG - API key found, length: 39 2025-05-30 09:35:18,862 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-30 09:35:18,862 - auto_diffusers - DEBUG - API key length: 39 2025-05-30 09:35:18,862 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-30 09:35:18,863 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-30 09:35:18,863 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-30 09:35:18,863 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-30 09:35:18,863 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-30 09:35:18,863 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-30 09:35:18,867 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-30 09:35:18,867 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-30 09:35:19,363 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-30 09:35:19,363 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-30 09:35:19,363 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-30 09:35:19,363 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-30 09:35:19,363 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-30 09:35:19,363 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-30 09:35:19,363 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-30 09:35:19,363 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-30 09:35:19,363 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-30 09:35:19,363 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-30 09:35:19,363 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-30 09:35:19,366 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 09:35:19,379 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-30 09:35:19,386 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 09:35:19,461 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 09:35:19,494 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=None socket_options=None 2025-05-30 09:35:19,495 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 09:35:19,495 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 09:35:19,495 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 09:35:19,495 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 09:35:19,495 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 09:35:19,495 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 09:35:19,496 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 00:35:19 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')]) 2025-05-30 09:35:19,496 - httpx - INFO - HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK" 2025-05-30 09:35:19,496 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 09:35:19,496 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 09:35:19,496 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 09:35:19,496 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 09:35:19,496 - httpcore.connection - DEBUG - close.started 2025-05-30 09:35:19,496 - httpcore.connection - DEBUG - close.complete 2025-05-30 09:35:19,496 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=3 socket_options=None 2025-05-30 09:35:19,497 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 09:35:19,497 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 09:35:19,497 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 09:35:19,497 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 09:35:19,497 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 09:35:19,497 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 09:35:19,503 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 00:35:19 GMT'), (b'server', b'uvicorn'), (b'content-length', b'90526'), (b'content-type', b'text/html; charset=utf-8')]) 2025-05-30 09:35:19,503 - httpx - INFO - HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK" 2025-05-30 09:35:19,503 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 09:35:19,503 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 09:35:19,503 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 09:35:19,503 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 09:35:19,503 - httpcore.connection - DEBUG - close.started 2025-05-30 09:35:19,503 - httpcore.connection - DEBUG - close.complete 2025-05-30 09:35:19,514 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=30 socket_options=None 2025-05-30 09:35:19,639 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 09:35:19,639 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-30 09:35:19,661 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 09:35:19,661 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=30 2025-05-30 09:35:19,694 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-30 09:35:19,915 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 09:35:19,915 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 09:35:19,915 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 09:35:19,915 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 09:35:19,915 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 09:35:19,915 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 09:35:19,956 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 09:35:19,957 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 09:35:19,957 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 09:35:19,958 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 09:35:19,958 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 09:35:19,958 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 09:35:20,053 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 00:35:20 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-30 09:35:20,054 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-30 09:35:20,054 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 09:35:20,054 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 09:35:20,054 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 09:35:20,054 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 09:35:20,055 - httpcore.connection - DEBUG - close.started 2025-05-30 09:35:20,055 - httpcore.connection - DEBUG - close.complete 2025-05-30 09:35:20,105 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 00:35:20 GMT'), (b'Content-Type', b'text/html; charset=utf-8'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'ContentType', b'application/json'), (b'Access-Control-Allow-Origin', b'*'), (b'Content-Encoding', b'gzip')]) 2025-05-30 09:35:20,106 - httpx - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK" 2025-05-30 09:35:20,106 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 09:35:20,106 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 09:35:20,106 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 09:35:20,107 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 09:35:20,107 - httpcore.connection - DEBUG - close.started 2025-05-30 09:35:20,107 - httpcore.connection - DEBUG - close.complete 2025-05-30 09:35:20,659 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 09:35:20,660 - simple_memory_calculator - INFO - Using known memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 09:35:20,660 - simple_memory_calculator - DEBUG - Known data: {'params_billions': 12.0, 'fp16_gb': 24.0, 'inference_fp16_gb': 36.0} 2025-05-30 09:35:20,660 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 09:35:20,660 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 09:35:20,660 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 09:35:20,660 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 09:35:20,660 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 09:35:20,660 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 09:35:20,743 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 09:35:20,960 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/launched HTTP/1.1" 200 0 2025-05-30 09:36:13,741 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-30 09:36:13,741 - __main__ - DEBUG - API key found, length: 39 2025-05-30 09:36:13,741 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-30 09:36:13,741 - auto_diffusers - DEBUG - API key length: 39 2025-05-30 09:36:13,741 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-30 09:36:13,741 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-30 09:36:13,741 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-30 09:36:13,741 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-30 09:36:13,741 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-30 09:36:13,741 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-30 09:36:13,745 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-30 09:36:13,745 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-30 09:36:14,231 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-30 09:36:14,231 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-30 09:36:14,231 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-30 09:36:14,231 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-30 09:36:14,231 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-30 09:36:14,231 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-30 09:36:14,231 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-30 09:36:14,231 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-30 09:36:14,231 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-30 09:36:14,231 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-30 09:36:14,231 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-30 09:36:14,234 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 09:36:14,245 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-30 09:36:14,253 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 09:36:14,334 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 09:36:14,367 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=None socket_options=None 2025-05-30 09:36:14,368 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 09:36:14,368 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 09:36:14,368 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 09:36:14,368 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 09:36:14,369 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 09:36:14,369 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 09:36:14,369 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 00:36:14 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')]) 2025-05-30 09:36:14,369 - httpx - INFO - HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK" 2025-05-30 09:36:14,369 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 09:36:14,369 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 09:36:14,369 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 09:36:14,369 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 09:36:14,369 - httpcore.connection - DEBUG - close.started 2025-05-30 09:36:14,369 - httpcore.connection - DEBUG - close.complete 2025-05-30 09:36:14,370 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=3 socket_options=None 2025-05-30 09:36:14,370 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 09:36:14,370 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 09:36:14,370 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 09:36:14,370 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 09:36:14,370 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 09:36:14,370 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 09:36:14,376 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 00:36:14 GMT'), (b'server', b'uvicorn'), (b'content-length', b'91032'), (b'content-type', b'text/html; charset=utf-8')]) 2025-05-30 09:36:14,376 - httpx - INFO - HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK" 2025-05-30 09:36:14,376 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 09:36:14,376 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 09:36:14,376 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 09:36:14,376 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 09:36:14,376 - httpcore.connection - DEBUG - close.started 2025-05-30 09:36:14,376 - httpcore.connection - DEBUG - close.complete 2025-05-30 09:36:14,388 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=30 socket_options=None 2025-05-30 09:36:14,391 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 09:36:14,391 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-30 09:36:14,520 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-30 09:36:14,532 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 09:36:14,532 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=30 2025-05-30 09:36:14,680 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 09:36:14,680 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 09:36:14,681 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 09:36:14,681 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 09:36:14,681 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 09:36:14,681 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 09:36:14,821 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 00:36:14 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-30 09:36:14,822 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-30 09:36:14,822 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 09:36:14,822 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 09:36:14,823 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 09:36:14,823 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 09:36:14,823 - httpcore.connection - DEBUG - close.started 2025-05-30 09:36:14,824 - httpcore.connection - DEBUG - close.complete 2025-05-30 09:36:14,826 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 09:36:14,826 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 09:36:14,826 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 09:36:14,826 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 09:36:14,826 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 09:36:14,826 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 09:36:14,975 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 00:36:14 GMT'), (b'Content-Type', b'text/html; charset=utf-8'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'ContentType', b'application/json'), (b'Access-Control-Allow-Origin', b'*'), (b'Content-Encoding', b'gzip')]) 2025-05-30 09:36:14,976 - httpx - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK" 2025-05-30 09:36:14,976 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 09:36:14,976 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 09:36:14,976 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 09:36:14,976 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 09:36:14,976 - httpcore.connection - DEBUG - close.started 2025-05-30 09:36:14,976 - httpcore.connection - DEBUG - close.complete 2025-05-30 09:36:15,386 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 09:36:15,387 - simple_memory_calculator - INFO - Using known memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 09:36:15,387 - simple_memory_calculator - DEBUG - Known data: {'params_billions': 12.0, 'fp16_gb': 24.0, 'inference_fp16_gb': 36.0} 2025-05-30 09:36:15,387 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 09:36:15,387 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 09:36:15,387 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 09:36:15,387 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 09:36:15,387 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 09:36:15,387 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 09:36:15,600 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 09:36:15,821 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/launched HTTP/1.1" 200 0 2025-05-30 09:38:22,046 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-30 09:38:22,046 - __main__ - DEBUG - API key found, length: 39 2025-05-30 09:38:22,046 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-30 09:38:22,046 - auto_diffusers - DEBUG - API key length: 39 2025-05-30 09:38:22,046 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-30 09:38:22,046 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-30 09:38:22,046 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-30 09:38:22,046 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-30 09:38:22,046 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-30 09:38:22,046 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-30 09:38:22,050 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-30 09:38:22,050 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-30 09:38:22,539 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-30 09:38:22,540 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-30 09:38:22,540 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-30 09:38:22,540 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-30 09:38:22,540 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-30 09:38:22,540 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-30 09:38:22,540 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-30 09:38:22,540 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-30 09:38:22,540 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-30 09:38:22,540 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-30 09:38:22,540 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-30 09:38:22,542 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 09:38:22,549 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 09:38:22,556 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-30 09:38:22,638 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 09:38:22,671 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=None socket_options=None 2025-05-30 09:38:22,672 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 09:38:22,672 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 09:38:22,672 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 09:38:22,672 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 09:38:22,672 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 09:38:22,673 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 09:38:22,673 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 00:38:22 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')]) 2025-05-30 09:38:22,673 - httpx - INFO - HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK" 2025-05-30 09:38:22,673 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 09:38:22,673 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 09:38:22,673 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 09:38:22,673 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 09:38:22,673 - httpcore.connection - DEBUG - close.started 2025-05-30 09:38:22,673 - httpcore.connection - DEBUG - close.complete 2025-05-30 09:38:22,673 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=3 socket_options=None 2025-05-30 09:38:22,674 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 09:38:22,674 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 09:38:22,674 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 09:38:22,674 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 09:38:22,674 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 09:38:22,674 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 09:38:22,680 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 00:38:22 GMT'), (b'server', b'uvicorn'), (b'content-length', b'91836'), (b'content-type', b'text/html; charset=utf-8')]) 2025-05-30 09:38:22,680 - httpx - INFO - HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK" 2025-05-30 09:38:22,680 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 09:38:22,680 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 09:38:22,681 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 09:38:22,681 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 09:38:22,681 - httpcore.connection - DEBUG - close.started 2025-05-30 09:38:22,681 - httpcore.connection - DEBUG - close.complete 2025-05-30 09:38:22,692 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=30 socket_options=None 2025-05-30 09:38:22,727 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 09:38:22,727 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-30 09:38:22,832 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 09:38:22,832 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=30 2025-05-30 09:38:22,838 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-30 09:38:23,020 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 09:38:23,021 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 09:38:23,021 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 09:38:23,021 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 09:38:23,022 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 09:38:23,022 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 09:38:23,120 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 09:38:23,120 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 09:38:23,121 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 09:38:23,121 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 09:38:23,121 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 09:38:23,121 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 09:38:23,168 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 00:38:23 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-30 09:38:23,168 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-30 09:38:23,168 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 09:38:23,168 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 09:38:23,169 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 09:38:23,169 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 09:38:23,169 - httpcore.connection - DEBUG - close.started 2025-05-30 09:38:23,169 - httpcore.connection - DEBUG - close.complete 2025-05-30 09:38:23,263 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 00:38:23 GMT'), (b'Content-Type', b'text/html; charset=utf-8'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'ContentType', b'application/json'), (b'Access-Control-Allow-Origin', b'*'), (b'Content-Encoding', b'gzip')]) 2025-05-30 09:38:23,263 - httpx - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK" 2025-05-30 09:38:23,263 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 09:38:23,263 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 09:38:23,263 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 09:38:23,263 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 09:38:23,264 - httpcore.connection - DEBUG - close.started 2025-05-30 09:38:23,264 - httpcore.connection - DEBUG - close.complete 2025-05-30 09:38:23,743 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 09:38:23,744 - simple_memory_calculator - INFO - Using known memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 09:38:23,744 - simple_memory_calculator - DEBUG - Known data: {'params_billions': 12.0, 'fp16_gb': 24.0, 'inference_fp16_gb': 36.0} 2025-05-30 09:38:23,744 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 09:38:23,744 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 09:38:23,744 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 09:38:23,744 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 09:38:23,744 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 09:38:23,745 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 09:38:23,902 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 09:38:24,120 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/launched HTTP/1.1" 200 0 2025-05-30 09:40:31,645 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-30 09:40:31,645 - __main__ - DEBUG - API key found, length: 39 2025-05-30 09:40:31,645 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-30 09:40:31,645 - auto_diffusers - DEBUG - API key length: 39 2025-05-30 09:40:31,645 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-30 09:40:31,645 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-30 09:40:31,645 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-30 09:40:31,645 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-30 09:40:31,645 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-30 09:40:31,645 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-30 09:40:31,649 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-30 09:40:31,649 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-30 09:40:32,097 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-30 09:40:32,097 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-30 09:40:32,097 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-30 09:40:32,097 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-30 09:40:32,097 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-30 09:40:32,097 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-30 09:40:32,097 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-30 09:40:32,097 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-30 09:40:32,097 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-30 09:40:32,097 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-30 09:40:32,097 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-30 09:40:32,099 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 09:40:32,106 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 09:40:32,112 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-30 09:40:32,195 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 09:40:32,231 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=None socket_options=None 2025-05-30 09:40:32,231 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 09:40:32,231 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 09:40:32,232 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 09:40:32,232 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 09:40:32,232 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 09:40:32,232 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 09:40:32,232 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 00:40:32 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')]) 2025-05-30 09:40:32,232 - httpx - INFO - HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK" 2025-05-30 09:40:32,232 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 09:40:32,232 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 09:40:32,232 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 09:40:32,232 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 09:40:32,233 - httpcore.connection - DEBUG - close.started 2025-05-30 09:40:32,233 - httpcore.connection - DEBUG - close.complete 2025-05-30 09:40:32,233 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=3 socket_options=None 2025-05-30 09:40:32,233 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 09:40:32,233 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 09:40:32,233 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 09:40:32,233 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 09:40:32,234 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 09:40:32,234 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 09:40:32,240 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 00:40:32 GMT'), (b'server', b'uvicorn'), (b'content-length', b'92502'), (b'content-type', b'text/html; charset=utf-8')]) 2025-05-30 09:40:32,240 - httpx - INFO - HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK" 2025-05-30 09:40:32,240 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 09:40:32,240 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 09:40:32,240 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 09:40:32,240 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 09:40:32,240 - httpcore.connection - DEBUG - close.started 2025-05-30 09:40:32,240 - httpcore.connection - DEBUG - close.complete 2025-05-30 09:40:32,251 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=30 socket_options=None 2025-05-30 09:40:32,335 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 09:40:32,335 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-30 09:40:32,388 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 09:40:32,388 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=30 2025-05-30 09:40:32,389 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-30 09:40:32,637 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 09:40:32,638 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 09:40:32,638 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 09:40:32,638 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 09:40:32,638 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 09:40:32,638 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 09:40:32,664 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 09:40:32,664 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 09:40:32,665 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 09:40:32,665 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 09:40:32,665 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 09:40:32,665 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 09:40:32,792 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 00:40:32 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-30 09:40:32,792 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-30 09:40:32,793 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 09:40:32,793 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 09:40:32,793 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 09:40:32,793 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 09:40:32,793 - httpcore.connection - DEBUG - close.started 2025-05-30 09:40:32,794 - httpcore.connection - DEBUG - close.complete 2025-05-30 09:40:32,804 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 00:40:32 GMT'), (b'Content-Type', b'text/html; charset=utf-8'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'ContentType', b'application/json'), (b'Access-Control-Allow-Origin', b'*'), (b'Content-Encoding', b'gzip')]) 2025-05-30 09:40:32,805 - httpx - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK" 2025-05-30 09:40:32,805 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 09:40:32,805 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 09:40:32,805 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 09:40:32,805 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 09:40:32,806 - httpcore.connection - DEBUG - close.started 2025-05-30 09:40:32,806 - httpcore.connection - DEBUG - close.complete 2025-05-30 09:40:32,946 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 09:40:32,947 - simple_memory_calculator - INFO - Using known memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 09:40:32,947 - simple_memory_calculator - DEBUG - Known data: {'params_billions': 12.0, 'fp16_gb': 24.0, 'inference_fp16_gb': 36.0} 2025-05-30 09:40:32,947 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 09:40:32,947 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 09:40:32,947 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 09:40:32,947 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 09:40:32,947 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 09:40:32,947 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 09:40:33,598 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 09:40:33,881 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/launched HTTP/1.1" 200 0 2025-05-30 09:41:21,842 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-30 09:41:21,842 - __main__ - DEBUG - API key found, length: 39 2025-05-30 09:41:21,842 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-30 09:41:21,842 - auto_diffusers - DEBUG - API key length: 39 2025-05-30 09:41:21,842 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-30 09:41:21,842 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-30 09:41:21,842 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-30 09:41:21,842 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-30 09:41:21,842 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-30 09:41:21,842 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-30 09:41:21,846 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-30 09:41:21,846 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-30 09:41:22,316 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-30 09:41:22,316 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-30 09:41:22,316 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-30 09:41:22,316 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-30 09:41:22,316 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-30 09:41:22,316 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-30 09:41:22,316 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-30 09:41:22,316 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-30 09:41:22,316 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-30 09:41:22,316 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-30 09:41:22,316 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-30 09:41:22,318 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 09:41:22,331 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-30 09:41:22,338 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 09:41:22,428 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 09:41:22,460 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=None socket_options=None 2025-05-30 09:41:22,460 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 09:41:22,460 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 09:41:22,461 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 09:41:22,461 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 09:41:22,461 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 09:41:22,461 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 09:41:22,461 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 00:41:22 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')]) 2025-05-30 09:41:22,461 - httpx - INFO - HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK" 2025-05-30 09:41:22,461 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 09:41:22,461 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 09:41:22,461 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 09:41:22,461 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 09:41:22,461 - httpcore.connection - DEBUG - close.started 2025-05-30 09:41:22,461 - httpcore.connection - DEBUG - close.complete 2025-05-30 09:41:22,462 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=3 socket_options=None 2025-05-30 09:41:22,462 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 09:41:22,462 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 09:41:22,462 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 09:41:22,462 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 09:41:22,462 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 09:41:22,462 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 09:41:22,468 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 00:41:22 GMT'), (b'server', b'uvicorn'), (b'content-length', b'92492'), (b'content-type', b'text/html; charset=utf-8')]) 2025-05-30 09:41:22,468 - httpx - INFO - HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK" 2025-05-30 09:41:22,468 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 09:41:22,468 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 09:41:22,468 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 09:41:22,469 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 09:41:22,469 - httpcore.connection - DEBUG - close.started 2025-05-30 09:41:22,469 - httpcore.connection - DEBUG - close.complete 2025-05-30 09:41:22,473 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 09:41:22,473 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-30 09:41:22,480 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=30 socket_options=None 2025-05-30 09:41:22,600 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-30 09:41:22,615 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 09:41:22,615 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=30 2025-05-30 09:41:22,746 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 09:41:22,747 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 09:41:22,748 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 09:41:22,748 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 09:41:22,748 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 09:41:22,748 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 09:41:22,883 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 00:41:22 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-30 09:41:22,883 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-30 09:41:22,884 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 09:41:22,884 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 09:41:22,884 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 09:41:22,884 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 09:41:22,884 - httpcore.connection - DEBUG - close.started 2025-05-30 09:41:22,884 - httpcore.connection - DEBUG - close.complete 2025-05-30 09:41:22,895 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 09:41:22,896 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 09:41:22,896 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 09:41:22,896 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 09:41:22,896 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 09:41:22,896 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 09:41:23,034 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 00:41:23 GMT'), (b'Content-Type', b'text/html; charset=utf-8'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'ContentType', b'application/json'), (b'Access-Control-Allow-Origin', b'*'), (b'Content-Encoding', b'gzip')]) 2025-05-30 09:41:23,034 - httpx - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK" 2025-05-30 09:41:23,034 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 09:41:23,035 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 09:41:23,035 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 09:41:23,035 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 09:41:23,035 - httpcore.connection - DEBUG - close.started 2025-05-30 09:41:23,036 - httpcore.connection - DEBUG - close.complete 2025-05-30 09:41:23,324 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 09:41:23,324 - simple_memory_calculator - INFO - Using known memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 09:41:23,324 - simple_memory_calculator - DEBUG - Known data: {'params_billions': 12.0, 'fp16_gb': 24.0, 'inference_fp16_gb': 36.0} 2025-05-30 09:41:23,325 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 09:41:23,325 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 09:41:23,325 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 09:41:23,325 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 09:41:23,325 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 09:41:23,325 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 09:41:23,626 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 09:41:23,847 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/launched HTTP/1.1" 200 0 2025-05-30 09:44:34,217 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-30 09:44:34,217 - __main__ - DEBUG - API key found, length: 39 2025-05-30 09:44:34,217 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-30 09:44:34,217 - auto_diffusers - DEBUG - API key length: 39 2025-05-30 09:44:34,217 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-30 09:44:34,217 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-30 09:44:34,217 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-30 09:44:34,217 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-30 09:44:34,217 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-30 09:44:34,217 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-30 09:44:34,221 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-30 09:44:34,221 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-30 09:44:34,732 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-30 09:44:34,732 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-30 09:44:34,732 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-30 09:44:34,732 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-30 09:44:34,732 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-30 09:44:34,732 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-30 09:44:34,732 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-30 09:44:34,732 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-30 09:44:34,732 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-30 09:44:34,732 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-30 09:44:34,732 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-30 09:44:34,735 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 09:44:34,748 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-30 09:44:34,755 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 09:44:34,833 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 09:44:34,866 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=None socket_options=None 2025-05-30 09:44:34,866 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 09:44:34,866 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 09:44:34,866 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 09:44:34,867 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 09:44:34,867 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 09:44:34,867 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 09:44:34,867 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 00:44:34 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')]) 2025-05-30 09:44:34,867 - httpx - INFO - HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK" 2025-05-30 09:44:34,867 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 09:44:34,867 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 09:44:34,867 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 09:44:34,867 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 09:44:34,867 - httpcore.connection - DEBUG - close.started 2025-05-30 09:44:34,867 - httpcore.connection - DEBUG - close.complete 2025-05-30 09:44:34,868 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=3 socket_options=None 2025-05-30 09:44:34,868 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 09:44:34,868 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 09:44:34,868 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 09:44:34,868 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 09:44:34,869 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 09:44:34,869 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 09:44:34,875 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 00:44:34 GMT'), (b'server', b'uvicorn'), (b'content-length', b'92835'), (b'content-type', b'text/html; charset=utf-8')]) 2025-05-30 09:44:34,875 - httpx - INFO - HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK" 2025-05-30 09:44:34,875 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 09:44:34,875 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 09:44:34,875 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 09:44:34,875 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 09:44:34,875 - httpcore.connection - DEBUG - close.started 2025-05-30 09:44:34,875 - httpcore.connection - DEBUG - close.complete 2025-05-30 09:44:34,887 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=30 socket_options=None 2025-05-30 09:44:34,910 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 09:44:34,910 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-30 09:44:35,032 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 09:44:35,032 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=30 2025-05-30 09:44:35,170 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-30 09:44:35,181 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 09:44:35,181 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 09:44:35,182 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 09:44:35,183 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 09:44:35,183 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 09:44:35,183 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 09:44:35,321 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 00:44:35 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-30 09:44:35,322 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-30 09:44:35,322 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 09:44:35,322 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 09:44:35,323 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 09:44:35,323 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 09:44:35,323 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 09:44:35,323 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 09:44:35,324 - httpcore.connection - DEBUG - close.started 2025-05-30 09:44:35,324 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 09:44:35,324 - httpcore.connection - DEBUG - close.complete 2025-05-30 09:44:35,324 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 09:44:35,325 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 09:44:35,325 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 09:44:35,470 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 00:44:35 GMT'), (b'Content-Type', b'text/html; charset=utf-8'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'ContentType', b'application/json'), (b'Access-Control-Allow-Origin', b'*'), (b'Content-Encoding', b'gzip')]) 2025-05-30 09:44:35,470 - httpx - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK" 2025-05-30 09:44:35,471 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 09:44:35,471 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 09:44:35,471 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 09:44:35,471 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 09:44:35,472 - httpcore.connection - DEBUG - close.started 2025-05-30 09:44:35,472 - httpcore.connection - DEBUG - close.complete 2025-05-30 09:44:36,137 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 09:44:36,192 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 09:44:36,192 - simple_memory_calculator - INFO - Using known memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 09:44:36,192 - simple_memory_calculator - DEBUG - Known data: {'params_billions': 12.0, 'fp16_gb': 24.0, 'inference_fp16_gb': 36.0} 2025-05-30 09:44:36,192 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 09:44:36,193 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 09:44:36,193 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 09:44:36,193 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 09:44:36,193 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 09:44:36,193 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 09:44:36,351 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/launched HTTP/1.1" 200 0 2025-05-30 09:45:03,951 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-30 09:45:03,951 - __main__ - DEBUG - API key found, length: 39 2025-05-30 09:45:03,951 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-30 09:45:03,951 - auto_diffusers - DEBUG - API key length: 39 2025-05-30 09:45:03,951 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-30 09:45:03,951 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-30 09:45:03,951 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-30 09:45:03,951 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-30 09:45:03,951 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-30 09:45:03,951 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-30 09:45:03,955 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-30 09:45:03,955 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-30 09:45:04,444 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-30 09:45:04,444 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-30 09:45:04,444 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-30 09:45:04,444 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-30 09:45:04,444 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-30 09:45:04,444 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-30 09:45:04,444 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-30 09:45:04,444 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-30 09:45:04,444 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-30 09:45:04,444 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-30 09:45:04,444 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-30 09:45:04,447 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 09:45:04,460 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-30 09:45:04,460 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 09:45:04,544 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 09:45:04,577 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=None socket_options=None 2025-05-30 09:45:04,577 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 09:45:04,577 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 09:45:04,578 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 09:45:04,578 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 09:45:04,578 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 09:45:04,578 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 09:45:04,578 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 00:45:04 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')]) 2025-05-30 09:45:04,578 - httpx - INFO - HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK" 2025-05-30 09:45:04,578 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 09:45:04,578 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 09:45:04,579 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 09:45:04,579 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 09:45:04,579 - httpcore.connection - DEBUG - close.started 2025-05-30 09:45:04,579 - httpcore.connection - DEBUG - close.complete 2025-05-30 09:45:04,579 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=3 socket_options=None 2025-05-30 09:45:04,579 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 09:45:04,579 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 09:45:04,580 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 09:45:04,580 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 09:45:04,580 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 09:45:04,580 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 09:45:04,586 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 00:45:04 GMT'), (b'server', b'uvicorn'), (b'content-length', b'92844'), (b'content-type', b'text/html; charset=utf-8')]) 2025-05-30 09:45:04,586 - httpx - INFO - HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK" 2025-05-30 09:45:04,586 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 09:45:04,586 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 09:45:04,586 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 09:45:04,586 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 09:45:04,586 - httpcore.connection - DEBUG - close.started 2025-05-30 09:45:04,586 - httpcore.connection - DEBUG - close.complete 2025-05-30 09:45:04,597 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=30 socket_options=None 2025-05-30 09:45:04,614 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 09:45:04,614 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-30 09:45:04,723 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-30 09:45:04,735 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 09:45:04,735 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=30 2025-05-30 09:45:04,906 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 09:45:04,907 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 09:45:04,907 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 09:45:04,907 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 09:45:04,907 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 09:45:04,908 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 09:45:05,016 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 09:45:05,017 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 09:45:05,017 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 09:45:05,017 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 09:45:05,017 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 09:45:05,018 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 09:45:05,055 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 00:45:05 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-30 09:45:05,055 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-30 09:45:05,055 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 09:45:05,055 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 09:45:05,056 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 09:45:05,056 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 09:45:05,056 - httpcore.connection - DEBUG - close.started 2025-05-30 09:45:05,056 - httpcore.connection - DEBUG - close.complete 2025-05-30 09:45:05,161 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 00:45:05 GMT'), (b'Content-Type', b'text/html; charset=utf-8'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'ContentType', b'application/json'), (b'Access-Control-Allow-Origin', b'*'), (b'Content-Encoding', b'gzip')]) 2025-05-30 09:45:05,161 - httpx - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK" 2025-05-30 09:45:05,162 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 09:45:05,162 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 09:45:05,162 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 09:45:05,162 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 09:45:05,163 - httpcore.connection - DEBUG - close.started 2025-05-30 09:45:05,163 - httpcore.connection - DEBUG - close.complete 2025-05-30 09:45:05,643 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 09:45:05,643 - simple_memory_calculator - INFO - Using known memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 09:45:05,643 - simple_memory_calculator - DEBUG - Known data: {'params_billions': 12.0, 'fp16_gb': 24.0, 'inference_fp16_gb': 36.0} 2025-05-30 09:45:05,643 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 09:45:05,643 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 09:45:05,643 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 09:45:05,643 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 09:45:05,643 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 09:45:05,644 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 09:45:05,849 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 09:45:06,066 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/launched HTTP/1.1" 200 0 2025-05-30 09:45:23,213 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-30 09:45:23,213 - __main__ - DEBUG - API key found, length: 39 2025-05-30 09:45:23,213 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-30 09:45:23,213 - auto_diffusers - DEBUG - API key length: 39 2025-05-30 09:45:23,213 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-30 09:45:23,213 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-30 09:45:23,213 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-30 09:45:23,213 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-30 09:45:23,213 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-30 09:45:23,213 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-30 09:45:23,217 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-30 09:45:23,217 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-30 09:45:23,677 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-30 09:45:23,677 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-30 09:45:23,677 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-30 09:45:23,677 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-30 09:45:23,677 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-30 09:45:23,677 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-30 09:45:23,677 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-30 09:45:23,677 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-30 09:45:23,677 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-30 09:45:23,677 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-30 09:45:23,677 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-30 09:45:23,679 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 09:45:23,696 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-30 09:45:23,700 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 09:45:23,783 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 09:45:23,818 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=None socket_options=None 2025-05-30 09:45:23,819 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 09:45:23,819 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 09:45:23,819 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 09:45:23,820 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 09:45:23,820 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 09:45:23,820 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 09:45:23,820 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 00:45:23 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')]) 2025-05-30 09:45:23,820 - httpx - INFO - HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK" 2025-05-30 09:45:23,820 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 09:45:23,820 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 09:45:23,820 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 09:45:23,820 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 09:45:23,820 - httpcore.connection - DEBUG - close.started 2025-05-30 09:45:23,820 - httpcore.connection - DEBUG - close.complete 2025-05-30 09:45:23,821 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=3 socket_options=None 2025-05-30 09:45:23,821 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 09:45:23,821 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 09:45:23,822 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 09:45:23,822 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 09:45:23,822 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 09:45:23,822 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 09:45:23,828 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 00:45:23 GMT'), (b'server', b'uvicorn'), (b'content-length', b'92831'), (b'content-type', b'text/html; charset=utf-8')]) 2025-05-30 09:45:23,828 - httpx - INFO - HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK" 2025-05-30 09:45:23,828 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 09:45:23,828 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 09:45:23,828 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 09:45:23,828 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 09:45:23,828 - httpcore.connection - DEBUG - close.started 2025-05-30 09:45:23,829 - httpcore.connection - DEBUG - close.complete 2025-05-30 09:45:23,838 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 09:45:23,838 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-30 09:45:23,841 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=30 socket_options=None 2025-05-30 09:45:24,013 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-30 09:45:24,013 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 09:45:24,013 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=30 2025-05-30 09:45:24,151 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 09:45:24,152 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 09:45:24,152 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 09:45:24,152 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 09:45:24,152 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 09:45:24,153 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 09:45:24,290 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 00:45:24 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-30 09:45:24,290 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-30 09:45:24,290 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 09:45:24,290 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 09:45:24,290 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 09:45:24,290 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 09:45:24,290 - httpcore.connection - DEBUG - close.started 2025-05-30 09:45:24,291 - httpcore.connection - DEBUG - close.complete 2025-05-30 09:45:24,297 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 09:45:24,297 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 09:45:24,298 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 09:45:24,298 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 09:45:24,298 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 09:45:24,298 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 09:45:24,439 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 00:45:24 GMT'), (b'Content-Type', b'text/html; charset=utf-8'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'ContentType', b'application/json'), (b'Access-Control-Allow-Origin', b'*'), (b'Content-Encoding', b'gzip')]) 2025-05-30 09:45:24,439 - httpx - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK" 2025-05-30 09:45:24,439 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 09:45:24,440 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 09:45:24,440 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 09:45:24,440 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 09:45:24,440 - httpcore.connection - DEBUG - close.started 2025-05-30 09:45:24,440 - httpcore.connection - DEBUG - close.complete 2025-05-30 09:45:25,006 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 09:45:25,228 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/launched HTTP/1.1" 200 0 2025-05-30 09:45:27,524 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 09:45:27,524 - simple_memory_calculator - INFO - Using known memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 09:45:27,524 - simple_memory_calculator - DEBUG - Known data: {'params_billions': 12.0, 'fp16_gb': 24.0, 'inference_fp16_gb': 36.0} 2025-05-30 09:45:27,524 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 09:45:27,524 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 09:45:27,524 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 09:45:27,524 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 09:45:27,524 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 09:45:27,524 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 09:45:49,950 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-30 09:45:49,950 - __main__ - DEBUG - API key found, length: 39 2025-05-30 09:45:49,950 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-30 09:45:49,950 - auto_diffusers - DEBUG - API key length: 39 2025-05-30 09:45:49,950 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-30 09:45:49,950 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-30 09:45:49,950 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-30 09:45:49,950 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-30 09:45:49,951 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-30 09:45:49,951 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-30 09:45:49,955 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-30 09:45:49,955 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-30 09:45:50,413 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-30 09:45:50,413 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-30 09:45:50,413 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-30 09:45:50,413 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-30 09:45:50,413 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-30 09:45:50,413 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-30 09:45:50,413 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-30 09:45:50,413 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-30 09:45:50,413 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-30 09:45:50,413 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-30 09:45:50,413 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-30 09:45:50,415 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 09:45:50,428 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-30 09:45:50,435 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 09:45:50,514 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 09:45:50,548 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=None socket_options=None 2025-05-30 09:45:50,548 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 09:45:50,548 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 09:45:50,548 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 09:45:50,549 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 09:45:50,549 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 09:45:50,549 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 09:45:50,549 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 00:45:50 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')]) 2025-05-30 09:45:50,549 - httpx - INFO - HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK" 2025-05-30 09:45:50,549 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 09:45:50,549 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 09:45:50,549 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 09:45:50,549 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 09:45:50,549 - httpcore.connection - DEBUG - close.started 2025-05-30 09:45:50,549 - httpcore.connection - DEBUG - close.complete 2025-05-30 09:45:50,550 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=3 socket_options=None 2025-05-30 09:45:50,550 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 09:45:50,550 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 09:45:50,550 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 09:45:50,551 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 09:45:50,551 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 09:45:50,551 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 09:45:50,557 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 00:45:50 GMT'), (b'server', b'uvicorn'), (b'content-length', b'92836'), (b'content-type', b'text/html; charset=utf-8')]) 2025-05-30 09:45:50,557 - httpx - INFO - HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK" 2025-05-30 09:45:50,557 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 09:45:50,557 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 09:45:50,557 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 09:45:50,557 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 09:45:50,557 - httpcore.connection - DEBUG - close.started 2025-05-30 09:45:50,557 - httpcore.connection - DEBUG - close.complete 2025-05-30 09:45:50,568 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=30 socket_options=None 2025-05-30 09:45:50,595 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 09:45:50,595 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-30 09:45:50,711 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 09:45:50,711 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=30 2025-05-30 09:45:50,906 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 09:45:50,907 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 09:45:50,907 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 09:45:50,907 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 09:45:50,907 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 09:45:50,908 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 09:45:50,997 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 09:45:50,997 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 09:45:50,997 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 09:45:50,997 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 09:45:50,997 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 09:45:50,997 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 09:45:51,063 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 00:45:51 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-30 09:45:51,063 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-30 09:45:51,063 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 09:45:51,063 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 09:45:51,063 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 09:45:51,063 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 09:45:51,063 - httpcore.connection - DEBUG - close.started 2025-05-30 09:45:51,063 - httpcore.connection - DEBUG - close.complete 2025-05-30 09:45:51,065 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-30 09:45:51,142 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 00:45:51 GMT'), (b'Content-Type', b'text/html; charset=utf-8'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'ContentType', b'application/json'), (b'Access-Control-Allow-Origin', b'*'), (b'Content-Encoding', b'gzip')]) 2025-05-30 09:45:51,142 - httpx - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK" 2025-05-30 09:45:51,143 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 09:45:51,144 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 09:45:51,144 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 09:45:51,144 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 09:45:51,144 - httpcore.connection - DEBUG - close.started 2025-05-30 09:45:51,147 - httpcore.connection - DEBUG - close.complete 2025-05-30 09:45:51,613 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 09:45:51,613 - simple_memory_calculator - INFO - Using known memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 09:45:51,613 - simple_memory_calculator - DEBUG - Known data: {'params_billions': 12.0, 'fp16_gb': 24.0, 'inference_fp16_gb': 36.0} 2025-05-30 09:45:51,613 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 09:45:51,613 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 09:45:51,613 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 09:45:51,613 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 09:45:51,613 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 09:45:51,613 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 09:45:51,757 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 09:45:51,977 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/launched HTTP/1.1" 200 0 2025-05-30 09:48:41,328 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-30 09:48:41,328 - __main__ - DEBUG - API key found, length: 39 2025-05-30 09:48:41,328 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-30 09:48:41,328 - auto_diffusers - DEBUG - API key length: 39 2025-05-30 09:48:41,328 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-30 09:48:41,328 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-30 09:48:41,328 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-30 09:48:41,328 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-30 09:48:41,328 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-30 09:48:41,328 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-30 09:48:41,332 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-30 09:48:41,333 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-30 09:48:41,794 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-30 09:48:41,794 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-30 09:48:41,794 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-30 09:48:41,794 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-30 09:48:41,794 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-30 09:48:41,794 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-30 09:48:41,794 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-30 09:48:41,794 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-30 09:48:41,794 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-30 09:48:41,794 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-30 09:48:41,794 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-30 09:48:41,797 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 09:48:41,810 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-30 09:48:41,817 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 09:48:41,897 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 09:48:41,930 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=None socket_options=None 2025-05-30 09:48:41,931 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 09:48:41,931 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 09:48:41,931 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 09:48:41,932 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 09:48:41,932 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 09:48:41,932 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 09:48:41,932 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 00:48:41 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')]) 2025-05-30 09:48:41,932 - httpx - INFO - HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK" 2025-05-30 09:48:41,932 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 09:48:41,932 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 09:48:41,932 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 09:48:41,932 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 09:48:41,932 - httpcore.connection - DEBUG - close.started 2025-05-30 09:48:41,932 - httpcore.connection - DEBUG - close.complete 2025-05-30 09:48:41,933 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=3 socket_options=None 2025-05-30 09:48:41,933 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 09:48:41,933 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 09:48:41,933 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 09:48:41,933 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 09:48:41,933 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 09:48:41,933 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 09:48:41,939 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 00:48:41 GMT'), (b'server', b'uvicorn'), (b'content-length', b'92836'), (b'content-type', b'text/html; charset=utf-8')]) 2025-05-30 09:48:41,939 - httpx - INFO - HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK" 2025-05-30 09:48:41,939 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 09:48:41,939 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 09:48:41,939 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 09:48:41,939 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 09:48:41,940 - httpcore.connection - DEBUG - close.started 2025-05-30 09:48:41,940 - httpcore.connection - DEBUG - close.complete 2025-05-30 09:48:41,951 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=30 socket_options=None 2025-05-30 09:48:42,310 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 09:48:42,311 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-30 09:48:42,327 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 09:48:42,327 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=30 2025-05-30 09:48:42,505 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-30 09:48:42,589 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 09:48:42,589 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 09:48:42,590 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 09:48:42,590 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 09:48:42,590 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 09:48:42,590 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 09:48:42,644 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 09:48:42,644 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 09:48:42,645 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 09:48:42,645 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 09:48:42,645 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 09:48:42,645 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 09:48:42,727 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 00:48:42 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-30 09:48:42,728 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-30 09:48:42,728 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 09:48:42,729 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 09:48:42,729 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 09:48:42,729 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 09:48:42,729 - httpcore.connection - DEBUG - close.started 2025-05-30 09:48:42,730 - httpcore.connection - DEBUG - close.complete 2025-05-30 09:48:42,803 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 00:48:42 GMT'), (b'Content-Type', b'text/html; charset=utf-8'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'ContentType', b'application/json'), (b'Access-Control-Allow-Origin', b'*'), (b'Content-Encoding', b'gzip')]) 2025-05-30 09:48:42,803 - httpx - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK" 2025-05-30 09:48:42,804 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 09:48:42,804 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 09:48:42,804 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 09:48:42,804 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 09:48:42,804 - httpcore.connection - DEBUG - close.started 2025-05-30 09:48:42,804 - httpcore.connection - DEBUG - close.complete 2025-05-30 09:48:43,200 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 09:48:43,201 - simple_memory_calculator - INFO - Using known memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 09:48:43,201 - simple_memory_calculator - DEBUG - Known data: {'params_billions': 12.0, 'fp16_gb': 24.0, 'inference_fp16_gb': 36.0} 2025-05-30 09:48:43,201 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 09:48:43,201 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 09:48:43,201 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 09:48:43,201 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 09:48:43,201 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 09:48:43,201 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 09:48:43,715 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 09:48:43,930 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/launched HTTP/1.1" 200 0 2025-05-30 09:50:27,837 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 09:50:27,838 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 09:50:27,838 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 09:50:27,838 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 09:50:27,838 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 09:50:27,838 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 09:50:27,838 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 09:50:27,839 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 09:52:41,974 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-30 09:52:41,975 - __main__ - DEBUG - API key found, length: 39 2025-05-30 09:52:41,975 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-30 09:52:41,975 - auto_diffusers - DEBUG - API key length: 39 2025-05-30 09:52:41,975 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-30 09:52:41,975 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-30 09:52:41,975 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-30 09:52:41,975 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-30 09:52:41,975 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-30 09:52:41,975 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-30 09:52:41,978 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-30 09:52:41,978 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-30 09:52:42,426 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-30 09:52:42,426 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-30 09:52:42,426 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-30 09:52:42,426 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-30 09:52:42,426 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-30 09:52:42,426 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-30 09:52:42,426 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-30 09:52:42,426 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-30 09:52:42,426 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-30 09:52:42,426 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-30 09:52:42,426 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-30 09:52:42,428 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 09:52:42,441 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-30 09:52:42,448 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 09:52:42,529 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 09:52:42,560 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=None socket_options=None 2025-05-30 09:52:42,560 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 09:52:42,560 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 09:52:42,560 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 09:52:42,561 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 09:52:42,561 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 09:52:42,561 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 09:52:42,561 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 00:52:42 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')]) 2025-05-30 09:52:42,561 - httpx - INFO - HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK" 2025-05-30 09:52:42,561 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 09:52:42,561 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 09:52:42,561 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 09:52:42,561 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 09:52:42,561 - httpcore.connection - DEBUG - close.started 2025-05-30 09:52:42,561 - httpcore.connection - DEBUG - close.complete 2025-05-30 09:52:42,562 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=3 socket_options=None 2025-05-30 09:52:42,562 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 09:52:42,562 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 09:52:42,562 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 09:52:42,562 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 09:52:42,562 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 09:52:42,562 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 09:52:42,569 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 00:52:42 GMT'), (b'server', b'uvicorn'), (b'content-length', b'92851'), (b'content-type', b'text/html; charset=utf-8')]) 2025-05-30 09:52:42,569 - httpx - INFO - HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK" 2025-05-30 09:52:42,569 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 09:52:42,569 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 09:52:42,569 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 09:52:42,569 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 09:52:42,569 - httpcore.connection - DEBUG - close.started 2025-05-30 09:52:42,569 - httpcore.connection - DEBUG - close.complete 2025-05-30 09:52:42,580 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=30 socket_options=None 2025-05-30 09:52:42,646 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 09:52:42,646 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-30 09:52:42,718 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 09:52:42,718 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=30 2025-05-30 09:52:42,727 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-30 09:52:42,928 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 09:52:42,929 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 09:52:42,929 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 09:52:42,929 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 09:52:42,929 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 09:52:42,929 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 09:52:42,998 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 09:52:42,998 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 09:52:42,998 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 09:52:42,999 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 09:52:42,999 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 09:52:42,999 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 09:52:43,071 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 00:52:43 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-30 09:52:43,071 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-30 09:52:43,072 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 09:52:43,072 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 09:52:43,072 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 09:52:43,072 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 09:52:43,073 - httpcore.connection - DEBUG - close.started 2025-05-30 09:52:43,073 - httpcore.connection - DEBUG - close.complete 2025-05-30 09:52:43,141 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 00:52:43 GMT'), (b'Content-Type', b'text/html; charset=utf-8'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'ContentType', b'application/json'), (b'Access-Control-Allow-Origin', b'*'), (b'Content-Encoding', b'gzip')]) 2025-05-30 09:52:43,141 - httpx - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK" 2025-05-30 09:52:43,141 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 09:52:43,141 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 09:52:43,142 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 09:52:43,142 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 09:52:43,142 - httpcore.connection - DEBUG - close.started 2025-05-30 09:52:43,142 - httpcore.connection - DEBUG - close.complete 2025-05-30 09:52:43,220 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 09:52:43,220 - simple_memory_calculator - INFO - Using known memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 09:52:43,220 - simple_memory_calculator - DEBUG - Known data: {'params_billions': 12.0, 'fp16_gb': 24.0, 'inference_fp16_gb': 36.0} 2025-05-30 09:52:43,220 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 09:52:43,220 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 09:52:43,220 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 09:52:43,220 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 09:52:43,220 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 09:52:43,220 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 09:52:43,776 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 09:52:44,018 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/launched HTTP/1.1" 200 0 2025-05-30 09:53:56,264 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-30 09:53:56,264 - __main__ - DEBUG - API key found, length: 39 2025-05-30 09:53:56,264 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-30 09:53:56,264 - auto_diffusers - DEBUG - API key length: 39 2025-05-30 09:53:56,264 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-30 09:53:56,264 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-30 09:53:56,264 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-30 09:53:56,264 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-30 09:53:56,264 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-30 09:53:56,264 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-30 09:53:56,268 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-30 09:53:56,268 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-30 09:53:56,755 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-30 09:53:56,755 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-30 09:53:56,755 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-30 09:53:56,755 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-30 09:53:56,755 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-30 09:53:56,755 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-30 09:53:56,755 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-30 09:53:56,755 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-30 09:53:56,755 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-30 09:53:56,755 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-30 09:53:56,755 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-30 09:53:56,757 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 09:53:56,765 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 09:53:56,772 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-30 09:53:56,861 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 09:53:56,894 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=None socket_options=None 2025-05-30 09:53:56,894 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 09:53:56,895 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 09:53:56,895 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 09:53:56,895 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 09:53:56,895 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 09:53:56,895 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 09:53:56,895 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 00:53:56 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')]) 2025-05-30 09:53:56,896 - httpx - INFO - HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK" 2025-05-30 09:53:56,896 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 09:53:56,896 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 09:53:56,896 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 09:53:56,896 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 09:53:56,896 - httpcore.connection - DEBUG - close.started 2025-05-30 09:53:56,896 - httpcore.connection - DEBUG - close.complete 2025-05-30 09:53:56,896 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=3 socket_options=None 2025-05-30 09:53:56,897 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 09:53:56,897 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 09:53:56,897 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 09:53:56,897 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 09:53:56,897 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 09:53:56,897 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 09:53:56,903 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 00:53:56 GMT'), (b'server', b'uvicorn'), (b'content-length', b'92851'), (b'content-type', b'text/html; charset=utf-8')]) 2025-05-30 09:53:56,903 - httpx - INFO - HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK" 2025-05-30 09:53:56,903 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 09:53:56,903 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 09:53:56,903 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 09:53:56,903 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 09:53:56,903 - httpcore.connection - DEBUG - close.started 2025-05-30 09:53:56,903 - httpcore.connection - DEBUG - close.complete 2025-05-30 09:53:56,914 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=30 socket_options=None 2025-05-30 09:53:56,926 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 09:53:56,926 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-30 09:53:57,043 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-30 09:53:57,054 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 09:53:57,054 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=30 2025-05-30 09:53:57,227 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 09:53:57,228 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 09:53:57,228 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 09:53:57,228 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 09:53:57,228 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 09:53:57,228 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 09:53:57,336 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 09:53:57,336 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 09:53:57,336 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 09:53:57,336 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 09:53:57,337 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 09:53:57,337 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 09:53:57,377 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 00:53:57 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-30 09:53:57,378 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-30 09:53:57,378 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 09:53:57,378 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 09:53:57,378 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 09:53:57,378 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 09:53:57,378 - httpcore.connection - DEBUG - close.started 2025-05-30 09:53:57,379 - httpcore.connection - DEBUG - close.complete 2025-05-30 09:53:57,481 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 00:53:57 GMT'), (b'Content-Type', b'text/html; charset=utf-8'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'ContentType', b'application/json'), (b'Access-Control-Allow-Origin', b'*'), (b'Content-Encoding', b'gzip')]) 2025-05-30 09:53:57,481 - httpx - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK" 2025-05-30 09:53:57,481 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 09:53:57,482 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 09:53:57,482 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 09:53:57,482 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 09:53:57,482 - httpcore.connection - DEBUG - close.started 2025-05-30 09:53:57,482 - httpcore.connection - DEBUG - close.complete 2025-05-30 09:53:58,100 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 09:53:58,265 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 09:53:58,266 - simple_memory_calculator - INFO - Using known memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 09:53:58,266 - simple_memory_calculator - DEBUG - Known data: {'params_billions': 12.0, 'fp16_gb': 24.0, 'inference_fp16_gb': 36.0} 2025-05-30 09:53:58,266 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 09:53:58,266 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 09:53:58,267 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 09:53:58,267 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 09:53:58,267 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 09:53:58,267 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 09:53:58,796 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/launched HTTP/1.1" 200 0 2025-05-30 09:57:16,387 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-30 09:57:16,387 - __main__ - DEBUG - API key found, length: 39 2025-05-30 09:57:16,387 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-30 09:57:16,387 - auto_diffusers - DEBUG - API key length: 39 2025-05-30 09:57:16,387 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-30 09:57:16,387 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-30 09:57:16,387 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-30 09:57:16,387 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-30 09:57:16,387 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-30 09:57:16,387 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-30 09:57:16,391 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-30 09:57:16,391 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-30 09:57:16,894 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-30 09:57:16,894 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-30 09:57:16,894 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-30 09:57:16,894 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-30 09:57:16,894 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-30 09:57:16,894 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-30 09:57:16,894 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-30 09:57:16,894 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-30 09:57:16,894 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-30 09:57:16,894 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-30 09:57:16,894 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-30 09:57:16,897 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 09:57:16,911 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-30 09:57:16,911 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 09:57:17,001 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 09:57:17,035 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=None socket_options=None 2025-05-30 09:57:17,036 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 09:57:17,036 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 09:57:17,036 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 09:57:17,036 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 09:57:17,036 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 09:57:17,036 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 09:57:17,037 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 00:57:17 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')]) 2025-05-30 09:57:17,037 - httpx - INFO - HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK" 2025-05-30 09:57:17,037 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 09:57:17,037 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 09:57:17,037 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 09:57:17,037 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 09:57:17,037 - httpcore.connection - DEBUG - close.started 2025-05-30 09:57:17,037 - httpcore.connection - DEBUG - close.complete 2025-05-30 09:57:17,038 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=3 socket_options=None 2025-05-30 09:57:17,038 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 09:57:17,038 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 09:57:17,038 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 09:57:17,038 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 09:57:17,039 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 09:57:17,039 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 09:57:17,045 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 00:57:17 GMT'), (b'server', b'uvicorn'), (b'content-length', b'92854'), (b'content-type', b'text/html; charset=utf-8')]) 2025-05-30 09:57:17,045 - httpx - INFO - HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK" 2025-05-30 09:57:17,045 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 09:57:17,045 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 09:57:17,045 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 09:57:17,045 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 09:57:17,045 - httpcore.connection - DEBUG - close.started 2025-05-30 09:57:17,045 - httpcore.connection - DEBUG - close.complete 2025-05-30 09:57:17,058 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=30 socket_options=None 2025-05-30 09:57:17,214 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-30 09:57:17,321 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 09:57:17,321 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=30 2025-05-30 09:57:17,323 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 09:57:17,323 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-30 09:57:17,612 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 09:57:17,612 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 09:57:17,612 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 09:57:17,612 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 09:57:17,612 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 09:57:17,612 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 09:57:17,619 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 09:57:17,619 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 09:57:17,619 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 09:57:17,619 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 09:57:17,619 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 09:57:17,619 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 09:57:17,760 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 00:57:17 GMT'), (b'Content-Type', b'text/html; charset=utf-8'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'ContentType', b'application/json'), (b'Access-Control-Allow-Origin', b'*'), (b'Content-Encoding', b'gzip')]) 2025-05-30 09:57:17,760 - httpx - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK" 2025-05-30 09:57:17,760 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 09:57:17,760 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 09:57:17,760 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 09:57:17,761 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 09:57:17,761 - httpcore.connection - DEBUG - close.started 2025-05-30 09:57:17,761 - httpcore.connection - DEBUG - close.complete 2025-05-30 09:57:17,771 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 00:57:17 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-30 09:57:17,772 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-30 09:57:17,772 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 09:57:17,772 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 09:57:17,772 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 09:57:17,772 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 09:57:17,772 - httpcore.connection - DEBUG - close.started 2025-05-30 09:57:17,773 - httpcore.connection - DEBUG - close.complete 2025-05-30 09:57:17,847 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 09:57:17,847 - simple_memory_calculator - INFO - Using known memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 09:57:17,847 - simple_memory_calculator - DEBUG - Known data: {'params_billions': 12.0, 'fp16_gb': 24.0, 'inference_fp16_gb': 36.0} 2025-05-30 09:57:17,847 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 09:57:17,847 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 09:57:17,847 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 09:57:17,847 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 09:57:17,847 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 09:57:17,847 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 09:57:18,344 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 09:57:18,559 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/launched HTTP/1.1" 200 0 2025-05-30 10:00:19,614 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-30 10:00:19,614 - __main__ - DEBUG - API key found, length: 39 2025-05-30 10:00:19,614 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-30 10:00:19,614 - auto_diffusers - DEBUG - API key length: 39 2025-05-30 10:00:19,614 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-30 10:00:19,614 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-30 10:00:19,614 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-30 10:00:19,614 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-30 10:00:19,614 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-30 10:00:19,614 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-30 10:00:19,617 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-30 10:00:19,618 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-30 10:00:20,081 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-30 10:00:20,081 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-30 10:00:20,081 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-30 10:00:20,081 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-30 10:00:20,081 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-30 10:00:20,081 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-30 10:00:20,081 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-30 10:00:20,081 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-30 10:00:20,081 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-30 10:00:20,081 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-30 10:00:20,081 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-30 10:00:20,083 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 10:00:20,091 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 10:00:20,097 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-30 10:00:20,186 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 10:00:20,216 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=None socket_options=None 2025-05-30 10:00:20,216 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 10:00:20,216 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 10:00:20,216 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 10:00:20,217 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 10:00:20,217 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 10:00:20,217 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 10:00:20,217 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 01:00:20 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')]) 2025-05-30 10:00:20,217 - httpx - INFO - HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK" 2025-05-30 10:00:20,217 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 10:00:20,217 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 10:00:20,217 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 10:00:20,217 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 10:00:20,217 - httpcore.connection - DEBUG - close.started 2025-05-30 10:00:20,217 - httpcore.connection - DEBUG - close.complete 2025-05-30 10:00:20,218 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=3 socket_options=None 2025-05-30 10:00:20,218 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 10:00:20,218 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 10:00:20,218 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 10:00:20,218 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 10:00:20,218 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 10:00:20,218 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 10:00:20,225 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 01:00:20 GMT'), (b'server', b'uvicorn'), (b'content-length', b'98793'), (b'content-type', b'text/html; charset=utf-8')]) 2025-05-30 10:00:20,225 - httpx - INFO - HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK" 2025-05-30 10:00:20,225 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 10:00:20,225 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 10:00:20,225 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 10:00:20,225 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 10:00:20,225 - httpcore.connection - DEBUG - close.started 2025-05-30 10:00:20,225 - httpcore.connection - DEBUG - close.complete 2025-05-30 10:00:20,237 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=30 socket_options=None 2025-05-30 10:00:20,268 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 10:00:20,268 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-30 10:00:20,378 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 10:00:20,378 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=30 2025-05-30 10:00:20,386 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-30 10:00:20,552 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 10:00:20,552 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 10:00:20,552 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 10:00:20,552 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 10:00:20,552 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 10:00:20,553 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 10:00:20,660 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 10:00:20,660 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 10:00:20,660 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 10:00:20,660 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 10:00:20,660 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 10:00:20,660 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 10:00:20,696 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 01:00:20 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-30 10:00:20,696 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-30 10:00:20,697 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 10:00:20,697 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 10:00:20,697 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 10:00:20,697 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 10:00:20,697 - httpcore.connection - DEBUG - close.started 2025-05-30 10:00:20,697 - httpcore.connection - DEBUG - close.complete 2025-05-30 10:00:20,806 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 01:00:20 GMT'), (b'Content-Type', b'text/html; charset=utf-8'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'ContentType', b'application/json'), (b'Access-Control-Allow-Origin', b'*'), (b'Content-Encoding', b'gzip')]) 2025-05-30 10:00:20,806 - httpx - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK" 2025-05-30 10:00:20,806 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 10:00:20,806 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 10:00:20,806 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 10:00:20,806 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 10:00:20,807 - httpcore.connection - DEBUG - close.started 2025-05-30 10:00:20,807 - httpcore.connection - DEBUG - close.complete 2025-05-30 10:00:21,398 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 10:00:21,430 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 10:00:21,430 - simple_memory_calculator - INFO - Using known memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 10:00:21,430 - simple_memory_calculator - DEBUG - Known data: {'params_billions': 12.0, 'fp16_gb': 24.0, 'inference_fp16_gb': 36.0} 2025-05-30 10:00:21,430 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 10:00:21,430 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 10:00:21,430 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 10:00:21,430 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 10:00:21,431 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 10:00:21,431 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 10:00:21,772 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/launched HTTP/1.1" 200 0 2025-05-30 10:00:22,804 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 10:00:22,804 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 10:00:22,804 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 10:00:22,804 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 10:00:22,804 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 10:00:22,804 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 10:00:22,804 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 10:00:22,804 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 10:00:22,804 - auto_diffusers - INFO - Starting code generation for model: black-forest-labs/FLUX.1-schnell 2025-05-30 10:00:22,804 - auto_diffusers - DEBUG - Parameters: prompt='A cat holding a sign that says hello world...', size=(768, 1360), steps=4 2025-05-30 10:00:22,804 - auto_diffusers - DEBUG - Manual specs: True, Memory analysis provided: True 2025-05-30 10:00:22,804 - auto_diffusers - INFO - Using manual hardware specifications 2025-05-30 10:00:22,804 - auto_diffusers - DEBUG - Manual specs: {'platform': 'Linux', 'architecture': 'manual_input', 'cpu_count': 8, 'python_version': '3.11', 'cuda_available': False, 'mps_available': False, 'torch_version': '2.0+', 'manual_input': True, 'ram_gb': 16, 'user_dtype': None, 'gpu_info': [{'name': 'Custom GPU', 'memory_mb': 8192}]} 2025-05-30 10:00:22,804 - auto_diffusers - DEBUG - GPU detected with 8.0 GB VRAM 2025-05-30 10:00:22,804 - auto_diffusers - INFO - Selected optimization profile: balanced 2025-05-30 10:00:22,805 - auto_diffusers - DEBUG - Creating generation prompt for Gemini API 2025-05-30 10:00:22,805 - auto_diffusers - DEBUG - Prompt length: 7598 characters 2025-05-30 10:00:22,805 - auto_diffusers - INFO - ================================================================================ 2025-05-30 10:00:22,805 - auto_diffusers - INFO - PROMPT SENT TO GEMINI API: 2025-05-30 10:00:22,805 - auto_diffusers - INFO - ================================================================================ 2025-05-30 10:00:22,805 - auto_diffusers - INFO - You are an expert in optimizing diffusers library code for different hardware configurations. NOTE: This system includes curated optimization knowledge from HuggingFace documentation. TASK: Generate optimized Python code for running a diffusion model with the following specifications: - Model: black-forest-labs/FLUX.1-schnell - Prompt: "A cat holding a sign that says hello world" - Image size: 768x1360 - Inference steps: 4 HARDWARE SPECIFICATIONS: - Platform: Linux (manual_input) - CPU Cores: 8 - CUDA Available: False - MPS Available: False - Optimization Profile: balanced - GPU: Custom GPU (8.0 GB VRAM) MEMORY ANALYSIS: - Model Memory Requirements: 36.0 GB (FP16 inference) - Model Weights Size: 24.0 GB (FP16) - Memory Recommendation: 🔄 Requires sequential CPU offloading - Recommended Precision: float16 - Attention Slicing Recommended: True - VAE Slicing Recommended: True OPTIMIZATION KNOWLEDGE BASE: # DIFFUSERS OPTIMIZATION TECHNIQUES ## Memory Optimization Techniques ### 1. Model CPU Offloading Use `enable_model_cpu_offload()` to move models between GPU and CPU automatically: ```python pipe.enable_model_cpu_offload() ``` - Saves significant VRAM by keeping only active models on GPU - Automatic management, no manual intervention needed - Compatible with all pipelines ### 2. Sequential CPU Offloading Use `enable_sequential_cpu_offload()` for more aggressive memory saving: ```python pipe.enable_sequential_cpu_offload() ``` - More memory efficient than model offloading - Moves models to CPU after each forward pass - Best for very limited VRAM scenarios ### 3. Attention Slicing Use `enable_attention_slicing()` to reduce memory during attention computation: ```python pipe.enable_attention_slicing() # or specify slice size pipe.enable_attention_slicing("max") # maximum slicing pipe.enable_attention_slicing(1) # slice_size = 1 ``` - Trades compute time for memory - Most effective for high-resolution images - Can be combined with other techniques ### 4. VAE Slicing Use `enable_vae_slicing()` for large batch processing: ```python pipe.enable_vae_slicing() ``` - Decodes images one at a time instead of all at once - Essential for batch sizes > 4 - Minimal performance impact on single images ### 5. VAE Tiling Use `enable_vae_tiling()` for high-resolution image generation: ```python pipe.enable_vae_tiling() ``` - Enables 4K+ image generation on 8GB VRAM - Splits images into overlapping tiles - Automatically disabled for 512x512 or smaller images ### 6. Memory Efficient Attention (xFormers) Use `enable_xformers_memory_efficient_attention()` if xFormers is installed: ```python pipe.enable_xformers_memory_efficient_attention() ``` - Significantly reduces memory usage and improves speed - Requires xformers library installation - Compatible with most models ## Performance Optimization Techniques ### 1. Half Precision (FP16/BF16) Use lower precision for better memory and speed: ```python # FP16 (widely supported) pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16) # BF16 (better numerical stability, newer hardware) pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.bfloat16) ``` - FP16: Halves memory usage, widely supported - BF16: Better numerical stability, requires newer GPUs - Essential for most optimization scenarios ### 2. Torch Compile (PyTorch 2.0+) Use `torch.compile()` for significant speed improvements: ```python pipe.unet = torch.compile(pipe.unet, mode="reduce-overhead", fullgraph=True) # For some models, compile VAE too: pipe.vae.decode = torch.compile(pipe.vae.decode, mode="reduce-overhead", fullgraph=True) ``` - 5-50% speed improvement - Requires PyTorch 2.0+ - First run is slower due to compilation ### 3. Fast Schedulers Use faster schedulers for fewer steps: ```python from diffusers import LMSDiscreteScheduler, UniPCMultistepScheduler # LMS Scheduler (good quality, fast) pipe.scheduler = LMSDiscreteScheduler.from_config(pipe.scheduler.config) # UniPC Scheduler (fastest) pipe.scheduler = UniPCMultistepScheduler.from_config(pipe.scheduler.config) ``` ## Hardware-Specific Optimizations ### NVIDIA GPU Optimizations ```python # Enable Tensor Cores torch.backends.cudnn.benchmark = True # Optimal data type for NVIDIA torch_dtype = torch.float16 # or torch.bfloat16 for RTX 30/40 series ``` ### Apple Silicon (MPS) Optimizations ```python # Use MPS device device = "mps" if torch.backends.mps.is_available() else "cpu" pipe = pipe.to(device) # Recommended dtype for Apple Silicon torch_dtype = torch.bfloat16 # Better than float16 on Apple Silicon # Attention slicing often helps on MPS pipe.enable_attention_slicing() ``` ### CPU Optimizations ```python # Use float32 for CPU torch_dtype = torch.float32 # Enable optimized attention pipe.enable_attention_slicing() ``` ## Model-Specific Guidelines ### FLUX Models - Do NOT use guidance_scale parameter (not needed for FLUX) - Use 4-8 inference steps maximum - BF16 dtype recommended - Enable attention slicing for memory optimization ### Stable Diffusion XL - Enable attention slicing for high resolutions - Use refiner model sparingly to save memory - Consider VAE tiling for >1024px images ### Stable Diffusion 1.5/2.1 - Very memory efficient base models - Can often run without optimizations on 8GB+ VRAM - Enable VAE slicing for batch processing ## Memory Usage Estimation - FLUX.1: ~24GB for full precision, ~12GB for FP16 - SDXL: ~7GB for FP16, ~14GB for FP32 - SD 1.5: ~2GB for FP16, ~4GB for FP32 ## Optimization Combinations by VRAM ### 24GB+ VRAM (High-end) ```python pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.bfloat16) pipe = pipe.to("cuda") pipe.unet = torch.compile(pipe.unet, mode="reduce-overhead", fullgraph=True) ``` ### 12-24GB VRAM (Mid-range) ```python pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16) pipe = pipe.to("cuda") pipe.enable_model_cpu_offload() pipe.enable_xformers_memory_efficient_attention() ``` ### 8-12GB VRAM (Entry-level) ```python pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16) pipe.enable_sequential_cpu_offload() pipe.enable_attention_slicing() pipe.enable_vae_slicing() pipe.enable_xformers_memory_efficient_attention() ``` ### <8GB VRAM (Low-end) ```python pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16) pipe.enable_sequential_cpu_offload() pipe.enable_attention_slicing("max") pipe.enable_vae_slicing() pipe.enable_vae_tiling() ``` IMPORTANT: For FLUX.1-schnell models, do NOT include guidance_scale parameter as it's not needed. Using the OPTIMIZATION KNOWLEDGE BASE above, generate Python code that: 1. **Selects the best optimization techniques** for the specific hardware profile 2. **Applies appropriate memory optimizations** based on available VRAM 3. **Uses optimal data types** for the target hardware: - User specified dtype (if provided): Use exactly as specified - Apple Silicon (MPS): prefer torch.bfloat16 - NVIDIA GPUs: prefer torch.float16 or torch.bfloat16 - CPU only: use torch.float32 4. **Implements hardware-specific optimizations** (CUDA, MPS, CPU) 5. **Follows model-specific guidelines** (e.g., FLUX guidance_scale handling) IMPORTANT GUIDELINES: - Reference the OPTIMIZATION KNOWLEDGE BASE to select appropriate techniques - Include all necessary imports - Add brief comments explaining optimization choices - Generate compact, production-ready code - Inline values where possible for concise code - Generate ONLY the Python code, no explanations before or after the code block 2025-05-30 10:00:22,805 - auto_diffusers - INFO - ================================================================================ 2025-05-30 10:00:22,805 - auto_diffusers - INFO - Sending request to Gemini API 2025-05-30 10:00:48,063 - auto_diffusers - INFO - Successfully received response from Gemini API (no tools used) 2025-05-30 10:00:48,064 - auto_diffusers - DEBUG - Response length: 2595 characters 2025-05-30 10:03:59,180 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-30 10:03:59,180 - __main__ - DEBUG - API key found, length: 39 2025-05-30 10:03:59,180 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-30 10:03:59,180 - auto_diffusers - DEBUG - API key length: 39 2025-05-30 10:03:59,181 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-30 10:03:59,181 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-30 10:03:59,181 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-30 10:03:59,181 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-30 10:03:59,181 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-30 10:03:59,181 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-30 10:03:59,184 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-30 10:03:59,185 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-30 10:03:59,676 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-30 10:03:59,676 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-30 10:03:59,676 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-30 10:03:59,676 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-30 10:03:59,676 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-30 10:03:59,676 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-30 10:03:59,676 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-30 10:03:59,676 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-30 10:03:59,676 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-30 10:03:59,676 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-30 10:03:59,676 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-30 10:03:59,678 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 10:03:59,692 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-30 10:03:59,699 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 10:03:59,780 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 10:03:59,811 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=None socket_options=None 2025-05-30 10:03:59,811 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 10:03:59,812 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 10:03:59,812 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 10:03:59,812 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 10:03:59,812 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 10:03:59,812 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 10:03:59,812 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 01:03:59 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')]) 2025-05-30 10:03:59,813 - httpx - INFO - HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK" 2025-05-30 10:03:59,813 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 10:03:59,813 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 10:03:59,813 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 10:03:59,813 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 10:03:59,813 - httpcore.connection - DEBUG - close.started 2025-05-30 10:03:59,813 - httpcore.connection - DEBUG - close.complete 2025-05-30 10:03:59,813 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=3 socket_options=None 2025-05-30 10:03:59,814 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 10:03:59,814 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 10:03:59,814 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 10:03:59,814 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 10:03:59,814 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 10:03:59,814 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 10:03:59,820 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 01:03:59 GMT'), (b'server', b'uvicorn'), (b'content-length', b'101856'), (b'content-type', b'text/html; charset=utf-8')]) 2025-05-30 10:03:59,820 - httpx - INFO - HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK" 2025-05-30 10:03:59,820 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 10:03:59,820 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 10:03:59,820 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 10:03:59,820 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 10:03:59,820 - httpcore.connection - DEBUG - close.started 2025-05-30 10:03:59,820 - httpcore.connection - DEBUG - close.complete 2025-05-30 10:03:59,832 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=30 socket_options=None 2025-05-30 10:03:59,933 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 10:03:59,933 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-30 10:03:59,974 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-30 10:04:00,006 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 10:04:00,006 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=30 2025-05-30 10:04:00,312 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 10:04:00,312 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 10:04:00,312 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 10:04:00,312 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 10:04:00,313 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 10:04:00,313 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 10:04:00,349 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 10:04:00,349 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 10:04:00,349 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 10:04:00,349 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 10:04:00,349 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 10:04:00,349 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 10:04:00,474 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 01:04:00 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-30 10:04:00,474 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-30 10:04:00,475 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 10:04:00,475 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 10:04:00,475 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 10:04:00,475 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 10:04:00,475 - httpcore.connection - DEBUG - close.started 2025-05-30 10:04:00,476 - httpcore.connection - DEBUG - close.complete 2025-05-30 10:04:00,517 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 01:04:00 GMT'), (b'Content-Type', b'text/html; charset=utf-8'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'ContentType', b'application/json'), (b'Access-Control-Allow-Origin', b'*'), (b'Content-Encoding', b'gzip')]) 2025-05-30 10:04:00,517 - httpx - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK" 2025-05-30 10:04:00,518 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 10:04:00,518 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 10:04:00,519 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 10:04:00,519 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 10:04:00,519 - httpcore.connection - DEBUG - close.started 2025-05-30 10:04:00,519 - httpcore.connection - DEBUG - close.complete 2025-05-30 10:04:00,986 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 10:04:00,986 - simple_memory_calculator - INFO - Using known memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 10:04:00,986 - simple_memory_calculator - DEBUG - Known data: {'params_billions': 12.0, 'fp16_gb': 24.0, 'inference_fp16_gb': 36.0} 2025-05-30 10:04:00,986 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 10:04:00,986 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 10:04:00,987 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 10:04:00,987 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 10:04:00,987 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 10:04:00,987 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 10:04:01,252 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 10:04:01,463 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/launched HTTP/1.1" 200 0 2025-05-30 10:07:15,267 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-30 10:07:15,267 - __main__ - DEBUG - API key found, length: 39 2025-05-30 10:07:15,267 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-30 10:07:15,267 - auto_diffusers - DEBUG - API key length: 39 2025-05-30 10:07:15,267 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-30 10:07:15,267 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-30 10:07:15,267 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-30 10:07:15,267 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-30 10:07:15,267 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-30 10:07:15,267 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-30 10:07:15,271 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-30 10:07:15,271 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-30 10:07:15,728 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-30 10:07:15,728 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-30 10:07:15,728 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-30 10:07:15,728 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-30 10:07:15,728 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-30 10:07:15,728 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-30 10:07:15,728 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-30 10:07:15,728 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-30 10:07:15,728 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-30 10:07:15,728 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-30 10:07:15,728 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-30 10:07:15,731 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 10:07:15,744 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-30 10:07:15,751 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 10:07:15,830 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 10:07:15,863 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=None socket_options=None 2025-05-30 10:07:15,864 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 10:07:15,864 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 10:07:15,864 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 10:07:15,864 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 10:07:15,865 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 10:07:15,865 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 10:07:15,865 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 01:07:15 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')]) 2025-05-30 10:07:15,865 - httpx - INFO - HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK" 2025-05-30 10:07:15,865 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 10:07:15,865 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 10:07:15,865 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 10:07:15,865 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 10:07:15,865 - httpcore.connection - DEBUG - close.started 2025-05-30 10:07:15,866 - httpcore.connection - DEBUG - close.complete 2025-05-30 10:07:15,866 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=3 socket_options=None 2025-05-30 10:07:15,866 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 10:07:15,866 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 10:07:15,866 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 10:07:15,866 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 10:07:15,867 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 10:07:15,867 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 10:07:15,873 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 01:07:15 GMT'), (b'server', b'uvicorn'), (b'content-length', b'101855'), (b'content-type', b'text/html; charset=utf-8')]) 2025-05-30 10:07:15,873 - httpx - INFO - HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK" 2025-05-30 10:07:15,873 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 10:07:15,873 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 10:07:15,873 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 10:07:15,873 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 10:07:15,873 - httpcore.connection - DEBUG - close.started 2025-05-30 10:07:15,873 - httpcore.connection - DEBUG - close.complete 2025-05-30 10:07:15,884 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=30 socket_options=None 2025-05-30 10:07:16,009 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 10:07:16,009 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-30 10:07:16,021 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 10:07:16,021 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=30 2025-05-30 10:07:16,087 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-30 10:07:16,283 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 10:07:16,285 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 10:07:16,285 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 10:07:16,286 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 10:07:16,286 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 10:07:16,286 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 10:07:16,296 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 10:07:16,296 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 10:07:16,296 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 10:07:16,296 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 10:07:16,297 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 10:07:16,297 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 10:07:16,423 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 01:07:16 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-30 10:07:16,424 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-30 10:07:16,424 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 10:07:16,424 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 10:07:16,424 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 10:07:16,424 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 10:07:16,424 - httpcore.connection - DEBUG - close.started 2025-05-30 10:07:16,425 - httpcore.connection - DEBUG - close.complete 2025-05-30 10:07:16,436 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 01:07:16 GMT'), (b'Content-Type', b'text/html; charset=utf-8'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'ContentType', b'application/json'), (b'Access-Control-Allow-Origin', b'*'), (b'Content-Encoding', b'gzip')]) 2025-05-30 10:07:16,437 - httpx - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK" 2025-05-30 10:07:16,437 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 10:07:16,437 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 10:07:16,437 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 10:07:16,437 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 10:07:16,437 - httpcore.connection - DEBUG - close.started 2025-05-30 10:07:16,438 - httpcore.connection - DEBUG - close.complete 2025-05-30 10:07:16,932 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 10:07:16,932 - simple_memory_calculator - INFO - Using known memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 10:07:16,932 - simple_memory_calculator - DEBUG - Known data: {'params_billions': 12.0, 'fp16_gb': 24.0, 'inference_fp16_gb': 36.0} 2025-05-30 10:07:16,932 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 10:07:16,933 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 10:07:16,933 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 10:07:16,933 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 10:07:16,933 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 10:07:16,933 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 10:07:17,071 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 10:07:17,305 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/launched HTTP/1.1" 200 0 2025-05-30 10:07:21,656 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 10:07:21,657 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 10:07:21,657 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 10:07:21,657 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 10:07:21,657 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 10:07:21,657 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 10:07:21,658 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 10:07:21,658 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 10:07:23,033 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 10:07:23,033 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 10:07:23,033 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 10:07:23,033 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 10:07:23,033 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 10:07:23,034 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 10:07:23,034 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 10:07:23,034 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 10:08:46,621 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-30 10:08:46,622 - __main__ - DEBUG - API key found, length: 39 2025-05-30 10:08:46,622 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-30 10:08:46,622 - auto_diffusers - DEBUG - API key length: 39 2025-05-30 10:08:46,622 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-30 10:08:46,622 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-30 10:08:46,622 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-30 10:08:46,622 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-30 10:08:46,622 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-30 10:08:46,622 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-30 10:08:46,625 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-30 10:08:46,625 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-30 10:08:47,104 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-30 10:08:47,104 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-30 10:08:47,105 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-30 10:08:47,105 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-30 10:08:47,105 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-30 10:08:47,105 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-30 10:08:47,105 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-30 10:08:47,105 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-30 10:08:47,105 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-30 10:08:47,105 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-30 10:08:47,105 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-30 10:08:47,107 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 10:08:47,119 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-30 10:08:47,127 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 10:08:47,208 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 10:08:47,242 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=None socket_options=None 2025-05-30 10:08:47,242 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 10:08:47,242 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 10:08:47,243 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 10:08:47,243 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 10:08:47,243 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 10:08:47,243 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 10:08:47,243 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 01:08:47 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')]) 2025-05-30 10:08:47,243 - httpx - INFO - HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK" 2025-05-30 10:08:47,243 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 10:08:47,244 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 10:08:47,244 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 10:08:47,244 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 10:08:47,244 - httpcore.connection - DEBUG - close.started 2025-05-30 10:08:47,244 - httpcore.connection - DEBUG - close.complete 2025-05-30 10:08:47,244 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=3 socket_options=None 2025-05-30 10:08:47,244 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 10:08:47,244 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 10:08:47,245 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 10:08:47,245 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 10:08:47,245 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 10:08:47,245 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 10:08:47,251 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 01:08:47 GMT'), (b'server', b'uvicorn'), (b'content-length', b'102088'), (b'content-type', b'text/html; charset=utf-8')]) 2025-05-30 10:08:47,251 - httpx - INFO - HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK" 2025-05-30 10:08:47,252 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 10:08:47,252 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 10:08:47,252 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 10:08:47,252 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 10:08:47,252 - httpcore.connection - DEBUG - close.started 2025-05-30 10:08:47,252 - httpcore.connection - DEBUG - close.complete 2025-05-30 10:08:47,263 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=30 socket_options=None 2025-05-30 10:08:47,295 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 10:08:47,295 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-30 10:08:47,403 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 10:08:47,403 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=30 2025-05-30 10:08:47,419 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-30 10:08:47,592 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 10:08:47,592 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 10:08:47,593 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 10:08:47,593 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 10:08:47,593 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 10:08:47,594 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 10:08:47,686 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 10:08:47,686 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 10:08:47,687 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 10:08:47,687 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 10:08:47,687 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 10:08:47,687 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 10:08:47,743 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 01:08:47 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-30 10:08:47,743 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-30 10:08:47,743 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 10:08:47,744 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 10:08:47,744 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 10:08:47,744 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 10:08:47,744 - httpcore.connection - DEBUG - close.started 2025-05-30 10:08:47,745 - httpcore.connection - DEBUG - close.complete 2025-05-30 10:08:47,830 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 01:08:47 GMT'), (b'Content-Type', b'text/html; charset=utf-8'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'ContentType', b'application/json'), (b'Access-Control-Allow-Origin', b'*'), (b'Content-Encoding', b'gzip')]) 2025-05-30 10:08:47,830 - httpx - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK" 2025-05-30 10:08:47,830 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 10:08:47,830 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 10:08:47,830 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 10:08:47,830 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 10:08:47,830 - httpcore.connection - DEBUG - close.started 2025-05-30 10:08:47,830 - httpcore.connection - DEBUG - close.complete 2025-05-30 10:08:48,416 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 10:08:48,667 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/launched HTTP/1.1" 200 0 2025-05-30 10:08:49,278 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 10:08:49,278 - simple_memory_calculator - INFO - Using known memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 10:08:49,278 - simple_memory_calculator - DEBUG - Known data: {'params_billions': 12.0, 'fp16_gb': 24.0, 'inference_fp16_gb': 36.0} 2025-05-30 10:08:49,278 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 10:08:49,279 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 10:08:49,279 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 10:08:49,279 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 10:08:49,279 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 10:08:49,280 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 10:10:21,389 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-30 10:10:21,389 - __main__ - DEBUG - API key found, length: 39 2025-05-30 10:10:21,389 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-30 10:10:21,389 - auto_diffusers - DEBUG - API key length: 39 2025-05-30 10:10:21,389 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-30 10:10:21,389 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-30 10:10:21,389 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-30 10:10:21,389 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-30 10:10:21,389 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-30 10:10:21,389 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-30 10:10:21,394 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-30 10:10:21,394 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-30 10:10:21,861 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-30 10:10:21,861 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-30 10:10:21,861 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-30 10:10:21,861 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-30 10:10:21,861 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-30 10:10:21,861 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-30 10:10:21,861 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-30 10:10:21,861 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-30 10:10:21,861 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-30 10:10:21,861 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-30 10:10:21,861 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-30 10:10:21,863 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 10:10:21,876 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-30 10:10:21,883 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 10:10:21,964 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 10:10:21,998 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=None socket_options=None 2025-05-30 10:10:21,999 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 10:10:21,999 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 10:10:21,999 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 10:10:21,999 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 10:10:22,000 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 10:10:22,000 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 10:10:22,000 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 01:10:21 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')]) 2025-05-30 10:10:22,000 - httpx - INFO - HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK" 2025-05-30 10:10:22,000 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 10:10:22,000 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 10:10:22,000 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 10:10:22,000 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 10:10:22,000 - httpcore.connection - DEBUG - close.started 2025-05-30 10:10:22,000 - httpcore.connection - DEBUG - close.complete 2025-05-30 10:10:22,000 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=3 socket_options=None 2025-05-30 10:10:22,001 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 10:10:22,001 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 10:10:22,001 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 10:10:22,001 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 10:10:22,001 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 10:10:22,001 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 10:10:22,007 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 01:10:21 GMT'), (b'server', b'uvicorn'), (b'content-length', b'99284'), (b'content-type', b'text/html; charset=utf-8')]) 2025-05-30 10:10:22,007 - httpx - INFO - HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK" 2025-05-30 10:10:22,007 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 10:10:22,007 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 10:10:22,007 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 10:10:22,007 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 10:10:22,007 - httpcore.connection - DEBUG - close.started 2025-05-30 10:10:22,007 - httpcore.connection - DEBUG - close.complete 2025-05-30 10:10:22,019 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=30 socket_options=None 2025-05-30 10:10:22,115 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 10:10:22,115 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-30 10:10:22,157 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 10:10:22,157 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=30 2025-05-30 10:10:22,206 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-30 10:10:22,389 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 10:10:22,389 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 10:10:22,389 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 10:10:22,389 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 10:10:22,389 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 10:10:22,390 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 10:10:22,434 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 10:10:22,434 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 10:10:22,434 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 10:10:22,434 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 10:10:22,435 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 10:10:22,435 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 10:10:22,565 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 01:10:22 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-30 10:10:22,566 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-30 10:10:22,566 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 10:10:22,566 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 10:10:22,567 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 10:10:22,567 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 10:10:22,567 - httpcore.connection - DEBUG - close.started 2025-05-30 10:10:22,567 - httpcore.connection - DEBUG - close.complete 2025-05-30 10:10:22,576 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 01:10:22 GMT'), (b'Content-Type', b'text/html; charset=utf-8'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'ContentType', b'application/json'), (b'Access-Control-Allow-Origin', b'*'), (b'Content-Encoding', b'gzip')]) 2025-05-30 10:10:22,577 - httpx - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK" 2025-05-30 10:10:22,577 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 10:10:22,577 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 10:10:22,577 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 10:10:22,577 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 10:10:22,578 - httpcore.connection - DEBUG - close.started 2025-05-30 10:10:22,578 - httpcore.connection - DEBUG - close.complete 2025-05-30 10:10:23,064 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 10:10:23,065 - simple_memory_calculator - INFO - Using known memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 10:10:23,065 - simple_memory_calculator - DEBUG - Known data: {'params_billions': 12.0, 'fp16_gb': 24.0, 'inference_fp16_gb': 36.0} 2025-05-30 10:10:23,065 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 10:10:23,065 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 10:10:23,065 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 10:10:23,065 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 10:10:23,065 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 10:10:23,065 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 10:10:23,165 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 10:10:23,430 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/launched HTTP/1.1" 200 0 2025-05-30 10:10:27,241 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 10:10:27,241 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 10:10:27,241 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 10:10:27,241 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 10:10:27,241 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 10:10:27,241 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 10:10:27,241 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 10:10:27,241 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 10:10:27,241 - auto_diffusers - INFO - Starting code generation for model: black-forest-labs/FLUX.1-schnell 2025-05-30 10:10:27,241 - auto_diffusers - DEBUG - Parameters: prompt='A cat holding a sign that says hello world...', size=(768, 1360), steps=4 2025-05-30 10:10:27,242 - auto_diffusers - DEBUG - Manual specs: True, Memory analysis provided: True 2025-05-30 10:10:27,242 - auto_diffusers - INFO - Using manual hardware specifications 2025-05-30 10:10:27,242 - auto_diffusers - DEBUG - Manual specs: {'platform': 'Linux', 'architecture': 'manual_input', 'cpu_count': 8, 'python_version': '3.11', 'cuda_available': False, 'mps_available': False, 'torch_version': '2.0+', 'manual_input': True, 'ram_gb': 16, 'user_dtype': None, 'gpu_info': [{'name': 'Custom GPU', 'memory_mb': 8192}]} 2025-05-30 10:10:27,242 - auto_diffusers - DEBUG - GPU detected with 8.0 GB VRAM 2025-05-30 10:10:27,242 - auto_diffusers - INFO - Selected optimization profile: balanced 2025-05-30 10:10:27,242 - auto_diffusers - DEBUG - Creating generation prompt for Gemini API 2025-05-30 10:10:27,242 - auto_diffusers - DEBUG - Prompt length: 7598 characters 2025-05-30 10:10:27,242 - auto_diffusers - INFO - ================================================================================ 2025-05-30 10:10:27,242 - auto_diffusers - INFO - PROMPT SENT TO GEMINI API: 2025-05-30 10:10:27,242 - auto_diffusers - INFO - ================================================================================ 2025-05-30 10:10:27,242 - auto_diffusers - INFO - You are an expert in optimizing diffusers library code for different hardware configurations. NOTE: This system includes curated optimization knowledge from HuggingFace documentation. TASK: Generate optimized Python code for running a diffusion model with the following specifications: - Model: black-forest-labs/FLUX.1-schnell - Prompt: "A cat holding a sign that says hello world" - Image size: 768x1360 - Inference steps: 4 HARDWARE SPECIFICATIONS: - Platform: Linux (manual_input) - CPU Cores: 8 - CUDA Available: False - MPS Available: False - Optimization Profile: balanced - GPU: Custom GPU (8.0 GB VRAM) MEMORY ANALYSIS: - Model Memory Requirements: 36.0 GB (FP16 inference) - Model Weights Size: 24.0 GB (FP16) - Memory Recommendation: 🔄 Requires sequential CPU offloading - Recommended Precision: float16 - Attention Slicing Recommended: True - VAE Slicing Recommended: True OPTIMIZATION KNOWLEDGE BASE: # DIFFUSERS OPTIMIZATION TECHNIQUES ## Memory Optimization Techniques ### 1. Model CPU Offloading Use `enable_model_cpu_offload()` to move models between GPU and CPU automatically: ```python pipe.enable_model_cpu_offload() ``` - Saves significant VRAM by keeping only active models on GPU - Automatic management, no manual intervention needed - Compatible with all pipelines ### 2. Sequential CPU Offloading Use `enable_sequential_cpu_offload()` for more aggressive memory saving: ```python pipe.enable_sequential_cpu_offload() ``` - More memory efficient than model offloading - Moves models to CPU after each forward pass - Best for very limited VRAM scenarios ### 3. Attention Slicing Use `enable_attention_slicing()` to reduce memory during attention computation: ```python pipe.enable_attention_slicing() # or specify slice size pipe.enable_attention_slicing("max") # maximum slicing pipe.enable_attention_slicing(1) # slice_size = 1 ``` - Trades compute time for memory - Most effective for high-resolution images - Can be combined with other techniques ### 4. VAE Slicing Use `enable_vae_slicing()` for large batch processing: ```python pipe.enable_vae_slicing() ``` - Decodes images one at a time instead of all at once - Essential for batch sizes > 4 - Minimal performance impact on single images ### 5. VAE Tiling Use `enable_vae_tiling()` for high-resolution image generation: ```python pipe.enable_vae_tiling() ``` - Enables 4K+ image generation on 8GB VRAM - Splits images into overlapping tiles - Automatically disabled for 512x512 or smaller images ### 6. Memory Efficient Attention (xFormers) Use `enable_xformers_memory_efficient_attention()` if xFormers is installed: ```python pipe.enable_xformers_memory_efficient_attention() ``` - Significantly reduces memory usage and improves speed - Requires xformers library installation - Compatible with most models ## Performance Optimization Techniques ### 1. Half Precision (FP16/BF16) Use lower precision for better memory and speed: ```python # FP16 (widely supported) pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16) # BF16 (better numerical stability, newer hardware) pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.bfloat16) ``` - FP16: Halves memory usage, widely supported - BF16: Better numerical stability, requires newer GPUs - Essential for most optimization scenarios ### 2. Torch Compile (PyTorch 2.0+) Use `torch.compile()` for significant speed improvements: ```python pipe.unet = torch.compile(pipe.unet, mode="reduce-overhead", fullgraph=True) # For some models, compile VAE too: pipe.vae.decode = torch.compile(pipe.vae.decode, mode="reduce-overhead", fullgraph=True) ``` - 5-50% speed improvement - Requires PyTorch 2.0+ - First run is slower due to compilation ### 3. Fast Schedulers Use faster schedulers for fewer steps: ```python from diffusers import LMSDiscreteScheduler, UniPCMultistepScheduler # LMS Scheduler (good quality, fast) pipe.scheduler = LMSDiscreteScheduler.from_config(pipe.scheduler.config) # UniPC Scheduler (fastest) pipe.scheduler = UniPCMultistepScheduler.from_config(pipe.scheduler.config) ``` ## Hardware-Specific Optimizations ### NVIDIA GPU Optimizations ```python # Enable Tensor Cores torch.backends.cudnn.benchmark = True # Optimal data type for NVIDIA torch_dtype = torch.float16 # or torch.bfloat16 for RTX 30/40 series ``` ### Apple Silicon (MPS) Optimizations ```python # Use MPS device device = "mps" if torch.backends.mps.is_available() else "cpu" pipe = pipe.to(device) # Recommended dtype for Apple Silicon torch_dtype = torch.bfloat16 # Better than float16 on Apple Silicon # Attention slicing often helps on MPS pipe.enable_attention_slicing() ``` ### CPU Optimizations ```python # Use float32 for CPU torch_dtype = torch.float32 # Enable optimized attention pipe.enable_attention_slicing() ``` ## Model-Specific Guidelines ### FLUX Models - Do NOT use guidance_scale parameter (not needed for FLUX) - Use 4-8 inference steps maximum - BF16 dtype recommended - Enable attention slicing for memory optimization ### Stable Diffusion XL - Enable attention slicing for high resolutions - Use refiner model sparingly to save memory - Consider VAE tiling for >1024px images ### Stable Diffusion 1.5/2.1 - Very memory efficient base models - Can often run without optimizations on 8GB+ VRAM - Enable VAE slicing for batch processing ## Memory Usage Estimation - FLUX.1: ~24GB for full precision, ~12GB for FP16 - SDXL: ~7GB for FP16, ~14GB for FP32 - SD 1.5: ~2GB for FP16, ~4GB for FP32 ## Optimization Combinations by VRAM ### 24GB+ VRAM (High-end) ```python pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.bfloat16) pipe = pipe.to("cuda") pipe.unet = torch.compile(pipe.unet, mode="reduce-overhead", fullgraph=True) ``` ### 12-24GB VRAM (Mid-range) ```python pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16) pipe = pipe.to("cuda") pipe.enable_model_cpu_offload() pipe.enable_xformers_memory_efficient_attention() ``` ### 8-12GB VRAM (Entry-level) ```python pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16) pipe.enable_sequential_cpu_offload() pipe.enable_attention_slicing() pipe.enable_vae_slicing() pipe.enable_xformers_memory_efficient_attention() ``` ### <8GB VRAM (Low-end) ```python pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16) pipe.enable_sequential_cpu_offload() pipe.enable_attention_slicing("max") pipe.enable_vae_slicing() pipe.enable_vae_tiling() ``` IMPORTANT: For FLUX.1-schnell models, do NOT include guidance_scale parameter as it's not needed. Using the OPTIMIZATION KNOWLEDGE BASE above, generate Python code that: 1. **Selects the best optimization techniques** for the specific hardware profile 2. **Applies appropriate memory optimizations** based on available VRAM 3. **Uses optimal data types** for the target hardware: - User specified dtype (if provided): Use exactly as specified - Apple Silicon (MPS): prefer torch.bfloat16 - NVIDIA GPUs: prefer torch.float16 or torch.bfloat16 - CPU only: use torch.float32 4. **Implements hardware-specific optimizations** (CUDA, MPS, CPU) 5. **Follows model-specific guidelines** (e.g., FLUX guidance_scale handling) IMPORTANT GUIDELINES: - Reference the OPTIMIZATION KNOWLEDGE BASE to select appropriate techniques - Include all necessary imports - Add brief comments explaining optimization choices - Generate compact, production-ready code - Inline values where possible for concise code - Generate ONLY the Python code, no explanations before or after the code block 2025-05-30 10:10:27,242 - auto_diffusers - INFO - ================================================================================ 2025-05-30 10:10:27,243 - auto_diffusers - INFO - Sending request to Gemini API 2025-05-30 10:10:38,525 - auto_diffusers - INFO - Successfully received response from Gemini API (no tools used) 2025-05-30 10:10:38,525 - auto_diffusers - DEBUG - Response length: 1928 characters 2025-05-30 10:14:26,020 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-30 10:14:26,020 - __main__ - DEBUG - API key found, length: 39 2025-05-30 10:14:26,020 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-30 10:14:26,020 - auto_diffusers - DEBUG - API key length: 39 2025-05-30 10:14:26,020 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-30 10:14:26,020 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-30 10:14:26,020 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-30 10:14:26,020 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-30 10:14:26,020 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-30 10:14:26,020 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-30 10:14:26,024 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-30 10:14:26,024 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-30 10:14:26,483 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-30 10:14:26,483 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-30 10:14:26,483 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-30 10:14:26,483 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-30 10:14:26,483 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-30 10:14:26,483 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-30 10:14:26,483 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-30 10:14:26,483 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-30 10:14:26,483 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-30 10:14:26,483 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-30 10:14:26,483 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-30 10:14:26,485 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 10:14:26,492 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 10:14:26,498 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-30 10:14:26,669 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 10:14:26,669 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-30 10:14:26,802 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-30 10:14:26,966 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 10:14:26,966 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 10:14:26,967 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 10:14:26,967 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 10:14:26,967 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 10:14:26,967 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 10:14:27,117 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 01:14:27 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-30 10:14:27,118 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-30 10:14:27,118 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 10:14:27,119 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 10:14:27,119 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 10:14:27,119 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 10:14:27,120 - httpcore.connection - DEBUG - close.started 2025-05-30 10:14:27,120 - httpcore.connection - DEBUG - close.complete 2025-05-30 10:21:46,310 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-30 10:21:46,310 - __main__ - DEBUG - API key found, length: 39 2025-05-30 10:21:46,310 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-30 10:21:46,310 - auto_diffusers - DEBUG - API key length: 39 2025-05-30 10:21:46,310 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-30 10:21:46,310 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-30 10:21:46,310 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-30 10:21:46,310 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-30 10:21:46,310 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-30 10:21:46,310 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-30 10:21:46,313 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-30 10:21:46,313 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-30 10:21:46,758 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-30 10:21:46,758 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-30 10:21:46,758 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-30 10:21:46,758 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-30 10:21:46,758 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-30 10:21:46,758 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-30 10:21:46,758 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-30 10:21:46,758 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-30 10:21:46,758 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-30 10:21:46,758 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-30 10:21:46,758 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-30 10:21:46,760 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 10:21:46,773 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-30 10:21:46,781 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 10:21:46,861 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 10:21:46,889 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=None socket_options=None 2025-05-30 10:21:46,890 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 10:21:46,890 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 10:21:46,890 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 10:21:46,890 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 10:21:46,890 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 10:21:46,891 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 10:21:46,891 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 01:21:46 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')]) 2025-05-30 10:21:46,891 - httpx - INFO - HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK" 2025-05-30 10:21:46,891 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 10:21:46,891 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 10:21:46,891 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 10:21:46,891 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 10:21:46,891 - httpcore.connection - DEBUG - close.started 2025-05-30 10:21:46,891 - httpcore.connection - DEBUG - close.complete 2025-05-30 10:21:46,891 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=3 socket_options=None 2025-05-30 10:21:46,892 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 10:21:46,892 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 10:21:46,892 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 10:21:46,892 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 10:21:46,892 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 10:21:46,892 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 10:21:46,898 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 01:21:46 GMT'), (b'server', b'uvicorn'), (b'content-length', b'101391'), (b'content-type', b'text/html; charset=utf-8')]) 2025-05-30 10:21:46,898 - httpx - INFO - HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK" 2025-05-30 10:21:46,898 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 10:21:46,898 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 10:21:46,898 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 10:21:46,898 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 10:21:46,898 - httpcore.connection - DEBUG - close.started 2025-05-30 10:21:46,898 - httpcore.connection - DEBUG - close.complete 2025-05-30 10:21:46,910 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=30 socket_options=None 2025-05-30 10:21:46,988 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 10:21:46,988 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-30 10:21:47,057 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 10:21:47,057 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=30 2025-05-30 10:21:47,063 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-30 10:21:47,322 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 10:21:47,322 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 10:21:47,323 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 10:21:47,323 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 10:21:47,323 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 10:21:47,323 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 10:21:47,349 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 10:21:47,349 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 10:21:47,349 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 10:21:47,349 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 10:21:47,349 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 10:21:47,349 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 10:21:47,466 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 01:21:47 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-30 10:21:47,466 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-30 10:21:47,466 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 10:21:47,467 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 10:21:47,467 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 10:21:47,467 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 10:21:47,467 - httpcore.connection - DEBUG - close.started 2025-05-30 10:21:47,468 - httpcore.connection - DEBUG - close.complete 2025-05-30 10:21:47,495 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 01:21:47 GMT'), (b'Content-Type', b'text/html; charset=utf-8'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'ContentType', b'application/json'), (b'Access-Control-Allow-Origin', b'*'), (b'Content-Encoding', b'gzip')]) 2025-05-30 10:21:47,496 - httpx - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK" 2025-05-30 10:21:47,496 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 10:21:47,496 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 10:21:47,497 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 10:21:47,497 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 10:21:47,497 - httpcore.connection - DEBUG - close.started 2025-05-30 10:21:47,497 - httpcore.connection - DEBUG - close.complete 2025-05-30 10:21:48,266 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 10:21:48,489 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/launched HTTP/1.1" 200 0 2025-05-30 10:21:57,114 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 10:21:57,114 - simple_memory_calculator - INFO - Using known memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 10:21:57,114 - simple_memory_calculator - DEBUG - Known data: {'params_billions': 12.0, 'fp16_gb': 24.0, 'inference_fp16_gb': 36.0} 2025-05-30 10:21:57,114 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 10:21:57,114 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 10:21:57,114 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 10:21:57,114 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 10:21:57,114 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 10:21:57,114 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 10:21:59,091 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 10:21:59,091 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 10:21:59,091 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 10:21:59,091 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 10:21:59,091 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 10:21:59,091 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 10:21:59,091 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 10:21:59,091 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 10:21:59,091 - auto_diffusers - INFO - Starting code generation for model: black-forest-labs/FLUX.1-schnell 2025-05-30 10:21:59,091 - auto_diffusers - DEBUG - Parameters: prompt='A cat holding a sign that says hello world...', size=(768, 1360), steps=4 2025-05-30 10:21:59,092 - auto_diffusers - DEBUG - Manual specs: True, Memory analysis provided: True 2025-05-30 10:21:59,092 - auto_diffusers - INFO - Using manual hardware specifications 2025-05-30 10:21:59,092 - auto_diffusers - DEBUG - Manual specs: {'platform': 'Linux', 'architecture': 'manual_input', 'cpu_count': 8, 'python_version': '3.11', 'cuda_available': False, 'mps_available': False, 'torch_version': '2.0+', 'manual_input': True, 'ram_gb': 16, 'user_dtype': None, 'gpu_info': [{'name': 'Custom GPU', 'memory_mb': 8192}]} 2025-05-30 10:21:59,092 - auto_diffusers - DEBUG - GPU detected with 8.0 GB VRAM 2025-05-30 10:21:59,092 - auto_diffusers - INFO - Selected optimization profile: balanced 2025-05-30 10:21:59,092 - auto_diffusers - DEBUG - Creating generation prompt for Gemini API 2025-05-30 10:21:59,092 - auto_diffusers - DEBUG - Prompt length: 7598 characters 2025-05-30 10:21:59,092 - auto_diffusers - INFO - ================================================================================ 2025-05-30 10:21:59,092 - auto_diffusers - INFO - PROMPT SENT TO GEMINI API: 2025-05-30 10:21:59,092 - auto_diffusers - INFO - ================================================================================ 2025-05-30 10:21:59,092 - auto_diffusers - INFO - You are an expert in optimizing diffusers library code for different hardware configurations. NOTE: This system includes curated optimization knowledge from HuggingFace documentation. TASK: Generate optimized Python code for running a diffusion model with the following specifications: - Model: black-forest-labs/FLUX.1-schnell - Prompt: "A cat holding a sign that says hello world" - Image size: 768x1360 - Inference steps: 4 HARDWARE SPECIFICATIONS: - Platform: Linux (manual_input) - CPU Cores: 8 - CUDA Available: False - MPS Available: False - Optimization Profile: balanced - GPU: Custom GPU (8.0 GB VRAM) MEMORY ANALYSIS: - Model Memory Requirements: 36.0 GB (FP16 inference) - Model Weights Size: 24.0 GB (FP16) - Memory Recommendation: 🔄 Requires sequential CPU offloading - Recommended Precision: float16 - Attention Slicing Recommended: True - VAE Slicing Recommended: True OPTIMIZATION KNOWLEDGE BASE: # DIFFUSERS OPTIMIZATION TECHNIQUES ## Memory Optimization Techniques ### 1. Model CPU Offloading Use `enable_model_cpu_offload()` to move models between GPU and CPU automatically: ```python pipe.enable_model_cpu_offload() ``` - Saves significant VRAM by keeping only active models on GPU - Automatic management, no manual intervention needed - Compatible with all pipelines ### 2. Sequential CPU Offloading Use `enable_sequential_cpu_offload()` for more aggressive memory saving: ```python pipe.enable_sequential_cpu_offload() ``` - More memory efficient than model offloading - Moves models to CPU after each forward pass - Best for very limited VRAM scenarios ### 3. Attention Slicing Use `enable_attention_slicing()` to reduce memory during attention computation: ```python pipe.enable_attention_slicing() # or specify slice size pipe.enable_attention_slicing("max") # maximum slicing pipe.enable_attention_slicing(1) # slice_size = 1 ``` - Trades compute time for memory - Most effective for high-resolution images - Can be combined with other techniques ### 4. VAE Slicing Use `enable_vae_slicing()` for large batch processing: ```python pipe.enable_vae_slicing() ``` - Decodes images one at a time instead of all at once - Essential for batch sizes > 4 - Minimal performance impact on single images ### 5. VAE Tiling Use `enable_vae_tiling()` for high-resolution image generation: ```python pipe.enable_vae_tiling() ``` - Enables 4K+ image generation on 8GB VRAM - Splits images into overlapping tiles - Automatically disabled for 512x512 or smaller images ### 6. Memory Efficient Attention (xFormers) Use `enable_xformers_memory_efficient_attention()` if xFormers is installed: ```python pipe.enable_xformers_memory_efficient_attention() ``` - Significantly reduces memory usage and improves speed - Requires xformers library installation - Compatible with most models ## Performance Optimization Techniques ### 1. Half Precision (FP16/BF16) Use lower precision for better memory and speed: ```python # FP16 (widely supported) pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16) # BF16 (better numerical stability, newer hardware) pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.bfloat16) ``` - FP16: Halves memory usage, widely supported - BF16: Better numerical stability, requires newer GPUs - Essential for most optimization scenarios ### 2. Torch Compile (PyTorch 2.0+) Use `torch.compile()` for significant speed improvements: ```python pipe.unet = torch.compile(pipe.unet, mode="reduce-overhead", fullgraph=True) # For some models, compile VAE too: pipe.vae.decode = torch.compile(pipe.vae.decode, mode="reduce-overhead", fullgraph=True) ``` - 5-50% speed improvement - Requires PyTorch 2.0+ - First run is slower due to compilation ### 3. Fast Schedulers Use faster schedulers for fewer steps: ```python from diffusers import LMSDiscreteScheduler, UniPCMultistepScheduler # LMS Scheduler (good quality, fast) pipe.scheduler = LMSDiscreteScheduler.from_config(pipe.scheduler.config) # UniPC Scheduler (fastest) pipe.scheduler = UniPCMultistepScheduler.from_config(pipe.scheduler.config) ``` ## Hardware-Specific Optimizations ### NVIDIA GPU Optimizations ```python # Enable Tensor Cores torch.backends.cudnn.benchmark = True # Optimal data type for NVIDIA torch_dtype = torch.float16 # or torch.bfloat16 for RTX 30/40 series ``` ### Apple Silicon (MPS) Optimizations ```python # Use MPS device device = "mps" if torch.backends.mps.is_available() else "cpu" pipe = pipe.to(device) # Recommended dtype for Apple Silicon torch_dtype = torch.bfloat16 # Better than float16 on Apple Silicon # Attention slicing often helps on MPS pipe.enable_attention_slicing() ``` ### CPU Optimizations ```python # Use float32 for CPU torch_dtype = torch.float32 # Enable optimized attention pipe.enable_attention_slicing() ``` ## Model-Specific Guidelines ### FLUX Models - Do NOT use guidance_scale parameter (not needed for FLUX) - Use 4-8 inference steps maximum - BF16 dtype recommended - Enable attention slicing for memory optimization ### Stable Diffusion XL - Enable attention slicing for high resolutions - Use refiner model sparingly to save memory - Consider VAE tiling for >1024px images ### Stable Diffusion 1.5/2.1 - Very memory efficient base models - Can often run without optimizations on 8GB+ VRAM - Enable VAE slicing for batch processing ## Memory Usage Estimation - FLUX.1: ~24GB for full precision, ~12GB for FP16 - SDXL: ~7GB for FP16, ~14GB for FP32 - SD 1.5: ~2GB for FP16, ~4GB for FP32 ## Optimization Combinations by VRAM ### 24GB+ VRAM (High-end) ```python pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.bfloat16) pipe = pipe.to("cuda") pipe.unet = torch.compile(pipe.unet, mode="reduce-overhead", fullgraph=True) ``` ### 12-24GB VRAM (Mid-range) ```python pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16) pipe = pipe.to("cuda") pipe.enable_model_cpu_offload() pipe.enable_xformers_memory_efficient_attention() ``` ### 8-12GB VRAM (Entry-level) ```python pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16) pipe.enable_sequential_cpu_offload() pipe.enable_attention_slicing() pipe.enable_vae_slicing() pipe.enable_xformers_memory_efficient_attention() ``` ### <8GB VRAM (Low-end) ```python pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16) pipe.enable_sequential_cpu_offload() pipe.enable_attention_slicing("max") pipe.enable_vae_slicing() pipe.enable_vae_tiling() ``` IMPORTANT: For FLUX.1-schnell models, do NOT include guidance_scale parameter as it's not needed. Using the OPTIMIZATION KNOWLEDGE BASE above, generate Python code that: 1. **Selects the best optimization techniques** for the specific hardware profile 2. **Applies appropriate memory optimizations** based on available VRAM 3. **Uses optimal data types** for the target hardware: - User specified dtype (if provided): Use exactly as specified - Apple Silicon (MPS): prefer torch.bfloat16 - NVIDIA GPUs: prefer torch.float16 or torch.bfloat16 - CPU only: use torch.float32 4. **Implements hardware-specific optimizations** (CUDA, MPS, CPU) 5. **Follows model-specific guidelines** (e.g., FLUX guidance_scale handling) IMPORTANT GUIDELINES: - Reference the OPTIMIZATION KNOWLEDGE BASE to select appropriate techniques - Include all necessary imports - Add brief comments explaining optimization choices - Generate compact, production-ready code - Inline values where possible for concise code - Generate ONLY the Python code, no explanations before or after the code block 2025-05-30 10:21:59,092 - auto_diffusers - INFO - ================================================================================ 2025-05-30 10:21:59,092 - auto_diffusers - INFO - Sending request to Gemini API 2025-05-30 10:22:16,528 - auto_diffusers - INFO - Successfully received response from Gemini API (no tools used) 2025-05-30 10:22:16,528 - auto_diffusers - DEBUG - Response length: 1776 characters 2025-05-30 10:29:16,567 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-30 10:29:16,568 - __main__ - DEBUG - API key found, length: 39 2025-05-30 10:29:16,568 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-30 10:29:16,568 - auto_diffusers - DEBUG - API key length: 39 2025-05-30 10:29:16,568 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-30 10:29:16,568 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-30 10:29:16,568 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-30 10:29:16,568 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-30 10:29:16,568 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-30 10:29:16,568 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-30 10:29:16,572 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-30 10:29:16,572 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-30 10:29:17,088 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-30 10:29:17,088 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-30 10:29:17,088 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-30 10:29:17,088 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-30 10:29:17,088 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-30 10:29:17,088 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-30 10:29:17,088 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-30 10:29:17,088 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-30 10:29:17,088 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-30 10:29:17,088 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-30 10:29:17,088 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-30 10:29:17,091 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 10:29:17,104 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-30 10:29:17,112 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 10:29:17,193 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 10:29:17,225 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=None socket_options=None 2025-05-30 10:29:17,225 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 10:29:17,225 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 10:29:17,225 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 10:29:17,225 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 10:29:17,226 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 10:29:17,226 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 10:29:17,226 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 01:29:17 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')]) 2025-05-30 10:29:17,226 - httpx - INFO - HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK" 2025-05-30 10:29:17,226 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 10:29:17,226 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 10:29:17,226 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 10:29:17,226 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 10:29:17,226 - httpcore.connection - DEBUG - close.started 2025-05-30 10:29:17,226 - httpcore.connection - DEBUG - close.complete 2025-05-30 10:29:17,227 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=3 socket_options=None 2025-05-30 10:29:17,227 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 10:29:17,227 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 10:29:17,227 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 10:29:17,227 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 10:29:17,228 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 10:29:17,228 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 10:29:17,233 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 01:29:17 GMT'), (b'server', b'uvicorn'), (b'content-length', b'100912'), (b'content-type', b'text/html; charset=utf-8')]) 2025-05-30 10:29:17,233 - httpx - INFO - HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK" 2025-05-30 10:29:17,233 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 10:29:17,233 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 10:29:17,233 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 10:29:17,234 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 10:29:17,234 - httpcore.connection - DEBUG - close.started 2025-05-30 10:29:17,234 - httpcore.connection - DEBUG - close.complete 2025-05-30 10:29:17,245 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=30 socket_options=None 2025-05-30 10:29:17,304 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 10:29:17,304 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-30 10:29:17,388 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 10:29:17,388 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=30 2025-05-30 10:29:17,629 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-30 10:29:17,629 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 10:29:17,630 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 10:29:17,630 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 10:29:17,630 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 10:29:17,630 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 10:29:17,630 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 10:29:17,675 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 10:29:17,675 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 10:29:17,676 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 10:29:17,676 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 10:29:17,676 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 10:29:17,676 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 10:29:17,814 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 01:29:17 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-30 10:29:17,815 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-30 10:29:17,815 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 10:29:17,815 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 10:29:17,815 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 10:29:17,815 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 10:29:17,816 - httpcore.connection - DEBUG - close.started 2025-05-30 10:29:17,816 - httpcore.connection - DEBUG - close.complete 2025-05-30 10:29:17,821 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 01:29:17 GMT'), (b'Content-Type', b'text/html; charset=utf-8'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'ContentType', b'application/json'), (b'Access-Control-Allow-Origin', b'*'), (b'Content-Encoding', b'gzip')]) 2025-05-30 10:29:17,822 - httpx - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK" 2025-05-30 10:29:17,822 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 10:29:17,823 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 10:29:17,823 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 10:29:17,823 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 10:29:17,823 - httpcore.connection - DEBUG - close.started 2025-05-30 10:29:17,823 - httpcore.connection - DEBUG - close.complete 2025-05-30 10:29:18,430 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 10:29:18,542 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 10:29:18,542 - simple_memory_calculator - INFO - Using known memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 10:29:18,542 - simple_memory_calculator - DEBUG - Known data: {'params_billions': 12.0, 'fp16_gb': 24.0, 'inference_fp16_gb': 36.0} 2025-05-30 10:29:18,542 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 10:29:18,542 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 10:29:18,542 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 10:29:18,542 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 10:29:18,542 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 10:29:18,542 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 10:29:18,652 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/launched HTTP/1.1" 200 0 2025-05-30 10:29:19,602 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 10:29:19,602 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 10:29:19,602 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 10:29:19,602 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 10:29:19,602 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 10:29:19,602 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 10:29:19,603 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 10:29:19,603 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 10:29:19,603 - auto_diffusers - INFO - Starting code generation for model: black-forest-labs/FLUX.1-schnell 2025-05-30 10:29:19,603 - auto_diffusers - DEBUG - Parameters: prompt='A cat holding a sign that says hello world...', size=(768, 1360), steps=4 2025-05-30 10:29:19,603 - auto_diffusers - DEBUG - Manual specs: True, Memory analysis provided: True 2025-05-30 10:29:19,603 - auto_diffusers - INFO - Using manual hardware specifications 2025-05-30 10:29:19,603 - auto_diffusers - DEBUG - Manual specs: {'platform': 'Linux', 'architecture': 'manual_input', 'cpu_count': 8, 'python_version': '3.11', 'cuda_available': False, 'mps_available': False, 'torch_version': '2.0+', 'manual_input': True, 'ram_gb': 16, 'user_dtype': None, 'gpu_info': [{'name': 'Custom GPU', 'memory_mb': 8192}]} 2025-05-30 10:29:19,603 - auto_diffusers - DEBUG - GPU detected with 8.0 GB VRAM 2025-05-30 10:29:19,603 - auto_diffusers - INFO - Selected optimization profile: balanced 2025-05-30 10:29:19,603 - auto_diffusers - DEBUG - Creating generation prompt for Gemini API 2025-05-30 10:29:19,603 - auto_diffusers - DEBUG - Prompt length: 7598 characters 2025-05-30 10:29:19,603 - auto_diffusers - INFO - ================================================================================ 2025-05-30 10:29:19,603 - auto_diffusers - INFO - PROMPT SENT TO GEMINI API: 2025-05-30 10:29:19,603 - auto_diffusers - INFO - ================================================================================ 2025-05-30 10:29:19,603 - auto_diffusers - INFO - You are an expert in optimizing diffusers library code for different hardware configurations. NOTE: This system includes curated optimization knowledge from HuggingFace documentation. TASK: Generate optimized Python code for running a diffusion model with the following specifications: - Model: black-forest-labs/FLUX.1-schnell - Prompt: "A cat holding a sign that says hello world" - Image size: 768x1360 - Inference steps: 4 HARDWARE SPECIFICATIONS: - Platform: Linux (manual_input) - CPU Cores: 8 - CUDA Available: False - MPS Available: False - Optimization Profile: balanced - GPU: Custom GPU (8.0 GB VRAM) MEMORY ANALYSIS: - Model Memory Requirements: 36.0 GB (FP16 inference) - Model Weights Size: 24.0 GB (FP16) - Memory Recommendation: 🔄 Requires sequential CPU offloading - Recommended Precision: float16 - Attention Slicing Recommended: True - VAE Slicing Recommended: True OPTIMIZATION KNOWLEDGE BASE: # DIFFUSERS OPTIMIZATION TECHNIQUES ## Memory Optimization Techniques ### 1. Model CPU Offloading Use `enable_model_cpu_offload()` to move models between GPU and CPU automatically: ```python pipe.enable_model_cpu_offload() ``` - Saves significant VRAM by keeping only active models on GPU - Automatic management, no manual intervention needed - Compatible with all pipelines ### 2. Sequential CPU Offloading Use `enable_sequential_cpu_offload()` for more aggressive memory saving: ```python pipe.enable_sequential_cpu_offload() ``` - More memory efficient than model offloading - Moves models to CPU after each forward pass - Best for very limited VRAM scenarios ### 3. Attention Slicing Use `enable_attention_slicing()` to reduce memory during attention computation: ```python pipe.enable_attention_slicing() # or specify slice size pipe.enable_attention_slicing("max") # maximum slicing pipe.enable_attention_slicing(1) # slice_size = 1 ``` - Trades compute time for memory - Most effective for high-resolution images - Can be combined with other techniques ### 4. VAE Slicing Use `enable_vae_slicing()` for large batch processing: ```python pipe.enable_vae_slicing() ``` - Decodes images one at a time instead of all at once - Essential for batch sizes > 4 - Minimal performance impact on single images ### 5. VAE Tiling Use `enable_vae_tiling()` for high-resolution image generation: ```python pipe.enable_vae_tiling() ``` - Enables 4K+ image generation on 8GB VRAM - Splits images into overlapping tiles - Automatically disabled for 512x512 or smaller images ### 6. Memory Efficient Attention (xFormers) Use `enable_xformers_memory_efficient_attention()` if xFormers is installed: ```python pipe.enable_xformers_memory_efficient_attention() ``` - Significantly reduces memory usage and improves speed - Requires xformers library installation - Compatible with most models ## Performance Optimization Techniques ### 1. Half Precision (FP16/BF16) Use lower precision for better memory and speed: ```python # FP16 (widely supported) pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16) # BF16 (better numerical stability, newer hardware) pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.bfloat16) ``` - FP16: Halves memory usage, widely supported - BF16: Better numerical stability, requires newer GPUs - Essential for most optimization scenarios ### 2. Torch Compile (PyTorch 2.0+) Use `torch.compile()` for significant speed improvements: ```python pipe.unet = torch.compile(pipe.unet, mode="reduce-overhead", fullgraph=True) # For some models, compile VAE too: pipe.vae.decode = torch.compile(pipe.vae.decode, mode="reduce-overhead", fullgraph=True) ``` - 5-50% speed improvement - Requires PyTorch 2.0+ - First run is slower due to compilation ### 3. Fast Schedulers Use faster schedulers for fewer steps: ```python from diffusers import LMSDiscreteScheduler, UniPCMultistepScheduler # LMS Scheduler (good quality, fast) pipe.scheduler = LMSDiscreteScheduler.from_config(pipe.scheduler.config) # UniPC Scheduler (fastest) pipe.scheduler = UniPCMultistepScheduler.from_config(pipe.scheduler.config) ``` ## Hardware-Specific Optimizations ### NVIDIA GPU Optimizations ```python # Enable Tensor Cores torch.backends.cudnn.benchmark = True # Optimal data type for NVIDIA torch_dtype = torch.float16 # or torch.bfloat16 for RTX 30/40 series ``` ### Apple Silicon (MPS) Optimizations ```python # Use MPS device device = "mps" if torch.backends.mps.is_available() else "cpu" pipe = pipe.to(device) # Recommended dtype for Apple Silicon torch_dtype = torch.bfloat16 # Better than float16 on Apple Silicon # Attention slicing often helps on MPS pipe.enable_attention_slicing() ``` ### CPU Optimizations ```python # Use float32 for CPU torch_dtype = torch.float32 # Enable optimized attention pipe.enable_attention_slicing() ``` ## Model-Specific Guidelines ### FLUX Models - Do NOT use guidance_scale parameter (not needed for FLUX) - Use 4-8 inference steps maximum - BF16 dtype recommended - Enable attention slicing for memory optimization ### Stable Diffusion XL - Enable attention slicing for high resolutions - Use refiner model sparingly to save memory - Consider VAE tiling for >1024px images ### Stable Diffusion 1.5/2.1 - Very memory efficient base models - Can often run without optimizations on 8GB+ VRAM - Enable VAE slicing for batch processing ## Memory Usage Estimation - FLUX.1: ~24GB for full precision, ~12GB for FP16 - SDXL: ~7GB for FP16, ~14GB for FP32 - SD 1.5: ~2GB for FP16, ~4GB for FP32 ## Optimization Combinations by VRAM ### 24GB+ VRAM (High-end) ```python pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.bfloat16) pipe = pipe.to("cuda") pipe.unet = torch.compile(pipe.unet, mode="reduce-overhead", fullgraph=True) ``` ### 12-24GB VRAM (Mid-range) ```python pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16) pipe = pipe.to("cuda") pipe.enable_model_cpu_offload() pipe.enable_xformers_memory_efficient_attention() ``` ### 8-12GB VRAM (Entry-level) ```python pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16) pipe.enable_sequential_cpu_offload() pipe.enable_attention_slicing() pipe.enable_vae_slicing() pipe.enable_xformers_memory_efficient_attention() ``` ### <8GB VRAM (Low-end) ```python pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16) pipe.enable_sequential_cpu_offload() pipe.enable_attention_slicing("max") pipe.enable_vae_slicing() pipe.enable_vae_tiling() ``` IMPORTANT: For FLUX.1-schnell models, do NOT include guidance_scale parameter as it's not needed. Using the OPTIMIZATION KNOWLEDGE BASE above, generate Python code that: 1. **Selects the best optimization techniques** for the specific hardware profile 2. **Applies appropriate memory optimizations** based on available VRAM 3. **Uses optimal data types** for the target hardware: - User specified dtype (if provided): Use exactly as specified - Apple Silicon (MPS): prefer torch.bfloat16 - NVIDIA GPUs: prefer torch.float16 or torch.bfloat16 - CPU only: use torch.float32 4. **Implements hardware-specific optimizations** (CUDA, MPS, CPU) 5. **Follows model-specific guidelines** (e.g., FLUX guidance_scale handling) IMPORTANT GUIDELINES: - Reference the OPTIMIZATION KNOWLEDGE BASE to select appropriate techniques - Include all necessary imports - Add brief comments explaining optimization choices - Generate compact, production-ready code - Inline values where possible for concise code - Generate ONLY the Python code, no explanations before or after the code block 2025-05-30 10:29:19,603 - auto_diffusers - INFO - ================================================================================ 2025-05-30 10:29:19,603 - auto_diffusers - INFO - Sending request to Gemini API 2025-05-30 10:29:31,763 - auto_diffusers - INFO - Successfully received response from Gemini API (no tools used) 2025-05-30 10:29:31,764 - auto_diffusers - DEBUG - Response length: 1665 characters 2025-05-30 10:32:32,108 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-30 10:32:32,108 - __main__ - DEBUG - API key found, length: 39 2025-05-30 10:32:32,108 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-30 10:32:32,108 - auto_diffusers - DEBUG - API key length: 39 2025-05-30 10:32:32,108 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-30 10:32:32,108 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-30 10:32:32,108 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-30 10:32:32,108 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-30 10:32:32,108 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-30 10:32:32,109 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-30 10:32:32,112 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-30 10:32:32,112 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-30 10:32:32,574 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-30 10:32:32,575 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-30 10:32:32,575 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-30 10:32:32,575 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-30 10:32:32,575 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-30 10:32:32,575 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-30 10:32:32,575 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-30 10:32:32,575 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-30 10:32:32,575 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-30 10:32:32,575 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-30 10:32:32,575 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-30 10:32:32,577 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 10:32:32,591 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-30 10:32:32,599 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 10:32:32,683 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 10:32:32,714 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=None socket_options=None 2025-05-30 10:32:32,715 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 10:32:32,715 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 10:32:32,715 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 10:32:32,715 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 10:32:32,715 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 10:32:32,716 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 10:32:32,716 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 01:32:32 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')]) 2025-05-30 10:32:32,716 - httpx - INFO - HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK" 2025-05-30 10:32:32,716 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 10:32:32,716 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 10:32:32,716 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 10:32:32,716 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 10:32:32,716 - httpcore.connection - DEBUG - close.started 2025-05-30 10:32:32,716 - httpcore.connection - DEBUG - close.complete 2025-05-30 10:32:32,717 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=3 socket_options=None 2025-05-30 10:32:32,717 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 10:32:32,717 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 10:32:32,717 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 10:32:32,717 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 10:32:32,718 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 10:32:32,718 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 10:32:32,723 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 01:32:32 GMT'), (b'server', b'uvicorn'), (b'content-length', b'99751'), (b'content-type', b'text/html; charset=utf-8')]) 2025-05-30 10:32:32,724 - httpx - INFO - HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK" 2025-05-30 10:32:32,724 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 10:32:32,724 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 10:32:32,724 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 10:32:32,724 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 10:32:32,724 - httpcore.connection - DEBUG - close.started 2025-05-30 10:32:32,724 - httpcore.connection - DEBUG - close.complete 2025-05-30 10:32:32,736 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=30 socket_options=None 2025-05-30 10:32:32,802 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 10:32:32,802 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-30 10:32:32,885 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 10:32:32,885 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=30 2025-05-30 10:32:32,931 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-30 10:32:33,098 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 10:32:33,098 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 10:32:33,098 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 10:32:33,098 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 10:32:33,098 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 10:32:33,099 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 10:32:33,178 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 10:32:33,178 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 10:32:33,178 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 10:32:33,179 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 10:32:33,179 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 10:32:33,179 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 10:32:33,248 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 01:32:33 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-30 10:32:33,248 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-30 10:32:33,248 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 10:32:33,248 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 10:32:33,249 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 10:32:33,249 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 10:32:33,249 - httpcore.connection - DEBUG - close.started 2025-05-30 10:32:33,249 - httpcore.connection - DEBUG - close.complete 2025-05-30 10:32:33,327 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 01:32:33 GMT'), (b'Content-Type', b'text/html; charset=utf-8'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'ContentType', b'application/json'), (b'Access-Control-Allow-Origin', b'*'), (b'Content-Encoding', b'gzip')]) 2025-05-30 10:32:33,327 - httpx - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK" 2025-05-30 10:32:33,328 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 10:32:33,329 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 10:32:33,329 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 10:32:33,329 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 10:32:33,330 - httpcore.connection - DEBUG - close.started 2025-05-30 10:32:33,330 - httpcore.connection - DEBUG - close.complete 2025-05-30 10:32:33,629 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 10:32:33,630 - simple_memory_calculator - INFO - Using known memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 10:32:33,630 - simple_memory_calculator - DEBUG - Known data: {'params_billions': 12.0, 'fp16_gb': 24.0, 'inference_fp16_gb': 36.0} 2025-05-30 10:32:33,630 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 10:32:33,630 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 10:32:33,631 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 10:32:33,631 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 10:32:33,631 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 10:32:33,631 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 10:32:33,912 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 10:32:34,135 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/launched HTTP/1.1" 200 0 2025-05-30 10:32:34,638 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 10:32:34,638 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 10:32:34,638 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 10:32:34,638 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 10:32:34,638 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 10:32:34,638 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 10:32:34,638 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 10:32:34,638 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 10:32:34,638 - auto_diffusers - INFO - Starting code generation for model: black-forest-labs/FLUX.1-schnell 2025-05-30 10:32:34,638 - auto_diffusers - DEBUG - Parameters: prompt='A cat holding a sign that says hello world...', size=(768, 1360), steps=4 2025-05-30 10:32:34,639 - auto_diffusers - DEBUG - Manual specs: True, Memory analysis provided: True 2025-05-30 10:32:34,639 - auto_diffusers - INFO - Using manual hardware specifications 2025-05-30 10:32:34,639 - auto_diffusers - DEBUG - Manual specs: {'platform': 'Linux', 'architecture': 'manual_input', 'cpu_count': 8, 'python_version': '3.11', 'cuda_available': False, 'mps_available': False, 'torch_version': '2.0+', 'manual_input': True, 'ram_gb': 16, 'user_dtype': None, 'gpu_info': [{'name': 'Custom GPU', 'memory_mb': 8192}]} 2025-05-30 10:32:34,639 - auto_diffusers - DEBUG - GPU detected with 8.0 GB VRAM 2025-05-30 10:32:34,639 - auto_diffusers - INFO - Selected optimization profile: balanced 2025-05-30 10:32:34,639 - auto_diffusers - DEBUG - Creating generation prompt for Gemini API 2025-05-30 10:32:34,639 - auto_diffusers - DEBUG - Prompt length: 7598 characters 2025-05-30 10:32:34,640 - auto_diffusers - INFO - ================================================================================ 2025-05-30 10:32:34,640 - auto_diffusers - INFO - PROMPT SENT TO GEMINI API: 2025-05-30 10:32:34,640 - auto_diffusers - INFO - ================================================================================ 2025-05-30 10:32:34,640 - auto_diffusers - INFO - You are an expert in optimizing diffusers library code for different hardware configurations. NOTE: This system includes curated optimization knowledge from HuggingFace documentation. TASK: Generate optimized Python code for running a diffusion model with the following specifications: - Model: black-forest-labs/FLUX.1-schnell - Prompt: "A cat holding a sign that says hello world" - Image size: 768x1360 - Inference steps: 4 HARDWARE SPECIFICATIONS: - Platform: Linux (manual_input) - CPU Cores: 8 - CUDA Available: False - MPS Available: False - Optimization Profile: balanced - GPU: Custom GPU (8.0 GB VRAM) MEMORY ANALYSIS: - Model Memory Requirements: 36.0 GB (FP16 inference) - Model Weights Size: 24.0 GB (FP16) - Memory Recommendation: 🔄 Requires sequential CPU offloading - Recommended Precision: float16 - Attention Slicing Recommended: True - VAE Slicing Recommended: True OPTIMIZATION KNOWLEDGE BASE: # DIFFUSERS OPTIMIZATION TECHNIQUES ## Memory Optimization Techniques ### 1. Model CPU Offloading Use `enable_model_cpu_offload()` to move models between GPU and CPU automatically: ```python pipe.enable_model_cpu_offload() ``` - Saves significant VRAM by keeping only active models on GPU - Automatic management, no manual intervention needed - Compatible with all pipelines ### 2. Sequential CPU Offloading Use `enable_sequential_cpu_offload()` for more aggressive memory saving: ```python pipe.enable_sequential_cpu_offload() ``` - More memory efficient than model offloading - Moves models to CPU after each forward pass - Best for very limited VRAM scenarios ### 3. Attention Slicing Use `enable_attention_slicing()` to reduce memory during attention computation: ```python pipe.enable_attention_slicing() # or specify slice size pipe.enable_attention_slicing("max") # maximum slicing pipe.enable_attention_slicing(1) # slice_size = 1 ``` - Trades compute time for memory - Most effective for high-resolution images - Can be combined with other techniques ### 4. VAE Slicing Use `enable_vae_slicing()` for large batch processing: ```python pipe.enable_vae_slicing() ``` - Decodes images one at a time instead of all at once - Essential for batch sizes > 4 - Minimal performance impact on single images ### 5. VAE Tiling Use `enable_vae_tiling()` for high-resolution image generation: ```python pipe.enable_vae_tiling() ``` - Enables 4K+ image generation on 8GB VRAM - Splits images into overlapping tiles - Automatically disabled for 512x512 or smaller images ### 6. Memory Efficient Attention (xFormers) Use `enable_xformers_memory_efficient_attention()` if xFormers is installed: ```python pipe.enable_xformers_memory_efficient_attention() ``` - Significantly reduces memory usage and improves speed - Requires xformers library installation - Compatible with most models ## Performance Optimization Techniques ### 1. Half Precision (FP16/BF16) Use lower precision for better memory and speed: ```python # FP16 (widely supported) pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16) # BF16 (better numerical stability, newer hardware) pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.bfloat16) ``` - FP16: Halves memory usage, widely supported - BF16: Better numerical stability, requires newer GPUs - Essential for most optimization scenarios ### 2. Torch Compile (PyTorch 2.0+) Use `torch.compile()` for significant speed improvements: ```python pipe.unet = torch.compile(pipe.unet, mode="reduce-overhead", fullgraph=True) # For some models, compile VAE too: pipe.vae.decode = torch.compile(pipe.vae.decode, mode="reduce-overhead", fullgraph=True) ``` - 5-50% speed improvement - Requires PyTorch 2.0+ - First run is slower due to compilation ### 3. Fast Schedulers Use faster schedulers for fewer steps: ```python from diffusers import LMSDiscreteScheduler, UniPCMultistepScheduler # LMS Scheduler (good quality, fast) pipe.scheduler = LMSDiscreteScheduler.from_config(pipe.scheduler.config) # UniPC Scheduler (fastest) pipe.scheduler = UniPCMultistepScheduler.from_config(pipe.scheduler.config) ``` ## Hardware-Specific Optimizations ### NVIDIA GPU Optimizations ```python # Enable Tensor Cores torch.backends.cudnn.benchmark = True # Optimal data type for NVIDIA torch_dtype = torch.float16 # or torch.bfloat16 for RTX 30/40 series ``` ### Apple Silicon (MPS) Optimizations ```python # Use MPS device device = "mps" if torch.backends.mps.is_available() else "cpu" pipe = pipe.to(device) # Recommended dtype for Apple Silicon torch_dtype = torch.bfloat16 # Better than float16 on Apple Silicon # Attention slicing often helps on MPS pipe.enable_attention_slicing() ``` ### CPU Optimizations ```python # Use float32 for CPU torch_dtype = torch.float32 # Enable optimized attention pipe.enable_attention_slicing() ``` ## Model-Specific Guidelines ### FLUX Models - Do NOT use guidance_scale parameter (not needed for FLUX) - Use 4-8 inference steps maximum - BF16 dtype recommended - Enable attention slicing for memory optimization ### Stable Diffusion XL - Enable attention slicing for high resolutions - Use refiner model sparingly to save memory - Consider VAE tiling for >1024px images ### Stable Diffusion 1.5/2.1 - Very memory efficient base models - Can often run without optimizations on 8GB+ VRAM - Enable VAE slicing for batch processing ## Memory Usage Estimation - FLUX.1: ~24GB for full precision, ~12GB for FP16 - SDXL: ~7GB for FP16, ~14GB for FP32 - SD 1.5: ~2GB for FP16, ~4GB for FP32 ## Optimization Combinations by VRAM ### 24GB+ VRAM (High-end) ```python pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.bfloat16) pipe = pipe.to("cuda") pipe.unet = torch.compile(pipe.unet, mode="reduce-overhead", fullgraph=True) ``` ### 12-24GB VRAM (Mid-range) ```python pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16) pipe = pipe.to("cuda") pipe.enable_model_cpu_offload() pipe.enable_xformers_memory_efficient_attention() ``` ### 8-12GB VRAM (Entry-level) ```python pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16) pipe.enable_sequential_cpu_offload() pipe.enable_attention_slicing() pipe.enable_vae_slicing() pipe.enable_xformers_memory_efficient_attention() ``` ### <8GB VRAM (Low-end) ```python pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16) pipe.enable_sequential_cpu_offload() pipe.enable_attention_slicing("max") pipe.enable_vae_slicing() pipe.enable_vae_tiling() ``` IMPORTANT: For FLUX.1-schnell models, do NOT include guidance_scale parameter as it's not needed. Using the OPTIMIZATION KNOWLEDGE BASE above, generate Python code that: 1. **Selects the best optimization techniques** for the specific hardware profile 2. **Applies appropriate memory optimizations** based on available VRAM 3. **Uses optimal data types** for the target hardware: - User specified dtype (if provided): Use exactly as specified - Apple Silicon (MPS): prefer torch.bfloat16 - NVIDIA GPUs: prefer torch.float16 or torch.bfloat16 - CPU only: use torch.float32 4. **Implements hardware-specific optimizations** (CUDA, MPS, CPU) 5. **Follows model-specific guidelines** (e.g., FLUX guidance_scale handling) IMPORTANT GUIDELINES: - Reference the OPTIMIZATION KNOWLEDGE BASE to select appropriate techniques - Include all necessary imports - Add brief comments explaining optimization choices - Generate compact, production-ready code - Inline values where possible for concise code - Generate ONLY the Python code, no explanations before or after the code block 2025-05-30 10:32:34,640 - auto_diffusers - INFO - ================================================================================ 2025-05-30 10:32:34,641 - auto_diffusers - INFO - Sending request to Gemini API 2025-05-30 10:33:08,633 - auto_diffusers - INFO - Successfully received response from Gemini API (no tools used) 2025-05-30 10:33:08,634 - auto_diffusers - DEBUG - Response length: 2670 characters 2025-05-30 11:43:56,204 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-30 11:43:56,204 - __main__ - DEBUG - API key found, length: 39 2025-05-30 11:43:56,204 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-30 11:43:56,204 - auto_diffusers - DEBUG - API key length: 39 2025-05-30 11:43:56,204 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-30 11:43:56,204 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-30 11:43:56,204 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-30 11:43:56,204 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-30 11:43:56,204 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-30 11:43:56,204 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-30 11:43:56,208 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-30 11:43:56,208 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-30 11:43:56,690 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-30 11:43:56,690 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-30 11:43:56,690 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-30 11:43:56,690 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-30 11:43:56,690 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-30 11:43:56,690 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-30 11:43:56,690 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-30 11:43:56,690 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-30 11:43:56,690 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-30 11:43:56,690 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-30 11:43:56,690 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-30 11:43:56,692 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 11:43:56,704 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-30 11:43:56,712 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 11:43:56,814 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 11:43:56,848 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=None socket_options=None 2025-05-30 11:43:56,849 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 11:43:56,849 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 11:43:56,849 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 11:43:56,849 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 11:43:56,849 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 11:43:56,849 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 11:43:56,850 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 02:43:56 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')]) 2025-05-30 11:43:56,850 - httpx - INFO - HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK" 2025-05-30 11:43:56,850 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 11:43:56,850 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 11:43:56,850 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 11:43:56,850 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 11:43:56,850 - httpcore.connection - DEBUG - close.started 2025-05-30 11:43:56,850 - httpcore.connection - DEBUG - close.complete 2025-05-30 11:43:56,850 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=3 socket_options=None 2025-05-30 11:43:56,851 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 11:43:56,851 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 11:43:56,851 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 11:43:56,851 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 11:43:56,851 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 11:43:56,851 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 11:43:56,857 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 02:43:56 GMT'), (b'server', b'uvicorn'), (b'content-length', b'101572'), (b'content-type', b'text/html; charset=utf-8')]) 2025-05-30 11:43:56,857 - httpx - INFO - HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK" 2025-05-30 11:43:56,858 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 11:43:56,858 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 11:43:56,858 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 11:43:56,858 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 11:43:56,858 - httpcore.connection - DEBUG - close.started 2025-05-30 11:43:56,858 - httpcore.connection - DEBUG - close.complete 2025-05-30 11:43:56,869 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=30 socket_options=None 2025-05-30 11:43:56,886 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 11:43:56,886 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-30 11:43:56,996 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-30 11:43:57,019 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 11:43:57,019 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=30 2025-05-30 11:43:57,182 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 11:43:57,183 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 11:43:57,183 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 11:43:57,183 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 11:43:57,184 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 11:43:57,184 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 11:43:57,319 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 11:43:57,320 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 11:43:57,320 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 11:43:57,320 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 11:43:57,320 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 11:43:57,320 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 11:43:57,334 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 02:43:57 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-30 11:43:57,334 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-30 11:43:57,334 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 11:43:57,334 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 11:43:57,335 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 11:43:57,335 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 11:43:57,335 - httpcore.connection - DEBUG - close.started 2025-05-30 11:43:57,335 - httpcore.connection - DEBUG - close.complete 2025-05-30 11:43:57,473 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 02:43:57 GMT'), (b'Content-Type', b'text/html; charset=utf-8'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'ContentType', b'application/json'), (b'Access-Control-Allow-Origin', b'*'), (b'Content-Encoding', b'gzip')]) 2025-05-30 11:43:57,474 - httpx - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK" 2025-05-30 11:43:57,474 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 11:43:57,474 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 11:43:57,474 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 11:43:57,474 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 11:43:57,475 - httpcore.connection - DEBUG - close.started 2025-05-30 11:43:57,475 - httpcore.connection - DEBUG - close.complete 2025-05-30 11:43:58,049 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 11:43:58,271 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/launched HTTP/1.1" 200 0 2025-05-30 11:43:58,333 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 11:43:58,334 - simple_memory_calculator - INFO - Using known memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 11:43:58,334 - simple_memory_calculator - DEBUG - Known data: {'params_billions': 12.0, 'fp16_gb': 24.0, 'inference_fp16_gb': 36.0} 2025-05-30 11:43:58,334 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 11:43:58,334 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 11:43:58,334 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 11:43:58,334 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 11:43:58,335 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 11:43:58,335 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 11:56:55,415 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-30 11:56:55,415 - __main__ - DEBUG - API key found, length: 39 2025-05-30 11:56:55,416 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-30 11:56:55,416 - auto_diffusers - DEBUG - API key length: 39 2025-05-30 11:56:55,416 - auto_diffusers - DEBUG - Creating tools for Gemini 2025-05-30 11:56:55,416 - auto_diffusers - INFO - Created 3 tools for Gemini 2025-05-30 11:56:55,416 - auto_diffusers - INFO - Successfully configured Gemini AI model with tools 2025-05-30 11:56:55,416 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-30 11:56:55,416 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-30 11:56:55,416 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-30 11:56:55,416 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.12.9 2025-05-30 11:56:55,416 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-30 11:56:55,419 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-30 11:56:55,419 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-30 11:56:55,900 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-30 11:56:55,900 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-30 11:56:55,900 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-30 11:56:55,900 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.12.9', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-30 11:56:55,900 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-30 11:56:55,900 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-30 11:56:55,900 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-30 11:56:55,900 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-30 11:56:55,900 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-30 11:56:55,900 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-30 11:56:55,900 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-30 11:56:55,902 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 11:56:55,909 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 11:56:55,916 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-30 11:56:55,979 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 11:56:56,083 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 11:56:56,083 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-30 11:56:56,166 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-30 11:56:56,418 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 11:56:56,419 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 11:56:56,419 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 11:56:56,420 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 11:56:56,420 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 11:56:56,420 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 11:56:56,560 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 02:56:56 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-30 11:56:56,561 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-30 11:56:56,561 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 11:56:56,561 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 11:56:56,561 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 11:56:56,561 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 11:56:56,561 - httpcore.connection - DEBUG - close.started 2025-05-30 11:56:56,562 - httpcore.connection - DEBUG - close.complete 2025-05-30 11:57:37,480 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-30 11:57:37,480 - __main__ - DEBUG - API key found, length: 39 2025-05-30 11:57:37,480 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-30 11:57:37,480 - auto_diffusers - DEBUG - API key length: 39 2025-05-30 11:57:37,480 - auto_diffusers - DEBUG - Creating tools for Gemini 2025-05-30 11:57:37,480 - auto_diffusers - INFO - Created 3 tools for Gemini 2025-05-30 11:57:37,481 - auto_diffusers - INFO - Successfully configured Gemini AI model with tools 2025-05-30 11:57:37,481 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-30 11:57:37,481 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-30 11:57:37,481 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-30 11:57:37,481 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.12.9 2025-05-30 11:57:37,481 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-30 11:57:37,485 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-30 11:57:37,485 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-30 11:57:37,976 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-30 11:57:37,976 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-30 11:57:37,976 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-30 11:57:37,976 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.12.9', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-30 11:57:37,976 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-30 11:57:37,976 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-30 11:57:37,976 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-30 11:57:37,976 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-30 11:57:37,976 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-30 11:57:37,976 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-30 11:57:37,976 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-30 11:57:37,979 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 11:57:37,980 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 11:57:37,992 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-30 11:57:38,053 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 11:57:38,141 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 11:57:38,141 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-30 11:57:38,242 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-30 11:57:38,420 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 11:57:38,420 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 11:57:38,421 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 11:57:38,422 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 11:57:38,422 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 11:57:38,422 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 11:57:38,563 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 02:57:38 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-30 11:57:38,563 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-30 11:57:38,564 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 11:57:38,564 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 11:57:38,564 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 11:57:38,564 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 11:57:38,564 - httpcore.connection - DEBUG - close.started 2025-05-30 11:57:38,565 - httpcore.connection - DEBUG - close.complete 2025-05-30 11:58:21,477 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-30 11:58:21,477 - __main__ - DEBUG - API key found, length: 39 2025-05-30 11:58:21,477 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-30 11:58:21,478 - auto_diffusers - DEBUG - API key length: 39 2025-05-30 11:58:21,478 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-30 11:58:21,478 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-30 11:58:21,478 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-30 11:58:21,478 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-30 11:58:21,478 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-30 11:58:21,478 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-30 11:58:21,482 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-30 11:58:21,482 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-30 11:58:22,009 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-30 11:58:22,009 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-30 11:58:22,009 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-30 11:58:22,009 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-30 11:58:22,009 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-30 11:58:22,009 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-30 11:58:22,009 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-30 11:58:22,009 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-30 11:58:22,009 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-30 11:58:22,009 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-30 11:58:22,009 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-30 11:58:22,011 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 11:58:22,025 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-30 11:58:22,031 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 11:58:22,110 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 11:58:22,141 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=None socket_options=None 2025-05-30 11:58:22,141 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 11:58:22,141 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 11:58:22,141 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 11:58:22,142 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 11:58:22,142 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 11:58:22,142 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 11:58:22,142 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 02:58:22 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')]) 2025-05-30 11:58:22,142 - httpx - INFO - HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK" 2025-05-30 11:58:22,142 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 11:58:22,142 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 11:58:22,142 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 11:58:22,142 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 11:58:22,143 - httpcore.connection - DEBUG - close.started 2025-05-30 11:58:22,143 - httpcore.connection - DEBUG - close.complete 2025-05-30 11:58:22,143 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=3 socket_options=None 2025-05-30 11:58:22,143 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 11:58:22,143 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 11:58:22,143 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 11:58:22,143 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 11:58:22,143 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 11:58:22,143 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 11:58:22,150 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 02:58:22 GMT'), (b'server', b'uvicorn'), (b'content-length', b'101579'), (b'content-type', b'text/html; charset=utf-8')]) 2025-05-30 11:58:22,150 - httpx - INFO - HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK" 2025-05-30 11:58:22,150 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 11:58:22,150 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 11:58:22,150 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 11:58:22,150 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 11:58:22,150 - httpcore.connection - DEBUG - close.started 2025-05-30 11:58:22,150 - httpcore.connection - DEBUG - close.complete 2025-05-30 11:58:22,162 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=30 socket_options=None 2025-05-30 11:58:22,262 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 11:58:22,262 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-30 11:58:22,311 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 11:58:22,311 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=30 2025-05-30 11:58:22,416 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-30 11:58:22,535 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 11:58:22,535 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 11:58:22,536 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 11:58:22,536 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 11:58:22,536 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 11:58:22,536 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 11:58:22,611 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 11:58:22,611 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 11:58:22,612 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 11:58:22,612 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 11:58:22,612 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 11:58:22,612 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 11:58:22,674 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 02:58:22 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-30 11:58:22,674 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-30 11:58:22,674 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 11:58:22,674 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 11:58:22,674 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 11:58:22,675 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 11:58:22,675 - httpcore.connection - DEBUG - close.started 2025-05-30 11:58:22,675 - httpcore.connection - DEBUG - close.complete 2025-05-30 11:58:22,762 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 02:58:22 GMT'), (b'Content-Type', b'text/html; charset=utf-8'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'ContentType', b'application/json'), (b'Access-Control-Allow-Origin', b'*'), (b'Content-Encoding', b'gzip')]) 2025-05-30 11:58:22,762 - httpx - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK" 2025-05-30 11:58:22,762 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 11:58:22,762 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 11:58:22,763 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 11:58:22,763 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 11:58:22,763 - httpcore.connection - DEBUG - close.started 2025-05-30 11:58:22,763 - httpcore.connection - DEBUG - close.complete 2025-05-30 11:58:23,210 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 11:58:23,210 - simple_memory_calculator - INFO - Using known memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 11:58:23,210 - simple_memory_calculator - DEBUG - Known data: {'params_billions': 12.0, 'fp16_gb': 24.0, 'inference_fp16_gb': 36.0} 2025-05-30 11:58:23,210 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 11:58:23,210 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 11:58:23,210 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 11:58:23,210 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 11:58:23,210 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 11:58:23,210 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 11:58:23,418 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 11:58:23,633 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/launched HTTP/1.1" 200 0 2025-05-30 12:00:05,794 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-30 12:00:05,794 - __main__ - DEBUG - API key found, length: 39 2025-05-30 12:00:05,794 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-30 12:00:05,794 - auto_diffusers - DEBUG - API key length: 39 2025-05-30 12:00:05,795 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-30 12:00:05,795 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-30 12:00:05,795 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-30 12:00:05,795 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-30 12:00:05,795 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-30 12:00:05,795 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-30 12:00:05,798 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-30 12:00:05,798 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-30 12:00:06,286 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-30 12:00:06,286 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-30 12:00:06,286 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-30 12:00:06,286 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-30 12:00:06,287 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-30 12:00:06,287 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-30 12:00:06,287 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-30 12:00:06,287 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-30 12:00:06,287 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-30 12:00:06,287 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-30 12:00:06,287 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-30 12:00:06,289 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 12:00:06,308 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-30 12:00:06,309 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 12:00:06,392 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 12:00:06,432 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=None socket_options=None 2025-05-30 12:00:06,432 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 12:00:06,432 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 12:00:06,432 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 12:00:06,432 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 12:00:06,432 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 12:00:06,432 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 12:00:06,433 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 03:00:06 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')]) 2025-05-30 12:00:06,434 - httpx - INFO - HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK" 2025-05-30 12:00:06,434 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 12:00:06,434 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 12:00:06,434 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 12:00:06,434 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 12:00:06,434 - httpcore.connection - DEBUG - close.started 2025-05-30 12:00:06,434 - httpcore.connection - DEBUG - close.complete 2025-05-30 12:00:06,435 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=3 socket_options=None 2025-05-30 12:00:06,435 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 12:00:06,436 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 12:00:06,436 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 12:00:06,436 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 12:00:06,436 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 12:00:06,436 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 12:00:06,446 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 03:00:06 GMT'), (b'server', b'uvicorn'), (b'content-length', b'101732'), (b'content-type', b'text/html; charset=utf-8')]) 2025-05-30 12:00:06,447 - httpx - INFO - HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK" 2025-05-30 12:00:06,447 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 12:00:06,447 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 12:00:06,447 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 12:00:06,447 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 12:00:06,447 - httpcore.connection - DEBUG - close.started 2025-05-30 12:00:06,447 - httpcore.connection - DEBUG - close.complete 2025-05-30 12:00:06,459 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=30 socket_options=None 2025-05-30 12:00:06,471 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 12:00:06,472 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-30 12:00:06,619 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 12:00:06,619 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=30 2025-05-30 12:00:06,647 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-30 12:00:06,750 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 12:00:06,750 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 12:00:06,750 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 12:00:06,751 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 12:00:06,751 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 12:00:06,751 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 12:00:06,890 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 03:00:06 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-30 12:00:06,891 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-30 12:00:06,891 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 12:00:06,891 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 12:00:06,892 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 12:00:06,892 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 12:00:06,892 - httpcore.connection - DEBUG - close.started 2025-05-30 12:00:06,893 - httpcore.connection - DEBUG - close.complete 2025-05-30 12:00:06,948 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 12:00:06,948 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 12:00:06,948 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 12:00:06,948 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 12:00:06,949 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 12:00:06,949 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 12:00:07,111 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 03:00:07 GMT'), (b'Content-Type', b'text/html; charset=utf-8'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'ContentType', b'application/json'), (b'Access-Control-Allow-Origin', b'*'), (b'Content-Encoding', b'gzip')]) 2025-05-30 12:00:07,111 - httpx - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK" 2025-05-30 12:00:07,111 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 12:00:07,112 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 12:00:07,112 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 12:00:07,112 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 12:00:07,112 - httpcore.connection - DEBUG - close.started 2025-05-30 12:00:07,113 - httpcore.connection - DEBUG - close.complete 2025-05-30 12:00:07,799 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 12:00:07,951 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:00:07,951 - simple_memory_calculator - INFO - Using known memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 12:00:07,952 - simple_memory_calculator - DEBUG - Known data: {'params_billions': 12.0, 'fp16_gb': 24.0, 'inference_fp16_gb': 36.0} 2025-05-30 12:00:07,952 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 12:00:07,952 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:00:07,952 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 12:00:07,952 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 12:00:07,952 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:00:07,952 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 12:00:08,014 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/launched HTTP/1.1" 200 0 2025-05-30 12:00:57,110 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-30 12:00:57,110 - __main__ - DEBUG - API key found, length: 39 2025-05-30 12:00:57,110 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-30 12:00:57,110 - auto_diffusers - DEBUG - API key length: 39 2025-05-30 12:00:57,110 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-30 12:00:57,110 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-30 12:00:57,110 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-30 12:00:57,110 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-30 12:00:57,110 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-30 12:00:57,110 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-30 12:00:57,113 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-30 12:00:57,113 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-30 12:00:57,568 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-30 12:00:57,568 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-30 12:00:57,568 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-30 12:00:57,568 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-30 12:00:57,568 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-30 12:00:57,568 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-30 12:00:57,568 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-30 12:00:57,568 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-30 12:00:57,568 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-30 12:00:57,568 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-30 12:00:57,568 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-30 12:00:57,570 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 12:00:57,584 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-30 12:00:57,591 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 12:00:57,670 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 12:00:57,703 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=None socket_options=None 2025-05-30 12:00:57,704 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 12:00:57,704 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 12:00:57,704 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 12:00:57,704 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 12:00:57,704 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 12:00:57,704 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 12:00:57,705 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 03:00:57 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')]) 2025-05-30 12:00:57,705 - httpx - INFO - HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK" 2025-05-30 12:00:57,705 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 12:00:57,705 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 12:00:57,705 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 12:00:57,705 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 12:00:57,705 - httpcore.connection - DEBUG - close.started 2025-05-30 12:00:57,705 - httpcore.connection - DEBUG - close.complete 2025-05-30 12:00:57,706 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=3 socket_options=None 2025-05-30 12:00:57,706 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 12:00:57,706 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 12:00:57,706 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 12:00:57,706 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 12:00:57,706 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 12:00:57,706 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 12:00:57,712 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 03:00:57 GMT'), (b'server', b'uvicorn'), (b'content-length', b'101701'), (b'content-type', b'text/html; charset=utf-8')]) 2025-05-30 12:00:57,712 - httpx - INFO - HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK" 2025-05-30 12:00:57,713 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 12:00:57,713 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 12:00:57,713 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 12:00:57,713 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 12:00:57,713 - httpcore.connection - DEBUG - close.started 2025-05-30 12:00:57,713 - httpcore.connection - DEBUG - close.complete 2025-05-30 12:00:57,724 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=30 socket_options=None 2025-05-30 12:00:57,728 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 12:00:57,728 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-30 12:00:57,863 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 12:00:57,863 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=30 2025-05-30 12:00:57,896 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-30 12:00:58,002 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 12:00:58,003 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 12:00:58,003 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 12:00:58,004 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 12:00:58,004 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 12:00:58,004 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 12:00:58,141 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 03:00:58 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-30 12:00:58,141 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-30 12:00:58,141 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 12:00:58,142 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 12:00:58,142 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 12:00:58,142 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 12:00:58,142 - httpcore.connection - DEBUG - close.started 2025-05-30 12:00:58,142 - httpcore.connection - DEBUG - close.complete 2025-05-30 12:00:58,144 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 12:00:58,144 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 12:00:58,144 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 12:00:58,144 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 12:00:58,144 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 12:00:58,144 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 12:00:58,285 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 03:00:58 GMT'), (b'Content-Type', b'text/html; charset=utf-8'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'ContentType', b'application/json'), (b'Access-Control-Allow-Origin', b'*'), (b'Content-Encoding', b'gzip')]) 2025-05-30 12:00:58,285 - httpx - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK" 2025-05-30 12:00:58,286 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 12:00:58,286 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 12:00:58,286 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 12:00:58,287 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 12:00:58,287 - httpcore.connection - DEBUG - close.started 2025-05-30 12:00:58,287 - httpcore.connection - DEBUG - close.complete 2025-05-30 12:00:58,874 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 12:00:59,128 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/launched HTTP/1.1" 200 0 2025-05-30 12:00:59,836 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:00:59,837 - simple_memory_calculator - INFO - Using known memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 12:00:59,837 - simple_memory_calculator - DEBUG - Known data: {'params_billions': 12.0, 'fp16_gb': 24.0, 'inference_fp16_gb': 36.0} 2025-05-30 12:00:59,837 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 12:00:59,837 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:00:59,837 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 12:00:59,837 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 12:00:59,837 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:00:59,837 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 12:01:50,907 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-30 12:01:50,907 - __main__ - DEBUG - API key found, length: 39 2025-05-30 12:01:50,907 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-30 12:01:50,907 - auto_diffusers - DEBUG - API key length: 39 2025-05-30 12:01:50,907 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-30 12:01:50,907 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-30 12:01:50,907 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-30 12:01:50,907 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-30 12:01:50,907 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-30 12:01:50,907 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-30 12:01:50,911 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-30 12:01:50,912 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-30 12:01:51,381 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-30 12:01:51,381 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-30 12:01:51,381 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-30 12:01:51,381 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-30 12:01:51,381 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-30 12:01:51,381 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-30 12:01:51,381 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-30 12:01:51,381 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-30 12:01:51,381 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-30 12:01:51,381 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-30 12:01:51,381 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-30 12:01:51,384 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 12:01:51,397 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-30 12:01:51,397 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 12:01:51,493 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 12:01:51,535 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=None socket_options=None 2025-05-30 12:01:51,536 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 12:01:51,536 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 12:01:51,536 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 12:01:51,536 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 12:01:51,537 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 12:01:51,537 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 12:01:51,537 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 03:01:51 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')]) 2025-05-30 12:01:51,537 - httpx - INFO - HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK" 2025-05-30 12:01:51,537 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 12:01:51,537 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 12:01:51,537 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 12:01:51,537 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 12:01:51,537 - httpcore.connection - DEBUG - close.started 2025-05-30 12:01:51,537 - httpcore.connection - DEBUG - close.complete 2025-05-30 12:01:51,538 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=3 socket_options=None 2025-05-30 12:01:51,538 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 12:01:51,538 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 12:01:51,538 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 12:01:51,539 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 12:01:51,539 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 12:01:51,539 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 12:01:51,545 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 03:01:51 GMT'), (b'server', b'uvicorn'), (b'content-length', b'98531'), (b'content-type', b'text/html; charset=utf-8')]) 2025-05-30 12:01:51,545 - httpx - INFO - HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK" 2025-05-30 12:01:51,545 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 12:01:51,545 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 12:01:51,545 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 12:01:51,545 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 12:01:51,545 - httpcore.connection - DEBUG - close.started 2025-05-30 12:01:51,546 - httpcore.connection - DEBUG - close.complete 2025-05-30 12:01:51,557 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=30 socket_options=None 2025-05-30 12:01:51,604 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 12:01:51,605 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-30 12:01:51,694 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-30 12:01:51,714 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 12:01:51,714 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=30 2025-05-30 12:01:51,894 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 12:01:51,895 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 12:01:51,895 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 12:01:51,895 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 12:01:51,896 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 12:01:51,896 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 12:01:52,036 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 12:01:52,036 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 12:01:52,036 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 12:01:52,036 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 12:01:52,036 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 12:01:52,037 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 12:01:52,045 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 03:01:52 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-30 12:01:52,045 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-30 12:01:52,045 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 12:01:52,045 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 12:01:52,045 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 12:01:52,045 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 12:01:52,046 - httpcore.connection - DEBUG - close.started 2025-05-30 12:01:52,046 - httpcore.connection - DEBUG - close.complete 2025-05-30 12:01:52,195 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 03:01:52 GMT'), (b'Content-Type', b'text/html; charset=utf-8'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'ContentType', b'application/json'), (b'Access-Control-Allow-Origin', b'*'), (b'Content-Encoding', b'gzip')]) 2025-05-30 12:01:52,195 - httpx - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK" 2025-05-30 12:01:52,195 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 12:01:52,195 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 12:01:52,195 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 12:01:52,195 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 12:01:52,195 - httpcore.connection - DEBUG - close.started 2025-05-30 12:01:52,196 - httpcore.connection - DEBUG - close.complete 2025-05-30 12:01:52,768 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 12:01:52,998 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/launched HTTP/1.1" 200 0 2025-05-30 12:01:53,107 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:01:53,107 - simple_memory_calculator - INFO - Using known memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 12:01:53,107 - simple_memory_calculator - DEBUG - Known data: {'params_billions': 12.0, 'fp16_gb': 24.0, 'inference_fp16_gb': 36.0} 2025-05-30 12:01:53,107 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 12:01:53,107 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:01:53,107 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 12:01:53,107 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 12:01:53,107 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:01:53,108 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 12:01:54,428 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:01:54,428 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 12:01:54,428 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 12:01:54,428 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:01:54,428 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 12:01:54,429 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 12:01:54,429 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:01:54,429 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 12:01:54,429 - auto_diffusers - INFO - Starting code generation for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:01:54,429 - auto_diffusers - DEBUG - Parameters: prompt='A cat holding a sign that says hello world...', size=(768, 1360), steps=4 2025-05-30 12:01:54,430 - auto_diffusers - DEBUG - Manual specs: True, Memory analysis provided: True 2025-05-30 12:01:54,430 - auto_diffusers - INFO - Using manual hardware specifications 2025-05-30 12:01:54,430 - auto_diffusers - DEBUG - Manual specs: {'platform': 'Linux', 'architecture': 'manual_input', 'cpu_count': 8, 'python_version': '3.11', 'cuda_available': False, 'mps_available': False, 'torch_version': '2.0+', 'manual_input': True, 'ram_gb': 16, 'user_dtype': None, 'gpu_info': [{'name': 'Custom GPU', 'memory_mb': 8192}]} 2025-05-30 12:01:54,430 - auto_diffusers - DEBUG - GPU detected with 8.0 GB VRAM 2025-05-30 12:01:54,430 - auto_diffusers - INFO - Selected optimization profile: balanced 2025-05-30 12:01:54,430 - auto_diffusers - DEBUG - Creating generation prompt for Gemini API 2025-05-30 12:01:54,430 - auto_diffusers - DEBUG - Prompt length: 7598 characters 2025-05-30 12:01:54,430 - auto_diffusers - INFO - ================================================================================ 2025-05-30 12:01:54,430 - auto_diffusers - INFO - PROMPT SENT TO GEMINI API: 2025-05-30 12:01:54,430 - auto_diffusers - INFO - ================================================================================ 2025-05-30 12:01:54,430 - auto_diffusers - INFO - You are an expert in optimizing diffusers library code for different hardware configurations. NOTE: This system includes curated optimization knowledge from HuggingFace documentation. TASK: Generate optimized Python code for running a diffusion model with the following specifications: - Model: black-forest-labs/FLUX.1-schnell - Prompt: "A cat holding a sign that says hello world" - Image size: 768x1360 - Inference steps: 4 HARDWARE SPECIFICATIONS: - Platform: Linux (manual_input) - CPU Cores: 8 - CUDA Available: False - MPS Available: False - Optimization Profile: balanced - GPU: Custom GPU (8.0 GB VRAM) MEMORY ANALYSIS: - Model Memory Requirements: 36.0 GB (FP16 inference) - Model Weights Size: 24.0 GB (FP16) - Memory Recommendation: 🔄 Requires sequential CPU offloading - Recommended Precision: float16 - Attention Slicing Recommended: True - VAE Slicing Recommended: True OPTIMIZATION KNOWLEDGE BASE: # DIFFUSERS OPTIMIZATION TECHNIQUES ## Memory Optimization Techniques ### 1. Model CPU Offloading Use `enable_model_cpu_offload()` to move models between GPU and CPU automatically: ```python pipe.enable_model_cpu_offload() ``` - Saves significant VRAM by keeping only active models on GPU - Automatic management, no manual intervention needed - Compatible with all pipelines ### 2. Sequential CPU Offloading Use `enable_sequential_cpu_offload()` for more aggressive memory saving: ```python pipe.enable_sequential_cpu_offload() ``` - More memory efficient than model offloading - Moves models to CPU after each forward pass - Best for very limited VRAM scenarios ### 3. Attention Slicing Use `enable_attention_slicing()` to reduce memory during attention computation: ```python pipe.enable_attention_slicing() # or specify slice size pipe.enable_attention_slicing("max") # maximum slicing pipe.enable_attention_slicing(1) # slice_size = 1 ``` - Trades compute time for memory - Most effective for high-resolution images - Can be combined with other techniques ### 4. VAE Slicing Use `enable_vae_slicing()` for large batch processing: ```python pipe.enable_vae_slicing() ``` - Decodes images one at a time instead of all at once - Essential for batch sizes > 4 - Minimal performance impact on single images ### 5. VAE Tiling Use `enable_vae_tiling()` for high-resolution image generation: ```python pipe.enable_vae_tiling() ``` - Enables 4K+ image generation on 8GB VRAM - Splits images into overlapping tiles - Automatically disabled for 512x512 or smaller images ### 6. Memory Efficient Attention (xFormers) Use `enable_xformers_memory_efficient_attention()` if xFormers is installed: ```python pipe.enable_xformers_memory_efficient_attention() ``` - Significantly reduces memory usage and improves speed - Requires xformers library installation - Compatible with most models ## Performance Optimization Techniques ### 1. Half Precision (FP16/BF16) Use lower precision for better memory and speed: ```python # FP16 (widely supported) pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16) # BF16 (better numerical stability, newer hardware) pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.bfloat16) ``` - FP16: Halves memory usage, widely supported - BF16: Better numerical stability, requires newer GPUs - Essential for most optimization scenarios ### 2. Torch Compile (PyTorch 2.0+) Use `torch.compile()` for significant speed improvements: ```python pipe.unet = torch.compile(pipe.unet, mode="reduce-overhead", fullgraph=True) # For some models, compile VAE too: pipe.vae.decode = torch.compile(pipe.vae.decode, mode="reduce-overhead", fullgraph=True) ``` - 5-50% speed improvement - Requires PyTorch 2.0+ - First run is slower due to compilation ### 3. Fast Schedulers Use faster schedulers for fewer steps: ```python from diffusers import LMSDiscreteScheduler, UniPCMultistepScheduler # LMS Scheduler (good quality, fast) pipe.scheduler = LMSDiscreteScheduler.from_config(pipe.scheduler.config) # UniPC Scheduler (fastest) pipe.scheduler = UniPCMultistepScheduler.from_config(pipe.scheduler.config) ``` ## Hardware-Specific Optimizations ### NVIDIA GPU Optimizations ```python # Enable Tensor Cores torch.backends.cudnn.benchmark = True # Optimal data type for NVIDIA torch_dtype = torch.float16 # or torch.bfloat16 for RTX 30/40 series ``` ### Apple Silicon (MPS) Optimizations ```python # Use MPS device device = "mps" if torch.backends.mps.is_available() else "cpu" pipe = pipe.to(device) # Recommended dtype for Apple Silicon torch_dtype = torch.bfloat16 # Better than float16 on Apple Silicon # Attention slicing often helps on MPS pipe.enable_attention_slicing() ``` ### CPU Optimizations ```python # Use float32 for CPU torch_dtype = torch.float32 # Enable optimized attention pipe.enable_attention_slicing() ``` ## Model-Specific Guidelines ### FLUX Models - Do NOT use guidance_scale parameter (not needed for FLUX) - Use 4-8 inference steps maximum - BF16 dtype recommended - Enable attention slicing for memory optimization ### Stable Diffusion XL - Enable attention slicing for high resolutions - Use refiner model sparingly to save memory - Consider VAE tiling for >1024px images ### Stable Diffusion 1.5/2.1 - Very memory efficient base models - Can often run without optimizations on 8GB+ VRAM - Enable VAE slicing for batch processing ## Memory Usage Estimation - FLUX.1: ~24GB for full precision, ~12GB for FP16 - SDXL: ~7GB for FP16, ~14GB for FP32 - SD 1.5: ~2GB for FP16, ~4GB for FP32 ## Optimization Combinations by VRAM ### 24GB+ VRAM (High-end) ```python pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.bfloat16) pipe = pipe.to("cuda") pipe.unet = torch.compile(pipe.unet, mode="reduce-overhead", fullgraph=True) ``` ### 12-24GB VRAM (Mid-range) ```python pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16) pipe = pipe.to("cuda") pipe.enable_model_cpu_offload() pipe.enable_xformers_memory_efficient_attention() ``` ### 8-12GB VRAM (Entry-level) ```python pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16) pipe.enable_sequential_cpu_offload() pipe.enable_attention_slicing() pipe.enable_vae_slicing() pipe.enable_xformers_memory_efficient_attention() ``` ### <8GB VRAM (Low-end) ```python pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16) pipe.enable_sequential_cpu_offload() pipe.enable_attention_slicing("max") pipe.enable_vae_slicing() pipe.enable_vae_tiling() ``` IMPORTANT: For FLUX.1-schnell models, do NOT include guidance_scale parameter as it's not needed. Using the OPTIMIZATION KNOWLEDGE BASE above, generate Python code that: 1. **Selects the best optimization techniques** for the specific hardware profile 2. **Applies appropriate memory optimizations** based on available VRAM 3. **Uses optimal data types** for the target hardware: - User specified dtype (if provided): Use exactly as specified - Apple Silicon (MPS): prefer torch.bfloat16 - NVIDIA GPUs: prefer torch.float16 or torch.bfloat16 - CPU only: use torch.float32 4. **Implements hardware-specific optimizations** (CUDA, MPS, CPU) 5. **Follows model-specific guidelines** (e.g., FLUX guidance_scale handling) IMPORTANT GUIDELINES: - Reference the OPTIMIZATION KNOWLEDGE BASE to select appropriate techniques - Include all necessary imports - Add brief comments explaining optimization choices - Generate compact, production-ready code - Inline values where possible for concise code - Generate ONLY the Python code, no explanations before or after the code block 2025-05-30 12:01:54,430 - auto_diffusers - INFO - ================================================================================ 2025-05-30 12:01:54,431 - auto_diffusers - INFO - Sending request to Gemini API 2025-05-30 12:02:14,766 - auto_diffusers - INFO - Successfully received response from Gemini API (no tools used) 2025-05-30 12:02:14,766 - auto_diffusers - DEBUG - Response length: 2109 characters 2025-05-30 12:04:37,376 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-30 12:04:37,376 - __main__ - DEBUG - API key found, length: 39 2025-05-30 12:04:37,376 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-30 12:04:37,376 - auto_diffusers - DEBUG - API key length: 39 2025-05-30 12:04:37,376 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-30 12:04:37,376 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-30 12:04:37,376 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-30 12:04:37,376 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-30 12:04:37,376 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-30 12:04:37,376 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-30 12:04:37,380 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-30 12:04:37,380 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-30 12:04:37,850 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-30 12:04:37,850 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-30 12:04:37,850 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-30 12:04:37,850 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-30 12:04:37,850 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-30 12:04:37,850 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-30 12:04:37,850 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-30 12:04:37,850 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-30 12:04:37,850 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-30 12:04:37,850 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-30 12:04:37,850 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-30 12:04:37,852 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 12:04:37,866 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-30 12:04:37,866 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 12:04:37,956 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 12:04:37,989 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=None socket_options=None 2025-05-30 12:04:37,990 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 12:04:37,990 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 12:04:37,990 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 12:04:37,990 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 12:04:37,990 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 12:04:37,991 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 12:04:37,991 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 03:04:37 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')]) 2025-05-30 12:04:37,991 - httpx - INFO - HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK" 2025-05-30 12:04:37,991 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 12:04:37,991 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 12:04:37,991 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 12:04:37,991 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 12:04:37,991 - httpcore.connection - DEBUG - close.started 2025-05-30 12:04:37,991 - httpcore.connection - DEBUG - close.complete 2025-05-30 12:04:37,991 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=3 socket_options=None 2025-05-30 12:04:37,992 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 12:04:37,992 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 12:04:37,992 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 12:04:37,992 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 12:04:37,992 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 12:04:37,992 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 12:04:37,998 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 03:04:37 GMT'), (b'server', b'uvicorn'), (b'content-length', b'101436'), (b'content-type', b'text/html; charset=utf-8')]) 2025-05-30 12:04:37,998 - httpx - INFO - HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK" 2025-05-30 12:04:37,998 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 12:04:37,998 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 12:04:37,999 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 12:04:37,999 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 12:04:37,999 - httpcore.connection - DEBUG - close.started 2025-05-30 12:04:37,999 - httpcore.connection - DEBUG - close.complete 2025-05-30 12:04:38,010 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=30 socket_options=None 2025-05-30 12:04:38,116 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 12:04:38,116 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-30 12:04:38,147 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 12:04:38,147 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=30 2025-05-30 12:04:38,161 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-30 12:04:38,406 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 12:04:38,407 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 12:04:38,407 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 12:04:38,407 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 12:04:38,407 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 12:04:38,407 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 12:04:38,424 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 12:04:38,424 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 12:04:38,424 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 12:04:38,424 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 12:04:38,424 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 12:04:38,424 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 12:04:38,553 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 03:04:38 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-30 12:04:38,553 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-30 12:04:38,553 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 12:04:38,554 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 12:04:38,554 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 12:04:38,554 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 12:04:38,555 - httpcore.connection - DEBUG - close.started 2025-05-30 12:04:38,555 - httpcore.connection - DEBUG - close.complete 2025-05-30 12:04:38,565 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 03:04:38 GMT'), (b'Content-Type', b'text/html; charset=utf-8'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'ContentType', b'application/json'), (b'Access-Control-Allow-Origin', b'*'), (b'Content-Encoding', b'gzip')]) 2025-05-30 12:04:38,565 - httpx - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK" 2025-05-30 12:04:38,565 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 12:04:38,565 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 12:04:38,565 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 12:04:38,566 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 12:04:38,566 - httpcore.connection - DEBUG - close.started 2025-05-30 12:04:38,566 - httpcore.connection - DEBUG - close.complete 2025-05-30 12:04:39,438 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 12:04:39,659 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/launched HTTP/1.1" 200 0 2025-05-30 12:04:39,711 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:04:39,711 - simple_memory_calculator - INFO - Using known memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 12:04:39,711 - simple_memory_calculator - DEBUG - Known data: {'params_billions': 12.0, 'fp16_gb': 24.0, 'inference_fp16_gb': 36.0} 2025-05-30 12:04:39,711 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 12:04:39,711 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:04:39,711 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 12:04:39,711 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 12:04:39,711 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:04:39,712 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 12:04:40,570 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:04:40,570 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 12:04:40,571 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 12:04:40,571 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:04:40,571 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 12:04:40,571 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 12:04:40,571 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:04:40,571 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 12:04:40,571 - auto_diffusers - INFO - Starting code generation for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:04:40,571 - auto_diffusers - DEBUG - Parameters: prompt='A cat holding a sign that says hello world...', size=(768, 1360), steps=4 2025-05-30 12:04:40,571 - auto_diffusers - DEBUG - Manual specs: True, Memory analysis provided: True 2025-05-30 12:04:40,571 - auto_diffusers - INFO - Using manual hardware specifications 2025-05-30 12:04:40,571 - auto_diffusers - DEBUG - Manual specs: {'platform': 'Linux', 'architecture': 'manual_input', 'cpu_count': 8, 'python_version': '3.11', 'cuda_available': False, 'mps_available': False, 'torch_version': '2.0+', 'manual_input': True, 'ram_gb': 16, 'user_dtype': None, 'gpu_info': [{'name': 'Custom GPU', 'memory_mb': 8192}]} 2025-05-30 12:04:40,571 - auto_diffusers - DEBUG - GPU detected with 8.0 GB VRAM 2025-05-30 12:04:40,572 - auto_diffusers - INFO - Selected optimization profile: balanced 2025-05-30 12:04:40,572 - auto_diffusers - DEBUG - Creating generation prompt for Gemini API 2025-05-30 12:04:40,572 - auto_diffusers - DEBUG - Prompt length: 7598 characters 2025-05-30 12:04:40,572 - auto_diffusers - INFO - ================================================================================ 2025-05-30 12:04:40,572 - auto_diffusers - INFO - PROMPT SENT TO GEMINI API: 2025-05-30 12:04:40,572 - auto_diffusers - INFO - ================================================================================ 2025-05-30 12:04:40,572 - auto_diffusers - INFO - You are an expert in optimizing diffusers library code for different hardware configurations. NOTE: This system includes curated optimization knowledge from HuggingFace documentation. TASK: Generate optimized Python code for running a diffusion model with the following specifications: - Model: black-forest-labs/FLUX.1-schnell - Prompt: "A cat holding a sign that says hello world" - Image size: 768x1360 - Inference steps: 4 HARDWARE SPECIFICATIONS: - Platform: Linux (manual_input) - CPU Cores: 8 - CUDA Available: False - MPS Available: False - Optimization Profile: balanced - GPU: Custom GPU (8.0 GB VRAM) MEMORY ANALYSIS: - Model Memory Requirements: 36.0 GB (FP16 inference) - Model Weights Size: 24.0 GB (FP16) - Memory Recommendation: 🔄 Requires sequential CPU offloading - Recommended Precision: float16 - Attention Slicing Recommended: True - VAE Slicing Recommended: True OPTIMIZATION KNOWLEDGE BASE: # DIFFUSERS OPTIMIZATION TECHNIQUES ## Memory Optimization Techniques ### 1. Model CPU Offloading Use `enable_model_cpu_offload()` to move models between GPU and CPU automatically: ```python pipe.enable_model_cpu_offload() ``` - Saves significant VRAM by keeping only active models on GPU - Automatic management, no manual intervention needed - Compatible with all pipelines ### 2. Sequential CPU Offloading Use `enable_sequential_cpu_offload()` for more aggressive memory saving: ```python pipe.enable_sequential_cpu_offload() ``` - More memory efficient than model offloading - Moves models to CPU after each forward pass - Best for very limited VRAM scenarios ### 3. Attention Slicing Use `enable_attention_slicing()` to reduce memory during attention computation: ```python pipe.enable_attention_slicing() # or specify slice size pipe.enable_attention_slicing("max") # maximum slicing pipe.enable_attention_slicing(1) # slice_size = 1 ``` - Trades compute time for memory - Most effective for high-resolution images - Can be combined with other techniques ### 4. VAE Slicing Use `enable_vae_slicing()` for large batch processing: ```python pipe.enable_vae_slicing() ``` - Decodes images one at a time instead of all at once - Essential for batch sizes > 4 - Minimal performance impact on single images ### 5. VAE Tiling Use `enable_vae_tiling()` for high-resolution image generation: ```python pipe.enable_vae_tiling() ``` - Enables 4K+ image generation on 8GB VRAM - Splits images into overlapping tiles - Automatically disabled for 512x512 or smaller images ### 6. Memory Efficient Attention (xFormers) Use `enable_xformers_memory_efficient_attention()` if xFormers is installed: ```python pipe.enable_xformers_memory_efficient_attention() ``` - Significantly reduces memory usage and improves speed - Requires xformers library installation - Compatible with most models ## Performance Optimization Techniques ### 1. Half Precision (FP16/BF16) Use lower precision for better memory and speed: ```python # FP16 (widely supported) pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16) # BF16 (better numerical stability, newer hardware) pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.bfloat16) ``` - FP16: Halves memory usage, widely supported - BF16: Better numerical stability, requires newer GPUs - Essential for most optimization scenarios ### 2. Torch Compile (PyTorch 2.0+) Use `torch.compile()` for significant speed improvements: ```python pipe.unet = torch.compile(pipe.unet, mode="reduce-overhead", fullgraph=True) # For some models, compile VAE too: pipe.vae.decode = torch.compile(pipe.vae.decode, mode="reduce-overhead", fullgraph=True) ``` - 5-50% speed improvement - Requires PyTorch 2.0+ - First run is slower due to compilation ### 3. Fast Schedulers Use faster schedulers for fewer steps: ```python from diffusers import LMSDiscreteScheduler, UniPCMultistepScheduler # LMS Scheduler (good quality, fast) pipe.scheduler = LMSDiscreteScheduler.from_config(pipe.scheduler.config) # UniPC Scheduler (fastest) pipe.scheduler = UniPCMultistepScheduler.from_config(pipe.scheduler.config) ``` ## Hardware-Specific Optimizations ### NVIDIA GPU Optimizations ```python # Enable Tensor Cores torch.backends.cudnn.benchmark = True # Optimal data type for NVIDIA torch_dtype = torch.float16 # or torch.bfloat16 for RTX 30/40 series ``` ### Apple Silicon (MPS) Optimizations ```python # Use MPS device device = "mps" if torch.backends.mps.is_available() else "cpu" pipe = pipe.to(device) # Recommended dtype for Apple Silicon torch_dtype = torch.bfloat16 # Better than float16 on Apple Silicon # Attention slicing often helps on MPS pipe.enable_attention_slicing() ``` ### CPU Optimizations ```python # Use float32 for CPU torch_dtype = torch.float32 # Enable optimized attention pipe.enable_attention_slicing() ``` ## Model-Specific Guidelines ### FLUX Models - Do NOT use guidance_scale parameter (not needed for FLUX) - Use 4-8 inference steps maximum - BF16 dtype recommended - Enable attention slicing for memory optimization ### Stable Diffusion XL - Enable attention slicing for high resolutions - Use refiner model sparingly to save memory - Consider VAE tiling for >1024px images ### Stable Diffusion 1.5/2.1 - Very memory efficient base models - Can often run without optimizations on 8GB+ VRAM - Enable VAE slicing for batch processing ## Memory Usage Estimation - FLUX.1: ~24GB for full precision, ~12GB for FP16 - SDXL: ~7GB for FP16, ~14GB for FP32 - SD 1.5: ~2GB for FP16, ~4GB for FP32 ## Optimization Combinations by VRAM ### 24GB+ VRAM (High-end) ```python pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.bfloat16) pipe = pipe.to("cuda") pipe.unet = torch.compile(pipe.unet, mode="reduce-overhead", fullgraph=True) ``` ### 12-24GB VRAM (Mid-range) ```python pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16) pipe = pipe.to("cuda") pipe.enable_model_cpu_offload() pipe.enable_xformers_memory_efficient_attention() ``` ### 8-12GB VRAM (Entry-level) ```python pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16) pipe.enable_sequential_cpu_offload() pipe.enable_attention_slicing() pipe.enable_vae_slicing() pipe.enable_xformers_memory_efficient_attention() ``` ### <8GB VRAM (Low-end) ```python pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16) pipe.enable_sequential_cpu_offload() pipe.enable_attention_slicing("max") pipe.enable_vae_slicing() pipe.enable_vae_tiling() ``` IMPORTANT: For FLUX.1-schnell models, do NOT include guidance_scale parameter as it's not needed. Using the OPTIMIZATION KNOWLEDGE BASE above, generate Python code that: 1. **Selects the best optimization techniques** for the specific hardware profile 2. **Applies appropriate memory optimizations** based on available VRAM 3. **Uses optimal data types** for the target hardware: - User specified dtype (if provided): Use exactly as specified - Apple Silicon (MPS): prefer torch.bfloat16 - NVIDIA GPUs: prefer torch.float16 or torch.bfloat16 - CPU only: use torch.float32 4. **Implements hardware-specific optimizations** (CUDA, MPS, CPU) 5. **Follows model-specific guidelines** (e.g., FLUX guidance_scale handling) IMPORTANT GUIDELINES: - Reference the OPTIMIZATION KNOWLEDGE BASE to select appropriate techniques - Include all necessary imports - Add brief comments explaining optimization choices - Generate compact, production-ready code - Inline values where possible for concise code - Generate ONLY the Python code, no explanations before or after the code block 2025-05-30 12:04:40,573 - auto_diffusers - INFO - ================================================================================ 2025-05-30 12:04:40,573 - auto_diffusers - INFO - Sending request to Gemini API 2025-05-30 12:04:58,859 - auto_diffusers - INFO - Successfully received response from Gemini API (no tools used) 2025-05-30 12:04:58,859 - auto_diffusers - DEBUG - Response length: 2133 characters 2025-05-30 12:06:14,204 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-30 12:06:14,204 - __main__ - DEBUG - API key found, length: 39 2025-05-30 12:06:14,204 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-30 12:06:14,204 - auto_diffusers - DEBUG - API key length: 39 2025-05-30 12:06:14,204 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-30 12:06:14,204 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-30 12:06:14,204 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-30 12:06:14,204 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-30 12:06:14,204 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-30 12:06:14,204 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-30 12:06:14,207 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-30 12:06:14,207 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-30 12:06:14,678 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-30 12:06:14,678 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-30 12:06:14,678 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-30 12:06:14,678 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-30 12:06:14,678 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-30 12:06:14,678 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-30 12:06:14,678 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-30 12:06:14,679 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-30 12:06:14,679 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-30 12:06:14,679 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-30 12:06:14,679 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-30 12:06:14,681 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 12:06:14,694 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-30 12:06:14,701 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 12:06:14,783 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 12:06:14,816 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=None socket_options=None 2025-05-30 12:06:14,817 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 12:06:14,817 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 12:06:14,817 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 12:06:14,817 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 12:06:14,817 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 12:06:14,817 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 12:06:14,818 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 03:06:14 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')]) 2025-05-30 12:06:14,818 - httpx - INFO - HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK" 2025-05-30 12:06:14,818 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 12:06:14,818 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 12:06:14,818 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 12:06:14,818 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 12:06:14,818 - httpcore.connection - DEBUG - close.started 2025-05-30 12:06:14,818 - httpcore.connection - DEBUG - close.complete 2025-05-30 12:06:14,818 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=3 socket_options=None 2025-05-30 12:06:14,819 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 12:06:14,819 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 12:06:14,819 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 12:06:14,819 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 12:06:14,819 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 12:06:14,819 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 12:06:14,825 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 03:06:14 GMT'), (b'server', b'uvicorn'), (b'content-length', b'101440'), (b'content-type', b'text/html; charset=utf-8')]) 2025-05-30 12:06:14,826 - httpx - INFO - HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK" 2025-05-30 12:06:14,826 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 12:06:14,826 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 12:06:14,826 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 12:06:14,826 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 12:06:14,826 - httpcore.connection - DEBUG - close.started 2025-05-30 12:06:14,826 - httpcore.connection - DEBUG - close.complete 2025-05-30 12:06:14,837 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=30 socket_options=None 2025-05-30 12:06:14,859 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 12:06:14,859 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-30 12:06:14,975 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 12:06:14,975 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=30 2025-05-30 12:06:14,999 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-30 12:06:15,180 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 12:06:15,180 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 12:06:15,180 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 12:06:15,180 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 12:06:15,181 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 12:06:15,181 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 12:06:15,255 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 12:06:15,256 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 12:06:15,256 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 12:06:15,256 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 12:06:15,256 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 12:06:15,256 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 12:06:15,318 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 03:06:15 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-30 12:06:15,319 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-30 12:06:15,319 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 12:06:15,319 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 12:06:15,319 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 12:06:15,319 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 12:06:15,320 - httpcore.connection - DEBUG - close.started 2025-05-30 12:06:15,320 - httpcore.connection - DEBUG - close.complete 2025-05-30 12:06:15,394 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 03:06:15 GMT'), (b'Content-Type', b'text/html; charset=utf-8'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'ContentType', b'application/json'), (b'Access-Control-Allow-Origin', b'*'), (b'Content-Encoding', b'gzip')]) 2025-05-30 12:06:15,394 - httpx - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK" 2025-05-30 12:06:15,395 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 12:06:15,395 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 12:06:15,395 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 12:06:15,395 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 12:06:15,395 - httpcore.connection - DEBUG - close.started 2025-05-30 12:06:15,395 - httpcore.connection - DEBUG - close.complete 2025-05-30 12:06:15,482 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:06:15,482 - simple_memory_calculator - INFO - Using known memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 12:06:15,483 - simple_memory_calculator - DEBUG - Known data: {'params_billions': 12.0, 'fp16_gb': 24.0, 'inference_fp16_gb': 36.0} 2025-05-30 12:06:15,483 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 12:06:15,483 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:06:15,483 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 12:06:15,483 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 12:06:15,483 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:06:15,483 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 12:06:16,066 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 12:06:16,145 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:06:16,145 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 12:06:16,146 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 12:06:16,146 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:06:16,146 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 12:06:16,146 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 12:06:16,146 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:06:16,146 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 12:06:16,146 - auto_diffusers - INFO - Starting code generation for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:06:16,146 - auto_diffusers - DEBUG - Parameters: prompt='A cat holding a sign that says hello world...', size=(768, 1360), steps=4 2025-05-30 12:06:16,146 - auto_diffusers - DEBUG - Manual specs: True, Memory analysis provided: True 2025-05-30 12:06:16,146 - auto_diffusers - INFO - Using manual hardware specifications 2025-05-30 12:06:16,146 - auto_diffusers - DEBUG - Manual specs: {'platform': 'Linux', 'architecture': 'manual_input', 'cpu_count': 8, 'python_version': '3.11', 'cuda_available': False, 'mps_available': False, 'torch_version': '2.0+', 'manual_input': True, 'ram_gb': 16, 'user_dtype': None, 'gpu_info': [{'name': 'Custom GPU', 'memory_mb': 8192}]} 2025-05-30 12:06:16,146 - auto_diffusers - DEBUG - GPU detected with 8.0 GB VRAM 2025-05-30 12:06:16,146 - auto_diffusers - INFO - Selected optimization profile: balanced 2025-05-30 12:06:16,146 - auto_diffusers - DEBUG - Creating generation prompt for Gemini API 2025-05-30 12:06:16,146 - auto_diffusers - DEBUG - Prompt length: 7598 characters 2025-05-30 12:06:16,146 - auto_diffusers - INFO - ================================================================================ 2025-05-30 12:06:16,146 - auto_diffusers - INFO - PROMPT SENT TO GEMINI API: 2025-05-30 12:06:16,147 - auto_diffusers - INFO - ================================================================================ 2025-05-30 12:06:16,147 - auto_diffusers - INFO - You are an expert in optimizing diffusers library code for different hardware configurations. NOTE: This system includes curated optimization knowledge from HuggingFace documentation. TASK: Generate optimized Python code for running a diffusion model with the following specifications: - Model: black-forest-labs/FLUX.1-schnell - Prompt: "A cat holding a sign that says hello world" - Image size: 768x1360 - Inference steps: 4 HARDWARE SPECIFICATIONS: - Platform: Linux (manual_input) - CPU Cores: 8 - CUDA Available: False - MPS Available: False - Optimization Profile: balanced - GPU: Custom GPU (8.0 GB VRAM) MEMORY ANALYSIS: - Model Memory Requirements: 36.0 GB (FP16 inference) - Model Weights Size: 24.0 GB (FP16) - Memory Recommendation: 🔄 Requires sequential CPU offloading - Recommended Precision: float16 - Attention Slicing Recommended: True - VAE Slicing Recommended: True OPTIMIZATION KNOWLEDGE BASE: # DIFFUSERS OPTIMIZATION TECHNIQUES ## Memory Optimization Techniques ### 1. Model CPU Offloading Use `enable_model_cpu_offload()` to move models between GPU and CPU automatically: ```python pipe.enable_model_cpu_offload() ``` - Saves significant VRAM by keeping only active models on GPU - Automatic management, no manual intervention needed - Compatible with all pipelines ### 2. Sequential CPU Offloading Use `enable_sequential_cpu_offload()` for more aggressive memory saving: ```python pipe.enable_sequential_cpu_offload() ``` - More memory efficient than model offloading - Moves models to CPU after each forward pass - Best for very limited VRAM scenarios ### 3. Attention Slicing Use `enable_attention_slicing()` to reduce memory during attention computation: ```python pipe.enable_attention_slicing() # or specify slice size pipe.enable_attention_slicing("max") # maximum slicing pipe.enable_attention_slicing(1) # slice_size = 1 ``` - Trades compute time for memory - Most effective for high-resolution images - Can be combined with other techniques ### 4. VAE Slicing Use `enable_vae_slicing()` for large batch processing: ```python pipe.enable_vae_slicing() ``` - Decodes images one at a time instead of all at once - Essential for batch sizes > 4 - Minimal performance impact on single images ### 5. VAE Tiling Use `enable_vae_tiling()` for high-resolution image generation: ```python pipe.enable_vae_tiling() ``` - Enables 4K+ image generation on 8GB VRAM - Splits images into overlapping tiles - Automatically disabled for 512x512 or smaller images ### 6. Memory Efficient Attention (xFormers) Use `enable_xformers_memory_efficient_attention()` if xFormers is installed: ```python pipe.enable_xformers_memory_efficient_attention() ``` - Significantly reduces memory usage and improves speed - Requires xformers library installation - Compatible with most models ## Performance Optimization Techniques ### 1. Half Precision (FP16/BF16) Use lower precision for better memory and speed: ```python # FP16 (widely supported) pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16) # BF16 (better numerical stability, newer hardware) pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.bfloat16) ``` - FP16: Halves memory usage, widely supported - BF16: Better numerical stability, requires newer GPUs - Essential for most optimization scenarios ### 2. Torch Compile (PyTorch 2.0+) Use `torch.compile()` for significant speed improvements: ```python pipe.unet = torch.compile(pipe.unet, mode="reduce-overhead", fullgraph=True) # For some models, compile VAE too: pipe.vae.decode = torch.compile(pipe.vae.decode, mode="reduce-overhead", fullgraph=True) ``` - 5-50% speed improvement - Requires PyTorch 2.0+ - First run is slower due to compilation ### 3. Fast Schedulers Use faster schedulers for fewer steps: ```python from diffusers import LMSDiscreteScheduler, UniPCMultistepScheduler # LMS Scheduler (good quality, fast) pipe.scheduler = LMSDiscreteScheduler.from_config(pipe.scheduler.config) # UniPC Scheduler (fastest) pipe.scheduler = UniPCMultistepScheduler.from_config(pipe.scheduler.config) ``` ## Hardware-Specific Optimizations ### NVIDIA GPU Optimizations ```python # Enable Tensor Cores torch.backends.cudnn.benchmark = True # Optimal data type for NVIDIA torch_dtype = torch.float16 # or torch.bfloat16 for RTX 30/40 series ``` ### Apple Silicon (MPS) Optimizations ```python # Use MPS device device = "mps" if torch.backends.mps.is_available() else "cpu" pipe = pipe.to(device) # Recommended dtype for Apple Silicon torch_dtype = torch.bfloat16 # Better than float16 on Apple Silicon # Attention slicing often helps on MPS pipe.enable_attention_slicing() ``` ### CPU Optimizations ```python # Use float32 for CPU torch_dtype = torch.float32 # Enable optimized attention pipe.enable_attention_slicing() ``` ## Model-Specific Guidelines ### FLUX Models - Do NOT use guidance_scale parameter (not needed for FLUX) - Use 4-8 inference steps maximum - BF16 dtype recommended - Enable attention slicing for memory optimization ### Stable Diffusion XL - Enable attention slicing for high resolutions - Use refiner model sparingly to save memory - Consider VAE tiling for >1024px images ### Stable Diffusion 1.5/2.1 - Very memory efficient base models - Can often run without optimizations on 8GB+ VRAM - Enable VAE slicing for batch processing ## Memory Usage Estimation - FLUX.1: ~24GB for full precision, ~12GB for FP16 - SDXL: ~7GB for FP16, ~14GB for FP32 - SD 1.5: ~2GB for FP16, ~4GB for FP32 ## Optimization Combinations by VRAM ### 24GB+ VRAM (High-end) ```python pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.bfloat16) pipe = pipe.to("cuda") pipe.unet = torch.compile(pipe.unet, mode="reduce-overhead", fullgraph=True) ``` ### 12-24GB VRAM (Mid-range) ```python pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16) pipe = pipe.to("cuda") pipe.enable_model_cpu_offload() pipe.enable_xformers_memory_efficient_attention() ``` ### 8-12GB VRAM (Entry-level) ```python pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16) pipe.enable_sequential_cpu_offload() pipe.enable_attention_slicing() pipe.enable_vae_slicing() pipe.enable_xformers_memory_efficient_attention() ``` ### <8GB VRAM (Low-end) ```python pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16) pipe.enable_sequential_cpu_offload() pipe.enable_attention_slicing("max") pipe.enable_vae_slicing() pipe.enable_vae_tiling() ``` IMPORTANT: For FLUX.1-schnell models, do NOT include guidance_scale parameter as it's not needed. Using the OPTIMIZATION KNOWLEDGE BASE above, generate Python code that: 1. **Selects the best optimization techniques** for the specific hardware profile 2. **Applies appropriate memory optimizations** based on available VRAM 3. **Uses optimal data types** for the target hardware: - User specified dtype (if provided): Use exactly as specified - Apple Silicon (MPS): prefer torch.bfloat16 - NVIDIA GPUs: prefer torch.float16 or torch.bfloat16 - CPU only: use torch.float32 4. **Implements hardware-specific optimizations** (CUDA, MPS, CPU) 5. **Follows model-specific guidelines** (e.g., FLUX guidance_scale handling) IMPORTANT GUIDELINES: - Reference the OPTIMIZATION KNOWLEDGE BASE to select appropriate techniques - Include all necessary imports - Add brief comments explaining optimization choices - Generate compact, production-ready code - Inline values where possible for concise code - Generate ONLY the Python code, no explanations before or after the code block 2025-05-30 12:06:16,147 - auto_diffusers - INFO - ================================================================================ 2025-05-30 12:06:16,147 - auto_diffusers - INFO - Sending request to Gemini API 2025-05-30 12:06:16,279 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/launched HTTP/1.1" 200 0 2025-05-30 12:06:50,443 - auto_diffusers - INFO - Successfully received response from Gemini API (no tools used) 2025-05-30 12:06:50,444 - auto_diffusers - DEBUG - Response length: 2440 characters 2025-05-30 12:07:59,163 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-30 12:07:59,163 - __main__ - DEBUG - API key found, length: 39 2025-05-30 12:07:59,163 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-30 12:07:59,163 - auto_diffusers - DEBUG - API key length: 39 2025-05-30 12:07:59,163 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-30 12:07:59,163 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-30 12:07:59,163 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-30 12:07:59,163 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-30 12:07:59,163 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-30 12:07:59,163 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-30 12:07:59,167 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-30 12:07:59,167 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-30 12:07:59,633 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-30 12:07:59,634 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-30 12:07:59,634 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-30 12:07:59,634 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-30 12:07:59,634 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-30 12:07:59,634 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-30 12:07:59,634 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-30 12:07:59,634 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-30 12:07:59,634 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-30 12:07:59,634 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-30 12:07:59,634 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-30 12:07:59,636 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 12:07:59,648 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-30 12:07:59,656 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 12:07:59,741 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 12:07:59,788 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=None socket_options=None 2025-05-30 12:07:59,788 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 12:07:59,788 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 12:07:59,789 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 12:07:59,789 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 12:07:59,789 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 12:07:59,789 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 12:07:59,789 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 03:07:59 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')]) 2025-05-30 12:07:59,790 - httpx - INFO - HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK" 2025-05-30 12:07:59,790 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 12:07:59,790 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 12:07:59,790 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 12:07:59,790 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 12:07:59,790 - httpcore.connection - DEBUG - close.started 2025-05-30 12:07:59,790 - httpcore.connection - DEBUG - close.complete 2025-05-30 12:07:59,790 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=3 socket_options=None 2025-05-30 12:07:59,791 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 12:07:59,791 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 12:07:59,791 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 12:07:59,791 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 12:07:59,791 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 12:07:59,791 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 12:07:59,798 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 03:07:59 GMT'), (b'server', b'uvicorn'), (b'content-length', b'101454'), (b'content-type', b'text/html; charset=utf-8')]) 2025-05-30 12:07:59,798 - httpx - INFO - HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK" 2025-05-30 12:07:59,798 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 12:07:59,798 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 12:07:59,798 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 12:07:59,799 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 12:07:59,799 - httpcore.connection - DEBUG - close.started 2025-05-30 12:07:59,799 - httpcore.connection - DEBUG - close.complete 2025-05-30 12:07:59,810 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=30 socket_options=None 2025-05-30 12:07:59,813 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 12:07:59,814 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-30 12:07:59,949 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 12:07:59,949 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=30 2025-05-30 12:07:59,967 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-30 12:08:00,140 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 12:08:00,141 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 12:08:00,141 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 12:08:00,141 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 12:08:00,141 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 12:08:00,141 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 12:08:00,258 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 12:08:00,258 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 12:08:00,258 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 12:08:00,258 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 12:08:00,258 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 12:08:00,258 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 12:08:00,279 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 03:08:00 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-30 12:08:00,279 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-30 12:08:00,280 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 12:08:00,280 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 12:08:00,280 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 12:08:00,280 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 12:08:00,280 - httpcore.connection - DEBUG - close.started 2025-05-30 12:08:00,280 - httpcore.connection - DEBUG - close.complete 2025-05-30 12:08:00,400 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 03:08:00 GMT'), (b'Content-Type', b'text/html; charset=utf-8'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'ContentType', b'application/json'), (b'Access-Control-Allow-Origin', b'*'), (b'Content-Encoding', b'gzip')]) 2025-05-30 12:08:00,401 - httpx - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK" 2025-05-30 12:08:00,401 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 12:08:00,401 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 12:08:00,401 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 12:08:00,401 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 12:08:00,401 - httpcore.connection - DEBUG - close.started 2025-05-30 12:08:00,401 - httpcore.connection - DEBUG - close.complete 2025-05-30 12:08:00,986 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 12:08:01,202 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/launched HTTP/1.1" 200 0 2025-05-30 12:08:08,120 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:08:08,121 - simple_memory_calculator - INFO - Using known memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 12:08:08,121 - simple_memory_calculator - DEBUG - Known data: {'params_billions': 12.0, 'fp16_gb': 24.0, 'inference_fp16_gb': 36.0} 2025-05-30 12:08:08,121 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 12:08:08,121 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:08:08,121 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 12:08:08,122 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 12:08:08,122 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:08:08,122 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 12:10:20,211 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-30 12:10:20,211 - __main__ - DEBUG - API key found, length: 39 2025-05-30 12:10:20,211 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-30 12:10:20,211 - auto_diffusers - DEBUG - API key length: 39 2025-05-30 12:10:20,211 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-30 12:10:20,211 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-30 12:10:20,211 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-30 12:10:20,211 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-30 12:10:20,211 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-30 12:10:20,211 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-30 12:10:20,216 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-30 12:10:20,216 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-30 12:10:20,680 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-30 12:10:20,680 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-30 12:10:20,680 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-30 12:10:20,680 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-30 12:10:20,680 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-30 12:10:20,680 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-30 12:10:20,680 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-30 12:10:20,680 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-30 12:10:20,680 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-30 12:10:20,680 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-30 12:10:20,680 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-30 12:10:20,682 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 12:10:20,695 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-30 12:10:20,703 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 12:10:20,786 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 12:10:20,820 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=None socket_options=None 2025-05-30 12:10:20,820 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 12:10:20,820 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 12:10:20,821 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 12:10:20,821 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 12:10:20,821 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 12:10:20,821 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 12:10:20,821 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 03:10:20 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')]) 2025-05-30 12:10:20,821 - httpx - INFO - HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK" 2025-05-30 12:10:20,821 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 12:10:20,821 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 12:10:20,821 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 12:10:20,821 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 12:10:20,821 - httpcore.connection - DEBUG - close.started 2025-05-30 12:10:20,822 - httpcore.connection - DEBUG - close.complete 2025-05-30 12:10:20,822 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=3 socket_options=None 2025-05-30 12:10:20,822 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 12:10:20,822 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 12:10:20,822 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 12:10:20,822 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 12:10:20,822 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 12:10:20,822 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 12:10:20,829 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 03:10:20 GMT'), (b'server', b'uvicorn'), (b'content-length', b'101445'), (b'content-type', b'text/html; charset=utf-8')]) 2025-05-30 12:10:20,829 - httpx - INFO - HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK" 2025-05-30 12:10:20,829 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 12:10:20,829 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 12:10:20,829 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 12:10:20,829 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 12:10:20,829 - httpcore.connection - DEBUG - close.started 2025-05-30 12:10:20,829 - httpcore.connection - DEBUG - close.complete 2025-05-30 12:10:20,841 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=30 socket_options=None 2025-05-30 12:10:20,960 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 12:10:20,960 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-30 12:10:20,982 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-30 12:10:20,985 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 12:10:20,985 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=30 2025-05-30 12:10:21,238 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 12:10:21,239 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 12:10:21,239 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 12:10:21,239 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 12:10:21,239 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 12:10:21,239 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 12:10:21,273 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 12:10:21,273 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 12:10:21,273 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 12:10:21,273 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 12:10:21,273 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 12:10:21,273 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 12:10:21,379 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 03:10:21 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-30 12:10:21,379 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-30 12:10:21,379 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 12:10:21,379 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 12:10:21,379 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 12:10:21,379 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 12:10:21,380 - httpcore.connection - DEBUG - close.started 2025-05-30 12:10:21,380 - httpcore.connection - DEBUG - close.complete 2025-05-30 12:10:21,419 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 03:10:21 GMT'), (b'Content-Type', b'text/html; charset=utf-8'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'ContentType', b'application/json'), (b'Access-Control-Allow-Origin', b'*'), (b'Content-Encoding', b'gzip')]) 2025-05-30 12:10:21,419 - httpx - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK" 2025-05-30 12:10:21,420 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 12:10:21,420 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 12:10:21,420 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 12:10:21,420 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 12:10:21,420 - httpcore.connection - DEBUG - close.started 2025-05-30 12:10:21,421 - httpcore.connection - DEBUG - close.complete 2025-05-30 12:10:22,043 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 12:10:22,264 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/launched HTTP/1.1" 200 0 2025-05-30 12:10:22,493 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:10:22,493 - simple_memory_calculator - INFO - Using known memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 12:10:22,493 - simple_memory_calculator - DEBUG - Known data: {'params_billions': 12.0, 'fp16_gb': 24.0, 'inference_fp16_gb': 36.0} 2025-05-30 12:10:22,493 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 12:10:22,493 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:10:22,493 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 12:10:22,493 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 12:10:22,494 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:10:22,494 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 12:11:53,624 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-30 12:11:53,624 - __main__ - DEBUG - API key found, length: 39 2025-05-30 12:11:53,624 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-30 12:11:53,624 - auto_diffusers - DEBUG - API key length: 39 2025-05-30 12:11:53,624 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-30 12:11:53,624 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-30 12:11:53,624 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-30 12:11:53,624 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-30 12:11:53,624 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-30 12:11:53,624 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-30 12:11:53,628 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-30 12:11:53,628 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-30 12:11:54,097 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-30 12:11:54,097 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-30 12:11:54,097 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-30 12:11:54,097 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-30 12:11:54,097 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-30 12:11:54,097 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-30 12:11:54,097 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-30 12:11:54,097 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-30 12:11:54,097 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-30 12:11:54,097 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-30 12:11:54,097 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-30 12:11:54,099 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 12:11:54,118 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-30 12:11:54,119 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 12:11:54,199 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 12:11:54,235 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=None socket_options=None 2025-05-30 12:11:54,235 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 12:11:54,236 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 12:11:54,236 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 12:11:54,236 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 12:11:54,236 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 12:11:54,236 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 12:11:54,236 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 03:11:54 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')]) 2025-05-30 12:11:54,237 - httpx - INFO - HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK" 2025-05-30 12:11:54,237 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 12:11:54,237 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 12:11:54,237 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 12:11:54,237 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 12:11:54,237 - httpcore.connection - DEBUG - close.started 2025-05-30 12:11:54,237 - httpcore.connection - DEBUG - close.complete 2025-05-30 12:11:54,237 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=3 socket_options=None 2025-05-30 12:11:54,238 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 12:11:54,238 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 12:11:54,238 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 12:11:54,238 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 12:11:54,238 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 12:11:54,238 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 12:11:54,245 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 03:11:54 GMT'), (b'server', b'uvicorn'), (b'content-length', b'101552'), (b'content-type', b'text/html; charset=utf-8')]) 2025-05-30 12:11:54,245 - httpx - INFO - HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK" 2025-05-30 12:11:54,245 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 12:11:54,245 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 12:11:54,245 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 12:11:54,245 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 12:11:54,245 - httpcore.connection - DEBUG - close.started 2025-05-30 12:11:54,245 - httpcore.connection - DEBUG - close.complete 2025-05-30 12:11:54,258 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=30 socket_options=None 2025-05-30 12:11:54,283 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 12:11:54,286 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-30 12:11:54,397 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 12:11:54,397 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=30 2025-05-30 12:11:54,403 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-30 12:11:54,569 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 12:11:54,570 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 12:11:54,570 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 12:11:54,570 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 12:11:54,570 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 12:11:54,570 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 12:11:54,676 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 12:11:54,676 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 12:11:54,676 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 12:11:54,676 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 12:11:54,676 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 12:11:54,676 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 12:11:54,714 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 03:11:54 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-30 12:11:54,714 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-30 12:11:54,714 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 12:11:54,714 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 12:11:54,714 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 12:11:54,714 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 12:11:54,714 - httpcore.connection - DEBUG - close.started 2025-05-30 12:11:54,715 - httpcore.connection - DEBUG - close.complete 2025-05-30 12:11:54,816 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 03:11:54 GMT'), (b'Content-Type', b'text/html; charset=utf-8'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'ContentType', b'application/json'), (b'Access-Control-Allow-Origin', b'*'), (b'Content-Encoding', b'gzip')]) 2025-05-30 12:11:54,816 - httpx - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK" 2025-05-30 12:11:54,816 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 12:11:54,817 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 12:11:54,817 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 12:11:54,817 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 12:11:54,817 - httpcore.connection - DEBUG - close.started 2025-05-30 12:11:54,818 - httpcore.connection - DEBUG - close.complete 2025-05-30 12:11:55,439 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 12:11:55,660 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/launched HTTP/1.1" 200 0 2025-05-30 12:11:57,078 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:11:57,078 - simple_memory_calculator - INFO - Using known memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 12:11:57,079 - simple_memory_calculator - DEBUG - Known data: {'params_billions': 12.0, 'fp16_gb': 24.0, 'inference_fp16_gb': 36.0} 2025-05-30 12:11:57,079 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 12:11:57,079 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:11:57,079 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 12:11:57,079 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 12:11:57,079 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:11:57,079 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 12:13:22,848 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-30 12:13:22,848 - __main__ - DEBUG - API key found, length: 39 2025-05-30 12:13:22,848 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-30 12:13:22,848 - auto_diffusers - DEBUG - API key length: 39 2025-05-30 12:13:22,848 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-30 12:13:22,848 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-30 12:13:22,848 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-30 12:13:22,848 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-30 12:13:22,848 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-30 12:13:22,848 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-30 12:13:22,852 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-30 12:13:22,852 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-30 12:13:23,326 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-30 12:13:23,326 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-30 12:13:23,326 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-30 12:13:23,326 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-30 12:13:23,326 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-30 12:13:23,326 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-30 12:13:23,326 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-30 12:13:23,326 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-30 12:13:23,326 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-30 12:13:23,326 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-30 12:13:23,326 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-30 12:13:23,328 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 12:13:23,342 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-30 12:13:23,349 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 12:13:23,434 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 12:13:23,467 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=None socket_options=None 2025-05-30 12:13:23,468 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 12:13:23,468 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 12:13:23,468 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 12:13:23,468 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 12:13:23,468 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 12:13:23,468 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 12:13:23,468 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 03:13:23 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')]) 2025-05-30 12:13:23,469 - httpx - INFO - HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK" 2025-05-30 12:13:23,469 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 12:13:23,469 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 12:13:23,469 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 12:13:23,469 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 12:13:23,469 - httpcore.connection - DEBUG - close.started 2025-05-30 12:13:23,469 - httpcore.connection - DEBUG - close.complete 2025-05-30 12:13:23,469 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=3 socket_options=None 2025-05-30 12:13:23,470 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 12:13:23,470 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 12:13:23,470 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 12:13:23,470 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 12:13:23,470 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 12:13:23,470 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 12:13:23,477 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 03:13:23 GMT'), (b'server', b'uvicorn'), (b'content-length', b'101618'), (b'content-type', b'text/html; charset=utf-8')]) 2025-05-30 12:13:23,477 - httpx - INFO - HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK" 2025-05-30 12:13:23,477 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 12:13:23,477 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 12:13:23,477 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 12:13:23,477 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 12:13:23,477 - httpcore.connection - DEBUG - close.started 2025-05-30 12:13:23,477 - httpcore.connection - DEBUG - close.complete 2025-05-30 12:13:23,489 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=30 socket_options=None 2025-05-30 12:13:23,523 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 12:13:23,523 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-30 12:13:23,621 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-30 12:13:23,638 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 12:13:23,638 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=30 2025-05-30 12:13:23,834 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 12:13:23,835 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 12:13:23,835 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 12:13:23,835 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 12:13:23,835 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 12:13:23,835 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 12:13:23,940 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 12:13:23,941 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 12:13:23,941 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 12:13:23,941 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 12:13:23,941 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 12:13:23,941 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 12:13:23,992 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 03:13:23 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-30 12:13:23,992 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-30 12:13:23,992 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 12:13:23,992 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 12:13:23,992 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 12:13:23,992 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 12:13:23,993 - httpcore.connection - DEBUG - close.started 2025-05-30 12:13:23,993 - httpcore.connection - DEBUG - close.complete 2025-05-30 12:13:24,092 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 03:13:24 GMT'), (b'Content-Type', b'text/html; charset=utf-8'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'ContentType', b'application/json'), (b'Access-Control-Allow-Origin', b'*'), (b'Content-Encoding', b'gzip')]) 2025-05-30 12:13:24,093 - httpx - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK" 2025-05-30 12:13:24,093 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 12:13:24,093 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 12:13:24,093 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 12:13:24,093 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 12:13:24,093 - httpcore.connection - DEBUG - close.started 2025-05-30 12:13:24,093 - httpcore.connection - DEBUG - close.complete 2025-05-30 12:13:24,177 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:13:24,177 - simple_memory_calculator - INFO - Using known memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 12:13:24,178 - simple_memory_calculator - DEBUG - Known data: {'params_billions': 12.0, 'fp16_gb': 24.0, 'inference_fp16_gb': 36.0} 2025-05-30 12:13:24,178 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 12:13:24,178 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:13:24,178 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 12:13:24,178 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 12:13:24,178 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:13:24,178 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 12:13:24,707 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 12:13:24,928 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/launched HTTP/1.1" 200 0 2025-05-30 12:14:05,021 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:14:05,021 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 12:14:05,021 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 12:14:05,022 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:14:05,022 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 12:14:05,022 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 12:14:05,022 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:14:05,022 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 12:14:05,022 - auto_diffusers - INFO - Starting code generation for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:14:05,022 - auto_diffusers - DEBUG - Parameters: prompt='A cat holding a sign that says hello world...', size=(768, 1360), steps=4 2025-05-30 12:14:05,022 - auto_diffusers - DEBUG - Manual specs: True, Memory analysis provided: True 2025-05-30 12:14:05,022 - auto_diffusers - INFO - Using manual hardware specifications 2025-05-30 12:14:05,022 - auto_diffusers - DEBUG - Manual specs: {'platform': 'Linux', 'architecture': 'manual_input', 'cpu_count': 8, 'python_version': '3.11', 'cuda_available': False, 'mps_available': False, 'torch_version': '2.0+', 'manual_input': True, 'ram_gb': 16, 'user_dtype': None, 'gpu_info': [{'name': 'Custom GPU', 'memory_mb': 8192}]} 2025-05-30 12:14:05,022 - auto_diffusers - DEBUG - GPU detected with 8.0 GB VRAM 2025-05-30 12:14:05,022 - auto_diffusers - INFO - Selected optimization profile: balanced 2025-05-30 12:14:05,022 - auto_diffusers - DEBUG - Creating generation prompt for Gemini API 2025-05-30 12:14:05,022 - auto_diffusers - DEBUG - Prompt length: 7598 characters 2025-05-30 12:14:05,022 - auto_diffusers - INFO - ================================================================================ 2025-05-30 12:14:05,022 - auto_diffusers - INFO - PROMPT SENT TO GEMINI API: 2025-05-30 12:14:05,022 - auto_diffusers - INFO - ================================================================================ 2025-05-30 12:14:05,022 - auto_diffusers - INFO - You are an expert in optimizing diffusers library code for different hardware configurations. NOTE: This system includes curated optimization knowledge from HuggingFace documentation. TASK: Generate optimized Python code for running a diffusion model with the following specifications: - Model: black-forest-labs/FLUX.1-schnell - Prompt: "A cat holding a sign that says hello world" - Image size: 768x1360 - Inference steps: 4 HARDWARE SPECIFICATIONS: - Platform: Linux (manual_input) - CPU Cores: 8 - CUDA Available: False - MPS Available: False - Optimization Profile: balanced - GPU: Custom GPU (8.0 GB VRAM) MEMORY ANALYSIS: - Model Memory Requirements: 36.0 GB (FP16 inference) - Model Weights Size: 24.0 GB (FP16) - Memory Recommendation: 🔄 Requires sequential CPU offloading - Recommended Precision: float16 - Attention Slicing Recommended: True - VAE Slicing Recommended: True OPTIMIZATION KNOWLEDGE BASE: # DIFFUSERS OPTIMIZATION TECHNIQUES ## Memory Optimization Techniques ### 1. Model CPU Offloading Use `enable_model_cpu_offload()` to move models between GPU and CPU automatically: ```python pipe.enable_model_cpu_offload() ``` - Saves significant VRAM by keeping only active models on GPU - Automatic management, no manual intervention needed - Compatible with all pipelines ### 2. Sequential CPU Offloading Use `enable_sequential_cpu_offload()` for more aggressive memory saving: ```python pipe.enable_sequential_cpu_offload() ``` - More memory efficient than model offloading - Moves models to CPU after each forward pass - Best for very limited VRAM scenarios ### 3. Attention Slicing Use `enable_attention_slicing()` to reduce memory during attention computation: ```python pipe.enable_attention_slicing() # or specify slice size pipe.enable_attention_slicing("max") # maximum slicing pipe.enable_attention_slicing(1) # slice_size = 1 ``` - Trades compute time for memory - Most effective for high-resolution images - Can be combined with other techniques ### 4. VAE Slicing Use `enable_vae_slicing()` for large batch processing: ```python pipe.enable_vae_slicing() ``` - Decodes images one at a time instead of all at once - Essential for batch sizes > 4 - Minimal performance impact on single images ### 5. VAE Tiling Use `enable_vae_tiling()` for high-resolution image generation: ```python pipe.enable_vae_tiling() ``` - Enables 4K+ image generation on 8GB VRAM - Splits images into overlapping tiles - Automatically disabled for 512x512 or smaller images ### 6. Memory Efficient Attention (xFormers) Use `enable_xformers_memory_efficient_attention()` if xFormers is installed: ```python pipe.enable_xformers_memory_efficient_attention() ``` - Significantly reduces memory usage and improves speed - Requires xformers library installation - Compatible with most models ## Performance Optimization Techniques ### 1. Half Precision (FP16/BF16) Use lower precision for better memory and speed: ```python # FP16 (widely supported) pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16) # BF16 (better numerical stability, newer hardware) pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.bfloat16) ``` - FP16: Halves memory usage, widely supported - BF16: Better numerical stability, requires newer GPUs - Essential for most optimization scenarios ### 2. Torch Compile (PyTorch 2.0+) Use `torch.compile()` for significant speed improvements: ```python pipe.unet = torch.compile(pipe.unet, mode="reduce-overhead", fullgraph=True) # For some models, compile VAE too: pipe.vae.decode = torch.compile(pipe.vae.decode, mode="reduce-overhead", fullgraph=True) ``` - 5-50% speed improvement - Requires PyTorch 2.0+ - First run is slower due to compilation ### 3. Fast Schedulers Use faster schedulers for fewer steps: ```python from diffusers import LMSDiscreteScheduler, UniPCMultistepScheduler # LMS Scheduler (good quality, fast) pipe.scheduler = LMSDiscreteScheduler.from_config(pipe.scheduler.config) # UniPC Scheduler (fastest) pipe.scheduler = UniPCMultistepScheduler.from_config(pipe.scheduler.config) ``` ## Hardware-Specific Optimizations ### NVIDIA GPU Optimizations ```python # Enable Tensor Cores torch.backends.cudnn.benchmark = True # Optimal data type for NVIDIA torch_dtype = torch.float16 # or torch.bfloat16 for RTX 30/40 series ``` ### Apple Silicon (MPS) Optimizations ```python # Use MPS device device = "mps" if torch.backends.mps.is_available() else "cpu" pipe = pipe.to(device) # Recommended dtype for Apple Silicon torch_dtype = torch.bfloat16 # Better than float16 on Apple Silicon # Attention slicing often helps on MPS pipe.enable_attention_slicing() ``` ### CPU Optimizations ```python # Use float32 for CPU torch_dtype = torch.float32 # Enable optimized attention pipe.enable_attention_slicing() ``` ## Model-Specific Guidelines ### FLUX Models - Do NOT use guidance_scale parameter (not needed for FLUX) - Use 4-8 inference steps maximum - BF16 dtype recommended - Enable attention slicing for memory optimization ### Stable Diffusion XL - Enable attention slicing for high resolutions - Use refiner model sparingly to save memory - Consider VAE tiling for >1024px images ### Stable Diffusion 1.5/2.1 - Very memory efficient base models - Can often run without optimizations on 8GB+ VRAM - Enable VAE slicing for batch processing ## Memory Usage Estimation - FLUX.1: ~24GB for full precision, ~12GB for FP16 - SDXL: ~7GB for FP16, ~14GB for FP32 - SD 1.5: ~2GB for FP16, ~4GB for FP32 ## Optimization Combinations by VRAM ### 24GB+ VRAM (High-end) ```python pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.bfloat16) pipe = pipe.to("cuda") pipe.unet = torch.compile(pipe.unet, mode="reduce-overhead", fullgraph=True) ``` ### 12-24GB VRAM (Mid-range) ```python pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16) pipe = pipe.to("cuda") pipe.enable_model_cpu_offload() pipe.enable_xformers_memory_efficient_attention() ``` ### 8-12GB VRAM (Entry-level) ```python pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16) pipe.enable_sequential_cpu_offload() pipe.enable_attention_slicing() pipe.enable_vae_slicing() pipe.enable_xformers_memory_efficient_attention() ``` ### <8GB VRAM (Low-end) ```python pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16) pipe.enable_sequential_cpu_offload() pipe.enable_attention_slicing("max") pipe.enable_vae_slicing() pipe.enable_vae_tiling() ``` IMPORTANT: For FLUX.1-schnell models, do NOT include guidance_scale parameter as it's not needed. Using the OPTIMIZATION KNOWLEDGE BASE above, generate Python code that: 1. **Selects the best optimization techniques** for the specific hardware profile 2. **Applies appropriate memory optimizations** based on available VRAM 3. **Uses optimal data types** for the target hardware: - User specified dtype (if provided): Use exactly as specified - Apple Silicon (MPS): prefer torch.bfloat16 - NVIDIA GPUs: prefer torch.float16 or torch.bfloat16 - CPU only: use torch.float32 4. **Implements hardware-specific optimizations** (CUDA, MPS, CPU) 5. **Follows model-specific guidelines** (e.g., FLUX guidance_scale handling) IMPORTANT GUIDELINES: - Reference the OPTIMIZATION KNOWLEDGE BASE to select appropriate techniques - Include all necessary imports - Add brief comments explaining optimization choices - Generate compact, production-ready code - Inline values where possible for concise code - Generate ONLY the Python code, no explanations before or after the code block 2025-05-30 12:14:05,023 - auto_diffusers - INFO - ================================================================================ 2025-05-30 12:14:05,023 - auto_diffusers - INFO - Sending request to Gemini API 2025-05-30 12:14:10,979 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-30 12:14:10,979 - __main__ - DEBUG - API key found, length: 39 2025-05-30 12:14:10,979 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-30 12:14:10,979 - auto_diffusers - DEBUG - API key length: 39 2025-05-30 12:14:10,979 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-30 12:14:10,979 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-30 12:14:10,979 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-30 12:14:10,979 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-30 12:14:10,979 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-30 12:14:10,979 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-30 12:14:10,983 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-30 12:14:10,983 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-30 12:14:11,437 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-30 12:14:11,437 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-30 12:14:11,437 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-30 12:14:11,437 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-30 12:14:11,437 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-30 12:14:11,437 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-30 12:14:11,437 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-30 12:14:11,437 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-30 12:14:11,437 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-30 12:14:11,437 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-30 12:14:11,437 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-30 12:14:11,439 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 12:14:11,452 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-30 12:14:11,459 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 12:14:11,537 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 12:14:11,571 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=None socket_options=None 2025-05-30 12:14:11,571 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 12:14:11,571 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 12:14:11,572 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 12:14:11,572 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 12:14:11,572 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 12:14:11,572 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 12:14:11,572 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 03:14:11 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')]) 2025-05-30 12:14:11,572 - httpx - INFO - HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK" 2025-05-30 12:14:11,572 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 12:14:11,572 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 12:14:11,572 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 12:14:11,572 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 12:14:11,572 - httpcore.connection - DEBUG - close.started 2025-05-30 12:14:11,572 - httpcore.connection - DEBUG - close.complete 2025-05-30 12:14:11,573 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=3 socket_options=None 2025-05-30 12:14:11,573 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 12:14:11,573 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 12:14:11,574 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 12:14:11,574 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 12:14:11,574 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 12:14:11,574 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 12:14:11,580 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 03:14:11 GMT'), (b'server', b'uvicorn'), (b'content-length', b'101701'), (b'content-type', b'text/html; charset=utf-8')]) 2025-05-30 12:14:11,580 - httpx - INFO - HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK" 2025-05-30 12:14:11,580 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 12:14:11,580 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 12:14:11,580 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 12:14:11,580 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 12:14:11,580 - httpcore.connection - DEBUG - close.started 2025-05-30 12:14:11,580 - httpcore.connection - DEBUG - close.complete 2025-05-30 12:14:11,592 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=30 socket_options=None 2025-05-30 12:14:11,602 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 12:14:11,602 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-30 12:14:11,721 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-30 12:14:11,734 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 12:14:11,734 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=30 2025-05-30 12:14:11,887 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 12:14:11,887 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 12:14:11,887 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 12:14:11,887 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 12:14:11,888 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 12:14:11,888 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 12:14:12,022 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 12:14:12,022 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 12:14:12,022 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 12:14:12,023 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 12:14:12,023 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 12:14:12,023 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 12:14:12,047 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 03:14:11 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-30 12:14:12,048 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-30 12:14:12,048 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 12:14:12,048 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 12:14:12,048 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 12:14:12,048 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 12:14:12,048 - httpcore.connection - DEBUG - close.started 2025-05-30 12:14:12,049 - httpcore.connection - DEBUG - close.complete 2025-05-30 12:14:12,168 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 03:14:12 GMT'), (b'Content-Type', b'text/html; charset=utf-8'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'ContentType', b'application/json'), (b'Access-Control-Allow-Origin', b'*'), (b'Content-Encoding', b'gzip')]) 2025-05-30 12:14:12,168 - httpx - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK" 2025-05-30 12:14:12,168 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 12:14:12,168 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 12:14:12,168 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 12:14:12,168 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 12:14:12,168 - httpcore.connection - DEBUG - close.started 2025-05-30 12:14:12,169 - httpcore.connection - DEBUG - close.complete 2025-05-30 12:14:12,334 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:14:12,334 - simple_memory_calculator - INFO - Using known memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 12:14:12,335 - simple_memory_calculator - DEBUG - Known data: {'params_billions': 12.0, 'fp16_gb': 24.0, 'inference_fp16_gb': 36.0} 2025-05-30 12:14:12,335 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 12:14:12,335 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:14:12,335 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 12:14:12,335 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 12:14:12,335 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:14:12,335 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 12:14:12,733 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 12:14:12,951 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/launched HTTP/1.1" 200 0 2025-05-30 12:15:12,436 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-30 12:15:12,436 - __main__ - DEBUG - API key found, length: 39 2025-05-30 12:15:12,436 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-30 12:15:12,436 - auto_diffusers - DEBUG - API key length: 39 2025-05-30 12:15:12,437 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-30 12:15:12,437 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-30 12:15:12,437 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-30 12:15:12,437 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-30 12:15:12,437 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-30 12:15:12,437 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-30 12:15:12,440 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-30 12:15:12,441 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-30 12:15:12,913 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-30 12:15:12,913 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-30 12:15:12,913 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-30 12:15:12,913 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-30 12:15:12,913 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-30 12:15:12,913 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-30 12:15:12,913 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-30 12:15:12,913 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-30 12:15:12,913 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-30 12:15:12,913 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-30 12:15:12,913 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-30 12:15:12,915 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 12:15:12,929 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-30 12:15:12,935 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 12:15:13,014 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 12:15:13,048 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=None socket_options=None 2025-05-30 12:15:13,048 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 12:15:13,048 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 12:15:13,049 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 12:15:13,049 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 12:15:13,049 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 12:15:13,049 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 12:15:13,049 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 03:15:13 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')]) 2025-05-30 12:15:13,049 - httpx - INFO - HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK" 2025-05-30 12:15:13,049 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 12:15:13,049 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 12:15:13,050 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 12:15:13,050 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 12:15:13,050 - httpcore.connection - DEBUG - close.started 2025-05-30 12:15:13,050 - httpcore.connection - DEBUG - close.complete 2025-05-30 12:15:13,050 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=3 socket_options=None 2025-05-30 12:15:13,050 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 12:15:13,051 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 12:15:13,051 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 12:15:13,051 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 12:15:13,051 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 12:15:13,051 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 12:15:13,057 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 03:15:13 GMT'), (b'server', b'uvicorn'), (b'content-length', b'101553'), (b'content-type', b'text/html; charset=utf-8')]) 2025-05-30 12:15:13,057 - httpx - INFO - HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK" 2025-05-30 12:15:13,057 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 12:15:13,057 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 12:15:13,057 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 12:15:13,057 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 12:15:13,057 - httpcore.connection - DEBUG - close.started 2025-05-30 12:15:13,057 - httpcore.connection - DEBUG - close.complete 2025-05-30 12:15:13,069 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=30 socket_options=None 2025-05-30 12:15:13,100 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 12:15:13,100 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-30 12:15:13,206 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 12:15:13,206 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=30 2025-05-30 12:15:13,294 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-30 12:15:13,391 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 12:15:13,392 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 12:15:13,392 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 12:15:13,392 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 12:15:13,392 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 12:15:13,392 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 12:15:13,518 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 12:15:13,518 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 12:15:13,519 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 12:15:13,519 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 12:15:13,519 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 12:15:13,519 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 12:15:13,537 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 03:15:13 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-30 12:15:13,538 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-30 12:15:13,538 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 12:15:13,538 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 12:15:13,538 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 12:15:13,538 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 12:15:13,538 - httpcore.connection - DEBUG - close.started 2025-05-30 12:15:13,539 - httpcore.connection - DEBUG - close.complete 2025-05-30 12:15:13,657 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 03:15:13 GMT'), (b'Content-Type', b'text/html; charset=utf-8'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'ContentType', b'application/json'), (b'Access-Control-Allow-Origin', b'*'), (b'Content-Encoding', b'gzip')]) 2025-05-30 12:15:13,657 - httpx - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK" 2025-05-30 12:15:13,657 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 12:15:13,658 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 12:15:13,658 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 12:15:13,658 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 12:15:13,658 - httpcore.connection - DEBUG - close.started 2025-05-30 12:15:13,659 - httpcore.connection - DEBUG - close.complete 2025-05-30 12:15:13,910 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:15:13,910 - simple_memory_calculator - INFO - Using known memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 12:15:13,910 - simple_memory_calculator - DEBUG - Known data: {'params_billions': 12.0, 'fp16_gb': 24.0, 'inference_fp16_gb': 36.0} 2025-05-30 12:15:13,910 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 12:15:13,910 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:15:13,910 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 12:15:13,911 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 12:15:13,911 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:15:13,911 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 12:15:14,276 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 12:15:14,494 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/launched HTTP/1.1" 200 0 2025-05-30 12:20:05,265 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-30 12:20:05,265 - __main__ - DEBUG - API key found, length: 39 2025-05-30 12:20:05,265 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-30 12:20:05,265 - auto_diffusers - DEBUG - API key length: 39 2025-05-30 12:20:05,265 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-30 12:20:05,266 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-30 12:20:05,266 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-30 12:20:05,266 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-30 12:20:05,266 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-30 12:20:05,266 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-30 12:20:05,269 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-30 12:20:05,269 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-30 12:20:05,785 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-30 12:20:05,785 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-30 12:20:05,785 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-30 12:20:05,785 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-30 12:20:05,785 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-30 12:20:05,785 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-30 12:20:05,785 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-30 12:20:05,785 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-30 12:20:05,785 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-30 12:20:05,785 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-30 12:20:05,785 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-30 12:20:05,787 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 12:20:05,800 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-30 12:20:05,808 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 12:20:05,895 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 12:20:05,927 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=None socket_options=None 2025-05-30 12:20:05,927 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 12:20:05,928 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 12:20:05,928 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 12:20:05,928 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 12:20:05,928 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 12:20:05,928 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 12:20:05,929 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 03:20:05 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')]) 2025-05-30 12:20:05,929 - httpx - INFO - HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK" 2025-05-30 12:20:05,929 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 12:20:05,929 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 12:20:05,929 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 12:20:05,929 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 12:20:05,929 - httpcore.connection - DEBUG - close.started 2025-05-30 12:20:05,929 - httpcore.connection - DEBUG - close.complete 2025-05-30 12:20:05,930 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=3 socket_options=None 2025-05-30 12:20:05,930 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 12:20:05,930 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 12:20:05,930 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 12:20:05,930 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 12:20:05,930 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 12:20:05,930 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 12:20:05,936 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 03:20:05 GMT'), (b'server', b'uvicorn'), (b'content-length', b'101540'), (b'content-type', b'text/html; charset=utf-8')]) 2025-05-30 12:20:05,937 - httpx - INFO - HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK" 2025-05-30 12:20:05,937 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 12:20:05,937 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 12:20:05,937 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 12:20:05,937 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 12:20:05,937 - httpcore.connection - DEBUG - close.started 2025-05-30 12:20:05,937 - httpcore.connection - DEBUG - close.complete 2025-05-30 12:20:05,948 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=30 socket_options=None 2025-05-30 12:20:05,975 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 12:20:05,976 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-30 12:20:06,094 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 12:20:06,094 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=30 2025-05-30 12:20:06,136 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-30 12:20:06,264 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 12:20:06,265 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 12:20:06,265 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 12:20:06,265 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 12:20:06,265 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 12:20:06,265 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 12:20:06,387 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 12:20:06,387 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 12:20:06,387 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 12:20:06,387 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 12:20:06,387 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 12:20:06,388 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 12:20:06,412 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 03:20:06 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-30 12:20:06,413 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-30 12:20:06,413 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 12:20:06,413 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 12:20:06,413 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 12:20:06,413 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 12:20:06,413 - httpcore.connection - DEBUG - close.started 2025-05-30 12:20:06,413 - httpcore.connection - DEBUG - close.complete 2025-05-30 12:20:06,535 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 03:20:06 GMT'), (b'Content-Type', b'text/html; charset=utf-8'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'ContentType', b'application/json'), (b'Access-Control-Allow-Origin', b'*'), (b'Content-Encoding', b'gzip')]) 2025-05-30 12:20:06,535 - httpx - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK" 2025-05-30 12:20:06,535 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 12:20:06,535 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 12:20:06,536 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 12:20:06,536 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 12:20:06,536 - httpcore.connection - DEBUG - close.started 2025-05-30 12:20:06,536 - httpcore.connection - DEBUG - close.complete 2025-05-30 12:20:07,136 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 12:20:07,376 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/launched HTTP/1.1" 200 0 2025-05-30 12:20:09,530 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:20:09,530 - simple_memory_calculator - INFO - Using known memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 12:20:09,530 - simple_memory_calculator - DEBUG - Known data: {'params_billions': 12.0, 'fp16_gb': 24.0, 'inference_fp16_gb': 36.0} 2025-05-30 12:20:09,530 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 12:20:09,530 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:20:09,530 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 12:20:09,530 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 12:20:09,530 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:20:09,530 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 12:20:52,779 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-30 12:20:52,779 - __main__ - DEBUG - API key found, length: 39 2025-05-30 12:20:52,779 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-30 12:20:52,779 - auto_diffusers - DEBUG - API key length: 39 2025-05-30 12:20:52,779 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-30 12:20:52,779 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-30 12:20:52,779 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-30 12:20:52,779 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-30 12:20:52,779 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-30 12:20:52,779 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-30 12:20:52,783 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-30 12:20:52,783 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-30 12:20:53,249 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-30 12:20:53,250 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-30 12:20:53,250 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-30 12:20:53,250 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-30 12:20:53,250 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-30 12:20:53,250 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-30 12:20:53,250 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-30 12:20:53,250 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-30 12:20:53,250 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-30 12:20:53,250 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-30 12:20:53,250 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-30 12:20:53,252 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 12:20:53,264 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-30 12:20:53,272 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 12:20:53,359 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 12:20:53,390 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=None socket_options=None 2025-05-30 12:20:53,391 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 12:20:53,391 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 12:20:53,391 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 12:20:53,391 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 12:20:53,391 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 12:20:53,391 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 12:20:53,392 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 03:20:53 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')]) 2025-05-30 12:20:53,392 - httpx - INFO - HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK" 2025-05-30 12:20:53,392 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 12:20:53,392 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 12:20:53,392 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 12:20:53,392 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 12:20:53,392 - httpcore.connection - DEBUG - close.started 2025-05-30 12:20:53,392 - httpcore.connection - DEBUG - close.complete 2025-05-30 12:20:53,392 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=3 socket_options=None 2025-05-30 12:20:53,393 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 12:20:53,393 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 12:20:53,393 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 12:20:53,393 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 12:20:53,393 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 12:20:53,393 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 12:20:53,399 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 03:20:53 GMT'), (b'server', b'uvicorn'), (b'content-length', b'101307'), (b'content-type', b'text/html; charset=utf-8')]) 2025-05-30 12:20:53,400 - httpx - INFO - HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK" 2025-05-30 12:20:53,400 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 12:20:53,400 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 12:20:53,400 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 12:20:53,400 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 12:20:53,400 - httpcore.connection - DEBUG - close.started 2025-05-30 12:20:53,400 - httpcore.connection - DEBUG - close.complete 2025-05-30 12:20:53,411 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=30 socket_options=None 2025-05-30 12:20:53,417 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 12:20:53,417 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-30 12:20:53,535 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-30 12:20:53,546 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 12:20:53,546 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=30 2025-05-30 12:20:53,697 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 12:20:53,698 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 12:20:53,698 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 12:20:53,698 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 12:20:53,698 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 12:20:53,699 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 12:20:53,821 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 12:20:53,822 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 12:20:53,823 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 12:20:53,823 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 12:20:53,823 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 12:20:53,823 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 12:20:53,841 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 03:20:53 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-30 12:20:53,841 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-30 12:20:53,841 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 12:20:53,842 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 12:20:53,842 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 12:20:53,842 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 12:20:53,842 - httpcore.connection - DEBUG - close.started 2025-05-30 12:20:53,842 - httpcore.connection - DEBUG - close.complete 2025-05-30 12:20:53,960 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 03:20:53 GMT'), (b'Content-Type', b'text/html; charset=utf-8'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'ContentType', b'application/json'), (b'Access-Control-Allow-Origin', b'*'), (b'Content-Encoding', b'gzip')]) 2025-05-30 12:20:53,961 - httpx - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK" 2025-05-30 12:20:53,961 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 12:20:53,961 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 12:20:53,961 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 12:20:53,962 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 12:20:53,962 - httpcore.connection - DEBUG - close.started 2025-05-30 12:20:53,962 - httpcore.connection - DEBUG - close.complete 2025-05-30 12:20:54,524 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 12:20:54,744 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/launched HTTP/1.1" 200 0 2025-05-30 12:20:56,441 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:20:56,441 - simple_memory_calculator - INFO - Using known memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 12:20:56,441 - simple_memory_calculator - DEBUG - Known data: {'params_billions': 12.0, 'fp16_gb': 24.0, 'inference_fp16_gb': 36.0} 2025-05-30 12:20:56,441 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 12:20:56,441 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:20:56,441 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 12:20:56,441 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 12:20:56,441 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:20:56,442 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 12:21:35,484 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-30 12:21:35,484 - __main__ - DEBUG - API key found, length: 39 2025-05-30 12:21:35,484 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-30 12:21:35,484 - auto_diffusers - DEBUG - API key length: 39 2025-05-30 12:21:35,484 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-30 12:21:35,484 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-30 12:21:35,484 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-30 12:21:35,484 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-30 12:21:35,484 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-30 12:21:35,484 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-30 12:21:35,488 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-30 12:21:35,488 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-30 12:21:35,941 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-30 12:21:35,941 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-30 12:21:35,941 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-30 12:21:35,941 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-30 12:21:35,941 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-30 12:21:35,941 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-30 12:21:35,941 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-30 12:21:35,941 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-30 12:21:35,941 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-30 12:21:35,941 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-30 12:21:35,941 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-30 12:21:35,943 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 12:21:35,957 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-30 12:21:35,964 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 12:21:36,045 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 12:21:36,077 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=None socket_options=None 2025-05-30 12:21:36,078 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 12:21:36,078 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 12:21:36,078 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 12:21:36,078 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 12:21:36,078 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 12:21:36,079 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 12:21:36,079 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 03:21:36 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')]) 2025-05-30 12:21:36,079 - httpx - INFO - HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK" 2025-05-30 12:21:36,079 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 12:21:36,079 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 12:21:36,079 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 12:21:36,079 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 12:21:36,079 - httpcore.connection - DEBUG - close.started 2025-05-30 12:21:36,079 - httpcore.connection - DEBUG - close.complete 2025-05-30 12:21:36,080 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=3 socket_options=None 2025-05-30 12:21:36,080 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 12:21:36,080 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 12:21:36,080 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 12:21:36,080 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 12:21:36,080 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 12:21:36,080 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 12:21:36,087 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 03:21:36 GMT'), (b'server', b'uvicorn'), (b'content-length', b'100397'), (b'content-type', b'text/html; charset=utf-8')]) 2025-05-30 12:21:36,087 - httpx - INFO - HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK" 2025-05-30 12:21:36,087 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 12:21:36,087 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 12:21:36,087 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 12:21:36,087 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 12:21:36,087 - httpcore.connection - DEBUG - close.started 2025-05-30 12:21:36,087 - httpcore.connection - DEBUG - close.complete 2025-05-30 12:21:36,099 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=30 socket_options=None 2025-05-30 12:21:36,255 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-30 12:21:36,514 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 12:21:36,514 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 12:21:36,514 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=30 2025-05-30 12:21:36,514 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-30 12:21:36,800 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 12:21:36,800 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 12:21:36,800 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 12:21:36,800 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 12:21:36,800 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 12:21:36,800 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 12:21:36,808 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 12:21:36,808 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 12:21:36,808 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 12:21:36,808 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 12:21:36,809 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 12:21:36,809 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 12:21:36,946 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 03:21:36 GMT'), (b'Content-Type', b'text/html; charset=utf-8'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'ContentType', b'application/json'), (b'Access-Control-Allow-Origin', b'*'), (b'Content-Encoding', b'gzip')]) 2025-05-30 12:21:36,947 - httpx - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK" 2025-05-30 12:21:36,947 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 12:21:36,948 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 12:21:36,948 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 12:21:36,948 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 12:21:36,948 - httpcore.connection - DEBUG - close.started 2025-05-30 12:21:36,949 - httpcore.connection - DEBUG - close.complete 2025-05-30 12:21:36,957 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 03:21:36 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-30 12:21:36,958 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-30 12:21:36,959 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 12:21:36,959 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 12:21:36,959 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 12:21:36,959 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 12:21:36,959 - httpcore.connection - DEBUG - close.started 2025-05-30 12:21:36,959 - httpcore.connection - DEBUG - close.complete 2025-05-30 12:21:37,538 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 12:21:37,654 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:21:37,654 - simple_memory_calculator - INFO - Using known memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 12:21:37,654 - simple_memory_calculator - DEBUG - Known data: {'params_billions': 12.0, 'fp16_gb': 24.0, 'inference_fp16_gb': 36.0} 2025-05-30 12:21:37,654 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 12:21:37,655 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:21:37,655 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 12:21:37,655 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 12:21:37,656 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:21:37,656 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 12:21:37,753 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/launched HTTP/1.1" 200 0 2025-05-30 12:22:32,882 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-30 12:22:32,883 - __main__ - DEBUG - API key found, length: 39 2025-05-30 12:22:32,883 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-30 12:22:32,883 - auto_diffusers - DEBUG - API key length: 39 2025-05-30 12:22:32,883 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-30 12:22:32,883 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-30 12:22:32,883 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-30 12:22:32,883 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-30 12:22:32,883 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-30 12:22:32,883 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-30 12:22:32,887 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-30 12:22:32,887 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-30 12:22:33,354 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-30 12:22:33,354 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-30 12:22:33,354 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-30 12:22:33,354 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-30 12:22:33,354 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-30 12:22:33,354 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-30 12:22:33,354 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-30 12:22:33,354 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-30 12:22:33,354 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-30 12:22:33,354 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-30 12:22:33,354 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-30 12:22:33,356 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 12:22:33,369 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-30 12:22:33,376 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 12:22:33,454 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 12:22:33,486 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=None socket_options=None 2025-05-30 12:22:33,486 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 12:22:33,487 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 12:22:33,487 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 12:22:33,487 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 12:22:33,487 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 12:22:33,487 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 12:22:33,487 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 03:22:33 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')]) 2025-05-30 12:22:33,487 - httpx - INFO - HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK" 2025-05-30 12:22:33,488 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 12:22:33,488 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 12:22:33,488 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 12:22:33,488 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 12:22:33,488 - httpcore.connection - DEBUG - close.started 2025-05-30 12:22:33,488 - httpcore.connection - DEBUG - close.complete 2025-05-30 12:22:33,488 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=3 socket_options=None 2025-05-30 12:22:33,489 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 12:22:33,489 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 12:22:33,489 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 12:22:33,489 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 12:22:33,489 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 12:22:33,489 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 12:22:33,495 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 03:22:33 GMT'), (b'server', b'uvicorn'), (b'content-length', b'100681'), (b'content-type', b'text/html; charset=utf-8')]) 2025-05-30 12:22:33,495 - httpx - INFO - HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK" 2025-05-30 12:22:33,495 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 12:22:33,495 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 12:22:33,495 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 12:22:33,495 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 12:22:33,495 - httpcore.connection - DEBUG - close.started 2025-05-30 12:22:33,495 - httpcore.connection - DEBUG - close.complete 2025-05-30 12:22:33,507 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=30 socket_options=None 2025-05-30 12:22:33,513 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 12:22:33,514 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-30 12:22:33,628 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-30 12:22:33,654 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 12:22:33,654 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=30 2025-05-30 12:22:33,784 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 12:22:33,784 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 12:22:33,785 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 12:22:33,785 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 12:22:33,785 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 12:22:33,785 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 12:22:33,923 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 03:22:33 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-30 12:22:33,924 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-30 12:22:33,924 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 12:22:33,924 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 12:22:33,924 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 12:22:33,924 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 12:22:33,924 - httpcore.connection - DEBUG - close.started 2025-05-30 12:22:33,924 - httpcore.connection - DEBUG - close.complete 2025-05-30 12:22:33,950 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 12:22:33,950 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 12:22:33,950 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 12:22:33,950 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 12:22:33,951 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 12:22:33,951 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 12:22:34,099 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 03:22:34 GMT'), (b'Content-Type', b'text/html; charset=utf-8'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'ContentType', b'application/json'), (b'Access-Control-Allow-Origin', b'*'), (b'Content-Encoding', b'gzip')]) 2025-05-30 12:22:34,100 - httpx - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK" 2025-05-30 12:22:34,100 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 12:22:34,100 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 12:22:34,100 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 12:22:34,100 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 12:22:34,100 - httpcore.connection - DEBUG - close.started 2025-05-30 12:22:34,101 - httpcore.connection - DEBUG - close.complete 2025-05-30 12:22:34,251 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:22:34,251 - simple_memory_calculator - INFO - Using known memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 12:22:34,252 - simple_memory_calculator - DEBUG - Known data: {'params_billions': 12.0, 'fp16_gb': 24.0, 'inference_fp16_gb': 36.0} 2025-05-30 12:22:34,252 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 12:22:34,252 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:22:34,252 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 12:22:34,252 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 12:22:34,252 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:22:34,252 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 12:22:34,683 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 12:22:34,905 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/launched HTTP/1.1" 200 0 2025-05-30 12:23:24,138 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-30 12:23:24,138 - __main__ - DEBUG - API key found, length: 39 2025-05-30 12:23:24,138 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-30 12:23:24,138 - auto_diffusers - DEBUG - API key length: 39 2025-05-30 12:23:24,138 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-30 12:23:24,138 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-30 12:23:24,138 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-30 12:23:24,138 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-30 12:23:24,138 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-30 12:23:24,138 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-30 12:23:24,142 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-30 12:23:24,142 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-30 12:23:24,609 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-30 12:23:24,609 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-30 12:23:24,609 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-30 12:23:24,609 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-30 12:23:24,609 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-30 12:23:24,609 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-30 12:23:24,609 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-30 12:23:24,609 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-30 12:23:24,609 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-30 12:23:24,609 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-30 12:23:24,609 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-30 12:23:24,612 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 12:23:24,625 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-30 12:23:24,634 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 12:23:24,729 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 12:23:24,761 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=None socket_options=None 2025-05-30 12:23:24,762 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 12:23:24,762 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 12:23:24,762 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 12:23:24,762 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 12:23:24,762 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 12:23:24,762 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 12:23:24,763 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 03:23:24 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')]) 2025-05-30 12:23:24,763 - httpx - INFO - HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK" 2025-05-30 12:23:24,763 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 12:23:24,763 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 12:23:24,763 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 12:23:24,763 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 12:23:24,763 - httpcore.connection - DEBUG - close.started 2025-05-30 12:23:24,763 - httpcore.connection - DEBUG - close.complete 2025-05-30 12:23:24,763 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=3 socket_options=None 2025-05-30 12:23:24,764 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 12:23:24,764 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 12:23:24,764 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 12:23:24,764 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 12:23:24,764 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 12:23:24,764 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 12:23:24,770 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 03:23:24 GMT'), (b'server', b'uvicorn'), (b'content-length', b'100672'), (b'content-type', b'text/html; charset=utf-8')]) 2025-05-30 12:23:24,770 - httpx - INFO - HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK" 2025-05-30 12:23:24,770 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 12:23:24,770 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 12:23:24,771 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 12:23:24,771 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 12:23:24,771 - httpcore.connection - DEBUG - close.started 2025-05-30 12:23:24,771 - httpcore.connection - DEBUG - close.complete 2025-05-30 12:23:24,782 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=30 socket_options=None 2025-05-30 12:23:24,807 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 12:23:24,807 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-30 12:23:24,921 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-30 12:23:24,925 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 12:23:24,925 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=30 2025-05-30 12:23:25,097 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 12:23:25,098 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 12:23:25,098 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 12:23:25,098 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 12:23:25,099 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 12:23:25,099 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 12:23:25,210 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 12:23:25,210 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 12:23:25,211 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 12:23:25,211 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 12:23:25,211 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 12:23:25,211 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 12:23:25,247 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 03:23:25 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-30 12:23:25,247 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-30 12:23:25,248 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 12:23:25,248 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 12:23:25,248 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 12:23:25,248 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 12:23:25,248 - httpcore.connection - DEBUG - close.started 2025-05-30 12:23:25,248 - httpcore.connection - DEBUG - close.complete 2025-05-30 12:23:25,357 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 03:23:25 GMT'), (b'Content-Type', b'text/html; charset=utf-8'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'ContentType', b'application/json'), (b'Access-Control-Allow-Origin', b'*'), (b'Content-Encoding', b'gzip')]) 2025-05-30 12:23:25,358 - httpx - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK" 2025-05-30 12:23:25,358 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 12:23:25,358 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 12:23:25,358 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 12:23:25,359 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 12:23:25,359 - httpcore.connection - DEBUG - close.started 2025-05-30 12:23:25,359 - httpcore.connection - DEBUG - close.complete 2025-05-30 12:23:25,986 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 12:23:26,205 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/launched HTTP/1.1" 200 0 2025-05-30 12:23:26,698 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:23:26,699 - simple_memory_calculator - INFO - Using known memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 12:23:26,699 - simple_memory_calculator - DEBUG - Known data: {'params_billions': 12.0, 'fp16_gb': 24.0, 'inference_fp16_gb': 36.0} 2025-05-30 12:23:26,699 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 12:23:26,699 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:23:26,699 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 12:23:26,699 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 12:23:26,699 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:23:26,699 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 12:23:54,576 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-30 12:23:54,576 - __main__ - DEBUG - API key found, length: 39 2025-05-30 12:23:54,576 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-30 12:23:54,576 - auto_diffusers - DEBUG - API key length: 39 2025-05-30 12:23:54,576 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-30 12:23:54,576 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-30 12:23:54,576 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-30 12:23:54,576 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-30 12:23:54,576 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-30 12:23:54,576 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-30 12:23:54,580 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-30 12:23:54,580 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-30 12:23:55,041 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-30 12:23:55,041 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-30 12:23:55,041 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-30 12:23:55,041 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-30 12:23:55,041 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-30 12:23:55,041 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-30 12:23:55,041 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-30 12:23:55,041 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-30 12:23:55,041 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-30 12:23:55,041 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-30 12:23:55,041 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-30 12:23:55,043 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 12:23:55,057 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-30 12:23:55,064 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 12:23:55,145 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 12:23:55,177 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=None socket_options=None 2025-05-30 12:23:55,178 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 12:23:55,178 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 12:23:55,178 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 12:23:55,178 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 12:23:55,178 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 12:23:55,178 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 12:23:55,179 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 03:23:55 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')]) 2025-05-30 12:23:55,179 - httpx - INFO - HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK" 2025-05-30 12:23:55,179 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 12:23:55,179 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 12:23:55,179 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 12:23:55,179 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 12:23:55,179 - httpcore.connection - DEBUG - close.started 2025-05-30 12:23:55,179 - httpcore.connection - DEBUG - close.complete 2025-05-30 12:23:55,180 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=3 socket_options=None 2025-05-30 12:23:55,180 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 12:23:55,180 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 12:23:55,180 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 12:23:55,180 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 12:23:55,180 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 12:23:55,180 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 12:23:55,187 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 03:23:55 GMT'), (b'server', b'uvicorn'), (b'content-length', b'100632'), (b'content-type', b'text/html; charset=utf-8')]) 2025-05-30 12:23:55,187 - httpx - INFO - HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK" 2025-05-30 12:23:55,187 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 12:23:55,187 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 12:23:55,187 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 12:23:55,187 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 12:23:55,187 - httpcore.connection - DEBUG - close.started 2025-05-30 12:23:55,187 - httpcore.connection - DEBUG - close.complete 2025-05-30 12:23:55,198 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=30 socket_options=None 2025-05-30 12:23:55,224 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 12:23:55,224 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-30 12:23:55,331 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-30 12:23:55,342 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 12:23:55,342 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=30 2025-05-30 12:23:55,504 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 12:23:55,505 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 12:23:55,505 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 12:23:55,505 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 12:23:55,506 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 12:23:55,506 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 12:23:55,636 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 12:23:55,636 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 12:23:55,637 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 12:23:55,637 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 12:23:55,637 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 12:23:55,637 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 12:23:55,646 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 03:23:55 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-30 12:23:55,646 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-30 12:23:55,647 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 12:23:55,647 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 12:23:55,647 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 12:23:55,647 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 12:23:55,648 - httpcore.connection - DEBUG - close.started 2025-05-30 12:23:55,648 - httpcore.connection - DEBUG - close.complete 2025-05-30 12:23:55,783 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 03:23:55 GMT'), (b'Content-Type', b'text/html; charset=utf-8'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'ContentType', b'application/json'), (b'Access-Control-Allow-Origin', b'*'), (b'Content-Encoding', b'gzip')]) 2025-05-30 12:23:55,783 - httpx - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK" 2025-05-30 12:23:55,783 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 12:23:55,783 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 12:23:55,783 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 12:23:55,783 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 12:23:55,783 - httpcore.connection - DEBUG - close.started 2025-05-30 12:23:55,783 - httpcore.connection - DEBUG - close.complete 2025-05-30 12:23:56,352 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 12:23:56,404 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:23:56,404 - simple_memory_calculator - INFO - Using known memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 12:23:56,404 - simple_memory_calculator - DEBUG - Known data: {'params_billions': 12.0, 'fp16_gb': 24.0, 'inference_fp16_gb': 36.0} 2025-05-30 12:23:56,404 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 12:23:56,404 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:23:56,404 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 12:23:56,404 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 12:23:56,404 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:23:56,404 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 12:23:56,575 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/launched HTTP/1.1" 200 0 2025-05-30 12:25:34,396 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-30 12:25:34,396 - __main__ - DEBUG - API key found, length: 39 2025-05-30 12:25:34,396 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-30 12:25:34,396 - auto_diffusers - DEBUG - API key length: 39 2025-05-30 12:25:34,396 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-30 12:25:34,396 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-30 12:25:34,396 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-30 12:25:34,396 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-30 12:25:34,396 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-30 12:25:34,396 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-30 12:25:34,399 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-30 12:25:34,400 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-30 12:25:34,861 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-30 12:25:34,861 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-30 12:25:34,861 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-30 12:25:34,861 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-30 12:25:34,861 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-30 12:25:34,861 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-30 12:25:34,861 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-30 12:25:34,861 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-30 12:25:34,861 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-30 12:25:34,861 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-30 12:25:34,861 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-30 12:25:34,863 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 12:25:34,877 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-30 12:25:34,884 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 12:25:34,964 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 12:25:34,998 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=None socket_options=None 2025-05-30 12:25:34,998 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 12:25:34,998 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 12:25:34,999 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 12:25:34,999 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 12:25:34,999 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 12:25:34,999 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 12:25:34,999 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 03:25:34 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')]) 2025-05-30 12:25:34,999 - httpx - INFO - HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK" 2025-05-30 12:25:34,999 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 12:25:34,999 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 12:25:35,000 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 12:25:35,000 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 12:25:35,000 - httpcore.connection - DEBUG - close.started 2025-05-30 12:25:35,000 - httpcore.connection - DEBUG - close.complete 2025-05-30 12:25:35,000 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=3 socket_options=None 2025-05-30 12:25:35,000 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 12:25:35,000 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 12:25:35,001 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 12:25:35,001 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 12:25:35,001 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 12:25:35,001 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 12:25:35,007 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 03:25:34 GMT'), (b'server', b'uvicorn'), (b'content-length', b'100633'), (b'content-type', b'text/html; charset=utf-8')]) 2025-05-30 12:25:35,007 - httpx - INFO - HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK" 2025-05-30 12:25:35,007 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 12:25:35,007 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 12:25:35,007 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 12:25:35,007 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 12:25:35,007 - httpcore.connection - DEBUG - close.started 2025-05-30 12:25:35,007 - httpcore.connection - DEBUG - close.complete 2025-05-30 12:25:35,019 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=30 socket_options=None 2025-05-30 12:25:35,048 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 12:25:35,048 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-30 12:25:35,162 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-30 12:25:35,168 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 12:25:35,168 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=30 2025-05-30 12:25:35,343 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 12:25:35,344 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 12:25:35,344 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 12:25:35,344 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 12:25:35,344 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 12:25:35,344 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 12:25:35,468 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 12:25:35,468 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 12:25:35,469 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 12:25:35,469 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 12:25:35,469 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 12:25:35,469 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 12:25:35,491 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 03:25:35 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-30 12:25:35,491 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-30 12:25:35,491 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 12:25:35,491 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 12:25:35,491 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 12:25:35,491 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 12:25:35,491 - httpcore.connection - DEBUG - close.started 2025-05-30 12:25:35,491 - httpcore.connection - DEBUG - close.complete 2025-05-30 12:25:35,608 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:25:35,609 - simple_memory_calculator - INFO - Using known memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 12:25:35,609 - simple_memory_calculator - DEBUG - Known data: {'params_billions': 12.0, 'fp16_gb': 24.0, 'inference_fp16_gb': 36.0} 2025-05-30 12:25:35,609 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 12:25:35,609 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:25:35,609 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 12:25:35,609 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 12:25:35,610 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:25:35,610 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 12:25:35,623 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 03:25:35 GMT'), (b'Content-Type', b'text/html; charset=utf-8'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'ContentType', b'application/json'), (b'Access-Control-Allow-Origin', b'*'), (b'Content-Encoding', b'gzip')]) 2025-05-30 12:25:35,623 - httpx - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK" 2025-05-30 12:25:35,623 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 12:25:35,624 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 12:25:35,624 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 12:25:35,624 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 12:25:35,624 - httpcore.connection - DEBUG - close.started 2025-05-30 12:25:35,624 - httpcore.connection - DEBUG - close.complete 2025-05-30 12:25:36,264 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 12:25:36,494 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/launched HTTP/1.1" 200 0 2025-05-30 12:35:49,296 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-30 12:35:49,296 - __main__ - DEBUG - API key found, length: 39 2025-05-30 12:35:49,296 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-30 12:35:49,296 - auto_diffusers - DEBUG - API key length: 39 2025-05-30 12:35:49,296 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-30 12:35:49,296 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-30 12:35:49,296 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-30 12:35:49,296 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-30 12:35:49,296 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-30 12:35:49,296 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-30 12:35:49,300 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-30 12:35:49,300 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-30 12:35:49,751 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-30 12:35:49,752 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-30 12:35:49,752 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-30 12:35:49,752 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-30 12:35:49,752 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-30 12:35:49,752 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-30 12:35:49,752 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-30 12:35:49,752 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-30 12:35:49,752 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-30 12:35:49,752 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-30 12:35:49,752 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-30 12:35:49,754 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 12:35:49,766 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-30 12:35:49,774 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 12:35:49,853 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 12:35:49,885 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=None socket_options=None 2025-05-30 12:35:49,885 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 12:35:49,886 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 12:35:49,886 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 12:35:49,886 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 12:35:49,886 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 12:35:49,886 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 12:35:49,886 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 03:35:49 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')]) 2025-05-30 12:35:49,887 - httpx - INFO - HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK" 2025-05-30 12:35:49,887 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 12:35:49,887 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 12:35:49,887 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 12:35:49,887 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 12:35:49,887 - httpcore.connection - DEBUG - close.started 2025-05-30 12:35:49,887 - httpcore.connection - DEBUG - close.complete 2025-05-30 12:35:49,887 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=3 socket_options=None 2025-05-30 12:35:49,888 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 12:35:49,888 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 12:35:49,888 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 12:35:49,888 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 12:35:49,888 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 12:35:49,888 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 12:35:49,894 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 03:35:49 GMT'), (b'server', b'uvicorn'), (b'content-length', b'100617'), (b'content-type', b'text/html; charset=utf-8')]) 2025-05-30 12:35:49,894 - httpx - INFO - HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK" 2025-05-30 12:35:49,894 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 12:35:49,894 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 12:35:49,894 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 12:35:49,894 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 12:35:49,894 - httpcore.connection - DEBUG - close.started 2025-05-30 12:35:49,894 - httpcore.connection - DEBUG - close.complete 2025-05-30 12:35:49,906 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=30 socket_options=None 2025-05-30 12:35:49,959 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 12:35:49,959 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-30 12:35:50,041 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 12:35:50,042 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=30 2025-05-30 12:35:50,058 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-30 12:35:50,290 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 12:35:50,290 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 12:35:50,290 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 12:35:50,290 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 12:35:50,290 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 12:35:50,290 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 12:35:50,315 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 12:35:50,316 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 12:35:50,316 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 12:35:50,316 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 12:35:50,316 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 12:35:50,316 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 12:35:50,454 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 03:35:50 GMT'), (b'Content-Type', b'text/html; charset=utf-8'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'ContentType', b'application/json'), (b'Access-Control-Allow-Origin', b'*'), (b'Content-Encoding', b'gzip')]) 2025-05-30 12:35:50,454 - httpx - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK" 2025-05-30 12:35:50,455 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 12:35:50,455 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 12:35:50,456 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 03:35:50 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-30 12:35:50,456 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 12:35:50,456 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-30 12:35:50,456 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 12:35:50,456 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 12:35:50,457 - httpcore.connection - DEBUG - close.started 2025-05-30 12:35:50,457 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 12:35:50,457 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 12:35:50,457 - httpcore.connection - DEBUG - close.complete 2025-05-30 12:35:50,457 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 12:35:50,458 - httpcore.connection - DEBUG - close.started 2025-05-30 12:35:50,458 - httpcore.connection - DEBUG - close.complete 2025-05-30 12:35:51,059 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 12:35:51,278 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/launched HTTP/1.1" 200 0 2025-05-30 12:40:50,489 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-30 12:40:50,489 - __main__ - DEBUG - API key found, length: 39 2025-05-30 12:40:50,489 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-30 12:40:50,489 - auto_diffusers - DEBUG - API key length: 39 2025-05-30 12:40:50,489 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-30 12:40:50,489 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-30 12:40:50,489 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-30 12:40:50,489 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-30 12:40:50,489 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-30 12:40:50,489 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-30 12:40:50,492 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-30 12:40:50,492 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-30 12:40:50,960 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-30 12:40:50,960 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-30 12:40:50,960 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-30 12:40:50,960 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-30 12:40:50,960 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-30 12:40:50,960 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-30 12:40:50,960 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-30 12:40:50,960 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-30 12:40:50,960 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-30 12:40:50,960 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-30 12:40:50,960 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-30 12:40:50,962 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 12:40:50,976 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-30 12:40:50,981 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 12:40:51,061 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 12:40:51,094 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=None socket_options=None 2025-05-30 12:40:51,095 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 12:40:51,095 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 12:40:51,095 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 12:40:51,095 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 12:40:51,095 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 12:40:51,095 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 12:40:51,096 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 03:40:51 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')]) 2025-05-30 12:40:51,096 - httpx - INFO - HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK" 2025-05-30 12:40:51,096 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 12:40:51,096 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 12:40:51,096 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 12:40:51,096 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 12:40:51,096 - httpcore.connection - DEBUG - close.started 2025-05-30 12:40:51,096 - httpcore.connection - DEBUG - close.complete 2025-05-30 12:40:51,097 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=3 socket_options=None 2025-05-30 12:40:51,097 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 12:40:51,097 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 12:40:51,097 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 12:40:51,097 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 12:40:51,097 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 12:40:51,097 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 12:40:51,103 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 03:40:51 GMT'), (b'server', b'uvicorn'), (b'content-length', b'100551'), (b'content-type', b'text/html; charset=utf-8')]) 2025-05-30 12:40:51,104 - httpx - INFO - HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK" 2025-05-30 12:40:51,104 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 12:40:51,104 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 12:40:51,104 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 12:40:51,104 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 12:40:51,104 - httpcore.connection - DEBUG - close.started 2025-05-30 12:40:51,104 - httpcore.connection - DEBUG - close.complete 2025-05-30 12:40:51,116 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=30 socket_options=None 2025-05-30 12:40:51,227 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 12:40:51,227 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-30 12:40:51,249 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 12:40:51,250 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=30 2025-05-30 12:40:51,267 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-30 12:40:51,515 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 12:40:51,515 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 12:40:51,515 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 12:40:51,515 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 12:40:51,515 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 12:40:51,515 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 12:40:51,571 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 12:40:51,571 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 12:40:51,572 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 12:40:51,572 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 12:40:51,572 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 12:40:51,572 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 12:40:51,659 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 03:40:51 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-30 12:40:51,660 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-30 12:40:51,660 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 12:40:51,660 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 12:40:51,660 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 12:40:51,661 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 12:40:51,661 - httpcore.connection - DEBUG - close.started 2025-05-30 12:40:51,661 - httpcore.connection - DEBUG - close.complete 2025-05-30 12:40:51,708 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 03:40:51 GMT'), (b'Content-Type', b'text/html; charset=utf-8'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'ContentType', b'application/json'), (b'Access-Control-Allow-Origin', b'*'), (b'Content-Encoding', b'gzip')]) 2025-05-30 12:40:51,708 - httpx - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK" 2025-05-30 12:40:51,708 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 12:40:51,709 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 12:40:51,709 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 12:40:51,709 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 12:40:51,710 - httpcore.connection - DEBUG - close.started 2025-05-30 12:40:51,710 - httpcore.connection - DEBUG - close.complete 2025-05-30 12:40:52,331 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 12:40:52,553 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/launched HTTP/1.1" 200 0 2025-05-30 12:40:54,350 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:40:54,351 - simple_memory_calculator - INFO - Using known memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 12:40:54,351 - simple_memory_calculator - DEBUG - Known data: {'params_billions': 12.0, 'fp16_gb': 24.0, 'inference_fp16_gb': 36.0} 2025-05-30 12:40:54,351 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 12:40:54,351 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:40:54,351 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 12:40:54,351 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 12:40:54,351 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:40:54,351 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 12:40:56,169 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnells 2025-05-30 12:40:56,387 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "GET /api/models/black-forest-labs/FLUX.1-schnells HTTP/1.1" 404 32 2025-05-30 12:40:56,388 - simple_memory_calculator - ERROR - API Error for model black-forest-labs/FLUX.1-schnells: RepositoryNotFoundError: 404 Client Error. (Request ID: Root=1-683928c8-0aaabd7b090fb8ab544ad942;119072ca-3d15-4507-9fd9-cd5b7d1a6dd1) Repository Not Found for url: https://huggingface.co/api/models/black-forest-labs/FLUX.1-schnells. Please make sure you specified the correct `repo_id` and `repo_type`. If you are trying to access a private or gated repo, make sure you are authenticated. For more details, see https://huggingface.co/docs/huggingface_hub/authentication 2025-05-30 12:40:56,388 - simple_memory_calculator - WARNING - Using generic estimation for black-forest-labs/FLUX.1-schnells due to: HuggingFace API Error: RepositoryNotFoundError: 404 Client Error. (Request ID: Root=1-683928c8-0aaabd7b090fb8ab544ad942;119072ca-3d15-4507-9fd9-cd5b7d1a6dd1) Repository Not Found for url: https://huggingface.co/api/models/black-forest-labs/FLUX.1-schnells. Please make sure you specified the correct `repo_id` and `repo_type`. If you are trying to access a private or gated repo, make sure you are authenticated. For more details, see https://huggingface.co/docs/huggingface_hub/authentication 2025-05-30 12:40:56,388 - simple_memory_calculator - DEBUG - Generic estimation parameters: 3.0B params, 6.0GB FP16 2025-05-30 12:40:56,388 - simple_memory_calculator - INFO - Generic estimation completed for black-forest-labs/FLUX.1-schnells 2025-05-30 12:40:56,388 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnells with 8.0GB VRAM 2025-05-30 12:40:56,388 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnells 2025-05-30 12:40:56,603 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "GET /api/models/black-forest-labs/FLUX.1-schnells HTTP/1.1" 404 32 2025-05-30 12:40:56,604 - simple_memory_calculator - ERROR - API Error for model black-forest-labs/FLUX.1-schnells: RepositoryNotFoundError: 404 Client Error. (Request ID: Root=1-683928c8-2a18d1162232d2236c70bd4c;9cb12c6a-cd51-46f1-9735-e99948188cb1) Repository Not Found for url: https://huggingface.co/api/models/black-forest-labs/FLUX.1-schnells. Please make sure you specified the correct `repo_id` and `repo_type`. If you are trying to access a private or gated repo, make sure you are authenticated. For more details, see https://huggingface.co/docs/huggingface_hub/authentication 2025-05-30 12:40:56,604 - simple_memory_calculator - WARNING - Using generic estimation for black-forest-labs/FLUX.1-schnells due to: HuggingFace API Error: RepositoryNotFoundError: 404 Client Error. (Request ID: Root=1-683928c8-2a18d1162232d2236c70bd4c;9cb12c6a-cd51-46f1-9735-e99948188cb1) Repository Not Found for url: https://huggingface.co/api/models/black-forest-labs/FLUX.1-schnells. Please make sure you specified the correct `repo_id` and `repo_type`. If you are trying to access a private or gated repo, make sure you are authenticated. For more details, see https://huggingface.co/docs/huggingface_hub/authentication 2025-05-30 12:40:56,604 - simple_memory_calculator - DEBUG - Generic estimation parameters: 3.0B params, 6.0GB FP16 2025-05-30 12:40:56,604 - simple_memory_calculator - INFO - Generic estimation completed for black-forest-labs/FLUX.1-schnells 2025-05-30 12:40:56,604 - simple_memory_calculator - DEBUG - Model memory: 6.0GB, Inference memory: 9.0GB 2025-05-30 12:40:56,605 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnells 2025-05-30 12:40:57,219 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "GET /api/models/black-forest-labs/FLUX.1-schnells HTTP/1.1" 404 32 2025-05-30 12:40:57,219 - simple_memory_calculator - ERROR - API Error for model black-forest-labs/FLUX.1-schnells: RepositoryNotFoundError: 404 Client Error. (Request ID: Root=1-683928c8-31558ca50d62c79922202ea7;078e0e53-2d18-4b7a-b59a-bf7ef16d16a6) Repository Not Found for url: https://huggingface.co/api/models/black-forest-labs/FLUX.1-schnells. Please make sure you specified the correct `repo_id` and `repo_type`. If you are trying to access a private or gated repo, make sure you are authenticated. For more details, see https://huggingface.co/docs/huggingface_hub/authentication 2025-05-30 12:40:57,219 - simple_memory_calculator - WARNING - Using generic estimation for black-forest-labs/FLUX.1-schnells due to: HuggingFace API Error: RepositoryNotFoundError: 404 Client Error. (Request ID: Root=1-683928c8-31558ca50d62c79922202ea7;078e0e53-2d18-4b7a-b59a-bf7ef16d16a6) Repository Not Found for url: https://huggingface.co/api/models/black-forest-labs/FLUX.1-schnells. Please make sure you specified the correct `repo_id` and `repo_type`. If you are trying to access a private or gated repo, make sure you are authenticated. For more details, see https://huggingface.co/docs/huggingface_hub/authentication 2025-05-30 12:40:57,220 - simple_memory_calculator - DEBUG - Generic estimation parameters: 3.0B params, 6.0GB FP16 2025-05-30 12:40:57,220 - simple_memory_calculator - INFO - Generic estimation completed for black-forest-labs/FLUX.1-schnells 2025-05-30 12:41:02,720 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnellsfddxcv 2025-05-30 12:41:02,941 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "GET /api/models/black-forest-labs/FLUX.1-schnellsfddxcv HTTP/1.1" 404 32 2025-05-30 12:41:02,941 - simple_memory_calculator - ERROR - API Error for model black-forest-labs/FLUX.1-schnellsfddxcv: RepositoryNotFoundError: 404 Client Error. (Request ID: Root=1-683928ce-6bc10f950c617f50323bab05;e15b6937-2a48-40ee-a0cc-d7891fbc5099) Repository Not Found for url: https://huggingface.co/api/models/black-forest-labs/FLUX.1-schnellsfddxcv. Please make sure you specified the correct `repo_id` and `repo_type`. If you are trying to access a private or gated repo, make sure you are authenticated. For more details, see https://huggingface.co/docs/huggingface_hub/authentication 2025-05-30 12:41:02,941 - simple_memory_calculator - WARNING - Using generic estimation for black-forest-labs/FLUX.1-schnellsfddxcv due to: HuggingFace API Error: RepositoryNotFoundError: 404 Client Error. (Request ID: Root=1-683928ce-6bc10f950c617f50323bab05;e15b6937-2a48-40ee-a0cc-d7891fbc5099) Repository Not Found for url: https://huggingface.co/api/models/black-forest-labs/FLUX.1-schnellsfddxcv. Please make sure you specified the correct `repo_id` and `repo_type`. If you are trying to access a private or gated repo, make sure you are authenticated. For more details, see https://huggingface.co/docs/huggingface_hub/authentication 2025-05-30 12:41:02,941 - simple_memory_calculator - DEBUG - Generic estimation parameters: 3.0B params, 6.0GB FP16 2025-05-30 12:41:02,941 - simple_memory_calculator - INFO - Generic estimation completed for black-forest-labs/FLUX.1-schnellsfddxcv 2025-05-30 12:41:02,941 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnellsfddxcv with 8.0GB VRAM 2025-05-30 12:41:02,941 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnellsfddxcv 2025-05-30 12:41:03,157 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "GET /api/models/black-forest-labs/FLUX.1-schnellsfddxcv HTTP/1.1" 404 32 2025-05-30 12:41:03,158 - simple_memory_calculator - ERROR - API Error for model black-forest-labs/FLUX.1-schnellsfddxcv: RepositoryNotFoundError: 404 Client Error. (Request ID: Root=1-683928cf-4d3c1a245dbb80321c0c9713;a18c732a-3397-4cfe-a773-eb2b97b2b218) Repository Not Found for url: https://huggingface.co/api/models/black-forest-labs/FLUX.1-schnellsfddxcv. Please make sure you specified the correct `repo_id` and `repo_type`. If you are trying to access a private or gated repo, make sure you are authenticated. For more details, see https://huggingface.co/docs/huggingface_hub/authentication 2025-05-30 12:41:03,158 - simple_memory_calculator - WARNING - Using generic estimation for black-forest-labs/FLUX.1-schnellsfddxcv due to: HuggingFace API Error: RepositoryNotFoundError: 404 Client Error. (Request ID: Root=1-683928cf-4d3c1a245dbb80321c0c9713;a18c732a-3397-4cfe-a773-eb2b97b2b218) Repository Not Found for url: https://huggingface.co/api/models/black-forest-labs/FLUX.1-schnellsfddxcv. Please make sure you specified the correct `repo_id` and `repo_type`. If you are trying to access a private or gated repo, make sure you are authenticated. For more details, see https://huggingface.co/docs/huggingface_hub/authentication 2025-05-30 12:41:03,158 - simple_memory_calculator - DEBUG - Generic estimation parameters: 3.0B params, 6.0GB FP16 2025-05-30 12:41:03,158 - simple_memory_calculator - INFO - Generic estimation completed for black-forest-labs/FLUX.1-schnellsfddxcv 2025-05-30 12:41:03,158 - simple_memory_calculator - DEBUG - Model memory: 6.0GB, Inference memory: 9.0GB 2025-05-30 12:41:03,158 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnellsfddxcv 2025-05-30 12:41:03,711 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "GET /api/models/black-forest-labs/FLUX.1-schnellsfddxcv HTTP/1.1" 404 32 2025-05-30 12:41:03,712 - simple_memory_calculator - ERROR - API Error for model black-forest-labs/FLUX.1-schnellsfddxcv: RepositoryNotFoundError: 404 Client Error. (Request ID: Root=1-683928cf-329640bd3ed026a47c89c979;4df15827-4846-450a-8d8c-1f058d95d647) Repository Not Found for url: https://huggingface.co/api/models/black-forest-labs/FLUX.1-schnellsfddxcv. Please make sure you specified the correct `repo_id` and `repo_type`. If you are trying to access a private or gated repo, make sure you are authenticated. For more details, see https://huggingface.co/docs/huggingface_hub/authentication 2025-05-30 12:41:03,712 - simple_memory_calculator - WARNING - Using generic estimation for black-forest-labs/FLUX.1-schnellsfddxcv due to: HuggingFace API Error: RepositoryNotFoundError: 404 Client Error. (Request ID: Root=1-683928cf-329640bd3ed026a47c89c979;4df15827-4846-450a-8d8c-1f058d95d647) Repository Not Found for url: https://huggingface.co/api/models/black-forest-labs/FLUX.1-schnellsfddxcv. Please make sure you specified the correct `repo_id` and `repo_type`. If you are trying to access a private or gated repo, make sure you are authenticated. For more details, see https://huggingface.co/docs/huggingface_hub/authentication 2025-05-30 12:41:03,712 - simple_memory_calculator - DEBUG - Generic estimation parameters: 3.0B params, 6.0GB FP16 2025-05-30 12:41:03,713 - simple_memory_calculator - INFO - Generic estimation completed for black-forest-labs/FLUX.1-schnellsfddxcv 2025-05-30 12:41:07,547 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnellsfddxcv 2025-05-30 12:41:07,767 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "GET /api/models/black-forest-labs/FLUX.1-schnellsfddxcv HTTP/1.1" 404 32 2025-05-30 12:41:07,767 - simple_memory_calculator - ERROR - API Error for model black-forest-labs/FLUX.1-schnellsfddxcv: RepositoryNotFoundError: 404 Client Error. (Request ID: Root=1-683928d3-379ee12d653da24507b48ddf;1ed7d97f-8890-4512-ae6e-264623326f56) Repository Not Found for url: https://huggingface.co/api/models/black-forest-labs/FLUX.1-schnellsfddxcv. Please make sure you specified the correct `repo_id` and `repo_type`. If you are trying to access a private or gated repo, make sure you are authenticated. For more details, see https://huggingface.co/docs/huggingface_hub/authentication 2025-05-30 12:41:07,767 - simple_memory_calculator - WARNING - Using generic estimation for black-forest-labs/FLUX.1-schnellsfddxcv due to: HuggingFace API Error: RepositoryNotFoundError: 404 Client Error. (Request ID: Root=1-683928d3-379ee12d653da24507b48ddf;1ed7d97f-8890-4512-ae6e-264623326f56) Repository Not Found for url: https://huggingface.co/api/models/black-forest-labs/FLUX.1-schnellsfddxcv. Please make sure you specified the correct `repo_id` and `repo_type`. If you are trying to access a private or gated repo, make sure you are authenticated. For more details, see https://huggingface.co/docs/huggingface_hub/authentication 2025-05-30 12:41:07,767 - simple_memory_calculator - DEBUG - Generic estimation parameters: 3.0B params, 6.0GB FP16 2025-05-30 12:41:07,767 - simple_memory_calculator - INFO - Generic estimation completed for black-forest-labs/FLUX.1-schnellsfddxcv 2025-05-30 12:41:07,767 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnellsfddxcv with 8.0GB VRAM 2025-05-30 12:41:07,767 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnellsfddxcv 2025-05-30 12:41:07,985 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "GET /api/models/black-forest-labs/FLUX.1-schnellsfddxcv HTTP/1.1" 404 32 2025-05-30 12:41:07,986 - simple_memory_calculator - ERROR - API Error for model black-forest-labs/FLUX.1-schnellsfddxcv: RepositoryNotFoundError: 404 Client Error. (Request ID: Root=1-683928d3-32b4a9f33024023b320a85a4;ba4aac7d-c3e8-4c72-a1da-1e1537f34656) Repository Not Found for url: https://huggingface.co/api/models/black-forest-labs/FLUX.1-schnellsfddxcv. Please make sure you specified the correct `repo_id` and `repo_type`. If you are trying to access a private or gated repo, make sure you are authenticated. For more details, see https://huggingface.co/docs/huggingface_hub/authentication 2025-05-30 12:41:07,986 - simple_memory_calculator - WARNING - Using generic estimation for black-forest-labs/FLUX.1-schnellsfddxcv due to: HuggingFace API Error: RepositoryNotFoundError: 404 Client Error. (Request ID: Root=1-683928d3-32b4a9f33024023b320a85a4;ba4aac7d-c3e8-4c72-a1da-1e1537f34656) Repository Not Found for url: https://huggingface.co/api/models/black-forest-labs/FLUX.1-schnellsfddxcv. Please make sure you specified the correct `repo_id` and `repo_type`. If you are trying to access a private or gated repo, make sure you are authenticated. For more details, see https://huggingface.co/docs/huggingface_hub/authentication 2025-05-30 12:41:07,986 - simple_memory_calculator - DEBUG - Generic estimation parameters: 3.0B params, 6.0GB FP16 2025-05-30 12:41:07,986 - simple_memory_calculator - INFO - Generic estimation completed for black-forest-labs/FLUX.1-schnellsfddxcv 2025-05-30 12:41:07,986 - simple_memory_calculator - DEBUG - Model memory: 6.0GB, Inference memory: 9.0GB 2025-05-30 12:41:07,986 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnellsfddxcv 2025-05-30 12:41:08,278 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "GET /api/models/black-forest-labs/FLUX.1-schnellsfddxcv HTTP/1.1" 404 32 2025-05-30 12:41:08,278 - simple_memory_calculator - ERROR - API Error for model black-forest-labs/FLUX.1-schnellsfddxcv: RepositoryNotFoundError: 404 Client Error. (Request ID: Root=1-683928d4-6ec300a378e96671193bad57;967826f7-13e6-4552-8bb7-2d14e66613e9) Repository Not Found for url: https://huggingface.co/api/models/black-forest-labs/FLUX.1-schnellsfddxcv. Please make sure you specified the correct `repo_id` and `repo_type`. If you are trying to access a private or gated repo, make sure you are authenticated. For more details, see https://huggingface.co/docs/huggingface_hub/authentication 2025-05-30 12:41:08,278 - simple_memory_calculator - WARNING - Using generic estimation for black-forest-labs/FLUX.1-schnellsfddxcv due to: HuggingFace API Error: RepositoryNotFoundError: 404 Client Error. (Request ID: Root=1-683928d4-6ec300a378e96671193bad57;967826f7-13e6-4552-8bb7-2d14e66613e9) Repository Not Found for url: https://huggingface.co/api/models/black-forest-labs/FLUX.1-schnellsfddxcv. Please make sure you specified the correct `repo_id` and `repo_type`. If you are trying to access a private or gated repo, make sure you are authenticated. For more details, see https://huggingface.co/docs/huggingface_hub/authentication 2025-05-30 12:41:08,279 - simple_memory_calculator - DEBUG - Generic estimation parameters: 3.0B params, 6.0GB FP16 2025-05-30 12:41:08,279 - simple_memory_calculator - INFO - Generic estimation completed for black-forest-labs/FLUX.1-schnellsfddxcv 2025-05-30 12:42:44,837 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-30 12:42:44,837 - __main__ - DEBUG - API key found, length: 39 2025-05-30 12:42:44,837 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-30 12:42:44,837 - auto_diffusers - DEBUG - API key length: 39 2025-05-30 12:42:44,837 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-30 12:42:44,837 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-30 12:42:44,837 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-30 12:42:44,838 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-30 12:42:44,838 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-30 12:42:44,838 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-30 12:42:44,842 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-30 12:42:44,842 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-30 12:42:45,305 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-30 12:42:45,305 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-30 12:42:45,305 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-30 12:42:45,305 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-30 12:42:45,305 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-30 12:42:45,305 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-30 12:42:45,305 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-30 12:42:45,305 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-30 12:42:45,305 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-30 12:42:45,305 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-30 12:42:45,305 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-30 12:42:45,307 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 12:42:45,319 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-30 12:42:45,327 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 12:42:45,406 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 12:42:45,439 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=None socket_options=None 2025-05-30 12:42:45,439 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 12:42:45,439 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 12:42:45,439 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 12:42:45,439 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 12:42:45,440 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 12:42:45,440 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 12:42:45,440 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 03:42:45 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')]) 2025-05-30 12:42:45,440 - httpx - INFO - HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK" 2025-05-30 12:42:45,440 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 12:42:45,440 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 12:42:45,440 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 12:42:45,440 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 12:42:45,440 - httpcore.connection - DEBUG - close.started 2025-05-30 12:42:45,440 - httpcore.connection - DEBUG - close.complete 2025-05-30 12:42:45,441 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=3 socket_options=None 2025-05-30 12:42:45,441 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 12:42:45,441 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 12:42:45,441 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 12:42:45,441 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 12:42:45,441 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 12:42:45,442 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 12:42:45,448 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 03:42:45 GMT'), (b'server', b'uvicorn'), (b'content-length', b'100551'), (b'content-type', b'text/html; charset=utf-8')]) 2025-05-30 12:42:45,448 - httpx - INFO - HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK" 2025-05-30 12:42:45,448 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 12:42:45,448 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 12:42:45,448 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 12:42:45,448 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 12:42:45,448 - httpcore.connection - DEBUG - close.started 2025-05-30 12:42:45,448 - httpcore.connection - DEBUG - close.complete 2025-05-30 12:42:45,459 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=30 socket_options=None 2025-05-30 12:42:45,487 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 12:42:45,487 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-30 12:42:45,604 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 12:42:45,604 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=30 2025-05-30 12:42:45,612 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-30 12:42:45,765 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 12:42:45,766 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 12:42:45,766 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 12:42:45,766 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 12:42:45,767 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 12:42:45,767 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 12:42:45,893 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 12:42:45,893 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 12:42:45,894 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 12:42:45,894 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 12:42:45,894 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 12:42:45,894 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 12:42:45,905 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 03:42:45 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-30 12:42:45,906 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-30 12:42:45,906 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 12:42:45,906 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 12:42:45,906 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 12:42:45,906 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 12:42:45,906 - httpcore.connection - DEBUG - close.started 2025-05-30 12:42:45,907 - httpcore.connection - DEBUG - close.complete 2025-05-30 12:42:46,041 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 03:42:45 GMT'), (b'Content-Type', b'text/html; charset=utf-8'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'ContentType', b'application/json'), (b'Access-Control-Allow-Origin', b'*'), (b'Content-Encoding', b'gzip')]) 2025-05-30 12:42:46,041 - httpx - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK" 2025-05-30 12:42:46,041 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 12:42:46,041 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 12:42:46,042 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 12:42:46,042 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 12:42:46,042 - httpcore.connection - DEBUG - close.started 2025-05-30 12:42:46,042 - httpcore.connection - DEBUG - close.complete 2025-05-30 12:42:46,611 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:42:46,611 - simple_memory_calculator - INFO - Using known memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 12:42:46,611 - simple_memory_calculator - DEBUG - Known data: {'params_billions': 12.0, 'fp16_gb': 24.0, 'inference_fp16_gb': 36.0} 2025-05-30 12:42:46,611 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 12:42:46,611 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:42:46,611 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 12:42:46,611 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 12:42:46,611 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:42:46,612 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 12:42:46,670 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 12:42:46,892 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/launched HTTP/1.1" 200 0 2025-05-30 12:42:49,704 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnelldd 2025-05-30 12:42:49,706 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 12:42:49,941 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "GET /api/models/black-forest-labs/FLUX.1-schnelldd HTTP/1.1" 404 32 2025-05-30 12:42:49,942 - simple_memory_calculator - ERROR - API Error for model black-forest-labs/FLUX.1-schnelldd: RepositoryNotFoundError: 404 Client Error. (Request ID: Root=1-68392939-0005b45d61830d803048ce93;deadb8a4-9a08-46f3-8c34-69d166a424ef) Repository Not Found for url: https://huggingface.co/api/models/black-forest-labs/FLUX.1-schnelldd. Please make sure you specified the correct `repo_id` and `repo_type`. If you are trying to access a private or gated repo, make sure you are authenticated. For more details, see https://huggingface.co/docs/huggingface_hub/authentication 2025-05-30 12:42:49,942 - simple_memory_calculator - WARNING - Using generic estimation for black-forest-labs/FLUX.1-schnelldd due to: HuggingFace API Error: RepositoryNotFoundError: 404 Client Error. (Request ID: Root=1-68392939-0005b45d61830d803048ce93;deadb8a4-9a08-46f3-8c34-69d166a424ef) Repository Not Found for url: https://huggingface.co/api/models/black-forest-labs/FLUX.1-schnelldd. Please make sure you specified the correct `repo_id` and `repo_type`. If you are trying to access a private or gated repo, make sure you are authenticated. For more details, see https://huggingface.co/docs/huggingface_hub/authentication 2025-05-30 12:42:49,942 - simple_memory_calculator - DEBUG - Generic estimation parameters: 3.0B params, 6.0GB FP16 2025-05-30 12:42:49,942 - simple_memory_calculator - INFO - Generic estimation completed for black-forest-labs/FLUX.1-schnelldd 2025-05-30 12:42:49,942 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnelldd with 8.0GB VRAM 2025-05-30 12:42:49,942 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnelldd 2025-05-30 12:42:50,157 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "GET /api/models/black-forest-labs/FLUX.1-schnelldd HTTP/1.1" 404 32 2025-05-30 12:42:50,158 - simple_memory_calculator - ERROR - API Error for model black-forest-labs/FLUX.1-schnelldd: RepositoryNotFoundError: 404 Client Error. (Request ID: Root=1-6839293a-415221553bb2f68d533bfded;99b52697-6d05-4582-8f1b-c6c58ebe11fa) Repository Not Found for url: https://huggingface.co/api/models/black-forest-labs/FLUX.1-schnelldd. Please make sure you specified the correct `repo_id` and `repo_type`. If you are trying to access a private or gated repo, make sure you are authenticated. For more details, see https://huggingface.co/docs/huggingface_hub/authentication 2025-05-30 12:42:50,158 - simple_memory_calculator - WARNING - Using generic estimation for black-forest-labs/FLUX.1-schnelldd due to: HuggingFace API Error: RepositoryNotFoundError: 404 Client Error. (Request ID: Root=1-6839293a-415221553bb2f68d533bfded;99b52697-6d05-4582-8f1b-c6c58ebe11fa) Repository Not Found for url: https://huggingface.co/api/models/black-forest-labs/FLUX.1-schnelldd. Please make sure you specified the correct `repo_id` and `repo_type`. If you are trying to access a private or gated repo, make sure you are authenticated. For more details, see https://huggingface.co/docs/huggingface_hub/authentication 2025-05-30 12:42:50,158 - simple_memory_calculator - DEBUG - Generic estimation parameters: 3.0B params, 6.0GB FP16 2025-05-30 12:42:50,158 - simple_memory_calculator - INFO - Generic estimation completed for black-forest-labs/FLUX.1-schnelldd 2025-05-30 12:42:50,158 - simple_memory_calculator - DEBUG - Model memory: 6.0GB, Inference memory: 9.0GB 2025-05-30 12:42:50,158 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnelldd 2025-05-30 12:42:50,409 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "GET /api/models/black-forest-labs/FLUX.1-schnelldd HTTP/1.1" 404 32 2025-05-30 12:42:50,410 - simple_memory_calculator - ERROR - API Error for model black-forest-labs/FLUX.1-schnelldd: RepositoryNotFoundError: 404 Client Error. (Request ID: Root=1-6839293a-424b7c1279f4673a7a262bde;24b28a19-f909-426a-95df-82f52d74581f) Repository Not Found for url: https://huggingface.co/api/models/black-forest-labs/FLUX.1-schnelldd. Please make sure you specified the correct `repo_id` and `repo_type`. If you are trying to access a private or gated repo, make sure you are authenticated. For more details, see https://huggingface.co/docs/huggingface_hub/authentication 2025-05-30 12:42:50,410 - simple_memory_calculator - WARNING - Using generic estimation for black-forest-labs/FLUX.1-schnelldd due to: HuggingFace API Error: RepositoryNotFoundError: 404 Client Error. (Request ID: Root=1-6839293a-424b7c1279f4673a7a262bde;24b28a19-f909-426a-95df-82f52d74581f) Repository Not Found for url: https://huggingface.co/api/models/black-forest-labs/FLUX.1-schnelldd. Please make sure you specified the correct `repo_id` and `repo_type`. If you are trying to access a private or gated repo, make sure you are authenticated. For more details, see https://huggingface.co/docs/huggingface_hub/authentication 2025-05-30 12:42:50,410 - simple_memory_calculator - DEBUG - Generic estimation parameters: 3.0B params, 6.0GB FP16 2025-05-30 12:42:50,410 - simple_memory_calculator - INFO - Generic estimation completed for black-forest-labs/FLUX.1-schnelldd 2025-05-30 12:42:55,241 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnelldd 2025-05-30 12:42:55,461 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "GET /api/models/black-forest-labs/FLUX.1-schnelldd HTTP/1.1" 404 32 2025-05-30 12:42:55,462 - simple_memory_calculator - ERROR - API Error for model black-forest-labs/FLUX.1-schnelldd: RepositoryNotFoundError: 404 Client Error. (Request ID: Root=1-6839293f-45a69887498716fe2290453d;5f056895-3da7-402c-a0b2-8779f6a300f2) Repository Not Found for url: https://huggingface.co/api/models/black-forest-labs/FLUX.1-schnelldd. Please make sure you specified the correct `repo_id` and `repo_type`. If you are trying to access a private or gated repo, make sure you are authenticated. For more details, see https://huggingface.co/docs/huggingface_hub/authentication 2025-05-30 12:42:55,462 - simple_memory_calculator - WARNING - Using generic estimation for black-forest-labs/FLUX.1-schnelldd due to: HuggingFace API Error: RepositoryNotFoundError: 404 Client Error. (Request ID: Root=1-6839293f-45a69887498716fe2290453d;5f056895-3da7-402c-a0b2-8779f6a300f2) Repository Not Found for url: https://huggingface.co/api/models/black-forest-labs/FLUX.1-schnelldd. Please make sure you specified the correct `repo_id` and `repo_type`. If you are trying to access a private or gated repo, make sure you are authenticated. For more details, see https://huggingface.co/docs/huggingface_hub/authentication 2025-05-30 12:42:55,462 - simple_memory_calculator - DEBUG - Generic estimation parameters: 3.0B params, 6.0GB FP16 2025-05-30 12:42:55,462 - simple_memory_calculator - INFO - Generic estimation completed for black-forest-labs/FLUX.1-schnelldd 2025-05-30 12:42:55,462 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnelldd with 8.0GB VRAM 2025-05-30 12:42:55,462 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnelldd 2025-05-30 12:42:55,696 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "GET /api/models/black-forest-labs/FLUX.1-schnelldd HTTP/1.1" 404 32 2025-05-30 12:42:55,697 - simple_memory_calculator - ERROR - API Error for model black-forest-labs/FLUX.1-schnelldd: RepositoryNotFoundError: 404 Client Error. (Request ID: Root=1-6839293f-021ad2693618a43e583dd961;29f9ee24-d491-43e9-826e-b61e5da13e14) Repository Not Found for url: https://huggingface.co/api/models/black-forest-labs/FLUX.1-schnelldd. Please make sure you specified the correct `repo_id` and `repo_type`. If you are trying to access a private or gated repo, make sure you are authenticated. For more details, see https://huggingface.co/docs/huggingface_hub/authentication 2025-05-30 12:42:55,697 - simple_memory_calculator - WARNING - Using generic estimation for black-forest-labs/FLUX.1-schnelldd due to: HuggingFace API Error: RepositoryNotFoundError: 404 Client Error. (Request ID: Root=1-6839293f-021ad2693618a43e583dd961;29f9ee24-d491-43e9-826e-b61e5da13e14) Repository Not Found for url: https://huggingface.co/api/models/black-forest-labs/FLUX.1-schnelldd. Please make sure you specified the correct `repo_id` and `repo_type`. If you are trying to access a private or gated repo, make sure you are authenticated. For more details, see https://huggingface.co/docs/huggingface_hub/authentication 2025-05-30 12:42:55,697 - simple_memory_calculator - DEBUG - Generic estimation parameters: 3.0B params, 6.0GB FP16 2025-05-30 12:42:55,697 - simple_memory_calculator - INFO - Generic estimation completed for black-forest-labs/FLUX.1-schnelldd 2025-05-30 12:42:55,698 - simple_memory_calculator - DEBUG - Model memory: 6.0GB, Inference memory: 9.0GB 2025-05-30 12:42:55,698 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnelldd 2025-05-30 12:42:55,912 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "GET /api/models/black-forest-labs/FLUX.1-schnelldd HTTP/1.1" 404 32 2025-05-30 12:42:55,912 - simple_memory_calculator - ERROR - API Error for model black-forest-labs/FLUX.1-schnelldd: RepositoryNotFoundError: 404 Client Error. (Request ID: Root=1-6839293f-750279ec4331878e6674dd97;27ce0e25-8488-4859-b492-d361b3e93eaa) Repository Not Found for url: https://huggingface.co/api/models/black-forest-labs/FLUX.1-schnelldd. Please make sure you specified the correct `repo_id` and `repo_type`. If you are trying to access a private or gated repo, make sure you are authenticated. For more details, see https://huggingface.co/docs/huggingface_hub/authentication 2025-05-30 12:42:55,912 - simple_memory_calculator - WARNING - Using generic estimation for black-forest-labs/FLUX.1-schnelldd due to: HuggingFace API Error: RepositoryNotFoundError: 404 Client Error. (Request ID: Root=1-6839293f-750279ec4331878e6674dd97;27ce0e25-8488-4859-b492-d361b3e93eaa) Repository Not Found for url: https://huggingface.co/api/models/black-forest-labs/FLUX.1-schnelldd. Please make sure you specified the correct `repo_id` and `repo_type`. If you are trying to access a private or gated repo, make sure you are authenticated. For more details, see https://huggingface.co/docs/huggingface_hub/authentication 2025-05-30 12:42:55,912 - simple_memory_calculator - DEBUG - Generic estimation parameters: 3.0B params, 6.0GB FP16 2025-05-30 12:42:55,913 - simple_memory_calculator - INFO - Generic estimation completed for black-forest-labs/FLUX.1-schnelldd 2025-05-30 12:44:05,771 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-30 12:44:05,771 - __main__ - DEBUG - API key found, length: 39 2025-05-30 12:44:05,771 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-30 12:44:05,771 - auto_diffusers - DEBUG - API key length: 39 2025-05-30 12:44:05,771 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-30 12:44:05,771 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-30 12:44:05,771 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-30 12:44:05,771 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-30 12:44:05,771 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-30 12:44:05,771 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-30 12:44:05,774 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-30 12:44:05,774 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-30 12:44:06,257 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-30 12:44:06,257 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-30 12:44:06,257 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-30 12:44:06,257 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-30 12:44:06,257 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-30 12:44:06,257 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-30 12:44:06,257 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-30 12:44:06,257 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-30 12:44:06,257 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-30 12:44:06,257 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-30 12:44:06,257 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-30 12:44:06,260 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 12:44:06,273 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-30 12:44:06,279 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 12:44:06,358 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 12:44:06,390 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=None socket_options=None 2025-05-30 12:44:06,391 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 12:44:06,391 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 12:44:06,391 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 12:44:06,391 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 12:44:06,391 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 12:44:06,391 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 12:44:06,392 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 03:44:06 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')]) 2025-05-30 12:44:06,392 - httpx - INFO - HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK" 2025-05-30 12:44:06,392 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 12:44:06,392 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 12:44:06,392 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 12:44:06,392 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 12:44:06,392 - httpcore.connection - DEBUG - close.started 2025-05-30 12:44:06,392 - httpcore.connection - DEBUG - close.complete 2025-05-30 12:44:06,393 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=3 socket_options=None 2025-05-30 12:44:06,393 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 12:44:06,393 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 12:44:06,393 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 12:44:06,393 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 12:44:06,394 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 12:44:06,394 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 12:44:06,399 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 03:44:06 GMT'), (b'server', b'uvicorn'), (b'content-length', b'100541'), (b'content-type', b'text/html; charset=utf-8')]) 2025-05-30 12:44:06,400 - httpx - INFO - HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK" 2025-05-30 12:44:06,400 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 12:44:06,400 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 12:44:06,400 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 12:44:06,400 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 12:44:06,400 - httpcore.connection - DEBUG - close.started 2025-05-30 12:44:06,400 - httpcore.connection - DEBUG - close.complete 2025-05-30 12:44:06,411 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=30 socket_options=None 2025-05-30 12:44:06,459 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 12:44:06,459 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-30 12:44:06,553 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 12:44:06,553 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=30 2025-05-30 12:44:06,556 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-30 12:44:06,802 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 12:44:06,802 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 12:44:06,803 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 12:44:06,803 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 12:44:06,803 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 12:44:06,803 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 12:44:06,832 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 12:44:06,832 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 12:44:06,832 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 12:44:06,832 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 12:44:06,832 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 12:44:06,832 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 12:44:06,967 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 03:44:06 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-30 12:44:06,967 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-30 12:44:06,968 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 12:44:06,968 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 12:44:06,968 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 12:44:06,968 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 12:44:06,968 - httpcore.connection - DEBUG - close.started 2025-05-30 12:44:06,968 - httpcore.connection - DEBUG - close.complete 2025-05-30 12:44:06,974 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 03:44:06 GMT'), (b'Content-Type', b'text/html; charset=utf-8'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'ContentType', b'application/json'), (b'Access-Control-Allow-Origin', b'*'), (b'Content-Encoding', b'gzip')]) 2025-05-30 12:44:06,975 - httpx - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK" 2025-05-30 12:44:06,975 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 12:44:06,975 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 12:44:06,976 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 12:44:06,976 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 12:44:06,976 - httpcore.connection - DEBUG - close.started 2025-05-30 12:44:06,976 - httpcore.connection - DEBUG - close.complete 2025-05-30 12:44:07,304 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:44:07,305 - simple_memory_calculator - INFO - Using known memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 12:44:07,305 - simple_memory_calculator - DEBUG - Known data: {'params_billions': 12.0, 'fp16_gb': 24.0, 'inference_fp16_gb': 36.0} 2025-05-30 12:44:07,306 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 12:44:07,306 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:44:07,306 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 12:44:07,306 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 12:44:07,306 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:44:07,306 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 12:44:07,566 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 12:44:07,785 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/launched HTTP/1.1" 200 0 2025-05-30 12:44:09,696 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnella 2025-05-30 12:44:09,697 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 12:44:10,142 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "GET /api/models/black-forest-labs/FLUX.1-schnella HTTP/1.1" 404 32 2025-05-30 12:44:10,142 - simple_memory_calculator - ERROR - API Error for model black-forest-labs/FLUX.1-schnella: RepositoryNotFoundError: 404 Client Error. (Request ID: Root=1-68392989-25945bd33b836bfd304e6410;59b8eecc-69af-4bd4-a300-491f715f14bc) Repository Not Found for url: https://huggingface.co/api/models/black-forest-labs/FLUX.1-schnella. Please make sure you specified the correct `repo_id` and `repo_type`. If you are trying to access a private or gated repo, make sure you are authenticated. For more details, see https://huggingface.co/docs/huggingface_hub/authentication 2025-05-30 12:44:10,142 - simple_memory_calculator - WARNING - Using generic estimation for black-forest-labs/FLUX.1-schnella due to: HuggingFace API Error: RepositoryNotFoundError: 404 Client Error. (Request ID: Root=1-68392989-25945bd33b836bfd304e6410;59b8eecc-69af-4bd4-a300-491f715f14bc) Repository Not Found for url: https://huggingface.co/api/models/black-forest-labs/FLUX.1-schnella. Please make sure you specified the correct `repo_id` and `repo_type`. If you are trying to access a private or gated repo, make sure you are authenticated. For more details, see https://huggingface.co/docs/huggingface_hub/authentication 2025-05-30 12:44:10,143 - simple_memory_calculator - DEBUG - Generic estimation parameters: 3.0B params, 6.0GB FP16 2025-05-30 12:44:10,143 - simple_memory_calculator - INFO - Generic estimation completed for black-forest-labs/FLUX.1-schnella 2025-05-30 12:44:10,143 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnella with 8.0GB VRAM 2025-05-30 12:44:10,143 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnella 2025-05-30 12:44:10,411 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "GET /api/models/black-forest-labs/FLUX.1-schnella HTTP/1.1" 404 32 2025-05-30 12:44:10,411 - simple_memory_calculator - ERROR - API Error for model black-forest-labs/FLUX.1-schnella: RepositoryNotFoundError: 404 Client Error. (Request ID: Root=1-6839298a-657eafab13f860d04a7fd405;84e5cfa7-02dc-451c-bb4a-5c3a159d6c96) Repository Not Found for url: https://huggingface.co/api/models/black-forest-labs/FLUX.1-schnella. Please make sure you specified the correct `repo_id` and `repo_type`. If you are trying to access a private or gated repo, make sure you are authenticated. For more details, see https://huggingface.co/docs/huggingface_hub/authentication 2025-05-30 12:44:10,411 - simple_memory_calculator - WARNING - Using generic estimation for black-forest-labs/FLUX.1-schnella due to: HuggingFace API Error: RepositoryNotFoundError: 404 Client Error. (Request ID: Root=1-6839298a-657eafab13f860d04a7fd405;84e5cfa7-02dc-451c-bb4a-5c3a159d6c96) Repository Not Found for url: https://huggingface.co/api/models/black-forest-labs/FLUX.1-schnella. Please make sure you specified the correct `repo_id` and `repo_type`. If you are trying to access a private or gated repo, make sure you are authenticated. For more details, see https://huggingface.co/docs/huggingface_hub/authentication 2025-05-30 12:44:10,411 - simple_memory_calculator - DEBUG - Generic estimation parameters: 3.0B params, 6.0GB FP16 2025-05-30 12:44:10,412 - simple_memory_calculator - INFO - Generic estimation completed for black-forest-labs/FLUX.1-schnella 2025-05-30 12:44:10,412 - simple_memory_calculator - DEBUG - Model memory: 6.0GB, Inference memory: 9.0GB 2025-05-30 12:44:10,412 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnella 2025-05-30 12:44:10,624 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "GET /api/models/black-forest-labs/FLUX.1-schnella HTTP/1.1" 404 32 2025-05-30 12:44:10,624 - simple_memory_calculator - ERROR - API Error for model black-forest-labs/FLUX.1-schnella: RepositoryNotFoundError: 404 Client Error. (Request ID: Root=1-6839298a-43c2472b3b0815ab6fd04f20;18b51d12-51ef-468d-9578-65597ac9d467) Repository Not Found for url: https://huggingface.co/api/models/black-forest-labs/FLUX.1-schnella. Please make sure you specified the correct `repo_id` and `repo_type`. If you are trying to access a private or gated repo, make sure you are authenticated. For more details, see https://huggingface.co/docs/huggingface_hub/authentication 2025-05-30 12:44:10,625 - simple_memory_calculator - WARNING - Using generic estimation for black-forest-labs/FLUX.1-schnella due to: HuggingFace API Error: RepositoryNotFoundError: 404 Client Error. (Request ID: Root=1-6839298a-43c2472b3b0815ab6fd04f20;18b51d12-51ef-468d-9578-65597ac9d467) Repository Not Found for url: https://huggingface.co/api/models/black-forest-labs/FLUX.1-schnella. Please make sure you specified the correct `repo_id` and `repo_type`. If you are trying to access a private or gated repo, make sure you are authenticated. For more details, see https://huggingface.co/docs/huggingface_hub/authentication 2025-05-30 12:44:10,625 - simple_memory_calculator - DEBUG - Generic estimation parameters: 3.0B params, 6.0GB FP16 2025-05-30 12:44:10,625 - simple_memory_calculator - INFO - Generic estimation completed for black-forest-labs/FLUX.1-schnella 2025-05-30 12:44:19,136 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:44:19,137 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 12:44:19,137 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 12:44:19,137 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:44:19,137 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 12:44:19,137 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 12:44:19,137 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:44:19,137 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 12:45:12,787 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-30 12:45:12,787 - __main__ - DEBUG - API key found, length: 39 2025-05-30 12:45:12,788 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-30 12:45:12,788 - auto_diffusers - DEBUG - API key length: 39 2025-05-30 12:45:12,788 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-30 12:45:12,788 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-30 12:45:12,788 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-30 12:45:12,788 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-30 12:45:12,788 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-30 12:45:12,788 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-30 12:45:12,791 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-30 12:45:12,791 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-30 12:45:13,280 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-30 12:45:13,280 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-30 12:45:13,280 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-30 12:45:13,280 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-30 12:45:13,280 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-30 12:45:13,280 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-30 12:45:13,280 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-30 12:45:13,280 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-30 12:45:13,280 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-30 12:45:13,280 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-30 12:45:13,280 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-30 12:45:13,283 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 12:45:13,296 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-30 12:45:13,296 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 12:45:13,383 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 12:45:13,414 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=None socket_options=None 2025-05-30 12:45:13,415 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 12:45:13,415 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 12:45:13,415 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 12:45:13,415 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 12:45:13,416 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 12:45:13,416 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 12:45:13,416 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 03:45:13 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')]) 2025-05-30 12:45:13,416 - httpx - INFO - HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK" 2025-05-30 12:45:13,416 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 12:45:13,416 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 12:45:13,416 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 12:45:13,416 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 12:45:13,416 - httpcore.connection - DEBUG - close.started 2025-05-30 12:45:13,416 - httpcore.connection - DEBUG - close.complete 2025-05-30 12:45:13,416 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=3 socket_options=None 2025-05-30 12:45:13,417 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 12:45:13,417 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 12:45:13,417 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 12:45:13,417 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 12:45:13,417 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 12:45:13,417 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 12:45:13,424 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 03:45:13 GMT'), (b'server', b'uvicorn'), (b'content-length', b'104209'), (b'content-type', b'text/html; charset=utf-8')]) 2025-05-30 12:45:13,424 - httpx - INFO - HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK" 2025-05-30 12:45:13,424 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 12:45:13,424 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 12:45:13,424 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 12:45:13,424 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 12:45:13,424 - httpcore.connection - DEBUG - close.started 2025-05-30 12:45:13,424 - httpcore.connection - DEBUG - close.complete 2025-05-30 12:45:13,435 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=30 socket_options=None 2025-05-30 12:45:13,441 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 12:45:13,441 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-30 12:45:13,560 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-30 12:45:13,587 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 12:45:13,588 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=30 2025-05-30 12:45:13,722 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 12:45:13,723 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 12:45:13,723 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 12:45:13,723 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 12:45:13,723 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 12:45:13,723 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 12:45:13,864 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 03:45:13 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-30 12:45:13,864 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-30 12:45:13,864 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 12:45:13,865 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 12:45:13,865 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 12:45:13,865 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 12:45:13,865 - httpcore.connection - DEBUG - close.started 2025-05-30 12:45:13,866 - httpcore.connection - DEBUG - close.complete 2025-05-30 12:45:13,902 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 12:45:13,903 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 12:45:13,903 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 12:45:13,903 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 12:45:13,903 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 12:45:13,903 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 12:45:14,058 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 03:45:13 GMT'), (b'Content-Type', b'text/html; charset=utf-8'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'ContentType', b'application/json'), (b'Access-Control-Allow-Origin', b'*'), (b'Content-Encoding', b'gzip')]) 2025-05-30 12:45:14,059 - httpx - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK" 2025-05-30 12:45:14,059 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 12:45:14,059 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 12:45:14,059 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 12:45:14,059 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 12:45:14,060 - httpcore.connection - DEBUG - close.started 2025-05-30 12:45:14,060 - httpcore.connection - DEBUG - close.complete 2025-05-30 12:45:14,579 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:45:14,579 - simple_memory_calculator - INFO - Using known memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 12:45:14,579 - simple_memory_calculator - DEBUG - Known data: {'params_billions': 12.0, 'fp16_gb': 24.0, 'inference_fp16_gb': 36.0} 2025-05-30 12:45:14,579 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 12:45:14,579 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:45:14,579 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 12:45:14,579 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 12:45:14,580 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:45:14,580 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 12:45:14,631 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 12:45:14,854 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/launched HTTP/1.1" 200 0 2025-05-30 12:45:16,354 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnella 2025-05-30 12:45:16,357 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 12:45:16,648 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "GET /api/models/black-forest-labs/FLUX.1-schnella HTTP/1.1" 404 32 2025-05-30 12:45:16,649 - simple_memory_calculator - ERROR - API Error for model black-forest-labs/FLUX.1-schnella: RepositoryNotFoundError: 404 Client Error. (Request ID: Root=1-683929cc-5b5a696d7511781e28797aad;a02be0e1-ef06-4150-9816-8364453f9621) Repository Not Found for url: https://huggingface.co/api/models/black-forest-labs/FLUX.1-schnella. Please make sure you specified the correct `repo_id` and `repo_type`. If you are trying to access a private or gated repo, make sure you are authenticated. For more details, see https://huggingface.co/docs/huggingface_hub/authentication 2025-05-30 12:45:16,649 - simple_memory_calculator - WARNING - Using generic estimation for black-forest-labs/FLUX.1-schnella due to: HuggingFace API Error: RepositoryNotFoundError: 404 Client Error. (Request ID: Root=1-683929cc-5b5a696d7511781e28797aad;a02be0e1-ef06-4150-9816-8364453f9621) Repository Not Found for url: https://huggingface.co/api/models/black-forest-labs/FLUX.1-schnella. Please make sure you specified the correct `repo_id` and `repo_type`. If you are trying to access a private or gated repo, make sure you are authenticated. For more details, see https://huggingface.co/docs/huggingface_hub/authentication 2025-05-30 12:45:16,649 - simple_memory_calculator - DEBUG - Generic estimation parameters: 3.0B params, 6.0GB FP16 2025-05-30 12:45:16,649 - simple_memory_calculator - INFO - Generic estimation completed for black-forest-labs/FLUX.1-schnella 2025-05-30 12:45:16,649 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnella with 8.0GB VRAM 2025-05-30 12:45:16,649 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnella 2025-05-30 12:45:16,879 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "GET /api/models/black-forest-labs/FLUX.1-schnella HTTP/1.1" 404 32 2025-05-30 12:45:16,879 - simple_memory_calculator - ERROR - API Error for model black-forest-labs/FLUX.1-schnella: RepositoryNotFoundError: 404 Client Error. (Request ID: Root=1-683929cc-39faec6c0bfe23a579a0c519;127bc48f-4df2-4923-9354-3b367c7a5405) Repository Not Found for url: https://huggingface.co/api/models/black-forest-labs/FLUX.1-schnella. Please make sure you specified the correct `repo_id` and `repo_type`. If you are trying to access a private or gated repo, make sure you are authenticated. For more details, see https://huggingface.co/docs/huggingface_hub/authentication 2025-05-30 12:45:16,879 - simple_memory_calculator - WARNING - Using generic estimation for black-forest-labs/FLUX.1-schnella due to: HuggingFace API Error: RepositoryNotFoundError: 404 Client Error. (Request ID: Root=1-683929cc-39faec6c0bfe23a579a0c519;127bc48f-4df2-4923-9354-3b367c7a5405) Repository Not Found for url: https://huggingface.co/api/models/black-forest-labs/FLUX.1-schnella. Please make sure you specified the correct `repo_id` and `repo_type`. If you are trying to access a private or gated repo, make sure you are authenticated. For more details, see https://huggingface.co/docs/huggingface_hub/authentication 2025-05-30 12:45:16,879 - simple_memory_calculator - DEBUG - Generic estimation parameters: 3.0B params, 6.0GB FP16 2025-05-30 12:45:16,879 - simple_memory_calculator - INFO - Generic estimation completed for black-forest-labs/FLUX.1-schnella 2025-05-30 12:45:16,879 - simple_memory_calculator - DEBUG - Model memory: 6.0GB, Inference memory: 9.0GB 2025-05-30 12:45:16,879 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnella 2025-05-30 12:45:17,192 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "GET /api/models/black-forest-labs/FLUX.1-schnella HTTP/1.1" 404 32 2025-05-30 12:45:17,193 - simple_memory_calculator - ERROR - API Error for model black-forest-labs/FLUX.1-schnella: RepositoryNotFoundError: 404 Client Error. (Request ID: Root=1-683929cd-4f792d2b77d9c5ad147576c4;29848fab-cd4a-49c9-8b4d-0861236e0ca4) Repository Not Found for url: https://huggingface.co/api/models/black-forest-labs/FLUX.1-schnella. Please make sure you specified the correct `repo_id` and `repo_type`. If you are trying to access a private or gated repo, make sure you are authenticated. For more details, see https://huggingface.co/docs/huggingface_hub/authentication 2025-05-30 12:45:17,193 - simple_memory_calculator - WARNING - Using generic estimation for black-forest-labs/FLUX.1-schnella due to: HuggingFace API Error: RepositoryNotFoundError: 404 Client Error. (Request ID: Root=1-683929cd-4f792d2b77d9c5ad147576c4;29848fab-cd4a-49c9-8b4d-0861236e0ca4) Repository Not Found for url: https://huggingface.co/api/models/black-forest-labs/FLUX.1-schnella. Please make sure you specified the correct `repo_id` and `repo_type`. If you are trying to access a private or gated repo, make sure you are authenticated. For more details, see https://huggingface.co/docs/huggingface_hub/authentication 2025-05-30 12:45:17,193 - simple_memory_calculator - DEBUG - Generic estimation parameters: 3.0B params, 6.0GB FP16 2025-05-30 12:45:17,193 - simple_memory_calculator - INFO - Generic estimation completed for black-forest-labs/FLUX.1-schnella 2025-05-30 12:45:18,284 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:45:18,284 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 12:45:18,284 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 12:45:18,284 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:45:18,285 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 12:45:18,285 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 12:45:18,285 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:45:18,285 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 12:45:19,093 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:45:19,094 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 12:45:19,094 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 12:45:19,095 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:45:19,095 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 12:45:19,095 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 12:45:19,095 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:45:19,095 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 12:45:22,953 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:45:22,953 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 12:45:22,953 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 12:45:22,953 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:45:22,953 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 12:45:22,954 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 12:45:22,954 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:45:22,954 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 12:47:21,624 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-30 12:47:21,624 - __main__ - DEBUG - API key found, length: 39 2025-05-30 12:47:21,624 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-30 12:47:21,625 - auto_diffusers - DEBUG - API key length: 39 2025-05-30 12:47:21,625 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-30 12:47:21,625 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-30 12:47:21,625 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-30 12:47:21,625 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-30 12:47:21,625 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-30 12:47:21,625 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-30 12:47:21,629 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-30 12:47:21,629 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-30 12:47:22,095 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-30 12:47:22,095 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-30 12:47:22,095 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-30 12:47:22,095 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-30 12:47:22,095 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-30 12:47:22,095 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-30 12:47:22,095 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-30 12:47:22,095 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-30 12:47:22,095 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-30 12:47:22,095 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-30 12:47:22,095 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-30 12:47:22,097 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 12:47:22,109 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-30 12:47:22,118 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 12:47:22,200 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 12:47:22,234 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=None socket_options=None 2025-05-30 12:47:22,234 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 12:47:22,234 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 12:47:22,235 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 12:47:22,235 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 12:47:22,235 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 12:47:22,235 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 12:47:22,235 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 03:47:22 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')]) 2025-05-30 12:47:22,235 - httpx - INFO - HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK" 2025-05-30 12:47:22,235 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 12:47:22,235 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 12:47:22,235 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 12:47:22,236 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 12:47:22,236 - httpcore.connection - DEBUG - close.started 2025-05-30 12:47:22,236 - httpcore.connection - DEBUG - close.complete 2025-05-30 12:47:22,236 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=3 socket_options=None 2025-05-30 12:47:22,236 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 12:47:22,236 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 12:47:22,236 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 12:47:22,236 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 12:47:22,237 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 12:47:22,237 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 12:47:22,243 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 03:47:22 GMT'), (b'server', b'uvicorn'), (b'content-length', b'104591'), (b'content-type', b'text/html; charset=utf-8')]) 2025-05-30 12:47:22,243 - httpx - INFO - HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK" 2025-05-30 12:47:22,243 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 12:47:22,243 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 12:47:22,243 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 12:47:22,243 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 12:47:22,243 - httpcore.connection - DEBUG - close.started 2025-05-30 12:47:22,243 - httpcore.connection - DEBUG - close.complete 2025-05-30 12:47:22,255 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=30 socket_options=None 2025-05-30 12:47:22,286 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 12:47:22,286 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-30 12:47:22,386 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-30 12:47:22,389 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 12:47:22,389 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=30 2025-05-30 12:47:22,576 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 12:47:22,576 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 12:47:22,577 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 12:47:22,577 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 12:47:22,577 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 12:47:22,577 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 12:47:22,710 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 12:47:22,710 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 12:47:22,710 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 12:47:22,710 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 12:47:22,711 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 12:47:22,711 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 12:47:22,726 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 03:47:22 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-30 12:47:22,726 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-30 12:47:22,726 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 12:47:22,726 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 12:47:22,726 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 12:47:22,726 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 12:47:22,726 - httpcore.connection - DEBUG - close.started 2025-05-30 12:47:22,727 - httpcore.connection - DEBUG - close.complete 2025-05-30 12:47:22,849 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 03:47:22 GMT'), (b'Content-Type', b'text/html; charset=utf-8'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'ContentType', b'application/json'), (b'Access-Control-Allow-Origin', b'*'), (b'Content-Encoding', b'gzip')]) 2025-05-30 12:47:22,850 - httpx - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK" 2025-05-30 12:47:22,850 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 12:47:22,850 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 12:47:22,850 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 12:47:22,850 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 12:47:22,850 - httpcore.connection - DEBUG - close.started 2025-05-30 12:47:22,850 - httpcore.connection - DEBUG - close.complete 2025-05-30 12:47:23,200 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:47:23,200 - simple_memory_calculator - INFO - Using known memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 12:47:23,201 - simple_memory_calculator - DEBUG - Known data: {'params_billions': 12.0, 'fp16_gb': 24.0, 'inference_fp16_gb': 36.0} 2025-05-30 12:47:23,201 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 12:47:23,201 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:47:23,201 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 12:47:23,201 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 12:47:23,201 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:47:23,201 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 12:47:23,480 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 12:47:23,699 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/launched HTTP/1.1" 200 0 2025-05-30 12:49:07,811 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-30 12:49:07,811 - __main__ - DEBUG - API key found, length: 39 2025-05-30 12:49:07,811 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-30 12:49:07,811 - auto_diffusers - DEBUG - API key length: 39 2025-05-30 12:49:07,811 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-30 12:49:07,811 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-30 12:49:07,811 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-30 12:49:07,811 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-30 12:49:07,811 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-30 12:49:07,811 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-30 12:49:07,815 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-30 12:49:07,815 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-30 12:49:08,273 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-30 12:49:08,273 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-30 12:49:08,273 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-30 12:49:08,273 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-30 12:49:08,273 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-30 12:49:08,273 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-30 12:49:08,273 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-30 12:49:08,273 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-30 12:49:08,273 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-30 12:49:08,273 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-30 12:49:08,273 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-30 12:49:08,276 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 12:49:08,290 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-30 12:49:08,291 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 12:49:08,382 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 12:49:08,413 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=None socket_options=None 2025-05-30 12:49:08,414 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 12:49:08,414 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 12:49:08,414 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 12:49:08,414 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 12:49:08,415 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 12:49:08,415 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 12:49:08,415 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 03:49:08 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')]) 2025-05-30 12:49:08,415 - httpx - INFO - HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK" 2025-05-30 12:49:08,415 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 12:49:08,415 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 12:49:08,415 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 12:49:08,415 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 12:49:08,415 - httpcore.connection - DEBUG - close.started 2025-05-30 12:49:08,415 - httpcore.connection - DEBUG - close.complete 2025-05-30 12:49:08,416 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=3 socket_options=None 2025-05-30 12:49:08,416 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 12:49:08,416 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 12:49:08,416 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 12:49:08,416 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 12:49:08,416 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 12:49:08,417 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 12:49:08,423 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 03:49:08 GMT'), (b'server', b'uvicorn'), (b'content-length', b'104593'), (b'content-type', b'text/html; charset=utf-8')]) 2025-05-30 12:49:08,423 - httpx - INFO - HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK" 2025-05-30 12:49:08,423 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 12:49:08,423 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 12:49:08,423 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 12:49:08,423 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 12:49:08,423 - httpcore.connection - DEBUG - close.started 2025-05-30 12:49:08,423 - httpcore.connection - DEBUG - close.complete 2025-05-30 12:49:08,434 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=30 socket_options=None 2025-05-30 12:49:08,467 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 12:49:08,468 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-30 12:49:08,579 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 12:49:08,580 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=30 2025-05-30 12:49:08,581 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-30 12:49:08,766 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 12:49:08,766 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 12:49:08,767 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 12:49:08,767 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 12:49:08,767 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 12:49:08,767 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 12:49:08,866 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 12:49:08,866 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 12:49:08,866 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 12:49:08,867 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 12:49:08,867 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 12:49:08,867 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 12:49:08,917 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 03:49:08 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-30 12:49:08,918 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-30 12:49:08,918 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 12:49:08,918 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 12:49:08,918 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 12:49:08,919 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 12:49:08,919 - httpcore.connection - DEBUG - close.started 2025-05-30 12:49:08,919 - httpcore.connection - DEBUG - close.complete 2025-05-30 12:49:09,011 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 03:49:08 GMT'), (b'Content-Type', b'text/html; charset=utf-8'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'ContentType', b'application/json'), (b'Access-Control-Allow-Origin', b'*'), (b'Content-Encoding', b'gzip')]) 2025-05-30 12:49:09,011 - httpx - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK" 2025-05-30 12:49:09,012 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 12:49:09,012 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 12:49:09,012 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 12:49:09,012 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 12:49:09,012 - httpcore.connection - DEBUG - close.started 2025-05-30 12:49:09,013 - httpcore.connection - DEBUG - close.complete 2025-05-30 12:49:09,596 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 12:49:09,855 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/launched HTTP/1.1" 200 0 2025-05-30 12:50:34,222 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-30 12:50:34,222 - __main__ - DEBUG - API key found, length: 39 2025-05-30 12:50:34,222 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-30 12:50:34,222 - auto_diffusers - DEBUG - API key length: 39 2025-05-30 12:50:34,222 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-30 12:50:34,222 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-30 12:50:34,222 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-30 12:50:34,222 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-30 12:50:34,222 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-30 12:50:34,222 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-30 12:50:34,226 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-30 12:50:34,226 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-30 12:50:34,690 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-30 12:50:34,690 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-30 12:50:34,690 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-30 12:50:34,690 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-30 12:50:34,690 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-30 12:50:34,690 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-30 12:50:34,690 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-30 12:50:34,690 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-30 12:50:34,690 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-30 12:50:34,690 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-30 12:50:34,690 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-30 12:50:34,692 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 12:50:34,713 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 12:50:34,717 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-30 12:50:34,797 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 12:50:34,829 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=None socket_options=None 2025-05-30 12:50:34,830 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 12:50:34,830 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 12:50:34,830 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 12:50:34,830 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 12:50:34,830 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 12:50:34,831 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 12:50:34,831 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 03:50:34 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')]) 2025-05-30 12:50:34,831 - httpx - INFO - HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK" 2025-05-30 12:50:34,831 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 12:50:34,831 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 12:50:34,831 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 12:50:34,831 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 12:50:34,831 - httpcore.connection - DEBUG - close.started 2025-05-30 12:50:34,831 - httpcore.connection - DEBUG - close.complete 2025-05-30 12:50:34,832 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=3 socket_options=None 2025-05-30 12:50:34,832 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 12:50:34,832 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 12:50:34,832 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 12:50:34,832 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 12:50:34,832 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 12:50:34,832 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 12:50:34,839 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 03:50:34 GMT'), (b'server', b'uvicorn'), (b'content-length', b'104587'), (b'content-type', b'text/html; charset=utf-8')]) 2025-05-30 12:50:34,839 - httpx - INFO - HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK" 2025-05-30 12:50:34,839 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 12:50:34,839 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 12:50:34,839 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 12:50:34,839 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 12:50:34,839 - httpcore.connection - DEBUG - close.started 2025-05-30 12:50:34,839 - httpcore.connection - DEBUG - close.complete 2025-05-30 12:50:34,851 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=30 socket_options=None 2025-05-30 12:50:34,874 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 12:50:34,874 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-30 12:50:34,986 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-30 12:50:34,988 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 12:50:34,988 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=30 2025-05-30 12:50:35,150 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 12:50:35,150 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 12:50:35,150 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 12:50:35,150 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 12:50:35,150 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 12:50:35,150 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 12:50:35,261 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 12:50:35,261 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 12:50:35,262 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 12:50:35,262 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 12:50:35,262 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 12:50:35,262 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 12:50:35,290 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 03:50:35 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-30 12:50:35,290 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-30 12:50:35,291 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 12:50:35,291 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 12:50:35,291 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 12:50:35,291 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 12:50:35,291 - httpcore.connection - DEBUG - close.started 2025-05-30 12:50:35,291 - httpcore.connection - DEBUG - close.complete 2025-05-30 12:50:35,404 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 03:50:35 GMT'), (b'Content-Type', b'text/html; charset=utf-8'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'ContentType', b'application/json'), (b'Access-Control-Allow-Origin', b'*'), (b'Content-Encoding', b'gzip')]) 2025-05-30 12:50:35,404 - httpx - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK" 2025-05-30 12:50:35,404 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 12:50:35,405 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 12:50:35,405 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 12:50:35,405 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 12:50:35,405 - httpcore.connection - DEBUG - close.started 2025-05-30 12:50:35,406 - httpcore.connection - DEBUG - close.complete 2025-05-30 12:50:35,694 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:50:35,694 - simple_memory_calculator - INFO - Using known memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 12:50:35,694 - simple_memory_calculator - DEBUG - Known data: {'params_billions': 12.0, 'fp16_gb': 24.0, 'inference_fp16_gb': 36.0} 2025-05-30 12:50:35,694 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 12:50:35,694 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:50:35,694 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 12:50:35,694 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 12:50:35,695 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:50:35,695 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 12:50:36,018 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 12:50:36,234 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/launched HTTP/1.1" 200 0 2025-05-30 12:50:46,315 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnelld 2025-05-30 12:50:46,317 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 12:50:46,561 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "GET /api/models/black-forest-labs/FLUX.1-schnelld HTTP/1.1" 404 32 2025-05-30 12:50:46,562 - simple_memory_calculator - ERROR - API Error for model black-forest-labs/FLUX.1-schnelld: RepositoryNotFoundError: 404 Client Error. (Request ID: Root=1-68392b16-53da4d9b310af7f46bd1b444;26e1c9e2-f3ed-4f5d-b0e4-859cd9e62f4f) Repository Not Found for url: https://huggingface.co/api/models/black-forest-labs/FLUX.1-schnelld. Please make sure you specified the correct `repo_id` and `repo_type`. If you are trying to access a private or gated repo, make sure you are authenticated. For more details, see https://huggingface.co/docs/huggingface_hub/authentication 2025-05-30 12:50:46,562 - simple_memory_calculator - WARNING - Using generic estimation for black-forest-labs/FLUX.1-schnelld due to: HuggingFace API Error: RepositoryNotFoundError: 404 Client Error. (Request ID: Root=1-68392b16-53da4d9b310af7f46bd1b444;26e1c9e2-f3ed-4f5d-b0e4-859cd9e62f4f) Repository Not Found for url: https://huggingface.co/api/models/black-forest-labs/FLUX.1-schnelld. Please make sure you specified the correct `repo_id` and `repo_type`. If you are trying to access a private or gated repo, make sure you are authenticated. For more details, see https://huggingface.co/docs/huggingface_hub/authentication 2025-05-30 12:50:46,562 - simple_memory_calculator - DEBUG - Generic estimation parameters: 3.0B params, 6.0GB FP16 2025-05-30 12:50:46,562 - simple_memory_calculator - INFO - Generic estimation completed for black-forest-labs/FLUX.1-schnelld 2025-05-30 12:50:46,562 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnelld with 8.0GB VRAM 2025-05-30 12:50:46,562 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnelld 2025-05-30 12:50:46,778 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "GET /api/models/black-forest-labs/FLUX.1-schnelld HTTP/1.1" 404 32 2025-05-30 12:50:46,778 - simple_memory_calculator - ERROR - API Error for model black-forest-labs/FLUX.1-schnelld: RepositoryNotFoundError: 404 Client Error. (Request ID: Root=1-68392b16-5274c80028c2b5c65cd331cb;c592b746-0bf5-46b7-8d59-f3f665327491) Repository Not Found for url: https://huggingface.co/api/models/black-forest-labs/FLUX.1-schnelld. Please make sure you specified the correct `repo_id` and `repo_type`. If you are trying to access a private or gated repo, make sure you are authenticated. For more details, see https://huggingface.co/docs/huggingface_hub/authentication 2025-05-30 12:50:46,778 - simple_memory_calculator - WARNING - Using generic estimation for black-forest-labs/FLUX.1-schnelld due to: HuggingFace API Error: RepositoryNotFoundError: 404 Client Error. (Request ID: Root=1-68392b16-5274c80028c2b5c65cd331cb;c592b746-0bf5-46b7-8d59-f3f665327491) Repository Not Found for url: https://huggingface.co/api/models/black-forest-labs/FLUX.1-schnelld. Please make sure you specified the correct `repo_id` and `repo_type`. If you are trying to access a private or gated repo, make sure you are authenticated. For more details, see https://huggingface.co/docs/huggingface_hub/authentication 2025-05-30 12:50:46,778 - simple_memory_calculator - DEBUG - Generic estimation parameters: 3.0B params, 6.0GB FP16 2025-05-30 12:50:46,778 - simple_memory_calculator - INFO - Generic estimation completed for black-forest-labs/FLUX.1-schnelld 2025-05-30 12:50:46,778 - simple_memory_calculator - DEBUG - Model memory: 6.0GB, Inference memory: 9.0GB 2025-05-30 12:50:46,778 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnelld 2025-05-30 12:50:47,001 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "GET /api/models/black-forest-labs/FLUX.1-schnelld HTTP/1.1" 404 32 2025-05-30 12:50:47,002 - simple_memory_calculator - ERROR - API Error for model black-forest-labs/FLUX.1-schnelld: RepositoryNotFoundError: 404 Client Error. (Request ID: Root=1-68392b16-6b527dda50c8f0b95804c8c2;72737c7a-e45e-4f6d-a393-687a0c0af8a0) Repository Not Found for url: https://huggingface.co/api/models/black-forest-labs/FLUX.1-schnelld. Please make sure you specified the correct `repo_id` and `repo_type`. If you are trying to access a private or gated repo, make sure you are authenticated. For more details, see https://huggingface.co/docs/huggingface_hub/authentication 2025-05-30 12:50:47,002 - simple_memory_calculator - WARNING - Using generic estimation for black-forest-labs/FLUX.1-schnelld due to: HuggingFace API Error: RepositoryNotFoundError: 404 Client Error. (Request ID: Root=1-68392b16-6b527dda50c8f0b95804c8c2;72737c7a-e45e-4f6d-a393-687a0c0af8a0) Repository Not Found for url: https://huggingface.co/api/models/black-forest-labs/FLUX.1-schnelld. Please make sure you specified the correct `repo_id` and `repo_type`. If you are trying to access a private or gated repo, make sure you are authenticated. For more details, see https://huggingface.co/docs/huggingface_hub/authentication 2025-05-30 12:50:47,002 - simple_memory_calculator - DEBUG - Generic estimation parameters: 3.0B params, 6.0GB FP16 2025-05-30 12:50:47,002 - simple_memory_calculator - INFO - Generic estimation completed for black-forest-labs/FLUX.1-schnelld 2025-05-30 12:50:48,348 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:50:48,348 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 12:50:48,349 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 12:50:48,350 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:50:48,350 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 12:50:48,350 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 12:50:48,351 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:50:48,351 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 12:54:53,641 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-30 12:54:53,641 - __main__ - DEBUG - API key found, length: 39 2025-05-30 12:54:53,641 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-30 12:54:53,641 - auto_diffusers - DEBUG - API key length: 39 2025-05-30 12:54:53,641 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-30 12:54:53,641 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-30 12:54:53,641 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-30 12:54:53,641 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-30 12:54:53,641 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-30 12:54:53,641 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-30 12:54:53,645 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-30 12:54:53,645 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-30 12:54:54,110 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-30 12:54:54,110 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-30 12:54:54,110 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-30 12:54:54,110 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-30 12:54:54,110 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-30 12:54:54,110 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-30 12:54:54,110 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-30 12:54:54,110 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-30 12:54:54,110 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-30 12:54:54,110 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-30 12:54:54,110 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-30 12:54:54,112 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 12:54:54,125 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-30 12:54:54,133 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 12:54:54,212 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 12:54:54,245 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=None socket_options=None 2025-05-30 12:54:54,246 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 12:54:54,246 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 12:54:54,246 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 12:54:54,246 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 12:54:54,247 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 12:54:54,247 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 12:54:54,247 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 03:54:54 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')]) 2025-05-30 12:54:54,247 - httpx - INFO - HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK" 2025-05-30 12:54:54,247 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 12:54:54,247 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 12:54:54,247 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 12:54:54,247 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 12:54:54,247 - httpcore.connection - DEBUG - close.started 2025-05-30 12:54:54,247 - httpcore.connection - DEBUG - close.complete 2025-05-30 12:54:54,248 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=3 socket_options=None 2025-05-30 12:54:54,248 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 12:54:54,248 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 12:54:54,248 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 12:54:54,248 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 12:54:54,249 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 12:54:54,249 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 12:54:54,255 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 03:54:54 GMT'), (b'server', b'uvicorn'), (b'content-length', b'104594'), (b'content-type', b'text/html; charset=utf-8')]) 2025-05-30 12:54:54,255 - httpx - INFO - HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK" 2025-05-30 12:54:54,255 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 12:54:54,255 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 12:54:54,255 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 12:54:54,255 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 12:54:54,256 - httpcore.connection - DEBUG - close.started 2025-05-30 12:54:54,256 - httpcore.connection - DEBUG - close.complete 2025-05-30 12:54:54,267 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=30 socket_options=None 2025-05-30 12:54:54,288 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 12:54:54,288 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-30 12:54:54,404 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 12:54:54,404 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=30 2025-05-30 12:54:54,408 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-30 12:54:54,561 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 12:54:54,561 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 12:54:54,561 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 12:54:54,561 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 12:54:54,561 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 12:54:54,561 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 12:54:54,682 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 12:54:54,682 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 12:54:54,682 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 12:54:54,682 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 12:54:54,682 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 12:54:54,683 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 12:54:54,699 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 03:54:54 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-30 12:54:54,699 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-30 12:54:54,699 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 12:54:54,699 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 12:54:54,700 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 12:54:54,700 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 12:54:54,700 - httpcore.connection - DEBUG - close.started 2025-05-30 12:54:54,700 - httpcore.connection - DEBUG - close.complete 2025-05-30 12:54:54,822 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 03:54:54 GMT'), (b'Content-Type', b'text/html; charset=utf-8'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'ContentType', b'application/json'), (b'Access-Control-Allow-Origin', b'*'), (b'Content-Encoding', b'gzip')]) 2025-05-30 12:54:54,822 - httpx - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK" 2025-05-30 12:54:54,823 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 12:54:54,823 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 12:54:54,823 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 12:54:54,823 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 12:54:54,824 - httpcore.connection - DEBUG - close.started 2025-05-30 12:54:54,824 - httpcore.connection - DEBUG - close.complete 2025-05-30 12:54:55,469 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 12:54:55,694 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/launched HTTP/1.1" 200 0 2025-05-30 12:54:57,037 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:54:57,038 - simple_memory_calculator - INFO - Using known memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 12:54:57,038 - simple_memory_calculator - DEBUG - Known data: {'params_billions': 12.0, 'fp16_gb': 24.0, 'inference_fp16_gb': 36.0} 2025-05-30 12:54:57,038 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 12:54:57,038 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:54:57,038 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 12:54:57,038 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 12:54:57,039 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:54:57,039 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 12:54:59,172 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:54:59,172 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 12:54:59,172 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 12:54:59,172 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:54:59,172 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 12:54:59,173 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 12:54:59,173 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:54:59,173 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 12:54:59,173 - auto_diffusers - INFO - Starting code generation for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:54:59,173 - auto_diffusers - DEBUG - Parameters: prompt='A cat holding a sign that says hello world...', size=(768, 1360), steps=4 2025-05-30 12:54:59,173 - auto_diffusers - DEBUG - Manual specs: True, Memory analysis provided: True 2025-05-30 12:54:59,173 - auto_diffusers - INFO - Using manual hardware specifications 2025-05-30 12:54:59,173 - auto_diffusers - DEBUG - Manual specs: {'platform': 'Linux', 'architecture': 'manual_input', 'cpu_count': 8, 'python_version': '3.11', 'cuda_available': False, 'mps_available': False, 'torch_version': '2.0+', 'manual_input': True, 'ram_gb': 16, 'user_dtype': None, 'gpu_info': [{'name': 'Custom GPU', 'memory_mb': 8192}]} 2025-05-30 12:54:59,173 - auto_diffusers - DEBUG - GPU detected with 8.0 GB VRAM 2025-05-30 12:54:59,173 - auto_diffusers - INFO - Selected optimization profile: balanced 2025-05-30 12:54:59,173 - auto_diffusers - DEBUG - Creating generation prompt for Gemini API 2025-05-30 12:54:59,173 - auto_diffusers - DEBUG - Prompt length: 7598 characters 2025-05-30 12:54:59,173 - auto_diffusers - INFO - ================================================================================ 2025-05-30 12:54:59,173 - auto_diffusers - INFO - PROMPT SENT TO GEMINI API: 2025-05-30 12:54:59,173 - auto_diffusers - INFO - ================================================================================ 2025-05-30 12:54:59,174 - auto_diffusers - INFO - You are an expert in optimizing diffusers library code for different hardware configurations. NOTE: This system includes curated optimization knowledge from HuggingFace documentation. TASK: Generate optimized Python code for running a diffusion model with the following specifications: - Model: black-forest-labs/FLUX.1-schnell - Prompt: "A cat holding a sign that says hello world" - Image size: 768x1360 - Inference steps: 4 HARDWARE SPECIFICATIONS: - Platform: Linux (manual_input) - CPU Cores: 8 - CUDA Available: False - MPS Available: False - Optimization Profile: balanced - GPU: Custom GPU (8.0 GB VRAM) MEMORY ANALYSIS: - Model Memory Requirements: 36.0 GB (FP16 inference) - Model Weights Size: 24.0 GB (FP16) - Memory Recommendation: 🔄 Requires sequential CPU offloading - Recommended Precision: float16 - Attention Slicing Recommended: True - VAE Slicing Recommended: True OPTIMIZATION KNOWLEDGE BASE: # DIFFUSERS OPTIMIZATION TECHNIQUES ## Memory Optimization Techniques ### 1. Model CPU Offloading Use `enable_model_cpu_offload()` to move models between GPU and CPU automatically: ```python pipe.enable_model_cpu_offload() ``` - Saves significant VRAM by keeping only active models on GPU - Automatic management, no manual intervention needed - Compatible with all pipelines ### 2. Sequential CPU Offloading Use `enable_sequential_cpu_offload()` for more aggressive memory saving: ```python pipe.enable_sequential_cpu_offload() ``` - More memory efficient than model offloading - Moves models to CPU after each forward pass - Best for very limited VRAM scenarios ### 3. Attention Slicing Use `enable_attention_slicing()` to reduce memory during attention computation: ```python pipe.enable_attention_slicing() # or specify slice size pipe.enable_attention_slicing("max") # maximum slicing pipe.enable_attention_slicing(1) # slice_size = 1 ``` - Trades compute time for memory - Most effective for high-resolution images - Can be combined with other techniques ### 4. VAE Slicing Use `enable_vae_slicing()` for large batch processing: ```python pipe.enable_vae_slicing() ``` - Decodes images one at a time instead of all at once - Essential for batch sizes > 4 - Minimal performance impact on single images ### 5. VAE Tiling Use `enable_vae_tiling()` for high-resolution image generation: ```python pipe.enable_vae_tiling() ``` - Enables 4K+ image generation on 8GB VRAM - Splits images into overlapping tiles - Automatically disabled for 512x512 or smaller images ### 6. Memory Efficient Attention (xFormers) Use `enable_xformers_memory_efficient_attention()` if xFormers is installed: ```python pipe.enable_xformers_memory_efficient_attention() ``` - Significantly reduces memory usage and improves speed - Requires xformers library installation - Compatible with most models ## Performance Optimization Techniques ### 1. Half Precision (FP16/BF16) Use lower precision for better memory and speed: ```python # FP16 (widely supported) pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16) # BF16 (better numerical stability, newer hardware) pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.bfloat16) ``` - FP16: Halves memory usage, widely supported - BF16: Better numerical stability, requires newer GPUs - Essential for most optimization scenarios ### 2. Torch Compile (PyTorch 2.0+) Use `torch.compile()` for significant speed improvements: ```python pipe.unet = torch.compile(pipe.unet, mode="reduce-overhead", fullgraph=True) # For some models, compile VAE too: pipe.vae.decode = torch.compile(pipe.vae.decode, mode="reduce-overhead", fullgraph=True) ``` - 5-50% speed improvement - Requires PyTorch 2.0+ - First run is slower due to compilation ### 3. Fast Schedulers Use faster schedulers for fewer steps: ```python from diffusers import LMSDiscreteScheduler, UniPCMultistepScheduler # LMS Scheduler (good quality, fast) pipe.scheduler = LMSDiscreteScheduler.from_config(pipe.scheduler.config) # UniPC Scheduler (fastest) pipe.scheduler = UniPCMultistepScheduler.from_config(pipe.scheduler.config) ``` ## Hardware-Specific Optimizations ### NVIDIA GPU Optimizations ```python # Enable Tensor Cores torch.backends.cudnn.benchmark = True # Optimal data type for NVIDIA torch_dtype = torch.float16 # or torch.bfloat16 for RTX 30/40 series ``` ### Apple Silicon (MPS) Optimizations ```python # Use MPS device device = "mps" if torch.backends.mps.is_available() else "cpu" pipe = pipe.to(device) # Recommended dtype for Apple Silicon torch_dtype = torch.bfloat16 # Better than float16 on Apple Silicon # Attention slicing often helps on MPS pipe.enable_attention_slicing() ``` ### CPU Optimizations ```python # Use float32 for CPU torch_dtype = torch.float32 # Enable optimized attention pipe.enable_attention_slicing() ``` ## Model-Specific Guidelines ### FLUX Models - Do NOT use guidance_scale parameter (not needed for FLUX) - Use 4-8 inference steps maximum - BF16 dtype recommended - Enable attention slicing for memory optimization ### Stable Diffusion XL - Enable attention slicing for high resolutions - Use refiner model sparingly to save memory - Consider VAE tiling for >1024px images ### Stable Diffusion 1.5/2.1 - Very memory efficient base models - Can often run without optimizations on 8GB+ VRAM - Enable VAE slicing for batch processing ## Memory Usage Estimation - FLUX.1: ~24GB for full precision, ~12GB for FP16 - SDXL: ~7GB for FP16, ~14GB for FP32 - SD 1.5: ~2GB for FP16, ~4GB for FP32 ## Optimization Combinations by VRAM ### 24GB+ VRAM (High-end) ```python pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.bfloat16) pipe = pipe.to("cuda") pipe.unet = torch.compile(pipe.unet, mode="reduce-overhead", fullgraph=True) ``` ### 12-24GB VRAM (Mid-range) ```python pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16) pipe = pipe.to("cuda") pipe.enable_model_cpu_offload() pipe.enable_xformers_memory_efficient_attention() ``` ### 8-12GB VRAM (Entry-level) ```python pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16) pipe.enable_sequential_cpu_offload() pipe.enable_attention_slicing() pipe.enable_vae_slicing() pipe.enable_xformers_memory_efficient_attention() ``` ### <8GB VRAM (Low-end) ```python pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16) pipe.enable_sequential_cpu_offload() pipe.enable_attention_slicing("max") pipe.enable_vae_slicing() pipe.enable_vae_tiling() ``` IMPORTANT: For FLUX.1-schnell models, do NOT include guidance_scale parameter as it's not needed. Using the OPTIMIZATION KNOWLEDGE BASE above, generate Python code that: 1. **Selects the best optimization techniques** for the specific hardware profile 2. **Applies appropriate memory optimizations** based on available VRAM 3. **Uses optimal data types** for the target hardware: - User specified dtype (if provided): Use exactly as specified - Apple Silicon (MPS): prefer torch.bfloat16 - NVIDIA GPUs: prefer torch.float16 or torch.bfloat16 - CPU only: use torch.float32 4. **Implements hardware-specific optimizations** (CUDA, MPS, CPU) 5. **Follows model-specific guidelines** (e.g., FLUX guidance_scale handling) IMPORTANT GUIDELINES: - Reference the OPTIMIZATION KNOWLEDGE BASE to select appropriate techniques - Include all necessary imports - Add brief comments explaining optimization choices - Generate compact, production-ready code - Inline values where possible for concise code - Generate ONLY the Python code, no explanations before or after the code block 2025-05-30 12:54:59,174 - auto_diffusers - INFO - ================================================================================ 2025-05-30 12:54:59,174 - auto_diffusers - INFO - Sending request to Gemini API 2025-05-30 12:55:22,590 - auto_diffusers - INFO - Successfully received response from Gemini API (no tools used) 2025-05-30 12:55:22,590 - auto_diffusers - DEBUG - Response length: 2545 characters 2025-05-30 12:56:22,011 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:56:22,011 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 12:56:22,011 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 12:56:22,012 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:56:22,012 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 12:56:22,012 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 12:56:22,012 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:56:22,012 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 12:56:22,012 - auto_diffusers - INFO - Starting code generation for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:56:22,012 - auto_diffusers - DEBUG - Parameters: prompt='A cat holding a sign that says hello world...', size=(768, 1360), steps=4 2025-05-30 12:56:22,012 - auto_diffusers - DEBUG - Manual specs: True, Memory analysis provided: True 2025-05-30 12:56:22,012 - auto_diffusers - INFO - Using manual hardware specifications 2025-05-30 12:56:22,012 - auto_diffusers - DEBUG - Manual specs: {'platform': 'Darwin', 'architecture': 'manual_input', 'cpu_count': 8, 'python_version': '3.11', 'cuda_available': False, 'mps_available': False, 'torch_version': '2.0+', 'manual_input': True, 'ram_gb': 16, 'user_dtype': None, 'gpu_info': [{'name': 'Custom GPU', 'memory_mb': 8192}]} 2025-05-30 12:56:22,012 - auto_diffusers - DEBUG - GPU detected with 8.0 GB VRAM 2025-05-30 12:56:22,012 - auto_diffusers - INFO - Selected optimization profile: balanced 2025-05-30 12:56:22,012 - auto_diffusers - DEBUG - Creating generation prompt for Gemini API 2025-05-30 12:56:22,012 - auto_diffusers - DEBUG - Prompt length: 7599 characters 2025-05-30 12:56:22,013 - auto_diffusers - INFO - ================================================================================ 2025-05-30 12:56:22,013 - auto_diffusers - INFO - PROMPT SENT TO GEMINI API: 2025-05-30 12:56:22,013 - auto_diffusers - INFO - ================================================================================ 2025-05-30 12:56:22,013 - auto_diffusers - INFO - You are an expert in optimizing diffusers library code for different hardware configurations. NOTE: This system includes curated optimization knowledge from HuggingFace documentation. TASK: Generate optimized Python code for running a diffusion model with the following specifications: - Model: black-forest-labs/FLUX.1-schnell - Prompt: "A cat holding a sign that says hello world" - Image size: 768x1360 - Inference steps: 4 HARDWARE SPECIFICATIONS: - Platform: Darwin (manual_input) - CPU Cores: 8 - CUDA Available: False - MPS Available: False - Optimization Profile: balanced - GPU: Custom GPU (8.0 GB VRAM) MEMORY ANALYSIS: - Model Memory Requirements: 36.0 GB (FP16 inference) - Model Weights Size: 24.0 GB (FP16) - Memory Recommendation: 🔄 Requires sequential CPU offloading - Recommended Precision: float16 - Attention Slicing Recommended: True - VAE Slicing Recommended: True OPTIMIZATION KNOWLEDGE BASE: # DIFFUSERS OPTIMIZATION TECHNIQUES ## Memory Optimization Techniques ### 1. Model CPU Offloading Use `enable_model_cpu_offload()` to move models between GPU and CPU automatically: ```python pipe.enable_model_cpu_offload() ``` - Saves significant VRAM by keeping only active models on GPU - Automatic management, no manual intervention needed - Compatible with all pipelines ### 2. Sequential CPU Offloading Use `enable_sequential_cpu_offload()` for more aggressive memory saving: ```python pipe.enable_sequential_cpu_offload() ``` - More memory efficient than model offloading - Moves models to CPU after each forward pass - Best for very limited VRAM scenarios ### 3. Attention Slicing Use `enable_attention_slicing()` to reduce memory during attention computation: ```python pipe.enable_attention_slicing() # or specify slice size pipe.enable_attention_slicing("max") # maximum slicing pipe.enable_attention_slicing(1) # slice_size = 1 ``` - Trades compute time for memory - Most effective for high-resolution images - Can be combined with other techniques ### 4. VAE Slicing Use `enable_vae_slicing()` for large batch processing: ```python pipe.enable_vae_slicing() ``` - Decodes images one at a time instead of all at once - Essential for batch sizes > 4 - Minimal performance impact on single images ### 5. VAE Tiling Use `enable_vae_tiling()` for high-resolution image generation: ```python pipe.enable_vae_tiling() ``` - Enables 4K+ image generation on 8GB VRAM - Splits images into overlapping tiles - Automatically disabled for 512x512 or smaller images ### 6. Memory Efficient Attention (xFormers) Use `enable_xformers_memory_efficient_attention()` if xFormers is installed: ```python pipe.enable_xformers_memory_efficient_attention() ``` - Significantly reduces memory usage and improves speed - Requires xformers library installation - Compatible with most models ## Performance Optimization Techniques ### 1. Half Precision (FP16/BF16) Use lower precision for better memory and speed: ```python # FP16 (widely supported) pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16) # BF16 (better numerical stability, newer hardware) pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.bfloat16) ``` - FP16: Halves memory usage, widely supported - BF16: Better numerical stability, requires newer GPUs - Essential for most optimization scenarios ### 2. Torch Compile (PyTorch 2.0+) Use `torch.compile()` for significant speed improvements: ```python pipe.unet = torch.compile(pipe.unet, mode="reduce-overhead", fullgraph=True) # For some models, compile VAE too: pipe.vae.decode = torch.compile(pipe.vae.decode, mode="reduce-overhead", fullgraph=True) ``` - 5-50% speed improvement - Requires PyTorch 2.0+ - First run is slower due to compilation ### 3. Fast Schedulers Use faster schedulers for fewer steps: ```python from diffusers import LMSDiscreteScheduler, UniPCMultistepScheduler # LMS Scheduler (good quality, fast) pipe.scheduler = LMSDiscreteScheduler.from_config(pipe.scheduler.config) # UniPC Scheduler (fastest) pipe.scheduler = UniPCMultistepScheduler.from_config(pipe.scheduler.config) ``` ## Hardware-Specific Optimizations ### NVIDIA GPU Optimizations ```python # Enable Tensor Cores torch.backends.cudnn.benchmark = True # Optimal data type for NVIDIA torch_dtype = torch.float16 # or torch.bfloat16 for RTX 30/40 series ``` ### Apple Silicon (MPS) Optimizations ```python # Use MPS device device = "mps" if torch.backends.mps.is_available() else "cpu" pipe = pipe.to(device) # Recommended dtype for Apple Silicon torch_dtype = torch.bfloat16 # Better than float16 on Apple Silicon # Attention slicing often helps on MPS pipe.enable_attention_slicing() ``` ### CPU Optimizations ```python # Use float32 for CPU torch_dtype = torch.float32 # Enable optimized attention pipe.enable_attention_slicing() ``` ## Model-Specific Guidelines ### FLUX Models - Do NOT use guidance_scale parameter (not needed for FLUX) - Use 4-8 inference steps maximum - BF16 dtype recommended - Enable attention slicing for memory optimization ### Stable Diffusion XL - Enable attention slicing for high resolutions - Use refiner model sparingly to save memory - Consider VAE tiling for >1024px images ### Stable Diffusion 1.5/2.1 - Very memory efficient base models - Can often run without optimizations on 8GB+ VRAM - Enable VAE slicing for batch processing ## Memory Usage Estimation - FLUX.1: ~24GB for full precision, ~12GB for FP16 - SDXL: ~7GB for FP16, ~14GB for FP32 - SD 1.5: ~2GB for FP16, ~4GB for FP32 ## Optimization Combinations by VRAM ### 24GB+ VRAM (High-end) ```python pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.bfloat16) pipe = pipe.to("cuda") pipe.unet = torch.compile(pipe.unet, mode="reduce-overhead", fullgraph=True) ``` ### 12-24GB VRAM (Mid-range) ```python pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16) pipe = pipe.to("cuda") pipe.enable_model_cpu_offload() pipe.enable_xformers_memory_efficient_attention() ``` ### 8-12GB VRAM (Entry-level) ```python pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16) pipe.enable_sequential_cpu_offload() pipe.enable_attention_slicing() pipe.enable_vae_slicing() pipe.enable_xformers_memory_efficient_attention() ``` ### <8GB VRAM (Low-end) ```python pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16) pipe.enable_sequential_cpu_offload() pipe.enable_attention_slicing("max") pipe.enable_vae_slicing() pipe.enable_vae_tiling() ``` IMPORTANT: For FLUX.1-schnell models, do NOT include guidance_scale parameter as it's not needed. Using the OPTIMIZATION KNOWLEDGE BASE above, generate Python code that: 1. **Selects the best optimization techniques** for the specific hardware profile 2. **Applies appropriate memory optimizations** based on available VRAM 3. **Uses optimal data types** for the target hardware: - User specified dtype (if provided): Use exactly as specified - Apple Silicon (MPS): prefer torch.bfloat16 - NVIDIA GPUs: prefer torch.float16 or torch.bfloat16 - CPU only: use torch.float32 4. **Implements hardware-specific optimizations** (CUDA, MPS, CPU) 5. **Follows model-specific guidelines** (e.g., FLUX guidance_scale handling) IMPORTANT GUIDELINES: - Reference the OPTIMIZATION KNOWLEDGE BASE to select appropriate techniques - Include all necessary imports - Add brief comments explaining optimization choices - Generate compact, production-ready code - Inline values where possible for concise code - Generate ONLY the Python code, no explanations before or after the code block 2025-05-30 12:56:22,013 - auto_diffusers - INFO - ================================================================================ 2025-05-30 12:56:22,013 - auto_diffusers - INFO - Sending request to Gemini API 2025-05-30 12:56:53,133 - auto_diffusers - INFO - Successfully received response from Gemini API (no tools used) 2025-05-30 12:56:53,133 - auto_diffusers - DEBUG - Response length: 3054 characters 2025-05-30 12:59:28,018 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-30 12:59:28,018 - __main__ - DEBUG - API key found, length: 39 2025-05-30 12:59:28,018 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-30 12:59:28,018 - auto_diffusers - DEBUG - API key length: 39 2025-05-30 12:59:28,018 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-30 12:59:28,018 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-30 12:59:28,018 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-30 12:59:28,018 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-30 12:59:28,018 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-30 12:59:28,018 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-30 12:59:28,022 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-30 12:59:28,022 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-30 12:59:28,486 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-30 12:59:28,487 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-30 12:59:28,487 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-30 12:59:28,487 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-30 12:59:28,487 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-30 12:59:28,487 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-30 12:59:28,487 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-30 12:59:28,487 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-30 12:59:28,487 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-30 12:59:28,487 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-30 12:59:28,487 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-30 12:59:28,489 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 12:59:28,502 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-30 12:59:28,510 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 12:59:28,608 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 12:59:28,639 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=None socket_options=None 2025-05-30 12:59:28,640 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 12:59:28,640 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 12:59:28,640 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 12:59:28,640 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 12:59:28,640 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 12:59:28,641 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 12:59:28,641 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 03:59:28 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')]) 2025-05-30 12:59:28,641 - httpx - INFO - HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK" 2025-05-30 12:59:28,641 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 12:59:28,641 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 12:59:28,641 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 12:59:28,641 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 12:59:28,641 - httpcore.connection - DEBUG - close.started 2025-05-30 12:59:28,641 - httpcore.connection - DEBUG - close.complete 2025-05-30 12:59:28,642 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=3 socket_options=None 2025-05-30 12:59:28,642 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 12:59:28,642 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 12:59:28,642 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 12:59:28,642 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 12:59:28,642 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 12:59:28,642 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 12:59:28,649 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 03:59:28 GMT'), (b'server', b'uvicorn'), (b'content-length', b'105022'), (b'content-type', b'text/html; charset=utf-8')]) 2025-05-30 12:59:28,649 - httpx - INFO - HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK" 2025-05-30 12:59:28,649 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 12:59:28,649 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 12:59:28,649 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 12:59:28,649 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 12:59:28,649 - httpcore.connection - DEBUG - close.started 2025-05-30 12:59:28,649 - httpcore.connection - DEBUG - close.complete 2025-05-30 12:59:28,660 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=30 socket_options=None 2025-05-30 12:59:28,674 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 12:59:28,674 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-30 12:59:28,799 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 12:59:28,799 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=30 2025-05-30 12:59:28,858 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-30 12:59:28,960 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 12:59:28,960 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 12:59:28,960 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 12:59:28,960 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 12:59:28,960 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 12:59:28,960 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 12:59:29,071 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 12:59:29,071 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 12:59:29,072 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 12:59:29,072 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 12:59:29,072 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 12:59:29,072 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 12:59:29,108 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 03:59:29 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-30 12:59:29,108 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-30 12:59:29,108 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 12:59:29,108 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 12:59:29,108 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 12:59:29,109 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 12:59:29,109 - httpcore.connection - DEBUG - close.started 2025-05-30 12:59:29,109 - httpcore.connection - DEBUG - close.complete 2025-05-30 12:59:29,211 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 03:59:29 GMT'), (b'Content-Type', b'text/html; charset=utf-8'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'ContentType', b'application/json'), (b'Access-Control-Allow-Origin', b'*'), (b'Content-Encoding', b'gzip')]) 2025-05-30 12:59:29,212 - httpx - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK" 2025-05-30 12:59:29,212 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 12:59:29,212 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 12:59:29,213 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 12:59:29,213 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 12:59:29,213 - httpcore.connection - DEBUG - close.started 2025-05-30 12:59:29,213 - httpcore.connection - DEBUG - close.complete 2025-05-30 12:59:29,858 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 12:59:30,074 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/launched HTTP/1.1" 200 0 2025-05-30 12:59:30,112 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:59:30,112 - simple_memory_calculator - INFO - Using known memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 12:59:30,112 - simple_memory_calculator - DEBUG - Known data: {'params_billions': 12.0, 'fp16_gb': 24.0, 'inference_fp16_gb': 36.0} 2025-05-30 12:59:30,112 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 12:59:30,112 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:59:30,112 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 12:59:30,112 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 12:59:30,112 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:59:30,112 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 12:59:31,998 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:59:31,998 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 12:59:31,998 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 12:59:31,999 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:59:31,999 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 12:59:31,999 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 12:59:31,999 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:59:31,999 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 12:59:31,999 - auto_diffusers - INFO - Starting code generation for model: black-forest-labs/FLUX.1-schnell 2025-05-30 12:59:31,999 - auto_diffusers - DEBUG - Parameters: prompt='A cat holding a sign that says hello world...', size=(768, 1360), steps=4 2025-05-30 12:59:31,999 - auto_diffusers - DEBUG - Manual specs: True, Memory analysis provided: True 2025-05-30 12:59:31,999 - auto_diffusers - INFO - Using manual hardware specifications 2025-05-30 12:59:31,999 - auto_diffusers - DEBUG - Manual specs: {'platform': 'Linux', 'architecture': 'manual_input', 'cpu_count': 8, 'python_version': '3.11', 'cuda_available': False, 'mps_available': False, 'torch_version': '2.0+', 'manual_input': True, 'ram_gb': 16, 'user_dtype': None, 'gpu_info': [{'name': 'Custom GPU', 'memory_mb': 8192}]} 2025-05-30 12:59:31,999 - auto_diffusers - DEBUG - GPU detected with 8.0 GB VRAM 2025-05-30 12:59:31,999 - auto_diffusers - INFO - Selected optimization profile: balanced 2025-05-30 12:59:31,999 - auto_diffusers - DEBUG - Creating generation prompt for Gemini API 2025-05-30 12:59:31,999 - auto_diffusers - DEBUG - Prompt length: 7598 characters 2025-05-30 12:59:31,999 - auto_diffusers - INFO - ================================================================================ 2025-05-30 12:59:32,000 - auto_diffusers - INFO - PROMPT SENT TO GEMINI API: 2025-05-30 12:59:32,000 - auto_diffusers - INFO - ================================================================================ 2025-05-30 12:59:32,000 - auto_diffusers - INFO - You are an expert in optimizing diffusers library code for different hardware configurations. NOTE: This system includes curated optimization knowledge from HuggingFace documentation. TASK: Generate optimized Python code for running a diffusion model with the following specifications: - Model: black-forest-labs/FLUX.1-schnell - Prompt: "A cat holding a sign that says hello world" - Image size: 768x1360 - Inference steps: 4 HARDWARE SPECIFICATIONS: - Platform: Linux (manual_input) - CPU Cores: 8 - CUDA Available: False - MPS Available: False - Optimization Profile: balanced - GPU: Custom GPU (8.0 GB VRAM) MEMORY ANALYSIS: - Model Memory Requirements: 36.0 GB (FP16 inference) - Model Weights Size: 24.0 GB (FP16) - Memory Recommendation: 🔄 Requires sequential CPU offloading - Recommended Precision: float16 - Attention Slicing Recommended: True - VAE Slicing Recommended: True OPTIMIZATION KNOWLEDGE BASE: # DIFFUSERS OPTIMIZATION TECHNIQUES ## Memory Optimization Techniques ### 1. Model CPU Offloading Use `enable_model_cpu_offload()` to move models between GPU and CPU automatically: ```python pipe.enable_model_cpu_offload() ``` - Saves significant VRAM by keeping only active models on GPU - Automatic management, no manual intervention needed - Compatible with all pipelines ### 2. Sequential CPU Offloading Use `enable_sequential_cpu_offload()` for more aggressive memory saving: ```python pipe.enable_sequential_cpu_offload() ``` - More memory efficient than model offloading - Moves models to CPU after each forward pass - Best for very limited VRAM scenarios ### 3. Attention Slicing Use `enable_attention_slicing()` to reduce memory during attention computation: ```python pipe.enable_attention_slicing() # or specify slice size pipe.enable_attention_slicing("max") # maximum slicing pipe.enable_attention_slicing(1) # slice_size = 1 ``` - Trades compute time for memory - Most effective for high-resolution images - Can be combined with other techniques ### 4. VAE Slicing Use `enable_vae_slicing()` for large batch processing: ```python pipe.enable_vae_slicing() ``` - Decodes images one at a time instead of all at once - Essential for batch sizes > 4 - Minimal performance impact on single images ### 5. VAE Tiling Use `enable_vae_tiling()` for high-resolution image generation: ```python pipe.enable_vae_tiling() ``` - Enables 4K+ image generation on 8GB VRAM - Splits images into overlapping tiles - Automatically disabled for 512x512 or smaller images ### 6. Memory Efficient Attention (xFormers) Use `enable_xformers_memory_efficient_attention()` if xFormers is installed: ```python pipe.enable_xformers_memory_efficient_attention() ``` - Significantly reduces memory usage and improves speed - Requires xformers library installation - Compatible with most models ## Performance Optimization Techniques ### 1. Half Precision (FP16/BF16) Use lower precision for better memory and speed: ```python # FP16 (widely supported) pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16) # BF16 (better numerical stability, newer hardware) pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.bfloat16) ``` - FP16: Halves memory usage, widely supported - BF16: Better numerical stability, requires newer GPUs - Essential for most optimization scenarios ### 2. Torch Compile (PyTorch 2.0+) Use `torch.compile()` for significant speed improvements: ```python pipe.unet = torch.compile(pipe.unet, mode="reduce-overhead", fullgraph=True) # For some models, compile VAE too: pipe.vae.decode = torch.compile(pipe.vae.decode, mode="reduce-overhead", fullgraph=True) ``` - 5-50% speed improvement - Requires PyTorch 2.0+ - First run is slower due to compilation ### 3. Fast Schedulers Use faster schedulers for fewer steps: ```python from diffusers import LMSDiscreteScheduler, UniPCMultistepScheduler # LMS Scheduler (good quality, fast) pipe.scheduler = LMSDiscreteScheduler.from_config(pipe.scheduler.config) # UniPC Scheduler (fastest) pipe.scheduler = UniPCMultistepScheduler.from_config(pipe.scheduler.config) ``` ## Hardware-Specific Optimizations ### NVIDIA GPU Optimizations ```python # Enable Tensor Cores torch.backends.cudnn.benchmark = True # Optimal data type for NVIDIA torch_dtype = torch.float16 # or torch.bfloat16 for RTX 30/40 series ``` ### Apple Silicon (MPS) Optimizations ```python # Use MPS device device = "mps" if torch.backends.mps.is_available() else "cpu" pipe = pipe.to(device) # Recommended dtype for Apple Silicon torch_dtype = torch.bfloat16 # Better than float16 on Apple Silicon # Attention slicing often helps on MPS pipe.enable_attention_slicing() ``` ### CPU Optimizations ```python # Use float32 for CPU torch_dtype = torch.float32 # Enable optimized attention pipe.enable_attention_slicing() ``` ## Model-Specific Guidelines ### FLUX Models - Do NOT use guidance_scale parameter (not needed for FLUX) - Use 4-8 inference steps maximum - BF16 dtype recommended - Enable attention slicing for memory optimization ### Stable Diffusion XL - Enable attention slicing for high resolutions - Use refiner model sparingly to save memory - Consider VAE tiling for >1024px images ### Stable Diffusion 1.5/2.1 - Very memory efficient base models - Can often run without optimizations on 8GB+ VRAM - Enable VAE slicing for batch processing ## Memory Usage Estimation - FLUX.1: ~24GB for full precision, ~12GB for FP16 - SDXL: ~7GB for FP16, ~14GB for FP32 - SD 1.5: ~2GB for FP16, ~4GB for FP32 ## Optimization Combinations by VRAM ### 24GB+ VRAM (High-end) ```python pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.bfloat16) pipe = pipe.to("cuda") pipe.unet = torch.compile(pipe.unet, mode="reduce-overhead", fullgraph=True) ``` ### 12-24GB VRAM (Mid-range) ```python pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16) pipe = pipe.to("cuda") pipe.enable_model_cpu_offload() pipe.enable_xformers_memory_efficient_attention() ``` ### 8-12GB VRAM (Entry-level) ```python pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16) pipe.enable_sequential_cpu_offload() pipe.enable_attention_slicing() pipe.enable_vae_slicing() pipe.enable_xformers_memory_efficient_attention() ``` ### <8GB VRAM (Low-end) ```python pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16) pipe.enable_sequential_cpu_offload() pipe.enable_attention_slicing("max") pipe.enable_vae_slicing() pipe.enable_vae_tiling() ``` IMPORTANT: For FLUX.1-schnell models, do NOT include guidance_scale parameter as it's not needed. Using the OPTIMIZATION KNOWLEDGE BASE above, generate Python code that: 1. **Selects the best optimization techniques** for the specific hardware profile 2. **Applies appropriate memory optimizations** based on available VRAM 3. **Uses optimal data types** for the target hardware: - User specified dtype (if provided): Use exactly as specified - Apple Silicon (MPS): prefer torch.bfloat16 - NVIDIA GPUs: prefer torch.float16 or torch.bfloat16 - CPU only: use torch.float32 4. **Implements hardware-specific optimizations** (CUDA, MPS, CPU) 5. **Follows model-specific guidelines** (e.g., FLUX guidance_scale handling) IMPORTANT GUIDELINES: - Reference the OPTIMIZATION KNOWLEDGE BASE to select appropriate techniques - Include all necessary imports - Add brief comments explaining optimization choices - Generate compact, production-ready code - Inline values where possible for concise code - Generate ONLY the Python code, no explanations before or after the code block 2025-05-30 12:59:32,000 - auto_diffusers - INFO - ================================================================================ 2025-05-30 12:59:32,000 - auto_diffusers - INFO - Sending request to Gemini API 2025-05-30 12:59:47,609 - auto_diffusers - INFO - Successfully received response from Gemini API (no tools used) 2025-05-30 12:59:47,609 - auto_diffusers - DEBUG - Response length: 2336 characters 2025-05-30 13:01:46,192 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-30 13:01:46,192 - __main__ - DEBUG - API key found, length: 39 2025-05-30 13:01:46,192 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-30 13:01:46,192 - auto_diffusers - DEBUG - API key length: 39 2025-05-30 13:01:46,192 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-30 13:01:46,192 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-30 13:01:46,192 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-30 13:01:46,193 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-30 13:01:46,193 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-30 13:01:46,193 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-30 13:01:46,197 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-30 13:01:46,197 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-30 13:01:46,686 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-30 13:01:46,686 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-30 13:01:46,686 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-30 13:01:46,686 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-30 13:01:46,686 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-30 13:01:46,686 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-30 13:01:46,687 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-30 13:01:46,687 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-30 13:01:46,687 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-30 13:01:46,687 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-30 13:01:46,687 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-30 13:01:46,689 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 13:01:46,702 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-30 13:01:46,710 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 13:01:46,789 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 13:01:46,822 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=None socket_options=None 2025-05-30 13:01:46,823 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 13:01:46,823 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 13:01:46,823 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 13:01:46,823 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 13:01:46,824 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 13:01:46,824 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 13:01:46,824 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 04:01:46 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')]) 2025-05-30 13:01:46,824 - httpx - INFO - HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK" 2025-05-30 13:01:46,824 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 13:01:46,824 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 13:01:46,824 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 13:01:46,824 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 13:01:46,824 - httpcore.connection - DEBUG - close.started 2025-05-30 13:01:46,824 - httpcore.connection - DEBUG - close.complete 2025-05-30 13:01:46,825 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=3 socket_options=None 2025-05-30 13:01:46,825 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 13:01:46,825 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 13:01:46,825 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 13:01:46,825 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 13:01:46,825 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 13:01:46,825 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 13:01:46,832 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 04:01:46 GMT'), (b'server', b'uvicorn'), (b'content-length', b'105165'), (b'content-type', b'text/html; charset=utf-8')]) 2025-05-30 13:01:46,832 - httpx - INFO - HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK" 2025-05-30 13:01:46,832 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 13:01:46,832 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 13:01:46,832 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 13:01:46,832 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 13:01:46,832 - httpcore.connection - DEBUG - close.started 2025-05-30 13:01:46,832 - httpcore.connection - DEBUG - close.complete 2025-05-30 13:01:46,843 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=30 socket_options=None 2025-05-30 13:01:47,914 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 13:01:47,914 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 13:01:47,914 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-30 13:01:47,914 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=30 2025-05-30 13:01:48,744 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-30 13:01:48,881 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 13:01:48,882 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 13:01:48,882 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 13:01:48,883 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 13:01:48,883 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 13:01:48,883 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 13:01:48,883 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 13:01:48,883 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 13:01:48,884 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 13:01:48,884 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 13:01:48,884 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 13:01:48,884 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 13:01:49,012 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 13:01:49,012 - simple_memory_calculator - INFO - Using known memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 13:01:49,012 - simple_memory_calculator - DEBUG - Known data: {'params_billions': 12.0, 'fp16_gb': 24.0, 'inference_fp16_gb': 36.0} 2025-05-30 13:01:49,012 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 13:01:49,012 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 13:01:49,013 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 13:01:49,013 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 13:01:49,013 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 13:01:49,013 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 13:01:49,026 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 04:01:48 GMT'), (b'Content-Type', b'text/html; charset=utf-8'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'ContentType', b'application/json'), (b'Access-Control-Allow-Origin', b'*'), (b'Content-Encoding', b'gzip')]) 2025-05-30 13:01:49,026 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 04:01:48 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-30 13:01:49,026 - httpx - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK" 2025-05-30 13:01:49,026 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-30 13:01:49,026 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 13:01:49,026 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 13:01:49,027 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 13:01:49,027 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 13:01:49,027 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 13:01:49,027 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 13:01:49,027 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 13:01:49,027 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 13:01:49,027 - httpcore.connection - DEBUG - close.started 2025-05-30 13:01:49,027 - httpcore.connection - DEBUG - close.started 2025-05-30 13:01:49,027 - httpcore.connection - DEBUG - close.complete 2025-05-30 13:01:49,027 - httpcore.connection - DEBUG - close.complete 2025-05-30 13:01:49,637 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 13:01:49,851 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/launched HTTP/1.1" 200 0 2025-05-30 13:03:36,653 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-30 13:03:36,653 - __main__ - DEBUG - API key found, length: 39 2025-05-30 13:03:36,653 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-30 13:03:36,653 - auto_diffusers - DEBUG - API key length: 39 2025-05-30 13:03:36,653 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-30 13:03:36,653 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-30 13:03:36,653 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-30 13:03:36,653 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-30 13:03:36,653 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-30 13:03:36,653 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-30 13:03:36,657 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-30 13:03:36,657 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-30 13:03:37,119 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-30 13:03:37,119 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-30 13:03:37,119 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-30 13:03:37,119 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-30 13:03:37,119 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-30 13:03:37,119 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-30 13:03:37,119 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-30 13:03:37,119 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-30 13:03:37,119 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-30 13:03:37,119 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-30 13:03:37,119 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-30 13:03:37,121 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 13:03:37,135 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-30 13:03:37,135 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 13:03:37,221 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 13:03:37,253 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=None socket_options=None 2025-05-30 13:03:37,253 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 13:03:37,254 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 13:03:37,254 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 13:03:37,254 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 13:03:37,254 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 13:03:37,254 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 13:03:37,254 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 04:03:37 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')]) 2025-05-30 13:03:37,255 - httpx - INFO - HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK" 2025-05-30 13:03:37,255 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 13:03:37,255 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 13:03:37,255 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 13:03:37,255 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 13:03:37,255 - httpcore.connection - DEBUG - close.started 2025-05-30 13:03:37,255 - httpcore.connection - DEBUG - close.complete 2025-05-30 13:03:37,255 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=3 socket_options=None 2025-05-30 13:03:37,256 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 13:03:37,256 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 13:03:37,256 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 13:03:37,256 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 13:03:37,256 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 13:03:37,256 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 13:03:37,263 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 04:03:37 GMT'), (b'server', b'uvicorn'), (b'content-length', b'105762'), (b'content-type', b'text/html; charset=utf-8')]) 2025-05-30 13:03:37,263 - httpx - INFO - HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK" 2025-05-30 13:03:37,263 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 13:03:37,263 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 13:03:37,263 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 13:03:37,263 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 13:03:37,263 - httpcore.connection - DEBUG - close.started 2025-05-30 13:03:37,263 - httpcore.connection - DEBUG - close.complete 2025-05-30 13:03:37,274 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=30 socket_options=None 2025-05-30 13:03:37,299 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 13:03:37,299 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-30 13:03:37,422 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 13:03:37,422 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=30 2025-05-30 13:03:37,427 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-30 13:03:37,577 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 13:03:37,577 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 13:03:37,577 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 13:03:37,577 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 13:03:37,577 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 13:03:37,577 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 13:03:37,717 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 04:03:37 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-30 13:03:37,718 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-30 13:03:37,718 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 13:03:37,718 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 13:03:37,718 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 13:03:37,718 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 13:03:37,719 - httpcore.connection - DEBUG - close.started 2025-05-30 13:03:37,719 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 13:03:37,719 - httpcore.connection - DEBUG - close.complete 2025-05-30 13:03:37,719 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 13:03:37,720 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 13:03:37,720 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 13:03:37,720 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 13:03:37,720 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 13:03:37,873 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 04:03:37 GMT'), (b'Content-Type', b'text/html; charset=utf-8'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'ContentType', b'application/json'), (b'Access-Control-Allow-Origin', b'*'), (b'Content-Encoding', b'gzip')]) 2025-05-30 13:03:37,873 - httpx - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK" 2025-05-30 13:03:37,874 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 13:03:37,874 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 13:03:37,874 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 13:03:37,875 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 13:03:37,875 - httpcore.connection - DEBUG - close.started 2025-05-30 13:03:37,875 - httpcore.connection - DEBUG - close.complete 2025-05-30 13:03:38,014 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 13:03:38,014 - simple_memory_calculator - INFO - Using known memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 13:03:38,014 - simple_memory_calculator - DEBUG - Known data: {'params_billions': 12.0, 'fp16_gb': 24.0, 'inference_fp16_gb': 36.0} 2025-05-30 13:03:38,014 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 13:03:38,014 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 13:03:38,014 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 13:03:38,015 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 13:03:38,015 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 13:03:38,015 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 13:03:38,475 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 13:03:38,704 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/launched HTTP/1.1" 200 0 2025-05-30 13:03:44,362 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 13:03:44,362 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 13:03:44,362 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 13:03:44,362 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 13:03:44,362 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 13:03:44,362 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 13:03:44,362 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 13:03:44,362 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 13:03:44,365 - auto_diffusers - INFO - Starting code generation for model: black-forest-labs/FLUX.1-schnell 2025-05-30 13:03:44,366 - auto_diffusers - DEBUG - Parameters: prompt='A cat holding a sign that says hello world...', size=(768, 1360), steps=4 2025-05-30 13:03:44,366 - auto_diffusers - DEBUG - Manual specs: True, Memory analysis provided: True 2025-05-30 13:03:44,366 - auto_diffusers - INFO - Using manual hardware specifications 2025-05-30 13:03:44,366 - auto_diffusers - DEBUG - Manual specs: {'platform': 'Linux', 'architecture': 'manual_input', 'cpu_count': 8, 'python_version': '3.11', 'cuda_available': False, 'mps_available': False, 'torch_version': '2.0+', 'manual_input': True, 'ram_gb': 16, 'user_dtype': None, 'gpu_info': [{'name': 'Custom GPU', 'memory_mb': 8192}]} 2025-05-30 13:03:44,367 - auto_diffusers - DEBUG - GPU detected with 8.0 GB VRAM 2025-05-30 13:03:44,367 - auto_diffusers - INFO - Selected optimization profile: balanced 2025-05-30 13:03:44,367 - auto_diffusers - DEBUG - Creating generation prompt for Gemini API 2025-05-30 13:03:44,367 - auto_diffusers - DEBUG - Prompt length: 7598 characters 2025-05-30 13:03:44,367 - auto_diffusers - INFO - ================================================================================ 2025-05-30 13:03:44,367 - auto_diffusers - INFO - PROMPT SENT TO GEMINI API: 2025-05-30 13:03:44,368 - auto_diffusers - INFO - ================================================================================ 2025-05-30 13:03:44,368 - auto_diffusers - INFO - You are an expert in optimizing diffusers library code for different hardware configurations. NOTE: This system includes curated optimization knowledge from HuggingFace documentation. TASK: Generate optimized Python code for running a diffusion model with the following specifications: - Model: black-forest-labs/FLUX.1-schnell - Prompt: "A cat holding a sign that says hello world" - Image size: 768x1360 - Inference steps: 4 HARDWARE SPECIFICATIONS: - Platform: Linux (manual_input) - CPU Cores: 8 - CUDA Available: False - MPS Available: False - Optimization Profile: balanced - GPU: Custom GPU (8.0 GB VRAM) MEMORY ANALYSIS: - Model Memory Requirements: 36.0 GB (FP16 inference) - Model Weights Size: 24.0 GB (FP16) - Memory Recommendation: 🔄 Requires sequential CPU offloading - Recommended Precision: float16 - Attention Slicing Recommended: True - VAE Slicing Recommended: True OPTIMIZATION KNOWLEDGE BASE: # DIFFUSERS OPTIMIZATION TECHNIQUES ## Memory Optimization Techniques ### 1. Model CPU Offloading Use `enable_model_cpu_offload()` to move models between GPU and CPU automatically: ```python pipe.enable_model_cpu_offload() ``` - Saves significant VRAM by keeping only active models on GPU - Automatic management, no manual intervention needed - Compatible with all pipelines ### 2. Sequential CPU Offloading Use `enable_sequential_cpu_offload()` for more aggressive memory saving: ```python pipe.enable_sequential_cpu_offload() ``` - More memory efficient than model offloading - Moves models to CPU after each forward pass - Best for very limited VRAM scenarios ### 3. Attention Slicing Use `enable_attention_slicing()` to reduce memory during attention computation: ```python pipe.enable_attention_slicing() # or specify slice size pipe.enable_attention_slicing("max") # maximum slicing pipe.enable_attention_slicing(1) # slice_size = 1 ``` - Trades compute time for memory - Most effective for high-resolution images - Can be combined with other techniques ### 4. VAE Slicing Use `enable_vae_slicing()` for large batch processing: ```python pipe.enable_vae_slicing() ``` - Decodes images one at a time instead of all at once - Essential for batch sizes > 4 - Minimal performance impact on single images ### 5. VAE Tiling Use `enable_vae_tiling()` for high-resolution image generation: ```python pipe.enable_vae_tiling() ``` - Enables 4K+ image generation on 8GB VRAM - Splits images into overlapping tiles - Automatically disabled for 512x512 or smaller images ### 6. Memory Efficient Attention (xFormers) Use `enable_xformers_memory_efficient_attention()` if xFormers is installed: ```python pipe.enable_xformers_memory_efficient_attention() ``` - Significantly reduces memory usage and improves speed - Requires xformers library installation - Compatible with most models ## Performance Optimization Techniques ### 1. Half Precision (FP16/BF16) Use lower precision for better memory and speed: ```python # FP16 (widely supported) pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16) # BF16 (better numerical stability, newer hardware) pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.bfloat16) ``` - FP16: Halves memory usage, widely supported - BF16: Better numerical stability, requires newer GPUs - Essential for most optimization scenarios ### 2. Torch Compile (PyTorch 2.0+) Use `torch.compile()` for significant speed improvements: ```python pipe.unet = torch.compile(pipe.unet, mode="reduce-overhead", fullgraph=True) # For some models, compile VAE too: pipe.vae.decode = torch.compile(pipe.vae.decode, mode="reduce-overhead", fullgraph=True) ``` - 5-50% speed improvement - Requires PyTorch 2.0+ - First run is slower due to compilation ### 3. Fast Schedulers Use faster schedulers for fewer steps: ```python from diffusers import LMSDiscreteScheduler, UniPCMultistepScheduler # LMS Scheduler (good quality, fast) pipe.scheduler = LMSDiscreteScheduler.from_config(pipe.scheduler.config) # UniPC Scheduler (fastest) pipe.scheduler = UniPCMultistepScheduler.from_config(pipe.scheduler.config) ``` ## Hardware-Specific Optimizations ### NVIDIA GPU Optimizations ```python # Enable Tensor Cores torch.backends.cudnn.benchmark = True # Optimal data type for NVIDIA torch_dtype = torch.float16 # or torch.bfloat16 for RTX 30/40 series ``` ### Apple Silicon (MPS) Optimizations ```python # Use MPS device device = "mps" if torch.backends.mps.is_available() else "cpu" pipe = pipe.to(device) # Recommended dtype for Apple Silicon torch_dtype = torch.bfloat16 # Better than float16 on Apple Silicon # Attention slicing often helps on MPS pipe.enable_attention_slicing() ``` ### CPU Optimizations ```python # Use float32 for CPU torch_dtype = torch.float32 # Enable optimized attention pipe.enable_attention_slicing() ``` ## Model-Specific Guidelines ### FLUX Models - Do NOT use guidance_scale parameter (not needed for FLUX) - Use 4-8 inference steps maximum - BF16 dtype recommended - Enable attention slicing for memory optimization ### Stable Diffusion XL - Enable attention slicing for high resolutions - Use refiner model sparingly to save memory - Consider VAE tiling for >1024px images ### Stable Diffusion 1.5/2.1 - Very memory efficient base models - Can often run without optimizations on 8GB+ VRAM - Enable VAE slicing for batch processing ## Memory Usage Estimation - FLUX.1: ~24GB for full precision, ~12GB for FP16 - SDXL: ~7GB for FP16, ~14GB for FP32 - SD 1.5: ~2GB for FP16, ~4GB for FP32 ## Optimization Combinations by VRAM ### 24GB+ VRAM (High-end) ```python pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.bfloat16) pipe = pipe.to("cuda") pipe.unet = torch.compile(pipe.unet, mode="reduce-overhead", fullgraph=True) ``` ### 12-24GB VRAM (Mid-range) ```python pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16) pipe = pipe.to("cuda") pipe.enable_model_cpu_offload() pipe.enable_xformers_memory_efficient_attention() ``` ### 8-12GB VRAM (Entry-level) ```python pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16) pipe.enable_sequential_cpu_offload() pipe.enable_attention_slicing() pipe.enable_vae_slicing() pipe.enable_xformers_memory_efficient_attention() ``` ### <8GB VRAM (Low-end) ```python pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16) pipe.enable_sequential_cpu_offload() pipe.enable_attention_slicing("max") pipe.enable_vae_slicing() pipe.enable_vae_tiling() ``` IMPORTANT: For FLUX.1-schnell models, do NOT include guidance_scale parameter as it's not needed. Using the OPTIMIZATION KNOWLEDGE BASE above, generate Python code that: 1. **Selects the best optimization techniques** for the specific hardware profile 2. **Applies appropriate memory optimizations** based on available VRAM 3. **Uses optimal data types** for the target hardware: - User specified dtype (if provided): Use exactly as specified - Apple Silicon (MPS): prefer torch.bfloat16 - NVIDIA GPUs: prefer torch.float16 or torch.bfloat16 - CPU only: use torch.float32 4. **Implements hardware-specific optimizations** (CUDA, MPS, CPU) 5. **Follows model-specific guidelines** (e.g., FLUX guidance_scale handling) IMPORTANT GUIDELINES: - Reference the OPTIMIZATION KNOWLEDGE BASE to select appropriate techniques - Include all necessary imports - Add brief comments explaining optimization choices - Generate compact, production-ready code - Inline values where possible for concise code - Generate ONLY the Python code, no explanations before or after the code block 2025-05-30 13:03:44,368 - auto_diffusers - INFO - ================================================================================ 2025-05-30 13:03:44,368 - auto_diffusers - INFO - Sending request to Gemini API 2025-05-30 13:04:03,397 - auto_diffusers - INFO - Successfully received response from Gemini API (no tools used) 2025-05-30 13:04:03,398 - auto_diffusers - DEBUG - Response length: 2233 characters 2025-05-30 13:05:29,939 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-30 13:05:29,939 - __main__ - DEBUG - API key found, length: 39 2025-05-30 13:05:29,939 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-30 13:05:29,939 - auto_diffusers - DEBUG - API key length: 39 2025-05-30 13:05:29,939 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-30 13:05:29,939 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-30 13:05:29,939 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-30 13:05:29,939 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-30 13:05:29,939 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-30 13:05:29,939 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-30 13:05:29,943 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-30 13:05:29,943 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-30 13:05:30,408 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-30 13:05:30,408 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-30 13:05:30,408 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-30 13:05:30,408 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-30 13:05:30,409 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-30 13:05:30,409 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-30 13:05:30,409 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-30 13:05:30,409 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-30 13:05:30,409 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-30 13:05:30,409 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-30 13:05:30,409 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-30 13:05:30,411 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 13:05:30,412 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 13:05:30,429 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-30 13:05:30,516 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 13:05:30,550 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=None socket_options=None 2025-05-30 13:05:30,551 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 13:05:30,551 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 13:05:30,551 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 13:05:30,551 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 13:05:30,551 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 13:05:30,551 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 13:05:30,552 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 04:05:30 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')]) 2025-05-30 13:05:30,552 - httpx - INFO - HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK" 2025-05-30 13:05:30,552 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 13:05:30,552 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 13:05:30,552 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 13:05:30,552 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 13:05:30,552 - httpcore.connection - DEBUG - close.started 2025-05-30 13:05:30,552 - httpcore.connection - DEBUG - close.complete 2025-05-30 13:05:30,553 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=3 socket_options=None 2025-05-30 13:05:30,553 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 13:05:30,553 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 13:05:30,553 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 13:05:30,553 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 13:05:30,553 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 13:05:30,553 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 13:05:30,561 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 04:05:30 GMT'), (b'server', b'uvicorn'), (b'content-length', b'106851'), (b'content-type', b'text/html; charset=utf-8')]) 2025-05-30 13:05:30,561 - httpx - INFO - HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK" 2025-05-30 13:05:30,561 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 13:05:30,561 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 13:05:30,561 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 13:05:30,561 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 13:05:30,561 - httpcore.connection - DEBUG - close.started 2025-05-30 13:05:30,561 - httpcore.connection - DEBUG - close.complete 2025-05-30 13:05:30,573 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=30 socket_options=None 2025-05-30 13:05:30,598 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 13:05:30,599 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-30 13:05:30,688 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-30 13:05:30,711 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 13:05:30,711 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=30 2025-05-30 13:05:30,891 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 13:05:30,892 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 13:05:30,892 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 13:05:30,892 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 13:05:30,892 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 13:05:30,893 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 13:05:30,991 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 13:05:30,991 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 13:05:30,991 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 13:05:30,991 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 13:05:30,992 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 13:05:30,992 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 13:05:31,039 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 04:05:30 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-30 13:05:31,040 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-30 13:05:31,040 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 13:05:31,040 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 13:05:31,040 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 13:05:31,040 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 13:05:31,041 - httpcore.connection - DEBUG - close.started 2025-05-30 13:05:31,041 - httpcore.connection - DEBUG - close.complete 2025-05-30 13:05:31,135 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 04:05:31 GMT'), (b'Content-Type', b'text/html; charset=utf-8'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'ContentType', b'application/json'), (b'Access-Control-Allow-Origin', b'*'), (b'Content-Encoding', b'gzip')]) 2025-05-30 13:05:31,135 - httpx - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK" 2025-05-30 13:05:31,135 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 13:05:31,136 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 13:05:31,136 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 13:05:31,136 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 13:05:31,136 - httpcore.connection - DEBUG - close.started 2025-05-30 13:05:31,136 - httpcore.connection - DEBUG - close.complete 2025-05-30 13:05:31,772 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 13:05:31,821 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 13:05:31,821 - simple_memory_calculator - INFO - Using known memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 13:05:31,821 - simple_memory_calculator - DEBUG - Known data: {'params_billions': 12.0, 'fp16_gb': 24.0, 'inference_fp16_gb': 36.0} 2025-05-30 13:05:31,821 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 13:05:31,821 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 13:05:31,821 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 13:05:31,821 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 13:05:31,822 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 13:05:31,822 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 13:05:32,345 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/launched HTTP/1.1" 200 0 2025-05-30 13:08:55,159 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-30 13:08:55,159 - __main__ - DEBUG - API key found, length: 39 2025-05-30 13:08:55,159 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-30 13:08:55,159 - auto_diffusers - DEBUG - API key length: 39 2025-05-30 13:08:55,159 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-30 13:08:55,159 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-30 13:08:55,159 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-30 13:08:55,159 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-30 13:08:55,159 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-30 13:08:55,159 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-30 13:08:55,163 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-30 13:08:55,163 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-30 13:08:55,637 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-30 13:08:55,637 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-30 13:08:55,637 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-30 13:08:55,637 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-30 13:08:55,637 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-30 13:08:55,637 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-30 13:08:55,637 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-30 13:08:55,637 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-30 13:08:55,637 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-30 13:08:55,637 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-30 13:08:55,638 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-30 13:08:55,640 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 13:08:55,647 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 13:08:55,654 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-30 13:08:55,739 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 13:08:55,771 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=None socket_options=None 2025-05-30 13:08:55,772 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 13:08:55,772 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 13:08:55,772 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 13:08:55,772 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 13:08:55,772 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 13:08:55,773 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 13:08:55,773 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 04:08:55 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')]) 2025-05-30 13:08:55,773 - httpx - INFO - HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK" 2025-05-30 13:08:55,773 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 13:08:55,773 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 13:08:55,773 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 13:08:55,773 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 13:08:55,773 - httpcore.connection - DEBUG - close.started 2025-05-30 13:08:55,773 - httpcore.connection - DEBUG - close.complete 2025-05-30 13:08:55,774 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=3 socket_options=None 2025-05-30 13:08:55,774 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 13:08:55,774 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 13:08:55,774 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 13:08:55,774 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 13:08:55,774 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 13:08:55,774 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 13:08:55,781 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 04:08:55 GMT'), (b'server', b'uvicorn'), (b'content-length', b'106859'), (b'content-type', b'text/html; charset=utf-8')]) 2025-05-30 13:08:55,781 - httpx - INFO - HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK" 2025-05-30 13:08:55,781 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 13:08:55,781 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 13:08:55,781 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 13:08:55,781 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 13:08:55,781 - httpcore.connection - DEBUG - close.started 2025-05-30 13:08:55,781 - httpcore.connection - DEBUG - close.complete 2025-05-30 13:08:55,793 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=30 socket_options=None 2025-05-30 13:08:55,816 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 13:08:55,816 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-30 13:08:55,943 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-30 13:08:55,949 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 13:08:55,949 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=30 2025-05-30 13:08:56,092 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 13:08:56,093 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 13:08:56,093 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 13:08:56,093 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 13:08:56,093 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 13:08:56,093 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 13:08:56,234 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 04:08:56 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-30 13:08:56,235 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-30 13:08:56,235 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 13:08:56,236 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 13:08:56,236 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 13:08:56,236 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 13:08:56,237 - httpcore.connection - DEBUG - close.started 2025-05-30 13:08:56,237 - httpcore.connection - DEBUG - close.complete 2025-05-30 13:08:56,263 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 13:08:56,263 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 13:08:56,264 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 13:08:56,264 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 13:08:56,264 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 13:08:56,264 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 13:08:56,421 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 04:08:56 GMT'), (b'Content-Type', b'text/html; charset=utf-8'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'ContentType', b'application/json'), (b'Access-Control-Allow-Origin', b'*'), (b'Content-Encoding', b'gzip')]) 2025-05-30 13:08:56,422 - httpx - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK" 2025-05-30 13:08:56,422 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 13:08:56,422 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 13:08:56,422 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 13:08:56,422 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 13:08:56,422 - httpcore.connection - DEBUG - close.started 2025-05-30 13:08:56,423 - httpcore.connection - DEBUG - close.complete 2025-05-30 13:08:57,087 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 13:08:57,301 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/launched HTTP/1.1" 200 0 2025-05-30 13:09:01,464 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 13:09:01,464 - simple_memory_calculator - INFO - Using known memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 13:09:01,465 - simple_memory_calculator - DEBUG - Known data: {'params_billions': 12.0, 'fp16_gb': 24.0, 'inference_fp16_gb': 36.0} 2025-05-30 13:09:01,465 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 13:09:01,465 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 13:09:01,465 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 13:09:01,465 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 13:09:01,465 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 13:09:01,465 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 13:10:26,389 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-30 13:10:26,389 - __main__ - DEBUG - API key found, length: 39 2025-05-30 13:10:26,389 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-30 13:10:26,389 - auto_diffusers - DEBUG - API key length: 39 2025-05-30 13:10:26,389 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-30 13:10:26,389 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-30 13:10:26,389 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-30 13:10:26,389 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-30 13:10:26,389 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-30 13:10:26,389 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-30 13:10:26,393 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-30 13:10:26,393 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-30 13:10:26,846 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-30 13:10:26,846 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-30 13:10:26,846 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-30 13:10:26,846 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-30 13:10:26,846 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-30 13:10:26,846 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-30 13:10:26,846 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-30 13:10:26,846 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-30 13:10:26,846 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-30 13:10:26,847 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-30 13:10:26,847 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-30 13:10:26,849 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 13:10:26,861 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-30 13:10:26,869 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 13:10:26,948 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 13:10:26,990 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=None socket_options=None 2025-05-30 13:10:26,991 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 13:10:26,991 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 13:10:26,991 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 13:10:26,991 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 13:10:26,992 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 13:10:26,992 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 13:10:26,992 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 04:10:26 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')]) 2025-05-30 13:10:26,992 - httpx - INFO - HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK" 2025-05-30 13:10:26,992 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 13:10:26,992 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 13:10:26,992 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 13:10:26,992 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 13:10:26,992 - httpcore.connection - DEBUG - close.started 2025-05-30 13:10:26,992 - httpcore.connection - DEBUG - close.complete 2025-05-30 13:10:26,993 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=3 socket_options=None 2025-05-30 13:10:26,994 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 13:10:26,994 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 13:10:26,995 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 13:10:26,995 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 13:10:26,996 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 13:10:26,996 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 13:10:27,001 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 04:10:26 GMT'), (b'server', b'uvicorn'), (b'content-length', b'108000'), (b'content-type', b'text/html; charset=utf-8')]) 2025-05-30 13:10:27,002 - httpx - INFO - HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK" 2025-05-30 13:10:27,002 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 13:10:27,003 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 13:10:27,003 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 13:10:27,003 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 13:10:27,004 - httpcore.connection - DEBUG - close.started 2025-05-30 13:10:27,004 - httpcore.connection - DEBUG - close.complete 2025-05-30 13:10:27,017 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=30 socket_options=None 2025-05-30 13:10:27,221 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 13:10:27,222 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=30 2025-05-30 13:10:27,245 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 13:10:27,245 - simple_memory_calculator - INFO - Using known memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 13:10:27,245 - simple_memory_calculator - DEBUG - Known data: {'params_billions': 12.0, 'fp16_gb': 24.0, 'inference_fp16_gb': 36.0} 2025-05-30 13:10:27,245 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 13:10:27,245 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 13:10:27,245 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 13:10:27,245 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 13:10:27,245 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 13:10:27,245 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 13:10:27,252 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 13:10:27,252 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-30 13:10:27,259 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-30 13:10:27,511 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 13:10:27,511 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 13:10:27,511 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 13:10:27,511 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 13:10:27,511 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 13:10:27,511 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 13:10:27,588 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 13:10:27,588 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 13:10:27,589 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 13:10:27,589 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 13:10:27,589 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 13:10:27,589 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 13:10:27,657 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 04:10:27 GMT'), (b'Content-Type', b'text/html; charset=utf-8'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'ContentType', b'application/json'), (b'Access-Control-Allow-Origin', b'*'), (b'Content-Encoding', b'gzip')]) 2025-05-30 13:10:27,657 - httpx - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK" 2025-05-30 13:10:27,658 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 13:10:27,658 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 13:10:27,658 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 13:10:27,659 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 13:10:27,659 - httpcore.connection - DEBUG - close.started 2025-05-30 13:10:27,659 - httpcore.connection - DEBUG - close.complete 2025-05-30 13:10:27,757 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 04:10:27 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-30 13:10:27,758 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-30 13:10:27,758 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 13:10:27,758 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 13:10:27,758 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 13:10:27,758 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 13:10:27,758 - httpcore.connection - DEBUG - close.started 2025-05-30 13:10:27,758 - httpcore.connection - DEBUG - close.complete 2025-05-30 13:10:28,324 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 13:10:28,548 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/launched HTTP/1.1" 200 0 2025-05-30 13:35:20,468 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-30 13:35:20,468 - __main__ - DEBUG - API key found, length: 39 2025-05-30 13:35:20,468 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-30 13:35:20,468 - auto_diffusers - DEBUG - API key length: 39 2025-05-30 13:35:20,468 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-30 13:35:20,468 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-30 13:35:20,468 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-30 13:35:20,468 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-30 13:35:20,468 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-30 13:35:20,468 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-30 13:35:20,472 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-30 13:35:20,472 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-30 13:35:20,941 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-30 13:35:20,941 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-30 13:35:20,941 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-30 13:35:20,941 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-30 13:35:20,941 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-30 13:35:20,941 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-30 13:35:20,941 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-30 13:35:20,941 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-30 13:35:20,941 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-30 13:35:20,941 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-30 13:35:20,941 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-30 13:35:20,944 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 13:35:20,957 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-30 13:35:20,964 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 13:35:21,047 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 13:35:21,079 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=None socket_options=None 2025-05-30 13:35:21,080 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 13:35:21,080 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 13:35:21,080 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 13:35:21,081 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 13:35:21,081 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 13:35:21,081 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 13:35:21,081 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 04:35:21 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')]) 2025-05-30 13:35:21,081 - httpx - INFO - HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK" 2025-05-30 13:35:21,081 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 13:35:21,081 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 13:35:21,081 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 13:35:21,081 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 13:35:21,081 - httpcore.connection - DEBUG - close.started 2025-05-30 13:35:21,081 - httpcore.connection - DEBUG - close.complete 2025-05-30 13:35:21,082 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=3 socket_options=None 2025-05-30 13:35:21,082 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 13:35:21,082 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 13:35:21,082 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 13:35:21,082 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 13:35:21,082 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 13:35:21,082 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 13:35:21,089 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 04:35:21 GMT'), (b'server', b'uvicorn'), (b'content-length', b'108175'), (b'content-type', b'text/html; charset=utf-8')]) 2025-05-30 13:35:21,089 - httpx - INFO - HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK" 2025-05-30 13:35:21,090 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 13:35:21,090 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 13:35:21,090 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 13:35:21,090 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 13:35:21,090 - httpcore.connection - DEBUG - close.started 2025-05-30 13:35:21,090 - httpcore.connection - DEBUG - close.complete 2025-05-30 13:35:21,101 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=30 socket_options=None 2025-05-30 13:35:21,120 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 13:35:21,120 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-30 13:35:21,229 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-30 13:35:21,241 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 13:35:21,241 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=30 2025-05-30 13:35:21,392 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 13:35:21,393 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 13:35:21,393 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 13:35:21,393 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 13:35:21,393 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 13:35:21,393 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 13:35:21,527 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 13:35:21,527 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 13:35:21,527 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 13:35:21,527 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 13:35:21,527 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 13:35:21,527 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 13:35:21,529 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 04:35:21 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-30 13:35:21,529 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-30 13:35:21,529 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 13:35:21,529 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 13:35:21,529 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 13:35:21,529 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 13:35:21,529 - httpcore.connection - DEBUG - close.started 2025-05-30 13:35:21,529 - httpcore.connection - DEBUG - close.complete 2025-05-30 13:35:21,672 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 04:35:21 GMT'), (b'Content-Type', b'text/html; charset=utf-8'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'ContentType', b'application/json'), (b'Access-Control-Allow-Origin', b'*'), (b'Content-Encoding', b'gzip')]) 2025-05-30 13:35:21,672 - httpx - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK" 2025-05-30 13:35:21,672 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 13:35:21,673 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 13:35:21,673 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 13:35:21,673 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 13:35:21,673 - httpcore.connection - DEBUG - close.started 2025-05-30 13:35:21,673 - httpcore.connection - DEBUG - close.complete 2025-05-30 13:35:22,261 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 13:35:22,493 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/launched HTTP/1.1" 200 0 2025-05-30 13:35:26,502 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 13:35:26,502 - simple_memory_calculator - INFO - Using known memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 13:35:26,503 - simple_memory_calculator - DEBUG - Known data: {'params_billions': 12.0, 'fp16_gb': 24.0, 'inference_fp16_gb': 36.0} 2025-05-30 13:35:26,503 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 13:35:26,503 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 13:35:26,503 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 13:35:26,503 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 13:35:26,503 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 13:35:26,503 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 13:36:39,069 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-30 13:36:39,069 - __main__ - DEBUG - API key found, length: 39 2025-05-30 13:36:39,069 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-30 13:36:39,069 - auto_diffusers - DEBUG - API key length: 39 2025-05-30 13:36:39,069 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-30 13:36:39,069 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-30 13:36:39,069 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-30 13:36:39,069 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-30 13:36:39,069 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-30 13:36:39,069 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-30 13:36:39,072 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-30 13:36:39,072 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-30 13:36:39,545 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-30 13:36:39,545 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-30 13:36:39,545 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-30 13:36:39,545 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-30 13:36:39,545 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-30 13:36:39,545 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-30 13:36:39,545 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-30 13:36:39,545 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-30 13:36:39,545 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-30 13:36:39,545 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-30 13:36:39,545 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-30 13:36:39,548 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 13:36:39,561 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-30 13:36:39,568 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 13:36:39,663 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 13:36:39,693 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=None socket_options=None 2025-05-30 13:36:39,693 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 13:36:39,693 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 13:36:39,694 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 13:36:39,694 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 13:36:39,694 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 13:36:39,694 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 13:36:39,694 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 04:36:39 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')]) 2025-05-30 13:36:39,695 - httpx - INFO - HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK" 2025-05-30 13:36:39,695 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 13:36:39,695 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 13:36:39,695 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 13:36:39,695 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 13:36:39,695 - httpcore.connection - DEBUG - close.started 2025-05-30 13:36:39,695 - httpcore.connection - DEBUG - close.complete 2025-05-30 13:36:39,695 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=3 socket_options=None 2025-05-30 13:36:39,696 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 13:36:39,696 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 13:36:39,696 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 13:36:39,696 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 13:36:39,696 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 13:36:39,696 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 13:36:39,702 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 04:36:39 GMT'), (b'server', b'uvicorn'), (b'content-length', b'109390'), (b'content-type', b'text/html; charset=utf-8')]) 2025-05-30 13:36:39,702 - httpx - INFO - HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK" 2025-05-30 13:36:39,702 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 13:36:39,702 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 13:36:39,703 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 13:36:39,703 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 13:36:39,703 - httpcore.connection - DEBUG - close.started 2025-05-30 13:36:39,703 - httpcore.connection - DEBUG - close.complete 2025-05-30 13:36:39,714 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=30 socket_options=None 2025-05-30 13:36:39,763 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 13:36:39,763 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-30 13:36:39,853 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 13:36:39,853 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=30 2025-05-30 13:36:39,907 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-30 13:36:40,107 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 13:36:40,107 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 13:36:40,108 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 13:36:40,108 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 13:36:40,108 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 13:36:40,109 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 13:36:40,128 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 13:36:40,128 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 13:36:40,128 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 13:36:40,128 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 13:36:40,128 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 13:36:40,128 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 13:36:40,268 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 04:36:40 GMT'), (b'Content-Type', b'text/html; charset=utf-8'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'ContentType', b'application/json'), (b'Access-Control-Allow-Origin', b'*'), (b'Content-Encoding', b'gzip')]) 2025-05-30 13:36:40,268 - httpx - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK" 2025-05-30 13:36:40,269 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 13:36:40,269 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 13:36:40,269 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 13:36:40,270 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 13:36:40,270 - httpcore.connection - DEBUG - close.started 2025-05-30 13:36:40,270 - httpcore.connection - DEBUG - close.complete 2025-05-30 13:36:40,279 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 04:36:40 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-30 13:36:40,280 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-30 13:36:40,280 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 13:36:40,280 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 13:36:40,280 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 13:36:40,280 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 13:36:40,280 - httpcore.connection - DEBUG - close.started 2025-05-30 13:36:40,280 - httpcore.connection - DEBUG - close.complete 2025-05-30 13:36:40,866 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 13:36:41,086 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/launched HTTP/1.1" 200 0 2025-05-30 13:36:41,216 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 13:36:41,216 - simple_memory_calculator - INFO - Using known memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 13:36:41,216 - simple_memory_calculator - DEBUG - Known data: {'params_billions': 12.0, 'fp16_gb': 24.0, 'inference_fp16_gb': 36.0} 2025-05-30 13:36:41,216 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 13:36:41,216 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 13:36:41,217 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 13:36:41,217 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 13:36:41,217 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 13:36:41,217 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 13:38:21,760 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-30 13:38:21,760 - __main__ - DEBUG - API key found, length: 39 2025-05-30 13:38:21,760 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-30 13:38:21,760 - auto_diffusers - DEBUG - API key length: 39 2025-05-30 13:38:21,760 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-30 13:38:21,760 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-30 13:38:21,760 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-30 13:38:21,760 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-30 13:38:21,760 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-30 13:38:21,760 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-30 13:38:21,764 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-30 13:38:21,764 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-30 13:38:22,234 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-30 13:38:22,234 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-30 13:38:22,234 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-30 13:38:22,234 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-30 13:38:22,234 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-30 13:38:22,234 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-30 13:38:22,234 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-30 13:38:22,234 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-30 13:38:22,234 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-30 13:38:22,234 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-30 13:38:22,234 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-30 13:38:22,236 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 13:38:22,250 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-30 13:38:22,256 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 13:38:22,335 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 13:38:22,369 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=None socket_options=None 2025-05-30 13:38:22,370 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 13:38:22,370 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 13:38:22,370 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 13:38:22,370 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 13:38:22,370 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 13:38:22,371 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 13:38:22,371 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 04:38:22 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')]) 2025-05-30 13:38:22,371 - httpx - INFO - HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK" 2025-05-30 13:38:22,371 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 13:38:22,371 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 13:38:22,371 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 13:38:22,371 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 13:38:22,371 - httpcore.connection - DEBUG - close.started 2025-05-30 13:38:22,371 - httpcore.connection - DEBUG - close.complete 2025-05-30 13:38:22,372 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=3 socket_options=None 2025-05-30 13:38:22,372 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 13:38:22,372 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 13:38:22,372 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 13:38:22,372 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 13:38:22,372 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 13:38:22,372 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 13:38:22,379 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 04:38:22 GMT'), (b'server', b'uvicorn'), (b'content-length', b'109348'), (b'content-type', b'text/html; charset=utf-8')]) 2025-05-30 13:38:22,379 - httpx - INFO - HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK" 2025-05-30 13:38:22,379 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 13:38:22,379 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 13:38:22,379 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 13:38:22,379 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 13:38:22,379 - httpcore.connection - DEBUG - close.started 2025-05-30 13:38:22,379 - httpcore.connection - DEBUG - close.complete 2025-05-30 13:38:22,391 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=30 socket_options=None 2025-05-30 13:38:22,418 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 13:38:22,418 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-30 13:38:22,525 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 13:38:22,525 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=30 2025-05-30 13:38:22,528 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-30 13:38:22,700 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 13:38:22,700 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 13:38:22,700 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 13:38:22,700 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 13:38:22,701 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 13:38:22,701 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 13:38:22,798 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 13:38:22,799 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 13:38:22,799 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 13:38:22,799 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 13:38:22,799 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 13:38:22,799 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 13:38:22,887 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 04:38:22 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-30 13:38:22,888 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-30 13:38:22,888 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 13:38:22,889 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 13:38:22,889 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 13:38:22,889 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 13:38:22,890 - httpcore.connection - DEBUG - close.started 2025-05-30 13:38:22,890 - httpcore.connection - DEBUG - close.complete 2025-05-30 13:38:22,937 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 04:38:22 GMT'), (b'Content-Type', b'text/html; charset=utf-8'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'ContentType', b'application/json'), (b'Access-Control-Allow-Origin', b'*'), (b'Content-Encoding', b'gzip')]) 2025-05-30 13:38:22,938 - httpx - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK" 2025-05-30 13:38:22,938 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 13:38:22,939 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 13:38:22,939 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 13:38:22,939 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 13:38:22,939 - httpcore.connection - DEBUG - close.started 2025-05-30 13:38:22,940 - httpcore.connection - DEBUG - close.complete 2025-05-30 13:38:23,236 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 13:38:23,236 - simple_memory_calculator - INFO - Using known memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 13:38:23,236 - simple_memory_calculator - DEBUG - Known data: {'params_billions': 12.0, 'fp16_gb': 24.0, 'inference_fp16_gb': 36.0} 2025-05-30 13:38:23,236 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 13:38:23,237 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 13:38:23,237 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 13:38:23,237 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 13:38:23,237 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 13:38:23,237 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 13:38:23,534 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 13:38:23,781 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/launched HTTP/1.1" 200 0 2025-05-30 13:39:41,812 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-30 13:39:41,812 - __main__ - DEBUG - API key found, length: 39 2025-05-30 13:39:41,812 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-30 13:39:41,812 - auto_diffusers - DEBUG - API key length: 39 2025-05-30 13:39:41,812 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-30 13:39:41,812 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-30 13:39:41,812 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-30 13:39:41,812 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-30 13:39:41,812 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-30 13:39:41,812 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-30 13:39:41,817 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-30 13:39:41,817 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-30 13:39:42,291 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-30 13:39:42,291 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-30 13:39:42,292 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-30 13:39:42,292 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-30 13:39:42,292 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-30 13:39:42,292 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-30 13:39:42,292 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-30 13:39:42,292 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-30 13:39:42,292 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-30 13:39:42,292 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-30 13:39:42,292 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-30 13:39:42,294 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 13:39:42,307 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-30 13:39:42,315 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 13:39:42,405 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 13:39:42,447 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=None socket_options=None 2025-05-30 13:39:42,448 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 13:39:42,448 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 13:39:42,448 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 13:39:42,449 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 13:39:42,449 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 13:39:42,449 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 13:39:42,449 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 04:39:42 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')]) 2025-05-30 13:39:42,449 - httpx - INFO - HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK" 2025-05-30 13:39:42,449 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 13:39:42,449 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 13:39:42,449 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 13:39:42,449 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 13:39:42,449 - httpcore.connection - DEBUG - close.started 2025-05-30 13:39:42,449 - httpcore.connection - DEBUG - close.complete 2025-05-30 13:39:42,450 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=3 socket_options=None 2025-05-30 13:39:42,450 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 13:39:42,451 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 13:39:42,451 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 13:39:42,451 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 13:39:42,451 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 13:39:42,451 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 13:39:42,457 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 04:39:42 GMT'), (b'server', b'uvicorn'), (b'content-length', b'109673'), (b'content-type', b'text/html; charset=utf-8')]) 2025-05-30 13:39:42,458 - httpx - INFO - HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK" 2025-05-30 13:39:42,458 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 13:39:42,458 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 13:39:42,458 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 13:39:42,458 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 13:39:42,458 - httpcore.connection - DEBUG - close.started 2025-05-30 13:39:42,458 - httpcore.connection - DEBUG - close.complete 2025-05-30 13:39:42,470 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=30 socket_options=None 2025-05-30 13:39:42,577 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 13:39:42,577 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-30 13:39:42,601 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-30 13:39:42,627 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 13:39:42,627 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=30 2025-05-30 13:39:42,895 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 13:39:42,895 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 13:39:42,896 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 13:39:42,896 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 13:39:42,896 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 13:39:42,896 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 13:39:42,947 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 13:39:42,947 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 13:39:42,947 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 13:39:42,947 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 13:39:42,947 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 13:39:42,947 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 13:39:43,058 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 04:39:42 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-30 13:39:43,058 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-30 13:39:43,059 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 13:39:43,059 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 13:39:43,059 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 13:39:43,059 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 13:39:43,059 - httpcore.connection - DEBUG - close.started 2025-05-30 13:39:43,060 - httpcore.connection - DEBUG - close.complete 2025-05-30 13:39:43,107 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 04:39:43 GMT'), (b'Content-Type', b'text/html; charset=utf-8'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'ContentType', b'application/json'), (b'Access-Control-Allow-Origin', b'*'), (b'Content-Encoding', b'gzip')]) 2025-05-30 13:39:43,107 - httpx - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK" 2025-05-30 13:39:43,108 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 13:39:43,108 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 13:39:43,109 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 13:39:43,109 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 13:39:43,109 - httpcore.connection - DEBUG - close.started 2025-05-30 13:39:43,109 - httpcore.connection - DEBUG - close.complete 2025-05-30 13:39:43,815 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 13:39:43,971 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 13:39:43,971 - simple_memory_calculator - INFO - Using known memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 13:39:43,971 - simple_memory_calculator - DEBUG - Known data: {'params_billions': 12.0, 'fp16_gb': 24.0, 'inference_fp16_gb': 36.0} 2025-05-30 13:39:43,971 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 13:39:43,971 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 13:39:43,971 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 13:39:43,972 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 13:39:43,972 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 13:39:43,972 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 13:39:44,023 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/launched HTTP/1.1" 200 0 2025-05-30 13:44:55,206 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-30 13:44:55,206 - __main__ - DEBUG - API key found, length: 39 2025-05-30 13:44:55,206 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-30 13:44:55,206 - auto_diffusers - DEBUG - API key length: 39 2025-05-30 13:44:55,206 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-30 13:44:55,206 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-30 13:44:55,206 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-30 13:44:55,206 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-30 13:44:55,206 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-30 13:44:55,206 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-30 13:44:55,209 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-30 13:44:55,210 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-30 13:44:55,716 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-30 13:44:55,716 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-30 13:44:55,716 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-30 13:44:55,716 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-30 13:44:55,716 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-30 13:44:55,716 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-30 13:44:55,716 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-30 13:44:55,716 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-30 13:44:55,716 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-30 13:44:55,716 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-30 13:44:55,716 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-30 13:44:55,719 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 13:44:55,732 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-30 13:44:55,740 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 13:44:55,827 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 13:44:55,858 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=None socket_options=None 2025-05-30 13:44:55,858 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 13:44:55,859 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 13:44:55,859 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 13:44:55,859 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 13:44:55,859 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 13:44:55,859 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 13:44:55,859 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 04:44:55 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')]) 2025-05-30 13:44:55,860 - httpx - INFO - HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK" 2025-05-30 13:44:55,860 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 13:44:55,860 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 13:44:55,860 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 13:44:55,860 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 13:44:55,860 - httpcore.connection - DEBUG - close.started 2025-05-30 13:44:55,860 - httpcore.connection - DEBUG - close.complete 2025-05-30 13:44:55,860 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=3 socket_options=None 2025-05-30 13:44:55,861 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 13:44:55,861 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 13:44:55,861 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 13:44:55,861 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 13:44:55,861 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 13:44:55,861 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 13:44:55,877 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 04:44:55 GMT'), (b'server', b'uvicorn'), (b'content-length', b'109695'), (b'content-type', b'text/html; charset=utf-8')]) 2025-05-30 13:44:55,877 - httpx - INFO - HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK" 2025-05-30 13:44:55,877 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 13:44:55,877 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 13:44:55,877 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 13:44:55,877 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 13:44:55,877 - httpcore.connection - DEBUG - close.started 2025-05-30 13:44:55,877 - httpcore.connection - DEBUG - close.complete 2025-05-30 13:44:55,890 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=30 socket_options=None 2025-05-30 13:44:55,972 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 13:44:55,972 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-30 13:44:56,041 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 13:44:56,041 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=30 2025-05-30 13:44:56,151 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-30 13:44:56,269 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 13:44:56,270 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 13:44:56,270 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 13:44:56,270 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 13:44:56,270 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 13:44:56,270 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 13:44:56,339 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 13:44:56,340 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 13:44:56,340 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 13:44:56,340 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 13:44:56,340 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 13:44:56,340 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 13:44:56,407 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 04:44:56 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-30 13:44:56,408 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-30 13:44:56,408 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 13:44:56,409 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 13:44:56,409 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 13:44:56,409 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 13:44:56,409 - httpcore.connection - DEBUG - close.started 2025-05-30 13:44:56,410 - httpcore.connection - DEBUG - close.complete 2025-05-30 13:44:56,494 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 04:44:56 GMT'), (b'Content-Type', b'text/html; charset=utf-8'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'ContentType', b'application/json'), (b'Access-Control-Allow-Origin', b'*'), (b'Content-Encoding', b'gzip')]) 2025-05-30 13:44:56,495 - httpx - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK" 2025-05-30 13:44:56,495 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 13:44:56,496 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 13:44:56,496 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 13:44:56,496 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 13:44:56,496 - httpcore.connection - DEBUG - close.started 2025-05-30 13:44:56,497 - httpcore.connection - DEBUG - close.complete 2025-05-30 13:44:56,673 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 13:44:56,673 - simple_memory_calculator - INFO - Using known memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 13:44:56,673 - simple_memory_calculator - DEBUG - Known data: {'params_billions': 12.0, 'fp16_gb': 24.0, 'inference_fp16_gb': 36.0} 2025-05-30 13:44:56,674 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 13:44:56,674 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 13:44:56,674 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 13:44:56,674 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 13:44:56,674 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 13:44:56,674 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 13:44:57,143 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 13:44:57,362 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/launched HTTP/1.1" 200 0 2025-05-30 13:44:59,518 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 13:44:59,518 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 13:44:59,518 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 13:44:59,518 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 13:44:59,518 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 13:44:59,519 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 13:44:59,519 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 13:44:59,519 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 13:45:01,606 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 13:45:01,606 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 13:45:01,606 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 13:45:01,606 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 13:45:01,606 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 13:45:01,606 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 13:45:01,606 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 13:45:01,607 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 13:45:56,869 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-30 13:45:56,869 - __main__ - DEBUG - API key found, length: 39 2025-05-30 13:45:56,869 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-30 13:45:56,869 - auto_diffusers - DEBUG - API key length: 39 2025-05-30 13:45:56,869 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-30 13:45:56,869 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-30 13:45:56,869 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-30 13:45:56,869 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-30 13:45:56,869 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-30 13:45:56,869 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-30 13:45:56,874 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-30 13:45:56,878 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-30 13:45:57,334 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-30 13:45:57,334 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-30 13:45:57,334 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-30 13:45:57,334 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-30 13:45:57,334 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-30 13:45:57,334 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-30 13:45:57,334 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-30 13:45:57,334 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-30 13:45:57,335 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-30 13:45:57,335 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-30 13:45:57,335 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-30 13:45:57,337 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 13:45:57,350 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-30 13:45:57,357 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 13:45:57,438 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 13:45:57,471 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=None socket_options=None 2025-05-30 13:45:57,471 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 13:45:57,471 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 13:45:57,471 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 13:45:57,472 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 13:45:57,472 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 13:45:57,472 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 13:45:57,472 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 04:45:57 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')]) 2025-05-30 13:45:57,472 - httpx - INFO - HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK" 2025-05-30 13:45:57,472 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 13:45:57,472 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 13:45:57,472 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 13:45:57,472 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 13:45:57,472 - httpcore.connection - DEBUG - close.started 2025-05-30 13:45:57,472 - httpcore.connection - DEBUG - close.complete 2025-05-30 13:45:57,473 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=3 socket_options=None 2025-05-30 13:45:57,473 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 13:45:57,473 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 13:45:57,473 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 13:45:57,473 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 13:45:57,473 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 13:45:57,473 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 13:45:57,480 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 04:45:57 GMT'), (b'server', b'uvicorn'), (b'content-length', b'109702'), (b'content-type', b'text/html; charset=utf-8')]) 2025-05-30 13:45:57,480 - httpx - INFO - HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK" 2025-05-30 13:45:57,480 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 13:45:57,480 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 13:45:57,480 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 13:45:57,480 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 13:45:57,480 - httpcore.connection - DEBUG - close.started 2025-05-30 13:45:57,480 - httpcore.connection - DEBUG - close.complete 2025-05-30 13:45:57,492 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=30 socket_options=None 2025-05-30 13:45:57,508 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 13:45:57,508 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-30 13:45:57,627 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 13:45:57,627 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-30 13:45:57,628 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=30 2025-05-30 13:45:57,809 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 13:45:57,809 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 13:45:57,811 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 13:45:57,811 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 13:45:57,811 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 13:45:57,811 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 13:45:57,899 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 13:45:57,899 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 13:45:57,900 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 13:45:57,900 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 13:45:57,900 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 13:45:57,900 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 13:45:57,960 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 04:45:57 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-30 13:45:57,961 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-30 13:45:57,961 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 13:45:57,961 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 13:45:57,961 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 13:45:57,961 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 13:45:57,961 - httpcore.connection - DEBUG - close.started 2025-05-30 13:45:57,961 - httpcore.connection - DEBUG - close.complete 2025-05-30 13:45:58,039 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 04:45:57 GMT'), (b'Content-Type', b'text/html; charset=utf-8'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'ContentType', b'application/json'), (b'Access-Control-Allow-Origin', b'*'), (b'Content-Encoding', b'gzip')]) 2025-05-30 13:45:58,040 - httpx - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK" 2025-05-30 13:45:58,040 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 13:45:58,041 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 13:45:58,041 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 13:45:58,041 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 13:45:58,041 - httpcore.connection - DEBUG - close.started 2025-05-30 13:45:58,041 - httpcore.connection - DEBUG - close.complete 2025-05-30 13:45:58,234 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 13:45:58,234 - simple_memory_calculator - INFO - Using known memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 13:45:58,234 - simple_memory_calculator - DEBUG - Known data: {'params_billions': 12.0, 'fp16_gb': 24.0, 'inference_fp16_gb': 36.0} 2025-05-30 13:45:58,234 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 13:45:58,234 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 13:45:58,234 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 13:45:58,234 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 13:45:58,234 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 13:45:58,234 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 13:45:58,620 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 13:45:58,836 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/launched HTTP/1.1" 200 0 2025-05-30 13:47:21,071 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-30 13:47:21,071 - __main__ - DEBUG - API key found, length: 39 2025-05-30 13:47:21,071 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-30 13:47:21,071 - auto_diffusers - DEBUG - API key length: 39 2025-05-30 13:47:21,071 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-30 13:47:21,071 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-30 13:47:21,071 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-30 13:47:21,071 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-30 13:47:21,071 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-30 13:47:21,071 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-30 13:47:21,075 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-30 13:47:21,075 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-30 13:47:21,534 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-30 13:47:21,534 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-30 13:47:21,534 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-30 13:47:21,534 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-30 13:47:21,534 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-30 13:47:21,534 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-30 13:47:21,534 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-30 13:47:21,534 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-30 13:47:21,534 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-30 13:47:21,534 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-30 13:47:21,534 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-30 13:47:21,536 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 13:47:21,549 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-30 13:47:21,557 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 13:47:21,648 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 13:47:21,681 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=None socket_options=None 2025-05-30 13:47:21,682 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 13:47:21,682 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 13:47:21,682 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 13:47:21,682 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 13:47:21,682 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 13:47:21,682 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 13:47:21,683 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 04:47:21 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')]) 2025-05-30 13:47:21,683 - httpx - INFO - HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK" 2025-05-30 13:47:21,683 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 13:47:21,683 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 13:47:21,683 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 13:47:21,683 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 13:47:21,683 - httpcore.connection - DEBUG - close.started 2025-05-30 13:47:21,683 - httpcore.connection - DEBUG - close.complete 2025-05-30 13:47:21,683 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=3 socket_options=None 2025-05-30 13:47:21,684 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 13:47:21,684 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 13:47:21,684 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 13:47:21,684 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 13:47:21,684 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 13:47:21,685 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 13:47:21,691 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 04:47:21 GMT'), (b'server', b'uvicorn'), (b'content-length', b'109853'), (b'content-type', b'text/html; charset=utf-8')]) 2025-05-30 13:47:21,691 - httpx - INFO - HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK" 2025-05-30 13:47:21,691 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 13:47:21,691 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 13:47:21,691 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 13:47:21,691 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 13:47:21,691 - httpcore.connection - DEBUG - close.started 2025-05-30 13:47:21,691 - httpcore.connection - DEBUG - close.complete 2025-05-30 13:47:21,703 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=30 socket_options=None 2025-05-30 13:47:21,806 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 13:47:21,807 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-30 13:47:21,839 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 13:47:21,839 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=30 2025-05-30 13:47:21,847 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-30 13:47:22,099 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 13:47:22,100 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 13:47:22,100 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 13:47:22,100 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 13:47:22,100 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 13:47:22,100 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 13:47:22,111 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 13:47:22,112 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 13:47:22,112 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 13:47:22,112 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 13:47:22,112 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 13:47:22,112 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 13:47:22,249 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 04:47:22 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-30 13:47:22,250 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-30 13:47:22,250 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 13:47:22,250 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 13:47:22,251 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 04:47:22 GMT'), (b'Content-Type', b'text/html; charset=utf-8'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'ContentType', b'application/json'), (b'Access-Control-Allow-Origin', b'*'), (b'Content-Encoding', b'gzip')]) 2025-05-30 13:47:22,251 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 13:47:22,251 - httpx - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK" 2025-05-30 13:47:22,251 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 13:47:22,252 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 13:47:22,252 - httpcore.connection - DEBUG - close.started 2025-05-30 13:47:22,252 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 13:47:22,252 - httpcore.connection - DEBUG - close.complete 2025-05-30 13:47:22,253 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 13:47:22,253 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 13:47:22,253 - httpcore.connection - DEBUG - close.started 2025-05-30 13:47:22,253 - httpcore.connection - DEBUG - close.complete 2025-05-30 13:47:22,861 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 13:47:23,077 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/launched HTTP/1.1" 200 0 2025-05-30 13:47:23,099 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 13:47:23,100 - simple_memory_calculator - INFO - Using known memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 13:47:23,100 - simple_memory_calculator - DEBUG - Known data: {'params_billions': 12.0, 'fp16_gb': 24.0, 'inference_fp16_gb': 36.0} 2025-05-30 13:47:23,100 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 13:47:23,100 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 13:47:23,100 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 13:47:23,100 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 13:47:23,100 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 13:47:23,100 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 13:48:19,454 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-30 13:48:19,454 - __main__ - DEBUG - API key found, length: 39 2025-05-30 13:48:19,454 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-30 13:48:19,454 - auto_diffusers - DEBUG - API key length: 39 2025-05-30 13:48:19,454 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-30 13:48:19,454 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-30 13:48:19,454 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-30 13:48:19,454 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-30 13:48:19,454 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-30 13:48:19,454 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-30 13:48:19,457 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-30 13:48:19,457 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-30 13:48:19,918 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-30 13:48:19,918 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-30 13:48:19,918 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-30 13:48:19,918 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-30 13:48:19,918 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-30 13:48:19,918 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-30 13:48:19,918 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-30 13:48:19,918 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-30 13:48:19,918 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-30 13:48:19,918 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-30 13:48:19,918 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-30 13:48:19,921 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 13:48:19,934 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-30 13:48:19,942 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 13:48:20,025 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 13:48:20,056 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=None socket_options=None 2025-05-30 13:48:20,057 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 13:48:20,057 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 13:48:20,057 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 13:48:20,057 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 13:48:20,057 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 13:48:20,058 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 13:48:20,058 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 04:48:20 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')]) 2025-05-30 13:48:20,058 - httpx - INFO - HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK" 2025-05-30 13:48:20,058 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 13:48:20,058 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 13:48:20,058 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 13:48:20,058 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 13:48:20,058 - httpcore.connection - DEBUG - close.started 2025-05-30 13:48:20,058 - httpcore.connection - DEBUG - close.complete 2025-05-30 13:48:20,059 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=3 socket_options=None 2025-05-30 13:48:20,059 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 13:48:20,059 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 13:48:20,059 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 13:48:20,059 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 13:48:20,059 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 13:48:20,059 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 13:48:20,065 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 04:48:20 GMT'), (b'server', b'uvicorn'), (b'content-length', b'109835'), (b'content-type', b'text/html; charset=utf-8')]) 2025-05-30 13:48:20,066 - httpx - INFO - HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK" 2025-05-30 13:48:20,066 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 13:48:20,066 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 13:48:20,066 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 13:48:20,066 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 13:48:20,066 - httpcore.connection - DEBUG - close.started 2025-05-30 13:48:20,066 - httpcore.connection - DEBUG - close.complete 2025-05-30 13:48:20,077 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=30 socket_options=None 2025-05-30 13:48:20,081 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 13:48:20,081 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-30 13:48:20,201 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-30 13:48:20,234 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 13:48:20,234 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=30 2025-05-30 13:48:20,357 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 13:48:20,357 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 13:48:20,358 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 13:48:20,358 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 13:48:20,358 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 13:48:20,359 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 13:48:20,496 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 04:48:20 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-30 13:48:20,497 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-30 13:48:20,497 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 13:48:20,497 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 13:48:20,497 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 13:48:20,497 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 13:48:20,497 - httpcore.connection - DEBUG - close.started 2025-05-30 13:48:20,498 - httpcore.connection - DEBUG - close.complete 2025-05-30 13:48:20,552 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 13:48:20,552 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 13:48:20,552 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 13:48:20,553 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 13:48:20,553 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 13:48:20,553 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 13:48:20,665 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 13:48:20,665 - simple_memory_calculator - INFO - Using known memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 13:48:20,665 - simple_memory_calculator - DEBUG - Known data: {'params_billions': 12.0, 'fp16_gb': 24.0, 'inference_fp16_gb': 36.0} 2025-05-30 13:48:20,665 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 13:48:20,665 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 13:48:20,665 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 13:48:20,665 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 13:48:20,665 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 13:48:20,665 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 13:48:20,736 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 04:48:20 GMT'), (b'Content-Type', b'text/html; charset=utf-8'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'ContentType', b'application/json'), (b'Access-Control-Allow-Origin', b'*'), (b'Content-Encoding', b'gzip')]) 2025-05-30 13:48:20,736 - httpx - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK" 2025-05-30 13:48:20,737 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 13:48:20,737 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 13:48:20,737 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 13:48:20,737 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 13:48:20,737 - httpcore.connection - DEBUG - close.started 2025-05-30 13:48:20,737 - httpcore.connection - DEBUG - close.complete 2025-05-30 13:48:21,392 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 13:48:21,613 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/launched HTTP/1.1" 200 0 2025-05-30 13:50:31,681 - __main__ - INFO - Initializing GradioAutodiffusers 2025-05-30 13:50:31,681 - __main__ - DEBUG - API key found, length: 39 2025-05-30 13:50:31,681 - auto_diffusers - INFO - Initializing AutoDiffusersGenerator 2025-05-30 13:50:31,681 - auto_diffusers - DEBUG - API key length: 39 2025-05-30 13:50:31,681 - auto_diffusers - WARNING - Tool calling dependencies not available, running without tools 2025-05-30 13:50:31,681 - hardware_detector - INFO - Initializing HardwareDetector 2025-05-30 13:50:31,681 - hardware_detector - DEBUG - Starting system hardware detection 2025-05-30 13:50:31,681 - hardware_detector - DEBUG - Platform: Darwin, Architecture: arm64 2025-05-30 13:50:31,681 - hardware_detector - DEBUG - CPU cores: 16, Python: 3.11.11 2025-05-30 13:50:31,681 - hardware_detector - DEBUG - Attempting GPU detection via nvidia-smi 2025-05-30 13:50:31,685 - hardware_detector - DEBUG - nvidia-smi not found, no NVIDIA GPU detected 2025-05-30 13:50:31,685 - hardware_detector - DEBUG - Checking PyTorch availability 2025-05-30 13:50:32,182 - hardware_detector - INFO - PyTorch 2.7.0 detected 2025-05-30 13:50:32,182 - hardware_detector - DEBUG - CUDA available: False, MPS available: True 2025-05-30 13:50:32,182 - hardware_detector - INFO - Hardware detection completed successfully 2025-05-30 13:50:32,182 - hardware_detector - DEBUG - Detected specs: {'platform': 'Darwin', 'architecture': 'arm64', 'cpu_count': 16, 'python_version': '3.11.11', 'gpu_info': None, 'cuda_available': False, 'mps_available': True, 'torch_version': '2.7.0'} 2025-05-30 13:50:32,182 - auto_diffusers - INFO - Hardware detector initialized successfully 2025-05-30 13:50:32,182 - __main__ - INFO - AutoDiffusersGenerator initialized successfully 2025-05-30 13:50:32,182 - simple_memory_calculator - INFO - Initializing SimpleMemoryCalculator 2025-05-30 13:50:32,182 - simple_memory_calculator - DEBUG - HuggingFace API initialized 2025-05-30 13:50:32,182 - simple_memory_calculator - DEBUG - Known models in database: 4 2025-05-30 13:50:32,182 - __main__ - INFO - SimpleMemoryCalculator initialized successfully 2025-05-30 13:50:32,182 - __main__ - DEBUG - Default model settings: gemini-2.5-flash-preview-05-20, temp=0.7 2025-05-30 13:50:32,184 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 13:50:32,204 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None 2025-05-30 13:50:32,205 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 13:50:32,292 - asyncio - DEBUG - Using selector: KqueueSelector 2025-05-30 13:50:32,324 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=None socket_options=None 2025-05-30 13:50:32,325 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 13:50:32,325 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 13:50:32,325 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 13:50:32,325 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 13:50:32,325 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 13:50:32,325 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 13:50:32,325 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 04:50:32 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')]) 2025-05-30 13:50:32,326 - httpx - INFO - HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK" 2025-05-30 13:50:32,326 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 13:50:32,326 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 13:50:32,326 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 13:50:32,326 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 13:50:32,326 - httpcore.connection - DEBUG - close.started 2025-05-30 13:50:32,326 - httpcore.connection - DEBUG - close.complete 2025-05-30 13:50:32,327 - httpcore.connection - DEBUG - connect_tcp.started host='localhost' port=7860 local_address=None timeout=3 socket_options=None 2025-05-30 13:50:32,327 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 13:50:32,327 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 13:50:32,327 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 13:50:32,327 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 13:50:32,327 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 13:50:32,328 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 13:50:32,334 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Fri, 30 May 2025 04:50:32 GMT'), (b'server', b'uvicorn'), (b'content-length', b'109813'), (b'content-type', b'text/html; charset=utf-8')]) 2025-05-30 13:50:32,334 - httpx - INFO - HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK" 2025-05-30 13:50:32,334 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 13:50:32,334 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 13:50:32,334 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 13:50:32,334 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 13:50:32,334 - httpcore.connection - DEBUG - close.started 2025-05-30 13:50:32,334 - httpcore.connection - DEBUG - close.complete 2025-05-30 13:50:32,346 - httpcore.connection - DEBUG - connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=30 socket_options=None 2025-05-30 13:50:32,367 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 13:50:32,367 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=3 2025-05-30 13:50:32,488 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0 2025-05-30 13:50:32,489 - httpcore.connection - DEBUG - connect_tcp.complete return_value= 2025-05-30 13:50:32,489 - httpcore.connection - DEBUG - start_tls.started ssl_context= server_hostname='api.gradio.app' timeout=30 2025-05-30 13:50:32,652 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 13:50:32,653 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 13:50:32,653 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 13:50:32,653 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 13:50:32,653 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 13:50:32,653 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 13:50:32,780 - httpcore.connection - DEBUG - start_tls.complete return_value= 2025-05-30 13:50:32,780 - httpcore.http11 - DEBUG - send_request_headers.started request= 2025-05-30 13:50:32,780 - httpcore.http11 - DEBUG - send_request_headers.complete 2025-05-30 13:50:32,780 - httpcore.http11 - DEBUG - send_request_body.started request= 2025-05-30 13:50:32,780 - httpcore.http11 - DEBUG - send_request_body.complete 2025-05-30 13:50:32,780 - httpcore.http11 - DEBUG - receive_response_headers.started request= 2025-05-30 13:50:32,796 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 04:50:32 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')]) 2025-05-30 13:50:32,796 - httpx - INFO - HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK" 2025-05-30 13:50:32,796 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 13:50:32,796 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 13:50:32,796 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 13:50:32,796 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 13:50:32,796 - httpcore.connection - DEBUG - close.started 2025-05-30 13:50:32,796 - httpcore.connection - DEBUG - close.complete 2025-05-30 13:50:32,929 - httpcore.http11 - DEBUG - receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Fri, 30 May 2025 04:50:32 GMT'), (b'Content-Type', b'text/html; charset=utf-8'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'ContentType', b'application/json'), (b'Access-Control-Allow-Origin', b'*'), (b'Content-Encoding', b'gzip')]) 2025-05-30 13:50:32,930 - httpx - INFO - HTTP Request: GET https://api.gradio.app/v3/tunnel-request "HTTP/1.1 200 OK" 2025-05-30 13:50:32,930 - httpcore.http11 - DEBUG - receive_response_body.started request= 2025-05-30 13:50:32,930 - httpcore.http11 - DEBUG - receive_response_body.complete 2025-05-30 13:50:32,930 - httpcore.http11 - DEBUG - response_closed.started 2025-05-30 13:50:32,930 - httpcore.http11 - DEBUG - response_closed.complete 2025-05-30 13:50:32,931 - httpcore.connection - DEBUG - close.started 2025-05-30 13:50:32,931 - httpcore.connection - DEBUG - close.complete 2025-05-30 13:50:33,501 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 13:50:33,501 - simple_memory_calculator - INFO - Using known memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 13:50:33,501 - simple_memory_calculator - DEBUG - Known data: {'params_billions': 12.0, 'fp16_gb': 24.0, 'inference_fp16_gb': 36.0} 2025-05-30 13:50:33,501 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 13:50:33,501 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 13:50:33,501 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 13:50:33,501 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 13:50:33,501 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 13:50:33,501 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 13:50:33,596 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): huggingface.co:443 2025-05-30 13:50:33,813 - urllib3.connectionpool - DEBUG - https://huggingface.co:443 "HEAD /api/telemetry/gradio/launched HTTP/1.1" 200 0 2025-05-30 13:50:37,564 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 13:50:37,564 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 13:50:37,564 - simple_memory_calculator - INFO - Generating memory recommendations for black-forest-labs/FLUX.1-schnell with 8.0GB VRAM 2025-05-30 13:50:37,564 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 13:50:37,565 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 13:50:37,565 - simple_memory_calculator - DEBUG - Model memory: 24.0GB, Inference memory: 36.0GB 2025-05-30 13:50:37,565 - simple_memory_calculator - INFO - Getting memory requirements for model: black-forest-labs/FLUX.1-schnell 2025-05-30 13:50:37,565 - simple_memory_calculator - DEBUG - Using cached memory data for black-forest-labs/FLUX.1-schnell 2025-05-30 13:50:37,565 - auto_diffusers - INFO - Starting code generation for model: black-forest-labs/FLUX.1-schnell 2025-05-30 13:50:37,565 - auto_diffusers - DEBUG - Parameters: prompt='A cat holding a sign that says hello world...', size=(768, 1360), steps=4 2025-05-30 13:50:37,565 - auto_diffusers - DEBUG - Manual specs: True, Memory analysis provided: True 2025-05-30 13:50:37,565 - auto_diffusers - INFO - Using manual hardware specifications 2025-05-30 13:50:37,566 - auto_diffusers - DEBUG - Manual specs: {'platform': 'Linux', 'architecture': 'manual_input', 'cpu_count': 8, 'python_version': '3.11', 'cuda_available': False, 'mps_available': False, 'torch_version': '2.0+', 'manual_input': True, 'ram_gb': 16, 'user_dtype': None, 'gpu_info': [{'name': 'Custom GPU', 'memory_mb': 8192}]} 2025-05-30 13:50:37,566 - auto_diffusers - DEBUG - GPU detected with 8.0 GB VRAM 2025-05-30 13:50:37,566 - auto_diffusers - INFO - Selected optimization profile: balanced 2025-05-30 13:50:37,566 - auto_diffusers - DEBUG - Creating generation prompt for Gemini API 2025-05-30 13:50:37,566 - auto_diffusers - DEBUG - Prompt length: 7598 characters 2025-05-30 13:50:37,566 - auto_diffusers - INFO - ================================================================================ 2025-05-30 13:50:37,566 - auto_diffusers - INFO - PROMPT SENT TO GEMINI API: 2025-05-30 13:50:37,566 - auto_diffusers - INFO - ================================================================================ 2025-05-30 13:50:37,566 - auto_diffusers - INFO - You are an expert in optimizing diffusers library code for different hardware configurations. NOTE: This system includes curated optimization knowledge from HuggingFace documentation. TASK: Generate optimized Python code for running a diffusion model with the following specifications: - Model: black-forest-labs/FLUX.1-schnell - Prompt: "A cat holding a sign that says hello world" - Image size: 768x1360 - Inference steps: 4 HARDWARE SPECIFICATIONS: - Platform: Linux (manual_input) - CPU Cores: 8 - CUDA Available: False - MPS Available: False - Optimization Profile: balanced - GPU: Custom GPU (8.0 GB VRAM) MEMORY ANALYSIS: - Model Memory Requirements: 36.0 GB (FP16 inference) - Model Weights Size: 24.0 GB (FP16) - Memory Recommendation: 🔄 Requires sequential CPU offloading - Recommended Precision: float16 - Attention Slicing Recommended: True - VAE Slicing Recommended: True OPTIMIZATION KNOWLEDGE BASE: # DIFFUSERS OPTIMIZATION TECHNIQUES ## Memory Optimization Techniques ### 1. Model CPU Offloading Use `enable_model_cpu_offload()` to move models between GPU and CPU automatically: ```python pipe.enable_model_cpu_offload() ``` - Saves significant VRAM by keeping only active models on GPU - Automatic management, no manual intervention needed - Compatible with all pipelines ### 2. Sequential CPU Offloading Use `enable_sequential_cpu_offload()` for more aggressive memory saving: ```python pipe.enable_sequential_cpu_offload() ``` - More memory efficient than model offloading - Moves models to CPU after each forward pass - Best for very limited VRAM scenarios ### 3. Attention Slicing Use `enable_attention_slicing()` to reduce memory during attention computation: ```python pipe.enable_attention_slicing() # or specify slice size pipe.enable_attention_slicing("max") # maximum slicing pipe.enable_attention_slicing(1) # slice_size = 1 ``` - Trades compute time for memory - Most effective for high-resolution images - Can be combined with other techniques ### 4. VAE Slicing Use `enable_vae_slicing()` for large batch processing: ```python pipe.enable_vae_slicing() ``` - Decodes images one at a time instead of all at once - Essential for batch sizes > 4 - Minimal performance impact on single images ### 5. VAE Tiling Use `enable_vae_tiling()` for high-resolution image generation: ```python pipe.enable_vae_tiling() ``` - Enables 4K+ image generation on 8GB VRAM - Splits images into overlapping tiles - Automatically disabled for 512x512 or smaller images ### 6. Memory Efficient Attention (xFormers) Use `enable_xformers_memory_efficient_attention()` if xFormers is installed: ```python pipe.enable_xformers_memory_efficient_attention() ``` - Significantly reduces memory usage and improves speed - Requires xformers library installation - Compatible with most models ## Performance Optimization Techniques ### 1. Half Precision (FP16/BF16) Use lower precision for better memory and speed: ```python # FP16 (widely supported) pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16) # BF16 (better numerical stability, newer hardware) pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.bfloat16) ``` - FP16: Halves memory usage, widely supported - BF16: Better numerical stability, requires newer GPUs - Essential for most optimization scenarios ### 2. Torch Compile (PyTorch 2.0+) Use `torch.compile()` for significant speed improvements: ```python pipe.unet = torch.compile(pipe.unet, mode="reduce-overhead", fullgraph=True) # For some models, compile VAE too: pipe.vae.decode = torch.compile(pipe.vae.decode, mode="reduce-overhead", fullgraph=True) ``` - 5-50% speed improvement - Requires PyTorch 2.0+ - First run is slower due to compilation ### 3. Fast Schedulers Use faster schedulers for fewer steps: ```python from diffusers import LMSDiscreteScheduler, UniPCMultistepScheduler # LMS Scheduler (good quality, fast) pipe.scheduler = LMSDiscreteScheduler.from_config(pipe.scheduler.config) # UniPC Scheduler (fastest) pipe.scheduler = UniPCMultistepScheduler.from_config(pipe.scheduler.config) ``` ## Hardware-Specific Optimizations ### NVIDIA GPU Optimizations ```python # Enable Tensor Cores torch.backends.cudnn.benchmark = True # Optimal data type for NVIDIA torch_dtype = torch.float16 # or torch.bfloat16 for RTX 30/40 series ``` ### Apple Silicon (MPS) Optimizations ```python # Use MPS device device = "mps" if torch.backends.mps.is_available() else "cpu" pipe = pipe.to(device) # Recommended dtype for Apple Silicon torch_dtype = torch.bfloat16 # Better than float16 on Apple Silicon # Attention slicing often helps on MPS pipe.enable_attention_slicing() ``` ### CPU Optimizations ```python # Use float32 for CPU torch_dtype = torch.float32 # Enable optimized attention pipe.enable_attention_slicing() ``` ## Model-Specific Guidelines ### FLUX Models - Do NOT use guidance_scale parameter (not needed for FLUX) - Use 4-8 inference steps maximum - BF16 dtype recommended - Enable attention slicing for memory optimization ### Stable Diffusion XL - Enable attention slicing for high resolutions - Use refiner model sparingly to save memory - Consider VAE tiling for >1024px images ### Stable Diffusion 1.5/2.1 - Very memory efficient base models - Can often run without optimizations on 8GB+ VRAM - Enable VAE slicing for batch processing ## Memory Usage Estimation - FLUX.1: ~24GB for full precision, ~12GB for FP16 - SDXL: ~7GB for FP16, ~14GB for FP32 - SD 1.5: ~2GB for FP16, ~4GB for FP32 ## Optimization Combinations by VRAM ### 24GB+ VRAM (High-end) ```python pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.bfloat16) pipe = pipe.to("cuda") pipe.unet = torch.compile(pipe.unet, mode="reduce-overhead", fullgraph=True) ``` ### 12-24GB VRAM (Mid-range) ```python pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16) pipe = pipe.to("cuda") pipe.enable_model_cpu_offload() pipe.enable_xformers_memory_efficient_attention() ``` ### 8-12GB VRAM (Entry-level) ```python pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16) pipe.enable_sequential_cpu_offload() pipe.enable_attention_slicing() pipe.enable_vae_slicing() pipe.enable_xformers_memory_efficient_attention() ``` ### <8GB VRAM (Low-end) ```python pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16) pipe.enable_sequential_cpu_offload() pipe.enable_attention_slicing("max") pipe.enable_vae_slicing() pipe.enable_vae_tiling() ``` IMPORTANT: For FLUX.1-schnell models, do NOT include guidance_scale parameter as it's not needed. Using the OPTIMIZATION KNOWLEDGE BASE above, generate Python code that: 1. **Selects the best optimization techniques** for the specific hardware profile 2. **Applies appropriate memory optimizations** based on available VRAM 3. **Uses optimal data types** for the target hardware: - User specified dtype (if provided): Use exactly as specified - Apple Silicon (MPS): prefer torch.bfloat16 - NVIDIA GPUs: prefer torch.float16 or torch.bfloat16 - CPU only: use torch.float32 4. **Implements hardware-specific optimizations** (CUDA, MPS, CPU) 5. **Follows model-specific guidelines** (e.g., FLUX guidance_scale handling) IMPORTANT GUIDELINES: - Reference the OPTIMIZATION KNOWLEDGE BASE to select appropriate techniques - Include all necessary imports - Add brief comments explaining optimization choices - Generate compact, production-ready code - Inline values where possible for concise code - Generate ONLY the Python code, no explanations before or after the code block 2025-05-30 13:50:37,567 - auto_diffusers - INFO - ================================================================================ 2025-05-30 13:50:37,567 - auto_diffusers - INFO - Sending request to Gemini API 2025-05-30 13:50:52,223 - auto_diffusers - INFO - Successfully received response from Gemini API (no tools used) 2025-05-30 13:50:52,224 - auto_diffusers - DEBUG - Response length: 3046 characters