title: Marketing Image Generator with AI Review
emoji: π¨
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 5.39.0
app_file: app.py
pinned: false
license: mit
short_description: AI marketing image generator with GCP Imagen4 + Gemini 2.5
Marketing Image Generator with Agent Review
A sophisticated AI-powered image generation system that creates high-quality marketing images with automated quality review and refinement. Built on modern AI technologies including Google's Imagen4 and Gemini 2.5 Pro with advanced agent orchestration.
Features
- AI-Powered Image Generation: Create stunning marketing images from text prompts using Google's Imagen4 via MCP server
- Automated Quality Review: Intelligent Gemini agent automatically reviews and refines generated images
- Marketing-Focused: Optimized for marketing materials, social media, and promotional content
- Real-time Feedback: Get instant quality scores and improvement suggestions
- Professional Workflow: Streamlined process from concept to final image
- Download & Share: Easy export of generated images in multiple formats
Quick Start
- Clone the repository - git clone <repository-url> cd MarketingImageGenerator
- Install dependencies - pip install -r requirements.txt
- Set up Google Cloud authentication - # For Hugging Face deployment, set these as secrets: # GOOGLE_API_KEY_1 through GOOGLE_API_KEY_6 # For local development, use .env file
- Run the Gradio app - python app.py
- Access the web interface - http://localhost:7860
System Architecture
Core Components
- Agent 1 (Image Generator): Creates images using Google's Imagen4 via MCP server integration
- Agent 2 (Marketing Reviewer): Analyzes image quality and provides marketing-focused feedback using Gemini Vision
- Orchestrator: Manages workflow between agents and handles handover
- Web Interface: Gradio-based user interface optimized for Hugging Face
- MCP Server Integration: Model Context Protocol for seamless Imagen4 access
System Architecture and Workflow
βββββββββββββββ    βββββββββββββββ    βββββββββββββββββββββββββββββββ
β    User     β    β  Gradio UI  β    β      AI Agents & Models     β
β             β    β             β    β                             β
β Image PromptβββββΆβ             βββββΆβ  Agent 1 (Gemini) Drafter   β
β             β    β             β    β                             β
βReviewer     βββββΆβ             βββββΆβ  Agent 2 (Gemini) Marketing β
βPrompt       β    β             β    β  Reviewer                   β
β             β    β             β    β                             β
β             β    β             β    β  βββββββββββββββββββββββββββ β
β             β    β             β    β  β   Imagen4 (via MCP)     β β
β             β    β             β    β  β                         β β
β             β    β             β    β  β  Draft Image Creation   β β
β             β    β             β    β  βββββββββββββββββββββββββββ β
β             β    β             β    β                             β
β             β    β             β    β  βββββββββββββββββββββββββββ β
β             β    β             β    β  β  Draft Image Reviewed   β β
β             β    β             β    β  β  & Changes Suggested    β β
β             β    β             β    β  βββββββββββββββββββββββββββ β
β             β    β             β    β                             β
β Image       ββββββ             ββββββ  Final Image Response      β
β Response    β    β             β    β                             β
βββββββββββββββ    βββββββββββββββ    βββββββββββββββββββββββββββββββ
Detailed Workflow:
- User Interaction (Left): - User sends Image Prompt (textual description for desired marketing image)
- User sends Reviewer Prompt (instructions/criteria for marketing review)
- User receives final Image Response (generated and reviewed image)
 
- Gradio UI (Center): - Acts as central interface receiving prompts from user
- Forwards Image Prompt to Agent 1 (Gemini) Drafter
- Forwards Reviewer Prompt to Agent 2 (Gemini) Marketing Reviewer
- Receives final Image Response from Agent 2 and presents to user
 
- Image Generation and Drafting (Top Right): - Agent 1 (Gemini) Drafter: Receives Image Prompt, orchestrates image generation
- Imagen4 (via MCP): Agent 1 interacts with Imagen4 through MCP server to create initial image draft
 
- Marketing Review and Refinement (Bottom Right): - Agent 2 (Gemini) Marketing Reviewer: Receives Reviewer Prompt, evaluates generated image against marketing criteria
- Draft Image Reviewed and Changes Suggested: Agent 2's review process output
- Iterative Refinement Loop: Bidirectional feedback between Agent 2 and Imagen4 (via Agent 1) to refine image until it meets marketing standards
- Final Image Response sent back to Gradio UI
 
Summary of Flow:
User provides prompts β Gradio UI β Agent 1 drafts image with Imagen4 β Agent 2 reviews and suggests refinements β Iterative refinement loop β Final reviewed image β User receives result
Technology Stack
- AI Models: Google Imagen4 (via MCP), Gemini 2.5 Pro Vision
- Framework: Gradio (Web Interface)
- Orchestration: Custom agent handover system
- Deployment: Hugging Face Spaces
- Authentication: Google Cloud API Keys
- Protocol: MCP (Model Context Protocol) for Imagen4 integration
Why A2A Was Not Applied
The system was designed with a custom handover mechanism instead of the A2A (Agent-to-Agent) protocol for the following reasons:
- Simplified Architecture: The current two-agent system (generator + reviewer) doesn't require the complexity of full A2A orchestration
- Direct Integration: MCP server provides direct access to Imagen4 without needing agent-to-agent communication protocols
- Performance Optimization: Direct handover between agents reduces latency and eliminates protocol overhead
- Deployment Simplicity: Hugging Face Spaces deployment is more straightforward without A2A dependencies
- Resource Efficiency: Fewer moving parts means better resource utilization in the cloud environment
The system maintains the benefits of multi-agent collaboration while using a more efficient, purpose-built handover system.
Usage
Web Interface (Gradio)
- Access the app on Hugging Face Spaces
- Enter your marketing image description in the prompt field
- Select your preferred art style (realistic, artistic, etc.)
- Configure quality threshold and advanced settings
- Click "Generate & Review Marketing Image"
- View the generated image with AI quality analysis and download
API Usage
import requests
# Generate an image
response = requests.post("http://localhost:8000/generate", json={
    "prompt": "A modern office space with natural lighting",
    "style": "realistic",
    "enable_review": True
})
# Get the generated image and review results
result = response.json()
image_data = result["data"]["image"]["data"]
quality_score = result["data"]["review"]["quality_score"]
Configuration
Environment Variables
- GOOGLE_API_KEY_1through- GOOGLE_API_KEY_6: Your Google AI API keys (set as Hugging Face secrets)
- LOG_LEVEL: Logging level (DEBUG, INFO, WARNING, ERROR)
- PORT: Web server port (default: 8000)
- STREAMLIT_PORT: Streamlit port (default: 8501)
Advanced Settings
- Quality Threshold: Minimum quality score for auto-approval
- Max Iterations: Maximum refinement attempts
- Review Settings: Customize review criteria
- MCP Configuration: Imagen4 server settings
Development
Project Structure
MarketingImageGenerator/
βββ README.md              # Project documentation
βββ app.py                 # Main Gradio application
βββ requirements.txt       # Python dependencies
βββ agents/                # AI agents (if needed for local development)
βββ tools/                 # Utility tools (if needed)
βββ tests/                 # Test suite (if needed)
βββ docs/                  # Documentation (if needed)
Note: The Hugging Face Spaces deployment uses a simplified structure with just the essential files (README.md, app.py, requirements.txt) for optimal deployment performance.
Running Tests
# Run all tests
pytest
# Run specific test suite
pytest tests/test_image_generator.py
pytest tests/test_mcp_integration.py
Contributing
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests for new functionality
- Submit a pull request
Deployment
Hugging Face Spaces
The application is deployed on Hugging Face Spaces with the following configuration:
- SDK: Gradio 5.38.2
- Python Version: 3.9+
- Secrets: Google API keys configured as HF secrets
- Auto-deploy: Enabled for main branch
Docker
# Build the image
docker build -t marketing-image-generator .
# Run the container
docker run -p 7860:7860 marketing-image-generator
Kubernetes
# Deploy to Kubernetes
kubectl apply -f k8s/
# Check deployment status
kubectl get pods -n marketing-image-generator
Monitoring
The system includes comprehensive monitoring:
- Health Checks: Automatic service health monitoring
- Metrics: Performance and usage metrics via Prometheus
- Logging: Structured logging for debugging
- Alerts: Automated alerting for issues
Access monitoring dashboards:
- Prometheus: http://localhost:9090
- Grafana: http://localhost:3000
Troubleshooting
Common Issues
- API Key Errors: Ensure your Google API keys are valid and configured as HF secrets
- Image Generation Fails: Check your internet connection and API quotas
- Review Not Working: Verify the Gemini agent is running and configured correctly
- MCP Connection Issues: Check Imagen4 server connectivity and configuration
Content Policy & Brand Restrictions
Google's AI models have built-in safety guardrails that may cause timeouts or rejections for certain content types:
π« Highly Restricted Content (Likely to cause stalls/timeouts):
- Political Figures: Named world leaders, politicians (e.g., "Putin", "Zelensky", "Biden")
- Political Buildings: Government buildings like "10 Downing Street", "White House"
- Geopolitical Content: War, conflict, or sensitive international relations
- Financial Institution Brands: Major banks like "HSBC", "Bank of America", "JPMorgan"
β οΈ Moderately Restricted Content (May cause delays):
- Regulated Industries: Healthcare, pharmaceutical, financial services
- Some Corporate Brands: Varies by sector and brand sensitivity
β Generally Permitted Content:
- Technology Brands: "Cognizant", "Microsoft", "IBM", "Accenture"
- Generic Business: "Professional office", "corporate environment"
- Non-branded Content: Generic descriptions without specific brand names
π§ Workarounds for Restricted Content:
Instead of: "Professional boardroom with HSBC signage"
Use: "Professional boardroom with international banking corporation signage in red and white colors"
Instead of: "Meeting with political leaders"
Use: "Meeting with business executives in government-style building"
Strategy: Move brand-specific requirements to Review Guidelines instead of the main prompt:
- Main Prompt: "Professional corporate environment"
- Review Guidelines: "Ensure branding reflects HSBC corporate colors (red and white)"
This approach bypasses content filters while still providing guidance for review.
Debug Mode
Enable debug logging by setting LOG_LEVEL=DEBUG in your environment variables.
Content Policy Testing
Use the included diagnostic scripts to test content restrictions:
- debug_hsbc_prompt.py- Test financial brand restrictions
- test_cognizant_brand.py- Test tech brand accessibility
- test_brand_workaround.py- Test workaround strategies
Support
For issues and questions:
- Check the documentation in /docs
- Review the troubleshooting guide
- Open an issue on GitHub
License
This project is licensed under the MIT License - see the LICENSE file for details.
Acknowledgments
- Google AI for Imagen4 and Gemini 2.5 Pro technologies
- Hugging Face for the deployment platform
- Gradio for the web interface framework
- The open-source community for various dependencies

