title: Marketing Image Generator with AI Review
emoji: π¨
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 5.39.0
app_file: app.py
pinned: false
license: mit
short_description: AI marketing image generator with GCP Imagen4 + Gemini 2.5
Marketing Image Generator with Agent Review
A sophisticated AI-powered image generation system that creates high-quality marketing images with automated quality review and refinement. Built on modern AI technologies including Google's Imagen4 and Gemini 2.5 Pro with advanced agent orchestration.
Features
- AI-Powered Image Generation: Create stunning marketing images from text prompts using Google's Imagen4 via MCP server
- Automated Quality Review: Intelligent Gemini agent automatically reviews and refines generated images
- Marketing-Focused: Optimized for marketing materials, social media, and promotional content
- Real-time Feedback: Get instant quality scores and improvement suggestions
- Professional Workflow: Streamlined process from concept to final image
- Download & Share: Easy export of generated images in multiple formats
Quick Start
Clone the repository
git clone <repository-url> cd MarketingImageGenerator
Install dependencies
pip install -r requirements.txt
Set up Google Cloud authentication
# For Hugging Face deployment, set these as secrets: # GOOGLE_API_KEY_1 through GOOGLE_API_KEY_6 # For local development, use .env file
Run the Gradio app
python app.py
Access the web interface
http://localhost:7860
System Architecture
Core Components
- Agent 1 (Image Generator): Creates images using Google's Imagen4 via MCP server integration
- Agent 2 (Marketing Reviewer): Analyzes image quality and provides marketing-focused feedback using Gemini Vision
- Orchestrator: Manages workflow between agents and handles handover
- Web Interface: Gradio-based user interface optimized for Hugging Face
- MCP Server Integration: Model Context Protocol for seamless Imagen4 access
System Architecture and Workflow
βββββββββββββββ βββββββββββββββ βββββββββββββββββββββββββββββββ
β User β β Gradio UI β β AI Agents & Models β
β β β β β β
β Image PromptβββββΆβ βββββΆβ Agent 1 (Gemini) Drafter β
β β β β β β
βReviewer βββββΆβ βββββΆβ Agent 2 (Gemini) Marketing β
βPrompt β β β β Reviewer β
β β β β β β
β β β β β βββββββββββββββββββββββββββ β
β β β β β β Imagen4 (via MCP) β β
β β β β β β β β
β β β β β β Draft Image Creation β β
β β β β β βββββββββββββββββββββββββββ β
β β β β β β
β β β β β βββββββββββββββββββββββββββ β
β β β β β β Draft Image Reviewed β β
β β β β β β & Changes Suggested β β
β β β β β βββββββββββββββββββββββββββ β
β β β β β β
β Image ββββββ ββββββ Final Image Response β
β Response β β β β β
βββββββββββββββ βββββββββββββββ βββββββββββββββββββββββββββββββ
Detailed Workflow:
User Interaction (Left):
- User sends Image Prompt (textual description for desired marketing image)
- User sends Reviewer Prompt (instructions/criteria for marketing review)
- User receives final Image Response (generated and reviewed image)
Gradio UI (Center):
- Acts as central interface receiving prompts from user
- Forwards Image Prompt to Agent 1 (Gemini) Drafter
- Forwards Reviewer Prompt to Agent 2 (Gemini) Marketing Reviewer
- Receives final Image Response from Agent 2 and presents to user
Image Generation and Drafting (Top Right):
- Agent 1 (Gemini) Drafter: Receives Image Prompt, orchestrates image generation
- Imagen4 (via MCP): Agent 1 interacts with Imagen4 through MCP server to create initial image draft
Marketing Review and Refinement (Bottom Right):
- Agent 2 (Gemini) Marketing Reviewer: Receives Reviewer Prompt, evaluates generated image against marketing criteria
- Draft Image Reviewed and Changes Suggested: Agent 2's review process output
- Iterative Refinement Loop: Bidirectional feedback between Agent 2 and Imagen4 (via Agent 1) to refine image until it meets marketing standards
- Final Image Response sent back to Gradio UI
Summary of Flow:
User provides prompts β Gradio UI β Agent 1 drafts image with Imagen4 β Agent 2 reviews and suggests refinements β Iterative refinement loop β Final reviewed image β User receives result
Technology Stack
- AI Models: Google Imagen4 (via MCP), Gemini 2.5 Pro Vision
- Framework: Gradio (Web Interface)
- Orchestration: Custom agent handover system
- Deployment: Hugging Face Spaces
- Authentication: Google Cloud API Keys
- Protocol: MCP (Model Context Protocol) for Imagen4 integration
Why A2A Was Not Applied
The system was designed with a custom handover mechanism instead of the A2A (Agent-to-Agent) protocol for the following reasons:
- Simplified Architecture: The current two-agent system (generator + reviewer) doesn't require the complexity of full A2A orchestration
- Direct Integration: MCP server provides direct access to Imagen4 without needing agent-to-agent communication protocols
- Performance Optimization: Direct handover between agents reduces latency and eliminates protocol overhead
- Deployment Simplicity: Hugging Face Spaces deployment is more straightforward without A2A dependencies
- Resource Efficiency: Fewer moving parts means better resource utilization in the cloud environment
The system maintains the benefits of multi-agent collaboration while using a more efficient, purpose-built handover system.
Usage
Web Interface (Gradio)
- Access the app on Hugging Face Spaces
- Enter your marketing image description in the prompt field
- Select your preferred art style (realistic, artistic, etc.)
- Configure quality threshold and advanced settings
- Click "Generate & Review Marketing Image"
- View the generated image with AI quality analysis and download
API Usage
import requests
# Generate an image
response = requests.post("http://localhost:8000/generate", json={
"prompt": "A modern office space with natural lighting",
"style": "realistic",
"enable_review": True
})
# Get the generated image and review results
result = response.json()
image_data = result["data"]["image"]["data"]
quality_score = result["data"]["review"]["quality_score"]
Configuration
Environment Variables
GOOGLE_API_KEY_1
throughGOOGLE_API_KEY_6
: Your Google AI API keys (set as Hugging Face secrets)LOG_LEVEL
: Logging level (DEBUG, INFO, WARNING, ERROR)PORT
: Web server port (default: 8000)STREAMLIT_PORT
: Streamlit port (default: 8501)
Advanced Settings
- Quality Threshold: Minimum quality score for auto-approval
- Max Iterations: Maximum refinement attempts
- Review Settings: Customize review criteria
- MCP Configuration: Imagen4 server settings
Development
Project Structure
MarketingImageGenerator/
βββ README.md # Project documentation
βββ app.py # Main Gradio application
βββ requirements.txt # Python dependencies
βββ agents/ # AI agents (if needed for local development)
βββ tools/ # Utility tools (if needed)
βββ tests/ # Test suite (if needed)
βββ docs/ # Documentation (if needed)
Note: The Hugging Face Spaces deployment uses a simplified structure with just the essential files (README.md
, app.py
, requirements.txt
) for optimal deployment performance.
Running Tests
# Run all tests
pytest
# Run specific test suite
pytest tests/test_image_generator.py
pytest tests/test_mcp_integration.py
Contributing
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests for new functionality
- Submit a pull request
Deployment
Hugging Face Spaces
The application is deployed on Hugging Face Spaces with the following configuration:
- SDK: Gradio 5.38.2
- Python Version: 3.9+
- Secrets: Google API keys configured as HF secrets
- Auto-deploy: Enabled for main branch
Docker
# Build the image
docker build -t marketing-image-generator .
# Run the container
docker run -p 7860:7860 marketing-image-generator
Kubernetes
# Deploy to Kubernetes
kubectl apply -f k8s/
# Check deployment status
kubectl get pods -n marketing-image-generator
Monitoring
The system includes comprehensive monitoring:
- Health Checks: Automatic service health monitoring
- Metrics: Performance and usage metrics via Prometheus
- Logging: Structured logging for debugging
- Alerts: Automated alerting for issues
Access monitoring dashboards:
- Prometheus:
http://localhost:9090
- Grafana:
http://localhost:3000
Troubleshooting
Common Issues
- API Key Errors: Ensure your Google API keys are valid and configured as HF secrets
- Image Generation Fails: Check your internet connection and API quotas
- Review Not Working: Verify the Gemini agent is running and configured correctly
- MCP Connection Issues: Check Imagen4 server connectivity and configuration
Content Policy & Brand Restrictions
Google's AI models have built-in safety guardrails that may cause timeouts or rejections for certain content types:
π« Highly Restricted Content (Likely to cause stalls/timeouts):
- Political Figures: Named world leaders, politicians (e.g., "Putin", "Zelensky", "Biden")
- Political Buildings: Government buildings like "10 Downing Street", "White House"
- Geopolitical Content: War, conflict, or sensitive international relations
- Financial Institution Brands: Major banks like "HSBC", "Bank of America", "JPMorgan"
β οΈ Moderately Restricted Content (May cause delays):
- Regulated Industries: Healthcare, pharmaceutical, financial services
- Some Corporate Brands: Varies by sector and brand sensitivity
β Generally Permitted Content:
- Technology Brands: "Cognizant", "Microsoft", "IBM", "Accenture"
- Generic Business: "Professional office", "corporate environment"
- Non-branded Content: Generic descriptions without specific brand names
π§ Workarounds for Restricted Content:
Instead of: "Professional boardroom with HSBC signage"
Use: "Professional boardroom with international banking corporation signage in red and white colors"
Instead of: "Meeting with political leaders"
Use: "Meeting with business executives in government-style building"
Strategy: Move brand-specific requirements to Review Guidelines instead of the main prompt:
- Main Prompt:
"Professional corporate environment"
- Review Guidelines:
"Ensure branding reflects HSBC corporate colors (red and white)"
This approach bypasses content filters while still providing guidance for review.
Debug Mode
Enable debug logging by setting LOG_LEVEL=DEBUG
in your environment variables.
Content Policy Testing
Use the included diagnostic scripts to test content restrictions:
debug_hsbc_prompt.py
- Test financial brand restrictionstest_cognizant_brand.py
- Test tech brand accessibilitytest_brand_workaround.py
- Test workaround strategies
Support
For issues and questions:
- Check the documentation in
/docs
- Review the troubleshooting guide
- Open an issue on GitHub
License
This project is licensed under the MIT License - see the LICENSE file for details.
Acknowledgments
- Google AI for Imagen4 and Gemini 2.5 Pro technologies
- Hugging Face for the deployment platform
- Gradio for the web interface framework
- The open-source community for various dependencies