Spaces:

CognizantAI
/

marketing-image-generator

Running

App Files Files Community

Noo88ear commited on 17 days ago

Commit

96e4f5d

verified ·

1 Parent(s): 22a36c5

Update README.md

Browse files

Files changed (1) hide show

README.md +108 -32

README.md CHANGED Viewed

@@ -15,8 +15,8 @@ A sophisticated AI-powered image generation system that creates high-quality mar
 ## Features
-- **AI-Powered Image Generation**: Create stunning marketing images from text prompts using Google's Imagen3
-- **Automated Quality Review**: Intelligent agents automatically review and refine generated images
 - **Marketing-Focused**: Optimized for marketing materials, social media, and promotional content
 - **Real-time Feedback**: Get instant quality scores and improvement suggestions
 - **Professional Workflow**: Streamlined process from concept to final image
@@ -37,8 +37,9 @@ A sophisticated AI-powered image generation system that creates high-quality mar
 3. **Set up Google Cloud authentication**
    ```bash
-   export GOOGLE_SERVICE_ACCOUNT_JSON='{"type":"service_account",...}'
-   # Or set GOOGLE_API_KEY for Google AI Studio
    ```
 4. **Run the Gradio app**
@@ -55,19 +56,85 @@ A sophisticated AI-powered image generation system that creates high-quality mar
 ### Core Components
-- **Image Generator Agent**: Creates images using Google's Imagen3
-- **Review Agent**: Analyzes image quality and provides feedback
-- **Orchestrator**: Manages workflow between agents
 - **Web Interface**: Gradio-based user interface optimized for Hugging Face
-- **Agent Integration**: Direct A2A protocol communication between agents
 ### Technology Stack
-- **AI Models**: Google Imagen3, Gemini Vision
 - **Framework**: Gradio (Web Interface)
-- **Orchestration**: Integrated A2A agent protocol
 - **Deployment**: Hugging Face Spaces
-- **Authentication**: Google Cloud Service Account
 ## Usage
@@ -102,8 +169,7 @@ quality_score = result["data"]["review"]["quality_score"]
 ### Environment Variables
-- `GOOGLE_API_KEY`: Your Google AI API key
-- `IMAGEN3_API_KEY`: Imagen3 API key (if different)
 - `LOG_LEVEL`: Logging level (DEBUG, INFO, WARNING, ERROR)
 - `PORT`: Web server port (default: 8000)
 - `STREAMLIT_PORT`: Streamlit port (default: 8501)
@@ -113,6 +179,7 @@ quality_score = result["data"]["review"]["quality_score"]
 - **Quality Threshold**: Minimum quality score for auto-approval
 - **Max Iterations**: Maximum refinement attempts
 - **Review Settings**: Customize review criteria
 ## Development
@@ -120,18 +187,17 @@ quality_score = result["data"]["review"]["quality_score"]
 ```
 MarketingImageGenerator/
-├── agents/                 # AI agents
-│   ├── generator/         # Image generation agent
-│   ├── reviewer/          # Quality review agent
-│   └── orchestrator/      # Workflow orchestration
-├── api/                   # FastAPI endpoints
-├── web/                   # Streamlit interface
-├── tools/                 # Utility tools
-├── tests/                 # Test suite
-├── docs/                  # Documentation
-└── deployment/            # Docker & K8s configs
 ```
 ### Running Tests
 ```bash
@@ -139,8 +205,8 @@ MarketingImageGenerator/
 pytest
 # Run specific test suite
-pytest tests/test_generation.py
-pytest tests/test_review.py
 ```
 ### Contributing
@@ -153,6 +219,15 @@ pytest tests/test_review.py
 ## Deployment
 ### Docker
 ```bash
@@ -160,7 +235,7 @@ pytest tests/test_review.py
 docker build -t marketing-image-generator .
 # Run the container
-docker run -p 8000:8000 -p 8501:8501 marketing-image-generator
 ```
 ### Kubernetes
@@ -178,7 +253,7 @@ kubectl get pods -n marketing-image-generator
 The system includes comprehensive monitoring:
 - **Health Checks**: Automatic service health monitoring
-- **Metrics**: Performance and usage metrics
 - **Logging**: Structured logging for debugging
 - **Alerts**: Automated alerting for issues
@@ -190,9 +265,10 @@ Access monitoring dashboards:
 ### Common Issues
-1. **API Key Errors**: Ensure your Google API key is valid and has the necessary permissions
 2. **Image Generation Fails**: Check your internet connection and API quotas
-3. **Review Not Working**: Verify the review agent is running and configured correctly
 ### Debug Mode
@@ -212,6 +288,6 @@ This project is licensed under the MIT License - see the LICENSE file for detail
 ## Acknowledgments
 - Google AI for Imagen3 and Gemini technologies
-- Streamlit for the web interface framework
-- FastAPI for the API framework
-- The open-source community for various dependencies

 ## Features
+- **AI-Powered Image Generation**: Create stunning marketing images from text prompts using Google's Imagen3 via MCP server
+- **Automated Quality Review**: Intelligent Gemini agent automatically reviews and refines generated images
 - **Marketing-Focused**: Optimized for marketing materials, social media, and promotional content
 - **Real-time Feedback**: Get instant quality scores and improvement suggestions
 - **Professional Workflow**: Streamlined process from concept to final image
 3. **Set up Google Cloud authentication**
    ```bash
+   # For Hugging Face deployment, set these as secrets:
+   # GOOGLE_API_KEY_1 through GOOGLE_API_KEY_6
+   # For local development, use .env file
    ```
 4. **Run the Gradio app**
 ### Core Components
+- **Agent 1 (Image Generator)**: Creates images using Google's Imagen3 via MCP server integration
+- **Agent 2 (Marketing Reviewer)**: Analyzes image quality and provides marketing-focused feedback using Gemini Vision
+- **Orchestrator**: Manages workflow between agents and handles handover
 - **Web Interface**: Gradio-based user interface optimized for Hugging Face
+- **MCP Server Integration**: Model Context Protocol for seamless Imagen3 access
+### System Architecture and Workflow
+```
+┌─────────────┐    ┌─────────────┐    ┌─────────────────────────────┐
+│    User     │    │  Gradio UI  │    │      AI Agents & Models     │
+│             │    │             │    │                             │
+│ Image Prompt│───▶│             │───▶│  Agent 1 (Gemini) Drafter   │
+│             │    │             │    │                             │
+│Reviewer     │───▶│             │───▶│  Agent 2 (Gemini) Marketing │
+│Prompt       │    │             │    │  Reviewer                   │
+│             │    │             │    │                             │
+│             │    │             │    │  ┌─────────────────────────┐│
+│             │    │             │    │  │ Ag1: Imagen3 (via MCP)  ││
+│             │    │             │    │  │                         ││
+│             │    │             │    │  │  Draft Image Creation   ││
+│             │    │             │    │  └─────────────────────────┘│
+│             │    │             │    │                             │
+│             │    │             │    │  ┌─────────────────────────┐│
+│             │    │             │    │  │Ag2;Draft Image Reviewed ││
+│             │    │             │    │  │  & Changes Suggested    ││
+│             │    │             │    │  └─────────────────────────┘│
+│             │    │             │    │                             │
+│ Image       │◀───│             │◀───│  Final Image Response       │
+│ Response    │    │             │    │                             │
+└─────────────┘    └─────────────┘    └─────────────────────────────┘
+```
+### Detailed Workflow:
+1. **User Interaction (Left)**:
+   - User sends **Image Prompt** (textual description for desired marketing image)
+   - User sends **Reviewer Prompt** (instructions/criteria for marketing review)
+   - User receives final **Image Response** (generated and reviewed image)
+2. **Gradio UI (Center)**:
+   - Acts as central interface receiving prompts from user
+   - Forwards **Image Prompt** to **Agent 1 (Gemini) Drafter**
+   - Forwards **Reviewer Prompt** to **Agent 2 (Gemini) Marketing Reviewer**
+   - Receives final **Image Response** from Agent 2 and presents to user
+3. **Image Generation and Drafting (Top Right)**:
+   - **Agent 1 (Gemini) Drafter**: Receives Image Prompt, orchestrates image generation
+   - **Imagen3 (via MCP)**: Agent 1 interacts with Imagen3 through MCP server to create initial image draft
+4. **Marketing Review and Refinement (Bottom Right)**:
+   - **Agent 2 (Gemini) Marketing Reviewer**: Receives Reviewer Prompt, evaluates generated image against marketing criteria
+   - **Draft Image Reviewed and Changes Suggested**: Agent 2's review process output
+   - **Iterative Refinement Loop**: Bidirectional feedback between Agent 2 and Imagen3 (via Agent 1) to refine image until it meets marketing standards
+   - Final **Image Response** sent back to Gradio UI
+### Summary of Flow:
+User provides prompts → Gradio UI → Agent 1 drafts image with Imagen3 → Agent 2 reviews and suggests refinements → Iterative refinement loop → Final reviewed image → User receives result
 ### Technology Stack
+- **AI Models**: Google Imagen3 (via MCP), Gemini Vision
 - **Framework**: Gradio (Web Interface)
+- **Orchestration**: Custom agent handover system
 - **Deployment**: Hugging Face Spaces
+- **Authentication**: Google Cloud API Keys
+- **Protocol**: MCP (Model Context Protocol) for Imagen3 integration
+### Why A2A Was Not Applied
+The system was designed with a **custom handover mechanism** instead of the A2A (Agent-to-Agent) protocol for the following reasons:
+1. **Simplified Architecture**: The current two-agent system (generator + reviewer) doesn't require the complexity of full A2A orchestration
+2. **Direct Integration**: MCP server provides direct access to Imagen3 without needing agent-to-agent communication protocols
+3. **Performance Optimization**: Direct handover between agents reduces latency and eliminates protocol overhead
+4. **Deployment Simplicity**: Hugging Face Spaces deployment is more straightforward without A2A dependencies
+5. **Resource Efficiency**: Fewer moving parts means better resource utilization in the cloud environment
+The system maintains the benefits of multi-agent collaboration while using a more efficient, purpose-built handover system.
 ## Usage
 ### Environment Variables
+- `GOOGLE_API_KEY_1` through `GOOGLE_API_KEY_6`: Your Google AI API keys (set as Hugging Face secrets)
 - `LOG_LEVEL`: Logging level (DEBUG, INFO, WARNING, ERROR)
 - `PORT`: Web server port (default: 8000)
 - `STREAMLIT_PORT`: Streamlit port (default: 8501)
 - **Quality Threshold**: Minimum quality score for auto-approval
 - **Max Iterations**: Maximum refinement attempts
 - **Review Settings**: Customize review criteria
+- **MCP Configuration**: Imagen3 server settings
 ## Development
 ```
 MarketingImageGenerator/
+├── README.md              # Project documentation
+├── app.py                 # Main Gradio application
+├── requirements.txt       # Python dependencies
+├── agents/                # AI agents (if needed for local development)
+├── tools/                 # Utility tools (if needed)
+├── tests/                 # Test suite (if needed)
+└── docs/                  # Documentation (if needed)
 ```
+**Note**: The Hugging Face Spaces deployment uses a simplified structure with just the essential files (`README.md`, `app.py`, `requirements.txt`) for optimal deployment performance.
 ### Running Tests
 ```bash
 pytest
 # Run specific test suite
+pytest tests/test_image_generator.py
+pytest tests/test_mcp_integration.py
 ```
 ### Contributing
 ## Deployment
+### Hugging Face Spaces
+The application is deployed on Hugging Face Spaces with the following configuration:
+- **SDK**: Gradio 5.38.2
+- **Python Version**: 3.9+
+- **Secrets**: Google API keys configured as HF secrets
+- **Auto-deploy**: Enabled for main branch
 ### Docker
 ```bash
 docker build -t marketing-image-generator .
 # Run the container
+docker run -p 7860:7860 marketing-image-generator
 ```
 ### Kubernetes
 The system includes comprehensive monitoring:
 - **Health Checks**: Automatic service health monitoring
+- **Metrics**: Performance and usage metrics via Prometheus
 - **Logging**: Structured logging for debugging
 - **Alerts**: Automated alerting for issues
 ### Common Issues
+1. **API Key Errors**: Ensure your Google API keys are valid and configured as HF secrets
 2. **Image Generation Fails**: Check your internet connection and API quotas
+3. **Review Not Working**: Verify the Gemini agent is running and configured correctly
+4. **MCP Connection Issues**: Check Imagen3 server connectivity and configuration
 ### Debug Mode
 ## Acknowledgments
 - Google AI for Imagen3 and Gemini technologies
+- Hugging Face for the deployment platform
+- Gradio for the web interface framework
+- The open-source community for various dependencies