Update README.md
Browse files
README.md
CHANGED
@@ -15,8 +15,8 @@ A sophisticated AI-powered image generation system that creates high-quality mar
|
|
15 |
|
16 |
## Features
|
17 |
|
18 |
-
- **AI-Powered Image Generation**: Create stunning marketing images from text prompts using Google's Imagen3
|
19 |
-
- **Automated Quality Review**: Intelligent
|
20 |
- **Marketing-Focused**: Optimized for marketing materials, social media, and promotional content
|
21 |
- **Real-time Feedback**: Get instant quality scores and improvement suggestions
|
22 |
- **Professional Workflow**: Streamlined process from concept to final image
|
@@ -37,8 +37,9 @@ A sophisticated AI-powered image generation system that creates high-quality mar
|
|
37 |
|
38 |
3. **Set up Google Cloud authentication**
|
39 |
```bash
|
40 |
-
|
41 |
-
#
|
|
|
42 |
```
|
43 |
|
44 |
4. **Run the Gradio app**
|
@@ -55,19 +56,85 @@ A sophisticated AI-powered image generation system that creates high-quality mar
|
|
55 |
|
56 |
### Core Components
|
57 |
|
58 |
-
- **Image Generator
|
59 |
-
- **
|
60 |
-
- **Orchestrator**: Manages workflow between agents
|
61 |
- **Web Interface**: Gradio-based user interface optimized for Hugging Face
|
62 |
-
- **
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
63 |
|
64 |
### Technology Stack
|
65 |
|
66 |
-
- **AI Models**: Google Imagen3, Gemini Vision
|
67 |
- **Framework**: Gradio (Web Interface)
|
68 |
-
- **Orchestration**:
|
69 |
- **Deployment**: Hugging Face Spaces
|
70 |
-
- **Authentication**: Google Cloud
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
71 |
|
72 |
## Usage
|
73 |
|
@@ -102,8 +169,7 @@ quality_score = result["data"]["review"]["quality_score"]
|
|
102 |
|
103 |
### Environment Variables
|
104 |
|
105 |
-
- `
|
106 |
-
- `IMAGEN3_API_KEY`: Imagen3 API key (if different)
|
107 |
- `LOG_LEVEL`: Logging level (DEBUG, INFO, WARNING, ERROR)
|
108 |
- `PORT`: Web server port (default: 8000)
|
109 |
- `STREAMLIT_PORT`: Streamlit port (default: 8501)
|
@@ -113,6 +179,7 @@ quality_score = result["data"]["review"]["quality_score"]
|
|
113 |
- **Quality Threshold**: Minimum quality score for auto-approval
|
114 |
- **Max Iterations**: Maximum refinement attempts
|
115 |
- **Review Settings**: Customize review criteria
|
|
|
116 |
|
117 |
## Development
|
118 |
|
@@ -120,18 +187,17 @@ quality_score = result["data"]["review"]["quality_score"]
|
|
120 |
|
121 |
```
|
122 |
MarketingImageGenerator/
|
123 |
-
βββ
|
124 |
-
|
125 |
-
|
126 |
-
|
127 |
-
βββ
|
128 |
-
βββ
|
129 |
-
|
130 |
-
βββ tests/ # Test suite
|
131 |
-
βββ docs/ # Documentation
|
132 |
-
βββ deployment/ # Docker & K8s configs
|
133 |
```
|
134 |
|
|
|
|
|
135 |
### Running Tests
|
136 |
|
137 |
```bash
|
@@ -139,8 +205,8 @@ MarketingImageGenerator/
|
|
139 |
pytest
|
140 |
|
141 |
# Run specific test suite
|
142 |
-
pytest tests/
|
143 |
-
pytest tests/
|
144 |
```
|
145 |
|
146 |
### Contributing
|
@@ -153,6 +219,15 @@ pytest tests/test_review.py
|
|
153 |
|
154 |
## Deployment
|
155 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
156 |
### Docker
|
157 |
|
158 |
```bash
|
@@ -160,7 +235,7 @@ pytest tests/test_review.py
|
|
160 |
docker build -t marketing-image-generator .
|
161 |
|
162 |
# Run the container
|
163 |
-
docker run -p
|
164 |
```
|
165 |
|
166 |
### Kubernetes
|
@@ -178,7 +253,7 @@ kubectl get pods -n marketing-image-generator
|
|
178 |
The system includes comprehensive monitoring:
|
179 |
|
180 |
- **Health Checks**: Automatic service health monitoring
|
181 |
-
- **Metrics**: Performance and usage metrics
|
182 |
- **Logging**: Structured logging for debugging
|
183 |
- **Alerts**: Automated alerting for issues
|
184 |
|
@@ -190,9 +265,10 @@ Access monitoring dashboards:
|
|
190 |
|
191 |
### Common Issues
|
192 |
|
193 |
-
1. **API Key Errors**: Ensure your Google API
|
194 |
2. **Image Generation Fails**: Check your internet connection and API quotas
|
195 |
-
3. **Review Not Working**: Verify the
|
|
|
196 |
|
197 |
### Debug Mode
|
198 |
|
@@ -212,6 +288,6 @@ This project is licensed under the MIT License - see the LICENSE file for detail
|
|
212 |
## Acknowledgments
|
213 |
|
214 |
- Google AI for Imagen3 and Gemini technologies
|
215 |
-
-
|
216 |
-
-
|
217 |
-
- The open-source community for various dependencies
|
|
|
15 |
|
16 |
## Features
|
17 |
|
18 |
+
- **AI-Powered Image Generation**: Create stunning marketing images from text prompts using Google's Imagen3 via MCP server
|
19 |
+
- **Automated Quality Review**: Intelligent Gemini agent automatically reviews and refines generated images
|
20 |
- **Marketing-Focused**: Optimized for marketing materials, social media, and promotional content
|
21 |
- **Real-time Feedback**: Get instant quality scores and improvement suggestions
|
22 |
- **Professional Workflow**: Streamlined process from concept to final image
|
|
|
37 |
|
38 |
3. **Set up Google Cloud authentication**
|
39 |
```bash
|
40 |
+
# For Hugging Face deployment, set these as secrets:
|
41 |
+
# GOOGLE_API_KEY_1 through GOOGLE_API_KEY_6
|
42 |
+
# For local development, use .env file
|
43 |
```
|
44 |
|
45 |
4. **Run the Gradio app**
|
|
|
56 |
|
57 |
### Core Components
|
58 |
|
59 |
+
- **Agent 1 (Image Generator)**: Creates images using Google's Imagen3 via MCP server integration
|
60 |
+
- **Agent 2 (Marketing Reviewer)**: Analyzes image quality and provides marketing-focused feedback using Gemini Vision
|
61 |
+
- **Orchestrator**: Manages workflow between agents and handles handover
|
62 |
- **Web Interface**: Gradio-based user interface optimized for Hugging Face
|
63 |
+
- **MCP Server Integration**: Model Context Protocol for seamless Imagen3 access
|
64 |
+
|
65 |
+
### System Architecture and Workflow
|
66 |
+
|
67 |
+
```
|
68 |
+
βββββββββββββββ βββββββββββββββ βββββββββββββββββββββββββββββββ
|
69 |
+
β User β β Gradio UI β β AI Agents & Models β
|
70 |
+
β β β β β β
|
71 |
+
β Image PromptβββββΆβ βββββΆβ Agent 1 (Gemini) Drafter β
|
72 |
+
β β β β β β
|
73 |
+
βReviewer βββββΆβ βββββΆβ Agent 2 (Gemini) Marketing β
|
74 |
+
βPrompt β β β β Reviewer β
|
75 |
+
β β β β β β
|
76 |
+
β β β β β ββββββββββββββββββββββββββββ
|
77 |
+
β β β β β β Ag1: Imagen3 (via MCP) ββ
|
78 |
+
β β β β β β ββ
|
79 |
+
β β β β β β Draft Image Creation ββ
|
80 |
+
β β β β β ββββββββββββββββββββββββββββ
|
81 |
+
β β β β β β
|
82 |
+
β β β β β ββββββββββββββββββββββββββββ
|
83 |
+
β β β β β βAg2;Draft Image Reviewed ββ
|
84 |
+
β β β β β β & Changes Suggested ββ
|
85 |
+
β β β β β ββββββββββββββββββββββββββββ
|
86 |
+
β β β β β β
|
87 |
+
β Image ββββββ ββββββ Final Image Response β
|
88 |
+
β Response β β β β β
|
89 |
+
βββββββββββββββ βββββββββββββββ βββββββββββββββββββββββββββββββ
|
90 |
+
```
|
91 |
+
|
92 |
+
### Detailed Workflow:
|
93 |
+
|
94 |
+
1. **User Interaction (Left)**:
|
95 |
+
- User sends **Image Prompt** (textual description for desired marketing image)
|
96 |
+
- User sends **Reviewer Prompt** (instructions/criteria for marketing review)
|
97 |
+
- User receives final **Image Response** (generated and reviewed image)
|
98 |
+
|
99 |
+
2. **Gradio UI (Center)**:
|
100 |
+
- Acts as central interface receiving prompts from user
|
101 |
+
- Forwards **Image Prompt** to **Agent 1 (Gemini) Drafter**
|
102 |
+
- Forwards **Reviewer Prompt** to **Agent 2 (Gemini) Marketing Reviewer**
|
103 |
+
- Receives final **Image Response** from Agent 2 and presents to user
|
104 |
+
|
105 |
+
3. **Image Generation and Drafting (Top Right)**:
|
106 |
+
- **Agent 1 (Gemini) Drafter**: Receives Image Prompt, orchestrates image generation
|
107 |
+
- **Imagen3 (via MCP)**: Agent 1 interacts with Imagen3 through MCP server to create initial image draft
|
108 |
+
|
109 |
+
4. **Marketing Review and Refinement (Bottom Right)**:
|
110 |
+
- **Agent 2 (Gemini) Marketing Reviewer**: Receives Reviewer Prompt, evaluates generated image against marketing criteria
|
111 |
+
- **Draft Image Reviewed and Changes Suggested**: Agent 2's review process output
|
112 |
+
- **Iterative Refinement Loop**: Bidirectional feedback between Agent 2 and Imagen3 (via Agent 1) to refine image until it meets marketing standards
|
113 |
+
- Final **Image Response** sent back to Gradio UI
|
114 |
+
|
115 |
+
### Summary of Flow:
|
116 |
+
User provides prompts β Gradio UI β Agent 1 drafts image with Imagen3 β Agent 2 reviews and suggests refinements β Iterative refinement loop β Final reviewed image β User receives result
|
117 |
|
118 |
### Technology Stack
|
119 |
|
120 |
+
- **AI Models**: Google Imagen3 (via MCP), Gemini Vision
|
121 |
- **Framework**: Gradio (Web Interface)
|
122 |
+
- **Orchestration**: Custom agent handover system
|
123 |
- **Deployment**: Hugging Face Spaces
|
124 |
+
- **Authentication**: Google Cloud API Keys
|
125 |
+
- **Protocol**: MCP (Model Context Protocol) for Imagen3 integration
|
126 |
+
|
127 |
+
### Why A2A Was Not Applied
|
128 |
+
|
129 |
+
The system was designed with a **custom handover mechanism** instead of the A2A (Agent-to-Agent) protocol for the following reasons:
|
130 |
+
|
131 |
+
1. **Simplified Architecture**: The current two-agent system (generator + reviewer) doesn't require the complexity of full A2A orchestration
|
132 |
+
2. **Direct Integration**: MCP server provides direct access to Imagen3 without needing agent-to-agent communication protocols
|
133 |
+
3. **Performance Optimization**: Direct handover between agents reduces latency and eliminates protocol overhead
|
134 |
+
4. **Deployment Simplicity**: Hugging Face Spaces deployment is more straightforward without A2A dependencies
|
135 |
+
5. **Resource Efficiency**: Fewer moving parts means better resource utilization in the cloud environment
|
136 |
+
|
137 |
+
The system maintains the benefits of multi-agent collaboration while using a more efficient, purpose-built handover system.
|
138 |
|
139 |
## Usage
|
140 |
|
|
|
169 |
|
170 |
### Environment Variables
|
171 |
|
172 |
+
- `GOOGLE_API_KEY_1` through `GOOGLE_API_KEY_6`: Your Google AI API keys (set as Hugging Face secrets)
|
|
|
173 |
- `LOG_LEVEL`: Logging level (DEBUG, INFO, WARNING, ERROR)
|
174 |
- `PORT`: Web server port (default: 8000)
|
175 |
- `STREAMLIT_PORT`: Streamlit port (default: 8501)
|
|
|
179 |
- **Quality Threshold**: Minimum quality score for auto-approval
|
180 |
- **Max Iterations**: Maximum refinement attempts
|
181 |
- **Review Settings**: Customize review criteria
|
182 |
+
- **MCP Configuration**: Imagen3 server settings
|
183 |
|
184 |
## Development
|
185 |
|
|
|
187 |
|
188 |
```
|
189 |
MarketingImageGenerator/
|
190 |
+
βββ README.md # Project documentation
|
191 |
+
βββ app.py # Main Gradio application
|
192 |
+
βββ requirements.txt # Python dependencies
|
193 |
+
βββ agents/ # AI agents (if needed for local development)
|
194 |
+
βββ tools/ # Utility tools (if needed)
|
195 |
+
βββ tests/ # Test suite (if needed)
|
196 |
+
βββ docs/ # Documentation (if needed)
|
|
|
|
|
|
|
197 |
```
|
198 |
|
199 |
+
**Note**: The Hugging Face Spaces deployment uses a simplified structure with just the essential files (`README.md`, `app.py`, `requirements.txt`) for optimal deployment performance.
|
200 |
+
|
201 |
### Running Tests
|
202 |
|
203 |
```bash
|
|
|
205 |
pytest
|
206 |
|
207 |
# Run specific test suite
|
208 |
+
pytest tests/test_image_generator.py
|
209 |
+
pytest tests/test_mcp_integration.py
|
210 |
```
|
211 |
|
212 |
### Contributing
|
|
|
219 |
|
220 |
## Deployment
|
221 |
|
222 |
+
### Hugging Face Spaces
|
223 |
+
|
224 |
+
The application is deployed on Hugging Face Spaces with the following configuration:
|
225 |
+
|
226 |
+
- **SDK**: Gradio 5.38.2
|
227 |
+
- **Python Version**: 3.9+
|
228 |
+
- **Secrets**: Google API keys configured as HF secrets
|
229 |
+
- **Auto-deploy**: Enabled for main branch
|
230 |
+
|
231 |
### Docker
|
232 |
|
233 |
```bash
|
|
|
235 |
docker build -t marketing-image-generator .
|
236 |
|
237 |
# Run the container
|
238 |
+
docker run -p 7860:7860 marketing-image-generator
|
239 |
```
|
240 |
|
241 |
### Kubernetes
|
|
|
253 |
The system includes comprehensive monitoring:
|
254 |
|
255 |
- **Health Checks**: Automatic service health monitoring
|
256 |
+
- **Metrics**: Performance and usage metrics via Prometheus
|
257 |
- **Logging**: Structured logging for debugging
|
258 |
- **Alerts**: Automated alerting for issues
|
259 |
|
|
|
265 |
|
266 |
### Common Issues
|
267 |
|
268 |
+
1. **API Key Errors**: Ensure your Google API keys are valid and configured as HF secrets
|
269 |
2. **Image Generation Fails**: Check your internet connection and API quotas
|
270 |
+
3. **Review Not Working**: Verify the Gemini agent is running and configured correctly
|
271 |
+
4. **MCP Connection Issues**: Check Imagen3 server connectivity and configuration
|
272 |
|
273 |
### Debug Mode
|
274 |
|
|
|
288 |
## Acknowledgments
|
289 |
|
290 |
- Google AI for Imagen3 and Gemini technologies
|
291 |
+
- Hugging Face for the deployment platform
|
292 |
+
- Gradio for the web interface framework
|
293 |
+
- The open-source community for various dependencies
|