File size: 9,396 Bytes
3617fb9 9a9c028 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 |
---
title: Agentic HF Analyzer
emoji: π
colorFrom: yellow
colorTo: green
sdk: gradio
sdk_version: 5.32.1
app_file: app.py
pinned: false
short_description: Recommends users which Repos/Spaces to look at
---
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
# π HF Repo Analyzer
An AI-powered Hugging Face repository discovery and analysis tool that helps you find, evaluate, and explore the best repositories for your specific needs.



## β¨ Features
- π€ **AI Assistant**: Intelligent conversation-based repository discovery
- π **Smart Search**: Auto-detection of repository IDs vs. keywords
- π **Automated Analysis**: LLM-powered repository evaluation and ranking
- π **Top 3 Selection**: AI-curated most relevant repositories
- π¬ **Repository Explorer**: Interactive chat with repository contents
- π― **Requirements Extraction**: Automatic keyword extraction from conversations
- π **Comprehensive Results**: Detailed analysis with strengths, weaknesses, and specialities
## π¦ Quick Start
### Prerequisites
- Python 3.8+
- OpenAI API key (for LLM analysis)
- Hugging Face access (for repository downloads)
### Installation
1. **Clone the repository**
```bash
git clone <repository-url>
cd Agentic_HF_Analyzer
```
2. **Install dependencies**
```bash
pip install -r requirements.txt
```
3. **Set up environment variables**
```bash
export modal_api="your_openai_api_key"
export base_url="your_openai_base_url"
```
4. **Run the application**
```bash
python app.py
```
5. **Open your browser** to `http://localhost:7860`
## π User Guide
### π€ Using the AI Assistant (Recommended)
1. **Start a Conversation**
- Navigate to the "π€ AI Assistant" tab
- Describe your project: "I'm building a chatbot for customer service"
- The AI will ask clarifying questions about your needs
2. **Automatic Discovery**
- When the AI has enough information, it will automatically:
- Extract relevant keywords from your conversation
- Search for matching repositories
- Analyze and rank them by relevance
3. **Review Results**
- The interface automatically switches to "π¬ Analysis & Results"
- View the top 3 most relevant repositories
- Browse all analyzed repositories with detailed insights
### π Using Smart Search (Direct Input)
1. **Repository IDs**
```
microsoft/DialoGPT-medium
openai/whisper
huggingface/transformers
```
2. **Keywords**
```
text generation
image classification
sentiment analysis
```
3. **Mixed Input**
- The system automatically detects the input type
- Repository IDs (containing `/`) are processed directly
- Keywords trigger automatic repository search
### π¬ Analyzing Results
- **Top 3 Repositories**: AI-selected most relevant based on your requirements
- **Detailed Analysis**: Strengths, weaknesses, specialities, and relevance ratings
- **Quick Actions**: Click repository names to visit or explore them
- **Repository Explorer**: Deep dive into individual repositories with AI chat
### π Repository Explorer
1. **Access Methods**:
- Click "π Open in Repo Explorer" from repository actions
- Manually enter repository ID in the Repo Explorer tab
2. **Features**:
- Automatic repository loading and analysis
- Interactive chat about repository contents
- File structure exploration
- Code analysis and explanations
## π οΈ Technical Architecture
### Core Components
```
app.py # Main Gradio interface and orchestration
βββ analyzer.py # Repository analysis and LLM processing
βββ hf_utils.py # Hugging Face API interactions
βββ chatbot_page.py # AI assistant conversation logic
βββ repo_explorer.py # Repository exploration interface
```
### Key Features Implementation
#### π€ AI Assistant
- **System Prompt**: Focused on requirements gathering, not recommendations
- **Auto-Extraction**: Detects conversation readiness for keyword extraction
- **Smart Processing**: Converts natural language to actionable search queries
#### π Smart Input Detection
```python
def is_repo_id_format(text: str) -> bool:
# Detects if input contains repository IDs (with /) vs keywords
lines = [line.strip() for line in re.split(r'[\n,]+', text) if line.strip()]
slash_count = sum(1 for line in lines if '/' in line)
return slash_count >= len(lines) * 0.5
```
#### π LLM-Powered Repository Ranking
- **Model**: `Orion-zhen/Qwen2.5-Coder-7B-Instruct-AWQ`
- **Criteria**: Requirements matching, strengths, relevance rating, speciality alignment
- **Output**: JSON-formatted repository rankings
#### π Analysis Pipeline
1. **Download**: Repository files (`.py`, `.md`, `.txt`)
2. **Combine**: Merge files into single analyzable document
3. **Analyze**: LLM evaluation for strengths, weaknesses, specialities
4. **Rank**: User requirement-based relevance scoring
5. **Select**: Top 3 most relevant repositories
### Data Flow
```mermaid
graph TD
A[User Input] --> B{Input Type?}
B -->|Keywords| C[Repository Search]
B -->|Repo IDs| D[Direct Processing]
C --> E[Repository List]
D --> E
E --> F[Download & Analyze]
F --> G[LLM Evaluation]
G --> H[Ranking & Selection]
H --> I[Results Display]
I --> J[Repository Explorer]
```
### File Structure
```
π¦ Agentic_HF_Analyzer/
βββ π app.py # Main application
βββ π analyzer.py # Repository analysis logic
βββ π hf_utils.py # Hugging Face utilities
βββ π chatbot_page.py # AI assistant functionality
βββ π repo_explorer.py # Repository exploration
βββ π requirements.txt # Python dependencies
βββ π README.md # Documentation
βββ π repo_ids.csv # Analysis results storage
βββ π repo_files/ # Temporary repository downloads
```
### Dependencies
```
gradio>=4.0.0 # Web interface framework
pandas>=1.5.0 # Data manipulation
regex>=2022.0.0 # Advanced regex operations
openai>=1.0.0 # LLM API access
huggingface_hub>=0.16.0 # HF repository access
requests>=2.28.0 # HTTP requests
```
### Environment Variables
| Variable | Description | Required |
|----------|-------------|----------|
| `modal_api` | OpenAI API key for LLM analysis | β
|
| `base_url` | OpenAI API base URL | β
|
### LLM Integration
#### Analysis Prompt Structure
```python
ANALYSIS_PROMPT = """
Analyze this repository and provide:
1. Strengths and capabilities
2. Potential weaknesses or limitations
3. Primary speciality/use case
4. Relevance rating for: {user_requirements}
Return valid JSON with: strength, weaknesses, speciality, relevance rating
"""
```
#### Repository Ranking System
- **Input**: User requirements + repository analysis data
- **Processing**: LLM evaluates relevance and ranks repositories
- **Output**: Top 3 most relevant repositories in order
### UI Components
#### Modern Design Features
- **Gradient Backgrounds**: Linear gradients for visual appeal
- **Glassmorphism**: Backdrop blur effects for modern look
- **Responsive Layout**: Adaptive to different screen sizes
- **Interactive Elements**: Hover effects and smooth transitions
- **Modal System**: Repository action selection popups
#### Tab Organization
1. **π€ AI Assistant**: Conversation-based discovery
2. **π Smart Search**: Direct input processing
3. **π¬ Analysis & Results**: Comprehensive analysis display
4. **π Repo Explorer**: Interactive repository exploration
### Advanced Features
#### Auto-Navigation
- Automatic tab switching based on workflow state
- Smooth scrolling to top on tab changes
- Progressive disclosure of information
#### Error Handling
- Graceful fallbacks for LLM failures
- CSV update retry mechanisms
- User-friendly error messages
#### Performance Optimizations
- Parallel processing for multiple repositories
- Progress tracking for long operations
- Efficient file caching and cleanup
## π§ Configuration
### Customizing Analysis
- Modify `CHATBOT_SYSTEM_PROMPT` for different assistant behavior
- Adjust repository search limits in `search_top_spaces()`
- Configure analysis criteria in `get_top_relevant_repos()`
### Adding File Types
```python
# In analyzer.py
download_filtered_space_files(
repo_id,
local_dir="repo_files",
file_extensions=['.py', '.md', '.txt', '.js', '.ts'] # Add more
)
```
## π€ Contributing
1. Fork the repository
2. Create a feature branch
3. Implement your changes
4. Add tests if applicable
5. Submit a pull request
## π License
This project is licensed under the MIT License - see the LICENSE file for details.
## π Acknowledgments
- **Gradio**: For the amazing web interface framework
- **Hugging Face**: For the incredible repository ecosystem
- **OpenAI**: For powerful language model capabilities
---
<div align="center">
<p>Built with β€οΈ for the open source community</p>
<p>π Happy repository hunting! π</p>
</div>
|