metadata
title: Agentic HF Analyzer
emoji: π
colorFrom: yellow
colorTo: green
sdk: gradio
sdk_version: 5.32.1
app_file: app.py
pinned: false
short_description: Recommends users which Repos/Spaces to look at
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
π HF Repo Analyzer
An AI-powered Hugging Face repository discovery and analysis tool that helps you find, evaluate, and explore the best repositories for your specific needs.
β¨ Features
- π€ AI Assistant: Intelligent conversation-based repository discovery
- π Smart Search: Auto-detection of repository IDs vs. keywords
- π Automated Analysis: LLM-powered repository evaluation and ranking
- π Top 3 Selection: AI-curated most relevant repositories
- π¬ Repository Explorer: Interactive chat with repository contents
- π― Requirements Extraction: Automatic keyword extraction from conversations
- π Comprehensive Results: Detailed analysis with strengths, weaknesses, and specialities
π¦ Quick Start
Prerequisites
- Python 3.8+
- OpenAI API key (for LLM analysis)
- Hugging Face access (for repository downloads)
Installation
Clone the repository
git clone <repository-url> cd Agentic_HF_Analyzer
Install dependencies
pip install -r requirements.txt
Set up environment variables
export modal_api="your_openai_api_key" export base_url="your_openai_base_url"
Run the application
python app.py
Open your browser to
http://localhost:7860
π User Guide
π€ Using the AI Assistant (Recommended)
Start a Conversation
- Navigate to the "π€ AI Assistant" tab
- Describe your project: "I'm building a chatbot for customer service"
- The AI will ask clarifying questions about your needs
Automatic Discovery
- When the AI has enough information, it will automatically:
- Extract relevant keywords from your conversation
- Search for matching repositories
- Analyze and rank them by relevance
- When the AI has enough information, it will automatically:
Review Results
- The interface automatically switches to "π¬ Analysis & Results"
- View the top 3 most relevant repositories
- Browse all analyzed repositories with detailed insights
π Using Smart Search (Direct Input)
Repository IDs
microsoft/DialoGPT-medium openai/whisper huggingface/transformers
Keywords
text generation image classification sentiment analysis
Mixed Input
- The system automatically detects the input type
- Repository IDs (containing
/
) are processed directly - Keywords trigger automatic repository search
π¬ Analyzing Results
- Top 3 Repositories: AI-selected most relevant based on your requirements
- Detailed Analysis: Strengths, weaknesses, specialities, and relevance ratings
- Quick Actions: Click repository names to visit or explore them
- Repository Explorer: Deep dive into individual repositories with AI chat
π Repository Explorer
Access Methods:
- Click "π Open in Repo Explorer" from repository actions
- Manually enter repository ID in the Repo Explorer tab
Features:
- Automatic repository loading and analysis
- Interactive chat about repository contents
- File structure exploration
- Code analysis and explanations
π οΈ Technical Architecture
Core Components
app.py # Main Gradio interface and orchestration
βββ analyzer.py # Repository analysis and LLM processing
βββ hf_utils.py # Hugging Face API interactions
βββ chatbot_page.py # AI assistant conversation logic
βββ repo_explorer.py # Repository exploration interface
Key Features Implementation
π€ AI Assistant
- System Prompt: Focused on requirements gathering, not recommendations
- Auto-Extraction: Detects conversation readiness for keyword extraction
- Smart Processing: Converts natural language to actionable search queries
π Smart Input Detection
def is_repo_id_format(text: str) -> bool:
# Detects if input contains repository IDs (with /) vs keywords
lines = [line.strip() for line in re.split(r'[\n,]+', text) if line.strip()]
slash_count = sum(1 for line in lines if '/' in line)
return slash_count >= len(lines) * 0.5
π LLM-Powered Repository Ranking
- Model:
Orion-zhen/Qwen2.5-Coder-7B-Instruct-AWQ
- Criteria: Requirements matching, strengths, relevance rating, speciality alignment
- Output: JSON-formatted repository rankings
π Analysis Pipeline
- Download: Repository files (
.py
,.md
,.txt
) - Combine: Merge files into single analyzable document
- Analyze: LLM evaluation for strengths, weaknesses, specialities
- Rank: User requirement-based relevance scoring
- Select: Top 3 most relevant repositories
Data Flow
graph TD
A[User Input] --> B{Input Type?}
B -->|Keywords| C[Repository Search]
B -->|Repo IDs| D[Direct Processing]
C --> E[Repository List]
D --> E
E --> F[Download & Analyze]
F --> G[LLM Evaluation]
G --> H[Ranking & Selection]
H --> I[Results Display]
I --> J[Repository Explorer]
File Structure
π¦ Agentic_HF_Analyzer/
βββ π app.py # Main application
βββ π analyzer.py # Repository analysis logic
βββ π hf_utils.py # Hugging Face utilities
βββ π chatbot_page.py # AI assistant functionality
βββ π repo_explorer.py # Repository exploration
βββ π requirements.txt # Python dependencies
βββ π README.md # Documentation
βββ π repo_ids.csv # Analysis results storage
βββ π repo_files/ # Temporary repository downloads
Dependencies
gradio>=4.0.0 # Web interface framework
pandas>=1.5.0 # Data manipulation
regex>=2022.0.0 # Advanced regex operations
openai>=1.0.0 # LLM API access
huggingface_hub>=0.16.0 # HF repository access
requests>=2.28.0 # HTTP requests
Environment Variables
Variable | Description | Required |
---|---|---|
modal_api |
OpenAI API key for LLM analysis | β |
base_url |
OpenAI API base URL | β |
LLM Integration
Analysis Prompt Structure
ANALYSIS_PROMPT = """
Analyze this repository and provide:
1. Strengths and capabilities
2. Potential weaknesses or limitations
3. Primary speciality/use case
4. Relevance rating for: {user_requirements}
Return valid JSON with: strength, weaknesses, speciality, relevance rating
"""
Repository Ranking System
- Input: User requirements + repository analysis data
- Processing: LLM evaluates relevance and ranks repositories
- Output: Top 3 most relevant repositories in order
UI Components
Modern Design Features
- Gradient Backgrounds: Linear gradients for visual appeal
- Glassmorphism: Backdrop blur effects for modern look
- Responsive Layout: Adaptive to different screen sizes
- Interactive Elements: Hover effects and smooth transitions
- Modal System: Repository action selection popups
Tab Organization
- π€ AI Assistant: Conversation-based discovery
- π Smart Search: Direct input processing
- π¬ Analysis & Results: Comprehensive analysis display
- π Repo Explorer: Interactive repository exploration
Advanced Features
Auto-Navigation
- Automatic tab switching based on workflow state
- Smooth scrolling to top on tab changes
- Progressive disclosure of information
Error Handling
- Graceful fallbacks for LLM failures
- CSV update retry mechanisms
- User-friendly error messages
Performance Optimizations
- Parallel processing for multiple repositories
- Progress tracking for long operations
- Efficient file caching and cleanup
π§ Configuration
Customizing Analysis
- Modify
CHATBOT_SYSTEM_PROMPT
for different assistant behavior - Adjust repository search limits in
search_top_spaces()
- Configure analysis criteria in
get_top_relevant_repos()
Adding File Types
# In analyzer.py
download_filtered_space_files(
repo_id,
local_dir="repo_files",
file_extensions=['.py', '.md', '.txt', '.js', '.ts'] # Add more
)
π€ Contributing
- Fork the repository
- Create a feature branch
- Implement your changes
- Add tests if applicable
- Submit a pull request
π License
This project is licensed under the MIT License - see the LICENSE file for details.
π Acknowledgments
- Gradio: For the amazing web interface framework
- Hugging Face: For the incredible repository ecosystem
- OpenAI: For powerful language model capabilities
Built with β€οΈ for the open source community
π Happy repository hunting! π