Spaces:

Agents-MCP-Hackathon
/

HF_RepoSense

Running

App Files Files Community

HF_RepoSense / README.md

naman1102

Update README.md

865e2c1 verified 11 days ago

preview code

raw

history blame contribute delete

9.76 kB

	---
	title: HF RepoSense
	emoji: 🚀
	colorFrom: blue
	colorTo: purple
	sdk: gradio
	sdk_version: 5.32.1
	app_file: app.py
	pinned: false
	short_description: AI-powered HuggingFace repository intelligence
	tags:
	- agent-demo-track
	---

	Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

	# 🚀 HF RepoSense : [Video demo](https://youtu.be/UqSRQy_8t-E)

	HF Accounts for all contributors:
	Naman Gupta: naman1102
	Surya Boddu : MLconArtist
	Lakshmi Girija Dhulipati: dlgirija
	Mohamed Ifreen Seyed Ibrahim: Mohamed-Ifreen

	AI-powered HuggingFace repository intelligence

	An intelligent AI system for discovering, analyzing, and evaluating HuggingFace repositories. HF RepoSense uses advanced AI to understand your requirements, search for relevant repositories, and provide comprehensive analysis with personalized recommendations.

	![HF RepoSense](https://img.shields.io/badge/Powered%20by-Gradio-orange)
	![Python](https://img.shields.io/badge/Python-3.8+-blue)
	![Hugging Face](https://img.shields.io/badge/Hugging%20Face-Spaces-yellow)

	## ✨ Features

	- 🤖 AI Assistant: Intelligent conversation-based repository discovery
	- 🔍 Smart Search: Auto-detection of repository IDs vs. keywords
	- 📊 Automated Analysis: LLM-powered repository evaluation and ranking
	- 🏆 Top 3 Selection: AI-curated most relevant repositories
	- 💬 Repository Explorer: Interactive chat with repository contents
	- 🎯 Requirements Extraction: Automatic keyword extraction from conversations
	- 📋 Comprehensive Results: Detailed analysis with strengths, weaknesses, and specialities


	## 🚦 Quick Start

	### Prerequisites

	- Python 3.8+
	- OpenAI API key (for LLM analysis)
	- Hugging Face access (for repository downloads)

	### Installation

	1. Clone the repository
	```bash
	git clone <repository-url>
	cd HF-RepoSense
	```

	2. Install dependencies
	```bash
	pip install -r requirements.txt
	```

	3. Set up environment variables
	```bash
	export modal_api="your_openai_api_key"
	export base_url="your_openai_base_url"
	```

	4. Run the application
	```bash
	python app.py
	```

	5. Open your browser to `http://localhost:7860`

	## 📖 User Guide

	### 🤖 Using the AI Assistant (Recommended)

	1. Start a Conversation
	- Navigate to the "🤖 AI Assistant" tab
	- Describe your project: "I'm building a chatbot for customer service"
	- The AI will ask clarifying questions about your needs

	2. Automatic Discovery
	- When the AI has enough information, it will automatically:
	- Extract relevant keywords from your conversation
	- Search for matching repositories
	- Analyze and rank them by relevance

	3. Review Results
	- The interface automatically switches to "🔬 Analysis & Results"
	- View the top 3 most relevant repositories
	- Browse all analyzed repositories with detailed insights

	### 📝 Using Smart Search (Direct Input)

	1. Repository IDs
	```
	microsoft/DialoGPT-medium
	openai/whisper
	huggingface/transformers
	```

	2. Keywords
	```
	text generation
	image classification
	sentiment analysis
	```

	3. Mixed Input
	- The system automatically detects the input type
	- Repository IDs (containing `/`) are processed directly
	- Keywords trigger automatic repository search

	### 🔬 Analyzing Results

	- Top 3 Repositories: AI-selected most relevant based on your requirements
	- Detailed Analysis: Strengths, weaknesses, specialities, and relevance ratings
	- Quick Actions: Click repository names to visit or explore them
	- Repository Explorer: Deep dive into individual repositories with AI chat

	### 🔍 Repository Explorer

	1. Access Methods:
	- Click "🔍 Open in Repo Explorer" from repository actions
	- Manually enter repository ID in the Repo Explorer tab

	2. Features:
	- Automatic repository loading and analysis
	- Interactive chat about repository contents
	- File structure exploration
	- Code analysis and explanations

	## 🛠️ Technical Architecture

	### Core Components

	```
	app.py # Main Gradio interface and orchestration
	├── analyzer.py # Repository analysis and LLM processing
	├── hf_utils.py # Hugging Face API interactions
	├── chatbot_page.py # AI assistant conversation logic
	└── repo_explorer.py # Repository exploration interface
	```

	### Key Features Implementation

	#### 🤖 AI Assistant
	- System Prompt: Focused on requirements gathering, not recommendations
	- Auto-Extraction: Detects conversation readiness for keyword extraction
	- Smart Processing: Converts natural language to actionable search queries

	#### 🔍 Smart Input Detection
	```python
	def is_repo_id_format(text: str) -> bool:
	# Detects if input contains repository IDs (with /) vs keywords
	lines = [line.strip() for line in re.split(r'[\n,]+', text) if line.strip()]
	slash_count = sum(1 for line in lines if '/' in line)
	return slash_count >= len(lines) * 0.5
	```

	#### 🏆 LLM-Powered Repository Ranking
	- Model: `Orion-zhen/Qwen2.5-Coder-7B-Instruct-AWQ`
	- Criteria: Requirements matching, strengths, relevance rating, speciality alignment
	- Output: JSON-formatted repository rankings

	#### 📊 Analysis Pipeline
	1. Download: Repository files (`.py`, `.md`, `.txt`)
	2. Combine: Merge files into single analyzable document
	3. Analyze: LLM evaluation for strengths, weaknesses, specialities
	4. Rank: User requirement-based relevance scoring
	5. Select: Top 3 most relevant repositories

	### Data Flow

	```mermaid
	graph TD
	A[User Input] --> B{Input Type?}
	B -->\|Keywords\| C[Repository Search]
	B -->\|Repo IDs\| D[Direct Processing]
	C --> E[Repository List]
	D --> E
	E --> F[Download & Analyze]
	F --> G[LLM Evaluation]
	G --> H[Ranking & Selection]
	H --> I[Results Display]
	I --> J[Repository Explorer]
	```

	### File Structure

	```
	📦 HF-RepoSense/
	├── 📄 app.py # Main application
	├── 📄 analyzer.py # Repository analysis logic
	├── 📄 hf_utils.py # Hugging Face utilities
	├── 📄 chatbot_page.py # AI assistant functionality
	├── 📄 repo_explorer.py # Repository exploration
	├── 📄 requirements.txt # Python dependencies
	├── 📄 README.md # Documentation
	├── 📄 repo_ids.csv # Analysis results storage
	└── 📁 repo_files/ # Temporary repository downloads
	```

	### Dependencies

	```
	gradio>=4.0.0 # Web interface framework
	pandas>=1.5.0 # Data manipulation
	regex>=2022.0.0 # Advanced regex operations
	openai>=1.0.0 # LLM API access
	huggingface_hub>=0.16.0 # HF repository access
	requests>=2.28.0 # HTTP requests
	```

	### Environment Variables

	\| Variable \| Description \| Required \|
	\|----------\|-------------\|----------\|
	\| `modal_api` \| OpenAI API key for LLM analysis \| ✅ \|
	\| `base_url` \| OpenAI API base URL \| ✅ \|

	### LLM Integration

	#### Analysis Prompt Structure
	```python
	ANALYSIS_PROMPT = """
	Analyze this repository and provide:
	1. Strengths and capabilities
	2. Potential weaknesses or limitations
	3. Primary speciality/use case
	4. Relevance rating for: {user_requirements}

	Return valid JSON with: strength, weaknesses, speciality, relevance rating
	"""
	```

	#### Repository Ranking System
	- Input: User requirements + repository analysis data
	- Processing: LLM evaluates relevance and ranks repositories
	- Output: Top 3 most relevant repositories in order

	### UI Components

	#### Modern Design Features
	- Gradient Backgrounds: Linear gradients for visual appeal
	- Glassmorphism: Backdrop blur effects for modern look
	- Responsive Layout: Adaptive to different screen sizes
	- Interactive Elements: Hover effects and smooth transitions
	- Modal System: Repository action selection popups

	#### Tab Organization
	1. 🤖 AI Assistant: Conversation-based discovery
	2. 📝 Smart Search: Direct input processing
	3. 🔬 Analysis & Results: Comprehensive analysis display
	4. 🔍 Repo Explorer: Interactive repository exploration

	### Advanced Features

	#### Auto-Navigation
	- Automatic tab switching based on workflow state
	- Smooth scrolling to top on tab changes
	- Progressive disclosure of information

	#### Error Handling
	- Graceful fallbacks for LLM failures
	- CSV update retry mechanisms
	- User-friendly error messages

	#### Performance Optimizations
	- Parallel processing for multiple repositories
	- Progress tracking for long operations
	- Efficient file caching and cleanup

	## 🔧 Configuration

	### Customizing Analysis
	- Modify `CHATBOT_SYSTEM_PROMPT` for different assistant behavior
	- Adjust repository search limits in `search_top_spaces()`
	- Configure analysis criteria in `get_top_relevant_repos()`

	### Adding File Types
	```python
	# In analyzer.py
	download_filtered_space_files(
	repo_id,
	local_dir="repo_files",
	file_extensions=['.py', '.md', '.txt', '.js', '.ts'] # Add more
	)
	```

	## 🤝 Contributing

	1. Fork the repository
	2. Create a feature branch
	3. Implement your changes
	4. Add tests if applicable
	5. Submit a pull request

	## 📄 License

	This project is licensed under the MIT License - see the LICENSE file for details.

	## 🙏 Acknowledgments

	- Gradio: For the amazing web interface framework
	- Hugging Face: For the incredible repository ecosystem
	- OpenAI: For powerful language model capabilities

	---

	<div align="center">
	<p>Built with ❤️ for the open source community</p>
	<p>🚀 Happy repository hunting! 🚀</p>
	</div>