paperindex / README.md
DVampire
update model
26884bd
---
title: AI Realizability Index
emoji: πŸ“š
colorFrom: blue
colorTo: purple
sdk: docker
sdk_version: "latest"
app_file: app.py
pinned: false
---
# AI Realizability Index - AI Paper Evaluation System
A comprehensive system for evaluating AI research papers using advanced language models with asynchronous processing and concurrent evaluation capabilities.
## Features
- **Daily Paper Crawling**: Automatically fetches papers from Hugging Face daily
- **AI Evaluation**: Uses Claude Sonnet to evaluate papers across multiple dimensions
- **Concurrent Processing**: True asynchronous evaluation with multiple papers processed simultaneously
- **Re-evaluation**: Ability to re-run evaluations for papers with updated results
- **Batch Evaluation**: "Evaluate All" feature to process multiple papers at once
- **Interactive Dashboard**: Beautiful web interface for browsing and evaluating papers
- **Asynchronous Database**: High-performance SQLite with WAL mode for concurrent operations
- **Smart Navigation**: Intelligent date navigation with fallback mechanisms
- **Real-time Status Updates**: Live progress tracking and notifications
## Recent Updates
### v0.1.0 - Asynchronous & Concurrent Features
- **Asynchronous Database**: Migrated from `sqlite3` to `aiosqlite` for better performance
- **Concurrent Evaluation**: Multiple papers can be evaluated simultaneously
- **Re-evaluation**: Added "Re-evaluate" button for papers to update evaluation results
- **Batch Processing**: "Evaluate All" button to process all un-evaluated papers
- **Enhanced UI**: Improved progress indicators and real-time notifications
- **Database Optimization**: WAL mode and performance pragmas for better concurrency
## Hugging Face Spaces Deployment
This application is configured for deployment on Hugging Face Spaces.
### Configuration
- **Port**: 7860 (Hugging Face Spaces standard)
- **Health Check**: `/api/health` endpoint
- **Docker**: Optimized Dockerfile for containerized deployment
### Deployment Steps
1. **Fork/Clone** this repository to your Hugging Face account
2. **Create a new Space** on Hugging Face
3. **Select Docker** as the SDK
4. **Set Environment Variables**:
- `ANTHROPIC_API_KEY`: Your Anthropic API key for Claude access
5. **Deploy**: The Space will automatically build and deploy
### Environment Variables
```bash
ANTHROPIC_API_KEY=your_api_key_here
PORT=7860 # Optional, defaults to 7860
```
## Local Development
### Prerequisites
- Python 3.9+
- Anthropic API key
### Installation
1. **Clone the repository**:
```bash
git clone <repository-url>
cd paperindex
```
2. **Install dependencies**:
```bash
pip install -r requirements.txt
```
3. **Set environment variables**:
```bash
export ANTHROPIC_API_KEY=your_api_key_here
```
4. **Run the application**:
```bash
python app.py
```
5. **Access the application**:
- Main interface: http://localhost:7860
- API documentation: http://localhost:7860/docs
## API Endpoints
### Core Endpoints
- `GET /api/daily` - Get daily papers with smart navigation
- `GET /api/paper/{paper_id}` - Get paper details
- `GET /api/eval/{paper_id}` - Get paper evaluation
- `GET /api/health` - Health check endpoint
### Evaluation Endpoints
- `POST /api/papers/evaluate/{arxiv_id}` - Start paper evaluation
- `POST /api/papers/reevaluate/{arxiv_id}` - Re-evaluate a paper
- `GET /api/papers/evaluate/{arxiv_id}/status` - Get evaluation status
- `GET /api/papers/evaluate/active-tasks` - Get currently running evaluations
### Cache Management
- `GET /api/cache/status` - Get cache statistics
- `POST /api/cache/clear` - Clear all cached data
- `POST /api/cache/refresh/{date}` - Refresh cache for specific date
## Architecture
### Frontend
- **HTML/CSS/JavaScript**: Modern, responsive interface
- **Real-time Updates**: Dynamic content loading with polling
- **Theme Support**: Light/dark mode toggle
- **Progress Indicators**: Visual feedback for evaluation status
- **Batch Operations**: "Evaluate All" functionality with sequential processing
### Backend
- **FastAPI**: High-performance web framework
- **Async SQLite**: `aiosqlite` with WAL mode for concurrent operations
- **Async Processing**: Background evaluation tasks with task tracking
- **Concurrent Evaluation**: Multiple papers evaluated simultaneously
- **Caching**: Intelligent caching system for performance
### AI Integration
- **Async Anthropic**: Non-blocking API calls with `AsyncAnthropic`
- **Multi-dimensional Analysis**: Comprehensive evaluation criteria
- **Structured Output**: JSON-based evaluation results
- **Error Handling**: Robust error handling and retry mechanisms
## Database Schema
### Papers Table
```sql
CREATE TABLE papers (
arxiv_id TEXT PRIMARY KEY,
title TEXT NOT NULL,
authors TEXT NOT NULL,
abstract TEXT,
categories TEXT,
published_date TEXT,
evaluation_content TEXT,
evaluation_score REAL,
overall_score REAL,
evaluation_tags TEXT,
evaluation_status TEXT DEFAULT 'not_started',
is_evaluated BOOLEAN DEFAULT FALSE,
evaluation_date TIMESTAMP,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
```
### Database Optimizations
- **WAL Mode**: `PRAGMA journal_mode=WAL` for better concurrency
- **Performance Pragmas**: Optimized settings for concurrent access
- **Asynchronous Operations**: All database calls are async/await
## Evaluation Dimensions
The system evaluates papers across 12 key dimensions:
1. **Task Formalization** - Clarity of problem definition
2. **Data & Resource Availability** - Access to required data
3. **Input-Output Complexity** - Complexity of inputs/outputs
4. **Real-World Interaction** - Practical applicability
5. **Existing AI Coverage** - Current AI capabilities
6. **Automation Barriers** - Technical challenges
7. **Human Originality** - Creative contribution
8. **Safety & Ethics** - Responsible AI considerations
9. **Societal/Economic Impact** - Broader implications
10. **Technical Maturity Needed** - Development requirements
11. **3-Year Feasibility** - Short-term potential
12. **Overall Automatability** - Comprehensive assessment
## Key Features
### Concurrent Evaluation
- Multiple papers can be evaluated simultaneously
- Global task tracking prevents duplicate evaluations
- Real-time status updates via polling
- Automatic error handling and recovery
### Re-evaluation System
- "Re-evaluate" button appears after initial evaluation
- Updates existing evaluation results in database
- Maintains evaluation history and timestamps
- Same comprehensive evaluation criteria
### Batch Processing
- "Evaluate All" button processes all un-evaluated papers
- Sequential processing with delays to prevent API overload
- Progress tracking and real-time notifications
- Automatic button state management
### Enhanced UI/UX
- Progress circles with proper layering
- Bottom-right notification system
- Dynamic button states and text updates
- Responsive design with modern styling
## Performance Optimizations
### Database
- Asynchronous operations with `aiosqlite`
- WAL mode for better concurrency
- Optimized SQLite pragmas
- Connection pooling and management
### API Calls
- Non-blocking Anthropic API calls
- Concurrent evaluation processing
- Task tracking and management
- Error handling and retry logic
### Frontend
- Efficient DOM manipulation
- Polling with appropriate intervals
- Memory management for log entries
- Optimized event handling
## Contributing
1. Fork the repository
2. Create a feature branch
3. Make your changes
4. Add tests if applicable
5. Submit a pull request
## License
This project is licensed under the MIT License - see the LICENSE file for details.