Spaces:
Running
Running
File size: 7,766 Bytes
166ba0a 26884bd 166ba0a 26884bd 21fd477 f7470ea 21fd477 310e884 f7470ea 310e884 f7470ea 310e884 f7470ea 21fd477 310e884 21fd477 310e884 21fd477 310e884 21fd477 310e884 21fd477 310e884 21fd477 310e884 21fd477 310e884 21fd477 310e884 f7470ea 310e884 f7470ea 310e884 f7470ea 310e884 f7470ea 310e884 f7470ea 310e884 f7470ea 310e884 f7470ea 310e884 f7470ea 310e884 21fd477 f7470ea 310e884 21fd477 310e884 21fd477 310e884 21fd477 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 |
---
title: AI Realizability Index
emoji: π
colorFrom: blue
colorTo: purple
sdk: docker
sdk_version: "latest"
app_file: app.py
pinned: false
---
# AI Realizability Index - AI Paper Evaluation System
A comprehensive system for evaluating AI research papers using advanced language models with asynchronous processing and concurrent evaluation capabilities.
## Features
- **Daily Paper Crawling**: Automatically fetches papers from Hugging Face daily
- **AI Evaluation**: Uses Claude Sonnet to evaluate papers across multiple dimensions
- **Concurrent Processing**: True asynchronous evaluation with multiple papers processed simultaneously
- **Re-evaluation**: Ability to re-run evaluations for papers with updated results
- **Batch Evaluation**: "Evaluate All" feature to process multiple papers at once
- **Interactive Dashboard**: Beautiful web interface for browsing and evaluating papers
- **Asynchronous Database**: High-performance SQLite with WAL mode for concurrent operations
- **Smart Navigation**: Intelligent date navigation with fallback mechanisms
- **Real-time Status Updates**: Live progress tracking and notifications
## Recent Updates
### v0.1.0 - Asynchronous & Concurrent Features
- **Asynchronous Database**: Migrated from `sqlite3` to `aiosqlite` for better performance
- **Concurrent Evaluation**: Multiple papers can be evaluated simultaneously
- **Re-evaluation**: Added "Re-evaluate" button for papers to update evaluation results
- **Batch Processing**: "Evaluate All" button to process all un-evaluated papers
- **Enhanced UI**: Improved progress indicators and real-time notifications
- **Database Optimization**: WAL mode and performance pragmas for better concurrency
## Hugging Face Spaces Deployment
This application is configured for deployment on Hugging Face Spaces.
### Configuration
- **Port**: 7860 (Hugging Face Spaces standard)
- **Health Check**: `/api/health` endpoint
- **Docker**: Optimized Dockerfile for containerized deployment
### Deployment Steps
1. **Fork/Clone** this repository to your Hugging Face account
2. **Create a new Space** on Hugging Face
3. **Select Docker** as the SDK
4. **Set Environment Variables**:
- `ANTHROPIC_API_KEY`: Your Anthropic API key for Claude access
5. **Deploy**: The Space will automatically build and deploy
### Environment Variables
```bash
ANTHROPIC_API_KEY=your_api_key_here
PORT=7860 # Optional, defaults to 7860
```
## Local Development
### Prerequisites
- Python 3.9+
- Anthropic API key
### Installation
1. **Clone the repository**:
```bash
git clone <repository-url>
cd paperindex
```
2. **Install dependencies**:
```bash
pip install -r requirements.txt
```
3. **Set environment variables**:
```bash
export ANTHROPIC_API_KEY=your_api_key_here
```
4. **Run the application**:
```bash
python app.py
```
5. **Access the application**:
- Main interface: http://localhost:7860
- API documentation: http://localhost:7860/docs
## API Endpoints
### Core Endpoints
- `GET /api/daily` - Get daily papers with smart navigation
- `GET /api/paper/{paper_id}` - Get paper details
- `GET /api/eval/{paper_id}` - Get paper evaluation
- `GET /api/health` - Health check endpoint
### Evaluation Endpoints
- `POST /api/papers/evaluate/{arxiv_id}` - Start paper evaluation
- `POST /api/papers/reevaluate/{arxiv_id}` - Re-evaluate a paper
- `GET /api/papers/evaluate/{arxiv_id}/status` - Get evaluation status
- `GET /api/papers/evaluate/active-tasks` - Get currently running evaluations
### Cache Management
- `GET /api/cache/status` - Get cache statistics
- `POST /api/cache/clear` - Clear all cached data
- `POST /api/cache/refresh/{date}` - Refresh cache for specific date
## Architecture
### Frontend
- **HTML/CSS/JavaScript**: Modern, responsive interface
- **Real-time Updates**: Dynamic content loading with polling
- **Theme Support**: Light/dark mode toggle
- **Progress Indicators**: Visual feedback for evaluation status
- **Batch Operations**: "Evaluate All" functionality with sequential processing
### Backend
- **FastAPI**: High-performance web framework
- **Async SQLite**: `aiosqlite` with WAL mode for concurrent operations
- **Async Processing**: Background evaluation tasks with task tracking
- **Concurrent Evaluation**: Multiple papers evaluated simultaneously
- **Caching**: Intelligent caching system for performance
### AI Integration
- **Async Anthropic**: Non-blocking API calls with `AsyncAnthropic`
- **Multi-dimensional Analysis**: Comprehensive evaluation criteria
- **Structured Output**: JSON-based evaluation results
- **Error Handling**: Robust error handling and retry mechanisms
## Database Schema
### Papers Table
```sql
CREATE TABLE papers (
arxiv_id TEXT PRIMARY KEY,
title TEXT NOT NULL,
authors TEXT NOT NULL,
abstract TEXT,
categories TEXT,
published_date TEXT,
evaluation_content TEXT,
evaluation_score REAL,
overall_score REAL,
evaluation_tags TEXT,
evaluation_status TEXT DEFAULT 'not_started',
is_evaluated BOOLEAN DEFAULT FALSE,
evaluation_date TIMESTAMP,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
```
### Database Optimizations
- **WAL Mode**: `PRAGMA journal_mode=WAL` for better concurrency
- **Performance Pragmas**: Optimized settings for concurrent access
- **Asynchronous Operations**: All database calls are async/await
## Evaluation Dimensions
The system evaluates papers across 12 key dimensions:
1. **Task Formalization** - Clarity of problem definition
2. **Data & Resource Availability** - Access to required data
3. **Input-Output Complexity** - Complexity of inputs/outputs
4. **Real-World Interaction** - Practical applicability
5. **Existing AI Coverage** - Current AI capabilities
6. **Automation Barriers** - Technical challenges
7. **Human Originality** - Creative contribution
8. **Safety & Ethics** - Responsible AI considerations
9. **Societal/Economic Impact** - Broader implications
10. **Technical Maturity Needed** - Development requirements
11. **3-Year Feasibility** - Short-term potential
12. **Overall Automatability** - Comprehensive assessment
## Key Features
### Concurrent Evaluation
- Multiple papers can be evaluated simultaneously
- Global task tracking prevents duplicate evaluations
- Real-time status updates via polling
- Automatic error handling and recovery
### Re-evaluation System
- "Re-evaluate" button appears after initial evaluation
- Updates existing evaluation results in database
- Maintains evaluation history and timestamps
- Same comprehensive evaluation criteria
### Batch Processing
- "Evaluate All" button processes all un-evaluated papers
- Sequential processing with delays to prevent API overload
- Progress tracking and real-time notifications
- Automatic button state management
### Enhanced UI/UX
- Progress circles with proper layering
- Bottom-right notification system
- Dynamic button states and text updates
- Responsive design with modern styling
## Performance Optimizations
### Database
- Asynchronous operations with `aiosqlite`
- WAL mode for better concurrency
- Optimized SQLite pragmas
- Connection pooling and management
### API Calls
- Non-blocking Anthropic API calls
- Concurrent evaluation processing
- Task tracking and management
- Error handling and retry logic
### Frontend
- Efficient DOM manipulation
- Polling with appropriate intervals
- Memory management for log entries
- Optimized event handling
## Contributing
1. Fork the repository
2. Create a feature branch
3. Make your changes
4. Add tests if applicable
5. Submit a pull request
## License
This project is licensed under the MIT License - see the LICENSE file for details.
|