Spaces:
Running
Running
metadata
title: AI Realizability Index
emoji: π
colorFrom: blue
colorTo: purple
sdk: docker
sdk_version: latest
app_file: app.py
pinned: false
AI Realizability Index - AI Paper Evaluation System
A comprehensive system for evaluating AI research papers using advanced language models with asynchronous processing and concurrent evaluation capabilities.
Features
- Daily Paper Crawling: Automatically fetches papers from Hugging Face daily
- AI Evaluation: Uses Claude Sonnet to evaluate papers across multiple dimensions
- Concurrent Processing: True asynchronous evaluation with multiple papers processed simultaneously
- Re-evaluation: Ability to re-run evaluations for papers with updated results
- Batch Evaluation: "Evaluate All" feature to process multiple papers at once
- Interactive Dashboard: Beautiful web interface for browsing and evaluating papers
- Asynchronous Database: High-performance SQLite with WAL mode for concurrent operations
- Smart Navigation: Intelligent date navigation with fallback mechanisms
- Real-time Status Updates: Live progress tracking and notifications
Recent Updates
v0.1.0 - Asynchronous & Concurrent Features
- Asynchronous Database: Migrated from
sqlite3
toaiosqlite
for better performance - Concurrent Evaluation: Multiple papers can be evaluated simultaneously
- Re-evaluation: Added "Re-evaluate" button for papers to update evaluation results
- Batch Processing: "Evaluate All" button to process all un-evaluated papers
- Enhanced UI: Improved progress indicators and real-time notifications
- Database Optimization: WAL mode and performance pragmas for better concurrency
Hugging Face Spaces Deployment
This application is configured for deployment on Hugging Face Spaces.
Configuration
- Port: 7860 (Hugging Face Spaces standard)
- Health Check:
/api/health
endpoint - Docker: Optimized Dockerfile for containerized deployment
Deployment Steps
- Fork/Clone this repository to your Hugging Face account
- Create a new Space on Hugging Face
- Select Docker as the SDK
- Set Environment Variables:
ANTHROPIC_API_KEY
: Your Anthropic API key for Claude access
- Deploy: The Space will automatically build and deploy
Environment Variables
ANTHROPIC_API_KEY=your_api_key_here
PORT=7860 # Optional, defaults to 7860
Local Development
Prerequisites
- Python 3.9+
- Anthropic API key
Installation
Clone the repository:
git clone <repository-url> cd paperindex
Install dependencies:
pip install -r requirements.txt
Set environment variables:
export ANTHROPIC_API_KEY=your_api_key_here
Run the application:
python app.py
Access the application:
- Main interface: http://localhost:7860
- API documentation: http://localhost:7860/docs
API Endpoints
Core Endpoints
GET /api/daily
- Get daily papers with smart navigationGET /api/paper/{paper_id}
- Get paper detailsGET /api/eval/{paper_id}
- Get paper evaluationGET /api/health
- Health check endpoint
Evaluation Endpoints
POST /api/papers/evaluate/{arxiv_id}
- Start paper evaluationPOST /api/papers/reevaluate/{arxiv_id}
- Re-evaluate a paperGET /api/papers/evaluate/{arxiv_id}/status
- Get evaluation statusGET /api/papers/evaluate/active-tasks
- Get currently running evaluations
Cache Management
GET /api/cache/status
- Get cache statisticsPOST /api/cache/clear
- Clear all cached dataPOST /api/cache/refresh/{date}
- Refresh cache for specific date
Architecture
Frontend
- HTML/CSS/JavaScript: Modern, responsive interface
- Real-time Updates: Dynamic content loading with polling
- Theme Support: Light/dark mode toggle
- Progress Indicators: Visual feedback for evaluation status
- Batch Operations: "Evaluate All" functionality with sequential processing
Backend
- FastAPI: High-performance web framework
- Async SQLite:
aiosqlite
with WAL mode for concurrent operations - Async Processing: Background evaluation tasks with task tracking
- Concurrent Evaluation: Multiple papers evaluated simultaneously
- Caching: Intelligent caching system for performance
AI Integration
- Async Anthropic: Non-blocking API calls with
AsyncAnthropic
- Multi-dimensional Analysis: Comprehensive evaluation criteria
- Structured Output: JSON-based evaluation results
- Error Handling: Robust error handling and retry mechanisms
Database Schema
Papers Table
CREATE TABLE papers (
arxiv_id TEXT PRIMARY KEY,
title TEXT NOT NULL,
authors TEXT NOT NULL,
abstract TEXT,
categories TEXT,
published_date TEXT,
evaluation_content TEXT,
evaluation_score REAL,
overall_score REAL,
evaluation_tags TEXT,
evaluation_status TEXT DEFAULT 'not_started',
is_evaluated BOOLEAN DEFAULT FALSE,
evaluation_date TIMESTAMP,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
Database Optimizations
- WAL Mode:
PRAGMA journal_mode=WAL
for better concurrency - Performance Pragmas: Optimized settings for concurrent access
- Asynchronous Operations: All database calls are async/await
Evaluation Dimensions
The system evaluates papers across 12 key dimensions:
- Task Formalization - Clarity of problem definition
- Data & Resource Availability - Access to required data
- Input-Output Complexity - Complexity of inputs/outputs
- Real-World Interaction - Practical applicability
- Existing AI Coverage - Current AI capabilities
- Automation Barriers - Technical challenges
- Human Originality - Creative contribution
- Safety & Ethics - Responsible AI considerations
- Societal/Economic Impact - Broader implications
- Technical Maturity Needed - Development requirements
- 3-Year Feasibility - Short-term potential
- Overall Automatability - Comprehensive assessment
Key Features
Concurrent Evaluation
- Multiple papers can be evaluated simultaneously
- Global task tracking prevents duplicate evaluations
- Real-time status updates via polling
- Automatic error handling and recovery
Re-evaluation System
- "Re-evaluate" button appears after initial evaluation
- Updates existing evaluation results in database
- Maintains evaluation history and timestamps
- Same comprehensive evaluation criteria
Batch Processing
- "Evaluate All" button processes all un-evaluated papers
- Sequential processing with delays to prevent API overload
- Progress tracking and real-time notifications
- Automatic button state management
Enhanced UI/UX
- Progress circles with proper layering
- Bottom-right notification system
- Dynamic button states and text updates
- Responsive design with modern styling
Performance Optimizations
Database
- Asynchronous operations with
aiosqlite
- WAL mode for better concurrency
- Optimized SQLite pragmas
- Connection pooling and management
API Calls
- Non-blocking Anthropic API calls
- Concurrent evaluation processing
- Task tracking and management
- Error handling and retry logic
Frontend
- Efficient DOM manipulation
- Polling with appropriate intervals
- Memory management for log entries
- Optimized event handling
Contributing
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests if applicable
- Submit a pull request
License
This project is licensed under the MIT License - see the LICENSE file for details.