DVampire commited on
Commit
310e884
·
1 Parent(s): 43dec7b

update website

Browse files
Files changed (4) hide show
  1. .dockerignore +68 -0
  2. Dockerfile +22 -2
  3. README.md +142 -27
  4. app.py +9 -1
.dockerignore ADDED
@@ -0,0 +1,68 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Git
2
+ .git
3
+ .gitignore
4
+
5
+ # Python
6
+ __pycache__
7
+ *.pyc
8
+ *.pyo
9
+ *.pyd
10
+ .Python
11
+ env
12
+ pip-log.txt
13
+ pip-delete-this-directory.txt
14
+ .tox
15
+ .coverage
16
+ .coverage.*
17
+ .cache
18
+ nosetests.xml
19
+ coverage.xml
20
+ *.cover
21
+ *.log
22
+ .git
23
+ .mypy_cache
24
+ .pytest_cache
25
+ .hypothesis
26
+
27
+ # Virtual environments
28
+ .env
29
+ .venv
30
+ env/
31
+ venv/
32
+ ENV/
33
+ env.bak/
34
+ venv.bak/
35
+
36
+ # IDE
37
+ .vscode/
38
+ .idea/
39
+ *.swp
40
+ *.swo
41
+ *~
42
+
43
+ # OS
44
+ .DS_Store
45
+ .DS_Store?
46
+ ._*
47
+ .Spotlight-V100
48
+ .Trashes
49
+ ehthumbs.db
50
+ Thumbs.db
51
+
52
+ # Development files
53
+ *.log
54
+ *.tmp
55
+ *.temp
56
+ workdir/
57
+ test_*.py
58
+ *_test.py
59
+
60
+ # Documentation
61
+ README.md
62
+ *.md
63
+ docs/
64
+
65
+ # Docker
66
+ Dockerfile
67
+ .dockerignore
68
+ docker-compose*.yml
Dockerfile CHANGED
@@ -1,13 +1,33 @@
1
- FROM python:3.9
2
 
 
 
 
 
 
3
  RUN useradd -m -u 1000 user
4
  USER user
5
  ENV PATH="/home/user/.local/bin:$PATH"
6
 
 
7
  WORKDIR /app
8
 
 
9
  COPY --chown=user ./requirements.txt requirements.txt
10
- RUN pip install --no-cache-dir --upgrade -r requirements.txt
11
 
 
 
 
 
 
12
  COPY --chown=user . /app
 
 
 
 
 
 
 
 
 
13
  CMD ["python", "app.py"]
 
1
+ FROM python:3.9-slim
2
 
3
+ # Set environment variables
4
+ ENV PYTHONUNBUFFERED=1
5
+ ENV PYTHONDONTWRITEBYTECODE=1
6
+
7
+ # Create user
8
  RUN useradd -m -u 1000 user
9
  USER user
10
  ENV PATH="/home/user/.local/bin:$PATH"
11
 
12
+ # Set working directory
13
  WORKDIR /app
14
 
15
+ # Copy requirements first for better caching
16
  COPY --chown=user ./requirements.txt requirements.txt
 
17
 
18
+ # Install dependencies
19
+ RUN pip install --no-cache-dir --upgrade pip && \
20
+ pip install --no-cache-dir --user -r requirements.txt
21
+
22
+ # Copy application code
23
  COPY --chown=user . /app
24
+
25
+ # Expose port (Hugging Face Spaces uses 7860)
26
+ EXPOSE 7860
27
+
28
+ # Health check
29
+ HEALTHCHECK --interval=30s --timeout=30s --start-period=5s --retries=3 \
30
+ CMD curl -f http://localhost:7860/api/health || exit 1
31
+
32
+ # Run the application
33
  CMD ["python", "app.py"]
README.md CHANGED
@@ -9,51 +9,166 @@ app_file: app.py
9
  pinned: false
10
  ---
11
 
12
- # PaperIndex
13
 
14
- A beautiful web application for browsing and evaluating daily papers from Hugging Face, featuring a modern UI inspired by Hugging Face's design.
15
 
16
  ## Features
17
 
18
- - 📰 **Daily Papers**: Browse the latest papers from Hugging Face
19
- - 🎨 **Beautiful UI**: Modern design with day/night theme switching
20
- - 📊 **Paper Evaluation**: Detailed evaluation pages with radar charts
21
- - 🔄 **Smart Caching**: Intelligent caching system for better performance
22
- - 📱 **Responsive**: Works perfectly on desktop and mobile devices
23
 
24
- ## Setup
25
 
26
- 1. **Configure API Key**: Add `ANTHROPIC_API_KEY` to your Space secrets
27
- 2. **Wait for Build**: The app will automatically build and deploy
28
- 3. **Access**: Your app will be available at the Space URL
29
 
30
- ## Local Development
31
 
32
- ```bash
33
- # Install dependencies
34
- pip install -r requirements.txt
 
 
35
 
36
- # Run development server
37
- python -m uvicorn server:app --reload --host 0.0.0.0 --port 8000
 
 
 
 
 
 
 
 
 
 
38
  ```
39
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
40
  ## API Endpoints
41
 
42
- - `GET /` - Main page
43
- - `GET /paper.html?id={paper_id}` - Paper evaluation page
44
- - `GET /api/daily?date_str={date}` - Get daily papers
 
45
  - `GET /api/eval/{paper_id}` - Get paper evaluation
46
- - `GET /api/cache/status` - Get cache status
47
- - `POST /api/cache/clear` - Clear cache
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
48
 
49
- ## Security
50
 
51
- - API keys are stored securely in HF Spaces secrets
52
- - Never hardcode sensitive information in your code
53
- - Use environment variables for local development
 
 
54
 
55
  ## License
56
 
57
- MIT License
58
 
59
 
 
9
  pinned: false
10
  ---
11
 
12
+ # Paper Index - AI Paper Evaluation System
13
 
14
+ A comprehensive system for evaluating AI research papers using advanced language models.
15
 
16
  ## Features
17
 
18
+ - **Daily Paper Crawling**: Automatically fetches papers from Hugging Face daily
19
+ - **AI Evaluation**: Uses Claude Sonnet to evaluate papers across multiple dimensions
20
+ - **Interactive Dashboard**: Beautiful web interface for browsing and evaluating papers
21
+ - **Database Storage**: Persistent storage of papers and evaluations
22
+ - **Smart Navigation**: Intelligent date navigation with fallback mechanisms
23
 
24
+ ## Hugging Face Spaces Deployment
25
 
26
+ This application is configured for deployment on Hugging Face Spaces.
 
 
27
 
28
+ ### Configuration
29
 
30
+ - **Port**: 7860 (Hugging Face Spaces standard)
31
+ - **Health Check**: `/api/health` endpoint
32
+ - **Docker**: Optimized Dockerfile for containerized deployment
33
+
34
+ ### Deployment Steps
35
 
36
+ 1. **Fork/Clone** this repository to your Hugging Face account
37
+ 2. **Create a new Space** on Hugging Face
38
+ 3. **Select Docker** as the SDK
39
+ 4. **Set Environment Variables**:
40
+ - `ANTHROPIC_API_KEY`: Your Anthropic API key for Claude access
41
+ 5. **Deploy**: The Space will automatically build and deploy
42
+
43
+ ### Environment Variables
44
+
45
+ ```bash
46
+ ANTHROPIC_API_KEY=your_api_key_here
47
+ PORT=7860 # Optional, defaults to 7860
48
  ```
49
 
50
+ ## Local Development
51
+
52
+ ### Prerequisites
53
+
54
+ - Python 3.9+
55
+ - Anthropic API key
56
+
57
+ ### Installation
58
+
59
+ 1. **Clone the repository**:
60
+ ```bash
61
+ git clone <repository-url>
62
+ cd paperindex
63
+ ```
64
+
65
+ 2. **Install dependencies**:
66
+ ```bash
67
+ pip install -r requirements.txt
68
+ ```
69
+
70
+ 3. **Set environment variables**:
71
+ ```bash
72
+ export ANTHROPIC_API_KEY=your_api_key_here
73
+ ```
74
+
75
+ 4. **Run the application**:
76
+ ```bash
77
+ python app.py
78
+ ```
79
+
80
+ 5. **Access the application**:
81
+ - Main interface: http://localhost:7860
82
+ - API documentation: http://localhost:7860/docs
83
+
84
  ## API Endpoints
85
 
86
+ ### Core Endpoints
87
+
88
+ - `GET /api/daily` - Get daily papers with smart navigation
89
+ - `GET /api/paper/{paper_id}` - Get paper details
90
  - `GET /api/eval/{paper_id}` - Get paper evaluation
91
+ - `GET /api/health` - Health check endpoint
92
+
93
+ ### Evaluation Endpoints
94
+
95
+ - `POST /api/papers/evaluate/{arxiv_id}` - Start paper evaluation
96
+ - `GET /api/papers/evaluate/{arxiv_id}/status` - Get evaluation status
97
+
98
+ ### Cache Management
99
+
100
+ - `GET /api/cache/status` - Get cache statistics
101
+ - `POST /api/cache/clear` - Clear all cached data
102
+ - `POST /api/cache/refresh/{date}` - Refresh cache for specific date
103
+
104
+ ## Architecture
105
+
106
+ ### Frontend
107
+ - **HTML/CSS/JavaScript**: Modern, responsive interface
108
+ - **Real-time Updates**: Dynamic content loading
109
+ - **Theme Support**: Light/dark mode toggle
110
+
111
+ ### Backend
112
+ - **FastAPI**: High-performance web framework
113
+ - **SQLite**: Lightweight database for paper storage
114
+ - **Async Processing**: Background evaluation tasks
115
+ - **Caching**: Intelligent caching system for performance
116
+
117
+ ### AI Integration
118
+ - **Claude Sonnet**: Advanced paper evaluation
119
+ - **Multi-dimensional Analysis**: Comprehensive evaluation criteria
120
+ - **Structured Output**: JSON-based evaluation results
121
+
122
+ ## Database Schema
123
+
124
+ ### Papers Table
125
+ ```sql
126
+ CREATE TABLE papers (
127
+ arxiv_id TEXT PRIMARY KEY,
128
+ title TEXT NOT NULL,
129
+ authors TEXT NOT NULL,
130
+ abstract TEXT,
131
+ categories TEXT,
132
+ published_date TEXT,
133
+ evaluation_content TEXT,
134
+ evaluation_score REAL,
135
+ overall_score REAL,
136
+ evaluation_tags TEXT,
137
+ evaluation_status TEXT DEFAULT 'not_started',
138
+ is_evaluated BOOLEAN DEFAULT FALSE,
139
+ evaluation_date TIMESTAMP,
140
+ created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
141
+ updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
142
+ );
143
+ ```
144
+
145
+ ## Evaluation Dimensions
146
+
147
+ The system evaluates papers across 12 key dimensions:
148
+
149
+ 1. **Task Formalization** - Clarity of problem definition
150
+ 2. **Data & Resource Availability** - Access to required data
151
+ 3. **Input-Output Complexity** - Complexity of inputs/outputs
152
+ 4. **Real-World Interaction** - Practical applicability
153
+ 5. **Existing AI Coverage** - Current AI capabilities
154
+ 6. **Automation Barriers** - Technical challenges
155
+ 7. **Human Originality** - Creative contribution
156
+ 8. **Safety & Ethics** - Responsible AI considerations
157
+ 9. **Societal/Economic Impact** - Broader implications
158
+ 10. **Technical Maturity Needed** - Development requirements
159
+ 11. **3-Year Feasibility** - Short-term potential
160
+ 12. **Overall Automatability** - Comprehensive assessment
161
 
162
+ ## Contributing
163
 
164
+ 1. Fork the repository
165
+ 2. Create a feature branch
166
+ 3. Make your changes
167
+ 4. Add tests if applicable
168
+ 5. Submit a pull request
169
 
170
  ## License
171
 
172
+ This project is licensed under the MIT License - see the LICENSE file for details.
173
 
174
 
app.py CHANGED
@@ -667,6 +667,12 @@ def clear_cache() -> Dict[str, str]:
667
  return {"message": "Cache cleared successfully"}
668
 
669
 
 
 
 
 
 
 
670
  @app.post("/api/cache/refresh/{date_str}")
671
  async def refresh_cache(date_str: str) -> Dict[str, Any]:
672
  """Force refresh cache for a specific date"""
@@ -719,4 +725,6 @@ if __name__ == "__main__":
719
  app.mount("/", StaticFiles(directory=config.frontend_path, html=True), name="static")
720
  logger.info(f"| Frontend initialized at: {config.frontend_path}")
721
 
722
- uvicorn.run(app, host="0.0.0.0", port=8000)
 
 
 
667
  return {"message": "Cache cleared successfully"}
668
 
669
 
670
+ @app.get("/api/health")
671
+ def health_check() -> Dict[str, str]:
672
+ """Health check endpoint for Hugging Face Spaces"""
673
+ return {"status": "healthy", "message": "Paper Index API is running"}
674
+
675
+
676
  @app.post("/api/cache/refresh/{date_str}")
677
  async def refresh_cache(date_str: str) -> Dict[str, Any]:
678
  """Force refresh cache for a specific date"""
 
725
  app.mount("/", StaticFiles(directory=config.frontend_path, html=True), name="static")
726
  logger.info(f"| Frontend initialized at: {config.frontend_path}")
727
 
728
+ # Use port 7860 for Hugging Face Spaces, fallback to 8000 for local development
729
+ port = int(os.environ.get("PORT", 7860))
730
+ uvicorn.run(app, host="0.0.0.0", port=port)