Spaces:
Running
Running
DVampire
commited on
Commit
·
78f6650
1
Parent(s):
6c5cf21
update website
Browse files- DATABASE_MIGRATION_SUMMARY.md +0 -147
- DATABASE_USAGE.md +0 -182
- PROJECT_STRUCTURE.md +0 -87
- app.py +173 -75
- frontend/index.html +10 -4
- frontend/main.js +250 -15
- frontend/paper.js +22 -2
- frontend/styles.css +150 -2
- requirements.txt +1 -0
- src/agents/evaluator.py +5 -5
- src/database/db.py +101 -92
- debug_comparison.py → test/debug_comparison.py +0 -0
- test/test_async_db.py +138 -0
- test/test_concurrent_eval.py +97 -0
- test_evaluation.py → test/test_evaluation.py +7 -7
DATABASE_MIGRATION_SUMMARY.md
DELETED
@@ -1,147 +0,0 @@
|
|
1 |
-
# 数据库迁移完成总结
|
2 |
-
|
3 |
-
## 概述
|
4 |
-
|
5 |
-
已成功将系统从JSON文件存储迁移到SQLite数据库存储,现在每篇arXiv文章的评价内容都存储在数据库中,支持更好的数据管理和查询功能。
|
6 |
-
|
7 |
-
## 主要修改
|
8 |
-
|
9 |
-
### 1. 数据库结构 (`src/database/db.py`)
|
10 |
-
|
11 |
-
**新增 papers 表:**
|
12 |
-
- `arxiv_id`: 论文唯一标识
|
13 |
-
- `title`, `authors`, `abstract`: 论文基本信息
|
14 |
-
- `evaluation_content`: 评价内容(JSON格式)
|
15 |
-
- `evaluation_score`: 总体自动化评分
|
16 |
-
- `evaluation_tags`: 评价标签
|
17 |
-
- `is_evaluated`: 评价状态标记
|
18 |
-
- `evaluation_date`: 评价时间
|
19 |
-
- `created_at`, `updated_at`: 时间戳
|
20 |
-
|
21 |
-
**新增数据库方法:**
|
22 |
-
- `insert_paper()`: 插入新论文
|
23 |
-
- `get_paper()`: 获取单个论文
|
24 |
-
- `update_paper_evaluation()`: 更新评价内容
|
25 |
-
- `get_evaluated_papers()`: 获取已评价论文
|
26 |
-
- `get_unevaluated_papers()`: 获取未评价论文
|
27 |
-
- `search_papers()`: 搜索论文
|
28 |
-
- `get_papers_count()`: 获取统计信息
|
29 |
-
|
30 |
-
### 2. 评价器修改 (`src/agents/evaluator.py`)
|
31 |
-
|
32 |
-
**ConversationState 类:**
|
33 |
-
- 添加 `arxiv_id` 字段
|
34 |
-
|
35 |
-
**save_node 函数:**
|
36 |
-
- 改为保存到数据库而不是JSON文件
|
37 |
-
- 自动提取评分和标签信息
|
38 |
-
- 支持结构化数据存储
|
39 |
-
|
40 |
-
**run_evaluation 函数:**
|
41 |
-
- 添加 `arxiv_id` 参数支持
|
42 |
-
|
43 |
-
### 3. API接口修改 (`app.py`)
|
44 |
-
|
45 |
-
**修改的接口:**
|
46 |
-
- `/api/evals`: 从数据库获取评价列表
|
47 |
-
- `/api/has-eval/{paper_id}`: 检查数据库中的评价状态
|
48 |
-
- `/api/eval/{paper_id}`: 从数据库获取评价内容
|
49 |
-
|
50 |
-
**新增接口:**
|
51 |
-
- `/api/papers/status`: 获取论文统计信息
|
52 |
-
- `/api/papers/insert`: 插入新论文
|
53 |
-
- `/api/papers/evaluate/{arxiv_id}`: 评价论文
|
54 |
-
|
55 |
-
### 4. CLI工具修改 (`src/cli/cli.py`)
|
56 |
-
|
57 |
-
**新增参数:**
|
58 |
-
- `--arxiv-id`: 指定论文的arXiv ID
|
59 |
-
|
60 |
-
**功能增强:**
|
61 |
-
- 支持将评价结果保存到数据库
|
62 |
-
- 保持向后兼容性(仍可保存到文件)
|
63 |
-
|
64 |
-
## 使用示例
|
65 |
-
|
66 |
-
### 1. 使用CLI评价论文并保存到数据库
|
67 |
-
|
68 |
-
```bash
|
69 |
-
# 评价论文并保存到数据库
|
70 |
-
python cli.py https://arxiv.org/pdf/2508.05629 --arxiv-id 2508.05629
|
71 |
-
|
72 |
-
# 同时保存到文件和数据库
|
73 |
-
python cli.py https://arxiv.org/pdf/2508.05629 --arxiv-id 2508.05629 -o /path/to/output
|
74 |
-
```
|
75 |
-
|
76 |
-
### 2. 使用API插入论文
|
77 |
-
|
78 |
-
```bash
|
79 |
-
curl -X POST "http://localhost:8000/api/papers/insert" \
|
80 |
-
-H "Content-Type: application/json" \
|
81 |
-
-d '{
|
82 |
-
"arxiv_id": "2508.05629",
|
83 |
-
"title": "Your Paper Title",
|
84 |
-
"authors": "Author 1, Author 2",
|
85 |
-
"abstract": "Paper abstract...",
|
86 |
-
"categories": "cs.AI, cs.LG",
|
87 |
-
"published_date": "2024-08-01"
|
88 |
-
}'
|
89 |
-
```
|
90 |
-
|
91 |
-
### 3. 获取评价统计
|
92 |
-
|
93 |
-
```bash
|
94 |
-
curl "http://localhost:8000/api/papers/status"
|
95 |
-
```
|
96 |
-
|
97 |
-
## 数据库优势
|
98 |
-
|
99 |
-
1. **结构化存储**: 论文信息和评价内容分离,便于管理
|
100 |
-
2. **状态跟踪**: 通过 `is_evaluated` 字段跟踪评价状态
|
101 |
-
3. **标签系统**: 支持为评价添加标签,便于分类筛选
|
102 |
-
4. **搜索功能**: 支持按标题、作者、摘要搜索
|
103 |
-
5. **统计功能**: 轻松获取论文统计信息
|
104 |
-
6. **API支持**: 完整的RESTful API接口
|
105 |
-
7. **数据完整性**: SQLite提供ACID特性
|
106 |
-
|
107 |
-
## 迁移注意事项
|
108 |
-
|
109 |
-
1. **现有JSON文件**: 可以编写脚本将现有JSON文件导入数据库
|
110 |
-
2. **数据库备份**: 建议定期备份数据库文件
|
111 |
-
3. **向后兼容**: CLI工具仍支持保存到文件,保持兼容性
|
112 |
-
4. **配置路径**: 数据库文件路径在 `configs/paper_agent.py` 中配置
|
113 |
-
|
114 |
-
## 测试验证
|
115 |
-
|
116 |
-
已创建并运行测试脚本验证所有数据库功能:
|
117 |
-
- ✅ 论文插入
|
118 |
-
- ✅ 论文查询
|
119 |
-
- ✅ 评价更新
|
120 |
-
- ✅ 状态检查
|
121 |
-
- ✅ 统计功能
|
122 |
-
- ✅ 搜索功能
|
123 |
-
|
124 |
-
## 下一步建议
|
125 |
-
|
126 |
-
1. **数据迁移**: 编写脚本将现有JSON文件导入数据库
|
127 |
-
2. **前端更新**: 更新前端界面以支持新的数据库功能
|
128 |
-
3. **批量操作**: 添加批量论文插入和评价功能
|
129 |
-
4. **数据导出**: 添加数据导出功能
|
130 |
-
5. **性能优化**: 为大量数据添加索引优化
|
131 |
-
|
132 |
-
## 文件清单
|
133 |
-
|
134 |
-
**修改的文件:**
|
135 |
-
- `src/database/db.py` - 数据库结构和操作
|
136 |
-
- `src/agents/evaluator.py` - 评价器修改
|
137 |
-
- `app.py` - API接口修改
|
138 |
-
- `src/cli/cli.py` - CLI工具修改
|
139 |
-
|
140 |
-
**新增的文件:**
|
141 |
-
- `DATABASE_USAGE.md` - 使用说明文档
|
142 |
-
- `DATABASE_MIGRATION_SUMMARY.md` - 本总结文档
|
143 |
-
|
144 |
-
**配置文件:**
|
145 |
-
- `configs/paper_agent.py` - 数据库路径配置
|
146 |
-
|
147 |
-
现在系统已经完全支持数据库存储,可以更好地管理论文评价数据!
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
DATABASE_USAGE.md
DELETED
@@ -1,182 +0,0 @@
|
|
1 |
-
# Papers Database 使用说明
|
2 |
-
|
3 |
-
## 概述
|
4 |
-
|
5 |
-
现在系统已经支持将arXiv文章和评价内容存储到SQLite数据库中,而不是保存在JSON文件中。这样可以更好地管理论文数据,支持查询、统计和标签管理。
|
6 |
-
|
7 |
-
## 数据库结构
|
8 |
-
|
9 |
-
### papers 表
|
10 |
-
|
11 |
-
| 字段 | 类型 | 说明 |
|
12 |
-
|------|------|------|
|
13 |
-
| arxiv_id | TEXT PRIMARY KEY | arXiv论文ID |
|
14 |
-
| title | TEXT NOT NULL | 论文标题 |
|
15 |
-
| authors | TEXT NOT NULL | 作者列表 |
|
16 |
-
| abstract | TEXT | 论文摘要 |
|
17 |
-
| categories | TEXT | 论文分类 |
|
18 |
-
| published_date | TEXT | 发布日期 |
|
19 |
-
| evaluation_content | TEXT | 评价内容(JSON格式) |
|
20 |
-
| evaluation_score | REAL | 总体自动化评分 |
|
21 |
-
| evaluation_tags | TEXT | 评价标签 |
|
22 |
-
| is_evaluated | BOOLEAN | 是否已评价 |
|
23 |
-
| evaluation_date | TIMESTAMP | 评价日期 |
|
24 |
-
| created_at | TIMESTAMP | 创建时间 |
|
25 |
-
| updated_at | TIMESTAMP | 更新时间 |
|
26 |
-
|
27 |
-
## 使用方法
|
28 |
-
|
29 |
-
### 1. 插入论文
|
30 |
-
|
31 |
-
```python
|
32 |
-
from src.database.db import db
|
33 |
-
|
34 |
-
# 插入新论文
|
35 |
-
db.insert_paper(
|
36 |
-
arxiv_id="2508.05629",
|
37 |
-
title="Your Paper Title",
|
38 |
-
authors="Author 1, Author 2",
|
39 |
-
abstract="Paper abstract...",
|
40 |
-
categories="cs.AI, cs.LG",
|
41 |
-
published_date="2024-08-01"
|
42 |
-
)
|
43 |
-
```
|
44 |
-
|
45 |
-
### 2. 更新评价
|
46 |
-
|
47 |
-
```python
|
48 |
-
# 更新论文评价
|
49 |
-
db.update_paper_evaluation(
|
50 |
-
arxiv_id="2508.05629",
|
51 |
-
evaluation_content='{"overall_automatability": 3, "three_year_feasibility": 75}',
|
52 |
-
evaluation_score=3.0,
|
53 |
-
evaluation_tags="3yr_feasibility:75%,overall_automatability:3/4"
|
54 |
-
)
|
55 |
-
```
|
56 |
-
|
57 |
-
### 3. 查询论文
|
58 |
-
|
59 |
-
```python
|
60 |
-
# 获取单个论文
|
61 |
-
paper = db.get_paper("2508.05629")
|
62 |
-
|
63 |
-
# 获取所有已评价的论文
|
64 |
-
evaluated_papers = db.get_evaluated_papers()
|
65 |
-
|
66 |
-
# 获取所有未评价的论文
|
67 |
-
unevaluated_papers = db.get_unevaluated_papers()
|
68 |
-
|
69 |
-
# 搜索论文
|
70 |
-
search_results = db.search_papers("AI")
|
71 |
-
```
|
72 |
-
|
73 |
-
### 4. 统计信息
|
74 |
-
|
75 |
-
```python
|
76 |
-
# 获取论文统计
|
77 |
-
count = db.get_papers_count()
|
78 |
-
print(f"总论文数: {count['total']}")
|
79 |
-
print(f"已评价: {count['evaluated']}")
|
80 |
-
print(f"未评价: {count['unevaluated']}")
|
81 |
-
```
|
82 |
-
|
83 |
-
## API 接口
|
84 |
-
|
85 |
-
### 获取评价列表
|
86 |
-
```
|
87 |
-
GET /api/evals
|
88 |
-
```
|
89 |
-
|
90 |
-
### 检查论文是否已评价
|
91 |
-
```
|
92 |
-
GET /api/has-eval/{paper_id}
|
93 |
-
```
|
94 |
-
|
95 |
-
### 获取论文评价
|
96 |
-
```
|
97 |
-
GET /api/eval/{paper_id}
|
98 |
-
```
|
99 |
-
|
100 |
-
### 获取论文统计
|
101 |
-
```
|
102 |
-
GET /api/papers/status
|
103 |
-
```
|
104 |
-
|
105 |
-
### 插入新论文
|
106 |
-
```
|
107 |
-
POST /api/papers/insert
|
108 |
-
Content-Type: application/json
|
109 |
-
|
110 |
-
{
|
111 |
-
"arxiv_id": "2508.05629",
|
112 |
-
"title": "Paper Title",
|
113 |
-
"authors": "Author 1, Author 2",
|
114 |
-
"abstract": "Abstract...",
|
115 |
-
"categories": "cs.AI",
|
116 |
-
"published_date": "2024-08-01"
|
117 |
-
}
|
118 |
-
```
|
119 |
-
|
120 |
-
### 评价论文
|
121 |
-
```
|
122 |
-
POST /api/papers/evaluate/{arxiv_id}
|
123 |
-
```
|
124 |
-
|
125 |
-
## CLI 工具使用
|
126 |
-
|
127 |
-
### 评价论文并保存到数据库
|
128 |
-
|
129 |
-
```bash
|
130 |
-
# 使用arxiv_id参数将评价保存到数据库
|
131 |
-
python cli.py https://arxiv.org/pdf/2508.05629 --arxiv-id 2508.05629
|
132 |
-
|
133 |
-
# 同时保存到文件和数据库
|
134 |
-
python cli.py https://arxiv.org/pdf/2508.05629 --arxiv-id 2508.05629 -o /path/to/output
|
135 |
-
```
|
136 |
-
|
137 |
-
## 迁移现有数据
|
138 |
-
|
139 |
-
如果你有现有的JSON评价文件,可以编写脚本将它们导入到数据库中:
|
140 |
-
|
141 |
-
```python
|
142 |
-
import json
|
143 |
-
import os
|
144 |
-
from src.database.db import db
|
145 |
-
|
146 |
-
def migrate_json_to_db(json_dir="workdir"):
|
147 |
-
"""将JSON文件迁移到数据库"""
|
148 |
-
for filename in os.listdir(json_dir):
|
149 |
-
if filename.endswith('.json'):
|
150 |
-
filepath = os.path.join(json_dir, filename)
|
151 |
-
with open(filepath, 'r') as f:
|
152 |
-
data = json.load(f)
|
153 |
-
|
154 |
-
# 提取arxiv_id(假设文件名包含arxiv_id)
|
155 |
-
arxiv_id = filename.split('_')[0] # 根据实际文件名格式调整
|
156 |
-
|
157 |
-
# 更新数据库中的评价
|
158 |
-
if 'response' in data:
|
159 |
-
db.update_paper_evaluation(
|
160 |
-
arxiv_id=arxiv_id,
|
161 |
-
evaluation_content=data['response'],
|
162 |
-
evaluation_score=None, # 需要从内容中解析
|
163 |
-
evaluation_tags=None
|
164 |
-
)
|
165 |
-
print(f"Migrated {filename} for paper {arxiv_id}")
|
166 |
-
```
|
167 |
-
|
168 |
-
## 优势
|
169 |
-
|
170 |
-
1. **结构化存储**: 论文信息和评价内容分开存储,便于查询
|
171 |
-
2. **标签系统**: 支持为评价添加标签,便于分类和筛选
|
172 |
-
3. **统计功能**: 可以轻松获取论文统计信息
|
173 |
-
4. **搜索功能**: 支持按标题、作者、摘要搜索论文
|
174 |
-
5. **状态管理**: 通过`is_evaluated`字段跟踪评价状态
|
175 |
-
6. **API支持**: 提供完整的RESTful API接口
|
176 |
-
|
177 |
-
## 注意事项
|
178 |
-
|
179 |
-
1. 确保在评价论文前先插入论文基本信息
|
180 |
-
2. 评价内容建议使用JSON格式,便于解析和展示
|
181 |
-
3. 定期备份数据库文件
|
182 |
-
4. 可以使用`evaluation_tags`字段存储关键评分信息,便于快速筛选
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
PROJECT_STRUCTURE.md
DELETED
@@ -1,87 +0,0 @@
|
|
1 |
-
# PaperIndex 项目结构
|
2 |
-
|
3 |
-
## 目录组织
|
4 |
-
|
5 |
-
```
|
6 |
-
paperindex/
|
7 |
-
├── app.py # 主应用程序入口点
|
8 |
-
├── cli.py # 命令行工具入口点
|
9 |
-
├── src/ # 源代码目录
|
10 |
-
│ ├── __init__.py
|
11 |
-
│ ├── app.py # 内部应用入口(已废弃)
|
12 |
-
│ ├── agents/ # AI 代理模块
|
13 |
-
│ │ ├── __init__.py
|
14 |
-
│ │ ├── evaluator.py # 论文评估器
|
15 |
-
│ │ └── prompt.py # 评估提示词
|
16 |
-
│ ├── database/ # 数据库模块
|
17 |
-
│ │ ├── __init__.py
|
18 |
-
│ │ ├── models.py # 数据库模型和类
|
19 |
-
│ │ └── papers_cache.db
|
20 |
-
│ ├── server/ # 服务器模块
|
21 |
-
│ │ ├── __init__.py
|
22 |
-
│ │ └── server.py # FastAPI 服务器
|
23 |
-
│ └── cli/ # 命令行工具模块
|
24 |
-
│ ├── __init__.py
|
25 |
-
│ └── cli.py # CLI 实现
|
26 |
-
├── frontend/ # 前端文件
|
27 |
-
│ ├── index.html
|
28 |
-
│ ├── paper.html
|
29 |
-
│ ├── main.js
|
30 |
-
│ ├── paper.js
|
31 |
-
│ └── styles.css
|
32 |
-
├── data/ # 数据目录
|
33 |
-
│ └── pdfs/
|
34 |
-
├── workdir/ # 工作目录
|
35 |
-
├── requirements.txt # Python 依赖
|
36 |
-
├── Dockerfile # Docker 配置
|
37 |
-
└── README.md # 项目说明
|
38 |
-
```
|
39 |
-
|
40 |
-
## 模块说明
|
41 |
-
|
42 |
-
### `src/agents/`
|
43 |
-
AI 代理模块,负责论文评估功能:
|
44 |
-
- `evaluator.py`: 使用 LangGraph 和 Claude API 进行论文评估
|
45 |
-
- `prompt.py`: 包含评估提示词和工具定义
|
46 |
-
|
47 |
-
### `src/database/`
|
48 |
-
数据库管理模块:
|
49 |
-
- `models.py`: 包含 PapersDatabase 类和数据库操作
|
50 |
-
- 包含 SQLite 数据库文件
|
51 |
-
- 负责论文缓存和状态管理
|
52 |
-
|
53 |
-
### `src/server/`
|
54 |
-
FastAPI 服务器模块:
|
55 |
-
- `server.py`: 主要的 Web 服务器实现
|
56 |
-
- 提供 RESTful API 接口
|
57 |
-
- 处理前端请求
|
58 |
-
|
59 |
-
### `src/cli/`
|
60 |
-
命令行工具模块:
|
61 |
-
- `cli.py`: 独立的论文评估命令行工具
|
62 |
-
- 支持本地 PDF 和在线 URL 评估
|
63 |
-
|
64 |
-
## 使用方法
|
65 |
-
|
66 |
-
### 启动 Web 应用
|
67 |
-
```bash
|
68 |
-
python app.py
|
69 |
-
```
|
70 |
-
|
71 |
-
### 使用命令行工具
|
72 |
-
```bash
|
73 |
-
python cli.py <pdf_path_or_url> [options]
|
74 |
-
```
|
75 |
-
|
76 |
-
### 开发模式
|
77 |
-
```bash
|
78 |
-
# 在 src 目录下运行
|
79 |
-
cd src
|
80 |
-
python -m uvicorn server.server:app --reload --host 0.0.0.0 --port 8000
|
81 |
-
```
|
82 |
-
|
83 |
-
## 导入路径
|
84 |
-
|
85 |
-
- 从根目录导入:`from src.agents.evaluator import Evaluator`
|
86 |
-
- 在 src 目录内导入:`from agents.evaluator import Evaluator`
|
87 |
-
- 模块间导入使用相对路径或绝对路径
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
app.py
CHANGED
@@ -25,7 +25,6 @@ from src.database import db
|
|
25 |
from src.logger import logger
|
26 |
from src.config import config
|
27 |
from src.crawl import HuggingFaceDailyPapers
|
28 |
-
from src.utils import assemble_project_path
|
29 |
from src.agents.evaluator import run_evaluation
|
30 |
|
31 |
app = FastAPI(title="PaperAgent")
|
@@ -67,8 +66,8 @@ async def get_daily(date_str: Optional[str] = None, direction: Optional[str] = N
|
|
67 |
hf_daily = HuggingFaceDailyPapers()
|
68 |
|
69 |
# First, check if we have fresh cache for the requested date
|
70 |
-
cached_data = db.get_cached_papers(target_date)
|
71 |
-
if cached_data and db.is_cache_fresh(target_date):
|
72 |
print(f"Using cached data for {target_date}")
|
73 |
return {
|
74 |
"date": target_date,
|
@@ -91,8 +90,8 @@ async def get_daily(date_str: Optional[str] = None, direction: Optional[str] = N
|
|
91 |
print(f"Redirected from {target_date} to {actual_date}")
|
92 |
|
93 |
# Check if the redirected date has fresh cache
|
94 |
-
cached_data = db.get_cached_papers(actual_date)
|
95 |
-
if cached_data and db.is_cache_fresh(actual_date):
|
96 |
print(f"Using cached data for redirected date {actual_date}")
|
97 |
return {
|
98 |
"date": actual_date,
|
@@ -108,7 +107,7 @@ async def get_daily(date_str: Optional[str] = None, direction: Optional[str] = N
|
|
108 |
enriched_cards = await enrich_cards(cards)
|
109 |
|
110 |
# Cache the results for the redirected date
|
111 |
-
db.cache_papers(actual_date, html, enriched_cards)
|
112 |
|
113 |
return {
|
114 |
"date": actual_date,
|
@@ -121,7 +120,7 @@ async def get_daily(date_str: Optional[str] = None, direction: Optional[str] = N
|
|
121 |
# If we got the exact date we requested, process normally
|
122 |
cards = hf_daily.parse_daily_cards(html)
|
123 |
enriched_cards = await enrich_cards(cards)
|
124 |
-
db.cache_papers(actual_date, html, enriched_cards)
|
125 |
|
126 |
return {
|
127 |
"date": actual_date,
|
@@ -134,7 +133,7 @@ async def get_daily(date_str: Optional[str] = None, direction: Optional[str] = N
|
|
134 |
except Exception as e:
|
135 |
print(f"Failed to fetch {target_date} for previous navigation: {e}")
|
136 |
# Fallback to cached data if available
|
137 |
-
cached_data = db.get_cached_papers(target_date)
|
138 |
if cached_data:
|
139 |
return {
|
140 |
"date": target_date,
|
@@ -157,7 +156,7 @@ async def get_daily(date_str: Optional[str] = None, direction: Optional[str] = N
|
|
157 |
if actual_date == target_date:
|
158 |
cards = hf_daily.parse_daily_cards(html)
|
159 |
enriched_cards = await enrich_cards(cards)
|
160 |
-
db.cache_papers(actual_date, html, enriched_cards)
|
161 |
|
162 |
return {
|
163 |
"date": actual_date,
|
@@ -174,8 +173,8 @@ async def get_daily(date_str: Optional[str] = None, direction: Optional[str] = N
|
|
174 |
# Try to find the next available date by incrementing
|
175 |
next_date = await find_next_available_date_forward(target_date)
|
176 |
if next_date:
|
177 |
-
cached_data = db.get_cached_papers(next_date)
|
178 |
-
if cached_data and db.is_cache_fresh(next_date):
|
179 |
print(f"Using cached data for next available date {next_date}")
|
180 |
return {
|
181 |
"date": next_date,
|
@@ -190,7 +189,7 @@ async def get_daily(date_str: Optional[str] = None, direction: Optional[str] = N
|
|
190 |
actual_date, html = await hf_daily.fetch_daily_html(next_date)
|
191 |
cards = hf_daily.parse_daily_cards(html)
|
192 |
enriched_cards = await enrich_cards(cards)
|
193 |
-
db.cache_papers(actual_date, html, enriched_cards)
|
194 |
|
195 |
return {
|
196 |
"date": actual_date,
|
@@ -214,7 +213,7 @@ async def get_daily(date_str: Optional[str] = None, direction: Optional[str] = N
|
|
214 |
# Try to find next available date
|
215 |
next_date = await find_next_available_date_forward(target_date)
|
216 |
if next_date:
|
217 |
-
cached_data = db.get_cached_papers(next_date)
|
218 |
if cached_data:
|
219 |
return {
|
220 |
"date": next_date,
|
@@ -239,8 +238,8 @@ async def get_daily(date_str: Optional[str] = None, direction: Optional[str] = N
|
|
239 |
print(f"Redirected from {target_date} to {actual_date}")
|
240 |
|
241 |
# Check if the redirected date has fresh cache
|
242 |
-
cached_data = db.get_cached_papers(actual_date)
|
243 |
-
if cached_data and db.is_cache_fresh(actual_date):
|
244 |
print(f"Using cached data for redirected date {actual_date}")
|
245 |
return {
|
246 |
"date": actual_date,
|
@@ -256,7 +255,7 @@ async def get_daily(date_str: Optional[str] = None, direction: Optional[str] = N
|
|
256 |
enriched_cards = await enrich_cards(cards)
|
257 |
|
258 |
# Cache the results for the redirected date
|
259 |
-
db.cache_papers(actual_date, html, enriched_cards)
|
260 |
|
261 |
return {
|
262 |
"date": actual_date,
|
@@ -269,7 +268,7 @@ async def get_daily(date_str: Optional[str] = None, direction: Optional[str] = N
|
|
269 |
# If we got the exact date we requested, process normally
|
270 |
cards = hf_daily.parse_daily_cards(html)
|
271 |
enriched_cards = await enrich_cards(cards)
|
272 |
-
db.cache_papers(actual_date, html, enriched_cards)
|
273 |
|
274 |
return {
|
275 |
"date": actual_date,
|
@@ -283,7 +282,7 @@ async def get_daily(date_str: Optional[str] = None, direction: Optional[str] = N
|
|
283 |
print(f"Failed to fetch {target_date}: {e}")
|
284 |
|
285 |
# If everything fails, return cached data if available
|
286 |
-
cached_data = db.get_cached_papers(target_date)
|
287 |
if cached_data:
|
288 |
return {
|
289 |
"date": target_date,
|
@@ -309,7 +308,7 @@ async def find_next_available_date_forward(start_date: str, max_attempts: int =
|
|
309 |
date_str = current_date.strftime("%Y-%m-%d")
|
310 |
|
311 |
# Check if we have cache for this date
|
312 |
-
cached_data = db.get_cached_papers(date_str)
|
313 |
if cached_data:
|
314 |
return date_str
|
315 |
|
@@ -338,7 +337,7 @@ async def enrich_cards(cards):
|
|
338 |
for c in cards:
|
339 |
arxiv_id = c.get("arxiv_id")
|
340 |
if arxiv_id:
|
341 |
-
paper = db.get_paper(arxiv_id)
|
342 |
if paper:
|
343 |
# Add evaluation status
|
344 |
c["has_eval"] = paper.get('is_evaluated', False)
|
@@ -369,9 +368,9 @@ async def enrich_cards(cards):
|
|
369 |
|
370 |
|
371 |
@app.get("/api/evals")
|
372 |
-
def list_evals() -> Dict[str, Any]:
|
373 |
# Get evaluated papers from database
|
374 |
-
evaluated_papers = db.get_evaluated_papers()
|
375 |
items: List[Dict[str, Any]] = []
|
376 |
|
377 |
for paper in evaluated_papers:
|
@@ -388,16 +387,16 @@ def list_evals() -> Dict[str, Any]:
|
|
388 |
|
389 |
|
390 |
@app.get("/api/has-eval/{paper_id}")
|
391 |
-
def has_eval(paper_id: str) -> Dict[str, bool]:
|
392 |
-
paper = db.get_paper(paper_id)
|
393 |
exists = paper is not None and paper.get('is_evaluated', False)
|
394 |
return {"exists": exists}
|
395 |
|
396 |
|
397 |
@app.get("/api/paper/{paper_id}")
|
398 |
-
def get_paper_details(paper_id: str) -> Dict[str, Any]:
|
399 |
"""Get detailed paper information from database"""
|
400 |
-
paper = db.get_paper(paper_id)
|
401 |
if not paper:
|
402 |
raise HTTPException(status_code=404, detail="Paper not found")
|
403 |
|
@@ -416,8 +415,8 @@ def get_paper_details(paper_id: str) -> Dict[str, Any]:
|
|
416 |
|
417 |
|
418 |
@app.get("/api/paper-score/{paper_id}")
|
419 |
-
def get_paper_score(paper_id: str) -> Dict[str, Any]:
|
420 |
-
paper = db.get_paper(paper_id)
|
421 |
print(f"Paper data for {paper_id}:", paper)
|
422 |
|
423 |
if not paper or not paper.get('is_evaluated', False):
|
@@ -468,8 +467,8 @@ def get_paper_score(paper_id: str) -> Dict[str, Any]:
|
|
468 |
|
469 |
|
470 |
@app.get("/api/eval/{paper_id}")
|
471 |
-
def get_eval(paper_id: str) -> Any:
|
472 |
-
paper = db.get_paper(paper_id)
|
473 |
if not paper or not paper.get('is_evaluated', False):
|
474 |
raise HTTPException(status_code=404, detail="Evaluation not found")
|
475 |
|
@@ -491,12 +490,13 @@ def get_eval(paper_id: str) -> Any:
|
|
491 |
|
492 |
|
493 |
@app.get("/api/available-dates")
|
494 |
-
def get_available_dates() -> Dict[str, Any]:
|
495 |
"""Get list of available dates in the cache"""
|
496 |
-
with db.get_connection() as conn:
|
497 |
-
cursor = conn.cursor()
|
498 |
-
cursor.execute('SELECT date_str FROM papers_cache ORDER BY date_str DESC LIMIT 30')
|
499 |
-
|
|
|
500 |
|
501 |
return {
|
502 |
"available_dates": dates,
|
@@ -505,21 +505,21 @@ def get_available_dates() -> Dict[str, Any]:
|
|
505 |
|
506 |
|
507 |
@app.get("/api/cache/status")
|
508 |
-
def get_cache_status() -> Dict[str, Any]:
|
509 |
"""Get cache status and statistics"""
|
510 |
-
with db.get_connection() as conn:
|
511 |
-
cursor = conn.cursor()
|
512 |
|
513 |
# Get total cached dates
|
514 |
-
cursor.execute('SELECT COUNT(*) as count FROM papers_cache')
|
515 |
-
total_cached = cursor.fetchone()['count']
|
516 |
|
517 |
# Get latest cached date
|
518 |
-
cursor.execute('SELECT date_str, updated_at FROM latest_date WHERE id = 1')
|
519 |
-
latest_info = cursor.fetchone()
|
520 |
|
521 |
# Get cache age distribution
|
522 |
-
cursor.execute('''
|
523 |
SELECT
|
524 |
CASE
|
525 |
WHEN updated_at > datetime('now', '-1 hour') THEN '1 hour'
|
@@ -531,7 +531,8 @@ def get_cache_status() -> Dict[str, Any]:
|
|
531 |
FROM papers_cache
|
532 |
GROUP BY age_group
|
533 |
''')
|
534 |
-
|
|
|
535 |
|
536 |
return {
|
537 |
"total_cached_dates": total_cached,
|
@@ -542,12 +543,12 @@ def get_cache_status() -> Dict[str, Any]:
|
|
542 |
|
543 |
|
544 |
@app.get("/api/papers/status")
|
545 |
-
def get_papers_status() -> Dict[str, Any]:
|
546 |
"""Get papers database status and statistics"""
|
547 |
-
papers_count = db.get_papers_count()
|
548 |
|
549 |
# Get recent evaluations
|
550 |
-
recent_papers = db.get_evaluated_papers()
|
551 |
recent_evaluations = []
|
552 |
for paper in recent_papers[:10]: # Get last 10 evaluations
|
553 |
recent_evaluations.append({
|
@@ -564,7 +565,7 @@ def get_papers_status() -> Dict[str, Any]:
|
|
564 |
|
565 |
|
566 |
@app.post("/api/papers/insert")
|
567 |
-
def insert_paper(paper_data: Dict[str, Any]) -> Dict[str, Any]:
|
568 |
"""Insert a new paper into the database"""
|
569 |
try:
|
570 |
required_fields = ['arxiv_id', 'title', 'authors']
|
@@ -572,7 +573,7 @@ def insert_paper(paper_data: Dict[str, Any]) -> Dict[str, Any]:
|
|
572 |
if field not in paper_data:
|
573 |
raise HTTPException(status_code=400, detail=f"Missing required field: {field}")
|
574 |
|
575 |
-
db.insert_paper(
|
576 |
arxiv_id=paper_data['arxiv_id'],
|
577 |
title=paper_data['title'],
|
578 |
authors=paper_data['authors'],
|
@@ -586,19 +587,26 @@ def insert_paper(paper_data: Dict[str, Any]) -> Dict[str, Any]:
|
|
586 |
raise HTTPException(status_code=500, detail=f"Failed to insert paper: {str(e)}")
|
587 |
|
588 |
|
|
|
|
|
|
|
589 |
@app.post("/api/papers/evaluate/{arxiv_id}")
|
590 |
-
async def evaluate_paper(arxiv_id: str) -> Dict[str, Any]:
|
591 |
"""Evaluate a paper by its arxiv_id"""
|
592 |
try:
|
593 |
# Check if paper exists in database
|
594 |
-
paper = db.get_paper(arxiv_id)
|
595 |
if not paper:
|
596 |
raise HTTPException(status_code=404, detail="Paper not found in database")
|
597 |
|
598 |
-
# Check if already evaluated
|
599 |
-
if paper.get('is_evaluated', False):
|
600 |
return {"message": f"Paper {arxiv_id} already evaluated", "status": "already_evaluated"}
|
601 |
|
|
|
|
|
|
|
|
|
602 |
# Create PDF URL from arxiv_id
|
603 |
pdf_url = f"https://arxiv.org/pdf/{arxiv_id}.pdf"
|
604 |
|
@@ -606,8 +614,8 @@ async def evaluate_paper(arxiv_id: str) -> Dict[str, Any]:
|
|
606 |
async def run_eval():
|
607 |
try:
|
608 |
# Update paper status to "evaluating"
|
609 |
-
db.update_paper_status(arxiv_id, "evaluating")
|
610 |
-
logger.info(f"Started evaluation for {arxiv_id}")
|
611 |
|
612 |
result = await run_evaluation(
|
613 |
pdf_path=pdf_url,
|
@@ -616,40 +624,51 @@ async def evaluate_paper(arxiv_id: str) -> Dict[str, Any]:
|
|
616 |
)
|
617 |
|
618 |
# Update paper status to "completed"
|
619 |
-
db.update_paper_status(arxiv_id, "completed")
|
620 |
-
logger.info(f"
|
621 |
except Exception as e:
|
622 |
# Update paper status to "failed"
|
623 |
-
db.update_paper_status(arxiv_id, "failed")
|
624 |
-
logger.error(f"
|
|
|
|
|
|
|
|
|
625 |
|
626 |
-
# Start evaluation in background
|
627 |
-
asyncio.create_task(run_eval())
|
|
|
628 |
|
629 |
return {
|
630 |
-
"message": f"
|
631 |
"status": "started",
|
632 |
-
"pdf_url": pdf_url
|
|
|
|
|
633 |
}
|
634 |
except Exception as e:
|
635 |
raise HTTPException(status_code=500, detail=f"Failed to evaluate paper: {str(e)}")
|
636 |
|
637 |
|
638 |
@app.get("/api/papers/evaluate/{arxiv_id}/status")
|
639 |
-
def get_evaluation_status(arxiv_id: str) -> Dict[str, Any]:
|
640 |
"""Get evaluation status for a paper"""
|
641 |
try:
|
642 |
-
paper = db.get_paper(arxiv_id)
|
643 |
if not paper:
|
644 |
raise HTTPException(status_code=404, detail="Paper not found")
|
645 |
|
646 |
status = paper.get('evaluation_status', 'not_started')
|
647 |
is_evaluated = paper.get('is_evaluated', False)
|
648 |
|
|
|
|
|
|
|
649 |
return {
|
650 |
"arxiv_id": arxiv_id,
|
651 |
"status": status,
|
652 |
"is_evaluated": is_evaluated,
|
|
|
653 |
"evaluation_date": paper.get('evaluation_date'),
|
654 |
"evaluation_score": paper.get('evaluation_score')
|
655 |
}
|
@@ -657,13 +676,88 @@ def get_evaluation_status(arxiv_id: str) -> Dict[str, Any]:
|
|
657 |
raise HTTPException(status_code=500, detail=f"Failed to get evaluation status: {str(e)}")
|
658 |
|
659 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
660 |
@app.post("/api/cache/clear")
|
661 |
-
def clear_cache() -> Dict[str, str]:
|
662 |
"""Clear all cached data"""
|
663 |
-
with db.get_connection() as conn:
|
664 |
-
cursor = conn.cursor()
|
665 |
-
cursor.execute('DELETE FROM papers_cache')
|
666 |
-
conn.commit()
|
667 |
return {"message": "Cache cleared successfully"}
|
668 |
|
669 |
|
@@ -679,7 +773,7 @@ async def refresh_cache(date_str: str) -> Dict[str, Any]:
|
|
679 |
cards = hf_daily.parse_daily_cards(html)
|
680 |
|
681 |
# Cache the results
|
682 |
-
db.cache_papers(actual_date, html, cards)
|
683 |
|
684 |
return {
|
685 |
"message": f"Cache refreshed for {actual_date}",
|
@@ -711,7 +805,7 @@ async def get_styles():
|
|
711 |
response.headers["Expires"] = "0"
|
712 |
return response
|
713 |
|
714 |
-
|
715 |
# Parse command line arguments
|
716 |
args = parse_args()
|
717 |
|
@@ -724,7 +818,7 @@ if __name__ == "__main__":
|
|
724 |
logger.info(f"| Config:\n{config.pretty_text}")
|
725 |
|
726 |
# Initialize the database
|
727 |
-
db.init_db(config=config)
|
728 |
logger.info(f"| Database initialized at: {config.db_path}")
|
729 |
|
730 |
# Load Frontend
|
@@ -733,5 +827,9 @@ if __name__ == "__main__":
|
|
733 |
logger.info(f"| Frontend initialized at: {config.frontend_path}")
|
734 |
|
735 |
# Use port 7860 for Hugging Face Spaces, fallback to 7860 for local development
|
736 |
-
|
737 |
-
uvicorn.
|
|
|
|
|
|
|
|
|
|
25 |
from src.logger import logger
|
26 |
from src.config import config
|
27 |
from src.crawl import HuggingFaceDailyPapers
|
|
|
28 |
from src.agents.evaluator import run_evaluation
|
29 |
|
30 |
app = FastAPI(title="PaperAgent")
|
|
|
66 |
hf_daily = HuggingFaceDailyPapers()
|
67 |
|
68 |
# First, check if we have fresh cache for the requested date
|
69 |
+
cached_data = await db.get_cached_papers(target_date)
|
70 |
+
if cached_data and await db.is_cache_fresh(target_date):
|
71 |
print(f"Using cached data for {target_date}")
|
72 |
return {
|
73 |
"date": target_date,
|
|
|
90 |
print(f"Redirected from {target_date} to {actual_date}")
|
91 |
|
92 |
# Check if the redirected date has fresh cache
|
93 |
+
cached_data = await db.get_cached_papers(actual_date)
|
94 |
+
if cached_data and await db.is_cache_fresh(actual_date):
|
95 |
print(f"Using cached data for redirected date {actual_date}")
|
96 |
return {
|
97 |
"date": actual_date,
|
|
|
107 |
enriched_cards = await enrich_cards(cards)
|
108 |
|
109 |
# Cache the results for the redirected date
|
110 |
+
await db.cache_papers(actual_date, html, enriched_cards)
|
111 |
|
112 |
return {
|
113 |
"date": actual_date,
|
|
|
120 |
# If we got the exact date we requested, process normally
|
121 |
cards = hf_daily.parse_daily_cards(html)
|
122 |
enriched_cards = await enrich_cards(cards)
|
123 |
+
await db.cache_papers(actual_date, html, enriched_cards)
|
124 |
|
125 |
return {
|
126 |
"date": actual_date,
|
|
|
133 |
except Exception as e:
|
134 |
print(f"Failed to fetch {target_date} for previous navigation: {e}")
|
135 |
# Fallback to cached data if available
|
136 |
+
cached_data = await db.get_cached_papers(target_date)
|
137 |
if cached_data:
|
138 |
return {
|
139 |
"date": target_date,
|
|
|
156 |
if actual_date == target_date:
|
157 |
cards = hf_daily.parse_daily_cards(html)
|
158 |
enriched_cards = await enrich_cards(cards)
|
159 |
+
await db.cache_papers(actual_date, html, enriched_cards)
|
160 |
|
161 |
return {
|
162 |
"date": actual_date,
|
|
|
173 |
# Try to find the next available date by incrementing
|
174 |
next_date = await find_next_available_date_forward(target_date)
|
175 |
if next_date:
|
176 |
+
cached_data = await db.get_cached_papers(next_date)
|
177 |
+
if cached_data and await db.is_cache_fresh(next_date):
|
178 |
print(f"Using cached data for next available date {next_date}")
|
179 |
return {
|
180 |
"date": next_date,
|
|
|
189 |
actual_date, html = await hf_daily.fetch_daily_html(next_date)
|
190 |
cards = hf_daily.parse_daily_cards(html)
|
191 |
enriched_cards = await enrich_cards(cards)
|
192 |
+
await db.cache_papers(actual_date, html, enriched_cards)
|
193 |
|
194 |
return {
|
195 |
"date": actual_date,
|
|
|
213 |
# Try to find next available date
|
214 |
next_date = await find_next_available_date_forward(target_date)
|
215 |
if next_date:
|
216 |
+
cached_data = await db.get_cached_papers(next_date)
|
217 |
if cached_data:
|
218 |
return {
|
219 |
"date": next_date,
|
|
|
238 |
print(f"Redirected from {target_date} to {actual_date}")
|
239 |
|
240 |
# Check if the redirected date has fresh cache
|
241 |
+
cached_data = await db.get_cached_papers(actual_date)
|
242 |
+
if cached_data and await db.is_cache_fresh(actual_date):
|
243 |
print(f"Using cached data for redirected date {actual_date}")
|
244 |
return {
|
245 |
"date": actual_date,
|
|
|
255 |
enriched_cards = await enrich_cards(cards)
|
256 |
|
257 |
# Cache the results for the redirected date
|
258 |
+
await db.cache_papers(actual_date, html, enriched_cards)
|
259 |
|
260 |
return {
|
261 |
"date": actual_date,
|
|
|
268 |
# If we got the exact date we requested, process normally
|
269 |
cards = hf_daily.parse_daily_cards(html)
|
270 |
enriched_cards = await enrich_cards(cards)
|
271 |
+
await db.cache_papers(actual_date, html, enriched_cards)
|
272 |
|
273 |
return {
|
274 |
"date": actual_date,
|
|
|
282 |
print(f"Failed to fetch {target_date}: {e}")
|
283 |
|
284 |
# If everything fails, return cached data if available
|
285 |
+
cached_data = await db.get_cached_papers(target_date)
|
286 |
if cached_data:
|
287 |
return {
|
288 |
"date": target_date,
|
|
|
308 |
date_str = current_date.strftime("%Y-%m-%d")
|
309 |
|
310 |
# Check if we have cache for this date
|
311 |
+
cached_data = await db.get_cached_papers(date_str)
|
312 |
if cached_data:
|
313 |
return date_str
|
314 |
|
|
|
337 |
for c in cards:
|
338 |
arxiv_id = c.get("arxiv_id")
|
339 |
if arxiv_id:
|
340 |
+
paper = await db.get_paper(arxiv_id)
|
341 |
if paper:
|
342 |
# Add evaluation status
|
343 |
c["has_eval"] = paper.get('is_evaluated', False)
|
|
|
368 |
|
369 |
|
370 |
@app.get("/api/evals")
|
371 |
+
async def list_evals() -> Dict[str, Any]:
|
372 |
# Get evaluated papers from database
|
373 |
+
evaluated_papers = await db.get_evaluated_papers()
|
374 |
items: List[Dict[str, Any]] = []
|
375 |
|
376 |
for paper in evaluated_papers:
|
|
|
387 |
|
388 |
|
389 |
@app.get("/api/has-eval/{paper_id}")
|
390 |
+
async def has_eval(paper_id: str) -> Dict[str, bool]:
|
391 |
+
paper = await db.get_paper(paper_id)
|
392 |
exists = paper is not None and paper.get('is_evaluated', False)
|
393 |
return {"exists": exists}
|
394 |
|
395 |
|
396 |
@app.get("/api/paper/{paper_id}")
|
397 |
+
async def get_paper_details(paper_id: str) -> Dict[str, Any]:
|
398 |
"""Get detailed paper information from database"""
|
399 |
+
paper = await db.get_paper(paper_id)
|
400 |
if not paper:
|
401 |
raise HTTPException(status_code=404, detail="Paper not found")
|
402 |
|
|
|
415 |
|
416 |
|
417 |
@app.get("/api/paper-score/{paper_id}")
|
418 |
+
async def get_paper_score(paper_id: str) -> Dict[str, Any]:
|
419 |
+
paper = await db.get_paper(paper_id)
|
420 |
print(f"Paper data for {paper_id}:", paper)
|
421 |
|
422 |
if not paper or not paper.get('is_evaluated', False):
|
|
|
467 |
|
468 |
|
469 |
@app.get("/api/eval/{paper_id}")
|
470 |
+
async def get_eval(paper_id: str) -> Any:
|
471 |
+
paper = await db.get_paper(paper_id)
|
472 |
if not paper or not paper.get('is_evaluated', False):
|
473 |
raise HTTPException(status_code=404, detail="Evaluation not found")
|
474 |
|
|
|
490 |
|
491 |
|
492 |
@app.get("/api/available-dates")
|
493 |
+
async def get_available_dates() -> Dict[str, Any]:
|
494 |
"""Get list of available dates in the cache"""
|
495 |
+
async with db.get_connection() as conn:
|
496 |
+
cursor = await conn.cursor()
|
497 |
+
await cursor.execute('SELECT date_str FROM papers_cache ORDER BY date_str DESC LIMIT 30')
|
498 |
+
rows = await cursor.fetchall()
|
499 |
+
dates = [row['date_str'] for row in rows]
|
500 |
|
501 |
return {
|
502 |
"available_dates": dates,
|
|
|
505 |
|
506 |
|
507 |
@app.get("/api/cache/status")
|
508 |
+
async def get_cache_status() -> Dict[str, Any]:
|
509 |
"""Get cache status and statistics"""
|
510 |
+
async with db.get_connection() as conn:
|
511 |
+
cursor = await conn.cursor()
|
512 |
|
513 |
# Get total cached dates
|
514 |
+
await cursor.execute('SELECT COUNT(*) as count FROM papers_cache')
|
515 |
+
total_cached = (await cursor.fetchone())['count']
|
516 |
|
517 |
# Get latest cached date
|
518 |
+
await cursor.execute('SELECT date_str, updated_at FROM latest_date WHERE id = 1')
|
519 |
+
latest_info = await cursor.fetchone()
|
520 |
|
521 |
# Get cache age distribution
|
522 |
+
await cursor.execute('''
|
523 |
SELECT
|
524 |
CASE
|
525 |
WHEN updated_at > datetime('now', '-1 hour') THEN '1 hour'
|
|
|
531 |
FROM papers_cache
|
532 |
GROUP BY age_group
|
533 |
''')
|
534 |
+
rows = await cursor.fetchall()
|
535 |
+
age_distribution = {row['age_group']: row['count'] for row in rows}
|
536 |
|
537 |
return {
|
538 |
"total_cached_dates": total_cached,
|
|
|
543 |
|
544 |
|
545 |
@app.get("/api/papers/status")
|
546 |
+
async def get_papers_status() -> Dict[str, Any]:
|
547 |
"""Get papers database status and statistics"""
|
548 |
+
papers_count = await db.get_papers_count()
|
549 |
|
550 |
# Get recent evaluations
|
551 |
+
recent_papers = await db.get_evaluated_papers()
|
552 |
recent_evaluations = []
|
553 |
for paper in recent_papers[:10]: # Get last 10 evaluations
|
554 |
recent_evaluations.append({
|
|
|
565 |
|
566 |
|
567 |
@app.post("/api/papers/insert")
|
568 |
+
async def insert_paper(paper_data: Dict[str, Any]) -> Dict[str, Any]:
|
569 |
"""Insert a new paper into the database"""
|
570 |
try:
|
571 |
required_fields = ['arxiv_id', 'title', 'authors']
|
|
|
573 |
if field not in paper_data:
|
574 |
raise HTTPException(status_code=400, detail=f"Missing required field: {field}")
|
575 |
|
576 |
+
await db.insert_paper(
|
577 |
arxiv_id=paper_data['arxiv_id'],
|
578 |
title=paper_data['title'],
|
579 |
authors=paper_data['authors'],
|
|
|
587 |
raise HTTPException(status_code=500, detail=f"Failed to insert paper: {str(e)}")
|
588 |
|
589 |
|
590 |
+
# Global task tracker for concurrent evaluations
|
591 |
+
evaluation_tasks = {}
|
592 |
+
|
593 |
@app.post("/api/papers/evaluate/{arxiv_id}")
|
594 |
+
async def evaluate_paper(arxiv_id: str, force_reevaluate: bool = False) -> Dict[str, Any]:
|
595 |
"""Evaluate a paper by its arxiv_id"""
|
596 |
try:
|
597 |
# Check if paper exists in database
|
598 |
+
paper = await db.get_paper(arxiv_id)
|
599 |
if not paper:
|
600 |
raise HTTPException(status_code=404, detail="Paper not found in database")
|
601 |
|
602 |
+
# Check if already evaluated (unless force_reevaluate is True)
|
603 |
+
if not force_reevaluate and paper.get('is_evaluated', False):
|
604 |
return {"message": f"Paper {arxiv_id} already evaluated", "status": "already_evaluated"}
|
605 |
|
606 |
+
# Check if evaluation is already running
|
607 |
+
if arxiv_id in evaluation_tasks and not evaluation_tasks[arxiv_id].done():
|
608 |
+
return {"message": f"Evaluation already running for {arxiv_id}", "status": "already_running"}
|
609 |
+
|
610 |
# Create PDF URL from arxiv_id
|
611 |
pdf_url = f"https://arxiv.org/pdf/{arxiv_id}.pdf"
|
612 |
|
|
|
614 |
async def run_eval():
|
615 |
try:
|
616 |
# Update paper status to "evaluating"
|
617 |
+
await db.update_paper_status(arxiv_id, "evaluating")
|
618 |
+
logger.info(f"Started {'re-' if force_reevaluate else ''}evaluation for {arxiv_id}")
|
619 |
|
620 |
result = await run_evaluation(
|
621 |
pdf_path=pdf_url,
|
|
|
624 |
)
|
625 |
|
626 |
# Update paper status to "completed"
|
627 |
+
await db.update_paper_status(arxiv_id, "completed")
|
628 |
+
logger.info(f"{'Re-' if force_reevaluate else ''}evaluation completed for {arxiv_id}")
|
629 |
except Exception as e:
|
630 |
# Update paper status to "failed"
|
631 |
+
await db.update_paper_status(arxiv_id, "failed")
|
632 |
+
logger.error(f"{'Re-' if force_reevaluate else ''}evaluation failed for {arxiv_id}: {str(e)}")
|
633 |
+
finally:
|
634 |
+
# Clean up task from tracker
|
635 |
+
if arxiv_id in evaluation_tasks:
|
636 |
+
del evaluation_tasks[arxiv_id]
|
637 |
|
638 |
+
# Start evaluation in background and track it
|
639 |
+
task = asyncio.create_task(run_eval())
|
640 |
+
evaluation_tasks[arxiv_id] = task
|
641 |
|
642 |
return {
|
643 |
+
"message": f"{'Re-' if force_reevaluate else ''}evaluation started for paper {arxiv_id}",
|
644 |
"status": "started",
|
645 |
+
"pdf_url": pdf_url,
|
646 |
+
"concurrent_tasks": len(evaluation_tasks),
|
647 |
+
"is_reevaluate": force_reevaluate
|
648 |
}
|
649 |
except Exception as e:
|
650 |
raise HTTPException(status_code=500, detail=f"Failed to evaluate paper: {str(e)}")
|
651 |
|
652 |
|
653 |
@app.get("/api/papers/evaluate/{arxiv_id}/status")
|
654 |
+
async def get_evaluation_status(arxiv_id: str) -> Dict[str, Any]:
|
655 |
"""Get evaluation status for a paper"""
|
656 |
try:
|
657 |
+
paper = await db.get_paper(arxiv_id)
|
658 |
if not paper:
|
659 |
raise HTTPException(status_code=404, detail="Paper not found")
|
660 |
|
661 |
status = paper.get('evaluation_status', 'not_started')
|
662 |
is_evaluated = paper.get('is_evaluated', False)
|
663 |
|
664 |
+
# Check if task is currently running
|
665 |
+
is_running = arxiv_id in evaluation_tasks and not evaluation_tasks[arxiv_id].done()
|
666 |
+
|
667 |
return {
|
668 |
"arxiv_id": arxiv_id,
|
669 |
"status": status,
|
670 |
"is_evaluated": is_evaluated,
|
671 |
+
"is_running": is_running,
|
672 |
"evaluation_date": paper.get('evaluation_date'),
|
673 |
"evaluation_score": paper.get('evaluation_score')
|
674 |
}
|
|
|
676 |
raise HTTPException(status_code=500, detail=f"Failed to get evaluation status: {str(e)}")
|
677 |
|
678 |
|
679 |
+
@app.post("/api/papers/reevaluate/{arxiv_id}")
|
680 |
+
async def reevaluate_paper(arxiv_id: str) -> Dict[str, Any]:
|
681 |
+
"""Re-evaluate a paper by its arxiv_id"""
|
682 |
+
try:
|
683 |
+
# Check if paper exists in database
|
684 |
+
paper = await db.get_paper(arxiv_id)
|
685 |
+
if not paper:
|
686 |
+
raise HTTPException(status_code=404, detail="Paper not found in database")
|
687 |
+
|
688 |
+
# Check if evaluation is already running
|
689 |
+
if arxiv_id in evaluation_tasks and not evaluation_tasks[arxiv_id].done():
|
690 |
+
return {"message": f"Evaluation already running for {arxiv_id}", "status": "already_running"}
|
691 |
+
|
692 |
+
# Create PDF URL from arxiv_id
|
693 |
+
pdf_url = f"https://arxiv.org/pdf/{arxiv_id}.pdf"
|
694 |
+
|
695 |
+
# Run re-evaluation in background task
|
696 |
+
async def run_reeval():
|
697 |
+
try:
|
698 |
+
# Update paper status to "evaluating"
|
699 |
+
await db.update_paper_status(arxiv_id, "evaluating")
|
700 |
+
logger.info(f"Started re-evaluation for {arxiv_id}")
|
701 |
+
|
702 |
+
result = await run_evaluation(
|
703 |
+
pdf_path=pdf_url,
|
704 |
+
arxiv_id=arxiv_id,
|
705 |
+
api_key=os.getenv("ANTHROPIC_API_KEY")
|
706 |
+
)
|
707 |
+
|
708 |
+
# Update paper status to "completed"
|
709 |
+
await db.update_paper_status(arxiv_id, "completed")
|
710 |
+
logger.info(f"Re-evaluation completed for {arxiv_id}")
|
711 |
+
except Exception as e:
|
712 |
+
# Update paper status to "failed"
|
713 |
+
await db.update_paper_status(arxiv_id, "failed")
|
714 |
+
logger.error(f"Re-evaluation failed for {arxiv_id}: {str(e)}")
|
715 |
+
finally:
|
716 |
+
# Clean up task from tracker
|
717 |
+
if arxiv_id in evaluation_tasks:
|
718 |
+
del evaluation_tasks[arxiv_id]
|
719 |
+
|
720 |
+
# Start re-evaluation in background and track it
|
721 |
+
task = asyncio.create_task(run_reeval())
|
722 |
+
evaluation_tasks[arxiv_id] = task
|
723 |
+
|
724 |
+
return {
|
725 |
+
"message": f"Re-evaluation started for paper {arxiv_id}",
|
726 |
+
"status": "started",
|
727 |
+
"pdf_url": pdf_url,
|
728 |
+
"concurrent_tasks": len(evaluation_tasks),
|
729 |
+
"is_reevaluate": True
|
730 |
+
}
|
731 |
+
except Exception as e:
|
732 |
+
raise HTTPException(status_code=500, detail=f"Failed to re-evaluate paper: {str(e)}")
|
733 |
+
|
734 |
+
|
735 |
+
@app.get("/api/papers/evaluate/active-tasks")
|
736 |
+
async def get_active_evaluation_tasks() -> Dict[str, Any]:
|
737 |
+
"""Get list of currently running evaluation tasks"""
|
738 |
+
active_tasks = {}
|
739 |
+
for arxiv_id, task in evaluation_tasks.items():
|
740 |
+
if not task.done():
|
741 |
+
active_tasks[arxiv_id] = {
|
742 |
+
"status": "running",
|
743 |
+
"done": task.done(),
|
744 |
+
"cancelled": task.cancelled()
|
745 |
+
}
|
746 |
+
|
747 |
+
return {
|
748 |
+
"active_tasks": active_tasks,
|
749 |
+
"total_active": len(active_tasks),
|
750 |
+
"total_tracked": len(evaluation_tasks)
|
751 |
+
}
|
752 |
+
|
753 |
+
|
754 |
@app.post("/api/cache/clear")
|
755 |
+
async def clear_cache() -> Dict[str, str]:
|
756 |
"""Clear all cached data"""
|
757 |
+
async with db.get_connection() as conn:
|
758 |
+
cursor = await conn.cursor()
|
759 |
+
await cursor.execute('DELETE FROM papers_cache')
|
760 |
+
await conn.commit()
|
761 |
return {"message": "Cache cleared successfully"}
|
762 |
|
763 |
|
|
|
773 |
cards = hf_daily.parse_daily_cards(html)
|
774 |
|
775 |
# Cache the results
|
776 |
+
await db.cache_papers(actual_date, html, cards)
|
777 |
|
778 |
return {
|
779 |
"message": f"Cache refreshed for {actual_date}",
|
|
|
805 |
response.headers["Expires"] = "0"
|
806 |
return response
|
807 |
|
808 |
+
async def main():
|
809 |
# Parse command line arguments
|
810 |
args = parse_args()
|
811 |
|
|
|
818 |
logger.info(f"| Config:\n{config.pretty_text}")
|
819 |
|
820 |
# Initialize the database
|
821 |
+
await db.init_db(config=config)
|
822 |
logger.info(f"| Database initialized at: {config.db_path}")
|
823 |
|
824 |
# Load Frontend
|
|
|
827 |
logger.info(f"| Frontend initialized at: {config.frontend_path}")
|
828 |
|
829 |
# Use port 7860 for Hugging Face Spaces, fallback to 7860 for local development
|
830 |
+
config_uvicorn = uvicorn.Config(app, host="0.0.0.0", port=7860)
|
831 |
+
server = uvicorn.Server(config_uvicorn)
|
832 |
+
await server.serve()
|
833 |
+
|
834 |
+
if __name__ == "__main__":
|
835 |
+
asyncio.run(main())
|
frontend/index.html
CHANGED
@@ -48,10 +48,16 @@
|
|
48 |
</div>
|
49 |
|
50 |
<div class="header-center">
|
51 |
-
<div class="
|
52 |
-
<
|
53 |
-
|
54 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
55 |
</div>
|
56 |
</div>
|
57 |
|
|
|
48 |
</div>
|
49 |
|
50 |
<div class="header-center">
|
51 |
+
<div class="search-batch-container">
|
52 |
+
<div class="ai-search-container">
|
53 |
+
<i class="fas fa-sparkles"></i>
|
54 |
+
<input type="text" placeholder="Search any paper with AI..." class="ai-search-input">
|
55 |
+
<i class="fas fa-cube"></i>
|
56 |
+
</div>
|
57 |
+
<button class="batch-evaluate-btn" id="batchEvaluateBtn">
|
58 |
+
<i class="fas fa-rocket"></i>
|
59 |
+
<span>Evaluate All</span>
|
60 |
+
</button>
|
61 |
</div>
|
62 |
</div>
|
63 |
|
frontend/main.js
CHANGED
@@ -416,6 +416,9 @@ class PaperCardRenderer {
|
|
416 |
button.onclick = () => {
|
417 |
window.location.href = `/paper.html?id=${encodeURIComponent(arxivId)}`;
|
418 |
};
|
|
|
|
|
|
|
419 |
} else {
|
420 |
// Paper doesn't have evaluation - show evaluate button
|
421 |
evalIcon.className = 'fas fa-play eval-icon';
|
@@ -433,6 +436,145 @@ class PaperCardRenderer {
|
|
433 |
}
|
434 |
}
|
435 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
436 |
async checkPaperScore(card, arxivId) {
|
437 |
try {
|
438 |
// First check if the card already has score data from the API response
|
@@ -500,17 +642,17 @@ class PaperCardRenderer {
|
|
500 |
}, 100);
|
501 |
}
|
502 |
|
503 |
-
async evaluatePaper(button, arxivId) {
|
504 |
const spinner = button.querySelector('.fa-spinner');
|
505 |
const evalIcon = button.querySelector('.eval-icon');
|
506 |
const evalText = button.querySelector('.eval-text');
|
507 |
const paperTitle = button.getAttribute('data-paper-title');
|
508 |
|
509 |
-
//
|
|
|
510 |
spinner.style.display = 'inline-block';
|
511 |
evalIcon.style.display = 'none';
|
512 |
-
evalText.textContent = '
|
513 |
-
button.className = 'eval-button evaluating-state';
|
514 |
button.disabled = true;
|
515 |
|
516 |
try {
|
@@ -534,23 +676,27 @@ class PaperCardRenderer {
|
|
534 |
});
|
535 |
|
536 |
// Start evaluation
|
537 |
-
const
|
|
|
|
|
|
|
|
|
538 |
method: 'POST'
|
539 |
});
|
540 |
|
541 |
if (response.ok) {
|
542 |
const result = await response.json();
|
543 |
|
544 |
-
if (result.status === 'already_evaluated') {
|
545 |
// Paper was already evaluated, redirect to evaluation page
|
546 |
window.location.href = `/paper.html?id=${encodeURIComponent(arxivId)}`;
|
547 |
} else {
|
548 |
// Evaluation started, show progress and poll for status
|
549 |
-
evalText.textContent = 'Started...';
|
550 |
button.className = 'eval-button started-state';
|
551 |
|
552 |
// Start polling for status
|
553 |
-
this.pollEvaluationStatus(button, arxivId);
|
554 |
}
|
555 |
} else {
|
556 |
throw new Error('Failed to start evaluation');
|
@@ -567,14 +713,15 @@ class PaperCardRenderer {
|
|
567 |
}
|
568 |
}
|
569 |
|
570 |
-
async pollEvaluationStatus(button, arxivId) {
|
571 |
const evalIcon = button.querySelector('.eval-icon');
|
572 |
const evalText = button.querySelector('.eval-text');
|
573 |
let pollCount = 0;
|
574 |
const maxPolls = 60; // Poll for up to 5 minutes (5s intervals)
|
575 |
|
576 |
// Show log message
|
577 |
-
|
|
|
578 |
|
579 |
const poll = async () => {
|
580 |
try {
|
@@ -584,24 +731,31 @@ class PaperCardRenderer {
|
|
584 |
|
585 |
switch (status.status) {
|
586 |
case 'evaluating':
|
587 |
-
evalText.textContent = `Evaluating... (${pollCount * 5}s)`;
|
588 |
evalIcon.className = 'fas fa-spinner fa-spin eval-icon';
|
589 |
button.className = 'eval-button evaluating-state';
|
590 |
-
|
|
|
591 |
break;
|
592 |
|
593 |
case 'completed':
|
594 |
evalIcon.className = 'fas fa-check eval-icon';
|
595 |
-
evalText.textContent = 'Completed';
|
596 |
button.className = 'eval-button evaluation-state';
|
597 |
button.onclick = () => {
|
598 |
window.location.href = `/paper.html?id=${encodeURIComponent(arxivId)}`;
|
599 |
};
|
600 |
-
|
|
|
601 |
|
602 |
// Add score badge after completion
|
603 |
this.checkPaperScore(button.closest('.hf-paper-card'), arxivId);
|
604 |
|
|
|
|
|
|
|
|
|
|
|
605 |
return; // Stop polling
|
606 |
|
607 |
case 'failed':
|
@@ -749,6 +903,19 @@ class PaperIndexApp {
|
|
749 |
e.target.classList.add('active');
|
750 |
});
|
751 |
});
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
752 |
}
|
753 |
|
754 |
async loadDaily(direction = null) {
|
@@ -822,7 +989,75 @@ class PaperIndexApp {
|
|
822 |
}
|
823 |
}
|
824 |
|
825 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
826 |
|
827 |
// Unified notification system
|
828 |
showNotification(options) {
|
|
|
416 |
button.onclick = () => {
|
417 |
window.location.href = `/paper.html?id=${encodeURIComponent(arxivId)}`;
|
418 |
};
|
419 |
+
|
420 |
+
// Add re-evaluate button for already evaluated papers
|
421 |
+
this.addReevaluateButton(card, arxivId);
|
422 |
} else {
|
423 |
// Paper doesn't have evaluation - show evaluate button
|
424 |
evalIcon.className = 'fas fa-play eval-icon';
|
|
|
436 |
}
|
437 |
}
|
438 |
|
439 |
+
addReevaluateButton(card, arxivId) {
|
440 |
+
// Check if re-evaluate button already exists
|
441 |
+
if (card.querySelector('.reevaluate-button')) {
|
442 |
+
return;
|
443 |
+
}
|
444 |
+
|
445 |
+
const cardActions = card.querySelector('.card-actions');
|
446 |
+
if (cardActions) {
|
447 |
+
const reevaluateButton = document.createElement('button');
|
448 |
+
reevaluateButton.className = 'reevaluate-button';
|
449 |
+
reevaluateButton.innerHTML = `
|
450 |
+
<i class="fas fa-redo"></i>
|
451 |
+
<span>Re-evaluate</span>
|
452 |
+
`;
|
453 |
+
reevaluateButton.onclick = () => {
|
454 |
+
this.reevaluatePaper(reevaluateButton, arxivId);
|
455 |
+
};
|
456 |
+
|
457 |
+
cardActions.appendChild(reevaluateButton);
|
458 |
+
}
|
459 |
+
}
|
460 |
+
|
461 |
+
async reevaluatePaper(button, arxivId) {
|
462 |
+
const icon = button.querySelector('i');
|
463 |
+
const text = button.querySelector('span');
|
464 |
+
const originalText = text.textContent;
|
465 |
+
const originalIcon = icon.className;
|
466 |
+
|
467 |
+
// Show loading state
|
468 |
+
icon.className = 'fas fa-spinner fa-spin';
|
469 |
+
text.textContent = 'Re-evaluating...';
|
470 |
+
button.disabled = true;
|
471 |
+
|
472 |
+
// Show log message
|
473 |
+
this.showLogMessage(`Started re-evaluation for paper ${arxivId}`, 'info');
|
474 |
+
|
475 |
+
try {
|
476 |
+
const response = await fetch(`/api/papers/reevaluate/${encodeURIComponent(arxivId)}`, {
|
477 |
+
method: 'POST'
|
478 |
+
});
|
479 |
+
|
480 |
+
if (response.ok) {
|
481 |
+
const result = await response.json();
|
482 |
+
|
483 |
+
if (result.status === 'already_running') {
|
484 |
+
text.textContent = 'Already running';
|
485 |
+
this.showLogMessage(`Re-evaluation already running for paper ${arxivId}`, 'warning');
|
486 |
+
setTimeout(() => {
|
487 |
+
icon.className = originalIcon;
|
488 |
+
text.textContent = originalText;
|
489 |
+
button.disabled = false;
|
490 |
+
}, 2000);
|
491 |
+
} else {
|
492 |
+
// Start polling for status
|
493 |
+
this.pollReevaluationStatus(button, arxivId, originalText, originalIcon);
|
494 |
+
}
|
495 |
+
} else {
|
496 |
+
throw new Error('Failed to start re-evaluation');
|
497 |
+
}
|
498 |
+
} catch (error) {
|
499 |
+
console.error('Error re-evaluating paper:', error);
|
500 |
+
icon.className = 'fas fa-exclamation-triangle';
|
501 |
+
text.textContent = 'Error';
|
502 |
+
this.showLogMessage(`Re-evaluation failed for paper ${arxivId}: ${error.message}`, 'error');
|
503 |
+
setTimeout(() => {
|
504 |
+
icon.className = originalIcon;
|
505 |
+
text.textContent = originalText;
|
506 |
+
button.disabled = false;
|
507 |
+
}, 2000);
|
508 |
+
}
|
509 |
+
}
|
510 |
+
|
511 |
+
async pollReevaluationStatus(button, arxivId, originalText, originalIcon) {
|
512 |
+
const icon = button.querySelector('i');
|
513 |
+
const text = button.querySelector('span');
|
514 |
+
let pollCount = 0;
|
515 |
+
const maxPolls = 60; // Poll for up to 5 minutes (5s intervals)
|
516 |
+
|
517 |
+
const poll = async () => {
|
518 |
+
try {
|
519 |
+
const response = await fetch(`/api/papers/evaluate/${encodeURIComponent(arxivId)}/status`);
|
520 |
+
if (response.ok) {
|
521 |
+
const status = await response.json();
|
522 |
+
|
523 |
+
switch (status.status) {
|
524 |
+
case 'evaluating':
|
525 |
+
text.textContent = `Re-evaluating... (${pollCount * 5}s)`;
|
526 |
+
icon.className = 'fas fa-spinner fa-spin';
|
527 |
+
this.showLogMessage(`Re-evaluating paper ${arxivId}... (${pollCount * 5}s)`, 'info');
|
528 |
+
break;
|
529 |
+
|
530 |
+
case 'completed':
|
531 |
+
icon.className = 'fas fa-check';
|
532 |
+
text.textContent = 'Re-evaluated';
|
533 |
+
button.disabled = false;
|
534 |
+
this.showLogMessage(`Re-evaluation completed for paper ${arxivId}`, 'success');
|
535 |
+
|
536 |
+
// Refresh the page to show updated results
|
537 |
+
setTimeout(() => {
|
538 |
+
window.location.reload();
|
539 |
+
}, 1000);
|
540 |
+
return;
|
541 |
+
|
542 |
+
case 'failed':
|
543 |
+
icon.className = 'fas fa-exclamation-triangle';
|
544 |
+
text.textContent = 'Failed';
|
545 |
+
button.disabled = false;
|
546 |
+
this.showLogMessage(`Re-evaluation failed for paper ${arxivId}`, 'error');
|
547 |
+
return;
|
548 |
+
|
549 |
+
default:
|
550 |
+
text.textContent = `Status: ${status.status}`;
|
551 |
+
}
|
552 |
+
|
553 |
+
pollCount++;
|
554 |
+
if (pollCount < maxPolls) {
|
555 |
+
setTimeout(poll, 5000);
|
556 |
+
} else {
|
557 |
+
icon.className = 'fas fa-clock';
|
558 |
+
text.textContent = 'Timeout';
|
559 |
+
button.disabled = false;
|
560 |
+
this.showLogMessage(`Re-evaluation timeout for paper ${arxivId}`, 'warning');
|
561 |
+
}
|
562 |
+
} else {
|
563 |
+
throw new Error('Failed to get status');
|
564 |
+
}
|
565 |
+
} catch (error) {
|
566 |
+
console.error('Error polling re-evaluation status:', error);
|
567 |
+
icon.className = 'fas fa-exclamation-triangle';
|
568 |
+
text.textContent = 'Error';
|
569 |
+
button.disabled = false;
|
570 |
+
}
|
571 |
+
};
|
572 |
+
|
573 |
+
poll();
|
574 |
+
}
|
575 |
+
|
576 |
+
|
577 |
+
|
578 |
async checkPaperScore(card, arxivId) {
|
579 |
try {
|
580 |
// First check if the card already has score data from the API response
|
|
|
642 |
}, 100);
|
643 |
}
|
644 |
|
645 |
+
async evaluatePaper(button, arxivId, isReevaluate = false) {
|
646 |
const spinner = button.querySelector('.fa-spinner');
|
647 |
const evalIcon = button.querySelector('.eval-icon');
|
648 |
const evalText = button.querySelector('.eval-text');
|
649 |
const paperTitle = button.getAttribute('data-paper-title');
|
650 |
|
651 |
+
// Clear any existing state classes and show loading state
|
652 |
+
button.className = 'eval-button started-state';
|
653 |
spinner.style.display = 'inline-block';
|
654 |
evalIcon.style.display = 'none';
|
655 |
+
evalText.textContent = isReevaluate ? 'Re-starting...' : 'Starting...';
|
|
|
656 |
button.disabled = true;
|
657 |
|
658 |
try {
|
|
|
676 |
});
|
677 |
|
678 |
// Start evaluation
|
679 |
+
const url = isReevaluate ?
|
680 |
+
`/api/papers/reevaluate/${encodeURIComponent(arxivId)}` :
|
681 |
+
`/api/papers/evaluate/${encodeURIComponent(arxivId)}`;
|
682 |
+
|
683 |
+
const response = await fetch(url, {
|
684 |
method: 'POST'
|
685 |
});
|
686 |
|
687 |
if (response.ok) {
|
688 |
const result = await response.json();
|
689 |
|
690 |
+
if (result.status === 'already_evaluated' && !isReevaluate) {
|
691 |
// Paper was already evaluated, redirect to evaluation page
|
692 |
window.location.href = `/paper.html?id=${encodeURIComponent(arxivId)}`;
|
693 |
} else {
|
694 |
// Evaluation started, show progress and poll for status
|
695 |
+
evalText.textContent = isReevaluate ? 'Re-started...' : 'Started...';
|
696 |
button.className = 'eval-button started-state';
|
697 |
|
698 |
// Start polling for status
|
699 |
+
this.pollEvaluationStatus(button, arxivId, isReevaluate);
|
700 |
}
|
701 |
} else {
|
702 |
throw new Error('Failed to start evaluation');
|
|
|
713 |
}
|
714 |
}
|
715 |
|
716 |
+
async pollEvaluationStatus(button, arxivId, isReevaluate = false) {
|
717 |
const evalIcon = button.querySelector('.eval-icon');
|
718 |
const evalText = button.querySelector('.eval-text');
|
719 |
let pollCount = 0;
|
720 |
const maxPolls = 60; // Poll for up to 5 minutes (5s intervals)
|
721 |
|
722 |
// Show log message
|
723 |
+
const action = isReevaluate ? 're-evaluation' : 'evaluation';
|
724 |
+
this.showLogMessage(`Started ${action} for paper ${arxivId}`, 'info');
|
725 |
|
726 |
const poll = async () => {
|
727 |
try {
|
|
|
731 |
|
732 |
switch (status.status) {
|
733 |
case 'evaluating':
|
734 |
+
evalText.textContent = isReevaluate ? `Re-evaluating... (${pollCount * 5}s)` : `Evaluating... (${pollCount * 5}s)`;
|
735 |
evalIcon.className = 'fas fa-spinner fa-spin eval-icon';
|
736 |
button.className = 'eval-button evaluating-state';
|
737 |
+
const evaluatingAction = isReevaluate ? 'Re-evaluating' : 'Evaluating';
|
738 |
+
this.showLogMessage(`${evaluatingAction} paper ${arxivId}... (${pollCount * 5}s)`, 'info');
|
739 |
break;
|
740 |
|
741 |
case 'completed':
|
742 |
evalIcon.className = 'fas fa-check eval-icon';
|
743 |
+
evalText.textContent = isReevaluate ? 'Re-evaluated' : 'Completed';
|
744 |
button.className = 'eval-button evaluation-state';
|
745 |
button.onclick = () => {
|
746 |
window.location.href = `/paper.html?id=${encodeURIComponent(arxivId)}`;
|
747 |
};
|
748 |
+
const completedAction = isReevaluate ? 'Re-evaluation' : 'Evaluation';
|
749 |
+
this.showLogMessage(`${completedAction} completed for paper ${arxivId}`, 'success');
|
750 |
|
751 |
// Add score badge after completion
|
752 |
this.checkPaperScore(button.closest('.hf-paper-card'), arxivId);
|
753 |
|
754 |
+
// Add re-evaluate button if not already re-evaluating
|
755 |
+
if (!isReevaluate) {
|
756 |
+
this.addReevaluateButton(button.closest('.hf-paper-card'), arxivId);
|
757 |
+
}
|
758 |
+
|
759 |
return; // Stop polling
|
760 |
|
761 |
case 'failed':
|
|
|
903 |
e.target.classList.add('active');
|
904 |
});
|
905 |
});
|
906 |
+
|
907 |
+
// Batch evaluate button
|
908 |
+
const batchEvaluateBtn = document.getElementById('batchEvaluateBtn');
|
909 |
+
console.log('Looking for batchEvaluateBtn:', batchEvaluateBtn);
|
910 |
+
if (batchEvaluateBtn) {
|
911 |
+
console.log('Adding click listener to batchEvaluateBtn');
|
912 |
+
batchEvaluateBtn.addEventListener('click', () => {
|
913 |
+
console.log('Batch evaluate button clicked');
|
914 |
+
this.startBatchEvaluation();
|
915 |
+
});
|
916 |
+
} else {
|
917 |
+
console.error('batchEvaluateBtn not found during initialization');
|
918 |
+
}
|
919 |
}
|
920 |
|
921 |
async loadDaily(direction = null) {
|
|
|
989 |
}
|
990 |
}
|
991 |
|
992 |
+
async startBatchEvaluation() {
|
993 |
+
console.log('startBatchEvaluation called');
|
994 |
+
|
995 |
+
const button = document.getElementById('batchEvaluateBtn');
|
996 |
+
if (!button) {
|
997 |
+
console.error('batchEvaluateBtn not found');
|
998 |
+
return;
|
999 |
+
}
|
1000 |
+
|
1001 |
+
console.log('Found batchEvaluateBtn:', button);
|
1002 |
+
|
1003 |
+
// Disable button and show loading state
|
1004 |
+
button.disabled = true;
|
1005 |
+
const originalContent = button.innerHTML;
|
1006 |
+
button.innerHTML = '<i class="fas fa-spinner fa-spin"></i><span>Starting...</span>';
|
1007 |
+
|
1008 |
+
try {
|
1009 |
+
// Find all unevaluated evaluate buttons
|
1010 |
+
const unevaluatedButtons = document.querySelectorAll('.eval-button');
|
1011 |
+
console.log('Found eval buttons:', unevaluatedButtons.length);
|
1012 |
+
|
1013 |
+
const buttonsToClick = [];
|
1014 |
+
|
1015 |
+
unevaluatedButtons.forEach((evalButton, index) => {
|
1016 |
+
const evalText = evalButton.querySelector('.eval-text');
|
1017 |
+
console.log(`Button ${index}:`, evalText ? evalText.textContent : 'no text');
|
1018 |
+
if (evalText && (evalText.textContent === 'Evaluate' || evalText.textContent === 'Check')) {
|
1019 |
+
buttonsToClick.push(evalButton);
|
1020 |
+
}
|
1021 |
+
});
|
1022 |
+
|
1023 |
+
console.log('Buttons to click:', buttonsToClick.length);
|
1024 |
+
|
1025 |
+
if (buttonsToClick.length === 0) {
|
1026 |
+
console.log('No buttons to click');
|
1027 |
+
this.cardRenderer.showLogMessage('All papers have already been evaluated.', 'info');
|
1028 |
+
return;
|
1029 |
+
}
|
1030 |
+
|
1031 |
+
this.cardRenderer.showLogMessage(`Starting batch evaluation of ${buttonsToClick.length} papers...`, 'info');
|
1032 |
+
|
1033 |
+
// Click each evaluate button with delay
|
1034 |
+
for (let i = 0; i < buttonsToClick.length; i++) {
|
1035 |
+
const evalButton = buttonsToClick[i];
|
1036 |
+
|
1037 |
+
// Update button text to show progress
|
1038 |
+
button.innerHTML = `<i class="fas fa-spinner fa-spin"></i><span>Starting ${i + 1} of ${buttonsToClick.length}</span>`;
|
1039 |
+
|
1040 |
+
console.log(`Clicking button ${i + 1}:`, evalButton);
|
1041 |
+
// Simulate click on the evaluate button
|
1042 |
+
evalButton.click();
|
1043 |
+
|
1044 |
+
// Add delay between clicks to avoid API overload
|
1045 |
+
await new Promise(resolve => setTimeout(resolve, 1000));
|
1046 |
+
}
|
1047 |
+
|
1048 |
+
this.cardRenderer.showLogMessage(`Started evaluation for ${buttonsToClick.length} papers. They will complete in the background.`, 'success');
|
1049 |
+
|
1050 |
+
} catch (error) {
|
1051 |
+
console.error('Batch evaluation error:', error);
|
1052 |
+
this.cardRenderer.showLogMessage(`Batch evaluation failed: ${error.message}`, 'error');
|
1053 |
+
} finally {
|
1054 |
+
// Restore button state
|
1055 |
+
button.disabled = false;
|
1056 |
+
button.innerHTML = originalContent;
|
1057 |
+
}
|
1058 |
+
}
|
1059 |
+
|
1060 |
+
|
1061 |
|
1062 |
// Unified notification system
|
1063 |
showNotification(options) {
|
frontend/paper.js
CHANGED
@@ -252,7 +252,24 @@ class PaperEvaluationRenderer {
|
|
252 |
</section>
|
253 |
`;
|
254 |
|
255 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
256 |
`<section class="evaluation-section">
|
257 |
<div class="section-header">
|
258 |
<h2><i class="fas fa-chart-bar"></i> Detailed Dimensional Analysis</h2>
|
@@ -524,9 +541,12 @@ class PaperEvaluationRenderer {
|
|
524 |
class PaperEvaluationApp {
|
525 |
constructor() {
|
526 |
this.renderer = new PaperEvaluationRenderer();
|
|
|
527 |
this.init();
|
528 |
}
|
529 |
|
|
|
|
|
530 |
async init() {
|
531 |
const id = getParam('id');
|
532 |
console.log('PaperEvaluationApp init with ID:', id);
|
@@ -592,7 +612,7 @@ class PaperEvaluationApp {
|
|
592 |
|
593 |
// Initialize the application when DOM is loaded
|
594 |
document.addEventListener('DOMContentLoaded', () => {
|
595 |
-
new PaperEvaluationApp();
|
596 |
});
|
597 |
|
598 |
|
|
|
252 |
</section>
|
253 |
`;
|
254 |
|
255 |
+
// Add action buttons at the top
|
256 |
+
const actionButtons = `
|
257 |
+
<section class="evaluation-section">
|
258 |
+
<div class="section-header">
|
259 |
+
<div style="display: flex; justify-content: space-between; align-items: center;">
|
260 |
+
<h2><i class="fas fa-chart-line"></i> Evaluation Actions</h2>
|
261 |
+
<div class="action-buttons">
|
262 |
+
<a href="/" class="action-btn primary">
|
263 |
+
<i class="fas fa-arrow-left"></i>
|
264 |
+
Back to Daily Papers
|
265 |
+
</a>
|
266 |
+
</div>
|
267 |
+
</div>
|
268 |
+
</div>
|
269 |
+
</section>
|
270 |
+
`;
|
271 |
+
|
272 |
+
contentEl.innerHTML = actionButtons + execSummary +
|
273 |
`<section class="evaluation-section">
|
274 |
<div class="section-header">
|
275 |
<h2><i class="fas fa-chart-bar"></i> Detailed Dimensional Analysis</h2>
|
|
|
541 |
class PaperEvaluationApp {
|
542 |
constructor() {
|
543 |
this.renderer = new PaperEvaluationRenderer();
|
544 |
+
this.paperId = getParam('id');
|
545 |
this.init();
|
546 |
}
|
547 |
|
548 |
+
|
549 |
+
|
550 |
async init() {
|
551 |
const id = getParam('id');
|
552 |
console.log('PaperEvaluationApp init with ID:', id);
|
|
|
612 |
|
613 |
// Initialize the application when DOM is loaded
|
614 |
document.addEventListener('DOMContentLoaded', () => {
|
615 |
+
window.paperApp = new PaperEvaluationApp();
|
616 |
});
|
617 |
|
618 |
|
frontend/styles.css
CHANGED
@@ -188,7 +188,7 @@ body {
|
|
188 |
margin: 0 auto;
|
189 |
padding: 0 24px;
|
190 |
display: grid;
|
191 |
-
grid-template-columns: 1fr
|
192 |
gap: 32px;
|
193 |
align-items: center;
|
194 |
}
|
@@ -205,9 +205,18 @@ body {
|
|
205 |
font-size: 16px;
|
206 |
}
|
207 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
208 |
.ai-search-container {
|
209 |
position: relative;
|
210 |
-
|
|
|
211 |
}
|
212 |
|
213 |
.ai-search-input {
|
@@ -245,6 +254,41 @@ body {
|
|
245 |
font-size: 16px;
|
246 |
}
|
247 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
248 |
.header-right {
|
249 |
display: flex;
|
250 |
flex-direction: column;
|
@@ -737,6 +781,7 @@ body {
|
|
737 |
border-radius: 50%;
|
738 |
transform: translateY(-50%);
|
739 |
animation: spin 1s linear infinite;
|
|
|
740 |
}
|
741 |
|
742 |
@keyframes spin {
|
@@ -762,6 +807,7 @@ body {
|
|
762 |
border-radius: 50%;
|
763 |
transform: translateY(-50%);
|
764 |
animation: pulse 1.5s ease-in-out infinite;
|
|
|
765 |
}
|
766 |
|
767 |
@keyframes pulse {
|
@@ -823,11 +869,113 @@ body {
|
|
823 |
border-color: var(--text-muted);
|
824 |
}
|
825 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
826 |
/* Spinner animation */
|
827 |
.eval-button .fa-spinner {
|
828 |
animation: spin 1s linear infinite;
|
829 |
}
|
830 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
831 |
@keyframes spin {
|
832 |
from { transform: rotate(0deg); }
|
833 |
to { transform: rotate(360deg); }
|
|
|
188 |
margin: 0 auto;
|
189 |
padding: 0 24px;
|
190 |
display: grid;
|
191 |
+
grid-template-columns: 1fr 1fr 1fr;
|
192 |
gap: 32px;
|
193 |
align-items: center;
|
194 |
}
|
|
|
205 |
font-size: 16px;
|
206 |
}
|
207 |
|
208 |
+
.search-batch-container {
|
209 |
+
display: flex;
|
210 |
+
align-items: center;
|
211 |
+
gap: 16px;
|
212 |
+
width: 100%;
|
213 |
+
justify-content: center;
|
214 |
+
}
|
215 |
+
|
216 |
.ai-search-container {
|
217 |
position: relative;
|
218 |
+
flex: 1;
|
219 |
+
max-width: 800px;
|
220 |
}
|
221 |
|
222 |
.ai-search-input {
|
|
|
254 |
font-size: 16px;
|
255 |
}
|
256 |
|
257 |
+
.batch-evaluate-btn {
|
258 |
+
display: flex;
|
259 |
+
align-items: center;
|
260 |
+
gap: 8px;
|
261 |
+
padding: 12px 20px;
|
262 |
+
background: linear-gradient(135deg, var(--accent-primary), var(--accent-secondary));
|
263 |
+
color: white;
|
264 |
+
border: none;
|
265 |
+
border-radius: 12px;
|
266 |
+
font-size: 14px;
|
267 |
+
font-weight: 600;
|
268 |
+
cursor: pointer;
|
269 |
+
transition: all 0.2s ease;
|
270 |
+
box-shadow: 0 2px 8px rgba(59, 130, 246, 0.3);
|
271 |
+
}
|
272 |
+
|
273 |
+
.batch-evaluate-btn:hover {
|
274 |
+
transform: translateY(-1px);
|
275 |
+
box-shadow: 0 4px 12px rgba(59, 130, 246, 0.4);
|
276 |
+
}
|
277 |
+
|
278 |
+
.batch-evaluate-btn:active {
|
279 |
+
transform: translateY(0);
|
280 |
+
}
|
281 |
+
|
282 |
+
.batch-evaluate-btn:disabled {
|
283 |
+
opacity: 0.6;
|
284 |
+
cursor: not-allowed;
|
285 |
+
transform: none;
|
286 |
+
}
|
287 |
+
|
288 |
+
.batch-evaluate-btn i {
|
289 |
+
font-size: 16px;
|
290 |
+
}
|
291 |
+
|
292 |
.header-right {
|
293 |
display: flex;
|
294 |
flex-direction: column;
|
|
|
781 |
border-radius: 50%;
|
782 |
transform: translateY(-50%);
|
783 |
animation: spin 1s linear infinite;
|
784 |
+
z-index: 1;
|
785 |
}
|
786 |
|
787 |
@keyframes spin {
|
|
|
807 |
border-radius: 50%;
|
808 |
transform: translateY(-50%);
|
809 |
animation: pulse 1.5s ease-in-out infinite;
|
810 |
+
z-index: 1;
|
811 |
}
|
812 |
|
813 |
@keyframes pulse {
|
|
|
869 |
border-color: var(--text-muted);
|
870 |
}
|
871 |
|
872 |
+
/* Re-evaluate button */
|
873 |
+
.reevaluate-button {
|
874 |
+
display: inline-flex;
|
875 |
+
align-items: center;
|
876 |
+
gap: 6px;
|
877 |
+
padding: 8px 16px;
|
878 |
+
border: 1px solid var(--accent-secondary);
|
879 |
+
border-radius: 8px;
|
880 |
+
background-color: var(--bg-secondary);
|
881 |
+
color: var(--accent-secondary);
|
882 |
+
font-size: 12px;
|
883 |
+
font-weight: 500;
|
884 |
+
text-decoration: none;
|
885 |
+
cursor: pointer;
|
886 |
+
transition: all 0.2s ease;
|
887 |
+
min-width: 100px;
|
888 |
+
justify-content: center;
|
889 |
+
margin-left: 8px;
|
890 |
+
}
|
891 |
+
|
892 |
+
.reevaluate-button:hover {
|
893 |
+
background-color: var(--accent-secondary);
|
894 |
+
color: white;
|
895 |
+
border-color: var(--accent-secondary);
|
896 |
+
}
|
897 |
+
|
898 |
+
.reevaluate-button:disabled {
|
899 |
+
opacity: 0.6;
|
900 |
+
cursor: not-allowed;
|
901 |
+
}
|
902 |
+
|
903 |
+
.reevaluate-button i {
|
904 |
+
font-size: 12px;
|
905 |
+
}
|
906 |
+
|
907 |
+
/* Action buttons for paper detail page */
|
908 |
+
.action-buttons {
|
909 |
+
display: flex;
|
910 |
+
gap: 12px;
|
911 |
+
align-items: center;
|
912 |
+
}
|
913 |
+
|
914 |
+
.action-btn {
|
915 |
+
display: inline-flex;
|
916 |
+
align-items: center;
|
917 |
+
gap: 8px;
|
918 |
+
padding: 10px 16px;
|
919 |
+
border: 1px solid var(--border-medium);
|
920 |
+
border-radius: 8px;
|
921 |
+
background-color: var(--bg-secondary);
|
922 |
+
color: var(--text-secondary);
|
923 |
+
font-size: 14px;
|
924 |
+
font-weight: 500;
|
925 |
+
text-decoration: none;
|
926 |
+
cursor: pointer;
|
927 |
+
transition: all 0.2s ease;
|
928 |
+
}
|
929 |
+
|
930 |
+
.action-btn:hover {
|
931 |
+
background-color: var(--bg-tertiary);
|
932 |
+
color: var(--text-primary);
|
933 |
+
border-color: var(--border-medium);
|
934 |
+
}
|
935 |
+
|
936 |
+
.action-btn.primary {
|
937 |
+
background-color: var(--accent-primary);
|
938 |
+
color: white;
|
939 |
+
border-color: var(--accent-primary);
|
940 |
+
}
|
941 |
+
|
942 |
+
.action-btn.primary:hover {
|
943 |
+
background-color: var(--accent-primary);
|
944 |
+
opacity: 0.9;
|
945 |
+
}
|
946 |
+
|
947 |
+
.action-btn.secondary {
|
948 |
+
background-color: var(--accent-secondary);
|
949 |
+
color: white;
|
950 |
+
border-color: var(--accent-secondary);
|
951 |
+
}
|
952 |
+
|
953 |
+
.action-btn.secondary:hover {
|
954 |
+
background-color: var(--accent-secondary);
|
955 |
+
opacity: 0.9;
|
956 |
+
}
|
957 |
+
|
958 |
+
.action-btn:disabled {
|
959 |
+
opacity: 0.6;
|
960 |
+
cursor: not-allowed;
|
961 |
+
}
|
962 |
+
|
963 |
/* Spinner animation */
|
964 |
.eval-button .fa-spinner {
|
965 |
animation: spin 1s linear infinite;
|
966 |
}
|
967 |
|
968 |
+
/* Ensure only one ::after pseudo-element is visible at a time */
|
969 |
+
.eval-button::after {
|
970 |
+
content: none;
|
971 |
+
}
|
972 |
+
|
973 |
+
.eval-button.evaluating-state::after,
|
974 |
+
.eval-button.started-state::after,
|
975 |
+
.eval-button.processing-state::after {
|
976 |
+
content: '';
|
977 |
+
}
|
978 |
+
|
979 |
@keyframes spin {
|
980 |
from { transform: rotate(0deg); }
|
981 |
to { transform: rotate(360deg); }
|
requirements.txt
CHANGED
@@ -9,4 +9,5 @@ httpx>=0.27.0
|
|
9 |
beautifulsoup4>=4.12.3
|
10 |
lxml>=5.2.2
|
11 |
mmengine>=0.10.7
|
|
|
12 |
|
|
|
9 |
beautifulsoup4>=4.12.3
|
10 |
lxml>=5.2.2
|
11 |
mmengine>=0.10.7
|
12 |
+
aiosqlite>=0.20.0
|
13 |
|
src/agents/evaluator.py
CHANGED
@@ -9,7 +9,7 @@ from typing import Any, Dict, List, Optional
|
|
9 |
from pathlib import Path
|
10 |
from datetime import datetime
|
11 |
|
12 |
-
from anthropic import
|
13 |
from anthropic.types import ToolUseBlock
|
14 |
from langgraph.graph import END, StateGraph
|
15 |
from pydantic import BaseModel, Field
|
@@ -59,7 +59,7 @@ class Evaluator:
|
|
59 |
api_key = api_key or os.getenv("ANTHROPIC_API_KEY")
|
60 |
if not api_key:
|
61 |
raise ValueError("Anthropic API key is required. Please set HF_SECRET_ANTHROPIC_API_KEY in Hugging Face Spaces secrets or ANTHROPIC_API_KEY environment variable.")
|
62 |
-
self.client =
|
63 |
self.system_prompt = REVIEWER_SYSTEM_PROMPT
|
64 |
self.eval_template = EVALUATION_PROMPT_TEMPLATE
|
65 |
|
@@ -91,8 +91,8 @@ class Evaluator:
|
|
91 |
})
|
92 |
|
93 |
try:
|
94 |
-
# Call Anthropic API with tools
|
95 |
-
response = self.client.messages.create(
|
96 |
model=config.model_id,
|
97 |
max_tokens=4000,
|
98 |
system=self.system_prompt,
|
@@ -210,7 +210,7 @@ async def save_node(state: ConversationState) -> ConversationState:
|
|
210 |
logger.warning(f"Warning: Could not parse evaluation_content as JSON: {e}")
|
211 |
|
212 |
# Save to database
|
213 |
-
db.update_paper_evaluation(
|
214 |
arxiv_id=state.arxiv_id,
|
215 |
evaluation_content=evaluation_content,
|
216 |
evaluation_score=evaluation_score,
|
|
|
9 |
from pathlib import Path
|
10 |
from datetime import datetime
|
11 |
|
12 |
+
from anthropic import AsyncAnthropic
|
13 |
from anthropic.types import ToolUseBlock
|
14 |
from langgraph.graph import END, StateGraph
|
15 |
from pydantic import BaseModel, Field
|
|
|
59 |
api_key = api_key or os.getenv("ANTHROPIC_API_KEY")
|
60 |
if not api_key:
|
61 |
raise ValueError("Anthropic API key is required. Please set HF_SECRET_ANTHROPIC_API_KEY in Hugging Face Spaces secrets or ANTHROPIC_API_KEY environment variable.")
|
62 |
+
self.client = AsyncAnthropic(api_key=api_key)
|
63 |
self.system_prompt = REVIEWER_SYSTEM_PROMPT
|
64 |
self.eval_template = EVALUATION_PROMPT_TEMPLATE
|
65 |
|
|
|
91 |
})
|
92 |
|
93 |
try:
|
94 |
+
# Call Anthropic API with tools (async)
|
95 |
+
response = await self.client.messages.create(
|
96 |
model=config.model_id,
|
97 |
max_tokens=4000,
|
98 |
system=self.system_prompt,
|
|
|
210 |
logger.warning(f"Warning: Could not parse evaluation_content as JSON: {e}")
|
211 |
|
212 |
# Save to database
|
213 |
+
await db.update_paper_evaluation(
|
214 |
arxiv_id=state.arxiv_id,
|
215 |
evaluation_content=evaluation_content,
|
216 |
evaluation_score=evaluation_score,
|
src/database/db.py
CHANGED
@@ -1,9 +1,9 @@
|
|
1 |
import os
|
2 |
import json
|
3 |
-
import
|
4 |
from datetime import date, datetime, timedelta
|
5 |
from typing import Any, Dict, List, Optional
|
6 |
-
from contextlib import
|
7 |
|
8 |
|
9 |
class PapersDatabase():
|
@@ -11,16 +11,16 @@ class PapersDatabase():
|
|
11 |
super().__init__(**kwargs)
|
12 |
self.db_path = None
|
13 |
|
14 |
-
def init_db(self, config):
|
15 |
"""Initialize the database with required tables"""
|
16 |
|
17 |
self.db_path = config.db_path
|
18 |
|
19 |
-
with self.get_connection() as conn:
|
20 |
-
cursor = conn.cursor()
|
21 |
|
22 |
# Create papers cache table
|
23 |
-
cursor.execute('''
|
24 |
CREATE TABLE IF NOT EXISTS papers_cache (
|
25 |
date_str TEXT PRIMARY KEY,
|
26 |
html_content TEXT NOT NULL,
|
@@ -31,7 +31,7 @@ class PapersDatabase():
|
|
31 |
''')
|
32 |
|
33 |
# Create papers table for individual arXiv papers
|
34 |
-
cursor.execute('''
|
35 |
CREATE TABLE IF NOT EXISTS papers (
|
36 |
arxiv_id TEXT PRIMARY KEY,
|
37 |
title TEXT NOT NULL,
|
@@ -52,7 +52,7 @@ class PapersDatabase():
|
|
52 |
''')
|
53 |
|
54 |
# Create latest_date table to track the most recent available date
|
55 |
-
cursor.execute('''
|
56 |
CREATE TABLE IF NOT EXISTS latest_date (
|
57 |
id INTEGER PRIMARY KEY CHECK (id = 1),
|
58 |
date_str TEXT NOT NULL,
|
@@ -61,34 +61,39 @@ class PapersDatabase():
|
|
61 |
''')
|
62 |
|
63 |
# Insert default latest_date record if it doesn't exist
|
64 |
-
cursor.execute('''
|
65 |
INSERT OR IGNORE INTO latest_date (id, date_str)
|
66 |
VALUES (1, ?)
|
67 |
''', (date.today().isoformat(),))
|
68 |
|
69 |
-
conn.commit()
|
70 |
|
71 |
-
@
|
72 |
-
def get_connection(self):
|
73 |
"""Context manager for database connections"""
|
74 |
-
conn =
|
75 |
-
conn.row_factory =
|
|
|
|
|
|
|
|
|
|
|
76 |
try:
|
77 |
yield conn
|
78 |
finally:
|
79 |
-
conn.close()
|
80 |
|
81 |
-
def get_cached_papers(self, date_str: str) -> Optional[Dict[str, Any]]:
|
82 |
"""Get cached papers for a specific date"""
|
83 |
-
with self.get_connection() as conn:
|
84 |
-
cursor = conn.cursor()
|
85 |
-
cursor.execute('''
|
86 |
SELECT parsed_cards, created_at
|
87 |
FROM papers_cache
|
88 |
WHERE date_str = ?
|
89 |
''', (date_str,))
|
90 |
|
91 |
-
row = cursor.fetchone()
|
92 |
if row:
|
93 |
return {
|
94 |
'cards': json.loads(row['parsed_cards']),
|
@@ -96,47 +101,47 @@ class PapersDatabase():
|
|
96 |
}
|
97 |
return None
|
98 |
|
99 |
-
def cache_papers(self, date_str: str, html_content: str, parsed_cards: List[Dict[str, Any]]):
|
100 |
"""Cache papers for a specific date"""
|
101 |
-
with self.get_connection() as conn:
|
102 |
-
cursor = conn.cursor()
|
103 |
-
cursor.execute('''
|
104 |
INSERT OR REPLACE INTO papers_cache
|
105 |
(date_str, html_content, parsed_cards, updated_at)
|
106 |
VALUES (?, ?, ?, CURRENT_TIMESTAMP)
|
107 |
''', (date_str, html_content, json.dumps(parsed_cards)))
|
108 |
-
conn.commit()
|
109 |
|
110 |
-
def get_latest_cached_date(self) -> Optional[str]:
|
111 |
"""Get the latest cached date"""
|
112 |
-
with self.get_connection() as conn:
|
113 |
-
cursor = conn.cursor()
|
114 |
-
cursor.execute('SELECT date_str FROM latest_date WHERE id = 1')
|
115 |
-
row = cursor.fetchone()
|
116 |
return row['date_str'] if row else None
|
117 |
|
118 |
-
def update_latest_date(self, date_str: str):
|
119 |
"""Update the latest available date"""
|
120 |
-
with self.get_connection() as conn:
|
121 |
-
cursor = conn.cursor()
|
122 |
-
cursor.execute('''
|
123 |
UPDATE latest_date
|
124 |
SET date_str = ?, updated_at = CURRENT_TIMESTAMP
|
125 |
WHERE id = 1
|
126 |
''', (date_str,))
|
127 |
-
conn.commit()
|
128 |
|
129 |
-
def is_cache_fresh(self, date_str: str, max_age_hours: int = 24) -> bool:
|
130 |
"""Check if cache is fresh (within max_age_hours)"""
|
131 |
-
with self.get_connection() as conn:
|
132 |
-
cursor = conn.cursor()
|
133 |
-
cursor.execute('''
|
134 |
SELECT updated_at
|
135 |
FROM papers_cache
|
136 |
WHERE date_str = ?
|
137 |
''', (date_str,))
|
138 |
|
139 |
-
row = cursor.fetchone()
|
140 |
if not row:
|
141 |
return False
|
142 |
|
@@ -144,64 +149,65 @@ class PapersDatabase():
|
|
144 |
age = datetime.now(cached_time.tzinfo) - cached_time
|
145 |
return age.total_seconds() < max_age_hours * 3600
|
146 |
|
147 |
-
def cleanup_old_cache(self, days_to_keep: int = 7):
|
148 |
"""Clean up old cache entries"""
|
149 |
cutoff_date = (datetime.now() - timedelta(days=days_to_keep)).isoformat()
|
150 |
-
with self.get_connection() as conn:
|
151 |
-
cursor = conn.cursor()
|
152 |
-
cursor.execute('''
|
153 |
DELETE FROM papers_cache
|
154 |
WHERE updated_at < ?
|
155 |
''', (cutoff_date,))
|
156 |
-
conn.commit()
|
157 |
|
158 |
# Papers table methods
|
159 |
-
def insert_paper(self, arxiv_id: str, title: str, authors: str, abstract: str = None,
|
160 |
categories: str = None, published_date: str = None):
|
161 |
"""Insert a new paper into the papers table"""
|
162 |
-
with self.get_connection() as conn:
|
163 |
-
cursor = conn.cursor()
|
164 |
-
cursor.execute('''
|
165 |
INSERT OR REPLACE INTO papers
|
166 |
(arxiv_id, title, authors, abstract, categories, published_date, updated_at)
|
167 |
VALUES (?, ?, ?, ?, ?, ?, CURRENT_TIMESTAMP)
|
168 |
''', (arxiv_id, title, authors, abstract, categories, published_date))
|
169 |
-
conn.commit()
|
170 |
|
171 |
-
def get_paper(self, arxiv_id: str) -> Optional[Dict[str, Any]]:
|
172 |
"""Get a paper by arxiv_id"""
|
173 |
-
with self.get_connection() as conn:
|
174 |
-
cursor = conn.cursor()
|
175 |
-
cursor.execute('''
|
176 |
SELECT * FROM papers WHERE arxiv_id = ?
|
177 |
''', (arxiv_id,))
|
178 |
|
179 |
-
row = cursor.fetchone()
|
180 |
if row:
|
181 |
return dict(row)
|
182 |
return None
|
183 |
|
184 |
-
def get_papers_by_evaluation_status(self, is_evaluated: bool = None) -> List[Dict[str, Any]]:
|
185 |
"""Get papers by evaluation status"""
|
186 |
-
with self.get_connection() as conn:
|
187 |
-
cursor = conn.cursor()
|
188 |
if is_evaluated is None:
|
189 |
-
cursor.execute('SELECT * FROM papers ORDER BY created_at DESC')
|
190 |
else:
|
191 |
-
cursor.execute('''
|
192 |
SELECT * FROM papers
|
193 |
WHERE is_evaluated = ?
|
194 |
ORDER BY created_at DESC
|
195 |
''', (is_evaluated,))
|
196 |
|
197 |
-
|
|
|
198 |
|
199 |
-
def update_paper_evaluation(self, arxiv_id: str, evaluation_content: str,
|
200 |
evaluation_score: float = None, overall_score: float = None, evaluation_tags: str = None):
|
201 |
"""Update paper with evaluation content"""
|
202 |
-
with self.get_connection() as conn:
|
203 |
-
cursor = conn.cursor()
|
204 |
-
cursor.execute('''
|
205 |
UPDATE papers
|
206 |
SET evaluation_content = ?,
|
207 |
evaluation_score = ?,
|
@@ -213,57 +219,60 @@ class PapersDatabase():
|
|
213 |
updated_at = CURRENT_TIMESTAMP
|
214 |
WHERE arxiv_id = ?
|
215 |
''', (evaluation_content, evaluation_score, overall_score, evaluation_tags, arxiv_id))
|
216 |
-
conn.commit()
|
217 |
|
218 |
-
def update_paper_status(self, arxiv_id: str, status: str):
|
219 |
"""Update paper evaluation status"""
|
220 |
-
with self.get_connection() as conn:
|
221 |
-
cursor = conn.cursor()
|
222 |
-
cursor.execute('''
|
223 |
UPDATE papers
|
224 |
SET evaluation_status = ?,
|
225 |
updated_at = CURRENT_TIMESTAMP
|
226 |
WHERE arxiv_id = ?
|
227 |
''', (status, arxiv_id))
|
228 |
-
conn.commit()
|
229 |
|
230 |
-
def get_unevaluated_papers(self) -> List[Dict[str, Any]]:
|
231 |
"""Get all papers that haven't been evaluated yet"""
|
232 |
-
return self.get_papers_by_evaluation_status(is_evaluated=False)
|
233 |
|
234 |
-
def get_evaluated_papers(self) -> List[Dict[str, Any]]:
|
235 |
"""Get all papers that have been evaluated"""
|
236 |
-
return self.get_papers_by_evaluation_status(is_evaluated=True)
|
237 |
|
238 |
-
def search_papers(self, query: str) -> List[Dict[str, Any]]:
|
239 |
"""Search papers by title, authors, or abstract"""
|
240 |
-
with self.get_connection() as conn:
|
241 |
-
cursor = conn.cursor()
|
242 |
search_pattern = f'%{query}%'
|
243 |
-
cursor.execute('''
|
244 |
SELECT * FROM papers
|
245 |
WHERE title LIKE ? OR authors LIKE ? OR abstract LIKE ?
|
246 |
ORDER BY created_at DESC
|
247 |
''', (search_pattern, search_pattern, search_pattern))
|
248 |
|
249 |
-
|
|
|
250 |
|
251 |
-
def delete_paper(self, arxiv_id: str):
|
252 |
"""Delete a paper from the database"""
|
253 |
-
with self.get_connection() as conn:
|
254 |
-
cursor = conn.cursor()
|
255 |
-
cursor.execute('DELETE FROM papers WHERE arxiv_id = ?', (arxiv_id,))
|
256 |
-
conn.commit()
|
257 |
|
258 |
-
def get_papers_count(self) -> Dict[str, int]:
|
259 |
"""Get count of papers by evaluation status"""
|
260 |
-
with self.get_connection() as conn:
|
261 |
-
cursor = conn.cursor()
|
262 |
-
cursor.execute('SELECT COUNT(*) as total FROM papers')
|
263 |
-
|
|
|
264 |
|
265 |
-
cursor.execute('SELECT COUNT(*) as evaluated FROM papers WHERE is_evaluated = TRUE')
|
266 |
-
|
|
|
267 |
|
268 |
return {
|
269 |
'total': total,
|
|
|
1 |
import os
|
2 |
import json
|
3 |
+
import aiosqlite
|
4 |
from datetime import date, datetime, timedelta
|
5 |
from typing import Any, Dict, List, Optional
|
6 |
+
from contextlib import asynccontextmanager
|
7 |
|
8 |
|
9 |
class PapersDatabase():
|
|
|
11 |
super().__init__(**kwargs)
|
12 |
self.db_path = None
|
13 |
|
14 |
+
async def init_db(self, config):
|
15 |
"""Initialize the database with required tables"""
|
16 |
|
17 |
self.db_path = config.db_path
|
18 |
|
19 |
+
async with self.get_connection() as conn:
|
20 |
+
cursor = await conn.cursor()
|
21 |
|
22 |
# Create papers cache table
|
23 |
+
await cursor.execute('''
|
24 |
CREATE TABLE IF NOT EXISTS papers_cache (
|
25 |
date_str TEXT PRIMARY KEY,
|
26 |
html_content TEXT NOT NULL,
|
|
|
31 |
''')
|
32 |
|
33 |
# Create papers table for individual arXiv papers
|
34 |
+
await cursor.execute('''
|
35 |
CREATE TABLE IF NOT EXISTS papers (
|
36 |
arxiv_id TEXT PRIMARY KEY,
|
37 |
title TEXT NOT NULL,
|
|
|
52 |
''')
|
53 |
|
54 |
# Create latest_date table to track the most recent available date
|
55 |
+
await cursor.execute('''
|
56 |
CREATE TABLE IF NOT EXISTS latest_date (
|
57 |
id INTEGER PRIMARY KEY CHECK (id = 1),
|
58 |
date_str TEXT NOT NULL,
|
|
|
61 |
''')
|
62 |
|
63 |
# Insert default latest_date record if it doesn't exist
|
64 |
+
await cursor.execute('''
|
65 |
INSERT OR IGNORE INTO latest_date (id, date_str)
|
66 |
VALUES (1, ?)
|
67 |
''', (date.today().isoformat(),))
|
68 |
|
69 |
+
await conn.commit()
|
70 |
|
71 |
+
@asynccontextmanager
|
72 |
+
async def get_connection(self):
|
73 |
"""Context manager for database connections"""
|
74 |
+
conn = await aiosqlite.connect(self.db_path)
|
75 |
+
conn.row_factory = aiosqlite.Row # Enable dict-like access
|
76 |
+
# Enable WAL mode for better concurrency
|
77 |
+
await conn.execute("PRAGMA journal_mode=WAL")
|
78 |
+
await conn.execute("PRAGMA synchronous=NORMAL")
|
79 |
+
await conn.execute("PRAGMA cache_size=10000")
|
80 |
+
await conn.execute("PRAGMA temp_store=MEMORY")
|
81 |
try:
|
82 |
yield conn
|
83 |
finally:
|
84 |
+
await conn.close()
|
85 |
|
86 |
+
async def get_cached_papers(self, date_str: str) -> Optional[Dict[str, Any]]:
|
87 |
"""Get cached papers for a specific date"""
|
88 |
+
async with self.get_connection() as conn:
|
89 |
+
cursor = await conn.cursor()
|
90 |
+
await cursor.execute('''
|
91 |
SELECT parsed_cards, created_at
|
92 |
FROM papers_cache
|
93 |
WHERE date_str = ?
|
94 |
''', (date_str,))
|
95 |
|
96 |
+
row = await cursor.fetchone()
|
97 |
if row:
|
98 |
return {
|
99 |
'cards': json.loads(row['parsed_cards']),
|
|
|
101 |
}
|
102 |
return None
|
103 |
|
104 |
+
async def cache_papers(self, date_str: str, html_content: str, parsed_cards: List[Dict[str, Any]]):
|
105 |
"""Cache papers for a specific date"""
|
106 |
+
async with self.get_connection() as conn:
|
107 |
+
cursor = await conn.cursor()
|
108 |
+
await cursor.execute('''
|
109 |
INSERT OR REPLACE INTO papers_cache
|
110 |
(date_str, html_content, parsed_cards, updated_at)
|
111 |
VALUES (?, ?, ?, CURRENT_TIMESTAMP)
|
112 |
''', (date_str, html_content, json.dumps(parsed_cards)))
|
113 |
+
await conn.commit()
|
114 |
|
115 |
+
async def get_latest_cached_date(self) -> Optional[str]:
|
116 |
"""Get the latest cached date"""
|
117 |
+
async with self.get_connection() as conn:
|
118 |
+
cursor = await conn.cursor()
|
119 |
+
await cursor.execute('SELECT date_str FROM latest_date WHERE id = 1')
|
120 |
+
row = await cursor.fetchone()
|
121 |
return row['date_str'] if row else None
|
122 |
|
123 |
+
async def update_latest_date(self, date_str: str):
|
124 |
"""Update the latest available date"""
|
125 |
+
async with self.get_connection() as conn:
|
126 |
+
cursor = await conn.cursor()
|
127 |
+
await cursor.execute('''
|
128 |
UPDATE latest_date
|
129 |
SET date_str = ?, updated_at = CURRENT_TIMESTAMP
|
130 |
WHERE id = 1
|
131 |
''', (date_str,))
|
132 |
+
await conn.commit()
|
133 |
|
134 |
+
async def is_cache_fresh(self, date_str: str, max_age_hours: int = 24) -> bool:
|
135 |
"""Check if cache is fresh (within max_age_hours)"""
|
136 |
+
async with self.get_connection() as conn:
|
137 |
+
cursor = await conn.cursor()
|
138 |
+
await cursor.execute('''
|
139 |
SELECT updated_at
|
140 |
FROM papers_cache
|
141 |
WHERE date_str = ?
|
142 |
''', (date_str,))
|
143 |
|
144 |
+
row = await cursor.fetchone()
|
145 |
if not row:
|
146 |
return False
|
147 |
|
|
|
149 |
age = datetime.now(cached_time.tzinfo) - cached_time
|
150 |
return age.total_seconds() < max_age_hours * 3600
|
151 |
|
152 |
+
async def cleanup_old_cache(self, days_to_keep: int = 7):
|
153 |
"""Clean up old cache entries"""
|
154 |
cutoff_date = (datetime.now() - timedelta(days=days_to_keep)).isoformat()
|
155 |
+
async with self.get_connection() as conn:
|
156 |
+
cursor = await conn.cursor()
|
157 |
+
await cursor.execute('''
|
158 |
DELETE FROM papers_cache
|
159 |
WHERE updated_at < ?
|
160 |
''', (cutoff_date,))
|
161 |
+
await conn.commit()
|
162 |
|
163 |
# Papers table methods
|
164 |
+
async def insert_paper(self, arxiv_id: str, title: str, authors: str, abstract: str = None,
|
165 |
categories: str = None, published_date: str = None):
|
166 |
"""Insert a new paper into the papers table"""
|
167 |
+
async with self.get_connection() as conn:
|
168 |
+
cursor = await conn.cursor()
|
169 |
+
await cursor.execute('''
|
170 |
INSERT OR REPLACE INTO papers
|
171 |
(arxiv_id, title, authors, abstract, categories, published_date, updated_at)
|
172 |
VALUES (?, ?, ?, ?, ?, ?, CURRENT_TIMESTAMP)
|
173 |
''', (arxiv_id, title, authors, abstract, categories, published_date))
|
174 |
+
await conn.commit()
|
175 |
|
176 |
+
async def get_paper(self, arxiv_id: str) -> Optional[Dict[str, Any]]:
|
177 |
"""Get a paper by arxiv_id"""
|
178 |
+
async with self.get_connection() as conn:
|
179 |
+
cursor = await conn.cursor()
|
180 |
+
await cursor.execute('''
|
181 |
SELECT * FROM papers WHERE arxiv_id = ?
|
182 |
''', (arxiv_id,))
|
183 |
|
184 |
+
row = await cursor.fetchone()
|
185 |
if row:
|
186 |
return dict(row)
|
187 |
return None
|
188 |
|
189 |
+
async def get_papers_by_evaluation_status(self, is_evaluated: bool = None) -> List[Dict[str, Any]]:
|
190 |
"""Get papers by evaluation status"""
|
191 |
+
async with self.get_connection() as conn:
|
192 |
+
cursor = await conn.cursor()
|
193 |
if is_evaluated is None:
|
194 |
+
await cursor.execute('SELECT * FROM papers ORDER BY created_at DESC')
|
195 |
else:
|
196 |
+
await cursor.execute('''
|
197 |
SELECT * FROM papers
|
198 |
WHERE is_evaluated = ?
|
199 |
ORDER BY created_at DESC
|
200 |
''', (is_evaluated,))
|
201 |
|
202 |
+
rows = await cursor.fetchall()
|
203 |
+
return [dict(row) for row in rows]
|
204 |
|
205 |
+
async def update_paper_evaluation(self, arxiv_id: str, evaluation_content: str,
|
206 |
evaluation_score: float = None, overall_score: float = None, evaluation_tags: str = None):
|
207 |
"""Update paper with evaluation content"""
|
208 |
+
async with self.get_connection() as conn:
|
209 |
+
cursor = await conn.cursor()
|
210 |
+
await cursor.execute('''
|
211 |
UPDATE papers
|
212 |
SET evaluation_content = ?,
|
213 |
evaluation_score = ?,
|
|
|
219 |
updated_at = CURRENT_TIMESTAMP
|
220 |
WHERE arxiv_id = ?
|
221 |
''', (evaluation_content, evaluation_score, overall_score, evaluation_tags, arxiv_id))
|
222 |
+
await conn.commit()
|
223 |
|
224 |
+
async def update_paper_status(self, arxiv_id: str, status: str):
|
225 |
"""Update paper evaluation status"""
|
226 |
+
async with self.get_connection() as conn:
|
227 |
+
cursor = await conn.cursor()
|
228 |
+
await cursor.execute('''
|
229 |
UPDATE papers
|
230 |
SET evaluation_status = ?,
|
231 |
updated_at = CURRENT_TIMESTAMP
|
232 |
WHERE arxiv_id = ?
|
233 |
''', (status, arxiv_id))
|
234 |
+
await conn.commit()
|
235 |
|
236 |
+
async def get_unevaluated_papers(self) -> List[Dict[str, Any]]:
|
237 |
"""Get all papers that haven't been evaluated yet"""
|
238 |
+
return await self.get_papers_by_evaluation_status(is_evaluated=False)
|
239 |
|
240 |
+
async def get_evaluated_papers(self) -> List[Dict[str, Any]]:
|
241 |
"""Get all papers that have been evaluated"""
|
242 |
+
return await self.get_papers_by_evaluation_status(is_evaluated=True)
|
243 |
|
244 |
+
async def search_papers(self, query: str) -> List[Dict[str, Any]]:
|
245 |
"""Search papers by title, authors, or abstract"""
|
246 |
+
async with self.get_connection() as conn:
|
247 |
+
cursor = await conn.cursor()
|
248 |
search_pattern = f'%{query}%'
|
249 |
+
await cursor.execute('''
|
250 |
SELECT * FROM papers
|
251 |
WHERE title LIKE ? OR authors LIKE ? OR abstract LIKE ?
|
252 |
ORDER BY created_at DESC
|
253 |
''', (search_pattern, search_pattern, search_pattern))
|
254 |
|
255 |
+
rows = await cursor.fetchall()
|
256 |
+
return [dict(row) for row in rows]
|
257 |
|
258 |
+
async def delete_paper(self, arxiv_id: str):
|
259 |
"""Delete a paper from the database"""
|
260 |
+
async with self.get_connection() as conn:
|
261 |
+
cursor = await conn.cursor()
|
262 |
+
await cursor.execute('DELETE FROM papers WHERE arxiv_id = ?', (arxiv_id,))
|
263 |
+
await conn.commit()
|
264 |
|
265 |
+
async def get_papers_count(self) -> Dict[str, int]:
|
266 |
"""Get count of papers by evaluation status"""
|
267 |
+
async with self.get_connection() as conn:
|
268 |
+
cursor = await conn.cursor()
|
269 |
+
await cursor.execute('SELECT COUNT(*) as total FROM papers')
|
270 |
+
total_row = await cursor.fetchone()
|
271 |
+
total = total_row['total']
|
272 |
|
273 |
+
await cursor.execute('SELECT COUNT(*) as evaluated FROM papers WHERE is_evaluated = TRUE')
|
274 |
+
evaluated_row = await cursor.fetchone()
|
275 |
+
evaluated = evaluated_row['evaluated']
|
276 |
|
277 |
return {
|
278 |
'total': total,
|
debug_comparison.py → test/debug_comparison.py
RENAMED
File without changes
|
test/test_async_db.py
ADDED
@@ -0,0 +1,138 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
#!/usr/bin/env python3
|
2 |
+
"""
|
3 |
+
Test script for async database operations
|
4 |
+
"""
|
5 |
+
|
6 |
+
import asyncio
|
7 |
+
import argparse
|
8 |
+
import os
|
9 |
+
import sys
|
10 |
+
from pathlib import Path
|
11 |
+
from mmengine.config import DictAction
|
12 |
+
|
13 |
+
# Add the project root to the path
|
14 |
+
root = str(Path(__file__).resolve().parents[1])
|
15 |
+
sys.path.append(root)
|
16 |
+
|
17 |
+
from src.database import db
|
18 |
+
from src.config import config
|
19 |
+
from src.logger import logger
|
20 |
+
|
21 |
+
def parse_args():
|
22 |
+
parser = argparse.ArgumentParser(description='main')
|
23 |
+
parser.add_argument("--config", default=os.path.join(root, "configs", "paper_agent.py"), help="config file path")
|
24 |
+
|
25 |
+
parser.add_argument(
|
26 |
+
'--cfg-options',
|
27 |
+
nargs='+',
|
28 |
+
action=DictAction,
|
29 |
+
help='override some settings in the used config, the key-value pair '
|
30 |
+
'in xxx=yyy format will be merged into config file. If the value to '
|
31 |
+
'be overwritten is a list, it should be like key="[a,b]" or key=a,b '
|
32 |
+
'It also allows nested list/tuple values, e.g. key="[(a,b),(c,d)]" '
|
33 |
+
'Note that the quotation marks are necessary and that no white space '
|
34 |
+
'is allowed.')
|
35 |
+
args = parser.parse_args()
|
36 |
+
return args
|
37 |
+
|
38 |
+
|
39 |
+
async def test_async_database():
|
40 |
+
"""Test async database operations"""
|
41 |
+
print("🧪 Testing Async Database Operations")
|
42 |
+
|
43 |
+
try:
|
44 |
+
# Initialize database
|
45 |
+
await db.init_db(config=config)
|
46 |
+
print("✅ Database initialized successfully")
|
47 |
+
|
48 |
+
# Test inserting a paper
|
49 |
+
test_arxiv_id = "2401.00001"
|
50 |
+
await db.insert_paper(
|
51 |
+
arxiv_id=test_arxiv_id,
|
52 |
+
title="Test Async Paper",
|
53 |
+
authors="Test Author",
|
54 |
+
abstract="This is a test paper for async database operations.",
|
55 |
+
categories="cs.AI",
|
56 |
+
published_date="2024-01-01"
|
57 |
+
)
|
58 |
+
print("✅ Paper inserted successfully")
|
59 |
+
|
60 |
+
# Test getting the paper
|
61 |
+
paper = await db.get_paper(test_arxiv_id)
|
62 |
+
if paper:
|
63 |
+
print(f"✅ Paper retrieved: {paper['title']}")
|
64 |
+
else:
|
65 |
+
print("❌ Paper not found")
|
66 |
+
return False
|
67 |
+
|
68 |
+
# Test updating paper evaluation
|
69 |
+
await db.update_paper_evaluation(
|
70 |
+
arxiv_id=test_arxiv_id,
|
71 |
+
evaluation_content="Test evaluation content",
|
72 |
+
evaluation_score=3.5,
|
73 |
+
overall_score=3.2,
|
74 |
+
evaluation_tags="test_tag"
|
75 |
+
)
|
76 |
+
print("✅ Paper evaluation updated successfully")
|
77 |
+
|
78 |
+
# Test getting evaluated papers
|
79 |
+
evaluated_papers = await db.get_evaluated_papers()
|
80 |
+
print(f"✅ Found {len(evaluated_papers)} evaluated papers")
|
81 |
+
|
82 |
+
# Test getting paper count
|
83 |
+
count = await db.get_papers_count()
|
84 |
+
print(f"✅ Paper count: {count}")
|
85 |
+
|
86 |
+
# Test searching papers
|
87 |
+
search_results = await db.search_papers("Test")
|
88 |
+
print(f"✅ Search results: {len(search_results)} papers found")
|
89 |
+
|
90 |
+
# Test cache operations
|
91 |
+
await db.cache_papers("2024-01-01", "<html>test</html>", [{"test": "data"}])
|
92 |
+
print("✅ Cache operation successful")
|
93 |
+
|
94 |
+
cached_data = await db.get_cached_papers("2024-01-01")
|
95 |
+
if cached_data:
|
96 |
+
print("✅ Cache retrieval successful")
|
97 |
+
else:
|
98 |
+
print("❌ Cache retrieval failed")
|
99 |
+
|
100 |
+
# Test cache freshness
|
101 |
+
is_fresh = await db.is_cache_fresh("2024-01-01")
|
102 |
+
print(f"✅ Cache freshness check: {is_fresh}")
|
103 |
+
|
104 |
+
print("\n🎉 All async database tests passed!")
|
105 |
+
return True
|
106 |
+
|
107 |
+
except Exception as e:
|
108 |
+
print(f"❌ Error during async database test: {str(e)}")
|
109 |
+
import traceback
|
110 |
+
traceback.print_exc()
|
111 |
+
return False
|
112 |
+
|
113 |
+
|
114 |
+
async def main():
|
115 |
+
"""Main function"""
|
116 |
+
print("🚀 Starting Async Database Test")
|
117 |
+
# Parse command line arguments
|
118 |
+
args = parse_args()
|
119 |
+
|
120 |
+
# Initialize the configuration
|
121 |
+
config.init_config(args.config, args)
|
122 |
+
|
123 |
+
# Initialize logger
|
124 |
+
logger.init_logger(config=config)
|
125 |
+
|
126 |
+
# Run the test
|
127 |
+
success = await test_async_database()
|
128 |
+
|
129 |
+
if success:
|
130 |
+
print("\n✅ All tests completed successfully!")
|
131 |
+
sys.exit(0)
|
132 |
+
else:
|
133 |
+
print("\n❌ Tests failed!")
|
134 |
+
sys.exit(1)
|
135 |
+
|
136 |
+
|
137 |
+
if __name__ == "__main__":
|
138 |
+
asyncio.run(main())
|
test/test_concurrent_eval.py
ADDED
@@ -0,0 +1,97 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
#!/usr/bin/env python3
|
2 |
+
"""
|
3 |
+
Test script for concurrent evaluation operations
|
4 |
+
"""
|
5 |
+
|
6 |
+
import asyncio
|
7 |
+
import aiohttp
|
8 |
+
import json
|
9 |
+
import sys
|
10 |
+
from pathlib import Path
|
11 |
+
|
12 |
+
# Add the project root to the path
|
13 |
+
root = str(Path(__file__).resolve().parents[1])
|
14 |
+
sys.path.append(root)
|
15 |
+
|
16 |
+
# Test papers (these should exist in your database)
|
17 |
+
TEST_PAPERS = [
|
18 |
+
"2401.00001",
|
19 |
+
"2401.00002",
|
20 |
+
"2401.00003"
|
21 |
+
]
|
22 |
+
|
23 |
+
BASE_URL = "http://localhost:7860"
|
24 |
+
|
25 |
+
async def test_concurrent_evaluations():
|
26 |
+
"""Test concurrent evaluation of multiple papers"""
|
27 |
+
print("🧪 Testing Concurrent Evaluations")
|
28 |
+
|
29 |
+
async with aiohttp.ClientSession() as session:
|
30 |
+
# Start multiple evaluations concurrently
|
31 |
+
tasks = []
|
32 |
+
for arxiv_id in TEST_PAPERS:
|
33 |
+
print(f"Starting evaluation for {arxiv_id}")
|
34 |
+
task = asyncio.create_task(start_evaluation(session, arxiv_id))
|
35 |
+
tasks.append(task)
|
36 |
+
|
37 |
+
# Wait for all evaluations to start
|
38 |
+
results = await asyncio.gather(*tasks, return_exceptions=True)
|
39 |
+
|
40 |
+
print("\n=== Evaluation Start Results ===")
|
41 |
+
for i, result in enumerate(results):
|
42 |
+
if isinstance(result, Exception):
|
43 |
+
print(f"❌ Error starting evaluation for {TEST_PAPERS[i]}: {result}")
|
44 |
+
else:
|
45 |
+
print(f"✅ Started evaluation for {TEST_PAPERS[i]}: {result.get('status')}")
|
46 |
+
|
47 |
+
# Check active tasks
|
48 |
+
print("\n=== Checking Active Tasks ===")
|
49 |
+
async with session.get(f"{BASE_URL}/api/papers/evaluate/active-tasks") as response:
|
50 |
+
if response.status == 200:
|
51 |
+
active_tasks = await response.json()
|
52 |
+
print(f"Active tasks: {active_tasks['total_active']}")
|
53 |
+
print(f"Tracked tasks: {active_tasks['total_tracked']}")
|
54 |
+
for arxiv_id, task_info in active_tasks['active_tasks'].items():
|
55 |
+
print(f" - {arxiv_id}: {task_info['status']}")
|
56 |
+
else:
|
57 |
+
print(f"❌ Failed to get active tasks: {response.status}")
|
58 |
+
|
59 |
+
# Monitor status for a few seconds
|
60 |
+
print("\n=== Monitoring Status ===")
|
61 |
+
for _ in range(5):
|
62 |
+
await asyncio.sleep(2)
|
63 |
+
for arxiv_id in TEST_PAPERS:
|
64 |
+
async with session.get(f"{BASE_URL}/api/papers/evaluate/{arxiv_id}/status") as response:
|
65 |
+
if response.status == 200:
|
66 |
+
status = await response.json()
|
67 |
+
print(f"{arxiv_id}: {status['status']} (running: {status.get('is_running', False)})")
|
68 |
+
else:
|
69 |
+
print(f"❌ Failed to get status for {arxiv_id}")
|
70 |
+
|
71 |
+
|
72 |
+
async def start_evaluation(session, arxiv_id):
|
73 |
+
"""Start evaluation for a specific paper"""
|
74 |
+
async with session.post(f"{BASE_URL}/api/papers/evaluate/{arxiv_id}") as response:
|
75 |
+
if response.status == 200:
|
76 |
+
return await response.json()
|
77 |
+
else:
|
78 |
+
error_text = await response.text()
|
79 |
+
raise Exception(f"HTTP {response.status}: {error_text}")
|
80 |
+
|
81 |
+
|
82 |
+
async def main():
|
83 |
+
"""Main function"""
|
84 |
+
print("🚀 Starting Concurrent Evaluation Test")
|
85 |
+
|
86 |
+
try:
|
87 |
+
await test_concurrent_evaluations()
|
88 |
+
print("\n✅ Concurrent evaluation test completed!")
|
89 |
+
except Exception as e:
|
90 |
+
print(f"\n❌ Test failed: {str(e)}")
|
91 |
+
import traceback
|
92 |
+
traceback.print_exc()
|
93 |
+
sys.exit(1)
|
94 |
+
|
95 |
+
|
96 |
+
if __name__ == "__main__":
|
97 |
+
asyncio.run(main())
|
test_evaluation.py → test/test_evaluation.py
RENAMED
@@ -15,7 +15,7 @@ from mmengine import DictAction
|
|
15 |
load_dotenv(verbose=True)
|
16 |
|
17 |
# 设置根目录路径
|
18 |
-
root = str(Path(__file__).
|
19 |
sys.path.append(root)
|
20 |
|
21 |
from src.database import db
|
@@ -64,13 +64,13 @@ async def test_evaluation():
|
|
64 |
|
65 |
try:
|
66 |
# Check if paper exists in database
|
67 |
-
paper = db.get_paper(test_arxiv_id)
|
68 |
if paper:
|
69 |
print(f"✅ Paper found in database: {paper['title']}")
|
70 |
else:
|
71 |
print(f"⚠️ Paper not in database, creating new record")
|
72 |
# Insert test paper
|
73 |
-
db.insert_paper(
|
74 |
arxiv_id=test_arxiv_id,
|
75 |
title="Test Paper for Evaluation",
|
76 |
authors="Test Author",
|
@@ -100,7 +100,7 @@ async def test_evaluation():
|
|
100 |
print("⚠️ Evaluation result may be incomplete")
|
101 |
|
102 |
# Check evaluation status in database
|
103 |
-
updated_paper = db.get_paper(test_arxiv_id)
|
104 |
if updated_paper and updated_paper.get('is_evaluated'):
|
105 |
print("✅ Evaluation saved to database")
|
106 |
print(f"Evaluation score: {updated_paper.get('evaluation_score')}")
|
@@ -123,14 +123,14 @@ async def test_database_operations():
|
|
123 |
|
124 |
try:
|
125 |
# Test getting paper
|
126 |
-
paper = db.get_paper("2508.09889")
|
127 |
if paper:
|
128 |
print(f"✅ Database connection OK, found paper: {paper['title']}")
|
129 |
else:
|
130 |
print("⚠️ Test paper not found in database")
|
131 |
|
132 |
# Test getting paper statistics
|
133 |
-
stats = db.get_papers_count()
|
134 |
print(f"✅ Paper statistics: Total={stats['total']}, Evaluated={stats['evaluated']}, Unevaluated={stats['unevaluated']}")
|
135 |
|
136 |
return True
|
@@ -156,7 +156,7 @@ async def main():
|
|
156 |
logger.info(f"| Config:\n{config.pretty_text}")
|
157 |
|
158 |
# Initialize database
|
159 |
-
db.init_db(config=config)
|
160 |
logger.info(f"| Database initialized at: {config.db_path}")
|
161 |
|
162 |
print(f"✅ Database initialized: {config.db_path}")
|
|
|
15 |
load_dotenv(verbose=True)
|
16 |
|
17 |
# 设置根目录路径
|
18 |
+
root = str(Path(__file__).resolve().parents[1])
|
19 |
sys.path.append(root)
|
20 |
|
21 |
from src.database import db
|
|
|
64 |
|
65 |
try:
|
66 |
# Check if paper exists in database
|
67 |
+
paper = await db.get_paper(test_arxiv_id)
|
68 |
if paper:
|
69 |
print(f"✅ Paper found in database: {paper['title']}")
|
70 |
else:
|
71 |
print(f"⚠️ Paper not in database, creating new record")
|
72 |
# Insert test paper
|
73 |
+
await db.insert_paper(
|
74 |
arxiv_id=test_arxiv_id,
|
75 |
title="Test Paper for Evaluation",
|
76 |
authors="Test Author",
|
|
|
100 |
print("⚠️ Evaluation result may be incomplete")
|
101 |
|
102 |
# Check evaluation status in database
|
103 |
+
updated_paper = await db.get_paper(test_arxiv_id)
|
104 |
if updated_paper and updated_paper.get('is_evaluated'):
|
105 |
print("✅ Evaluation saved to database")
|
106 |
print(f"Evaluation score: {updated_paper.get('evaluation_score')}")
|
|
|
123 |
|
124 |
try:
|
125 |
# Test getting paper
|
126 |
+
paper = await db.get_paper("2508.09889")
|
127 |
if paper:
|
128 |
print(f"✅ Database connection OK, found paper: {paper['title']}")
|
129 |
else:
|
130 |
print("⚠️ Test paper not found in database")
|
131 |
|
132 |
# Test getting paper statistics
|
133 |
+
stats = await db.get_papers_count()
|
134 |
print(f"✅ Paper statistics: Total={stats['total']}, Evaluated={stats['evaluated']}, Unevaluated={stats['unevaluated']}")
|
135 |
|
136 |
return True
|
|
|
156 |
logger.info(f"| Config:\n{config.pretty_text}")
|
157 |
|
158 |
# Initialize database
|
159 |
+
await db.init_db(config=config)
|
160 |
logger.info(f"| Database initialized at: {config.db_path}")
|
161 |
|
162 |
print(f"✅ Database initialized: {config.db_path}")
|