File size: 5,969 Bytes
074e02d
04aed77
 
 
 
60a7b9d
 
074e02d
04aed77
074e02d
 
04aed77
 
 
 
 
 
 
 
 
60a7b9d
04aed77
 
 
60a7b9d
 
 
 
 
04aed77
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
60a7b9d
04aed77
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
---
title: P2P Paper-to-Poster Generator
emoji: πŸŽ“
colorFrom: blue
colorTo: purple
sdk: docker
app_port: 7860
pinned: false
license: mit
---

# P2P: Automated Paper-to-Poster Generation and Fine-Grained Benchmark

[![](https://img.shields.io/badge/arXiv-2505.17104-b31b1b.svg?style=for-the-badge)](https://arxiv.org/abs/2505.17104)

[![Dataset - P2PInstruct](https://img.shields.io/badge/Dataset-P2PInstruct-blue)](https://huggingface.co/datasets/ASC8384/P2PInstruct)
[![Dataset - P2PEval](https://img.shields.io/badge/Dataset-P2PEval-blue)](https://huggingface.co/datasets/ASC8384/P2PEval)

## πŸš€ Try it on Hugging Face Spaces

This application is deployed on Hugging Face Spaces using Docker SDK to support advanced dependencies like Playwright! You can try it directly in your browser without any installation:

**πŸŽ“ [Launch P2P Paper-to-Poster Generator](https://huggingface.co/spaces/ASC8384/P2P)**

### Docker Deployment on HF Spaces:
1. **SDK**: Uses `docker` instead of `gradio` to support system-level dependencies
2. **Playwright Support**: Automatically installs and configures Playwright browsers
3. **Pre-built Environment**: No manual setup required for complex dependencies

### Quick Start on Spaces:
1. Upload your PDF research paper
2. Enter your OpenAI API key and base URL (if using proxy)
3. Input the AI model name (e.g., gpt-4o-mini, claude-3-sonnet)
4. Configure the figure detection service URL
5. Click "Generate Poster" and wait for processing
6. Preview the generated poster and download JSON/HTML files

⚠️ **Requirements**:
- Valid OpenAI API key with sufficient balance
- Figure detection service URL for extracting images from PDFs
- Compatible AI model (OpenAI, Claude, Gemini, etc.)

πŸ’‘ **Features**:
- Real-time HTML poster preview
- Direct JSON structure display
- Support for multiple AI models
- Flexible API configuration
- Advanced layout optimization with Playwright

## Overview

P2P is an AI-powered tool that automatically converts academic research papers into professional conference posters. This repository contains the code for generating and evaluating these posters, leveraging large language models to extract key information and create visually appealing presentations.

The full research paper is available on [arXiv](https://arxiv.org/abs/2505.17104).

**Note:** Due to the large size of the evaluation and training datasets, only simple samples are included in this repository. The complete datasets are available on HuggingFace:
- [P2PInstruct](https://huggingface.co/datasets/ASC8384/P2PInstruct) - Training dataset
- [P2PEval](https://huggingface.co/datasets/ASC8384/P2PEval) - Benchmark dataset

## Repository Structure

### Core Files
- `main.py`: Main entry point for generating a poster from a single paper
- `start.py`: Batch processing script for generating posters from multiple papers 
- `end.py`: Evaluation coordinator that processes generated posters
- `evalv2.py`: Core evaluation logic with metrics and comparison methods
- `figure_detection.py`: Utility for detecting and extracting figures from PDFs

### Directories
- `poster/`: Core poster generation logic
  - `poster.py`: Main poster generation implementation
  - `figures.py`: Figure extraction and processing utilities
  - `compress.py`: Image compression utilities
  - `loader.py`: PDF loading utilities

- `eval/`: Evaluation tools and resources
  - `eval_checklist.py`: Checklist-based evaluation implementation
  - `predict_with_xgboost.py`: ML-based poster quality prediction
  - `common.yaml`: Common evaluation parameters
  - `xgboost_model.joblib`: Pre-trained evaluation model

## Requirements

- Python 3.10+
- Dependencies listed in `requirements.txt`

## Setup

Install dependencies:
```bash
pip install -r requirements.txt
playwright install
```

## Usage

### Generating a Single Poster

To generate a poster from a single paper:

```bash
# Deploy figure_detection first
python main.py --url="URL_TO_PDF" --pdf="path/to/paper.pdf" --model="gpt-4o-mini" --output="output/poster.json"
```

#### Parameters:
- `--url`: URL for PDF processing service (detecting and extracting figures)
- `--pdf`: Path to the local PDF file
- `--model`: LLM model to use (default: gpt-4o-mini)
- `--output`: Output file path (default: poster.json)

#### Output Files:
- `poster.json`: JSON representation of the poster
- `poster.html`: HTML version of the poster
- `poster.png`: PNG image of the poster

### Batch Generating Posters

To generate posters for multiple papers:

1. Organize your papers in a directory structure:
```
eval/data/
  └─ paper_id_1/
     └─ paper.pdf
  └─ paper_id_2/
     └─ paper.pdf
  ...
```

2. Edit `start.py` to configure:
   - `url`: URL for PDF processing service
   - `input_dir`: Directory containing papers (default: "eval/data")
   - `models`: List of AI models to use for generation

3. Run the batch generation script:
```bash
python start.py
```

Generated posters will be saved to:
```
eval/temp-v2/{model_name}/{paper_id}/
  └─ poster.json
  └─ poster.html
  └─ poster.png
```

### Evaluating Posters

To evaluate generated posters:

1. Ensure reference materials exist:
```
eval/data/{paper_id}/
  └─ poster.png (reference poster)
  └─ checklist.yaml (evaluation checklist)
```

2. Run the evaluation script:
```bash
python end.py
```

Evaluation results will be saved to `eval/temp-v2/results.jsonl`.

## Citation

If you find our work useful, please consider citing P2P:

```bibtex
@misc{sun2025p2pautomatedpapertopostergeneration,
      title={P2P: Automated Paper-to-Poster Generation and Fine-Grained Benchmark}, 
      author={Tao Sun and Enhao Pan and Zhengkai Yang and Kaixin Sui and Jiajun Shi and Xianfu Cheng and Tongliang Li and Wenhao Huang and Ge Zhang and Jian Yang and Zhoujun Li},
      year={2025},
      eprint={2505.17104},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2505.17104}, 
}
```