Upload 4 files
Browse files- Dockerfile +29 -36
- README.md +19 -110
- server.js +14 -35
Dockerfile
CHANGED
@@ -1,37 +1,30 @@
|
|
1 |
-
#
|
2 |
-
FROM ghcr.io/puppeteer/puppeteer:21.5.2
|
3 |
-
|
4 |
-
#
|
5 |
-
|
6 |
-
|
7 |
-
#
|
8 |
-
|
9 |
-
|
10 |
-
|
11 |
-
|
12 |
-
|
13 |
-
#
|
14 |
-
|
15 |
-
|
16 |
-
#
|
17 |
-
COPY
|
18 |
-
|
19 |
-
#
|
20 |
-
|
21 |
-
|
22 |
-
#
|
23 |
-
|
24 |
-
|
25 |
-
|
26 |
-
|
27 |
-
|
28 |
-
|
29 |
-
|
30 |
-
ENV PORT=7860
|
31 |
-
ENV PUPPETEER_EXECUTABLE_PATH=/usr/bin/google-chrome-stable
|
32 |
-
|
33 |
-
# 暴露端口
|
34 |
-
EXPOSE 7860
|
35 |
-
|
36 |
-
# 启动命令
|
37 |
CMD ["npm", "start"]
|
|
|
1 |
+
# 最简 HF Spaces Dockerfile - 修复版
|
2 |
+
FROM ghcr.io/puppeteer/puppeteer:21.5.2
|
3 |
+
|
4 |
+
# 直接设置工作目录
|
5 |
+
WORKDIR /usr/src/app
|
6 |
+
|
7 |
+
# 复制 package 文件
|
8 |
+
COPY package*.json ./
|
9 |
+
|
10 |
+
# 切换到 root 用户安装依赖
|
11 |
+
USER root
|
12 |
+
|
13 |
+
# 安装 Node.js 依赖
|
14 |
+
RUN npm ci --only=production && npm cache clean --force
|
15 |
+
|
16 |
+
# 复制应用代码
|
17 |
+
COPY . .
|
18 |
+
|
19 |
+
# 切换回非 root 用户
|
20 |
+
USER pptruser
|
21 |
+
|
22 |
+
# 设置环境变量
|
23 |
+
ENV NODE_ENV=production
|
24 |
+
ENV PORT=7860
|
25 |
+
|
26 |
+
# 暴露端口
|
27 |
+
EXPOSE 7860
|
28 |
+
|
29 |
+
# 启动命令
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
30 |
CMD ["npm", "start"]
|
README.md
CHANGED
@@ -10,138 +10,47 @@ license: mit
|
|
10 |
|
11 |
# Page Screenshot API
|
12 |
|
13 |
-
A web service that captures screenshots of web pages using Puppeteer.
|
14 |
|
15 |
## Features
|
16 |
- Web page screenshot capture
|
17 |
-
- Customizable dimensions (width/height)
|
18 |
- Adjustable image quality
|
19 |
- Rate limiting for API protection
|
20 |
- CORS enabled for cross-origin requests
|
|
|
|
|
|
|
|
|
21 |
|
22 |
## API Usage
|
23 |
|
24 |
### POST /screenshot
|
25 |
-
|
26 |
-
Capture a screenshot of a web page.
|
27 |
-
|
28 |
-
**Request Body:**
|
29 |
```json
|
30 |
{
|
31 |
"url": "https://example.com",
|
32 |
-
"width":
|
33 |
-
"height":
|
34 |
-
"quality":
|
35 |
}
|
36 |
```
|
37 |
|
38 |
-
**
|
39 |
-
-
|
40 |
-
-
|
41 |
-
-
|
42 |
-
-
|
43 |
-
|
44 |
-
**Response:**
|
45 |
-
Returns the screenshot as a JPEG image.
|
46 |
|
47 |
### GET /
|
|
|
48 |
|
49 |
-
|
|
|
50 |
|
51 |
## Example Usage
|
52 |
-
|
53 |
```bash
|
54 |
-
curl -X POST
|
55 |
-H "Content-Type: application/json" \
|
56 |
-
-d '{"url": "https://
|
57 |
--output screenshot.jpg
|
58 |
-
```
|
59 |
-
|
60 |
-
## Rate Limiting
|
61 |
-
|
62 |
-
- 100 requests per 15 minutes per IP address
|
63 |
-
|
64 |
-
## Deployment
|
65 |
-
|
66 |
-
This application can be deployed on various platforms:
|
67 |
-
- Hugging Face Spaces (Docker)
|
68 |
-
- Railway
|
69 |
-
- Render.com
|
70 |
-
- Vercel
|
71 |
-
|
72 |
-
For detailed deployment instructions, see `DEPLOYMENT_GUIDE.md`.
|
73 |
-
|
74 |
-
## Railway部署指南
|
75 |
-
|
76 |
-
### 1. 准备部署
|
77 |
-
确保你的项目包含以下文件:
|
78 |
-
- `Dockerfile` - 容器化配置
|
79 |
-
- `railway.toml` - Railway部署配置
|
80 |
-
- `package.json` - 依赖和启动脚本
|
81 |
-
|
82 |
-
### 2. 部署到Railway
|
83 |
-
有两种方式部署到Railway:
|
84 |
-
|
85 |
-
#### 方式一:通过GitHub连接(推荐)
|
86 |
-
1. 将代码推送到GitHub仓库
|
87 |
-
2. 访问 [Railway.app](https://railway.app)
|
88 |
-
3. 登录并点击 "New Project"
|
89 |
-
4. 选择 "Deploy from GitHub repo"
|
90 |
-
5. 选择你的仓库
|
91 |
-
6. Railway会自动检测Dockerfile并开始部署
|
92 |
-
|
93 |
-
#### 方式二:使用Railway CLI
|
94 |
-
```bash
|
95 |
-
# 安装Railway CLI
|
96 |
-
npm install -g @railway/cli
|
97 |
-
|
98 |
-
# 登录Railway
|
99 |
-
railway login
|
100 |
-
|
101 |
-
# 初始化项目
|
102 |
-
railway init
|
103 |
-
|
104 |
-
# 部署
|
105 |
-
railway up
|
106 |
-
```
|
107 |
-
|
108 |
-
### 3. 环境变量配置
|
109 |
-
在Railway控制台的Variables标签中添加:
|
110 |
-
- `NODE_ENV=production`
|
111 |
-
- `PORT` (Railway自动设置,无需手动配置)
|
112 |
-
|
113 |
-
### 4. 资源配置
|
114 |
-
推荐配置:
|
115 |
-
- CPU: 1 vCPU
|
116 |
-
- Memory: 1GB RAM
|
117 |
-
|
118 |
-
这些配置已在 `railway.toml` 中预设。
|
119 |
-
|
120 |
-
### 5. 自定义域名(可选)
|
121 |
-
在Railway控制台的Settings > Domains中可以:
|
122 |
-
- 使用Railway提供的免费子域名
|
123 |
-
- 绑定你自己的域名
|
124 |
-
|
125 |
-
### 6. 监控和日志
|
126 |
-
- 在Railway控制台的Deployments标签查看部署状态
|
127 |
-
- 在Metrics标签监控资源使用情况
|
128 |
-
- 在Variables标签管理环境变量
|
129 |
-
|
130 |
-
### 故障排除
|
131 |
-
如果部署失败,检查:
|
132 |
-
1. Dockerfile语法是否正确
|
133 |
-
2. package.json中的start脚本是否正确
|
134 |
-
3. 依赖包是否都已安装
|
135 |
-
4. 内存使用是否超出限制
|
136 |
-
|
137 |
-
### 部署后测试
|
138 |
-
```bash
|
139 |
-
# 健康检查
|
140 |
-
curl https://your-app.railway.app/
|
141 |
-
|
142 |
-
# 截图测试
|
143 |
-
curl -X POST https://your-app.railway.app/screenshot \
|
144 |
-
-H "Content-Type: application/json" \
|
145 |
-
-d '{"url": "https://google.com"}' \
|
146 |
-
--output test-screenshot.jpg
|
147 |
```
|
|
|
10 |
|
11 |
# Page Screenshot API
|
12 |
|
13 |
+
A web service that captures screenshots of web pages using Puppeteer, optimized for Hugging Face Spaces.
|
14 |
|
15 |
## Features
|
16 |
- Web page screenshot capture
|
17 |
+
- Customizable dimensions (width/height)
|
18 |
- Adjustable image quality
|
19 |
- Rate limiting for API protection
|
20 |
- CORS enabled for cross-origin requests
|
21 |
+
- Interactive demo interface
|
22 |
+
|
23 |
+
## Live Demo
|
24 |
+
Visit `/demo` to try the interactive screenshot tool!
|
25 |
|
26 |
## API Usage
|
27 |
|
28 |
### POST /screenshot
|
|
|
|
|
|
|
|
|
29 |
```json
|
30 |
{
|
31 |
"url": "https://example.com",
|
32 |
+
"width": 1280,
|
33 |
+
"height": 720,
|
34 |
+
"quality": 75
|
35 |
}
|
36 |
```
|
37 |
|
38 |
+
**HF Spaces Limits:**
|
39 |
+
- Width: 100-1600px
|
40 |
+
- Height: 100-1200px
|
41 |
+
- Timeout: 15 seconds
|
42 |
+
- Rate limit: 30 requests/15min
|
|
|
|
|
|
|
43 |
|
44 |
### GET /
|
45 |
+
Health check endpoint
|
46 |
|
47 |
+
### GET /demo
|
48 |
+
Interactive demo interface
|
49 |
|
50 |
## Example Usage
|
|
|
51 |
```bash
|
52 |
+
curl -X POST /screenshot \
|
53 |
-H "Content-Type: application/json" \
|
54 |
+
-d '{"url": "https://google.com", "width": 1280, "height": 720}' \
|
55 |
--output screenshot.jpg
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
56 |
```
|
server.js
CHANGED
@@ -9,15 +9,15 @@ const PORT = process.env.PORT || 7860;
|
|
9 |
|
10 |
// 中间件配置 - HF Spaces 优化
|
11 |
app.use(helmet({
|
12 |
-
contentSecurityPolicy: false
|
13 |
}));
|
14 |
app.use(cors());
|
15 |
app.use(express.json({ limit: '10mb' }));
|
16 |
|
17 |
-
// 速率限制
|
18 |
const limiter = rateLimit({
|
19 |
windowMs: 15 * 60 * 1000,
|
20 |
-
max: 30,
|
21 |
message: {
|
22 |
error: 'Too many requests, please try again later.'
|
23 |
}
|
@@ -39,11 +39,10 @@ app.get('/', (req, res) => {
|
|
39 |
});
|
40 |
});
|
41 |
|
42 |
-
// 截图API端点
|
43 |
app.post('/screenshot', async (req, res) => {
|
44 |
const { url, width = 1280, height = 720, quality = 75 } = req.body;
|
45 |
|
46 |
-
// 参数验证
|
47 |
if (!url) {
|
48 |
return res.status(400).json({
|
49 |
error: 'URL is required',
|
@@ -51,10 +50,8 @@ app.post('/screenshot', async (req, res) => {
|
|
51 |
});
|
52 |
}
|
53 |
|
54 |
-
// URL格式验证
|
55 |
try {
|
56 |
const urlObj = new URL(url);
|
57 |
-
// 检查协议
|
58 |
if (!['http:', 'https:'].includes(urlObj.protocol)) {
|
59 |
return res.status(400).json({
|
60 |
error: 'Only HTTP and HTTPS URLs are supported'
|
@@ -66,7 +63,6 @@ app.post('/screenshot', async (req, res) => {
|
|
66 |
});
|
67 |
}
|
68 |
|
69 |
-
// 分辨率验证 - HF Spaces 更严格限制
|
70 |
if (width < 100 || width > 1600 || height < 100 || height > 1200) {
|
71 |
return res.status(400).json({
|
72 |
error: 'Width must be 100-1600px, height must be 100-1200px for HF Spaces'
|
@@ -75,8 +71,10 @@ app.post('/screenshot', async (req, res) => {
|
|
75 |
|
76 |
let browser;
|
77 |
try {
|
78 |
-
|
79 |
-
|
|
|
|
|
80 |
headless: 'new',
|
81 |
args: [
|
82 |
'--no-sandbox',
|
@@ -89,35 +87,21 @@ app.post('/screenshot', async (req, res) => {
|
|
89 |
'--disable-extensions',
|
90 |
'--disable-background-timer-throttling',
|
91 |
'--disable-backgrounding-occluded-windows',
|
92 |
-
'--disable-renderer-backgrounding'
|
93 |
-
'--disable-features=TranslateUI',
|
94 |
-
'--disable-default-apps',
|
95 |
-
'--no-default-browser-check',
|
96 |
-
'--disable-background-networking'
|
97 |
]
|
98 |
-
};
|
99 |
-
|
100 |
-
// 在 HF Spaces 中使用系统 Chrome
|
101 |
-
if (process.env.PUPPETEER_EXECUTABLE_PATH) {
|
102 |
-
browserOptions.executablePath = process.env.PUPPETEER_EXECUTABLE_PATH;
|
103 |
-
}
|
104 |
-
|
105 |
-
console.log('Launching browser...');
|
106 |
-
browser = await puppeteer.launch(browserOptions);
|
107 |
|
108 |
const page = await browser.newPage();
|
109 |
|
110 |
-
// 设置视窗大小
|
111 |
await page.setViewport({
|
112 |
width: parseInt(width),
|
113 |
height: parseInt(height),
|
114 |
deviceScaleFactor: 1
|
115 |
});
|
116 |
|
117 |
-
// 设置用户代理和其他页面选项
|
118 |
await page.setUserAgent('Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36');
|
119 |
|
120 |
-
//
|
121 |
await page.setRequestInterception(true);
|
122 |
page.on('request', (req) => {
|
123 |
const resourceType = req.resourceType();
|
@@ -130,18 +114,15 @@ app.post('/screenshot', async (req, res) => {
|
|
130 |
|
131 |
console.log(`Navigating to: ${url}`);
|
132 |
|
133 |
-
// 访问页面 - HF Spaces 更短超时
|
134 |
await page.goto(url, {
|
135 |
-
waitUntil: 'domcontentloaded',
|
136 |
-
timeout: 15000
|
137 |
});
|
138 |
|
139 |
-
// 等待页面稳定
|
140 |
await page.waitForTimeout(1000);
|
141 |
|
142 |
console.log('Taking screenshot...');
|
143 |
|
144 |
-
// 截图
|
145 |
const screenshot = await page.screenshot({
|
146 |
type: 'jpeg',
|
147 |
quality: Math.max(10, Math.min(100, parseInt(quality))),
|
@@ -150,7 +131,6 @@ app.post('/screenshot', async (req, res) => {
|
|
150 |
|
151 |
console.log(`Screenshot taken: ${screenshot.length} bytes`);
|
152 |
|
153 |
-
// 设置响应头
|
154 |
res.set({
|
155 |
'Content-Type': 'image/jpeg',
|
156 |
'Content-Length': screenshot.length,
|
@@ -167,7 +147,6 @@ app.post('/screenshot', async (req, res) => {
|
|
167 |
message: error.message
|
168 |
};
|
169 |
|
170 |
-
// 根据错误类型提供更好的错误信息
|
171 |
if (error.message.includes('timeout')) {
|
172 |
errorResponse.suggestion = 'Try a simpler webpage or reduce timeout';
|
173 |
} else if (error.message.includes('net::')) {
|
@@ -187,7 +166,7 @@ app.post('/screenshot', async (req, res) => {
|
|
187 |
}
|
188 |
});
|
189 |
|
190 |
-
// HF Spaces 演示界面
|
191 |
app.get('/demo', (req, res) => {
|
192 |
res.send(`
|
193 |
<!DOCTYPE html>
|
|
|
9 |
|
10 |
// 中间件配置 - HF Spaces 优化
|
11 |
app.use(helmet({
|
12 |
+
contentSecurityPolicy: false
|
13 |
}));
|
14 |
app.use(cors());
|
15 |
app.use(express.json({ limit: '10mb' }));
|
16 |
|
17 |
+
// 速率限制
|
18 |
const limiter = rateLimit({
|
19 |
windowMs: 15 * 60 * 1000,
|
20 |
+
max: 30,
|
21 |
message: {
|
22 |
error: 'Too many requests, please try again later.'
|
23 |
}
|
|
|
39 |
});
|
40 |
});
|
41 |
|
42 |
+
// 截图API端点
|
43 |
app.post('/screenshot', async (req, res) => {
|
44 |
const { url, width = 1280, height = 720, quality = 75 } = req.body;
|
45 |
|
|
|
46 |
if (!url) {
|
47 |
return res.status(400).json({
|
48 |
error: 'URL is required',
|
|
|
50 |
});
|
51 |
}
|
52 |
|
|
|
53 |
try {
|
54 |
const urlObj = new URL(url);
|
|
|
55 |
if (!['http:', 'https:'].includes(urlObj.protocol)) {
|
56 |
return res.status(400).json({
|
57 |
error: 'Only HTTP and HTTPS URLs are supported'
|
|
|
63 |
});
|
64 |
}
|
65 |
|
|
|
66 |
if (width < 100 || width > 1600 || height < 100 || height > 1200) {
|
67 |
return res.status(400).json({
|
68 |
error: 'Width must be 100-1600px, height must be 100-1200px for HF Spaces'
|
|
|
71 |
|
72 |
let browser;
|
73 |
try {
|
74 |
+
console.log('Launching browser...');
|
75 |
+
|
76 |
+
// HF Spaces 优化配置 - 使用 Puppeteer 镜像的默认 Chrome
|
77 |
+
browser = await puppeteer.launch({
|
78 |
headless: 'new',
|
79 |
args: [
|
80 |
'--no-sandbox',
|
|
|
87 |
'--disable-extensions',
|
88 |
'--disable-background-timer-throttling',
|
89 |
'--disable-backgrounding-occluded-windows',
|
90 |
+
'--disable-renderer-backgrounding'
|
|
|
|
|
|
|
|
|
91 |
]
|
92 |
+
});
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
93 |
|
94 |
const page = await browser.newPage();
|
95 |
|
|
|
96 |
await page.setViewport({
|
97 |
width: parseInt(width),
|
98 |
height: parseInt(height),
|
99 |
deviceScaleFactor: 1
|
100 |
});
|
101 |
|
|
|
102 |
await page.setUserAgent('Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36');
|
103 |
|
104 |
+
// 拦截资源以提高性能
|
105 |
await page.setRequestInterception(true);
|
106 |
page.on('request', (req) => {
|
107 |
const resourceType = req.resourceType();
|
|
|
114 |
|
115 |
console.log(`Navigating to: ${url}`);
|
116 |
|
|
|
117 |
await page.goto(url, {
|
118 |
+
waitUntil: 'domcontentloaded',
|
119 |
+
timeout: 15000
|
120 |
});
|
121 |
|
|
|
122 |
await page.waitForTimeout(1000);
|
123 |
|
124 |
console.log('Taking screenshot...');
|
125 |
|
|
|
126 |
const screenshot = await page.screenshot({
|
127 |
type: 'jpeg',
|
128 |
quality: Math.max(10, Math.min(100, parseInt(quality))),
|
|
|
131 |
|
132 |
console.log(`Screenshot taken: ${screenshot.length} bytes`);
|
133 |
|
|
|
134 |
res.set({
|
135 |
'Content-Type': 'image/jpeg',
|
136 |
'Content-Length': screenshot.length,
|
|
|
147 |
message: error.message
|
148 |
};
|
149 |
|
|
|
150 |
if (error.message.includes('timeout')) {
|
151 |
errorResponse.suggestion = 'Try a simpler webpage or reduce timeout';
|
152 |
} else if (error.message.includes('net::')) {
|
|
|
166 |
}
|
167 |
});
|
168 |
|
169 |
+
// HF Spaces 演示界面
|
170 |
app.get('/demo', (req, res) => {
|
171 |
res.send(`
|
172 |
<!DOCTYPE html>
|