CatPtain commited on
Commit
bf310a8
·
verified ·
1 Parent(s): 80b4048

Upload 4 files

Browse files
Files changed (3) hide show
  1. Dockerfile +29 -36
  2. README.md +19 -110
  3. server.js +14 -35
Dockerfile CHANGED
@@ -1,37 +1,30 @@
1
- # 轻量级 HF Spaces Dockerfile
2
- FROM ghcr.io/puppeteer/puppeteer:21.5.2
3
-
4
- # 切换到 root 用户进行安装
5
- USER root
6
-
7
- # 安装额外的字体和依赖
8
- RUN apt-get update && apt-get install -y \
9
- fonts-liberation \
10
- fonts-dejavu-core \
11
- && rm -rf /var/lib/apt/lists/*
12
-
13
- # 设置工作目录
14
- WORKDIR /usr/src/app
15
-
16
- # 复制 package 文件
17
- COPY package*.json ./
18
-
19
- # 安装依赖
20
- RUN npm ci --only=production && npm cache clean --force
21
-
22
- # 复制应用代码
23
- COPY . .
24
-
25
- # 切换回非 root 用户
26
- USER pptruser
27
-
28
- # 设置环境变量
29
- ENV NODE_ENV=production
30
- ENV PORT=7860
31
- ENV PUPPETEER_EXECUTABLE_PATH=/usr/bin/google-chrome-stable
32
-
33
- # 暴露端口
34
- EXPOSE 7860
35
-
36
- # 启动命令
37
  CMD ["npm", "start"]
 
1
+ # 最简 HF Spaces Dockerfile - 修复版
2
+ FROM ghcr.io/puppeteer/puppeteer:21.5.2
3
+
4
+ # 直接设置工作目录
5
+ WORKDIR /usr/src/app
6
+
7
+ # 复制 package 文件
8
+ COPY package*.json ./
9
+
10
+ # 切换到 root 用户安装依赖
11
+ USER root
12
+
13
+ # 安装 Node.js 依赖
14
+ RUN npm ci --only=production && npm cache clean --force
15
+
16
+ # 复制应用代码
17
+ COPY . .
18
+
19
+ # 切换回非 root 用户
20
+ USER pptruser
21
+
22
+ # 设置环境变量
23
+ ENV NODE_ENV=production
24
+ ENV PORT=7860
25
+
26
+ # 暴露端口
27
+ EXPOSE 7860
28
+
29
+ # 启动命令
 
 
 
 
 
 
 
30
  CMD ["npm", "start"]
README.md CHANGED
@@ -10,138 +10,47 @@ license: mit
10
 
11
  # Page Screenshot API
12
 
13
- A web service that captures screenshots of web pages using Puppeteer.
14
 
15
  ## Features
16
  - Web page screenshot capture
17
- - Customizable dimensions (width/height)
18
  - Adjustable image quality
19
  - Rate limiting for API protection
20
  - CORS enabled for cross-origin requests
 
 
 
 
21
 
22
  ## API Usage
23
 
24
  ### POST /screenshot
25
-
26
- Capture a screenshot of a web page.
27
-
28
- **Request Body:**
29
  ```json
30
  {
31
  "url": "https://example.com",
32
- "width": 1920,
33
- "height": 1080,
34
- "quality": 80
35
  }
36
  ```
37
 
38
- **Parameters:**
39
- - `url` (required): The URL of the webpage to capture
40
- - `width` (optional): Screenshot width in pixels (default: 1920, range: 100-4000)
41
- - `height` (optional): Screenshot height in pixels (default: 1080, range: 100-4000)
42
- - `quality` (optional): JPEG quality (default: 80, range: 1-100)
43
-
44
- **Response:**
45
- Returns the screenshot as a JPEG image.
46
 
47
  ### GET /
 
48
 
49
- Health check endpoint that returns API status.
 
50
 
51
  ## Example Usage
52
-
53
  ```bash
54
- curl -X POST https://your-app.railway.app/screenshot \
55
  -H "Content-Type: application/json" \
56
- -d '{"url": "https://example.com", "width": 1280, "height": 720}' \
57
  --output screenshot.jpg
58
- ```
59
-
60
- ## Rate Limiting
61
-
62
- - 100 requests per 15 minutes per IP address
63
-
64
- ## Deployment
65
-
66
- This application can be deployed on various platforms:
67
- - Hugging Face Spaces (Docker)
68
- - Railway
69
- - Render.com
70
- - Vercel
71
-
72
- For detailed deployment instructions, see `DEPLOYMENT_GUIDE.md`.
73
-
74
- ## Railway部署指南
75
-
76
- ### 1. 准备部署
77
- 确保你的项目包含以下文件:
78
- - `Dockerfile` - 容器化配置
79
- - `railway.toml` - Railway部署配置
80
- - `package.json` - 依赖和启动脚本
81
-
82
- ### 2. 部署到Railway
83
- 有两种方式部署到Railway:
84
-
85
- #### 方式一:通过GitHub连接(推荐)
86
- 1. 将代码推送到GitHub仓库
87
- 2. 访问 [Railway.app](https://railway.app)
88
- 3. 登录并点击 "New Project"
89
- 4. 选择 "Deploy from GitHub repo"
90
- 5. 选择你的仓库
91
- 6. Railway会自动检测Dockerfile并开始部署
92
-
93
- #### 方式二:使用Railway CLI
94
- ```bash
95
- # 安装Railway CLI
96
- npm install -g @railway/cli
97
-
98
- # 登录Railway
99
- railway login
100
-
101
- # 初始化项目
102
- railway init
103
-
104
- # 部署
105
- railway up
106
- ```
107
-
108
- ### 3. 环境变量配置
109
- 在Railway控制台的Variables标签中添加:
110
- - `NODE_ENV=production`
111
- - `PORT` (Railway自动设置,无需手动配置)
112
-
113
- ### 4. 资源配置
114
- 推荐配置:
115
- - CPU: 1 vCPU
116
- - Memory: 1GB RAM
117
-
118
- 这些配置已在 `railway.toml` 中预设。
119
-
120
- ### 5. 自定义域名(可选)
121
- 在Railway控制台的Settings > Domains中可以:
122
- - 使用Railway提供的免费子域名
123
- - 绑定你自己的域名
124
-
125
- ### 6. 监控和日志
126
- - 在Railway控制台的Deployments标签查看部署状态
127
- - 在Metrics标签监控资源使用情况
128
- - 在Variables标签管理环境变量
129
-
130
- ### 故障排除
131
- 如果部署失败,检查:
132
- 1. Dockerfile语法是否正确
133
- 2. package.json中的start脚本是否正确
134
- 3. 依赖包是否都已安装
135
- 4. 内存使用是否超出限制
136
-
137
- ### 部署后测试
138
- ```bash
139
- # 健康检查
140
- curl https://your-app.railway.app/
141
-
142
- # 截图测试
143
- curl -X POST https://your-app.railway.app/screenshot \
144
- -H "Content-Type: application/json" \
145
- -d '{"url": "https://google.com"}' \
146
- --output test-screenshot.jpg
147
  ```
 
10
 
11
  # Page Screenshot API
12
 
13
+ A web service that captures screenshots of web pages using Puppeteer, optimized for Hugging Face Spaces.
14
 
15
  ## Features
16
  - Web page screenshot capture
17
+ - Customizable dimensions (width/height)
18
  - Adjustable image quality
19
  - Rate limiting for API protection
20
  - CORS enabled for cross-origin requests
21
+ - Interactive demo interface
22
+
23
+ ## Live Demo
24
+ Visit `/demo` to try the interactive screenshot tool!
25
 
26
  ## API Usage
27
 
28
  ### POST /screenshot
 
 
 
 
29
  ```json
30
  {
31
  "url": "https://example.com",
32
+ "width": 1280,
33
+ "height": 720,
34
+ "quality": 75
35
  }
36
  ```
37
 
38
+ **HF Spaces Limits:**
39
+ - Width: 100-1600px
40
+ - Height: 100-1200px
41
+ - Timeout: 15 seconds
42
+ - Rate limit: 30 requests/15min
 
 
 
43
 
44
  ### GET /
45
+ Health check endpoint
46
 
47
+ ### GET /demo
48
+ Interactive demo interface
49
 
50
  ## Example Usage
 
51
  ```bash
52
+ curl -X POST /screenshot \
53
  -H "Content-Type: application/json" \
54
+ -d '{"url": "https://google.com", "width": 1280, "height": 720}' \
55
  --output screenshot.jpg
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
56
  ```
server.js CHANGED
@@ -9,15 +9,15 @@ const PORT = process.env.PORT || 7860;
9
 
10
  // 中间件配置 - HF Spaces 优化
11
  app.use(helmet({
12
- contentSecurityPolicy: false // HF Spaces 需要
13
  }));
14
  app.use(cors());
15
  app.use(express.json({ limit: '10mb' }));
16
 
17
- // 速率限制 - HF Spaces 调整
18
  const limiter = rateLimit({
19
  windowMs: 15 * 60 * 1000,
20
- max: 30, // 进一步降低限制
21
  message: {
22
  error: 'Too many requests, please try again later.'
23
  }
@@ -39,11 +39,10 @@ app.get('/', (req, res) => {
39
  });
40
  });
41
 
42
- // 截图API端点 - 增强错误处理
43
  app.post('/screenshot', async (req, res) => {
44
  const { url, width = 1280, height = 720, quality = 75 } = req.body;
45
 
46
- // 参数验证
47
  if (!url) {
48
  return res.status(400).json({
49
  error: 'URL is required',
@@ -51,10 +50,8 @@ app.post('/screenshot', async (req, res) => {
51
  });
52
  }
53
 
54
- // URL格式验证
55
  try {
56
  const urlObj = new URL(url);
57
- // 检查协议
58
  if (!['http:', 'https:'].includes(urlObj.protocol)) {
59
  return res.status(400).json({
60
  error: 'Only HTTP and HTTPS URLs are supported'
@@ -66,7 +63,6 @@ app.post('/screenshot', async (req, res) => {
66
  });
67
  }
68
 
69
- // 分辨率验证 - HF Spaces 更严格限制
70
  if (width < 100 || width > 1600 || height < 100 || height > 1200) {
71
  return res.status(400).json({
72
  error: 'Width must be 100-1600px, height must be 100-1200px for HF Spaces'
@@ -75,8 +71,10 @@ app.post('/screenshot', async (req, res) => {
75
 
76
  let browser;
77
  try {
78
- // 启动浏览器 - HF Spaces 专用配置
79
- const browserOptions = {
 
 
80
  headless: 'new',
81
  args: [
82
  '--no-sandbox',
@@ -89,35 +87,21 @@ app.post('/screenshot', async (req, res) => {
89
  '--disable-extensions',
90
  '--disable-background-timer-throttling',
91
  '--disable-backgrounding-occluded-windows',
92
- '--disable-renderer-backgrounding',
93
- '--disable-features=TranslateUI',
94
- '--disable-default-apps',
95
- '--no-default-browser-check',
96
- '--disable-background-networking'
97
  ]
98
- };
99
-
100
- // 在 HF Spaces 中使用系统 Chrome
101
- if (process.env.PUPPETEER_EXECUTABLE_PATH) {
102
- browserOptions.executablePath = process.env.PUPPETEER_EXECUTABLE_PATH;
103
- }
104
-
105
- console.log('Launching browser...');
106
- browser = await puppeteer.launch(browserOptions);
107
 
108
  const page = await browser.newPage();
109
 
110
- // 设置视窗大小
111
  await page.setViewport({
112
  width: parseInt(width),
113
  height: parseInt(height),
114
  deviceScaleFactor: 1
115
  });
116
 
117
- // 设置用户代理和其他页面选项
118
  await page.setUserAgent('Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36');
119
 
120
- // 拦截不必要的资源以提高性能
121
  await page.setRequestInterception(true);
122
  page.on('request', (req) => {
123
  const resourceType = req.resourceType();
@@ -130,18 +114,15 @@ app.post('/screenshot', async (req, res) => {
130
 
131
  console.log(`Navigating to: ${url}`);
132
 
133
- // 访问页面 - HF Spaces 更短超时
134
  await page.goto(url, {
135
- waitUntil: 'domcontentloaded', // 更快的等待条件
136
- timeout: 15000 // 15秒超时
137
  });
138
 
139
- // 等待页面稳定
140
  await page.waitForTimeout(1000);
141
 
142
  console.log('Taking screenshot...');
143
 
144
- // 截图
145
  const screenshot = await page.screenshot({
146
  type: 'jpeg',
147
  quality: Math.max(10, Math.min(100, parseInt(quality))),
@@ -150,7 +131,6 @@ app.post('/screenshot', async (req, res) => {
150
 
151
  console.log(`Screenshot taken: ${screenshot.length} bytes`);
152
 
153
- // 设置响应头
154
  res.set({
155
  'Content-Type': 'image/jpeg',
156
  'Content-Length': screenshot.length,
@@ -167,7 +147,6 @@ app.post('/screenshot', async (req, res) => {
167
  message: error.message
168
  };
169
 
170
- // 根据错误类型提供更好的错误信息
171
  if (error.message.includes('timeout')) {
172
  errorResponse.suggestion = 'Try a simpler webpage or reduce timeout';
173
  } else if (error.message.includes('net::')) {
@@ -187,7 +166,7 @@ app.post('/screenshot', async (req, res) => {
187
  }
188
  });
189
 
190
- // HF Spaces 演示界面 - 改进版
191
  app.get('/demo', (req, res) => {
192
  res.send(`
193
  <!DOCTYPE html>
 
9
 
10
  // 中间件配置 - HF Spaces 优化
11
  app.use(helmet({
12
+ contentSecurityPolicy: false
13
  }));
14
  app.use(cors());
15
  app.use(express.json({ limit: '10mb' }));
16
 
17
+ // 速率限制
18
  const limiter = rateLimit({
19
  windowMs: 15 * 60 * 1000,
20
+ max: 30,
21
  message: {
22
  error: 'Too many requests, please try again later.'
23
  }
 
39
  });
40
  });
41
 
42
+ // 截图API端点
43
  app.post('/screenshot', async (req, res) => {
44
  const { url, width = 1280, height = 720, quality = 75 } = req.body;
45
 
 
46
  if (!url) {
47
  return res.status(400).json({
48
  error: 'URL is required',
 
50
  });
51
  }
52
 
 
53
  try {
54
  const urlObj = new URL(url);
 
55
  if (!['http:', 'https:'].includes(urlObj.protocol)) {
56
  return res.status(400).json({
57
  error: 'Only HTTP and HTTPS URLs are supported'
 
63
  });
64
  }
65
 
 
66
  if (width < 100 || width > 1600 || height < 100 || height > 1200) {
67
  return res.status(400).json({
68
  error: 'Width must be 100-1600px, height must be 100-1200px for HF Spaces'
 
71
 
72
  let browser;
73
  try {
74
+ console.log('Launching browser...');
75
+
76
+ // HF Spaces 优化配置 - 使用 Puppeteer 镜像的默认 Chrome
77
+ browser = await puppeteer.launch({
78
  headless: 'new',
79
  args: [
80
  '--no-sandbox',
 
87
  '--disable-extensions',
88
  '--disable-background-timer-throttling',
89
  '--disable-backgrounding-occluded-windows',
90
+ '--disable-renderer-backgrounding'
 
 
 
 
91
  ]
92
+ });
 
 
 
 
 
 
 
 
93
 
94
  const page = await browser.newPage();
95
 
 
96
  await page.setViewport({
97
  width: parseInt(width),
98
  height: parseInt(height),
99
  deviceScaleFactor: 1
100
  });
101
 
 
102
  await page.setUserAgent('Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36');
103
 
104
+ // 拦截资源以提高性能
105
  await page.setRequestInterception(true);
106
  page.on('request', (req) => {
107
  const resourceType = req.resourceType();
 
114
 
115
  console.log(`Navigating to: ${url}`);
116
 
 
117
  await page.goto(url, {
118
+ waitUntil: 'domcontentloaded',
119
+ timeout: 15000
120
  });
121
 
 
122
  await page.waitForTimeout(1000);
123
 
124
  console.log('Taking screenshot...');
125
 
 
126
  const screenshot = await page.screenshot({
127
  type: 'jpeg',
128
  quality: Math.max(10, Math.min(100, parseInt(quality))),
 
131
 
132
  console.log(`Screenshot taken: ${screenshot.length} bytes`);
133
 
 
134
  res.set({
135
  'Content-Type': 'image/jpeg',
136
  'Content-Length': screenshot.length,
 
147
  message: error.message
148
  };
149
 
 
150
  if (error.message.includes('timeout')) {
151
  errorResponse.suggestion = 'Try a simpler webpage or reduce timeout';
152
  } else if (error.message.includes('net::')) {
 
166
  }
167
  });
168
 
169
+ // HF Spaces 演示界面
170
  app.get('/demo', (req, res) => {
171
  res.send(`
172
  <!DOCTYPE html>