Spaces:
Build error
Build error
Zenith Wang
commited on
Commit
·
139c357
1
Parent(s):
d2001c1
支持CoT推理展示,优化界面布局,简化说明文档
Browse files
README.md
CHANGED
@@ -1,5 +1,5 @@
|
|
1 |
---
|
2 |
-
title: Step-3
|
3 |
emoji: 🤖
|
4 |
colorFrom: purple
|
5 |
colorTo: blue
|
@@ -10,65 +10,29 @@ pinned: false
|
|
10 |
license: mit
|
11 |
---
|
12 |
|
13 |
-
# Step-3
|
14 |
|
15 |
-
|
16 |
|
17 |
-
##
|
18 |
|
19 |
-
-
|
20 |
-
-
|
21 |
-
-
|
22 |
-
-
|
23 |
|
24 |
-
##
|
25 |
|
26 |
-
|
|
|
|
|
27 |
|
28 |
-
|
29 |
-
- 进入 Space 的 Settings 页面
|
30 |
-
- 在 "Repository secrets" 部分添加:
|
31 |
-
- Name: `STEP_API_KEY`
|
32 |
-
- Value: 你的阶跃星辰 API 密钥
|
33 |
|
34 |
-
|
35 |
-
- 上传一张图片
|
36 |
-
- 输入提示词(例如:"这是什么?请详细描述")
|
37 |
-
- 点击"开始分析"
|
38 |
-
- 等待 AI 返回结果
|
39 |
-
|
40 |
-
### 获取 API 密钥
|
41 |
-
|
42 |
-
1. 访问 [阶跃星辰官网](https://www.stepfun.com/)
|
43 |
-
2. 注册/登录账号
|
44 |
-
3. 在控制台创建 API 密钥
|
45 |
-
|
46 |
-
## 示例提示词
|
47 |
-
|
48 |
-
- "这张图片中有什么内容?请详细描述。"
|
49 |
-
- "帮我看看这是什么菜,如何制作?"
|
50 |
-
- "分析这张图片的构图和色彩运用。"
|
51 |
-
- "这张图片可能是在什么地方拍摄的?"
|
52 |
-
- "图片中的人物在做什么?他们的表情如何?"
|
53 |
|
54 |
## 技术栈
|
55 |
|
56 |
- **模型**: Step-3 / Step-r1-v-mini
|
57 |
- **框架**: Gradio 4.19.2
|
58 |
-
- **API**: OpenAI Python SDK (兼容
|
59 |
-
|
60 |
-
## 注意事项
|
61 |
-
|
62 |
-
- 请确保图片清晰度足够
|
63 |
-
- 提示词越具体,分析结果越准确
|
64 |
-
- API 密钥请妥善保管,不要公开分享
|
65 |
-
|
66 |
-
## 许可证
|
67 |
-
|
68 |
-
MIT License
|
69 |
-
|
70 |
-
## 致谢
|
71 |
-
|
72 |
-
- [阶跃星辰](https://www.stepfun.com/) - 提供强大的 AI 模型
|
73 |
-
- [Gradio](https://gradio.app/) - 提供优秀的 Web UI 框架
|
74 |
-
- [Hugging Face](https://huggingface.co/) - 提供免费的部署平台
|
|
|
1 |
---
|
2 |
+
title: Step-3
|
3 |
emoji: 🤖
|
4 |
colorFrom: purple
|
5 |
colorTo: blue
|
|
|
10 |
license: mit
|
11 |
---
|
12 |
|
13 |
+
# Step-3 🤖
|
14 |
|
15 |
+
智能图像理解和分析工具,支持 Chain of Thought (CoT) 推理展示。
|
16 |
|
17 |
+
## 主要特性
|
18 |
|
19 |
+
- 🧠 **CoT 推理展示**:实时显示模型的思考过程
|
20 |
+
- 🔄 **流式输出**:推理过程和最终答案分开展示
|
21 |
+
- 🖼️ **图像分析**:支持多种图片格式
|
22 |
+
- 📝 **双模型支持**:Step-3 和 Step-r1-v-mini
|
23 |
|
24 |
+
## 如何配置
|
25 |
|
26 |
+
在 Hugging Face Space 的 Settings → Repository secrets 中添加:
|
27 |
+
- **Name**: `STEP_API_KEY`
|
28 |
+
- **Value**: 你的 Step API 密钥
|
29 |
|
30 |
+
## 获取 API 密钥
|
|
|
|
|
|
|
|
|
31 |
|
32 |
+
访问 [阶跃星辰官网](https://www.stepfun.com/) 注册并获取 API 密钥。
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
33 |
|
34 |
## 技术栈
|
35 |
|
36 |
- **模型**: Step-3 / Step-r1-v-mini
|
37 |
- **框架**: Gradio 4.19.2
|
38 |
+
- **API**: OpenAI Python SDK (兼容)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
app.py
CHANGED
@@ -8,7 +8,7 @@ from PIL import Image
|
|
8 |
|
9 |
# 配置
|
10 |
BASE_URL = "https://api.stepfun.com/v1"
|
11 |
-
# 从环境变量获取API
|
12 |
STEP_API_KEY = os.environ.get("STEP_API_KEY", "")
|
13 |
|
14 |
# 可选模型
|
@@ -19,7 +19,6 @@ def image_to_base64(image):
|
|
19 |
if image is None:
|
20 |
return None
|
21 |
|
22 |
-
# 如果是PIL图像,直接处理
|
23 |
if isinstance(image, Image.Image):
|
24 |
buffered = BytesIO()
|
25 |
image.save(buffered, format="PNG")
|
@@ -28,25 +27,30 @@ def image_to_base64(image):
|
|
28 |
|
29 |
return None
|
30 |
|
31 |
-
def call_step_api(image, prompt, model, temperature=0.7, max_tokens=2000
|
32 |
-
"""调用Step API
|
33 |
|
34 |
if image is None:
|
35 |
-
|
|
|
36 |
|
37 |
if not prompt:
|
38 |
-
|
|
|
39 |
|
40 |
if not STEP_API_KEY:
|
41 |
-
|
|
|
42 |
|
43 |
# 转换图像为base64
|
44 |
try:
|
45 |
base64_image = image_to_base64(image)
|
46 |
if base64_image is None:
|
47 |
-
|
|
|
48 |
except Exception as e:
|
49 |
-
|
|
|
50 |
|
51 |
# 构造消息
|
52 |
messages = [
|
@@ -72,94 +76,90 @@ def call_step_api(image, prompt, model, temperature=0.7, max_tokens=2000, stream
|
|
72 |
try:
|
73 |
client = OpenAI(api_key=STEP_API_KEY, base_url=BASE_URL)
|
74 |
except Exception as e:
|
75 |
-
|
|
|
76 |
|
77 |
try:
|
78 |
# 记录开始时间
|
79 |
start_time = time.time()
|
80 |
|
81 |
-
|
82 |
-
|
83 |
-
|
84 |
-
|
85 |
-
|
86 |
-
|
87 |
-
|
88 |
-
|
89 |
-
|
90 |
-
|
91 |
-
|
92 |
-
|
93 |
-
|
94 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
95 |
|
96 |
-
#
|
97 |
-
if
|
98 |
-
|
99 |
-
|
100 |
-
|
101 |
-
|
102 |
-
|
103 |
-
|
104 |
-
|
105 |
-
|
106 |
-
|
107 |
-
|
108 |
-
|
109 |
-
|
110 |
-
|
111 |
-
|
112 |
-
|
113 |
-
|
114 |
-
|
115 |
-
|
116 |
-
|
117 |
-
|
118 |
-
|
119 |
-
|
120 |
-
|
121 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
122 |
|
123 |
except Exception as e:
|
124 |
error_msg = str(e)
|
125 |
if "api_key" in error_msg.lower():
|
126 |
-
yield "❌ API密钥错误:请检查密钥是否有效"
|
127 |
elif "network" in error_msg.lower() or "connection" in error_msg.lower():
|
128 |
-
yield "❌ 网络连接错误:请检查网络连接"
|
129 |
else:
|
130 |
-
yield f"❌ API调用错误: {error_msg[:200]}"
|
131 |
-
|
132 |
-
def process_image_and_prompt(image, prompt, model, temperature, max_tokens, stream_output):
|
133 |
-
"""处理图像和提示词的主函数"""
|
134 |
-
output = ""
|
135 |
-
for chunk in call_step_api(image, prompt, model, temperature, max_tokens, stream_output):
|
136 |
-
output = chunk
|
137 |
-
yield output
|
138 |
|
139 |
# 创建Gradio界面
|
140 |
-
with gr.Blocks(title="Step-3
|
141 |
gr.Markdown("""
|
142 |
-
# 🤖 Step-3
|
143 |
-
|
144 |
-
基于阶跃星辰 Step-3 模型的图像理解和分析工具。上传图片并输入提示词,让AI帮你分析图像内容。
|
145 |
|
146 |
-
|
147 |
-
- 🖼️ 支持多种图片格式上传
|
148 |
-
- 💬 自然语言交互
|
149 |
-
- 🔄 实时流式输出
|
150 |
-
- 🧠 深度推理能力
|
151 |
""")
|
152 |
|
153 |
-
# API密钥状态提示
|
154 |
-
if not STEP_API_KEY:
|
155 |
-
gr.Markdown("""
|
156 |
-
⚠️ **注意:API密钥未配置**
|
157 |
-
|
158 |
-
请在 Hugging Face Space 的 Settings 中添加 Secret:
|
159 |
-
- Name: `STEP_API_KEY`
|
160 |
-
- Value: 你的阶跃星辰 API 密钥
|
161 |
-
""")
|
162 |
-
|
163 |
with gr.Row():
|
164 |
with gr.Column(scale=1):
|
165 |
# 输入区域
|
@@ -171,9 +171,9 @@ with gr.Blocks(title="Step-3 图像理解助手", theme=gr.themes.Soft()) as dem
|
|
171 |
|
172 |
prompt_input = gr.Textbox(
|
173 |
label="提示词",
|
174 |
-
placeholder="
|
175 |
lines=3,
|
176 |
-
value="
|
177 |
)
|
178 |
|
179 |
with gr.Accordion("高级设置", open=False):
|
@@ -188,7 +188,7 @@ with gr.Blocks(title="Step-3 图像理解助手", theme=gr.themes.Soft()) as dem
|
|
188 |
maximum=1,
|
189 |
value=0.7,
|
190 |
step=0.1,
|
191 |
-
label="Temperature
|
192 |
)
|
193 |
|
194 |
max_tokens_slider = gr.Slider(
|
@@ -198,76 +198,68 @@ with gr.Blocks(title="Step-3 图像理解助手", theme=gr.themes.Soft()) as dem
|
|
198 |
step=100,
|
199 |
label="最大输出长度"
|
200 |
)
|
201 |
-
|
202 |
-
stream_checkbox = gr.Checkbox(
|
203 |
-
value=True,
|
204 |
-
label="流式输出"
|
205 |
-
)
|
206 |
|
207 |
submit_btn = gr.Button("🚀 开始分析", variant="primary")
|
208 |
clear_btn = gr.Button("🗑️ 清空", variant="secondary")
|
209 |
|
210 |
with gr.Column(scale=1):
|
211 |
-
#
|
212 |
-
|
213 |
-
|
214 |
-
|
215 |
-
|
216 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
217 |
)
|
218 |
|
219 |
-
#
|
220 |
gr.Examples(
|
221 |
examples=[
|
222 |
-
["
|
223 |
-
["
|
224 |
-
["
|
225 |
-
["
|
226 |
-
["图片中的人物在做什么?他们的表情如何?", "step-3"],
|
227 |
-
["这个产品的设计有什么特点?", "step-3"],
|
228 |
],
|
229 |
inputs=[prompt_input, model_select],
|
230 |
-
label="
|
231 |
)
|
232 |
|
233 |
-
# 事件处理
|
234 |
submit_btn.click(
|
235 |
-
fn=
|
236 |
inputs=[
|
237 |
image_input,
|
238 |
prompt_input,
|
239 |
model_select,
|
240 |
temperature_slider,
|
241 |
-
max_tokens_slider
|
242 |
-
stream_checkbox
|
243 |
],
|
244 |
-
outputs=
|
245 |
show_progress=True
|
246 |
)
|
247 |
|
248 |
clear_btn.click(
|
249 |
-
fn=lambda: (None, "", ""),
|
250 |
inputs=[],
|
251 |
-
outputs=[image_input, prompt_input,
|
252 |
)
|
253 |
|
254 |
# 页脚
|
255 |
gr.Markdown("""
|
256 |
---
|
257 |
-
|
258 |
-
1. 上传一张图片(支持 JPG、PNG 等格式)
|
259 |
-
2. 输入你的问题或分析需求
|
260 |
-
3. 点击"开始分析"按钮
|
261 |
-
4. 等待AI返回分析结果
|
262 |
-
|
263 |
-
### 注意事项:
|
264 |
-
- 请确保图片清晰度足够
|
265 |
-
- 提示词越具体,分析结果越准确
|
266 |
-
- 可以在高级设置中调整模型参数
|
267 |
-
|
268 |
-
Powered by [阶跃星辰 Step-3](https://www.stepfun.com/)
|
269 |
""")
|
270 |
|
271 |
-
# 启动应用
|
272 |
if __name__ == "__main__":
|
273 |
demo.launch()
|
|
|
8 |
|
9 |
# 配置
|
10 |
BASE_URL = "https://api.stepfun.com/v1"
|
11 |
+
# 从环境变量获取API密钥
|
12 |
STEP_API_KEY = os.environ.get("STEP_API_KEY", "")
|
13 |
|
14 |
# 可选模型
|
|
|
19 |
if image is None:
|
20 |
return None
|
21 |
|
|
|
22 |
if isinstance(image, Image.Image):
|
23 |
buffered = BytesIO()
|
24 |
image.save(buffered, format="PNG")
|
|
|
27 |
|
28 |
return None
|
29 |
|
30 |
+
def call_step_api(image, prompt, model, temperature=0.7, max_tokens=2000):
|
31 |
+
"""调用Step API进行图像分析和文本生成,支持CoT推理展示"""
|
32 |
|
33 |
if image is None:
|
34 |
+
yield "❌ 请先上传一张图片", ""
|
35 |
+
return
|
36 |
|
37 |
if not prompt:
|
38 |
+
yield "❌ 请输入提示词", ""
|
39 |
+
return
|
40 |
|
41 |
if not STEP_API_KEY:
|
42 |
+
yield "❌ API密钥未配置。请在 Hugging Face Space 的 Settings 中添加 STEP_API_KEY 环境变量。", ""
|
43 |
+
return
|
44 |
|
45 |
# 转换图像为base64
|
46 |
try:
|
47 |
base64_image = image_to_base64(image)
|
48 |
if base64_image is None:
|
49 |
+
yield "❌ 图片处理失败", ""
|
50 |
+
return
|
51 |
except Exception as e:
|
52 |
+
yield f"❌ 图片处理错误: {str(e)}", ""
|
53 |
+
return
|
54 |
|
55 |
# 构造消息
|
56 |
messages = [
|
|
|
76 |
try:
|
77 |
client = OpenAI(api_key=STEP_API_KEY, base_url=BASE_URL)
|
78 |
except Exception as e:
|
79 |
+
yield f"❌ 客户端初始化失败: {str(e)}", ""
|
80 |
+
return
|
81 |
|
82 |
try:
|
83 |
# 记录开始时间
|
84 |
start_time = time.time()
|
85 |
|
86 |
+
# 流式输出
|
87 |
+
response = client.chat.completions.create(
|
88 |
+
model=model,
|
89 |
+
messages=messages,
|
90 |
+
temperature=temperature,
|
91 |
+
max_tokens=max_tokens,
|
92 |
+
stream=True
|
93 |
+
)
|
94 |
+
|
95 |
+
full_response = ""
|
96 |
+
reasoning_content = ""
|
97 |
+
final_answer = ""
|
98 |
+
is_reasoning = False
|
99 |
+
reasoning_started = False
|
100 |
+
|
101 |
+
for chunk in response:
|
102 |
+
if chunk.choices and chunk.choices[0].delta:
|
103 |
+
delta = chunk.choices[0].delta
|
104 |
+
|
105 |
+
if hasattr(delta, 'content') and delta.content:
|
106 |
+
content = delta.content
|
107 |
+
full_response += content
|
108 |
|
109 |
+
# 检测reasoning标记
|
110 |
+
if "<reasoning>" in content:
|
111 |
+
is_reasoning = True
|
112 |
+
reasoning_started = True
|
113 |
+
# 提取<reasoning>之前的内容添加到final_answer
|
114 |
+
before_reasoning = content.split("<reasoning>")[0]
|
115 |
+
if before_reasoning:
|
116 |
+
final_answer += before_reasoning
|
117 |
+
# 提取<reasoning>之后的内容开始reasoning
|
118 |
+
after_tag = content.split("<reasoning>")[1] if len(content.split("<reasoning>")) > 1 else ""
|
119 |
+
reasoning_content += after_tag
|
120 |
+
elif "</reasoning>" in content:
|
121 |
+
# 提取</reasoning>之前的内容添加到reasoning
|
122 |
+
before_tag = content.split("</reasoning>")[0]
|
123 |
+
reasoning_content += before_tag
|
124 |
+
is_reasoning = False
|
125 |
+
# 提取</reasoning>之后的内容添加到final_answer
|
126 |
+
after_reasoning = content.split("</reasoning>")[1] if len(content.split("</reasoning>")) > 1 else ""
|
127 |
+
final_answer += after_reasoning
|
128 |
+
elif is_reasoning:
|
129 |
+
reasoning_content += content
|
130 |
+
else:
|
131 |
+
final_answer += content
|
132 |
+
|
133 |
+
# 实时输出
|
134 |
+
if reasoning_started:
|
135 |
+
yield reasoning_content, final_answer
|
136 |
+
else:
|
137 |
+
yield "", final_answer
|
138 |
+
|
139 |
+
# 添加生成时间
|
140 |
+
elapsed_time = time.time() - start_time
|
141 |
+
time_info = f"\n\n⏱️ 生成用时: {elapsed_time:.2f}秒"
|
142 |
+
final_answer += time_info
|
143 |
+
|
144 |
+
yield reasoning_content, final_answer
|
145 |
|
146 |
except Exception as e:
|
147 |
error_msg = str(e)
|
148 |
if "api_key" in error_msg.lower():
|
149 |
+
yield "", "❌ API密钥错误:请检查密钥是否有效"
|
150 |
elif "network" in error_msg.lower() or "connection" in error_msg.lower():
|
151 |
+
yield "", "❌ 网络连接错误:请检查网络连接"
|
152 |
else:
|
153 |
+
yield "", f"❌ API调用错误: {error_msg[:200]}"
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
154 |
|
155 |
# 创建Gradio界面
|
156 |
+
with gr.Blocks(title="Step-3", theme=gr.themes.Soft()) as demo:
|
157 |
gr.Markdown("""
|
158 |
+
# 🤖 Step-3
|
|
|
|
|
159 |
|
160 |
+
上传图片并输入提示词,让 Step-3 分析图像内容。
|
|
|
|
|
|
|
|
|
161 |
""")
|
162 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
163 |
with gr.Row():
|
164 |
with gr.Column(scale=1):
|
165 |
# 输入区域
|
|
|
171 |
|
172 |
prompt_input = gr.Textbox(
|
173 |
label="提示词",
|
174 |
+
placeholder="例如:这是什么?请详细描述",
|
175 |
lines=3,
|
176 |
+
value="请详细描述这张图片的内容。"
|
177 |
)
|
178 |
|
179 |
with gr.Accordion("高级设置", open=False):
|
|
|
188 |
maximum=1,
|
189 |
value=0.7,
|
190 |
step=0.1,
|
191 |
+
label="Temperature"
|
192 |
)
|
193 |
|
194 |
max_tokens_slider = gr.Slider(
|
|
|
198 |
step=100,
|
199 |
label="最大输出长度"
|
200 |
)
|
|
|
|
|
|
|
|
|
|
|
201 |
|
202 |
submit_btn = gr.Button("🚀 开始分析", variant="primary")
|
203 |
clear_btn = gr.Button("🗑️ 清空", variant="secondary")
|
204 |
|
205 |
with gr.Column(scale=1):
|
206 |
+
# 推理过程展示
|
207 |
+
with gr.Accordion("💭 推理过程 (CoT)", open=True):
|
208 |
+
reasoning_output = gr.Textbox(
|
209 |
+
label="思考过程",
|
210 |
+
lines=10,
|
211 |
+
max_lines=15,
|
212 |
+
show_copy_button=True,
|
213 |
+
interactive=False
|
214 |
+
)
|
215 |
+
|
216 |
+
# 最终答案展示
|
217 |
+
answer_output = gr.Textbox(
|
218 |
+
label="📝 分析结果",
|
219 |
+
lines=15,
|
220 |
+
max_lines=25,
|
221 |
+
show_copy_button=True,
|
222 |
+
interactive=False
|
223 |
)
|
224 |
|
225 |
+
# 示例
|
226 |
gr.Examples(
|
227 |
examples=[
|
228 |
+
["这张图片中有什么?", "step-3"],
|
229 |
+
["详细描述图片内容", "step-3"],
|
230 |
+
["这是什么物体?有什么特征?", "step-3"],
|
231 |
+
["分析图片的主要元素", "step-3"],
|
|
|
|
|
232 |
],
|
233 |
inputs=[prompt_input, model_select],
|
234 |
+
label="示例提示词"
|
235 |
)
|
236 |
|
237 |
+
# 事件处理 - 流式输出到两个文本框
|
238 |
submit_btn.click(
|
239 |
+
fn=call_step_api,
|
240 |
inputs=[
|
241 |
image_input,
|
242 |
prompt_input,
|
243 |
model_select,
|
244 |
temperature_slider,
|
245 |
+
max_tokens_slider
|
|
|
246 |
],
|
247 |
+
outputs=[reasoning_output, answer_output],
|
248 |
show_progress=True
|
249 |
)
|
250 |
|
251 |
clear_btn.click(
|
252 |
+
fn=lambda: (None, "", "", ""),
|
253 |
inputs=[],
|
254 |
+
outputs=[image_input, prompt_input, reasoning_output, answer_output]
|
255 |
)
|
256 |
|
257 |
# 页脚
|
258 |
gr.Markdown("""
|
259 |
---
|
260 |
+
Powered by [Step-3](https://www.stepfun.com/)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
261 |
""")
|
262 |
|
263 |
+
# 启动应用
|
264 |
if __name__ == "__main__":
|
265 |
demo.launch()
|