Spaces:
Sleeping
Sleeping
v1.1.0
Browse files- .gitignore +4 -0
- README.md +52 -47
- modules/llm/BaseLLM.py +32 -0
- modules/llm/Claude.py +55 -0
- modules/llm/DeepSeek.py +49 -0
- modules/llm/Doubao.py +44 -0
- modules/llm/Gemini.py +45 -0
- modules/llm/LangChainGPT.py +46 -0
- modules/llm/LocalModel.py +66 -0
- modules/llm/OpenRouter.py +42 -0
- modules/llm/Qwen.py +48 -0
- utils.py +92 -4
.gitignore
ADDED
@@ -0,0 +1,4 @@
|
|
|
|
|
|
|
|
|
|
|
1 |
+
**/__pycache__
|
2 |
+
run.bat
|
3 |
+
openai_api_test.py
|
4 |
+
assets/
|
README.md
CHANGED
@@ -10,53 +10,8 @@ pinned: false
|
|
10 |
license: apache-2.0
|
11 |
---
|
12 |
|
13 |
-
# EasyTranslator v1.0
|
14 |
基于gradio的汉化辅助工具
|
15 |
-
## v1.0.6更新内容
|
16 |
-
1. 更新文件合并功能,方便多人协作。在文件合并页中可将依照指示将两个json文件合并,同步人工翻译进度。并支持导出小规模json文件方便传输。
|
17 |
-
|
18 |
-
## v1.0.5更新内容
|
19 |
-
1. 支持键盘快捷键<br>
|
20 |
-
shift+w: ↑<br>
|
21 |
-
shift+x: ↓<br>
|
22 |
-
shift+s: save json<br>
|
23 |
-
shift+r: replace<br>
|
24 |
-
shift+g: gpt translate<br>
|
25 |
-
shift+b: baidu translate<br>
|
26 |
-
|
27 |
-
## v1.0.4更新内容
|
28 |
-
1. 追加摸鱼模式, 将必要组件集中在半个屏幕内。在`config.json`中`moyu_mode`设为1开启, 设为0关闭
|
29 |
-
2. 加入对GPT翻译的超时检测, 时间上限在`config.json`的`openai_api_settings`中的`time_limit`处设置, 单位为秒。若请求超时, 会打印超时提示, 但不会报错)
|
30 |
-
3. GPT翻译现在将不返回重复结果
|
31 |
-
|
32 |
-
## v1.0.3更新内容
|
33 |
-
1. 支持预览页直接修改译文, 建议保存JSON后再使用此功能
|
34 |
-
2. 可选是否即时更新上次编辑id
|
35 |
-
|
36 |
-
`config.json`中设置`"if_save_id_immediately"`参数, 若为1则逻辑与之前一样, 在切换id时立刻保存进`config.json`;若为0则会显示保存编辑id按钮`SAVE last edited position`, 在点击后存入`config.json`。
|
37 |
-
|
38 |
-
## v1.0.2更新内容
|
39 |
-
1. 支持批量机翻
|
40 |
-
|
41 |
-
## v1.0.1更新内容
|
42 |
-
1. 优化文件读取逻辑
|
43 |
-
2. 增加错误提示、警告等。保存JSON成功时会提示更新的译文条数
|
44 |
-
3. 允许自定义传输到gpt的prompt、自定义百度翻译的原文及目标语言
|
45 |
-
4. 追加上下文预览功能, 并允许自定义预览条数和编号。指定id将会以双星号标记, 修改过的译文将会在前面加星号标记
|
46 |
-
5. 优化按钮手感
|
47 |
-
|
48 |
-
## 特性
|
49 |
-
1. 一键机翻接口, 提供复制到剪贴板按钮
|
50 |
-
2. 便捷的上下句切换, 直接跳转功能
|
51 |
-
3. 记忆上次编辑位置功能
|
52 |
-
4. 人名翻译记忆功能, 一次修改将会同步到全体。人名词典在程序启动时读取并在保存JSON文件时保存。开启程序时可以直接改`name_cn`, 关闭程序后可以修改人名词典。下次开启程序时人名词典中的内容将会覆盖JSON文件中的`name_cn`。
|
53 |
-
5. 文本翻译记忆功能, 机翻/修改后只要不关闭程序, 切换上下句, 刷新 网页都不会影响
|
54 |
-
6. 译文缓存。相对地原文不会缓存, 所以手滑改或删掉只要切换或者刷新即可恢复。因此想查看原文具体某个词的翻译也可以直接编辑原文再机翻, 不会影响原文本。
|
55 |
-
7. 一键替换功能, 用于专有名词错译的情况。会将机翻及手翻文本中的对象全部替换。替换词典可以在运行中直接更改, 不用重开程序。
|
56 |
-
8. 便利的api key管理及prompt修改等
|
57 |
-
9. 提供JSON文件与CSV文件互转
|
58 |
-
10. 上下文预览功能
|
59 |
-
<br><br>
|
60 |
|
61 |
## 使用
|
62 |
至少需要安装python3(作者使用的版本是3.10, 其它版本尚未测试)
|
@@ -102,10 +57,60 @@ json文件格式要求为:
|
|
102 |
```
|
103 |
python EasyTranslator.py
|
104 |
```
|
105 |
-
(hf版为app.py)
|
106 |
然后在网页中打开程序给出的网址(eg: http://127.0.0.1:7860 )
|
107 |
<br><br>
|
108 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
109 |
## 演示
|
110 |
摸鱼模式 \
|
111 |
 \
|
|
|
10 |
license: apache-2.0
|
11 |
---
|
12 |
|
13 |
+
# EasyTranslator v1.1.0
|
14 |
基于gradio的汉化辅助工具
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
15 |
|
16 |
## 使用
|
17 |
至少需要安装python3(作者使用的版本是3.10, 其它版本尚未测试)
|
|
|
57 |
```
|
58 |
python EasyTranslator.py
|
59 |
```
|
|
|
60 |
然后在网页中打开程序给出的网址(eg: http://127.0.0.1:7860 )
|
61 |
<br><br>
|
62 |
|
63 |
+
## v1.1.0更新内容
|
64 |
+
1. 现支持通常翻译和批量翻译中gemini, claude, qwen, deepseek...等模型的自由选择和调用。请在`config.json`中或API页填写api key (注:当设置了OpenRouter的api时,会优先使用OpenRouter的接口)。使用gemini, claude, doubao等的官方接口请安装相关依赖包。可以自行修改`utils.py`中的MODEL_LIST以调整可选模型,避免冗余。
|
65 |
+
2. 修改了键盘快捷键设置<br>
|
66 |
+
alt+w: ↑<br>
|
67 |
+
alt+x: ↓<br>
|
68 |
+
alt+s: save json<br>
|
69 |
+
alt+r: replace<br>
|
70 |
+
alt+q: model1 translate<br>
|
71 |
+
alt+e: model2 translate<br>
|
72 |
+
|
73 |
+
## v1.0.6更新内容
|
74 |
+
1. 更新文件合并功能,方便多人协作。在文件合并页中可将依照指示将两个json文件合并,同步人工翻译进度。并支持导出小规模json文件方便传输。
|
75 |
+
|
76 |
+
## v1.0.5更新内容
|
77 |
+
1. 支持键盘快捷键
|
78 |
+
|
79 |
+
## v1.0.4更新内容
|
80 |
+
1. 追加摸鱼模式, 将必要组件集中在半个屏幕内。在`config.json`中`moyu_mode`设为1开启, 设为0关闭
|
81 |
+
2. 加入对GPT翻译的超时检测, 时间上限在`config.json`的`openai_api_settings`中的`time_limit`处设置, 单位为秒。若请求超时, 会打印超时提示, 但不会报错)
|
82 |
+
3. GPT翻译现在将不返回重复结果
|
83 |
+
|
84 |
+
## v1.0.3更新内容
|
85 |
+
1. 支持预览页直接修改译文, 建议保存JSON后再使用此功能
|
86 |
+
2. 可选是否即时更新上次编辑id
|
87 |
+
|
88 |
+
`config.json`中设置`"if_save_id_immediately"`参数, 若为1则逻辑与之前一样, 在切换id时立刻保存进`config.json`;若为0则会显示保存编辑id按钮`SAVE last edited position`, 在点击后存入`config.json`。
|
89 |
+
|
90 |
+
## v1.0.2更新内容
|
91 |
+
1. 支持批量机翻
|
92 |
+
|
93 |
+
## v1.0.1更新内容
|
94 |
+
1. 优化文件读取逻辑
|
95 |
+
2. 增加错误提示、警告等。保存JSON成功时会提示更新的译文条数
|
96 |
+
3. 允许自定义传输到gpt的prompt、自定义百度翻译的���文及目标语言
|
97 |
+
4. 追加上下文预览功能, 并允许自定义预览条数和编号。指定id将会以双星号标记, 修改过的译文将会在前面加星号标记
|
98 |
+
5. 优化按钮手感
|
99 |
+
|
100 |
+
## 特性
|
101 |
+
1. 一键机翻接口, 提供复制到剪贴板按钮
|
102 |
+
2. 便捷的上下句切换, 直接跳转功能
|
103 |
+
3. 记忆上次编辑位置功能
|
104 |
+
4. 人名翻译记忆功能, 一次修改将会同步到全体。人名词典在程序启动时读取并在保存JSON文件时保存。开启程序时可以直接改`name_cn`, 关闭程序后可以修改人名词典。下次开启程序时人名词典中的内容将会覆盖JSON文件中的`name_cn`。
|
105 |
+
5. 文本翻译记忆功能, 机翻/修改后只要不关闭程序, 切换上下句, 刷新 网页都不会影响
|
106 |
+
6. 译文缓存。相对地原文不会缓存, 所以手滑改或删掉只要切换或者刷新即可恢复。因此想查看原文具体某个词的翻译也可以直接编辑原文再机翻, 不会影响原文本。
|
107 |
+
7. 一键替换功能, 用于专有名词错译的情况。会将机翻及手翻文本中的对象全部替换。替换词典可以在运行中直接更改, 不用重开程序。
|
108 |
+
8. 便利的api key管理及prompt修改等
|
109 |
+
9. 提供JSON文件与CSV文件互转
|
110 |
+
10. 上下文预览功能
|
111 |
+
<br><br>
|
112 |
+
|
113 |
+
|
114 |
## 演示
|
115 |
摸鱼模式 \
|
116 |
 \
|
modules/llm/BaseLLM.py
ADDED
@@ -0,0 +1,32 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
from abc import ABC, abstractmethod
|
2 |
+
|
3 |
+
class BaseLLM(ABC):
|
4 |
+
|
5 |
+
def __init__(self):
|
6 |
+
pass
|
7 |
+
|
8 |
+
@abstractmethod
|
9 |
+
def initialize_message(self):
|
10 |
+
pass
|
11 |
+
|
12 |
+
@abstractmethod
|
13 |
+
def ai_message(self, payload):
|
14 |
+
pass
|
15 |
+
|
16 |
+
@abstractmethod
|
17 |
+
def system_message(self, payload):
|
18 |
+
pass
|
19 |
+
|
20 |
+
@abstractmethod
|
21 |
+
def user_message(self, payload):
|
22 |
+
pass
|
23 |
+
|
24 |
+
@abstractmethod
|
25 |
+
def get_response(self):
|
26 |
+
pass
|
27 |
+
|
28 |
+
@abstractmethod
|
29 |
+
def print_prompt(self):
|
30 |
+
pass
|
31 |
+
|
32 |
+
|
modules/llm/Claude.py
ADDED
@@ -0,0 +1,55 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
import anthropic
|
2 |
+
import os
|
3 |
+
from .BaseLLM import BaseLLM
|
4 |
+
|
5 |
+
class Claude(BaseLLM):
|
6 |
+
|
7 |
+
def __init__(self, model="claude-3-5-sonnet-20240620"):
|
8 |
+
super(Claude, self).__init__()
|
9 |
+
self.model_name = model
|
10 |
+
self.client = anthropic.Anthropic(
|
11 |
+
api_key = os.environ.get("ANTHROPIC_API_KEY")
|
12 |
+
)
|
13 |
+
# add api_base
|
14 |
+
self.messages = []
|
15 |
+
|
16 |
+
def initialize_message(self):
|
17 |
+
self.messages = []
|
18 |
+
|
19 |
+
def ai_message(self, payload):
|
20 |
+
self.messages.append({"role": "ai", "content": payload})
|
21 |
+
|
22 |
+
def system_message(self, payload):
|
23 |
+
self.messages.append({"role": "system", "content": payload})
|
24 |
+
|
25 |
+
def user_message(self, payload):
|
26 |
+
self.messages.append({"role": "user", "content": payload})
|
27 |
+
|
28 |
+
def get_response(self):
|
29 |
+
message = self.client.messages.create(
|
30 |
+
model=self.model_name,
|
31 |
+
max_tokens=4096,
|
32 |
+
messages=self.messages
|
33 |
+
)
|
34 |
+
return message.content
|
35 |
+
|
36 |
+
def chat(self,text):
|
37 |
+
self.initialize_message()
|
38 |
+
if isinstance(messages, str):
|
39 |
+
self.user_message(text)
|
40 |
+
response = self.get_response()
|
41 |
+
return response
|
42 |
+
|
43 |
+
def print_prompt(self):
|
44 |
+
for message in self.messages:
|
45 |
+
print(message)
|
46 |
+
|
47 |
+
if __name__ == '__main__':
|
48 |
+
messages = [{"role": "system", "content": "Hello, how are you?"}]
|
49 |
+
model = "claude-3-5-sonnet-20240620"
|
50 |
+
#model = 'gpt-4o'
|
51 |
+
llm = Claude()
|
52 |
+
|
53 |
+
print(llm.chat("Say it is a test."))
|
54 |
+
|
55 |
+
|
modules/llm/DeepSeek.py
ADDED
@@ -0,0 +1,49 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
from .BaseLLM import BaseLLM
|
2 |
+
from openai import OpenAI
|
3 |
+
import os
|
4 |
+
|
5 |
+
class DeepSeek(BaseLLM):
|
6 |
+
|
7 |
+
def __init__(self, model="deepseek-chat"):
|
8 |
+
super(DeepSeek, self).__init__()
|
9 |
+
self.client = OpenAI(
|
10 |
+
api_key=os.getenv("DEEPSEEK_API_KEY"),
|
11 |
+
base_url="https://api.deepseek.com",
|
12 |
+
)
|
13 |
+
self.model_name = model
|
14 |
+
self.messages = []
|
15 |
+
|
16 |
+
|
17 |
+
def initialize_message(self):
|
18 |
+
self.messages = []
|
19 |
+
|
20 |
+
def ai_message(self, payload):
|
21 |
+
self.messages.append({"role": "ai", "content": payload})
|
22 |
+
|
23 |
+
def system_message(self, payload):
|
24 |
+
self.messages.append({"role": "system", "content": payload})
|
25 |
+
|
26 |
+
def user_message(self, payload):
|
27 |
+
self.messages.append({"role": "user", "content": payload})
|
28 |
+
|
29 |
+
def get_response(self,temperature = 0.8):
|
30 |
+
|
31 |
+
response = self.client.chat.completions.create(
|
32 |
+
model="deepseek-chat",
|
33 |
+
messages=[
|
34 |
+
{"role": "system", "content": "You are a helpful assistant"},
|
35 |
+
{"role": "user", "content": "Hello"},
|
36 |
+
],
|
37 |
+
stream=False
|
38 |
+
)
|
39 |
+
return response.choices[0].message.content
|
40 |
+
|
41 |
+
def chat(self,text):
|
42 |
+
self.initialize_message()
|
43 |
+
self.user_message(text)
|
44 |
+
response = self.get_response()
|
45 |
+
return response
|
46 |
+
|
47 |
+
def print_prompt(self):
|
48 |
+
for message in self.messages:
|
49 |
+
print(message)
|
modules/llm/Doubao.py
ADDED
@@ -0,0 +1,44 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
from .BaseLLM import BaseLLM
|
2 |
+
from volcenginesdkarkruntime import Ark
|
3 |
+
import os
|
4 |
+
|
5 |
+
class Doubao(BaseLLM):
|
6 |
+
|
7 |
+
def __init__(self, model="ep-20241228220355-cqxcs"):
|
8 |
+
super(Doubao, self).__init__()
|
9 |
+
self.client = Ark(api_key=os.environ.get("ARK_API_KEY"))
|
10 |
+
self.model_name = model
|
11 |
+
self.messages = []
|
12 |
+
|
13 |
+
def initialize_message(self):
|
14 |
+
self.messages = []
|
15 |
+
|
16 |
+
def ai_message(self, payload):
|
17 |
+
self.messages.append({"role": "ai", "content": payload})
|
18 |
+
|
19 |
+
def system_message(self, payload):
|
20 |
+
self.messages.append({"role": "system", "content": payload})
|
21 |
+
|
22 |
+
def user_message(self, payload):
|
23 |
+
self.messages.append({"role": "user", "content": payload})
|
24 |
+
|
25 |
+
def get_response(self,temperature = 0.8):
|
26 |
+
|
27 |
+
completion = self.client.chat.completions.create(
|
28 |
+
model=self.model_name,
|
29 |
+
messages=self.messages,
|
30 |
+
temperature=temperature,
|
31 |
+
top_p=0.8
|
32 |
+
)
|
33 |
+
|
34 |
+
return completion.choices[0].message.content
|
35 |
+
|
36 |
+
def chat(self,text):
|
37 |
+
self.initialize_message()
|
38 |
+
self.user_message(text)
|
39 |
+
response = self.get_response()
|
40 |
+
return response
|
41 |
+
|
42 |
+
def print_prompt(self):
|
43 |
+
for message in self.messages:
|
44 |
+
print(message)
|
modules/llm/Gemini.py
ADDED
@@ -0,0 +1,45 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
from .BaseLLM import BaseLLM
|
2 |
+
import google.generativeai as genai
|
3 |
+
import os
|
4 |
+
import time
|
5 |
+
|
6 |
+
class Gemini(BaseLLM):
|
7 |
+
def __init__(self, model="gemini-1.5-flash"):
|
8 |
+
super(Gemini, self).__init__()
|
9 |
+
genai.configure(api_key=os.getenv("GEMINI_API_KEY"))
|
10 |
+
self.model_name = model
|
11 |
+
self.model = genai.GenerativeModel(model)
|
12 |
+
self.messages = []
|
13 |
+
|
14 |
+
|
15 |
+
def initialize_message(self):
|
16 |
+
self.messages = []
|
17 |
+
|
18 |
+
def ai_message(self, payload):
|
19 |
+
self.messages.append({"role": "model", "parts": payload})
|
20 |
+
|
21 |
+
def system_message(self, payload):
|
22 |
+
self.messages.append({"role": "system", "parts": payload})
|
23 |
+
|
24 |
+
def user_message(self, payload):
|
25 |
+
self.messages.append({"role": "user", "parts": payload})
|
26 |
+
|
27 |
+
def get_response(self,temperature = 0.8):
|
28 |
+
time.sleep(3)
|
29 |
+
chat = self.model.start_chat(
|
30 |
+
history = self.messages
|
31 |
+
)
|
32 |
+
response = chat.send_message(generation_config=genai.GenerationConfig(
|
33 |
+
temperature=temperature,
|
34 |
+
))
|
35 |
+
|
36 |
+
return response.text
|
37 |
+
|
38 |
+
def chat(self,text):
|
39 |
+
chat = self.model.start_chat()
|
40 |
+
response = chat.send_message(text)
|
41 |
+
return response.text
|
42 |
+
|
43 |
+
def print_prompt(self):
|
44 |
+
for message in self.messages:
|
45 |
+
print(message)
|
modules/llm/LangChainGPT.py
ADDED
@@ -0,0 +1,46 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
from .BaseLLM import BaseLLM
|
2 |
+
from openai import OpenAI
|
3 |
+
import os
|
4 |
+
|
5 |
+
class LangChainGPT(BaseLLM):
|
6 |
+
|
7 |
+
def __init__(self, model="gpt-4o-mini"):
|
8 |
+
super(LangChainGPT, self).__init__()
|
9 |
+
self.client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))
|
10 |
+
self.model_name = model
|
11 |
+
# add api_base
|
12 |
+
self.messages = []
|
13 |
+
|
14 |
+
def initialize_message(self):
|
15 |
+
self.messages = []
|
16 |
+
|
17 |
+
def ai_message(self, payload):
|
18 |
+
self.messages.append({"role": "ai", "content": payload})
|
19 |
+
|
20 |
+
def system_message(self, payload):
|
21 |
+
self.messages.append({"role": "system", "content": payload})
|
22 |
+
|
23 |
+
def user_message(self, payload):
|
24 |
+
self.messages.append({"role": "user", "content": payload})
|
25 |
+
|
26 |
+
def get_response(self,temperature = 0.8):
|
27 |
+
|
28 |
+
completion = self.client.chat.completions.create(
|
29 |
+
model=self.model_name,
|
30 |
+
messages=self.messages,
|
31 |
+
temperature=temperature,
|
32 |
+
top_p=0.8
|
33 |
+
)
|
34 |
+
return completion.choices[0].message.content
|
35 |
+
|
36 |
+
def chat(self,text,temperature = 0.8):
|
37 |
+
self.initialize_message()
|
38 |
+
self.user_message(text)
|
39 |
+
response = self.get_response(temperature = temperature)
|
40 |
+
return response
|
41 |
+
|
42 |
+
def print_prompt(self):
|
43 |
+
for message in self.messages:
|
44 |
+
print(message)
|
45 |
+
|
46 |
+
|
modules/llm/LocalModel.py
ADDED
@@ -0,0 +1,66 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
from .BaseLLM import BaseLLM
|
2 |
+
from peft import PeftModel
|
3 |
+
import os
|
4 |
+
from transformers import AutoModelForCausalLM, AutoTokenizer
|
5 |
+
|
6 |
+
|
7 |
+
class LocalModel(BaseLLM):
|
8 |
+
def __init__(self, model, adapter_path = None):
|
9 |
+
super(LocalModel, self).__init__()
|
10 |
+
model_name = model
|
11 |
+
self.model = AutoModelForCausalLM.from_pretrained(
|
12 |
+
model_name,
|
13 |
+
torch_dtype="auto",
|
14 |
+
device_map="auto",
|
15 |
+
|
16 |
+
)
|
17 |
+
if isinstance(adapter_path,str):
|
18 |
+
self.model = PeftModel.from_pretrained(self.model, adapter_path)
|
19 |
+
elif isinstance(adapter_path,list):
|
20 |
+
for path in adapter_path:
|
21 |
+
self.model = PeftModel.from_pretrained(self.model, path)
|
22 |
+
|
23 |
+
self.tokenizer = AutoTokenizer.from_pretrained(model_name)
|
24 |
+
self.model_name = model
|
25 |
+
self.messages = []
|
26 |
+
|
27 |
+
def initialize_message(self):
|
28 |
+
self.messages = []
|
29 |
+
|
30 |
+
def ai_message(self, payload):
|
31 |
+
self.messages.append({"role": "ai", "content": payload})
|
32 |
+
|
33 |
+
def system_message(self, payload):
|
34 |
+
self.messages.append({"role": "system", "content": payload})
|
35 |
+
|
36 |
+
def user_message(self, payload):
|
37 |
+
self.messages.append({"role": "user", "content": payload})
|
38 |
+
|
39 |
+
def get_response(self,temperature = 0.8):
|
40 |
+
|
41 |
+
text = self.tokenizer.apply_chat_template(
|
42 |
+
self.messages,
|
43 |
+
tokenize=False,
|
44 |
+
add_generation_prompt=True
|
45 |
+
)
|
46 |
+
model_inputs = self.tokenizer([text], return_tensors="pt").to(self.model.device)
|
47 |
+
generated_ids = self.model.generate(
|
48 |
+
**model_inputs,
|
49 |
+
max_new_tokens=512
|
50 |
+
)
|
51 |
+
generated_ids = [
|
52 |
+
output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
|
53 |
+
]
|
54 |
+
|
55 |
+
response = self.tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
|
56 |
+
return response
|
57 |
+
|
58 |
+
def chat(self,text,temperature = 0.8):
|
59 |
+
self.initialize_message()
|
60 |
+
self.user_message(text)
|
61 |
+
response = self.get_response(temperature = temperature)
|
62 |
+
return response
|
63 |
+
|
64 |
+
def print_prompt(self):
|
65 |
+
for message in self.messages:
|
66 |
+
print(message)
|
modules/llm/OpenRouter.py
ADDED
@@ -0,0 +1,42 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
from .BaseLLM import BaseLLM
|
2 |
+
import os
|
3 |
+
from openai import OpenAI
|
4 |
+
|
5 |
+
class OpenRouter(BaseLLM):
|
6 |
+
def __init__(self, model="deepseek/deepseek-r1:free"):
|
7 |
+
super(OpenRouter, self).__init__()
|
8 |
+
self.client = OpenAI(
|
9 |
+
api_key=os.getenv("OPENROUTER_API_KEY"),
|
10 |
+
base_url="https://openrouter.ai/api/v1",
|
11 |
+
)
|
12 |
+
self.model_name = model
|
13 |
+
self.messages = []
|
14 |
+
|
15 |
+
def initialize_message(self):
|
16 |
+
self.messages = []
|
17 |
+
|
18 |
+
def ai_message(self, payload):
|
19 |
+
self.messages.append({"role": "ai", "content": payload})
|
20 |
+
|
21 |
+
def system_message(self, payload):
|
22 |
+
self.messages.append({"role": "system", "content": payload})
|
23 |
+
|
24 |
+
def user_message(self, payload):
|
25 |
+
self.messages.append({"role": "user", "content": payload})
|
26 |
+
|
27 |
+
def get_response(self,temperature = 0.8):
|
28 |
+
completion = self.client.chat.completions.create(
|
29 |
+
model=self.model_name,
|
30 |
+
messages=self.messages
|
31 |
+
)
|
32 |
+
return completion.choices[0].message.content
|
33 |
+
|
34 |
+
def chat(self,text,temperature = 0.8):
|
35 |
+
self.initialize_message()
|
36 |
+
self.user_message(text)
|
37 |
+
response = self.get_response(temperature = temperature)
|
38 |
+
return response
|
39 |
+
|
40 |
+
def print_prompt(self):
|
41 |
+
for message in self.messages:
|
42 |
+
print(message)
|
modules/llm/Qwen.py
ADDED
@@ -0,0 +1,48 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
from .BaseLLM import BaseLLM
|
2 |
+
from openai import OpenAI
|
3 |
+
import os
|
4 |
+
|
5 |
+
class Qwen(BaseLLM):
|
6 |
+
|
7 |
+
def __init__(self, model="qwen-max"):
|
8 |
+
# qwen-max, qwen-plus, qwen-turbo
|
9 |
+
super(Qwen, self).__init__()
|
10 |
+
self.client = OpenAI(
|
11 |
+
api_key=os.getenv("DASHSCOPE_API_KEY"),
|
12 |
+
base_url="https://dashscope.aliyuncs.com/compatible-mode/v1",
|
13 |
+
)
|
14 |
+
self.model_name = model
|
15 |
+
# add api_base
|
16 |
+
self.messages = []
|
17 |
+
|
18 |
+
def initialize_message(self):
|
19 |
+
self.messages = []
|
20 |
+
|
21 |
+
def ai_message(self, payload):
|
22 |
+
self.messages.append({"role": "ai", "content": payload})
|
23 |
+
|
24 |
+
def system_message(self, payload):
|
25 |
+
self.messages.append({"role": "system", "content": payload})
|
26 |
+
|
27 |
+
def user_message(self, payload):
|
28 |
+
self.messages.append({"role": "user", "content": payload})
|
29 |
+
|
30 |
+
def get_response(self,temperature = 0.8):
|
31 |
+
|
32 |
+
completion = self.client.chat.completions.create(
|
33 |
+
model=self.model_name,
|
34 |
+
messages=self.messages,
|
35 |
+
temperature=temperature,
|
36 |
+
top_p=0.8
|
37 |
+
)
|
38 |
+
return completion.choices[0].message.content
|
39 |
+
|
40 |
+
def chat(self,text,temperature = 0.8):
|
41 |
+
self.initialize_message()
|
42 |
+
self.user_message(text)
|
43 |
+
response = self.get_response(temperature = temperature)
|
44 |
+
return response
|
45 |
+
|
46 |
+
def print_prompt(self):
|
47 |
+
for message in self.messages:
|
48 |
+
print(message)
|
utils.py
CHANGED
@@ -4,9 +4,71 @@ import random
|
|
4 |
import json
|
5 |
from hashlib import md5
|
6 |
from os import path as osp
|
|
|
7 |
import csv
|
8 |
import threading
|
9 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
10 |
def load_config(filepath):
|
11 |
with open(filepath, "r", encoding="utf-8") as file:
|
12 |
args = json.load(file)
|
@@ -46,6 +108,7 @@ def get_baidu_completion(text,api_id,api_key,from_lang,to_lang):
|
|
46 |
openai_api_key = args["openai_api_settings"]["openai_api_key"]
|
47 |
time_limit = float(args["openai_api_settings"]["time_limit"])
|
48 |
client = openai.OpenAI(api_key = openai_api_key)
|
|
|
49 |
class GPTThread(threading.Thread):
|
50 |
def __init__(self, model, messages, temperature):
|
51 |
super().__init__()
|
@@ -63,19 +126,44 @@ class GPTThread(threading.Thread):
|
|
63 |
)
|
64 |
self.result = response.choices[0].message.content
|
65 |
|
66 |
-
def get_gpt_completion(prompt, model="gpt-
|
67 |
messages = [{"role": "user", "content": prompt}]
|
68 |
temperature = random.uniform(0,1)
|
69 |
thread = GPTThread(model, messages,temperature)
|
70 |
thread.start()
|
71 |
-
thread.join(
|
72 |
if thread.is_alive():
|
73 |
thread.terminate()
|
74 |
print("请求超时")
|
75 |
return "TimeoutError", False
|
76 |
else:
|
77 |
return thread.result, True
|
78 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
79 |
def left_pad_zero(number, digit):
|
80 |
number_str = str(number)
|
81 |
padding_count = digit - len(number_str)
|
@@ -101,7 +189,7 @@ def convert_to_json(files, text_col, name_col, id_col):
|
|
101 |
with open(path,"r",encoding="utf-8") as f:
|
102 |
reader = csv.DictReader(f)
|
103 |
line_num = sum(1 for _ in open(path,"r",encoding="utf-8"))
|
104 |
-
fieldnames = reader.fieldnames
|
105 |
if id_col not in fieldnames:
|
106 |
ids = generate_ids(line_num)
|
107 |
i = 0
|
|
|
4 |
import json
|
5 |
from hashlib import md5
|
6 |
from os import path as osp
|
7 |
+
import os
|
8 |
import csv
|
9 |
import threading
|
10 |
|
11 |
+
MODEL_NAME_DICT = {
|
12 |
+
"gpt-4":"openai/gpt-4",
|
13 |
+
"gpt-4o":"openai/gpt-4o",
|
14 |
+
"gpt-4o-mini":"openai/gpt-4o-mini",
|
15 |
+
"gpt-3.5-turbo":"openai/gpt-3.5-turbo",
|
16 |
+
"deepseek-r1":"deepseek/deepseek-r1",
|
17 |
+
"deepseek-v3":"deepseek/deepseek-chat",
|
18 |
+
"gemini-2":"google/gemini-2.0-flash-001",
|
19 |
+
"gemini-1.5":"google/gemini-flash-1.5",
|
20 |
+
"llama3-70b": "meta-llama/llama-3.3-70b-instruct",
|
21 |
+
"qwen-turbo":"qwen/qwen-turbo",
|
22 |
+
"qwen-plus":"qwen/qwen-plus",
|
23 |
+
"qwen-max":"qwen/qwen-max",
|
24 |
+
"qwen-2.5-72b":"qwen/qwen-2.5-72b-instruct",
|
25 |
+
"claude-3.5-sonnet":"anthropic/claude-3.5-sonnet",
|
26 |
+
"phi-4":"microsoft/phi-4",
|
27 |
+
}
|
28 |
+
|
29 |
+
def get_models(model_name):
|
30 |
+
# return the combination of llm, embedding and tokenizer
|
31 |
+
if os.getenv("OPENROUTER_API_KEY", default="") and "YOUR" not in os.getenv("OPENROUTER_API_KEY", default="") and model_name in MODEL_NAME_DICT:
|
32 |
+
from modules.llm.OpenRouter import OpenRouter
|
33 |
+
return OpenRouter(model=MODEL_NAME_DICT[model_name])
|
34 |
+
elif model_name == 'openai':
|
35 |
+
from modules.llm.LangChainGPT import LangChainGPT
|
36 |
+
return LangChainGPT()
|
37 |
+
elif model_name.startswith('gpt-3.5'):
|
38 |
+
from modules.llm.LangChainGPT import LangChainGPT
|
39 |
+
return LangChainGPT(model="gpt-3.5-turbo")
|
40 |
+
elif model_name == 'gpt-4':
|
41 |
+
from modules.llm.LangChainGPT import LangChainGPT
|
42 |
+
return LangChainGPT(model="gpt-4")
|
43 |
+
elif model_name == 'gpt-4o':
|
44 |
+
from modules.llm.LangChainGPT import LangChainGPT
|
45 |
+
return LangChainGPT(model="gpt-4o")
|
46 |
+
elif model_name == "gpt-4o-mini":
|
47 |
+
from modules.llm.LangChainGPT import LangChainGPT
|
48 |
+
return LangChainGPT(model="gpt-4o-mini")
|
49 |
+
elif model_name.startswith("claude-3-5"):
|
50 |
+
from modules.llm.Claude import Claude
|
51 |
+
return Claude(model="claude-3-5-sonnet-20241022")
|
52 |
+
elif model_name in ["qwen-turbo","qwen-plus","qwen-max"]:
|
53 |
+
from modules.llm.Qwen import Qwen
|
54 |
+
return Qwen(model = model_name)
|
55 |
+
elif model_name.startswith('doubao'):
|
56 |
+
from modules.llm.Doubao import Doubao
|
57 |
+
return Doubao()
|
58 |
+
elif model_name.startswith('gemini-2'):
|
59 |
+
from modules.llm.Gemini import Gemini
|
60 |
+
return Gemini("gemini-2.0-flash")
|
61 |
+
elif model_name.startswith('gemini-1.5'):
|
62 |
+
from modules.llm.Gemini import Gemini
|
63 |
+
return Gemini("gemini-1.5-flash")
|
64 |
+
elif model_name.startswith("deepseek"):
|
65 |
+
from modules.llm.DeepSeek import DeepSeek
|
66 |
+
return DeepSeek()
|
67 |
+
else:
|
68 |
+
print(f'Warning! undefined model {model_name}, use gpt-4o-mini instead.')
|
69 |
+
from modules.llm.LangChainGPT import LangChainGPT
|
70 |
+
return LangChainGPT()
|
71 |
+
|
72 |
def load_config(filepath):
|
73 |
with open(filepath, "r", encoding="utf-8") as file:
|
74 |
args = json.load(file)
|
|
|
108 |
openai_api_key = args["openai_api_settings"]["openai_api_key"]
|
109 |
time_limit = float(args["openai_api_settings"]["time_limit"])
|
110 |
client = openai.OpenAI(api_key = openai_api_key)
|
111 |
+
|
112 |
class GPTThread(threading.Thread):
|
113 |
def __init__(self, model, messages, temperature):
|
114 |
super().__init__()
|
|
|
126 |
)
|
127 |
self.result = response.choices[0].message.content
|
128 |
|
129 |
+
def get_gpt_completion(prompt, time_limit = 10, model="gpt-40-mini"):
|
130 |
messages = [{"role": "user", "content": prompt}]
|
131 |
temperature = random.uniform(0,1)
|
132 |
thread = GPTThread(model, messages,temperature)
|
133 |
thread.start()
|
134 |
+
thread.join(time_limit)
|
135 |
if thread.is_alive():
|
136 |
thread.terminate()
|
137 |
print("请求超时")
|
138 |
return "TimeoutError", False
|
139 |
else:
|
140 |
return thread.result, True
|
141 |
+
|
142 |
+
class LLMThread(threading.Thread):
|
143 |
+
def __init__(self, llm, prompt, temperature):
|
144 |
+
super().__init__()
|
145 |
+
self.llm = llm
|
146 |
+
self.prompt = prompt
|
147 |
+
self.temperature = temperature
|
148 |
+
self.result = ""
|
149 |
+
def terminate(self):
|
150 |
+
self._running = False
|
151 |
+
def run(self):
|
152 |
+
self.result = self.llm.chat(self.prompt, temperature = self.temperature)
|
153 |
+
|
154 |
+
def get_llm_completion(prompt, time_limit = 10, model_name="gpt-4o-mini"):
|
155 |
+
llm = get_models(model_name)
|
156 |
+
temperature = 0.7
|
157 |
+
thread = LLMThread(llm, prompt,temperature)
|
158 |
+
thread.start()
|
159 |
+
thread.join(time_limit)
|
160 |
+
if thread.is_alive():
|
161 |
+
thread.terminate()
|
162 |
+
print("请求超时")
|
163 |
+
return "TimeoutError", False
|
164 |
+
else:
|
165 |
+
return thread.result, True
|
166 |
+
|
167 |
def left_pad_zero(number, digit):
|
168 |
number_str = str(number)
|
169 |
padding_count = digit - len(number_str)
|
|
|
189 |
with open(path,"r",encoding="utf-8") as f:
|
190 |
reader = csv.DictReader(f)
|
191 |
line_num = sum(1 for _ in open(path,"r",encoding="utf-8"))
|
192 |
+
fieldnames = reader.fieldnames if reader.fieldnames else []
|
193 |
if id_col not in fieldnames:
|
194 |
ids = generate_ids(line_num)
|
195 |
i = 0
|