Spaces:

retopara
/

ragflow

Build error

App Files Files Community

Zhen Wang commited on Jan 26

Commit

e7e30bf

2 Parent(s): 927d6b8 0c54322

Merge branch 'infiniflow:main' into main

Browse files

Files changed (24) hide show

README.md +2 -1
README_id.md +2 -1
README_ja.md +2 -1
README_ko.md +2 -1
README_pt_br.md +2 -1
README_tzh.md +1 -1
README_zh.md +2 -1
api/apps/kb_app.py +9 -1
docs/references/agent_component_reference/concentrator.mdx +1 -1
docs/references/supported_models.mdx +2 -1
graphrag/general/graph_extractor.py +1 -1
pyproject.toml +4 -4
rag/app/table.py +2 -2
rag/llm/__init__.py +2 -0
rag/llm/chat_model.py +38 -34
rag/llm/embedding_model.py +28 -16
rag/llm/rerank_model.py +13 -2
rag/nlp/search.py +2 -2
rag/raptor.py +3 -1
rag/utils/es_conn.py +1 -1
rag/utils/infinity_conn.py +74 -15
uv.lock +46 -46
web/src/pages/user-setting/components/setting-title/index.tsx +4 -1
web/src/pages/user-setting/setting-model/index.tsx +18 -14

README.md CHANGED Viewed

@@ -7,6 +7,7 @@
 <p align="center">
   <a href="./README.md">English</a> |
   <a href="./README_zh.md">简体中文</a> |
   <a href="./README_ja.md">日本語</a> |
   <a href="./README_ko.md">한국어</a> |
   <a href="./README_id.md">Bahasa Indonesia</a> |
@@ -77,12 +78,12 @@ Try our demo at [https://demo.ragflow.io](https://demo.ragflow.io).
 ## 🔥 Latest Updates
 - 2024-12-18 Upgrades Document Layout Analysis model in Deepdoc.
 - 2024-12-04 Adds support for pagerank score in knowledge base.
 - 2024-11-22 Adds more variables to Agent.
 - 2024-11-01 Adds keyword extraction and related question generation to the parsed chunks to improve the accuracy of retrieval.
 - 2024-08-22 Support text to SQL statements through RAG.
-- 2024-08-02 Supports GraphRAG inspired by [graphrag](https://github.com/microsoft/graphrag) and mind map.
 ## 🎉 Stay Tuned

 <p align="center">
   <a href="./README.md">English</a> |
   <a href="./README_zh.md">简体中文</a> |
+  <a href="./README_tzh.md">繁体中文</a> |
   <a href="./README_ja.md">日本語</a> |
   <a href="./README_ko.md">한국어</a> |
   <a href="./README_id.md">Bahasa Indonesia</a> |
 ## 🔥 Latest Updates
+- 2025-01-26 Optimizes knowledge graph extraction and application, offering various configuration options.
 - 2024-12-18 Upgrades Document Layout Analysis model in Deepdoc.
 - 2024-12-04 Adds support for pagerank score in knowledge base.
 - 2024-11-22 Adds more variables to Agent.
 - 2024-11-01 Adds keyword extraction and related question generation to the parsed chunks to improve the accuracy of retrieval.
 - 2024-08-22 Support text to SQL statements through RAG.
 ## 🎉 Stay Tuned

README_id.md CHANGED Viewed

@@ -7,6 +7,7 @@
 <p align="center">
   <a href="./README.md">English</a> |
   <a href="./README_zh.md">简体中文</a> |
   <a href="./README_ja.md">日本語</a> |
   <a href="./README_ko.md">한국어</a> |
   <a href="./README_id.md">Bahasa Indonesia</a> |
@@ -74,12 +75,12 @@ Coba demo kami di [https://demo.ragflow.io](https://demo.ragflow.io).
 ## 🔥 Pembaruan Terbaru
 - 2024-12-18 Meningkatkan model Analisis Tata Letak Dokumen di Deepdoc.
 - 2024-12-04 Mendukung skor pagerank ke basis pengetahuan.
 - 2024-11-22 Peningkatan definisi dan penggunaan variabel di Agen.
 - 2024-11-01 Penambahan ekstraksi kata kunci dan pembuatan pertanyaan terkait untuk meningkatkan akurasi pengambilan.
 - 2024-08-22 Dukungan untuk teks ke pernyataan SQL melalui RAG.
-- 2024-08-02 Dukungan GraphRAG yang terinspirasi oleh [graphrag](https://github.com/microsoft/graphrag) dan mind map.
 ## 🎉 Tetap Terkini

 <p align="center">
   <a href="./README.md">English</a> |
   <a href="./README_zh.md">简体中文</a> |
+  <a href="./README_tzh.md">繁体中文</a> |
   <a href="./README_ja.md">日本語</a> |
   <a href="./README_ko.md">한국어</a> |
   <a href="./README_id.md">Bahasa Indonesia</a> |
 ## 🔥 Pembaruan Terbaru
+- 2025-01-26 Optimalkan ekstraksi dan penerapan grafik pengetahuan dan sediakan berbagai opsi konfigurasi.
 - 2024-12-18 Meningkatkan model Analisis Tata Letak Dokumen di Deepdoc.
 - 2024-12-04 Mendukung skor pagerank ke basis pengetahuan.
 - 2024-11-22 Peningkatan definisi dan penggunaan variabel di Agen.
 - 2024-11-01 Penambahan ekstraksi kata kunci dan pembuatan pertanyaan terkait untuk meningkatkan akurasi pengambilan.
 - 2024-08-22 Dukungan untuk teks ke pernyataan SQL melalui RAG.
 ## 🎉 Tetap Terkini

README_ja.md CHANGED Viewed

@@ -7,6 +7,7 @@
 <p align="center">
   <a href="./README.md">English</a> |
   <a href="./README_zh.md">简体中文</a> |
   <a href="./README_ja.md">日本語</a> |
   <a href="./README_ko.md">한국어</a> |
   <a href="./README_id.md">Bahasa Indonesia</a> |
@@ -54,12 +55,12 @@
 ## 🔥 最新情報
 - 2024-12-18 Deepdoc のドキュメント レイアウト分析モデルをアップグレードします。
 - 2024-12-04 ナレッジ ベースへのページランク スコアをサポートしました。
 - 2024-11-22 エージェントでの変数の定義と使用法を改善しました。
 - 2024-11-01 再現の精度を向上させるために、解析されたチャンクにキーワード抽出と関連質問の生成を追加しました。
 - 2024-08-22 RAG を介して SQL ステートメントへのテキストをサポートします。
-- 2024-08-02 [graphrag](https://github.com/microsoft/graphrag) からインスピレーションを得た GraphRAG とマインド マップをサポートします。
 ## 🎉 続きを楽しみに

 <p align="center">
   <a href="./README.md">English</a> |
   <a href="./README_zh.md">简体中文</a> |
+  <a href="./README_tzh.md">繁体中文</a> |
   <a href="./README_ja.md">日本語</a> |
   <a href="./README_ko.md">한국어</a> |
   <a href="./README_id.md">Bahasa Indonesia</a> |
 ## 🔥 最新情報
+- 2025-01-26 ナレッジ グラフの抽出と適用を最適化し、さまざまな構成オプションを提供します。
 - 2024-12-18 Deepdoc のドキュメント レイアウト分析モデルをアップグレードします。
 - 2024-12-04 ナレッジ ベースへのページランク スコアをサポートしました。
 - 2024-11-22 エージェントでの変数の定義と使用法を改善しました。
 - 2024-11-01 再現の精度を向上させるために、解析されたチャンクにキーワード抽出と関連質問の生成を追加しました。
 - 2024-08-22 RAG を介して SQL ステートメントへのテキストをサポートします。
 ## 🎉 続きを楽しみに

README_ko.md CHANGED Viewed

@@ -7,6 +7,7 @@
 <p align="center">
   <a href="./README.md">English</a> |
   <a href="./README_zh.md">简体中文</a> |
   <a href="./README_ja.md">日本語</a> |
   <a href="./README_ko.md">한국어</a> |
   <a href="./README_id.md">Bahasa Indonesia</a> |
@@ -54,13 +55,13 @@
 ## 🔥 업데이트
 - 2024-12-18 Deepdoc의 문서 레이아웃 분석 모델 업그레이드.
 - 2024-12-04 지식베이스에 대한 페이지랭크 점수를 지원합니다.
 - 2024-11-22 에이전트의 변수 정의 및 사용을 개선했습니다.
 - 2024-11-01 파싱된 청크에 키워드 추출 및 관련 질문 생성을 추가하여 재현율을 향상시킵니다.
 - 2024-08-22 RAG를 통해 SQL 문에 텍스트를 지원합니다.
-- 2024-08-02: [graphrag](https://github.com/microsoft/graphrag)와 마인드맵에서 영감을 받은 GraphRAG를 지원합니다.
 ## 🎉 계속 지켜봐 주세요

 <p align="center">
   <a href="./README.md">English</a> |
   <a href="./README_zh.md">简体中文</a> |
+  <a href="./README_tzh.md">繁体中文</a> |
   <a href="./README_ja.md">日本語</a> |
   <a href="./README_ko.md">한국어</a> |
   <a href="./README_id.md">Bahasa Indonesia</a> |
 ## 🔥 업데이트
+- 2025-01-26 지식 그래프 추출 및 적용을 최적화하고 다양한 구성 옵션을 제공합니다.
 - 2024-12-18 Deepdoc의 문서 레이아웃 분석 모델 업그레이드.
 - 2024-12-04 지식베이스에 대한 페이지랭크 점수를 지원합니다.
 - 2024-11-22 에이전트의 변수 정의 및 사용을 개선했습니다.
 - 2024-11-01 파싱된 청크에 키워드 추출 및 관련 질문 생성을 추가하여 재현율을 향상시킵니다.
 - 2024-08-22 RAG를 통해 SQL 문에 텍스트를 지원합니다.
 ## 🎉 계속 지켜봐 주세요

README_pt_br.md CHANGED Viewed

@@ -7,6 +7,7 @@
 <p align="center">
   <a href="./README.md">English</a> |
   <a href="./README_zh.md">简体中文</a> |
   <a href="./README_ja.md">日本語</a> |
   <a href="./README_ko.md">한국어</a> |
   <a href="./README_id.md">Bahasa Indonesia</a> |
@@ -74,12 +75,12 @@ Experimente nossa demo em [https://demo.ragflow.io](https://demo.ragflow.io).
 ## 🔥 Últimas Atualizações
 - 18-12-2024 Atualiza o modelo de Análise de Layout de Documentos no Deepdoc.
 - 04-12-2024 Adiciona suporte para pontuação de pagerank na base de conhecimento.
 - 22-11-2024 Adiciona mais variáveis para o Agente.
 - 01-11-2024 Adiciona extração de palavras-chave e geração de perguntas relacionadas aos blocos analisados para melhorar a precisão da recuperação.
 - 22-08-2024 Suporta conversão de texto para comandos SQL via RAG.
-- 02-08-2024 Suporta GraphRAG inspirado pelo [graphrag](https://github.com/microsoft/graphrag) e mapa mental.
 ## 🎉 Fique Ligado

 <p align="center">
   <a href="./README.md">English</a> |
   <a href="./README_zh.md">简体中文</a> |
+  <a href="./README_tzh.md">繁体中文</a> |
   <a href="./README_ja.md">日本語</a> |
   <a href="./README_ko.md">한국어</a> |
   <a href="./README_id.md">Bahasa Indonesia</a> |
 ## 🔥 Últimas Atualizações
+- 26-01-2025 Otimize a extração e aplicação de gráficos de conhecimento e forneça uma variedade de opções de configuração.
 - 18-12-2024 Atualiza o modelo de Análise de Layout de Documentos no Deepdoc.
 - 04-12-2024 Adiciona suporte para pontuação de pagerank na base de conhecimento.
 - 22-11-2024 Adiciona mais variáveis para o Agente.
 - 01-11-2024 Adiciona extração de palavras-chave e geração de perguntas relacionadas aos blocos analisados para melhorar a precisão da recuperação.
 - 22-08-2024 Suporta conversão de texto para comandos SQL via RAG.
 ## 🎉 Fique Ligado

README_tzh.md CHANGED Viewed

@@ -54,12 +54,12 @@
 ## 🔥 近期更新
 - 2024-12-18 升級了 Deepdoc 的文檔佈局分析模型。
 - 2024-12-04 支援知識庫的 Pagerank 分數。
 - 2024-11-22 完善了 Agent 中的變數定義和使用。
 - 2024-11-01 對解析後的 chunk 加入關鍵字抽取和相關問題產生以提高回想的準確度。
 - 2024-08-22 支援用 RAG 技術實現從自然語言到 SQL 語句的轉換。
-- 2024-08-02 支持 GraphRAG 啟發於 [graphrag](https://github.com/microsoft/graphrag) 和心智圖。
 ## 🎉 關注項目

 ## 🔥 近期更新
+- 2025-01-26 最佳化知識圖譜的擷取與應用，提供了多種配置選擇。
 - 2024-12-18 升級了 Deepdoc 的文檔佈局分析模型。
 - 2024-12-04 支援知識庫的 Pagerank 分數。
 - 2024-11-22 完善了 Agent 中的變數定義和使用。
 - 2024-11-01 對解析後的 chunk 加入關鍵字抽取和相關問題產生以提高回想的準確度。
 - 2024-08-22 支援用 RAG 技術實現從自然語言到 SQL 語句的轉換。
 ## 🎉 關注項目

README_zh.md CHANGED Viewed

@@ -7,6 +7,7 @@
 <p align="center">
   <a href="./README.md">English</a> |
   <a href="./README_zh.md">简体中文</a> |
   <a href="./README_ja.md">日本語</a> |
   <a href="./README_ko.md">한국어</a> |
   <a href="./README_id.md">Bahasa Indonesia</a> |
@@ -54,12 +55,12 @@
 ## 🔥 近期更新
 - 2024-12-18 升级了 Deepdoc 的文档布局分析模型。
 - 2024-12-04 支持知识库的 Pagerank 分数。
 - 2024-11-22 完善了 Agent 中的变量定义和使用。
 - 2024-11-01 对解析后的 chunk 加入关键词抽取和相关问题生成以提高召回的准确度。
 - 2024-08-22 支持用 RAG 技术实现从自然语言到 SQL 语句的转换。
-- 2024-08-02 支持 GraphRAG 启发于 [graphrag](https://github.com/microsoft/graphrag) 和思维导图。
 ## 🎉 关注项目

 <p align="center">
   <a href="./README.md">English</a> |
   <a href="./README_zh.md">简体中文</a> |
+  <a href="./README_tzh.md">繁体中文</a> |
   <a href="./README_ja.md">日本語</a> |
   <a href="./README_ko.md">한국어</a> |
   <a href="./README_id.md">Bahasa Indonesia</a> |
 ## 🔥 近期更新
+- 2025-01-26 优化知识图谱的提取和应用，提供了多种配置选择。
 - 2024-12-18 升级了 Deepdoc 的文档布局分析模型。
 - 2024-12-04 支持知识库的 Pagerank 分数。
 - 2024-11-22 完善了 Agent 中的变量定义和使用。
 - 2024-11-01 对解析后的 chunk 加入关键词抽取和相关问题生成以提高召回的准确度。
 - 2024-08-22 支持用 RAG 技术实现从自然语言到 SQL 语句的转换。
 ## 🎉 关注项目

api/apps/kb_app.py CHANGED Viewed

@@ -24,6 +24,7 @@ from api.db.services.document_service import DocumentService
 from api.db.services.file2document_service import File2DocumentService
 from api.db.services.file_service import FileService
 from api.db.services.user_service import TenantService, UserTenantService
 from api.utils.api_utils import server_error_response, get_data_error_result, validate_request, not_allowed_parameters
 from api.utils import get_uuid
 from api.db import StatusEnum, FileSource
@@ -96,6 +97,13 @@ def update():
             return get_data_error_result(
                 message="Can't find this knowledgebase!")
         if req["name"].lower() != kb.name.lower() \
                 and len(
             KnowledgebaseService.query(name=req["name"], tenant_id=current_user.id, status=StatusEnum.VALID.value)) > 1:
@@ -112,7 +120,7 @@ def update():
                                          search.index_name(kb.tenant_id), kb.id)
             else:
                 # Elasticsearch requires PAGERANK_FLD be non-zero!
-                settings.docStoreConn.update({"exist": PAGERANK_FLD}, {"remove": PAGERANK_FLD},
                                          search.index_name(kb.tenant_id), kb.id)
         e, kb = KnowledgebaseService.get_by_id(kb.id)

 from api.db.services.file2document_service import File2DocumentService
 from api.db.services.file_service import FileService
 from api.db.services.user_service import TenantService, UserTenantService
+from api.settings import DOC_ENGINE
 from api.utils.api_utils import server_error_response, get_data_error_result, validate_request, not_allowed_parameters
 from api.utils import get_uuid
 from api.db import StatusEnum, FileSource
             return get_data_error_result(
                 message="Can't find this knowledgebase!")
+        if req.get("parser_id", "") == "tag" and DOC_ENGINE == "infinity":
+            return get_json_result(
+                data=False,
+                message='The chunk method Tag has not been supported by Infinity yet.',
+                code=settings.RetCode.OPERATING_ERROR
+            )
         if req["name"].lower() != kb.name.lower() \
                 and len(
             KnowledgebaseService.query(name=req["name"], tenant_id=current_user.id, status=StatusEnum.VALID.value)) > 1:
                                          search.index_name(kb.tenant_id), kb.id)
             else:
                 # Elasticsearch requires PAGERANK_FLD be non-zero!
+                settings.docStoreConn.update({"exists": PAGERANK_FLD}, {"remove": PAGERANK_FLD},
                                          search.index_name(kb.tenant_id), kb.id)
         e, kb = KnowledgebaseService.get_by_id(kb.id)

docs/references/agent_component_reference/concentrator.mdx CHANGED Viewed

@@ -18,7 +18,7 @@ A **Concentrator** component enhances the current UX design. For a component ori
 ## Examples
-Explore our general-purpose chatbot agent template, featuring a **Concentrator** component (component ID: **medical**) that relays an execution flow from category 2 of the **Categorize** component to the two translator components:
 1. Click the **Agent** tab at the top center of the page to access the **Agent** page.
 2. Click **+ Create agent** on the top right of the page to open the **agent template** page.

 ## Examples
+Explore our general-purpose chatbot agent template, featuring a **Concentrator** component (component ID: **medical**) that relays an execution flow from category 2 of the **Categorize** component to two translator components:
 1. Click the **Agent** tab at the top center of the page to access the **Agent** page.
 2. Click **+ Create agent** on the top right of the page to open the **agent template** page.

docs/references/supported_models.mdx CHANGED Viewed

@@ -12,7 +12,7 @@ A complete list of models supported by RAGFlow, which will continue to expand.
 <APITable>
 ```
-| Provider              | Chat               | Embedding          | Rerank             | Multimodal         | ASR                | TTS                |
 | --------------------- | ------------------ | ------------------ | ------------------ | ------------------ | ------------------ | ------------------ |
 | Anthropic             | :heavy_check_mark: |                    |                    |                    |                    |                    |
 | Azure-OpenAI          | :heavy_check_mark: | :heavy_check_mark: |                    | :heavy_check_mark: | :heavy_check_mark: |                    |
@@ -26,6 +26,7 @@ A complete list of models supported by RAGFlow, which will continue to expand.
 | Fish Audio            |                    |                    |                    |                    |                    | :heavy_check_mark: |
 | Gemini                | :heavy_check_mark: | :heavy_check_mark: |                    | :heavy_check_mark: |                    |                    |
 | Google Cloud          | :heavy_check_mark: |                    |                    |                    |                    |                    |
 | Groq                  | :heavy_check_mark: |                    |                    |                    |                    |                    |
 | HuggingFace           | :heavy_check_mark: | :heavy_check_mark: |                    |                    |                    |                    |
 | Jina                  |                    | :heavy_check_mark: | :heavy_check_mark: |                    |                    |                    |

 <APITable>
 ```
+| Provider              | Chat               | Embedding          | Rerank             | Img2txt            | Sequence2txt       | TTS                |
 | --------------------- | ------------------ | ------------------ | ------------------ | ------------------ | ------------------ | ------------------ |
 | Anthropic             | :heavy_check_mark: |                    |                    |                    |                    |                    |
 | Azure-OpenAI          | :heavy_check_mark: | :heavy_check_mark: |                    | :heavy_check_mark: | :heavy_check_mark: |                    |
 | Fish Audio            |                    |                    |                    |                    |                    | :heavy_check_mark: |
 | Gemini                | :heavy_check_mark: | :heavy_check_mark: |                    | :heavy_check_mark: |                    |                    |
 | Google Cloud          | :heavy_check_mark: |                    |                    |                    |                    |                    |
+| GPUStack              | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: |                    | :heavy_check_mark: | :heavy_check_mark: |
 | Groq                  | :heavy_check_mark: |                    |                    |                    |                    |                    |
 | HuggingFace           | :heavy_check_mark: | :heavy_check_mark: |                    |                    |                    |                    |
 | Jina                  |                    | :heavy_check_mark: | :heavy_check_mark: |                    |                    |                    |

graphrag/general/graph_extractor.py CHANGED Viewed

@@ -135,7 +135,7 @@ class GraphExtractor(Extractor):
                     break
                 history.append({"role": "assistant", "content": response})
                 history.append({"role": "user", "content": LOOP_PROMPT})
-                continuation = self._chat("", history, self._loop_args)
                 token_count += num_tokens_from_string("\n".join([m["content"] for m in history]) + response)
                 if continuation != "YES":
                     break

                     break
                 history.append({"role": "assistant", "content": response})
                 history.append({"role": "user", "content": LOOP_PROMPT})
+                continuation = self._chat("", history, {"temperature": 0.8})
                 token_count += num_tokens_from_string("\n".join([m["content"] for m in history]) + response)
                 if continuation != "YES":
                     break

pyproject.toml CHANGED Viewed

@@ -59,8 +59,8 @@ dependencies = [
     "nltk==3.9.1",
     "numpy>=1.26.0,<2.0.0",
     "ollama==0.2.1",
-    "onnxruntime==1.19.2; sys_platform == 'darwin' or platform_machine == 'arm64'",
-    "onnxruntime-gpu==1.19.2; platform_machine == 'x86_64'",
     "openai==1.45.0",
     "opencv-python==4.10.0.84",
     "opencv-python-headless==4.10.0.84",
@@ -128,8 +128,8 @@ dependencies = [
 [project.optional-dependencies]
 full = [
     "bcembedding==0.1.5",
-    "fastembed>=0.3.6,<0.4.0; sys_platform == 'darwin' or platform_machine == 'arm64'",
-    "fastembed-gpu>=0.3.6,<0.4.0; platform_machine == 'x86_64'",
     "flagembedding==1.2.10",
     "torch>=2.5.0,<3.0.0",
     "transformers>=4.35.0,<5.0.0"

     "nltk==3.9.1",
     "numpy>=1.26.0,<2.0.0",
     "ollama==0.2.1",
+    "onnxruntime==1.19.2; sys_platform == 'darwin' or platform_machine != 'x86_64'",
+    "onnxruntime-gpu==1.19.2; sys_platform != 'darwin' and platform_machine == 'x86_64'",
     "openai==1.45.0",
     "opencv-python==4.10.0.84",
     "opencv-python-headless==4.10.0.84",
 [project.optional-dependencies]
 full = [
     "bcembedding==0.1.5",
+    "fastembed>=0.3.6,<0.4.0; sys_platform == 'darwin' or platform_machine != 'x86_64'",
+    "fastembed-gpu>=0.3.6,<0.4.0; sys_platform != 'darwin' and platform_machine == 'x86_64'",
     "flagembedding==1.2.10",
     "torch>=2.5.0,<3.0.0",
     "transformers>=4.35.0,<5.0.0"

rag/app/table.py CHANGED Viewed

@@ -102,9 +102,9 @@ def column_data_type(arr):
     for a in arr:
         if a is None:
             continue
-        if re.match(r"[+-]?[0-9]+(\.0+)?$", str(a).replace("%%", "")):
             counts["int"] += 1
-        elif re.match(r"[+-]?[0-9.]+$", str(a).replace("%%", "")):
             counts["float"] += 1
         elif re.match(r"(true|yes|是|\*|✓|✔|☑|✅|√|false|no|否|⍻|×)$", str(a), flags=re.IGNORECASE):
             counts["bool"] += 1

     for a in arr:
         if a is None:
             continue
+        if re.match(r"[+-]?[0-9]{,19}(\.0+)?$", str(a).replace("%%", "")):
             counts["int"] += 1
+        elif re.match(r"[+-]?[0-9.]{,19}$", str(a).replace("%%", "")):
             counts["float"] += 1
         elif re.match(r"(true|yes|是|\*|✓|✔|☑|✅|√|false|no|否|⍻|×)$", str(a), flags=re.IGNORECASE):
             counts["bool"] += 1

rag/llm/__init__.py CHANGED Viewed

@@ -13,6 +13,8 @@
 #  See the License for the specific language governing permissions and
 #  limitations under the License.
 #
 from .embedding_model import (
     OllamaEmbed,
     LocalAIEmbed,

 #  See the License for the specific language governing permissions and
 #  limitations under the License.
 #
+#  AFTER UPDATING THIS FILE, PLEASE ENSURE THAT docs/references/supported_models.mdx IS ALSO UPDATED for consistency!
+#
 from .embedding_model import (
     OllamaEmbed,
     LocalAIEmbed,

rag/llm/chat_model.py CHANGED Viewed

@@ -53,7 +53,7 @@ class Base(ABC):
                     ans += LENGTH_NOTIFICATION_CN
                 else:
                     ans += LENGTH_NOTIFICATION_EN
-            return ans, response.usage.total_tokens
         except openai.APIError as e:
             return "**ERROR**: " + str(e), 0
@@ -75,15 +75,11 @@ class Base(ABC):
                     resp.choices[0].delta.content = ""
                 ans += resp.choices[0].delta.content
-                if not hasattr(resp, "usage") or not resp.usage:
-                    total_tokens = (
-                                total_tokens
-                                + num_tokens_from_string(resp.choices[0].delta.content)
-                        )
-                elif isinstance(resp.usage, dict):
-                    total_tokens = resp.usage.get("total_tokens", total_tokens)
                 else:
-                    total_tokens = resp.usage.total_tokens
                 if resp.choices[0].finish_reason == "length":
                     if is_chinese(ans):
@@ -97,6 +93,17 @@ class Base(ABC):
         yield total_tokens
 class GptTurbo(Base):
     def __init__(self, key, model_name="gpt-3.5-turbo", base_url="https://api.openai.com/v1"):
@@ -182,7 +189,7 @@ class BaiChuanChat(Base):
                     ans += LENGTH_NOTIFICATION_CN
                 else:
                     ans += LENGTH_NOTIFICATION_EN
-            return ans, response.usage.total_tokens
         except openai.APIError as e:
             return "**ERROR**: " + str(e), 0
@@ -212,14 +219,11 @@ class BaiChuanChat(Base):
                 if not resp.choices[0].delta.content:
                     resp.choices[0].delta.content = ""
                 ans += resp.choices[0].delta.content
-                total_tokens = (
-                    (
-                            total_tokens
-                            + num_tokens_from_string(resp.choices[0].delta.content)
-                    )
-                    if not hasattr(resp, "usage")
-                    else resp.usage["total_tokens"]
-                )
                 if resp.choices[0].finish_reason == "length":
                     if is_chinese([ans]):
                         ans += LENGTH_NOTIFICATION_CN
@@ -256,7 +260,7 @@ class QWenChat(Base):
             tk_count = 0
             if response.status_code == HTTPStatus.OK:
                 ans += response.output.choices[0]['message']['content']
-                tk_count += response.usage.total_tokens
                 if response.output.choices[0].get("finish_reason", "") == "length":
                     if is_chinese([ans]):
                         ans += LENGTH_NOTIFICATION_CN
@@ -292,7 +296,7 @@ class QWenChat(Base):
             for resp in response:
                 if resp.status_code == HTTPStatus.OK:
                     ans = resp.output.choices[0]['message']['content']
-                    tk_count = resp.usage.total_tokens
                     if resp.output.choices[0].get("finish_reason", "") == "length":
                         if is_chinese(ans):
                             ans += LENGTH_NOTIFICATION_CN
@@ -334,7 +338,7 @@ class ZhipuChat(Base):
                     ans += LENGTH_NOTIFICATION_CN
                 else:
                     ans += LENGTH_NOTIFICATION_EN
-            return ans, response.usage.total_tokens
         except Exception as e:
             return "**ERROR**: " + str(e), 0
@@ -364,9 +368,9 @@ class ZhipuChat(Base):
                         ans += LENGTH_NOTIFICATION_CN
                     else:
                         ans += LENGTH_NOTIFICATION_EN
-                    tk_count = resp.usage.total_tokens
                 if resp.choices[0].finish_reason == "stop":
-                    tk_count = resp.usage.total_tokens
                 yield ans
         except Exception as e:
             yield ans + "\n**ERROR**: " + str(e)
@@ -569,7 +573,7 @@ class MiniMaxChat(Base):
                     ans += LENGTH_NOTIFICATION_CN
                 else:
                     ans += LENGTH_NOTIFICATION_EN
-            return ans, response["usage"]["total_tokens"]
         except Exception as e:
             return "**ERROR**: " + str(e), 0
@@ -603,11 +607,11 @@ class MiniMaxChat(Base):
                 if "choices" in resp and "delta" in resp["choices"][0]:
                     text = resp["choices"][0]["delta"]["content"]
                 ans += text
-                total_tokens = (
-                    total_tokens + num_tokens_from_string(text)
-                    if "usage" not in resp
-                    else resp["usage"]["total_tokens"]
-                )
                 yield ans
         except Exception as e:
@@ -640,7 +644,7 @@ class MistralChat(Base):
                     ans += LENGTH_NOTIFICATION_CN
                 else:
                     ans += LENGTH_NOTIFICATION_EN
-            return ans, response.usage.total_tokens
         except openai.APIError as e:
             return "**ERROR**: " + str(e), 0
@@ -838,7 +842,7 @@ class GeminiChat(Base):
         yield 0
-class GroqChat:
     def __init__(self, key, model_name, base_url=''):
         from groq import Groq
         self.client = Groq(api_key=key)
@@ -863,7 +867,7 @@ class GroqChat:
                     ans += LENGTH_NOTIFICATION_CN
                 else:
                     ans += LENGTH_NOTIFICATION_EN
-            return ans, response.usage.total_tokens
         except Exception as e:
             return ans + "\n**ERROR**: " + str(e), 0
@@ -1255,7 +1259,7 @@ class BaiduYiyanChat(Base):
                 **gen_conf
             ).body
             ans = response['result']
-            return ans, response["usage"]["total_tokens"]
         except Exception as e:
             return ans + "\n**ERROR**: " + str(e), 0
@@ -1283,7 +1287,7 @@ class BaiduYiyanChat(Base):
             for resp in response:
                 resp = resp.body
                 ans += resp['result']
-                total_tokens = resp["usage"]["total_tokens"]
                 yield ans

                     ans += LENGTH_NOTIFICATION_CN
                 else:
                     ans += LENGTH_NOTIFICATION_EN
+            return ans, self.total_token_count(response)
         except openai.APIError as e:
             return "**ERROR**: " + str(e), 0
                     resp.choices[0].delta.content = ""
                 ans += resp.choices[0].delta.content
+                tol = self.total_token_count(resp)
+                if not tol:
+                    total_tokens += num_tokens_from_string(resp.choices[0].delta.content)
                 else:
+                    total_tokens = tol
                 if resp.choices[0].finish_reason == "length":
                     if is_chinese(ans):
         yield total_tokens
+    def total_token_count(self, resp):
+        try:
+            return resp.usage.total_tokens
+        except Exception:
+            pass
+        try:
+            return resp["usage"]["total_tokens"]
+        except Exception:
+            pass
+        return 0
 class GptTurbo(Base):
     def __init__(self, key, model_name="gpt-3.5-turbo", base_url="https://api.openai.com/v1"):
                     ans += LENGTH_NOTIFICATION_CN
                 else:
                     ans += LENGTH_NOTIFICATION_EN
+            return ans, self.total_token_count(response)
         except openai.APIError as e:
             return "**ERROR**: " + str(e), 0
                 if not resp.choices[0].delta.content:
                     resp.choices[0].delta.content = ""
                 ans += resp.choices[0].delta.content
+                tol = self.total_token_count(resp)
+                if not tol:
+                    total_tokens += num_tokens_from_string(resp.choices[0].delta.content)
+                else:
+                    total_tokens = tol
                 if resp.choices[0].finish_reason == "length":
                     if is_chinese([ans]):
                         ans += LENGTH_NOTIFICATION_CN
             tk_count = 0
             if response.status_code == HTTPStatus.OK:
                 ans += response.output.choices[0]['message']['content']
+                tk_count += self.total_token_count(response)
                 if response.output.choices[0].get("finish_reason", "") == "length":
                     if is_chinese([ans]):
                         ans += LENGTH_NOTIFICATION_CN
             for resp in response:
                 if resp.status_code == HTTPStatus.OK:
                     ans = resp.output.choices[0]['message']['content']
+                    tk_count = self.total_token_count(resp)
                     if resp.output.choices[0].get("finish_reason", "") == "length":
                         if is_chinese(ans):
                             ans += LENGTH_NOTIFICATION_CN
                     ans += LENGTH_NOTIFICATION_CN
                 else:
                     ans += LENGTH_NOTIFICATION_EN
+            return ans, self.total_token_count(response)
         except Exception as e:
             return "**ERROR**: " + str(e), 0
                         ans += LENGTH_NOTIFICATION_CN
                     else:
                         ans += LENGTH_NOTIFICATION_EN
+                    tk_count = self.total_token_count(resp)
                 if resp.choices[0].finish_reason == "stop":
+                    tk_count = self.total_token_count(resp)
                 yield ans
         except Exception as e:
             yield ans + "\n**ERROR**: " + str(e)
                     ans += LENGTH_NOTIFICATION_CN
                 else:
                     ans += LENGTH_NOTIFICATION_EN
+            return ans, self.total_token_count(response)
         except Exception as e:
             return "**ERROR**: " + str(e), 0
                 if "choices" in resp and "delta" in resp["choices"][0]:
                     text = resp["choices"][0]["delta"]["content"]
                 ans += text
+                tol = self.total_token_count(resp)
+                if not tol:
+                    total_tokens += num_tokens_from_string(text)
+                else:
+                    total_tokens = tol
                 yield ans
         except Exception as e:
                     ans += LENGTH_NOTIFICATION_CN
                 else:
                     ans += LENGTH_NOTIFICATION_EN
+            return ans, self.total_token_count(response)
         except openai.APIError as e:
             return "**ERROR**: " + str(e), 0
         yield 0
+class GroqChat(Base):
     def __init__(self, key, model_name, base_url=''):
         from groq import Groq
         self.client = Groq(api_key=key)
                     ans += LENGTH_NOTIFICATION_CN
                 else:
                     ans += LENGTH_NOTIFICATION_EN
+            return ans, self.total_token_count(response)
         except Exception as e:
             return ans + "\n**ERROR**: " + str(e), 0
                 **gen_conf
             ).body
             ans = response['result']
+            return ans, self.total_token_count(response)
         except Exception as e:
             return ans + "\n**ERROR**: " + str(e), 0
             for resp in response:
                 resp = resp.body
                 ans += resp['result']
+                total_tokens = self.total_token_count(resp)
                 yield ans

rag/llm/embedding_model.py CHANGED Viewed

@@ -44,11 +44,23 @@ class Base(ABC):
     def encode_queries(self, text: str):
         raise NotImplementedError("Please implement encode method!")
 class DefaultEmbedding(Base):
     _model = None
     _model_name = ""
     _model_lock = threading.Lock()
     def __init__(self, key, model_name, **kwargs):
         """
         If you have trouble downloading HuggingFace models, -_^ this might help!!
@@ -115,13 +127,13 @@ class OpenAIEmbed(Base):
             res = self.client.embeddings.create(input=texts[i:i + batch_size],
                                                 model=self.model_name)
             ress.extend([d.embedding for d in res.data])
-            total_tokens += res.usage.total_tokens
         return np.array(ress), total_tokens
     def encode_queries(self, text):
         res = self.client.embeddings.create(input=[truncate(text, 8191)],
                                             model=self.model_name)
-        return np.array(res.data[0].embedding), res.usage.total_tokens
 class LocalAIEmbed(Base):
@@ -188,7 +200,7 @@ class QWenEmbed(Base):
                 for e in resp["output"]["embeddings"]:
                     embds[e["text_index"]] = e["embedding"]
                 res.extend(embds)
-                token_count += resp["usage"]["total_tokens"]
             return np.array(res), token_count
         except Exception as e:
             raise Exception("Account abnormal. Please ensure it's on good standing to use QWen's "+self.model_name)
@@ -203,7 +215,7 @@ class QWenEmbed(Base):
                 text_type="query"
             )
             return np.array(resp["output"]["embeddings"][0]
-                            ["embedding"]), resp["usage"]["total_tokens"]
         except Exception:
             raise Exception("Account abnormal. Please ensure it's on good standing to use QWen's "+self.model_name)
         return np.array([]), 0
@@ -229,13 +241,13 @@ class ZhipuEmbed(Base):
             res = self.client.embeddings.create(input=txt,
                                                 model=self.model_name)
             arr.append(res.data[0].embedding)
-            tks_num += res.usage.total_tokens
         return np.array(arr), tks_num
     def encode_queries(self, text):
         res = self.client.embeddings.create(input=text,
                                             model=self.model_name)
-        return np.array(res.data[0].embedding), res.usage.total_tokens
 class OllamaEmbed(Base):
@@ -318,13 +330,13 @@ class XinferenceEmbed(Base):
         for i in range(0, len(texts), batch_size):
             res = self.client.embeddings.create(input=texts[i:i + batch_size], model=self.model_name)
             ress.extend([d.embedding for d in res.data])
-            total_tokens += res.usage.total_tokens
         return np.array(ress), total_tokens
     def encode_queries(self, text):
         res = self.client.embeddings.create(input=[text],
                                             model=self.model_name)
-        return np.array(res.data[0].embedding), res.usage.total_tokens
 class YoudaoEmbed(Base):
@@ -383,7 +395,7 @@ class JinaEmbed(Base):
             }
             res = requests.post(self.base_url, headers=self.headers, json=data).json()
             ress.extend([d["embedding"] for d in res["data"]])
-            token_count += res["usage"]["total_tokens"]
         return np.array(ress), token_count
     def encode_queries(self, text):
@@ -447,13 +459,13 @@ class MistralEmbed(Base):
             res = self.client.embeddings(input=texts[i:i + batch_size],
                                         model=self.model_name)
             ress.extend([d.embedding for d in res.data])
-            token_count += res.usage.total_tokens
         return np.array(ress), token_count
     def encode_queries(self, text):
         res = self.client.embeddings(input=[truncate(text, 8196)],
                                             model=self.model_name)
-        return np.array(res.data[0].embedding), res.usage.total_tokens
 class BedrockEmbed(Base):
@@ -565,7 +577,7 @@ class NvidiaEmbed(Base):
             }
             res = requests.post(self.base_url, headers=self.headers, json=payload).json()
             ress.extend([d["embedding"] for d in res["data"]])
-            token_count += res["usage"]["total_tokens"]
         return np.array(ress), token_count
     def encode_queries(self, text):
@@ -677,7 +689,7 @@ class SILICONFLOWEmbed(Base):
             if "data" not in res or not isinstance(res["data"], list) or len(res["data"]) != len(texts_batch):
                 raise ValueError(f"SILICONFLOWEmbed.encode got invalid response from {self.base_url}")
             ress.extend([d["embedding"] for d in res["data"]])
-            token_count += res["usage"]["total_tokens"]
         return np.array(ress), token_count
     def encode_queries(self, text):
@@ -689,7 +701,7 @@ class SILICONFLOWEmbed(Base):
         res = requests.post(self.base_url, json=payload, headers=self.headers).json()
         if "data" not in res or not isinstance(res["data"], list) or len(res["data"])!= 1:
             raise ValueError(f"SILICONFLOWEmbed.encode_queries got invalid response from {self.base_url}")
-        return np.array(res["data"][0]["embedding"]), res["usage"]["total_tokens"]
 class ReplicateEmbed(Base):
@@ -727,14 +739,14 @@ class BaiduYiyanEmbed(Base):
         res = self.client.do(model=self.model_name, texts=texts).body
         return (
             np.array([r["embedding"] for r in res["data"]]),
-            res["usage"]["total_tokens"],
         )
     def encode_queries(self, text):
         res = self.client.do(model=self.model_name, texts=[text]).body
         return (
             np.array([r["embedding"] for r in res["data"]]),
-            res["usage"]["total_tokens"],
         )

     def encode_queries(self, text: str):
         raise NotImplementedError("Please implement encode method!")
+    def total_token_count(self, resp):
+        try:
+            return resp.usage.total_tokens
+        except Exception:
+            pass
+        try:
+            return resp["usage"]["total_tokens"]
+        except Exception:
+            pass
+        return 0
 class DefaultEmbedding(Base):
     _model = None
     _model_name = ""
     _model_lock = threading.Lock()
     def __init__(self, key, model_name, **kwargs):
         """
         If you have trouble downloading HuggingFace models, -_^ this might help!!
             res = self.client.embeddings.create(input=texts[i:i + batch_size],
                                                 model=self.model_name)
             ress.extend([d.embedding for d in res.data])
+            total_tokens += self.total_token_count(res)
         return np.array(ress), total_tokens
     def encode_queries(self, text):
         res = self.client.embeddings.create(input=[truncate(text, 8191)],
                                             model=self.model_name)
+        return np.array(res.data[0].embedding), self.total_token_count(res)
 class LocalAIEmbed(Base):
                 for e in resp["output"]["embeddings"]:
                     embds[e["text_index"]] = e["embedding"]
                 res.extend(embds)
+                token_count += self.total_token_count(resp)
             return np.array(res), token_count
         except Exception as e:
             raise Exception("Account abnormal. Please ensure it's on good standing to use QWen's "+self.model_name)
                 text_type="query"
             )
             return np.array(resp["output"]["embeddings"][0]
+                            ["embedding"]), self.total_token_count(resp)
         except Exception:
             raise Exception("Account abnormal. Please ensure it's on good standing to use QWen's "+self.model_name)
         return np.array([]), 0
             res = self.client.embeddings.create(input=txt,
                                                 model=self.model_name)
             arr.append(res.data[0].embedding)
+            tks_num += self.total_token_count(res)
         return np.array(arr), tks_num
     def encode_queries(self, text):
         res = self.client.embeddings.create(input=text,
                                             model=self.model_name)
+        return np.array(res.data[0].embedding), self.total_token_count(res)
 class OllamaEmbed(Base):
         for i in range(0, len(texts), batch_size):
             res = self.client.embeddings.create(input=texts[i:i + batch_size], model=self.model_name)
             ress.extend([d.embedding for d in res.data])
+            total_tokens += self.total_token_count(res)
         return np.array(ress), total_tokens
     def encode_queries(self, text):
         res = self.client.embeddings.create(input=[text],
                                             model=self.model_name)
+        return np.array(res.data[0].embedding), self.total_token_count(res)
 class YoudaoEmbed(Base):
             }
             res = requests.post(self.base_url, headers=self.headers, json=data).json()
             ress.extend([d["embedding"] for d in res["data"]])
+            token_count += self.total_token_count(res)
         return np.array(ress), token_count
     def encode_queries(self, text):
             res = self.client.embeddings(input=texts[i:i + batch_size],
                                         model=self.model_name)
             ress.extend([d.embedding for d in res.data])
+            token_count += self.total_token_count(res)
         return np.array(ress), token_count
     def encode_queries(self, text):
         res = self.client.embeddings(input=[truncate(text, 8196)],
                                             model=self.model_name)
+        return np.array(res.data[0].embedding), self.total_token_count(res)
 class BedrockEmbed(Base):
             }
             res = requests.post(self.base_url, headers=self.headers, json=payload).json()
             ress.extend([d["embedding"] for d in res["data"]])
+            token_count += self.total_token_count(res)
         return np.array(ress), token_count
     def encode_queries(self, text):
             if "data" not in res or not isinstance(res["data"], list) or len(res["data"]) != len(texts_batch):
                 raise ValueError(f"SILICONFLOWEmbed.encode got invalid response from {self.base_url}")
             ress.extend([d["embedding"] for d in res["data"]])
+            token_count += self.total_token_count(res)
         return np.array(ress), token_count
     def encode_queries(self, text):
         res = requests.post(self.base_url, json=payload, headers=self.headers).json()
         if "data" not in res or not isinstance(res["data"], list) or len(res["data"])!= 1:
             raise ValueError(f"SILICONFLOWEmbed.encode_queries got invalid response from {self.base_url}")
+        return np.array(res["data"][0]["embedding"]), self.total_token_count(res)
 class ReplicateEmbed(Base):
         res = self.client.do(model=self.model_name, texts=texts).body
         return (
             np.array([r["embedding"] for r in res["data"]]),
+            self.total_token_count(res),
         )
     def encode_queries(self, text):
         res = self.client.do(model=self.model_name, texts=[text]).body
         return (
             np.array([r["embedding"] for r in res["data"]]),
+            self.total_token_count(res),
         )

rag/llm/rerank_model.py CHANGED Viewed

@@ -42,6 +42,17 @@ class Base(ABC):
     def similarity(self, query: str, texts: list):
         raise NotImplementedError("Please implement encode method!")
 class DefaultRerank(Base):
     _model = None
@@ -115,7 +126,7 @@ class JinaRerank(Base):
         rank = np.zeros(len(texts), dtype=float)
         for d in res["results"]:
             rank[d["index"]] = d["relevance_score"]
-        return rank, res["usage"]["total_tokens"]
 class YoudaoRerank(DefaultRerank):
@@ -417,7 +428,7 @@ class BaiduYiyanRerank(Base):
         rank = np.zeros(len(texts), dtype=float)
         for d in res["results"]:
             rank[d["index"]] = d["relevance_score"]
-        return rank, res["usage"]["total_tokens"]
 class VoyageRerank(Base):

     def similarity(self, query: str, texts: list):
         raise NotImplementedError("Please implement encode method!")
+    def total_token_count(self, resp):
+        try:
+            return resp.usage.total_tokens
+        except Exception:
+            pass
+        try:
+            return resp["usage"]["total_tokens"]
+        except Exception:
+            pass
+        return 0
 class DefaultRerank(Base):
     _model = None
         rank = np.zeros(len(texts), dtype=float)
         for d in res["results"]:
             rank[d["index"]] = d["relevance_score"]
+        return rank, self.total_token_count(res)
 class YoudaoRerank(DefaultRerank):
         rank = np.zeros(len(texts), dtype=float)
         for d in res["results"]:
             rank[d["index"]] = d["relevance_score"]
+        return rank, self.total_token_count(res)
 class VoyageRerank(Base):

rag/nlp/search.py CHANGED Viewed

@@ -465,7 +465,7 @@ class Dealer:
         if not aggs:
             return False
         cnt = np.sum([c for _, c in aggs])
-        tag_fea = sorted([(a, round(0.1*(c + 1) / (cnt + S) / (all_tags.get(a, 0.0001)))) for a, c in aggs],
                          key=lambda x: x[1] * -1)[:topn_tags]
         doc[TAG_FLD] = {a: c for a, c in tag_fea if c > 0}
         return True
@@ -481,6 +481,6 @@ class Dealer:
         if not aggs:
             return {}
         cnt = np.sum([c for _, c in aggs])
-        tag_fea = sorted([(a, round(0.1*(c + 1) / (cnt + S) / (all_tags.get(a, 0.0001)))) for a, c in aggs],
                          key=lambda x: x[1] * -1)[:topn_tags]
         return {a: max(1, c) for a, c in tag_fea}

         if not aggs:
             return False
         cnt = np.sum([c for _, c in aggs])
+        tag_fea = sorted([(a, round(0.1*(c + 1) / (cnt + S) / max(1e-6, all_tags.get(a, 0.0001)))) for a, c in aggs],
                          key=lambda x: x[1] * -1)[:topn_tags]
         doc[TAG_FLD] = {a: c for a, c in tag_fea if c > 0}
         return True
         if not aggs:
             return {}
         cnt = np.sum([c for _, c in aggs])
+        tag_fea = sorted([(a, round(0.1*(c + 1) / (cnt + S) / max(1e-6, all_tags.get(a, 0.0001)))) for a, c in aggs],
                          key=lambda x: x[1] * -1)[:topn_tags]
         return {a: max(1, c) for a, c in tag_fea}

rag/raptor.py CHANGED Viewed

@@ -71,7 +71,7 @@ class RecursiveAbstractiveProcessing4TreeOrganizedRetrieval:
         start, end = 0, len(chunks)
         if len(chunks) <= 1:
             return
-        chunks = [(s, a) for s, a in chunks if len(a) > 0]
         def summarize(ck_idx, lock):
             nonlocal chunks
@@ -125,6 +125,8 @@ class RecursiveAbstractiveProcessing4TreeOrganizedRetrieval:
                 threads = []
                 for c in range(n_clusters):
                     ck_idx = [i + start for i in range(len(lbls)) if lbls[i] == c]
                     threads.append(executor.submit(summarize, ck_idx, lock))
                 wait(threads, return_when=ALL_COMPLETED)
                 for th in threads:

         start, end = 0, len(chunks)
         if len(chunks) <= 1:
             return
+        chunks = [(s, a) for s, a in chunks if s and len(a) > 0]
         def summarize(ck_idx, lock):
             nonlocal chunks
                 threads = []
                 for c in range(n_clusters):
                     ck_idx = [i + start for i in range(len(lbls)) if lbls[i] == c]
+                    if not ck_idx:
+                        continue
                     threads.append(executor.submit(summarize, ck_idx, lock))
                 wait(threads, return_when=ALL_COMPLETED)
                 for th in threads:

rag/utils/es_conn.py CHANGED Viewed

@@ -336,7 +336,7 @@ class ESConnection(DocStoreConnection):
         for k, v in condition.items():
             if not isinstance(k, str) or not v:
                 continue
-            if k == "exist":
                 bqry.filter.append(Q("exists", field=v))
                 continue
             if isinstance(v, list):

         for k, v in condition.items():
             if not isinstance(k, str) or not v:
                 continue
+            if k == "exists":
                 bqry.filter.append(Q("exists", field=v))
                 continue
             if isinstance(v, list):

rag/utils/infinity_conn.py CHANGED Viewed

@@ -44,8 +44,23 @@ from rag.utils.doc_store_conn import (
 logger = logging.getLogger('ragflow.infinity_conn')
-def equivalent_condition_to_str(condition: dict) -> str | None:
     assert "_id" not in condition
     cond = list()
     for k, v in condition.items():
         if not isinstance(k, str) or k in ["kb_id"] or not v:
@@ -61,8 +76,15 @@ def equivalent_condition_to_str(condition: dict) -> str | None:
                 strInCond = ", ".join(inCond)
                 strInCond = f"{k} IN ({strInCond})"
                 cond.append(strInCond)
         elif isinstance(v, str):
             cond.append(f"{k}='{v}'")
         else:
             cond.append(f"{k}={str(v)}")
     return " AND ".join(cond) if cond else "1=1"
@@ -273,15 +295,32 @@ class InfinityConnection(DocStoreConnection):
         for essential_field in ["id"]:
             if essential_field not in selectFields:
                 selectFields.append(essential_field)
         if matchExprs:
-            for essential_field in ["score()", PAGERANK_FLD]:
-                selectFields.append(essential_field)
         # Prepare expressions common to all tables
         filter_cond = None
         filter_fulltext = ""
         if condition:
-            filter_cond = equivalent_condition_to_str(condition)
         for matchExpr in matchExprs:
             if isinstance(matchExpr, MatchTextExpr):
                 if filter_cond and "filter" not in matchExpr.extra_options:
@@ -364,7 +403,9 @@ class InfinityConnection(DocStoreConnection):
         self.connPool.release_conn(inf_conn)
         res = concat_dataframes(df_list, selectFields)
         if matchExprs:
-            res = res.sort(pl.col("SCORE") + pl.col(PAGERANK_FLD), descending=True, maintain_order=True)
         res = res.limit(limit)
         logger.debug(f"INFINITY search final result: {str(res)}")
         return res, total_hits_count
@@ -419,12 +460,21 @@ class InfinityConnection(DocStoreConnection):
             self.createIdx(indexName, knowledgebaseId, vector_size)
             table_instance = db_instance.get_table(table_name)
         docs = copy.deepcopy(documents)
         for d in docs:
             assert "_id" not in d
             assert "id" in d
             for k, v in d.items():
-                if k in ["important_kwd", "question_kwd", "entities_kwd", "tag_kwd"]:
                     assert isinstance(v, list)
                     d[k] = "###".join(v)
                 elif re.search(r"_feas$", k):
@@ -439,6 +489,11 @@ class InfinityConnection(DocStoreConnection):
                 elif k in ["page_num_int", "top_int"]:
                     assert isinstance(v, list)
                     d[k] = "_".join(f"{num:08x}" for num in v)
         ids = ["'{}'".format(d["id"]) for d in docs]
         str_ids = ", ".join(ids)
         str_filter = f"id IN ({str_ids})"
@@ -460,11 +515,11 @@ class InfinityConnection(DocStoreConnection):
         db_instance = inf_conn.get_database(self.dbName)
         table_name = f"{indexName}_{knowledgebaseId}"
         table_instance = db_instance.get_table(table_name)
-        if "exist" in condition:
-            del condition["exist"]
-        filter = equivalent_condition_to_str(condition)
         for k, v in list(newValue.items()):
-            if k in ["important_kwd", "question_kwd", "entities_kwd", "tag_kwd"]:
                 assert isinstance(v, list)
                 newValue[k] = "###".join(v)
             elif re.search(r"_feas$", k):
@@ -481,9 +536,11 @@ class InfinityConnection(DocStoreConnection):
             elif k in ["page_num_int", "top_int"]:
                 assert isinstance(v, list)
                 newValue[k] = "_".join(f"{num:08x}" for num in v)
-            elif k == "remove" and v in [PAGERANK_FLD]:
                 del newValue[k]
-                newValue[v] = 0
         logger.debug(f"INFINITY update table {table_name}, filter {filter}, newValue {newValue}.")
         table_instance.update(filter, newValue)
         self.connPool.release_conn(inf_conn)
@@ -493,14 +550,14 @@ class InfinityConnection(DocStoreConnection):
         inf_conn = self.connPool.get_conn()
         db_instance = inf_conn.get_database(self.dbName)
         table_name = f"{indexName}_{knowledgebaseId}"
-        filter = equivalent_condition_to_str(condition)
         try:
             table_instance = db_instance.get_table(table_name)
         except Exception:
             logger.warning(
-                f"Skipped deleting `{filter}` from table {table_name} since the table doesn't exist."
             )
             return 0
         logger.debug(f"INFINITY delete table {table_name}, filter {filter}.")
         res = table_instance.delete(filter)
         self.connPool.release_conn(inf_conn)
@@ -538,7 +595,7 @@ class InfinityConnection(DocStoreConnection):
                 v = res[fieldnm][i]
                 if isinstance(v, Series):
                     v = list(v)
-                elif fieldnm in ["important_kwd", "question_kwd", "entities_kwd", "tag_kwd"]:
                     assert isinstance(v, str)
                     v = [kwd for kwd in v.split("###") if kwd]
                 elif fieldnm == "position_int":
@@ -569,6 +626,8 @@ class InfinityConnection(DocStoreConnection):
         ans = {}
         num_rows = len(res)
         column_id = res["id"]
         for i in range(num_rows):
             id = column_id[i]
             txt = res[fieldnm][i]

 logger = logging.getLogger('ragflow.infinity_conn')
+def equivalent_condition_to_str(condition: dict, table_instance=None) -> str | None:
     assert "_id" not in condition
+    clmns = {}
+    if table_instance:
+        for n, ty, de, _ in table_instance.show_columns().rows():
+            clmns[n] = (ty, de)
+    def exists(cln):
+        nonlocal clmns
+        assert cln in clmns, f"'{cln}' should be in '{clmns}'."
+        ty, de = clmns[cln]
+        if ty.lower().find("cha"):
+            if not de:
+                de = ""
+            return f" {cln}!='{de}' "
+        return f"{cln}!={de}"
     cond = list()
     for k, v in condition.items():
         if not isinstance(k, str) or k in ["kb_id"] or not v:
                 strInCond = ", ".join(inCond)
                 strInCond = f"{k} IN ({strInCond})"
                 cond.append(strInCond)
+        elif k == "must_not":
+            if isinstance(v, dict):
+                for kk, vv in v.items():
+                    if kk == "exists":
+                        cond.append("NOT (%s)" % exists(vv))
         elif isinstance(v, str):
             cond.append(f"{k}='{v}'")
+        elif k == "exists":
+            cond.append(exists(v))
         else:
             cond.append(f"{k}={str(v)}")
     return " AND ".join(cond) if cond else "1=1"
         for essential_field in ["id"]:
             if essential_field not in selectFields:
                 selectFields.append(essential_field)
+        score_func = ""
+        score_column = ""
+        for matchExpr in matchExprs:
+            if isinstance(matchExpr, MatchTextExpr):
+                score_func = "score()"
+                score_column = "SCORE"
+                break
+        if not score_func:
+            for matchExpr in matchExprs:
+                if isinstance(matchExpr, MatchDenseExpr):
+                    score_func = "similarity()"
+                    score_column = "SIMILARITY"
+                    break
         if matchExprs:
+            selectFields.append(score_func)
+            selectFields.append(PAGERANK_FLD)
         # Prepare expressions common to all tables
         filter_cond = None
         filter_fulltext = ""
         if condition:
+            for indexName in indexNames:
+                table_name = f"{indexName}_{knowledgebaseIds[0]}"
+                filter_cond = equivalent_condition_to_str(condition, db_instance.get_table(table_name))
+                break
         for matchExpr in matchExprs:
             if isinstance(matchExpr, MatchTextExpr):
                 if filter_cond and "filter" not in matchExpr.extra_options:
         self.connPool.release_conn(inf_conn)
         res = concat_dataframes(df_list, selectFields)
         if matchExprs:
+            res = res.sort(pl.col(score_column) + pl.col(PAGERANK_FLD), descending=True, maintain_order=True)
+            if score_column and score_column != "SCORE":
+                res = res.rename({score_column: "SCORE"})
         res = res.limit(limit)
         logger.debug(f"INFINITY search final result: {str(res)}")
         return res, total_hits_count
             self.createIdx(indexName, knowledgebaseId, vector_size)
             table_instance = db_instance.get_table(table_name)
+        # embedding fields can't have a default value....
+        embedding_clmns = []
+        clmns = table_instance.show_columns().rows()
+        for n, ty, _, _ in clmns:
+            r = re.search(r"Embedding\([a-z]+,([0-9]+)\)", ty)
+            if not r:
+                continue
+            embedding_clmns.append((n, int(r.group(1))))
         docs = copy.deepcopy(documents)
         for d in docs:
             assert "_id" not in d
             assert "id" in d
             for k, v in d.items():
+                if k in ["important_kwd", "question_kwd", "entities_kwd", "tag_kwd", "source_id"]:
                     assert isinstance(v, list)
                     d[k] = "###".join(v)
                 elif re.search(r"_feas$", k):
                 elif k in ["page_num_int", "top_int"]:
                     assert isinstance(v, list)
                     d[k] = "_".join(f"{num:08x}" for num in v)
+            for n, vs in embedding_clmns:
+                if n in d:
+                    continue
+                d[n] = [0] * vs
         ids = ["'{}'".format(d["id"]) for d in docs]
         str_ids = ", ".join(ids)
         str_filter = f"id IN ({str_ids})"
         db_instance = inf_conn.get_database(self.dbName)
         table_name = f"{indexName}_{knowledgebaseId}"
         table_instance = db_instance.get_table(table_name)
+        #if "exists" in condition:
+        #    del condition["exists"]
+        filter = equivalent_condition_to_str(condition, table_instance)
         for k, v in list(newValue.items()):
+            if k in ["important_kwd", "question_kwd", "entities_kwd", "tag_kwd", "source_id"]:
                 assert isinstance(v, list)
                 newValue[k] = "###".join(v)
             elif re.search(r"_feas$", k):
             elif k in ["page_num_int", "top_int"]:
                 assert isinstance(v, list)
                 newValue[k] = "_".join(f"{num:08x}" for num in v)
+            elif k == "remove":
                 del newValue[k]
+                if v in [PAGERANK_FLD]:
+                    newValue[v] = 0
         logger.debug(f"INFINITY update table {table_name}, filter {filter}, newValue {newValue}.")
         table_instance.update(filter, newValue)
         self.connPool.release_conn(inf_conn)
         inf_conn = self.connPool.get_conn()
         db_instance = inf_conn.get_database(self.dbName)
         table_name = f"{indexName}_{knowledgebaseId}"
         try:
             table_instance = db_instance.get_table(table_name)
         except Exception:
             logger.warning(
+                f"Skipped deleting from table {table_name} since the table doesn't exist."
             )
             return 0
+        filter = equivalent_condition_to_str(condition, table_instance)
         logger.debug(f"INFINITY delete table {table_name}, filter {filter}.")
         res = table_instance.delete(filter)
         self.connPool.release_conn(inf_conn)
                 v = res[fieldnm][i]
                 if isinstance(v, Series):
                     v = list(v)
+                elif fieldnm in ["important_kwd", "question_kwd", "entities_kwd", "tag_kwd", "source_id"]:
                     assert isinstance(v, str)
                     v = [kwd for kwd in v.split("###") if kwd]
                 elif fieldnm == "position_int":
         ans = {}
         num_rows = len(res)
         column_id = res["id"]
+        if fieldnm not in res:
+            return {}
         for i in range(num_rows):
             id = column_id[i]
             txt = res[fieldnm][i]

uv.lock CHANGED Viewed

@@ -850,7 +850,7 @@ name = "coloredlogs"
 version = "15.0.1"
 source = { registry = "https://pypi.tuna.tsinghua.edu.cn/simple" }
 dependencies = [
-    { name = "humanfriendly", marker = "platform_machine != 'aarch64' or sys_platform != 'linux'" },
 ]
 sdist = { url = "https://pypi.tuna.tsinghua.edu.cn/packages/cc/c7/eed8f27100517e8c0e6b923d5f0845d0cb99763da6fdee00478f91db7325/coloredlogs-15.0.1.tar.gz", hash = "sha256:7c991aa71a4577af2f82600d8f8f3a89f936baeaf9b50a9c197da014e5bf16b0", size = 278520 }
 wheels = [
@@ -1329,18 +1329,18 @@ name = "fastembed"
 version = "0.3.6"
 source = { registry = "https://pypi.tuna.tsinghua.edu.cn/simple" }
 dependencies = [
-    { name = "huggingface-hub", marker = "platform_machine != 'aarch64' or sys_platform != 'linux'" },
-    { name = "loguru", marker = "platform_machine != 'aarch64' or sys_platform != 'linux'" },
-    { name = "mmh3", marker = "platform_machine != 'aarch64' or sys_platform != 'linux'" },
-    { name = "numpy", marker = "platform_machine != 'aarch64' or sys_platform != 'linux'" },
-    { name = "onnx", marker = "platform_machine != 'aarch64' or sys_platform != 'linux'" },
-    { name = "onnxruntime", marker = "platform_machine != 'aarch64' or sys_platform != 'linux'" },
-    { name = "pillow", marker = "platform_machine != 'aarch64' or sys_platform != 'linux'" },
-    { name = "pystemmer", marker = "platform_machine != 'aarch64' or sys_platform != 'linux'" },
-    { name = "requests", marker = "platform_machine != 'aarch64' or sys_platform != 'linux'" },
-    { name = "snowballstemmer", marker = "platform_machine != 'aarch64' or sys_platform != 'linux'" },
-    { name = "tokenizers", marker = "platform_machine != 'aarch64' or sys_platform != 'linux'" },
-    { name = "tqdm", marker = "platform_machine != 'aarch64' or sys_platform != 'linux'" },
 ]
 sdist = { url = "https://pypi.tuna.tsinghua.edu.cn/packages/ae/20/68a109c8def842ed47a2951873fb2d7d23ee296ef8c195aedbb735670fff/fastembed-0.3.6.tar.gz", hash = "sha256:c93c8ec99b8c008c2d192d6297866b8d70ec7ac8f5696b34eb5ea91f85efd15f", size = 35058 }
 wheels = [
@@ -1352,17 +1352,17 @@ name = "fastembed-gpu"
 version = "0.3.6"
 source = { registry = "https://pypi.tuna.tsinghua.edu.cn/simple" }
 dependencies = [
-    { name = "huggingface-hub", marker = "platform_machine != 'aarch64' or sys_platform != 'linux'" },
-    { name = "loguru", marker = "platform_machine != 'aarch64' or sys_platform != 'linux'" },
-    { name = "mmh3", marker = "platform_machine != 'aarch64' or sys_platform != 'linux'" },
-    { name = "numpy", marker = "platform_machine != 'aarch64' or sys_platform != 'linux'" },
-    { name = "onnxruntime-gpu", marker = "platform_machine != 'aarch64' or sys_platform != 'linux'" },
-    { name = "pillow", marker = "platform_machine != 'aarch64' or sys_platform != 'linux'" },
-    { name = "pystemmer", marker = "platform_machine != 'aarch64' or sys_platform != 'linux'" },
-    { name = "requests", marker = "platform_machine != 'aarch64' or sys_platform != 'linux'" },
-    { name = "snowballstemmer", marker = "platform_machine != 'aarch64' or sys_platform != 'linux'" },
-    { name = "tokenizers", marker = "platform_machine != 'aarch64' or sys_platform != 'linux'" },
-    { name = "tqdm", marker = "platform_machine != 'aarch64' or sys_platform != 'linux'" },
 ]
 sdist = { url = "https://pypi.tuna.tsinghua.edu.cn/packages/da/07/7336c7f3d7ee47f33b407eeb50f5eeb152889de538a52a8f1cc637192816/fastembed_gpu-0.3.6.tar.gz", hash = "sha256:ee2de8918b142adbbf48caaffec0c492f864d73c073eea5a3dcd0e8c1041c50d", size = 35051 }
 wheels = [
@@ -3424,8 +3424,8 @@ name = "onnx"
 version = "1.17.0"
 source = { registry = "https://pypi.tuna.tsinghua.edu.cn/simple" }
 dependencies = [
-    { name = "numpy", marker = "platform_machine != 'aarch64' or sys_platform != 'linux'" },
-    { name = "protobuf", marker = "platform_machine != 'aarch64' or sys_platform != 'linux'" },
 ]
 sdist = { url = "https://pypi.tuna.tsinghua.edu.cn/packages/9a/54/0e385c26bf230d223810a9c7d06628d954008a5e5e4b73ee26ef02327282/onnx-1.17.0.tar.gz", hash = "sha256:48ca1a91ff73c1d5e3ea2eef20ae5d0e709bb8a2355ed798ffc2169753013fd3", size = 12165120 }
 wheels = [
@@ -3451,12 +3451,12 @@ name = "onnxruntime"
 version = "1.19.2"
 source = { registry = "https://pypi.tuna.tsinghua.edu.cn/simple" }
 dependencies = [
-    { name = "coloredlogs", marker = "platform_machine != 'aarch64' or sys_platform != 'linux'" },
-    { name = "flatbuffers", marker = "platform_machine != 'aarch64' or sys_platform != 'linux'" },
-    { name = "numpy", marker = "platform_machine != 'aarch64' or sys_platform != 'linux'" },
-    { name = "packaging", marker = "platform_machine != 'aarch64' or sys_platform != 'linux'" },
-    { name = "protobuf", marker = "platform_machine != 'aarch64' or sys_platform != 'linux'" },
-    { name = "sympy", marker = "platform_machine != 'aarch64' or sys_platform != 'linux'" },
 ]
 wheels = [
     { url = "https://pypi.tuna.tsinghua.edu.cn/packages/39/18/272d3d7406909141d3c9943796e3e97cafa53f4342d9231c0cfd8cb05702/onnxruntime-1.19.2-cp310-cp310-macosx_11_0_universal2.whl", hash = "sha256:84fa57369c06cadd3c2a538ae2a26d76d583e7c34bdecd5769d71ca5c0fc750e", size = 16776408 },
@@ -3481,12 +3481,12 @@ name = "onnxruntime-gpu"
 version = "1.19.2"
 source = { registry = "https://pypi.tuna.tsinghua.edu.cn/simple" }
 dependencies = [
-    { name = "coloredlogs", marker = "platform_machine != 'aarch64' or sys_platform != 'linux'" },
-    { name = "flatbuffers", marker = "platform_machine != 'aarch64' or sys_platform != 'linux'" },
-    { name = "numpy", marker = "platform_machine != 'aarch64' or sys_platform != 'linux'" },
-    { name = "packaging", marker = "platform_machine != 'aarch64' or sys_platform != 'linux'" },
-    { name = "protobuf", marker = "platform_machine != 'aarch64' or sys_platform != 'linux'" },
-    { name = "sympy", marker = "platform_machine != 'aarch64' or sys_platform != 'linux'" },
 ]
 wheels = [
     { url = "https://pypi.tuna.tsinghua.edu.cn/packages/d0/9c/3fa310e0730643051eb88e884f19813a6c8b67d0fbafcda610d960e589db/onnxruntime_gpu-1.19.2-cp310-cp310-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:a49740e079e7c5215830d30cde3df792e903df007aa0b0fd7aa797937061b27a", size = 226178508 },
@@ -4768,8 +4768,8 @@ dependencies = [
     { name = "nltk" },
     { name = "numpy" },
     { name = "ollama" },
-    { name = "onnxruntime", marker = "platform_machine == 'arm64' or sys_platform == 'darwin'" },
-    { name = "onnxruntime-gpu", marker = "platform_machine == 'x86_64'" },
     { name = "openai" },
     { name = "opencv-python" },
     { name = "opencv-python-headless" },
@@ -4833,8 +4833,8 @@ dependencies = [
 [package.optional-dependencies]
 full = [
     { name = "bcembedding" },
-    { name = "fastembed", marker = "platform_machine == 'arm64' or sys_platform == 'darwin'" },
-    { name = "fastembed-gpu", marker = "platform_machine == 'x86_64'" },
     { name = "flagembedding" },
     { name = "torch" },
     { name = "transformers" },
@@ -4870,8 +4870,8 @@ requires-dist = [
     { name = "elastic-transport", specifier = "==8.12.0" },
     { name = "elasticsearch", specifier = "==8.12.1" },
     { name = "elasticsearch-dsl", specifier = "==8.12.0" },
-    { name = "fastembed", marker = "(platform_machine == 'arm64' and extra == 'full') or (sys_platform == 'darwin' and extra == 'full')", specifier = ">=0.3.6,<0.4.0" },
-    { name = "fastembed-gpu", marker = "platform_machine == 'x86_64' and extra == 'full'", specifier = ">=0.3.6,<0.4.0" },
     { name = "fasttext", specifier = "==0.9.3" },
     { name = "filelock", specifier = "==3.15.4" },
     { name = "flagembedding", marker = "extra == 'full'", specifier = "==1.2.10" },
@@ -4900,8 +4900,8 @@ requires-dist = [
     { name = "nltk", specifier = "==3.9.1" },
     { name = "numpy", specifier = ">=1.26.0,<2.0.0" },
     { name = "ollama", specifier = "==0.2.1" },
-    { name = "onnxruntime", marker = "platform_machine == 'arm64' or sys_platform == 'darwin'", specifier = "==1.19.2" },
-    { name = "onnxruntime-gpu", marker = "platform_machine == 'x86_64'", specifier = "==1.19.2" },
     { name = "openai", specifier = "==1.45.0" },
     { name = "opencv-python", specifier = "==4.10.0.84" },
     { name = "opencv-python-headless", specifier = "==4.10.0.84" },

 version = "15.0.1"
 source = { registry = "https://pypi.tuna.tsinghua.edu.cn/simple" }
 dependencies = [
+    { name = "humanfriendly" },
 ]
 sdist = { url = "https://pypi.tuna.tsinghua.edu.cn/packages/cc/c7/eed8f27100517e8c0e6b923d5f0845d0cb99763da6fdee00478f91db7325/coloredlogs-15.0.1.tar.gz", hash = "sha256:7c991aa71a4577af2f82600d8f8f3a89f936baeaf9b50a9c197da014e5bf16b0", size = 278520 }
 wheels = [
 version = "0.3.6"
 source = { registry = "https://pypi.tuna.tsinghua.edu.cn/simple" }
 dependencies = [
+    { name = "huggingface-hub" },
+    { name = "loguru" },
+    { name = "mmh3" },
+    { name = "numpy" },
+    { name = "onnx" },
+    { name = "onnxruntime" },
+    { name = "pillow" },
+    { name = "pystemmer" },
+    { name = "requests" },
+    { name = "snowballstemmer" },
+    { name = "tokenizers" },
+    { name = "tqdm" },
 ]
 sdist = { url = "https://pypi.tuna.tsinghua.edu.cn/packages/ae/20/68a109c8def842ed47a2951873fb2d7d23ee296ef8c195aedbb735670fff/fastembed-0.3.6.tar.gz", hash = "sha256:c93c8ec99b8c008c2d192d6297866b8d70ec7ac8f5696b34eb5ea91f85efd15f", size = 35058 }
 wheels = [
 version = "0.3.6"
 source = { registry = "https://pypi.tuna.tsinghua.edu.cn/simple" }
 dependencies = [
+    { name = "huggingface-hub", marker = "(platform_machine != 'aarch64' and sys_platform == 'linux') or (sys_platform != 'darwin' and sys_platform != 'linux')" },
+    { name = "loguru", marker = "(platform_machine != 'aarch64' and sys_platform == 'linux') or (sys_platform != 'darwin' and sys_platform != 'linux')" },
+    { name = "mmh3", marker = "(platform_machine != 'aarch64' and sys_platform == 'linux') or (sys_platform != 'darwin' and sys_platform != 'linux')" },
+    { name = "numpy", marker = "(platform_machine != 'aarch64' and sys_platform == 'linux') or (sys_platform != 'darwin' and sys_platform != 'linux')" },
+    { name = "onnxruntime-gpu", marker = "(platform_machine != 'aarch64' and sys_platform == 'linux') or (sys_platform != 'darwin' and sys_platform != 'linux')" },
+    { name = "pillow", marker = "(platform_machine != 'aarch64' and sys_platform == 'linux') or (sys_platform != 'darwin' and sys_platform != 'linux')" },
+    { name = "pystemmer", marker = "(platform_machine != 'aarch64' and sys_platform == 'linux') or (sys_platform != 'darwin' and sys_platform != 'linux')" },
+    { name = "requests", marker = "(platform_machine != 'aarch64' and sys_platform == 'linux') or (sys_platform != 'darwin' and sys_platform != 'linux')" },
+    { name = "snowballstemmer", marker = "(platform_machine != 'aarch64' and sys_platform == 'linux') or (sys_platform != 'darwin' and sys_platform != 'linux')" },
+    { name = "tokenizers", marker = "(platform_machine != 'aarch64' and sys_platform == 'linux') or (sys_platform != 'darwin' and sys_platform != 'linux')" },
+    { name = "tqdm", marker = "(platform_machine != 'aarch64' and sys_platform == 'linux') or (sys_platform != 'darwin' and sys_platform != 'linux')" },
 ]
 sdist = { url = "https://pypi.tuna.tsinghua.edu.cn/packages/da/07/7336c7f3d7ee47f33b407eeb50f5eeb152889de538a52a8f1cc637192816/fastembed_gpu-0.3.6.tar.gz", hash = "sha256:ee2de8918b142adbbf48caaffec0c492f864d73c073eea5a3dcd0e8c1041c50d", size = 35051 }
 wheels = [
 version = "1.17.0"
 source = { registry = "https://pypi.tuna.tsinghua.edu.cn/simple" }
 dependencies = [
+    { name = "numpy" },
+    { name = "protobuf" },
 ]
 sdist = { url = "https://pypi.tuna.tsinghua.edu.cn/packages/9a/54/0e385c26bf230d223810a9c7d06628d954008a5e5e4b73ee26ef02327282/onnx-1.17.0.tar.gz", hash = "sha256:48ca1a91ff73c1d5e3ea2eef20ae5d0e709bb8a2355ed798ffc2169753013fd3", size = 12165120 }
 wheels = [
 version = "1.19.2"
 source = { registry = "https://pypi.tuna.tsinghua.edu.cn/simple" }
 dependencies = [
+    { name = "coloredlogs" },
+    { name = "flatbuffers" },
+    { name = "numpy" },
+    { name = "packaging" },
+    { name = "protobuf" },
+    { name = "sympy" },
 ]
 wheels = [
     { url = "https://pypi.tuna.tsinghua.edu.cn/packages/39/18/272d3d7406909141d3c9943796e3e97cafa53f4342d9231c0cfd8cb05702/onnxruntime-1.19.2-cp310-cp310-macosx_11_0_universal2.whl", hash = "sha256:84fa57369c06cadd3c2a538ae2a26d76d583e7c34bdecd5769d71ca5c0fc750e", size = 16776408 },
 version = "1.19.2"
 source = { registry = "https://pypi.tuna.tsinghua.edu.cn/simple" }
 dependencies = [
+    { name = "coloredlogs", marker = "(platform_machine != 'aarch64' and sys_platform == 'linux') or (sys_platform != 'darwin' and sys_platform != 'linux')" },
+    { name = "flatbuffers", marker = "(platform_machine != 'aarch64' and sys_platform == 'linux') or (sys_platform != 'darwin' and sys_platform != 'linux')" },
+    { name = "numpy", marker = "(platform_machine != 'aarch64' and sys_platform == 'linux') or (sys_platform != 'darwin' and sys_platform != 'linux')" },
+    { name = "packaging", marker = "(platform_machine != 'aarch64' and sys_platform == 'linux') or (sys_platform != 'darwin' and sys_platform != 'linux')" },
+    { name = "protobuf", marker = "(platform_machine != 'aarch64' and sys_platform == 'linux') or (sys_platform != 'darwin' and sys_platform != 'linux')" },
+    { name = "sympy", marker = "(platform_machine != 'aarch64' and sys_platform == 'linux') or (sys_platform != 'darwin' and sys_platform != 'linux')" },
 ]
 wheels = [
     { url = "https://pypi.tuna.tsinghua.edu.cn/packages/d0/9c/3fa310e0730643051eb88e884f19813a6c8b67d0fbafcda610d960e589db/onnxruntime_gpu-1.19.2-cp310-cp310-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:a49740e079e7c5215830d30cde3df792e903df007aa0b0fd7aa797937061b27a", size = 226178508 },
     { name = "nltk" },
     { name = "numpy" },
     { name = "ollama" },
+    { name = "onnxruntime", marker = "platform_machine != 'x86_64' or sys_platform == 'darwin'" },
+    { name = "onnxruntime-gpu", marker = "platform_machine == 'x86_64' and sys_platform != 'darwin'" },
     { name = "openai" },
     { name = "opencv-python" },
     { name = "opencv-python-headless" },
 [package.optional-dependencies]
 full = [
     { name = "bcembedding" },
+    { name = "fastembed", marker = "platform_machine != 'x86_64' or sys_platform == 'darwin'" },
+    { name = "fastembed-gpu", marker = "platform_machine == 'x86_64' and sys_platform != 'darwin'" },
     { name = "flagembedding" },
     { name = "torch" },
     { name = "transformers" },
     { name = "elastic-transport", specifier = "==8.12.0" },
     { name = "elasticsearch", specifier = "==8.12.1" },
     { name = "elasticsearch-dsl", specifier = "==8.12.0" },
+    { name = "fastembed", marker = "(platform_machine != 'x86_64' and extra == 'full') or (sys_platform == 'darwin' and extra == 'full')", specifier = ">=0.3.6,<0.4.0" },
+    { name = "fastembed-gpu", marker = "platform_machine == 'x86_64' and sys_platform != 'darwin' and extra == 'full'", specifier = ">=0.3.6,<0.4.0" },
     { name = "fasttext", specifier = "==0.9.3" },
     { name = "filelock", specifier = "==3.15.4" },
     { name = "flagembedding", marker = "extra == 'full'", specifier = "==1.2.10" },
     { name = "nltk", specifier = "==3.9.1" },
     { name = "numpy", specifier = ">=1.26.0,<2.0.0" },
     { name = "ollama", specifier = "==0.2.1" },
+    { name = "onnxruntime", marker = "platform_machine != 'x86_64' or sys_platform == 'darwin'", specifier = "==1.19.2" },
+    { name = "onnxruntime-gpu", marker = "platform_machine == 'x86_64' and sys_platform != 'darwin'", specifier = "==1.19.2" },
     { name = "openai", specifier = "==1.45.0" },
     { name = "opencv-python", specifier = "==4.10.0.84" },
     { name = "opencv-python-headless", specifier = "==4.10.0.84" },

web/src/pages/user-setting/components/setting-title/index.tsx CHANGED Viewed

@@ -27,7 +27,10 @@ const SettingTitle = ({
       </div>
       {showRightButton && (
         <Button type={'primary'} onClick={clickButton}>
-          <SettingOutlined></SettingOutlined> {t('systemModelSettings')}
         </Button>
       )}
     </Flex>

       </div>
       {showRightButton && (
         <Button type={'primary'} onClick={clickButton}>
+          <Flex align="center" gap={4}>
+            <SettingOutlined />
+            {t('systemModelSettings')}
+          </Flex>
         </Button>
       )}
     </Flex>

web/src/pages/user-setting/setting-model/index.tsx CHANGED Viewed

@@ -92,27 +92,31 @@ const ModelCard = ({ item, clickApiKey }: IModelCardProps) => {
           <Col span={12} className={styles.factoryOperationWrapper}>
             <Space size={'middle'}>
               <Button onClick={handleApiKeyClick}>
-                {isLocalLlmFactory(item.name) ||
-                item.name === 'VolcEngine' ||
-                item.name === 'Tencent Hunyuan' ||
-                item.name === 'XunFei Spark' ||
-                item.name === 'BaiduYiyan' ||
-                item.name === 'Fish Audio' ||
-                item.name === 'Tencent Cloud' ||
-                item.name === 'Google Cloud' ||
-                item.name === 'Azure OpenAI'
-                  ? t('addTheModel')
-                  : 'API-Key'}
-                <SettingOutlined />
               </Button>
               <Button onClick={handleShowMoreClick}>
-                <Flex gap={'small'}>
                   {t('showMoreModels')}
                   <MoreModelIcon />
                 </Flex>
               </Button>
               <Button type={'text'} onClick={handleDeleteFactory}>
-                <CloseCircleOutlined style={{ color: '#D92D20' }} />
               </Button>
             </Space>
           </Col>

           <Col span={12} className={styles.factoryOperationWrapper}>
             <Space size={'middle'}>
               <Button onClick={handleApiKeyClick}>
+                <Flex align="center" gap={4}>
+                  {isLocalLlmFactory(item.name) ||
+                  item.name === 'VolcEngine' ||
+                  item.name === 'Tencent Hunyuan' ||
+                  item.name === 'XunFei Spark' ||
+                  item.name === 'BaiduYiyan' ||
+                  item.name === 'Fish Audio' ||
+                  item.name === 'Tencent Cloud' ||
+                  item.name === 'Google Cloud' ||
+                  item.name === 'Azure OpenAI'
+                    ? t('addTheModel')
+                    : 'API-Key'}
+                  <SettingOutlined />
+                </Flex>
               </Button>
               <Button onClick={handleShowMoreClick}>
+                <Flex align="center" gap={4}>
                   {t('showMoreModels')}
                   <MoreModelIcon />
                 </Flex>
               </Button>
               <Button type={'text'} onClick={handleDeleteFactory}>
+                <Flex align="center">
+                  <CloseCircleOutlined style={{ color: '#D92D20' }} />
+                </Flex>
               </Button>
             </Space>
           </Col>