Spaces:
Sleeping
Sleeping
Upload folder using huggingface_hub
Browse files- README.md +2 -2
- README_AGENTS.md +139 -0
- analyze_unused.py +76 -0
- app.py +93 -89
- explore_metadata.py +134 -0
- faiss_index/index.faiss +2 -2
- faiss_index/index.pkl +2 -2
- loader.py +131 -68
- model.py +16 -0
- port_recomendations.py +147 -0
- port_recommendations_standalone.py +41 -0
- requirements.txt +2 -1
- retriever_tool.py +49 -0
- setup.py +131 -0
README.md
CHANGED
@@ -13,8 +13,8 @@ sdk_version: 5.30.0
|
|
13 |

|
14 |
|
15 |
|
16 |
-
|
17 |
-
|
18 |
|
19 |
## π Key Features
|
20 |
|
|
|
13 |

|
14 |
|
15 |
|
16 |
+
Queries network documentation with natural languague.
|
17 |
+
Recommend ports to users
|
18 |
|
19 |
## π Key Features
|
20 |
|
README_AGENTS.md
ADDED
@@ -0,0 +1,139 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# Network Infrastructure AI Assistant
|
2 |
+
|
3 |
+
This project implements an AI-powered network infrastructure assistant with specialized port recommendation capabilities using the OpenAI Agents SDK.
|
4 |
+
|
5 |
+
## Architecture Overview
|
6 |
+
|
7 |
+
The system follows a modular architecture based on the OpenAI Agents SDK:
|
8 |
+
|
9 |
+
### Core Components
|
10 |
+
|
11 |
+
1. **`retriever_tool.py`** - Network information retrieval tool
|
12 |
+
- Uses FAISS vector database for semantic search
|
13 |
+
- Searches through network documentation and device configurations
|
14 |
+
- Returns relevant network information with similarity scores
|
15 |
+
|
16 |
+
2. **`port_recommnedations.py`** - Specialized port recommendations agent
|
17 |
+
- Expert agent focused on port/interface recommendations
|
18 |
+
- Understands MLAG configurations and redundancy requirements
|
19 |
+
- Provides specific device names and port numbers
|
20 |
+
|
21 |
+
3. **`app.py`** - Main orchestrator application
|
22 |
+
- Combines retrieval tool and port recommendations agent
|
23 |
+
- Provides Gradio web interface
|
24 |
+
- Routes queries to appropriate tools based on context
|
25 |
+
|
26 |
+
4. **`port_recommendations_standalone.py`** - Standalone port recommendations
|
27 |
+
- Direct access to port recommendations agent
|
28 |
+
- Useful for testing and scripting
|
29 |
+
|
30 |
+
## Key Features
|
31 |
+
|
32 |
+
### Port Recommendations
|
33 |
+
- Automatic redundancy across MLAG pairs (leaf01/leaf02, leaf03/leaf04, etc.)
|
34 |
+
- Same port numbers across paired devices when possible
|
35 |
+
- Support for single port requests (without redundancy)
|
36 |
+
- Detailed responses with device names and specific port identifiers
|
37 |
+
|
38 |
+
### Network Information Retrieval
|
39 |
+
- Semantic search through network documentation
|
40 |
+
- Device-specific configuration lookup
|
41 |
+
- Fabric-wide information queries
|
42 |
+
|
43 |
+
## Usage Examples
|
44 |
+
|
45 |
+
### Port Recommendations
|
46 |
+
```python
|
47 |
+
# Various ways to request ports
|
48 |
+
"I need an unused port" # Returns 2 ports with redundancy
|
49 |
+
"I need an unused port without redundancy" # Returns 1 port
|
50 |
+
"I need to dual connect a server to the network" # Returns MLAG pair
|
51 |
+
"What ports are available on leaf01?" # Device-specific query
|
52 |
+
```
|
53 |
+
|
54 |
+
### General Network Queries
|
55 |
+
```python
|
56 |
+
"What is the BGP configuration?"
|
57 |
+
"Show me the VLAN settings"
|
58 |
+
"What's the loopback pool configuration?"
|
59 |
+
```
|
60 |
+
|
61 |
+
## Running the System
|
62 |
+
|
63 |
+
### Web Interface
|
64 |
+
```bash
|
65 |
+
python app.py
|
66 |
+
```
|
67 |
+
This launches a Gradio web interface where you can ask questions about the network infrastructure.
|
68 |
+
|
69 |
+
### Standalone Port Recommendations
|
70 |
+
```bash
|
71 |
+
python port_recommendations_standalone.py
|
72 |
+
```
|
73 |
+
This runs a test suite with various port recommendation queries.
|
74 |
+
|
75 |
+
### Testing Individual Components
|
76 |
+
```python
|
77 |
+
from port_recommnedations import port_recommendations_agent
|
78 |
+
from retriever_tool import retrieve_network_information
|
79 |
+
|
80 |
+
# Test retrieval tool
|
81 |
+
result = retrieve_network_information("unused ports")
|
82 |
+
|
83 |
+
# Test port agent (requires async)
|
84 |
+
import asyncio
|
85 |
+
from agents import Runner
|
86 |
+
|
87 |
+
async def test():
|
88 |
+
result = await Runner.run(port_recommendations_agent, "I need a port")
|
89 |
+
print(result.final_output)
|
90 |
+
|
91 |
+
asyncio.run(test())
|
92 |
+
```
|
93 |
+
|
94 |
+
## File Structure
|
95 |
+
|
96 |
+
```
|
97 |
+
agent-sdk/
|
98 |
+
βββ retriever_tool.py # Network information retrieval
|
99 |
+
βββ port_recommnedations.py # Specialized port agent
|
100 |
+
βββ app.py # Main orchestrator with Gradio UI
|
101 |
+
βββ port_recommendations_standalone.py # Standalone port recommendations
|
102 |
+
βββ faiss_index/ # Vector database
|
103 |
+
βββ prompts.yaml # Prompt templates
|
104 |
+
βββ README_AGENTS.md # This file
|
105 |
+
```
|
106 |
+
|
107 |
+
## Dependencies
|
108 |
+
|
109 |
+
- `openai-agents`: OpenAI Agents SDK
|
110 |
+
- `langchain-community`: FAISS and embeddings
|
111 |
+
- `sentence-transformers`: Text embeddings
|
112 |
+
- `gradio`: Web interface
|
113 |
+
- `PyYAML`: Configuration files
|
114 |
+
|
115 |
+
## Agent Design Principles
|
116 |
+
|
117 |
+
Based on the OpenAI Agents SDK documentation:
|
118 |
+
|
119 |
+
1. **Function Tools**: The retriever uses `@function_tool` decorator for automatic tool setup
|
120 |
+
2. **Agents as Tools**: Port recommendations agent is used as a tool in the main orchestrator
|
121 |
+
3. **Specialized Instructions**: Each agent has domain-specific instructions and behaviors
|
122 |
+
4. **Tool Routing**: Main agent routes queries to appropriate specialized tools
|
123 |
+
|
124 |
+
## Port Recommendation Rules
|
125 |
+
|
126 |
+
The port recommendations agent follows these key rules:
|
127 |
+
|
128 |
+
1. **Default Redundancy**: Always recommend two ports across different devices unless specifically requested otherwise
|
129 |
+
2. **MLAG Pairing**: Recommend ports across MLAG pairs (odd/even leaf switches)
|
130 |
+
3. **Port Alignment**: Try to use the same port number across paired devices
|
131 |
+
4. **Specific Responses**: Include device names and exact port identifiers
|
132 |
+
5. **Query First**: Will return only data from the leaf switches
|
133 |
+
|
134 |
+
## Future Enhancements
|
135 |
+
|
136 |
+
- Add more specialized agents (security policies, VLAN management, etc.)
|
137 |
+
- Implement caching for frequently requested information
|
138 |
+
- Add support for configuration changes and validation
|
139 |
+
- Integrate with network management systems
|
analyze_unused.py
ADDED
@@ -0,0 +1,76 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
#!/usr/bin/env python3
|
2 |
+
"""
|
3 |
+
Analyze where UNUSED interfaces are actually located in the database.
|
4 |
+
"""
|
5 |
+
|
6 |
+
from langchain_community.embeddings import HuggingFaceEmbeddings
|
7 |
+
from langchain_community.vectorstores import FAISS
|
8 |
+
|
9 |
+
def analyze_unused_locations():
|
10 |
+
"""Find where UNUSED interfaces are actually stored."""
|
11 |
+
|
12 |
+
print("Analyzing where UNUSED interfaces are located...")
|
13 |
+
print("=" * 80)
|
14 |
+
|
15 |
+
# Load the FAISS database
|
16 |
+
FAISS_INDEX_PATH = "faiss_index"
|
17 |
+
embeddings = HuggingFaceEmbeddings(model_name="sentence-transformers/all-MiniLM-L6-v2")
|
18 |
+
db = FAISS.load_local(FAISS_INDEX_PATH, embeddings, allow_dangerous_deserialization=True)
|
19 |
+
|
20 |
+
# Search for chunks that actually contain UNUSED
|
21 |
+
query = "UNUSED"
|
22 |
+
results_with_scores = db.similarity_search_with_score(query, k=15)
|
23 |
+
|
24 |
+
print(f"Query: '{query}'")
|
25 |
+
print(f"Found {len(results_with_scores)} results")
|
26 |
+
print("=" * 80)
|
27 |
+
|
28 |
+
for i, (doc, score) in enumerate(results_with_scores):
|
29 |
+
device_name = doc.metadata.get('device_name', 'Unknown')
|
30 |
+
header_path = doc.metadata.get('header_path', 'No header path')
|
31 |
+
section_title = doc.metadata.get('section_title', 'No section')
|
32 |
+
|
33 |
+
unused_count = doc.page_content.count('UNUSED')
|
34 |
+
|
35 |
+
if unused_count > 0: # Only show chunks with UNUSED
|
36 |
+
print(f"\\nResult {i+1} (Score: {score:.4f}) - {unused_count} UNUSED")
|
37 |
+
print(f" Device: {device_name}")
|
38 |
+
print(f" Header Path: {header_path}")
|
39 |
+
print(f" Section: {section_title}")
|
40 |
+
|
41 |
+
# Show where UNUSED appears in content
|
42 |
+
lines = doc.page_content.split('\\n')
|
43 |
+
unused_lines = [line for line in lines if 'UNUSED' in line]
|
44 |
+
print(f" UNUSED interfaces found:")
|
45 |
+
for line in unused_lines[:3]: # Show first 3
|
46 |
+
print(f" {line.strip()}")
|
47 |
+
|
48 |
+
# Show broader context
|
49 |
+
print(f" Content preview: {doc.page_content[:200]}...")
|
50 |
+
print("-" * 60)
|
51 |
+
|
52 |
+
print("\\n" + "=" * 80)
|
53 |
+
print("Testing better queries for finding UNUSED interfaces...")
|
54 |
+
|
55 |
+
# Test different queries
|
56 |
+
test_queries = [
|
57 |
+
"UNUSED interface Ethernet",
|
58 |
+
"Ethernet Interfaces Device Configuration UNUSED",
|
59 |
+
"interface description UNUSED",
|
60 |
+
"switchport access vlan 50 UNUSED"
|
61 |
+
]
|
62 |
+
|
63 |
+
for query in test_queries:
|
64 |
+
print(f"\\nTesting query: '{query}'")
|
65 |
+
results = db.similarity_search_with_score(query, k=3)
|
66 |
+
for i, (doc, score) in enumerate(results):
|
67 |
+
unused_count = doc.page_content.count('UNUSED')
|
68 |
+
if unused_count > 0:
|
69 |
+
print(f" β
Result {i+1}: {unused_count} UNUSED (score: {score:.4f})")
|
70 |
+
print(f" Device: {doc.metadata.get('device_name', 'Unknown')}")
|
71 |
+
print(f" Section: {doc.metadata.get('section_title', 'Unknown')}")
|
72 |
+
else:
|
73 |
+
print(f" β Result {i+1}: No UNUSED (score: {score:.4f})")
|
74 |
+
|
75 |
+
if __name__ == "__main__":
|
76 |
+
analyze_unused_locations()
|
app.py
CHANGED
@@ -5,107 +5,111 @@
|
|
5 |
# "langchain", # Core Langchain
|
6 |
# "faiss-cpu", # FAISS vector store
|
7 |
# "sentence-transformers", # For HuggingFaceEmbeddings
|
8 |
-
# "
|
|
|
9 |
# "gradio",
|
10 |
-
# "einops",
|
11 |
-
# "smolagents[litellm]",
|
12 |
# # "unstructured" # Required by loader.py, not directly by app.py but good for environment consistency
|
13 |
# ]
|
14 |
# ///
|
15 |
|
16 |
import yaml
|
17 |
-
|
18 |
-
|
19 |
-
|
20 |
-
|
21 |
-
# from opentelemetry import trace
|
22 |
-
# from opentelemetry.sdk.trace import TracerProvider
|
23 |
-
# from opentelemetry.sdk.trace.export import BatchSpanProcessor
|
24 |
-
# from openinference.instrumentation.smolagents import SmolagentsInstrumentor
|
25 |
-
# from opentelemetry.exporter.otlp.proto.http.trace_exporter import OTLPSpanExporter
|
26 |
-
# from opentelemetry.sdk.trace.export import ConsoleSpanExporter, SimpleSpanProcessor
|
27 |
-
# # Endpoint
|
28 |
-
# endpoint = "http://0.0.0.0:6006/v1/traces"
|
29 |
-
|
30 |
-
# trace_provider = TracerProvider()
|
31 |
-
# trace_provider.add_span_processor(SimpleSpanProcessor(OTLPSpanExporter(endpoint)))
|
32 |
-
# SmolagentsInstrumentor().instrument(tracer_provider=trace_provider)
|
33 |
-
|
34 |
-
from langchain_community.vectorstores import FAISS
|
35 |
-
from langchain_community.embeddings import HuggingFaceEmbeddings
|
36 |
-
|
37 |
-
FAISS_INDEX_PATH = "faiss_index"
|
38 |
-
EMBEDDING_MODEL_NAME = "sentence-transformers/all-MiniLM-L6-v2" # Must match loader.py
|
39 |
|
40 |
-
|
|
|
|
|
41 |
|
42 |
-
|
43 |
-
|
44 |
-
description = "Provide information of our network using semantic search. "
|
45 |
-
inputs = {
|
46 |
-
"query": {
|
47 |
-
"type": "string",
|
48 |
-
"description": "The query to perform. This should be semantically close to your target documents. Use the affirmative form rather than a question.",
|
49 |
-
}
|
50 |
-
}
|
51 |
-
output_type = "string"
|
52 |
-
|
53 |
-
def __init__(self, **kwargs):
|
54 |
-
super().__init__(**kwargs)
|
55 |
-
self.embeddings = HuggingFaceEmbeddings(model_name=EMBEDDING_MODEL_NAME)
|
56 |
-
# allow_dangerous_deserialization is recommended for FAISS indexes saved by Langchain
|
57 |
-
self.db = FAISS.load_local(
|
58 |
-
FAISS_INDEX_PATH,
|
59 |
-
self.embeddings,
|
60 |
-
allow_dangerous_deserialization=True
|
61 |
-
)
|
62 |
|
63 |
-
|
64 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
65 |
|
66 |
-
|
|
|
|
|
67 |
|
68 |
-
|
69 |
-
|
70 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
71 |
|
72 |
-
|
73 |
-
|
74 |
-
|
75 |
-
|
76 |
-
|
77 |
-
|
78 |
-
|
79 |
-
|
80 |
-
|
81 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
82 |
|
83 |
-
|
84 |
-
|
85 |
-
|
|
|
|
|
|
|
86 |
|
87 |
-
#
|
88 |
-
|
89 |
-
|
90 |
-
|
91 |
-
|
92 |
-
|
93 |
-
|
94 |
-
|
95 |
-
|
96 |
-
|
97 |
-
verbosity_level=2,
|
98 |
-
grammar=None,
|
99 |
-
planning_interval=None,
|
100 |
-
name="network_information_agent",
|
101 |
-
description="Have access to the network information of our fabric.",
|
102 |
-
add_base_tools=False)
|
103 |
-
|
104 |
-
# # Example usage
|
105 |
-
# response = agent.run(
|
106 |
-
# "What is the loopback Pool address used by the fabric, how many ip addresses are in use?"
|
107 |
-
# )
|
108 |
-
# print(response)
|
109 |
|
110 |
-
|
111 |
-
|
|
|
|
|
|
|
|
|
|
5 |
# "langchain", # Core Langchain
|
6 |
# "faiss-cpu", # FAISS vector store
|
7 |
# "sentence-transformers", # For HuggingFaceEmbeddings
|
8 |
+
# "openai-agents", # OpenAI Agents SDK
|
9 |
+
# "gradio[mcp]",
|
10 |
# "gradio",
|
|
|
|
|
11 |
# # "unstructured" # Required by loader.py, not directly by app.py but good for environment consistency
|
12 |
# ]
|
13 |
# ///
|
14 |
|
15 |
import yaml
|
16 |
+
import gradio as gr
|
17 |
+
from agents import Agent, gen_trace_id, Runner, ModelSettings
|
18 |
+
import asyncio
|
19 |
+
from textwrap import dedent
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
20 |
|
21 |
+
# Import the retriever tool and port recommendations agent
|
22 |
+
from retriever_tool import retrieve_network_information
|
23 |
+
from port_recomendations import port_recommendations_agent
|
24 |
|
25 |
+
with open("prompts.yaml", 'r') as stream:
|
26 |
+
prompt_templates = yaml.safe_load(stream)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
27 |
|
28 |
+
# Create the main orchestrator agent with the port recommendations agent as a tool
|
29 |
+
main_agent = Agent(
|
30 |
+
name="network_agent",
|
31 |
+
instructions=dedent("""
|
32 |
+
You are a network infrastructure assistant that helps users with various network-related queries.
|
33 |
+
You have access to specialized tools and agents:
|
34 |
+
|
35 |
+
1. retrieve_network_information: For general network documentation queries
|
36 |
+
2. port_recommendations_tool: For port/interface recommendations and connectivity questions
|
37 |
|
38 |
+
Use the appropriate tool based on the user's request:
|
39 |
+
- For port recommendations, unused ports, interface questions, or device connectivity: use port_recommendations_tool
|
40 |
+
- For general network information, configuration details, or documentation queries: use retrieve_network_information
|
41 |
|
42 |
+
Always be helpful, precise, and provide detailed responses based on the tools' output.
|
43 |
+
"""),
|
44 |
+
model="gpt-4o-mini",
|
45 |
+
model_settings=ModelSettings(tool_choice="required", temperature=0.0),
|
46 |
+
tools=[
|
47 |
+
retrieve_network_information,
|
48 |
+
port_recommendations_agent.as_tool(
|
49 |
+
tool_name="port_recommendations_tool",
|
50 |
+
tool_description="Get port and interface recommendations for connecting devices to the network. Use this for questions about unused ports, interface recommendations, or device connectivity."
|
51 |
+
)
|
52 |
+
],
|
53 |
+
)
|
54 |
+
async def run(query: str):
|
55 |
+
""" Run the network query process and return the final result"""
|
56 |
+
try:
|
57 |
+
trace_id = gen_trace_id()
|
58 |
+
print(f"View trace: https://platform.openai.com/traces/trace?trace_id={trace_id}")
|
59 |
+
|
60 |
+
result = await Runner.run(
|
61 |
+
main_agent,
|
62 |
+
f"Query: {query}",
|
63 |
+
max_turns=5,
|
64 |
+
)
|
65 |
+
return result.final_output
|
66 |
+
except Exception as e:
|
67 |
+
print(f"Error during query processing: {e}")
|
68 |
+
return f"An error occurred during processing: {str(e)}"
|
69 |
+
async def main(query: str):
|
70 |
+
result = await run(query)
|
71 |
+
print(result)
|
72 |
+
return result
|
73 |
|
74 |
+
def sync_run(query: str):
|
75 |
+
"""Synchronous wrapper for the async run function for Gradio"""
|
76 |
+
return asyncio.run(run(query))
|
77 |
+
|
78 |
+
# Gradio Interface
|
79 |
+
with gr.Blocks(theme=gr.themes.Default(primary_hue="blue")) as ui:
|
80 |
+
gr.Markdown("# Network Infrastructure Assistant")
|
81 |
+
gr.Markdown("Ask questions about network infrastructure, port recommendations, or device connectivity.")
|
82 |
+
|
83 |
+
with gr.Row():
|
84 |
+
with gr.Column():
|
85 |
+
query_textbox = gr.Textbox(
|
86 |
+
label="Your Question",
|
87 |
+
placeholder="e.g., 'I need an unused port for a new server' or 'What's the BGP configuration?'",
|
88 |
+
lines=3
|
89 |
+
)
|
90 |
+
run_button = gr.Button("Ask", variant="primary")
|
91 |
|
92 |
+
with gr.Column():
|
93 |
+
response_textbox = gr.Textbox(
|
94 |
+
label="Response",
|
95 |
+
lines=10,
|
96 |
+
interactive=False
|
97 |
+
)
|
98 |
|
99 |
+
# Event handlers
|
100 |
+
run_button.click(fn=sync_run, inputs=query_textbox, outputs=response_textbox)
|
101 |
+
query_textbox.submit(fn=sync_run, inputs=query_textbox, outputs=response_textbox)
|
102 |
+
|
103 |
+
# Example queries
|
104 |
+
gr.Markdown("### Example Queries:")
|
105 |
+
gr.Markdown("- I need an unused port for a new server")
|
106 |
+
gr.Markdown("- I need to dual connect a server to the network, what ports should I use?")
|
107 |
+
gr.Markdown("- What are the BGP settings for the fabric?")
|
108 |
+
gr.Markdown("- Show me the VLAN configuration")
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
109 |
|
110 |
+
if __name__ == "__main__":
|
111 |
+
# Test query
|
112 |
+
# test_result = asyncio.run(main("I need to dual connect a server to the network, what ports should I use?"))
|
113 |
+
|
114 |
+
# Launch Gradio interface
|
115 |
+
ui.launch(inbrowser=True, debug=True, mcp_server=True)
|
explore_metadata.py
ADDED
@@ -0,0 +1,134 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
#!/usr/bin/env python3
|
2 |
+
"""
|
3 |
+
Test script to explore all metadata fields available in the FAISS database chunks.
|
4 |
+
"""
|
5 |
+
|
6 |
+
import os
|
7 |
+
from langchain_community.embeddings import HuggingFaceEmbeddings
|
8 |
+
from langchain_community.vectorstores import FAISS
|
9 |
+
|
10 |
+
# Configuration
|
11 |
+
FAISS_INDEX_PATH = "faiss_index"
|
12 |
+
EMBEDDINGS_MODEL_NAME = "sentence-transformers/all-MiniLM-L6-v2"
|
13 |
+
|
14 |
+
def explore_metadata():
|
15 |
+
"""Explore all metadata fields available in the database chunks."""
|
16 |
+
print("EXPLORING METADATA IN FAISS DATABASE")
|
17 |
+
print("=" * 60)
|
18 |
+
|
19 |
+
if not os.path.exists(FAISS_INDEX_PATH):
|
20 |
+
print(f"β Error: FAISS index not found at {FAISS_INDEX_PATH}")
|
21 |
+
return False
|
22 |
+
|
23 |
+
try:
|
24 |
+
embeddings = HuggingFaceEmbeddings(model_name=EMBEDDINGS_MODEL_NAME)
|
25 |
+
vector_db = FAISS.load_local(FAISS_INDEX_PATH, embeddings, allow_dangerous_deserialization=True)
|
26 |
+
print(f"β
Successfully loaded FAISS index from {FAISS_INDEX_PATH}")
|
27 |
+
except Exception as e:
|
28 |
+
print(f"β Error loading FAISS index: {e}")
|
29 |
+
return False
|
30 |
+
|
31 |
+
# Get a sample of documents to analyze metadata
|
32 |
+
sample_queries = [
|
33 |
+
"Ethernet Interfaces Summary",
|
34 |
+
"UNUSED",
|
35 |
+
"interface configuration",
|
36 |
+
"device information",
|
37 |
+
"fabric"
|
38 |
+
]
|
39 |
+
|
40 |
+
all_metadata_keys = set()
|
41 |
+
metadata_examples = {}
|
42 |
+
|
43 |
+
print("\nSampling documents to analyze metadata...")
|
44 |
+
print("-" * 40)
|
45 |
+
|
46 |
+
for query in sample_queries:
|
47 |
+
try:
|
48 |
+
results = vector_db.similarity_search_with_score(query, k=3)
|
49 |
+
|
50 |
+
for doc, score in results:
|
51 |
+
if doc.metadata:
|
52 |
+
# Collect all metadata keys
|
53 |
+
all_metadata_keys.update(doc.metadata.keys())
|
54 |
+
|
55 |
+
# Store examples of each metadata field
|
56 |
+
for key, value in doc.metadata.items():
|
57 |
+
if key not in metadata_examples:
|
58 |
+
metadata_examples[key] = []
|
59 |
+
if value not in metadata_examples[key]:
|
60 |
+
metadata_examples[key].append(value)
|
61 |
+
|
62 |
+
except Exception as e:
|
63 |
+
print(f"Error with query '{query}': {e}")
|
64 |
+
|
65 |
+
# Display metadata analysis
|
66 |
+
print(f"\nπ METADATA ANALYSIS")
|
67 |
+
print("=" * 60)
|
68 |
+
print(f"Total unique metadata keys found: {len(all_metadata_keys)}")
|
69 |
+
print(f"Metadata keys: {sorted(all_metadata_keys)}")
|
70 |
+
|
71 |
+
print(f"\nπ DETAILED METADATA FIELDS:")
|
72 |
+
print("-" * 40)
|
73 |
+
|
74 |
+
for key in sorted(all_metadata_keys):
|
75 |
+
examples = metadata_examples.get(key, [])
|
76 |
+
print(f"\nπ Field: '{key}'")
|
77 |
+
print(f" Unique values found: {len(examples)}")
|
78 |
+
print(f" Example values:")
|
79 |
+
for i, example in enumerate(examples[:5]): # Show max 5 examples
|
80 |
+
print(f" {i+1}: {repr(example)}")
|
81 |
+
if len(examples) > 5:
|
82 |
+
print(f" ... and {len(examples) - 5} more")
|
83 |
+
|
84 |
+
# Show some detailed examples
|
85 |
+
print(f"\nπ SAMPLE DOCUMENTS WITH FULL METADATA:")
|
86 |
+
print("-" * 40)
|
87 |
+
|
88 |
+
# Get a few documents to show complete metadata
|
89 |
+
sample_results = vector_db.similarity_search_with_score("Ethernet", k=3)
|
90 |
+
|
91 |
+
for i, (doc, score) in enumerate(sample_results):
|
92 |
+
print(f"\n[SAMPLE {i+1}]")
|
93 |
+
print(f"Score: {score:.4f}")
|
94 |
+
print(f"Content Length: {len(doc.page_content)} characters")
|
95 |
+
print(f"Content Preview: {doc.page_content[:100].replace(chr(10), ' ')}...")
|
96 |
+
print(f"Complete Metadata:")
|
97 |
+
if doc.metadata:
|
98 |
+
for key, value in sorted(doc.metadata.items()):
|
99 |
+
print(f" {key}: {repr(value)}")
|
100 |
+
else:
|
101 |
+
print(" No metadata found")
|
102 |
+
print("-" * 30)
|
103 |
+
|
104 |
+
# Analysis summary
|
105 |
+
print(f"\nπ SUMMARY:")
|
106 |
+
print("=" * 60)
|
107 |
+
|
108 |
+
device_docs = len([ex for ex in metadata_examples.get('device_name', []) if ex])
|
109 |
+
source_files = len(metadata_examples.get('source', []))
|
110 |
+
|
111 |
+
print(f"β’ Device documents found: {device_docs}")
|
112 |
+
print(f"β’ Source files found: {source_files}")
|
113 |
+
|
114 |
+
if 'device_name' in all_metadata_keys:
|
115 |
+
print(f"β’ Device names: {metadata_examples.get('device_name', [])}")
|
116 |
+
|
117 |
+
if 'source' in all_metadata_keys:
|
118 |
+
print(f"β’ Source file types: {set(f.split('.')[-1] if '.' in f else 'unknown' for f in metadata_examples.get('source', []))}")
|
119 |
+
|
120 |
+
return True
|
121 |
+
|
122 |
+
def main():
|
123 |
+
"""Run the metadata exploration."""
|
124 |
+
success = explore_metadata()
|
125 |
+
|
126 |
+
if success:
|
127 |
+
print("\nβ
Metadata exploration completed successfully!")
|
128 |
+
return 0
|
129 |
+
else:
|
130 |
+
print("\nβ Metadata exploration failed")
|
131 |
+
return 1
|
132 |
+
|
133 |
+
if __name__ == "__main__":
|
134 |
+
exit(main())
|
faiss_index/index.faiss
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
-
size
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:e2bb0a47cc3c04d9b19379506d84d0d35ba2bdfbdae110574ea79aca0f01ce5f
|
3 |
+
size 612909
|
faiss_index/index.pkl
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
-
size
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:d712088b22f569606162167cde236f005479c49932000ae661f4d6c0a70da9e2
|
3 |
+
size 317433
|
loader.py
CHANGED
@@ -1,49 +1,139 @@
|
|
1 |
-
|
|
|
|
|
|
|
|
|
|
|
2 |
"""
|
3 |
-
|
|
|
4 |
"""
|
5 |
|
6 |
import os
|
|
|
7 |
from langchain_community.document_loaders import UnstructuredMarkdownLoader
|
8 |
from langchain.text_splitter import MarkdownHeaderTextSplitter, RecursiveCharacterTextSplitter
|
9 |
from langchain_community.embeddings import HuggingFaceEmbeddings
|
10 |
from langchain_community.vectorstores import FAISS
|
|
|
11 |
|
12 |
-
# Define the paths to your documentation folders
|
13 |
DOCS_DIR = "documentation"
|
14 |
DEVICE_DOCS_PATH = os.path.join(DOCS_DIR, "devices")
|
15 |
FABRIC_DOCS_PATH = os.path.join(DOCS_DIR, "fabric")
|
16 |
FAISS_INDEX_PATH = "faiss_index"
|
17 |
|
18 |
-
def
|
19 |
"""
|
20 |
-
|
21 |
-
|
22 |
-
Device name is stored in metadata if applicable.
|
23 |
"""
|
24 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
25 |
for file_path in file_paths:
|
26 |
-
|
27 |
-
|
28 |
-
|
29 |
-
|
30 |
-
|
31 |
-
|
32 |
-
|
33 |
-
|
34 |
-
|
35 |
-
|
36 |
-
|
37 |
-
|
38 |
-
|
39 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
40 |
|
41 |
def create_vector_db():
|
42 |
"""
|
43 |
-
Scans documentation folders, loads MD files
|
44 |
-
and saves a FAISS vector database.
|
45 |
"""
|
46 |
markdown_files = []
|
|
|
|
|
47 |
for root, _, files in os.walk(DEVICE_DOCS_PATH):
|
48 |
for file in files:
|
49 |
if file.endswith(".md"):
|
@@ -60,64 +150,37 @@ def create_vector_db():
|
|
60 |
|
61 |
print(f"Found {len(markdown_files)} markdown files to process.")
|
62 |
|
63 |
-
# Load documents
|
64 |
-
documents =
|
65 |
-
print(f"
|
66 |
-
|
67 |
-
# Define headers to split on
|
68 |
-
headers_to_split_on = [
|
69 |
-
("#", "header1"),
|
70 |
-
("##", "header2"),
|
71 |
-
("###", "header3"),
|
72 |
-
]
|
73 |
|
74 |
-
#
|
75 |
-
|
76 |
-
|
77 |
-
|
78 |
-
|
79 |
-
|
80 |
-
|
81 |
-
|
82 |
-
|
83 |
-
# Copy metadata from original document
|
84 |
-
split_doc.metadata.update(doc.metadata)
|
85 |
-
header_split_docs.extend(header_split)
|
86 |
-
except Exception as e:
|
87 |
-
print(f"Warning: Could not split by headers: {e}")
|
88 |
-
# If header splitting fails, keep the original document
|
89 |
-
header_split_docs.append(doc)
|
90 |
-
|
91 |
-
# Then do recursive character splitting with smaller chunks and larger overlap
|
92 |
-
text_splitter = RecursiveCharacterTextSplitter(chunk_size=800, chunk_overlap=200)
|
93 |
-
texts = text_splitter.split_documents(header_split_docs)
|
94 |
-
print(f"Split documents into {len(texts)} chunks.")
|
95 |
-
|
96 |
-
# Add device context to each chunk's page_content if it's from a device file
|
97 |
-
for text_chunk in texts:
|
98 |
-
if 'device_name' in text_chunk.metadata:
|
99 |
-
device_name = text_chunk.metadata['device_name']
|
100 |
-
# Prepend device name to the content of the chunk
|
101 |
-
# Ensure it's not already prepended (e.g. if a header itself was the device name)
|
102 |
-
if not text_chunk.page_content.strip().startswith(f"Device: {device_name}"):
|
103 |
-
text_chunk.page_content = f"Device: {device_name}\\n\\n{text_chunk.page_content}"
|
104 |
|
105 |
-
print("
|
106 |
embeddings = HuggingFaceEmbeddings(model_name="sentence-transformers/all-MiniLM-L6-v2")
|
107 |
print("Embeddings model loaded.")
|
108 |
|
109 |
# Create FAISS vector store
|
110 |
-
if not
|
111 |
-
print("No
|
112 |
return
|
113 |
|
114 |
print("Creating FAISS index...")
|
115 |
-
vector_db = FAISS.from_documents(
|
116 |
print("FAISS index created.")
|
117 |
|
118 |
# Save FAISS index
|
119 |
vector_db.save_local(FAISS_INDEX_PATH)
|
120 |
print(f"FAISS index saved to {FAISS_INDEX_PATH}")
|
|
|
121 |
|
122 |
if __name__ == "__main__":
|
123 |
create_vector_db()
|
|
|
1 |
+
# /// script
|
2 |
+
# dependencies = [
|
3 |
+
# "langchain_community",
|
4 |
+
# "langchain_core",
|
5 |
+
# ]
|
6 |
+
# ///
|
7 |
"""
|
8 |
+
Enhanced loader script for creating FAISS vector database from Markdown documentation
|
9 |
+
with improved header metadata extraction.
|
10 |
"""
|
11 |
|
12 |
import os
|
13 |
+
import re
|
14 |
from langchain_community.document_loaders import UnstructuredMarkdownLoader
|
15 |
from langchain.text_splitter import MarkdownHeaderTextSplitter, RecursiveCharacterTextSplitter
|
16 |
from langchain_community.embeddings import HuggingFaceEmbeddings
|
17 |
from langchain_community.vectorstores import FAISS
|
18 |
+
from langchain_core.documents import Document
|
19 |
|
|
|
20 |
DOCS_DIR = "documentation"
|
21 |
DEVICE_DOCS_PATH = os.path.join(DOCS_DIR, "devices")
|
22 |
FABRIC_DOCS_PATH = os.path.join(DOCS_DIR, "fabric")
|
23 |
FAISS_INDEX_PATH = "faiss_index"
|
24 |
|
25 |
+
def extract_header_context(content, chunk_start_pos):
|
26 |
"""
|
27 |
+
Extract the header hierarchy for a given position in the markdown content.
|
28 |
+
Returns a dict with header levels and creates header_path and section_title.
|
|
|
29 |
"""
|
30 |
+
lines = content[:chunk_start_pos].split('\n')
|
31 |
+
headers = {}
|
32 |
+
|
33 |
+
# Track the current header hierarchy
|
34 |
+
for line in lines:
|
35 |
+
line = line.strip()
|
36 |
+
if line.startswith('#') and not line.startswith('#!'): # Exclude shebang
|
37 |
+
# Count the number of # to determine header level
|
38 |
+
level = len(line) - len(line.lstrip('#'))
|
39 |
+
if 1 <= level <= 5: # Only process header levels 1-5
|
40 |
+
header_text = line.lstrip('#').strip()
|
41 |
+
headers[f'header{level}'] = header_text
|
42 |
+
# Clear lower level headers when we encounter a higher level
|
43 |
+
for i in range(level + 1, 6):
|
44 |
+
if f'header{i}' in headers:
|
45 |
+
del headers[f'header{i}']
|
46 |
+
|
47 |
+
return headers
|
48 |
+
|
49 |
+
def enhance_chunk_metadata(chunk, original_content, chunk_position, file_metadata):
|
50 |
+
"""
|
51 |
+
Enhance a chunk with header metadata and other contextual information.
|
52 |
+
"""
|
53 |
+
# Start with file-level metadata
|
54 |
+
enhanced_metadata = file_metadata.copy()
|
55 |
+
|
56 |
+
# Extract header context for this chunk position
|
57 |
+
header_context = extract_header_context(original_content, chunk_position)
|
58 |
+
enhanced_metadata.update(header_context)
|
59 |
+
|
60 |
+
# Create header path from all header levels
|
61 |
+
header_path_parts = []
|
62 |
+
for i in range(1, 6): # header1 through header5
|
63 |
+
if f'header{i}' in enhanced_metadata:
|
64 |
+
header_path_parts.append(enhanced_metadata[f'header{i}'])
|
65 |
+
|
66 |
+
if header_path_parts:
|
67 |
+
enhanced_metadata['header_path'] = " > ".join(header_path_parts)
|
68 |
+
enhanced_metadata['section_title'] = header_path_parts[-1] # Most specific header
|
69 |
+
|
70 |
+
return enhanced_metadata
|
71 |
+
|
72 |
+
def load_markdown_documents_with_headers(file_paths):
|
73 |
+
"""
|
74 |
+
Loads markdown documents and creates chunks with enhanced header metadata.
|
75 |
+
"""
|
76 |
+
all_documents = []
|
77 |
+
|
78 |
for file_path in file_paths:
|
79 |
+
print(f"Processing: {os.path.basename(file_path)}")
|
80 |
+
|
81 |
+
# Read the raw markdown content
|
82 |
+
with open(file_path, 'r', encoding='utf-8') as f:
|
83 |
+
content = f.read()
|
84 |
+
|
85 |
+
# Create base metadata for this file
|
86 |
+
file_metadata = {
|
87 |
+
'source': os.path.basename(file_path)
|
88 |
+
}
|
89 |
+
|
90 |
+
# Add device_name if it's a device file
|
91 |
+
if 'DCX-' in os.path.basename(file_path):
|
92 |
+
file_metadata['device_name'] = os.path.basename(file_path).replace('.md', '')
|
93 |
+
|
94 |
+
# Split content into chunks using RecursiveCharacterTextSplitter
|
95 |
+
text_splitter = RecursiveCharacterTextSplitter(
|
96 |
+
chunk_size=800,
|
97 |
+
chunk_overlap=200,
|
98 |
+
separators=["\n## ", "\n### ", "\n#### ", "\n##### ", "\n\n", "\n", " ", ""]
|
99 |
+
)
|
100 |
+
|
101 |
+
chunks = text_splitter.split_text(content)
|
102 |
+
|
103 |
+
for chunk in chunks:
|
104 |
+
# Find the position of this chunk in the original content
|
105 |
+
chunk_position = content.find(chunk)
|
106 |
+
if chunk_position == -1:
|
107 |
+
# If exact match not found, try finding a shorter prefix
|
108 |
+
chunk_start = chunk[:min(100, len(chunk))]
|
109 |
+
chunk_position = content.find(chunk_start)
|
110 |
+
if chunk_position == -1:
|
111 |
+
chunk_position = 0 # Fallback to beginning
|
112 |
+
|
113 |
+
# Enhance metadata with header context
|
114 |
+
enhanced_metadata = enhance_chunk_metadata(chunk, content, chunk_position, file_metadata)
|
115 |
+
|
116 |
+
# Add device context to content if it's a device file
|
117 |
+
final_content = chunk
|
118 |
+
if 'device_name' in enhanced_metadata:
|
119 |
+
device_name = enhanced_metadata['device_name']
|
120 |
+
if not chunk.strip().startswith(f"Device: {device_name}"):
|
121 |
+
final_content = f"Device: {device_name}\\n\\n{chunk}"
|
122 |
+
|
123 |
+
# Create document with enhanced metadata
|
124 |
+
doc = Document(page_content=final_content, metadata=enhanced_metadata)
|
125 |
+
all_documents.append(doc)
|
126 |
+
|
127 |
+
return all_documents
|
128 |
|
129 |
def create_vector_db():
|
130 |
"""
|
131 |
+
Scans documentation folders, loads MD files with enhanced header metadata,
|
132 |
+
creates embeddings, and saves a FAISS vector database.
|
133 |
"""
|
134 |
markdown_files = []
|
135 |
+
|
136 |
+
# Collect all markdown files
|
137 |
for root, _, files in os.walk(DEVICE_DOCS_PATH):
|
138 |
for file in files:
|
139 |
if file.endswith(".md"):
|
|
|
150 |
|
151 |
print(f"Found {len(markdown_files)} markdown files to process.")
|
152 |
|
153 |
+
# Load documents with enhanced header metadata
|
154 |
+
documents = load_markdown_documents_with_headers(markdown_files)
|
155 |
+
print(f"Created {len(documents)} document chunks with header metadata.")
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
156 |
|
157 |
+
# Debug: Print sample metadata from first few chunks
|
158 |
+
print("\\nSample metadata from first 3 chunks:")
|
159 |
+
for i, doc in enumerate(documents[:3]):
|
160 |
+
print(f"\\nChunk {i+1}:")
|
161 |
+
print(f" Source: {doc.metadata.get('source', 'Unknown')}")
|
162 |
+
print(f" Device: {doc.metadata.get('device_name', 'N/A')}")
|
163 |
+
print(f" Header Path: {doc.metadata.get('header_path', 'No headers')}")
|
164 |
+
print(f" Section Title: {doc.metadata.get('section_title', 'No section')}")
|
165 |
+
print(f" Content Preview: {doc.page_content[:100]}...")
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
166 |
|
167 |
+
print("\\nCreating FAISS vector database...")
|
168 |
embeddings = HuggingFaceEmbeddings(model_name="sentence-transformers/all-MiniLM-L6-v2")
|
169 |
print("Embeddings model loaded.")
|
170 |
|
171 |
# Create FAISS vector store
|
172 |
+
if not documents:
|
173 |
+
print("No documents to process for FAISS index.")
|
174 |
return
|
175 |
|
176 |
print("Creating FAISS index...")
|
177 |
+
vector_db = FAISS.from_documents(documents, embeddings)
|
178 |
print("FAISS index created.")
|
179 |
|
180 |
# Save FAISS index
|
181 |
vector_db.save_local(FAISS_INDEX_PATH)
|
182 |
print(f"FAISS index saved to {FAISS_INDEX_PATH}")
|
183 |
+
print(f"Total chunks in database: {len(documents)}")
|
184 |
|
185 |
if __name__ == "__main__":
|
186 |
create_vector_db()
|
model.py
ADDED
@@ -0,0 +1,16 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
from openai import AsyncOpenAI
|
2 |
+
from dotenv import load_dotenv
|
3 |
+
import os
|
4 |
+
from agents import OpenAIChatCompletionsModel
|
5 |
+
|
6 |
+
load_dotenv(override=True)
|
7 |
+
|
8 |
+
google_api_key = os.getenv('GOOGLE_API_KEY')
|
9 |
+
GEMINI_BASE_URL = "https://generativelanguage.googleapis.com/v1beta/openai/"
|
10 |
+
gemini_client = AsyncOpenAI(base_url=GEMINI_BASE_URL, api_key=google_api_key)
|
11 |
+
gemini_model = OpenAIChatCompletionsModel(model="gemini-2.0-flash", openai_client=gemini_client)
|
12 |
+
|
13 |
+
qwen_api_key = "blah"
|
14 |
+
qwen_base_url = "http://localhost:11434/v1"
|
15 |
+
qwen_client = AsyncOpenAI(base_url=qwen_base_url, api_key=qwen_api_key)
|
16 |
+
qwen_model = OpenAIChatCompletionsModel(model="qwen3:14b", openai_client=qwen_client)
|
port_recomendations.py
ADDED
@@ -0,0 +1,147 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
from agents import Agent, function_tool
|
2 |
+
from retriever_tool import db
|
3 |
+
from textwrap import dedent
|
4 |
+
from agents import Tool
|
5 |
+
from model import gemini_model, qwen_model
|
6 |
+
|
7 |
+
@function_tool
|
8 |
+
def unused_ports() -> str:
|
9 |
+
"""Get information about unused Ethernet interfaces across all network devices.
|
10 |
+
|
11 |
+
This tool specifically queries for unused/available ports in the network infrastructure
|
12 |
+
by filtering documents with 'Ethernet Interfaces Summary' headers and UNUSED interfaces.
|
13 |
+
Only returns results from leaf switches (devices with 'LEAF' in their name).
|
14 |
+
"""
|
15 |
+
|
16 |
+
# Use metadata-based filtering instead of similarity search
|
17 |
+
# Get all documents from the vectorstore
|
18 |
+
all_docs_with_scores = db.similarity_search_with_score("", k=db.index.ntotal)
|
19 |
+
|
20 |
+
# Filter documents by header metadata and device type
|
21 |
+
target_header = "Ethernet Interfaces Summary"
|
22 |
+
matching_docs = []
|
23 |
+
|
24 |
+
for doc, score in all_docs_with_scores:
|
25 |
+
# Check if document has the target header
|
26 |
+
section_title = doc.metadata.get('section_title', '')
|
27 |
+
header_path = doc.metadata.get('header_path', '')
|
28 |
+
device_name = doc.metadata.get('device_name', '')
|
29 |
+
|
30 |
+
# Filter for:
|
31 |
+
# 1. Documents with "Ethernet Interfaces Summary" in header path
|
32 |
+
# 2. Documents from LEAF devices only
|
33 |
+
# 3. Documents that contain UNUSED interfaces
|
34 |
+
is_ethernet_summary = (target_header == section_title or target_header in header_path)
|
35 |
+
is_leaf_device = 'LEAF' in device_name.upper()
|
36 |
+
has_unused = 'UNUSED' in doc.page_content
|
37 |
+
|
38 |
+
if is_ethernet_summary and is_leaf_device and has_unused:
|
39 |
+
matching_docs.append((doc, score))
|
40 |
+
|
41 |
+
response = ""
|
42 |
+
if not matching_docs:
|
43 |
+
return "No unused interface information found in Ethernet Interfaces Summary sections of leaf devices."
|
44 |
+
|
45 |
+
# Track devices and their unused interfaces
|
46 |
+
device_unused = {}
|
47 |
+
|
48 |
+
for doc, score in matching_docs:
|
49 |
+
device_name = doc.metadata.get('device_name')
|
50 |
+
source = doc.metadata.get('source', 'Unknown source')
|
51 |
+
header_path = doc.metadata.get('header_path', 'No header path')
|
52 |
+
section_title = doc.metadata.get('section_title', 'No section title')
|
53 |
+
|
54 |
+
if device_name:
|
55 |
+
# Count UNUSED interfaces in this chunk
|
56 |
+
unused_count = doc.page_content.count('UNUSED')
|
57 |
+
|
58 |
+
if unused_count > 0:
|
59 |
+
if device_name not in device_unused:
|
60 |
+
device_unused[device_name] = {
|
61 |
+
'total_unused': 0,
|
62 |
+
'sections': [],
|
63 |
+
'source': source
|
64 |
+
}
|
65 |
+
|
66 |
+
device_unused[device_name]['total_unused'] += unused_count
|
67 |
+
device_unused[device_name]['sections'].append({
|
68 |
+
'section': section_title,
|
69 |
+
'header_path': header_path,
|
70 |
+
'count': unused_count,
|
71 |
+
'score': score
|
72 |
+
})
|
73 |
+
|
74 |
+
# Format the response with enhanced metadata
|
75 |
+
if matching_docs:
|
76 |
+
# Attach the actual sections containing UNUSED interfaces from each document
|
77 |
+
for doc, score in matching_docs:
|
78 |
+
device_name = doc.metadata.get('device_name', 'Unknown device')
|
79 |
+
source = doc.metadata.get('source', 'Unknown source')
|
80 |
+
section_title = doc.metadata.get('section_title', 'No section title')
|
81 |
+
header_path = doc.metadata.get('header_path', 'No header path')
|
82 |
+
|
83 |
+
response += f"---\n"
|
84 |
+
response += f"Device: {device_name} (Source: {source})\n"
|
85 |
+
response += f"Section: {section_title}\n"
|
86 |
+
response += f"Path: {header_path}\n\n"
|
87 |
+
|
88 |
+
# Extract and attach only the lines that actually contain UNUSED interfaces
|
89 |
+
for line in doc.page_content.splitlines():
|
90 |
+
if 'UNUSED' in line:
|
91 |
+
response += line + "\n"
|
92 |
+
|
93 |
+
response += "\n"
|
94 |
+
else:
|
95 |
+
response += "No leaf devices with unused interfaces found in Ethernet Interfaces Summary sections."
|
96 |
+
|
97 |
+
print(f"Retrieved {len(matching_docs)} filtered results from Ethernet Interfaces Summary sections on leaf devices")
|
98 |
+
return response
|
99 |
+
|
100 |
+
# Port recommendation specific instructions
|
101 |
+
port_recommendation_instructions = dedent("""
|
102 |
+
You are an expert network assistant specialized in port and interface recommendations.
|
103 |
+
Your role is to use the available tools to answer questions about port/interface recommendations for connecting new devices to the network infrastructure.
|
104 |
+
|
105 |
+
Key responsibilities:
|
106 |
+
- Port and interface are synonymous terms (users may use them interchangeably)
|
107 |
+
- Always use the retrieve_network_information tool to find unused interface ports before making recommendations
|
108 |
+
- Provide specific device names and port numbers in your recommendations
|
109 |
+
- Be detailed and precise in your responses
|
110 |
+
|
111 |
+
Port Recommendation Rules:
|
112 |
+
1. If not specified otherwise, always recommend TWO ports across different devices for redundancy
|
113 |
+
2. Recommend ports across devices that form a MLAG or LACP group
|
114 |
+
3. Leaf switches are in MLAG pairs: odd-numbered leaf (leaf01, leaf03, etc.) paired with even-numbered leaf (leaf02, leaf04, etc.)
|
115 |
+
4. Try to select the same interface port number across paired devices (e.g., if recommending port 25 on leaf01, also recommend port 25 on leaf02)
|
116 |
+
5. Include device names and specific port identifiers in your response
|
117 |
+
6. If user specifically requests single port or "without redundancy", recommend only one port
|
118 |
+
7. Only recommend ports that have the description "UNUSED".
|
119 |
+
|
120 |
+
|
121 |
+
Response Format:
|
122 |
+
- Always query for unused ports first using retrieve_network_information
|
123 |
+
- Provide clear, actionable recommendations with device names and port numbers
|
124 |
+
- Explain the reasoning behind your recommendations when relevant
|
125 |
+
|
126 |
+
Examples:
|
127 |
+
User: "I need an unused port"
|
128 |
+
Response: After checking available ports, I recommend using Ethernet1/25 on leaf01 and Ethernet1/25 on leaf02 for redundancy.
|
129 |
+
|
130 |
+
User: "I need an unused port without redundancy"
|
131 |
+
Response: After checking available ports, I recommend using Ethernet1/26 on leaf01.
|
132 |
+
|
133 |
+
User: "I need to dual connect a server to the network, what ports should I use?"
|
134 |
+
Response: For dual-connecting a server, I recommend using Ethernet1/27 on leaf01 and Ethernet1/27 on leaf02, which will provide MLAG redundancy.
|
135 |
+
|
136 |
+
User: "I need to connect two servers to the network, what ports should I use?"
|
137 |
+
Response: For connecting two servers, I recommend using Ethernet1/28-29 on leaf01 and Ethernet1/28-29 on leaf02 for redundancy.
|
138 |
+
""")
|
139 |
+
|
140 |
+
|
141 |
+
# Create the specialized port recommendations agent
|
142 |
+
port_recommendations_agent = Agent(
|
143 |
+
name="port_recommendations_agent",
|
144 |
+
instructions=port_recommendation_instructions,
|
145 |
+
model=qwen_model,
|
146 |
+
tools=[unused_ports],
|
147 |
+
)
|
port_recommendations_standalone.py
ADDED
@@ -0,0 +1,41 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
|
2 |
+
import asyncio
|
3 |
+
from agents import Runner, trace, gen_trace_id
|
4 |
+
from port_recomendations import port_recommendations_agent
|
5 |
+
|
6 |
+
async def get_port_recommendation(query: str):
|
7 |
+
"""Get port recommendations using the specialized agent"""
|
8 |
+
trace_id = gen_trace_id()
|
9 |
+
with trace("Port Recommendation", trace_id=trace_id):
|
10 |
+
print(f"View trace: https://platform.openai.com/traces/trace?trace_id={trace_id}")
|
11 |
+
try:
|
12 |
+
result = await Runner.run(
|
13 |
+
port_recommendations_agent,
|
14 |
+
f"Query: {query}",
|
15 |
+
max_turns=5,
|
16 |
+
)
|
17 |
+
return result.final_output
|
18 |
+
except Exception as e:
|
19 |
+
print(f"Error during port recommendation: {e}")
|
20 |
+
return f"An error occurred: {str(e)}"
|
21 |
+
|
22 |
+
async def main():
|
23 |
+
"""Test the port recommendations agent with various queries"""
|
24 |
+
test_queries = [
|
25 |
+
"I need an unused port",
|
26 |
+
"I need an unused port without redundancy",
|
27 |
+
"I need to dual connect a server to the network, what ports should I use?",
|
28 |
+
"What ports are available on leaf01?",
|
29 |
+
"I need 4 ports for a new switch connection"
|
30 |
+
]
|
31 |
+
|
32 |
+
for query in test_queries:
|
33 |
+
print(f"\n{'='*60}")
|
34 |
+
print(f"Query: {query}")
|
35 |
+
print(f"{'='*60}")
|
36 |
+
result = await get_port_recommendation(query)
|
37 |
+
print(f"Recommendation: {result}")
|
38 |
+
print()
|
39 |
+
|
40 |
+
if __name__ == "__main__":
|
41 |
+
asyncio.run(main())
|
requirements.txt
CHANGED
@@ -8,4 +8,5 @@ einops
|
|
8 |
langchain-community
|
9 |
langchain
|
10 |
faiss-cpu
|
11 |
-
unstructured
|
|
|
|
8 |
langchain-community
|
9 |
langchain
|
10 |
faiss-cpu
|
11 |
+
unstructured
|
12 |
+
gradio[mcp]
|
retriever_tool.py
ADDED
@@ -0,0 +1,49 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
|
2 |
+
from langchain_community.vectorstores import FAISS
|
3 |
+
try:
|
4 |
+
from langchain_huggingface import HuggingFaceEmbeddings
|
5 |
+
except ImportError:
|
6 |
+
# Fallback to deprecated import if langchain-huggingface is not installed
|
7 |
+
from langchain_community.embeddings import HuggingFaceEmbeddings
|
8 |
+
from agents import function_tool
|
9 |
+
|
10 |
+
FAISS_INDEX_PATH = "faiss_index"
|
11 |
+
EMBEDDING_MODEL_NAME = "sentence-transformers/all-MiniLM-L6-v2" # Must match loader.py
|
12 |
+
|
13 |
+
# Initialize embeddings and vector store
|
14 |
+
embeddings = HuggingFaceEmbeddings(model_name=EMBEDDING_MODEL_NAME)
|
15 |
+
db = FAISS.load_local(
|
16 |
+
FAISS_INDEX_PATH,
|
17 |
+
embeddings,
|
18 |
+
allow_dangerous_deserialization=True
|
19 |
+
)
|
20 |
+
|
21 |
+
@function_tool
|
22 |
+
def retrieve_network_information(query: str) -> str:
|
23 |
+
"""Provide information of our network using semantic search.
|
24 |
+
|
25 |
+
Args:
|
26 |
+
query: The query to search for in the network documentation.
|
27 |
+
This should be semantically close to your target documents.
|
28 |
+
Use the affirmative form rather than a question.
|
29 |
+
"""
|
30 |
+
|
31 |
+
results_with_scores = db.similarity_search_with_score(query, k=10)
|
32 |
+
|
33 |
+
response = ""
|
34 |
+
if not results_with_scores:
|
35 |
+
return "No relevant information found in the documentation for your query."
|
36 |
+
|
37 |
+
for doc, score in results_with_scores:
|
38 |
+
device_name = doc.metadata.get('device_name')
|
39 |
+
source = doc.metadata.get('source', 'Unknown source')
|
40 |
+
|
41 |
+
if device_name:
|
42 |
+
response += f"Device: {device_name} (Source: {source}, Score: {score:.4f})\n"
|
43 |
+
else:
|
44 |
+
# If not device_name, assume it's global/fabric information
|
45 |
+
response += f"Global/Fabric Info (Source: {source}, Score: {score:.4f})\n"
|
46 |
+
response += f"Result: {doc.page_content}\n\n"
|
47 |
+
|
48 |
+
print(f"Retrieved {len(results_with_scores)} results for query: '{query}'")
|
49 |
+
return response
|
setup.py
ADDED
@@ -0,0 +1,131 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
#!/usr/bin/env python3
|
2 |
+
"""
|
3 |
+
Setup script for the AI Agent System for Port Recommendations
|
4 |
+
Helps configure the environment and run the system.
|
5 |
+
"""
|
6 |
+
|
7 |
+
import os
|
8 |
+
import subprocess
|
9 |
+
import sys
|
10 |
+
|
11 |
+
def check_dependencies():
|
12 |
+
"""Check if required dependencies are installed"""
|
13 |
+
print("π Checking dependencies...")
|
14 |
+
|
15 |
+
required_packages = [
|
16 |
+
"gradio",
|
17 |
+
"openai-agents",
|
18 |
+
"faiss-cpu",
|
19 |
+
"langchain",
|
20 |
+
"langchain-community",
|
21 |
+
"sentence-transformers",
|
22 |
+
"PyYAML"
|
23 |
+
]
|
24 |
+
|
25 |
+
missing_packages = []
|
26 |
+
|
27 |
+
for package in required_packages:
|
28 |
+
try:
|
29 |
+
__import__(package.replace("-", "_"))
|
30 |
+
print(f"β
{package}")
|
31 |
+
except ImportError:
|
32 |
+
print(f"β {package} (missing)")
|
33 |
+
missing_packages.append(package)
|
34 |
+
|
35 |
+
if missing_packages:
|
36 |
+
print(f"\nπ¦ Missing packages: {', '.join(missing_packages)}")
|
37 |
+
print("Install with: pip install " + " ".join(missing_packages))
|
38 |
+
return False
|
39 |
+
|
40 |
+
print("β
All dependencies installed!")
|
41 |
+
return True
|
42 |
+
|
43 |
+
def check_environment():
|
44 |
+
"""Check environment configuration"""
|
45 |
+
print("\nπ§ Checking environment...")
|
46 |
+
|
47 |
+
# Check for OpenAI API key
|
48 |
+
if os.getenv("OPENAI_API_KEY"):
|
49 |
+
print("β
OPENAI_API_KEY is set")
|
50 |
+
api_key_status = True
|
51 |
+
else:
|
52 |
+
print("β OPENAI_API_KEY is not set")
|
53 |
+
print(" Set it with: export OPENAI_API_KEY='your-api-key-here'")
|
54 |
+
api_key_status = False
|
55 |
+
|
56 |
+
# Check for FAISS index
|
57 |
+
if os.path.exists("faiss_index"):
|
58 |
+
print("β
FAISS index exists")
|
59 |
+
faiss_status = True
|
60 |
+
else:
|
61 |
+
print("β FAISS index not found")
|
62 |
+
print(" Run the loader script first to create the index")
|
63 |
+
faiss_status = False
|
64 |
+
|
65 |
+
return api_key_status and faiss_status
|
66 |
+
|
67 |
+
def show_usage_examples():
|
68 |
+
"""Show usage examples"""
|
69 |
+
print("\nπ Usage Examples:")
|
70 |
+
print("=" * 50)
|
71 |
+
|
72 |
+
examples = [
|
73 |
+
{
|
74 |
+
"query": "I need an unused port for a new server",
|
75 |
+
"description": "Single port recommendation"
|
76 |
+
},
|
77 |
+
{
|
78 |
+
"query": "I need to dual connect a server to the network, what ports should I use?",
|
79 |
+
"description": "MLAG dual connection recommendation"
|
80 |
+
},
|
81 |
+
{
|
82 |
+
"query": "What are the BGP settings for the fabric?",
|
83 |
+
"description": "General network information query"
|
84 |
+
},
|
85 |
+
{
|
86 |
+
"query": "Show me available interfaces on switch01",
|
87 |
+
"description": "Device-specific port query"
|
88 |
+
}
|
89 |
+
]
|
90 |
+
|
91 |
+
for i, example in enumerate(examples, 1):
|
92 |
+
print(f"{i}. {example['description']}")
|
93 |
+
print(f" Query: \"{example['query']}\"")
|
94 |
+
print()
|
95 |
+
|
96 |
+
def main():
|
97 |
+
"""Main setup function"""
|
98 |
+
print("π AI Agent System Setup")
|
99 |
+
print("=" * 50)
|
100 |
+
|
101 |
+
deps_ok = check_dependencies()
|
102 |
+
env_ok = check_environment()
|
103 |
+
|
104 |
+
print("\nπ Setup Summary:")
|
105 |
+
print("=" * 30)
|
106 |
+
|
107 |
+
if deps_ok and env_ok:
|
108 |
+
print("β
System ready to run!")
|
109 |
+
print("\nπ― To start the system:")
|
110 |
+
print(" python app.py")
|
111 |
+
print("\nπ Web interface will be available at:")
|
112 |
+
print(" http://127.0.0.1:7862")
|
113 |
+
|
114 |
+
else:
|
115 |
+
print("β οΈ Setup incomplete")
|
116 |
+
if not deps_ok:
|
117 |
+
print(" - Install missing dependencies")
|
118 |
+
if not env_ok:
|
119 |
+
print(" - Configure environment variables")
|
120 |
+
print(" - Create FAISS index if needed")
|
121 |
+
|
122 |
+
show_usage_examples()
|
123 |
+
|
124 |
+
print("\nπ Additional Resources:")
|
125 |
+
print("- README_AGENTS.md - Architecture documentation")
|
126 |
+
print("- IMPLEMENTATION_SUMMARY.md - Implementation details")
|
127 |
+
print("- validate_system.py - System validation tests")
|
128 |
+
print("- demo.py - Interactive demonstration")
|
129 |
+
|
130 |
+
if __name__ == "__main__":
|
131 |
+
main()
|