File size: 15,144 Bytes
8440e4d
7dbb999
 
 
 
8440e4d
5391209
8440e4d
7dbb999
4185c3e
5391209
8440e4d
 
187bd4b
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
---
title: iLearn
emoji: 🧠
colorFrom: blue
colorTo: indigo
sdk: gradio
sdk_version: 5.33.1
app_file: app.py
pinned: true
short_description: AI that learns over time.  Watch it learn.
tag: agent-demo-track
---

# Documentation: iLearn - An Autonomous AI Research Agent

## 1. Overview

**iLearn** is a sophisticated, self-improving AI research agent designed to perform complex tasks, manage its own knowledge, and learn from its interactions. Built with a modular architecture, it combines a powerful chat interface with long-term memory, dynamic rule-based guidance, and external tools like web search.

The core principle of iLearn is its "learning loop": after every interaction, it reflects on its performance, generates insights, and updates its internal "rules" to improve future responses. This allows the agent to adapt, refine its identity, and build a persistent, evolving knowledge base.

---

## 2. Key Features

*   **Self-Improving Knowledge Base**: The agent analyzes its conversations to generate and refine its guiding principles ("rules"), enabling it to learn from successes and failures.
*   **Long-Term Memory**: Utilizes a semantic memory system to recall past interactions, ensuring context and continuity across conversations.
*   **Persistent & Pluggable Memory**: Supports multiple backends for storing memories and rules, including volatile RAM, a local SQLite database, or a persistent Hugging Face Dataset repository.
*   **Intelligent Tool Use**: Employs an LLM to decide the best course of action for a given query, choosing between a direct response, a web search, or retrieving information from its memory.
*   **Multi-Provider LLM Support**: Integrates with a wide range of LLM providers (Groq, OpenRouter, OpenAI, Google, etc.) through a unified API handler, configured via `models.json`.
*   **Web Research Capabilities**: Can perform web searches (via DuckDuckGo and Google) and scrape page content to answer questions with up-to-date information, citing its sources.
*   **Comprehensive UI**: A Gradio interface provides a seamless chat experience and includes a "Knowledge Base" tab for directly managing the AI's rules and memories.
*   **Extensive Configuration**: Highly configurable through environment variables and in-app settings, allowing control over everything from the default system prompt to the models used for decision-making.

---

## 3. How It Works: The Interaction & Learning Loop

The application follows a distinct, multi-stage process for each user interaction.


*(A conceptual flow diagram)*

1.  **Initial Rule Retrieval (RAG)**: When a user sends a message, the system first performs a semantic search on its existing **Rules** to find the most relevant guiding principles for the current context. This forms the initial "guidelines" for the AI's response.

2.  **Tool Decision**: The agent uses a fast, lightweight LLM (e.g., Llama 3 8B via Groq) to decide on the best action. The LLM is given the user's query, recent chat history, and the retrieved rules, and is asked to choose one of the following actions:
    *   `quick_respond`: Answer directly using the provided context.
    *   `search_duckduckgo_and_report`: Perform a web search to gather information.
    *   `scrape_url_and_report`: Scrape content from a specific URL provided by the user.
    *   `answer_using_conversation_memory`: Perform a semantic search on past conversations to find relevant information.

3.  **Action Execution & Prompt Assembly**: Based on the chosen action, the system executes the task:
    *   If searching/scraping, `websearch_logic.py` is used to get external content.
    *   If using memory, `memory_logic.py` retrieves relevant past interactions.
    A final, comprehensive prompt is then assembled, including the system prompt, chat history, retrieved rules, and any new context from tools.

4.  **Final Response Generation**: This final prompt is sent to the main, user-selected LLM (e.g., Llama 3 70B). The response is streamed back to the user through the chat interface.

5.  **Post-Interaction Learning (The Core Loop)**: This is the most critical step. After the response is complete, a background task is initiated:
    *   **Metrics Generation**: An LLM call generates key metrics about the interaction: a short `takeaway`, a `response_success_score`, and a `future_confidence_score`.
    *   **Memory Storage**: The full interaction (user query, AI response, and metrics) is saved as a new **Memory**.
    *   **Insight Reflection**: A powerful LLM (like GPT-4o or Claude 3.5 Sonnet) is prompted to act as a "knowledge base curator". It receives:
        *   A summary of the just-completed interaction.
        *   A list of potentially relevant existing Rules.
        *   The specific Rules that were used to generate the response.
    *   **Rule Generation (XML)**: The LLM's task is to output an XML structure containing a list of operations (`<operations_list>`). Each operation is either an `add` for a new rule or an `update` to refine an existing one. This strict XML format ensures reliable parsing.
    *   **Knowledge Base Update**: The system parses the XML and applies the changes to the **Rules** database, adding, updating, and consolidating the AI's guiding principles. This new knowledge will be available for all future interactions.

---

## 4. System Components (File Breakdown)

*   `app.py`
    *   **Role**: The main application file. It contains the Gradio UI definition, orchestrates the entire interaction and learning loop, and manages the application's state.
    *   **Key Functions**: `handle_gradio_chat_submit`, `process_user_interaction_gradio`, `perform_post_interaction_learning`.

*   `memory_logic.py`
    *   **Role**: The AI's knowledge base and memory system. It handles the storage, retrieval, and management of both **Memories** (past conversations) and **Rules** (guiding principles).
    *   **Technology**: Uses `sentence-transformers` for creating text embeddings and `faiss` for efficient semantic search.
    *   **Backends**: Abstracted to work with:
        *   `RAM`: In-memory storage (volatile).
        *   `SQLITE`: Persistent local database (`app_data/ai_memory.db`).
        *   `HF_DATASET`: Pushes data to a private Hugging Face Dataset for robust, cloud-based persistence.

*   `model_logic.py`
    *   **Role**: A versatile, multi-provider LLM handler. It abstracts the complexities of calling different model APIs.
    *   **Functionality**: Provides a single function `call_model_stream` that takes a provider, model name, and messages, and handles the provider-specific request formatting, authentication, and streaming response parsing.

*   `websearch_logic.py`
    *   **Role**: Provides the AI with the tools to access external information from the internet.
    *   **Functionality**: Implements `search_and_scrape_duckduckgo` and `search_and_scrape_google` for search, and a robust `scrape_url` function to extract clean, readable text content from web pages using `BeautifulSoup`.

*   `models.json`
    *   **Role**: A configuration file that maps user-friendly model names (e.g., "Llama 3 8B (Groq)") to their specific API identifiers (e.g., "llama3-8b-8192") for each provider. This makes it easy to add or change available models without altering the code.

*   `requirements.txt`
    *   **Role**: Lists all the necessary Python packages for the project to run.

---

## 5. Configuration

The application is highly configurable via environment variables (in a `.env` file) and hardcoded toggles in `app.py`.

### 5.1. In-App Toggles (`app.py`)

These are at the very top of `app.py` for easy access during development.

*   `DEMO_MODE` (bool): If `True`, disables all destructive actions like clearing the knowledge base, saving edited rules, and file uploads.
*   `MEMORY_STORAGE_TYPE` (str): Sets the storage backend. Can be `"RAM"`, `"SQLITE"`, or `"HF_DATASET"`. This will override the `STORAGE_BACKEND` environment variable.
*   `HF_DATASET_MEMORY_REPO` (str): The Hugging Face Dataset repository for storing memories.
*   `HF_DATASET_RULES_REPO` (str): The Hugging Face Dataset repository for storing rules.

### 5.2. Environment Variables (`.env` file)

Create a `.env` file in the root directory to set these values.

| Variable                      | Description                                                                                             | Default                       |
| ----------------------------- | ------------------------------------------------------------------------------------------------------- | ----------------------------- |
| **API Keys**                  |                                                                                                         |                               |
| `GROQ_API_KEY`                | API Key for Groq services.                                                                              | `None`                        |
| `OPENROUTER_API_KEY`          | API Key for OpenRouter.ai.                                                                              | `None`                        |
| `OPENAI_API_KEY`              | API Key for OpenAI.                                                                                     | `None`                        |
| `HF_TOKEN`                    | Hugging Face token for pushing to HF Datasets and using HF Inference API.                               | `None`                        |
| `..._API_KEY`                 | Keys for other providers as defined in `model_logic.py`.                                                | `None`                        |
| **App Behavior**              |                                                                                                         |                               |
| `WEB_SEARCH_ENABLED`          | Set to `true` or `false` to enable/disable the web search tool.                                         | `true`                        |
| `DEFAULT_SYSTEM_PROMPT`       | The default system prompt for the AI.                                                                   | A generic helpful assistant.  |
| `MAX_HISTORY_TURNS`           | The number of conversation turns (user+AI) to keep in the context window.                               | `7`                           |
| `TOOL_DECISION_PROVIDER`      | The LLM provider to use for the tool-decision step. (e.g., `groq`)                                      | `groq`                        |
| `TOOL_DECISION_MODEL`         | The model ID to use for the tool-decision step. (e.g., `llama3-8b-8192`)                                 | `llama3-8b-8192`              |
| `INSIGHT_MODEL_OVERRIDE`      | Override the model for the post-interaction learning step. Format: `provider/model_id`.                 | Uses the main chat model.     |
| `METRICS_MODEL`               | Override model for generating metrics. Format: `provider/model_id`.                                     | Uses the main chat model.     |
| **Storage & Loading**         |                                                                                                         |                               |
| `STORAGE_BACKEND`             | Storage backend: `RAM`, `SQLITE`, `HF_DATASET`. (Overridden by `MEMORY_STORAGE_TYPE` in `app.py`)       | `HF_DATASET`                  |
| `SQLITE_DB_PATH`              | Path to the SQLite database file if `SQLITE` is used.                                                   | `app_data/ai_memory.db`       |
| `HF_MEMORY_DATASET_REPO`      | HF Dataset repo for memories. (Overridden by `HF_DATASET_MEMORY_REPO` in `app.py`)                      | `broadfield-dev/ai-brain`     |
| `HF_RULES_DATASET_REPO`       | HF Dataset repo for rules. (Overridden by `HF_DATASET_RULES_REPO` in `app.py`)                          | `broadfield-dev/ai-rules`     |
| `LOAD_RULES_FILE`             | Path to a local file (.txt or .jsonl) to load rules from on startup.                                    | `None`                        |
| `LOAD_MEMORIES_FILE`          | Path to a local file (.json or .jsonl) to load memories from on startup.                                | `None`                        |

---

## 6. The Memory & Rules System

### 6.1. Rules (Guiding Principles)

Rules are the AI's core identity and behavioral guidelines. They are stored as simple text strings but follow a specific format to be effective: `[TYPE|SCORE] Text of the rule`.

*   **TYPE**: Categorizes the rule.
    *   `CORE_RULE`: Defines fundamental identity (e.g., name, purpose). The learning loop tries to consolidate these.
    *   `RESPONSE_PRINCIPLE`: Guides the style and content of responses (e.g., be concise, cite sources).
    *   `BEHAVIORAL_ADJUSTMENT`: Fine-tunes behavior based on specific feedback.
    *   `GENERAL_LEARNING`: Stores a general piece of factual information or a learned preference.
*   **SCORE**: A float from `0.0` to `1.0` indicating the confidence or importance of the rule.
*   **Text**: The natural language content of the rule itself.

### 6.2. Memories

Memories are records of past interactions. Each memory is a JSON object containing:
*   `user_input`: The user's message.
*   `bot_response`: The AI's full response.
*   `timestamp`: The UTC timestamp of the interaction.
*   `metrics`: A sub-object with the LLM-generated analysis (`takeaway`, `response_success_score`, etc.).

---

## 7. How to Run Locally

1.  **Clone the Repository**:
    ```bash
    git clone https://huggingface.co/spaces/broadfield-dev/ilearn
    cd ilearn
    ```

2.  **Install Dependencies**:
    It's recommended to use a virtual environment.
    ```bash
    python -m venv venv
    source venv/bin/activate  # On Windows, use `venv\Scripts\activate`
    pip install -r requirements.txt
    ```

3.  **Create `.env` File**:
    Create a file named `.env` in the root directory and add your configuration. At a minimum, you'll need an API key for an LLM provider.

    **`.env.example`**:
    ```env
    # --- API KEYS (Fill in at least one) ---
    GROQ_API_KEY="gsk_..."
    OPENAI_API_KEY="sk_..."
    HF_TOKEN="hf_..." # Required for HF_DATASET backend and HF Inference

    # --- BEHAVIOR ---
    WEB_SEARCH_ENABLED="true"
    DEFAULT_SYSTEM_PROMPT="Your Name is Node. You are a Helpful AI Assistant..."
    MAX_HISTORY_TURNS=7
    TOOL_DECISION_PROVIDER="groq"
    TOOL_DECISION_MODEL="llama3-8b-8192"
    #INSIGHT_MODEL_OVERRIDE="openai/gpt-4o" # Optional: Use a powerful model for learning

    # --- STORAGE (Choose one and configure) ---
    STORAGE_BACKEND="SQLITE" # Options: RAM, SQLITE, HF_DATASET
    # If using SQLITE:
    SQLITE_DB_PATH="app_data/ai_memory.db"
    # If using HF_DATASET:
    #HF_MEMORY_DATASET_REPO="your-username/my-ai-brain"
    #HF_RULES_DATASET_REPO="your-username/my-ai-rules"

    # --- STARTUP FILE LOADING (Optional) ---
    #LOAD_RULES_FILE="initial_rules.txt"
    #LOAD_MEMORIES_FILE="initial_memories.jsonl"
    ```

4.  **Run the Application**:
    ```bash
    python app.py
    ```
    The application will start and be accessible at `http://127.0.0.1:7860`.