milwright commited on
Commit
525ef5c
·
1 Parent(s): dc5c13f

Fix RAG processing crashes with multiprocessing and memory optimizations

Browse files

- Set TOKENIZERS_PARALLELISM=false to prevent tokenizer conflicts
- Reduce embedding batch size from 32 to 16 for stability
- Force CPU-only processing to avoid GPU/multiprocessing issues
- Add comprehensive error handling for network and memory issues
- Enhance progress logging during document processing
- Change default model to all-MiniLM-L6-v2 for better compatibility
- Create test_rag_fix.py to verify RAG functionality
- Update support documentation with Preview tab usage and tool configuration
- Add project documentation in CLAUDE.md for future development
- Remove outdated development files

Files changed (6) hide show
  1. CLAUDE.md +296 -0
  2. CLAUDE_DESKTOP_DEVELOPMENT.md +0 -411
  3. app.py +29 -1
  4. devjournal.md +0 -5
  5. test_rag_fix.py +182 -0
  6. vector_store.py +67 -14
CLAUDE.md ADDED
@@ -0,0 +1,296 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # CLAUDE.md
2
+
3
+ This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
4
+
5
+ ## Project Overview
6
+
7
+ Chat UI Helper is a Gradio-based tool for generating and configuring chat interfaces for HuggingFace Spaces. It creates deployable packages with custom assistants, web scraping capabilities, and optional vector RAG functionality.
8
+
9
+ ## Core Architecture
10
+
11
+ ### Main Application Flow (`app.py`)
12
+ The application follows a three-tab Gradio interface pattern:
13
+ 1. **Configuration Tab**: Space setup, assistant configuration, tool settings
14
+ 2. **Sandbox Preview Tab**: Interactive testing with real OpenRouter API integration
15
+ 3. **Support Docs Tab**: Comprehensive guidance and templates via `support_docs.py`
16
+
17
+ ### Template Generation System
18
+ - `SPACE_TEMPLATE` (lines 130-710): Complete HuggingFace Space template with export functionality
19
+ - `generate_zip()` function (lines 869-935): Orchestrates package creation with all dependencies
20
+ - Key template variables: `{system_prompt}`, `{model}`, `{enable_vector_rag}`, `{api_key_var}`, `{grounding_urls}`, `{enable_dynamic_urls}`, `{enable_web_search}`
21
+
22
+ ### Preview Sandbox Architecture
23
+ - Real OpenRouter API integration in preview mode (`preview_chat_response()` line 1185)
24
+ - URL context testing with dynamic add/remove functionality
25
+ - Configuration-aware responses using exact model and parameters from user configuration
26
+ - Fallback messaging when `OPENROUTER_API_KEY` environment variable not set
27
+ - Legacy tuple format compatibility for Gradio 4.44.1 ChatInterface
28
+ - Comprehensive debugging with enhanced error handling and API response validation
29
+
30
+ ### Document Processing Pipeline (RAG)
31
+ - **RAGTool** (`rag_tool.py`): Main orchestrator with 10MB file size validation
32
+ - **DocumentProcessor** (`document_processor.py`): PDF/DOCX/TXT/MD parsing with semantic chunking (800 chars, 100 overlap)
33
+ - **VectorStore** (`vector_store.py`): FAISS-based similarity search and base64 serialization
34
+
35
+ ### Web Scraping Architecture
36
+ Simple HTTP + BeautifulSoup approach with crawl4ai integration:
37
+ - `enhanced_fetch_url_content()` (lines 79-128): Enhanced requests with timeout and user-agent headers
38
+ - Content cleaning: Removes scripts, styles, navigation elements
39
+ - Content limits: ~4000 character truncation for context management
40
+ - URL content caching: `get_cached_grounding_context()` (line 1441) prevents redundant fetches
41
+ - `extract_urls_from_text()` (line 51): Regex-based URL extraction for dynamic fetching
42
+
43
+ ## Development Commands
44
+
45
+ ### Environment Setup
46
+ **Important**: This application requires Python ≥3.10 for Gradio 5.x compatibility.
47
+
48
+ ```bash
49
+ # Recommended: Use Python 3.11+ environment
50
+ python3.11 -m venv venv311
51
+ source venv311/bin/activate # or venv311\Scripts\activate on Windows
52
+ pip install -r requirements.txt
53
+ ```
54
+
55
+ ### Running the Application
56
+ ```bash
57
+ # With virtual environment activated
58
+ python app.py
59
+ ```
60
+
61
+ ### Testing Commands
62
+ ```bash
63
+ # Test vector database functionality (requires all RAG dependencies)
64
+ python test_vector_db.py
65
+
66
+ # Test RAG fixes and error handling
67
+ python test_rag_fix.py
68
+
69
+ # Test OpenRouter API key validation
70
+ python test_api_key.py
71
+
72
+ # Test minimal Gradio functionality (for debugging)
73
+ python test_minimal.py
74
+
75
+ # Test preview functionality components
76
+ python test_preview.py
77
+
78
+ # Test individual RAG components
79
+ python -c "from test_vector_db import test_document_processing; test_document_processing()"
80
+ python -c "from test_vector_db import test_vector_store; test_vector_store()"
81
+ python -c "from test_vector_db import test_rag_tool; test_rag_tool()"
82
+ ```
83
+
84
+ ### Pre-Test Setup for RAG Components
85
+ ```bash
86
+ # Create test document for vector database testing
87
+ echo "This is a test document for RAG functionality testing." > test_document.txt
88
+
89
+ # Verify all dependencies are installed
90
+ python -c "import sentence_transformers, faiss, fitz; print('RAG dependencies available')"
91
+ ```
92
+
93
+ ## Key Dependencies and Versions
94
+
95
+ ### Required Dependencies
96
+ - **Gradio ≥4.44.1**: Main UI framework (5.37.0 recommended for Python ≥3.10)
97
+ - **requests ≥2.32.3**: HTTP requests for web content fetching
98
+ - **beautifulsoup4 ≥4.12.3**: HTML parsing for web scraping
99
+ - **python-dotenv ≥1.0.0**: Environment variable management
100
+
101
+ ### Optional RAG Dependencies
102
+ - **sentence-transformers ≥2.2.2**: Text embeddings
103
+ - **faiss-cpu ==1.7.4**: Vector similarity search
104
+ - **PyMuPDF ≥1.23.0**: PDF text extraction
105
+ - **python-docx ≥0.8.11**: DOCX document processing
106
+ - **numpy ==1.26.4**: Numerical operations
107
+
108
+ ### Optional Web Search Dependencies
109
+ - **crawl4ai ≥0.2.0**: Advanced web crawling for web search functionality
110
+ - **aiohttp ≥3.8.0**: Async HTTP client for crawl4ai
111
+
112
+ ## Configuration Patterns
113
+
114
+ ### Conditional Dependency Loading
115
+ ```python
116
+ try:
117
+ from rag_tool import RAGTool
118
+ HAS_RAG = True
119
+ except ImportError:
120
+ HAS_RAG = False
121
+ RAGTool = None
122
+ ```
123
+ This pattern allows graceful degradation when optional vector dependencies are unavailable.
124
+
125
+ ### Template Variable Substitution
126
+ Generated spaces use these key substitutions:
127
+ - `{system_prompt}`: Combined assistant configuration
128
+ - `{grounding_urls}`: Static URL list for context
129
+ - `{enable_dynamic_urls}`: Runtime URL fetching capability
130
+ - `{enable_vector_rag}`: Document search integration
131
+ - `{enable_web_search}`: Web search integration via crawl4ai
132
+ - `{rag_data_json}`: Serialized embeddings and chunks
133
+ - `{api_key_var}`: Customizable API key environment variable name
134
+
135
+ ### Access Control Pattern
136
+ - Environment variable `SPACE_ACCESS_CODE` for student access control
137
+ - Global state management for session-based access in generated spaces
138
+ - Security-first approach storing credentials as HuggingFace Spaces secrets
139
+
140
+ ### RAG Integration Workflow
141
+ 1. Documents uploaded through Gradio File component with conditional visibility (`HAS_RAG` flag)
142
+ 2. Processed via DocumentProcessor (PDF/DOCX/TXT/MD support) in `process_documents()` function
143
+ 3. Chunked and embedded using sentence-transformers (800 chars, 100 overlap)
144
+ 4. FAISS index created and serialized to base64 for deployment portability
145
+ 5. Embedded in generated template via `{rag_data_json}` template variable
146
+
147
+ ## Implementation Notes
148
+
149
+ ### Research Template System (Simplified)
150
+ - **Simple Toggle**: `toggle_research_assistant()` function (line 1704) provides simple on/off functionality
151
+ - **Direct System Prompt**: Enables predefined academic research prompt with DOI verification and LibKey integration
152
+ - **Auto-Enable Dynamic URLs**: Research template automatically enables dynamic URL fetching for academic sources
153
+ - **Template Content**: Academic inquiry focus with DOI-verified sources, fact-checking, and proper citation requirements
154
+
155
+ ### State Management Across Tabs
156
+ - Extensive use of `gr.State()` for maintaining session data
157
+ - Cross-tab functionality through shared state variables (`sandbox_state`, `preview_config_state`)
158
+ - URL content caching to prevent redundant web requests (`url_content_cache` global variable)
159
+ - Preview debugging with comprehensive error handling and API response validation
160
+
161
+ ### Gradio Compatibility and Message Format Handling
162
+ - **Target Version**: Gradio 5.37.0 (requires Python ≥3.10)
163
+ - **Legacy Support**: Gradio 4.44.1 compatibility with JSON schema workarounds
164
+ - **Message Format**: Preview uses legacy tuple format `[user_msg, bot_msg]` for ChatInterface compatibility
165
+ - **Generated Spaces**: Use modern dictionary format `{"role": "user", "content": "..."}` for OpenRouter API
166
+
167
+ ### Security Considerations
168
+ - Never embed API keys or access codes in generated templates
169
+ - Environment variable pattern for all sensitive configuration (`{api_key_var}` template variable)
170
+ - Input validation for uploaded files and URL processing
171
+ - Content length limits for web scraping operations
172
+
173
+ ## Tool Configuration Changes
174
+
175
+ ### Code Execution Functionality Removed
176
+ **Important**: Code execution functionality has been completely removed from the application. Do not attempt to re-add it.
177
+
178
+ - All `enable_code_execution` parameters and checkboxes have been removed
179
+ - The `toggle_code_execution` function has been removed
180
+ - Code execution logic in preview and generation functions has been removed
181
+ - Generated spaces no longer support code execution capabilities
182
+
183
+ ### Web Search Integration
184
+ - **Enable Web Search**: Checkbox to enable web search functionality using crawl4ai
185
+ - **Technology**: Uses crawl4ai library with DuckDuckGo for search results
186
+ - **Implementation**: Integrated in both preview mode and generated spaces
187
+ - **Fallback**: Simple HTTP requests if crawl4ai is not available
188
+
189
+ ## Testing Infrastructure
190
+
191
+ ### Current Test Structure
192
+ - `test_vector_db.py`: Comprehensive RAG component testing
193
+ - `test_api_key.py`: OpenRouter API validation
194
+ - `test_minimal.py`: Basic Gradio functionality debugging
195
+ - `test_preview.py`: Preview functionality component testing
196
+
197
+ ### Test Dependencies
198
+ RAG testing requires: `sentence-transformers`, `faiss-cpu`, `PyMuPDF`, `python-docx`
199
+ Core testing requires: `gradio`, `requests`, `beautifulsoup4`, `python-dotenv`
200
+
201
+ ### Testing Status
202
+ - **Functional**: Four main test files covering core functionality
203
+ - **Usage**: Run individual Python test modules directly
204
+ - **Coverage**: Basic component testing, no automated integration tests
205
+
206
+ ## Known Issues and Compatibility
207
+
208
+ ### RAG Processing "Connection errored out" Issue
209
+ - **Issue**: Server crashes or hangs during document processing with "Connection errored out" error
210
+ - **Root Cause**: Memory-intensive embedding model download/initialization causing server timeout
211
+ - **Symptoms**:
212
+ - `stream.ts:185 Method not implemented.`
213
+ - `Failed to load resource: net::ERR_INCOMPLETE_CHUNKED_ENCODING`
214
+ - Server becomes unresponsive during RAG document processing
215
+ - **Solutions**:
216
+ 1. **Use smaller batch sizes**: Reduced from 32 to 16 chunks per batch
217
+ 2. **Improved error handling**: Better feedback for network/memory issues
218
+ 3. **CPU-only processing**: Force CPU usage to avoid GPU/multiprocessing conflicts
219
+ 4. **Environment variables**: Set `TOKENIZERS_PARALLELISM=false` to prevent multiprocessing issues
220
+ 5. **Smaller model**: Default model changed from `sentence-transformers/all-MiniLM-L6-v2` to `all-MiniLM-L6-v2`
221
+ - **Testing**: Run `python test_rag_fix.py` to verify RAG functionality
222
+ - **Prevention**: Process documents one at a time, use smaller files (<5MB)
223
+
224
+ ### Gradio 4.44.1 JSON Schema Bug
225
+ - **Issue**: TypeError in `json_schema_to_python_type` prevents app startup in some environments
226
+ - **Symptom**: "argument of type 'bool' is not iterable" error during API schema generation
227
+ - **Workaround**: Individual component functions work correctly
228
+ - **Solution**: Upgrade to Gradio 5.x for full compatibility
229
+
230
+ ### Python Version Requirements
231
+ - **Minimum**: Python 3.9 (for Gradio 4.44.1)
232
+ - **Recommended**: Python 3.11+ (for Gradio 5.x and optimal performance)
233
+
234
+ ## Common Claude Code Anti-Patterns to Avoid
235
+
236
+ ### Message Format Reversion
237
+ **❌ Don't revert to:** New dictionary format in preview functions
238
+ ```python
239
+ # WRONG - breaks Gradio 4.44.1 ChatInterface
240
+ history.append({"role": "user", "content": message})
241
+ history.append({"role": "assistant", "content": response})
242
+ ```
243
+ **✅ Keep:** Legacy tuple format for preview compatibility
244
+ ```python
245
+ # CORRECT - works with current Gradio ChatInterface
246
+ history.append([message, response])
247
+ ```
248
+
249
+ ### Template Variable Substitution
250
+ **❌ Don't change:** Template string escaping patterns in `SPACE_TEMPLATE`
251
+ - Keep double backslashes: `\\n\\n` (becomes `\n\n` after Python string processing)
252
+ - Keep double braces: `{{variable}}` (becomes `{variable}` after format())
253
+ - **Reason**: Template undergoes two levels of processing (Python format + HuggingFace deployment)
254
+
255
+ ### Code Execution Re-Addition
256
+ **❌ Don't re-add:** Code execution functionality has been intentionally removed
257
+ - Do not add `enable_code_execution` parameters back to functions
258
+ - Do not create code execution UI components
259
+ - Do not add code execution logic to preview or generation workflows
260
+ - **Reason**: Code execution functionality was removed by design
261
+
262
+ ### Conditional Dependency Loading
263
+ **❌ Don't remove:** `HAS_RAG` flag and conditional imports
264
+ ```python
265
+ # WRONG - breaks installations without vector dependencies
266
+ from rag_tool import RAGTool
267
+ ```
268
+ **✅ Keep:** Graceful degradation pattern
269
+ ```python
270
+ # CORRECT - allows app to work without optional dependencies
271
+ try:
272
+ from rag_tool import RAGTool
273
+ HAS_RAG = True
274
+ except ImportError:
275
+ HAS_RAG = False
276
+ RAGTool = None
277
+ ```
278
+
279
+ ### URL Management and Preview Functionality
280
+ **❌ Don't remove:** Dynamic URL add/remove functionality or real API integration in preview
281
+ - Keep `add_urls()`, `remove_urls()`, `add_chat_urls()`, `remove_chat_urls()` functions
282
+ - Maintain URL count state management with `gr.State()`
283
+ - Keep actual OpenRouter API calls in preview mode when `OPENROUTER_API_KEY` is set
284
+ - **Reason**: Users expect scalable URL input interface and realistic preview testing
285
+
286
+ ## Development-Only Utilities
287
+
288
+ ### MCP Servers
289
+ - **Gradio Docs**: Available at https://gradio-docs-mcp.hf.space/gradio_api/mcp/sse
290
+ - Use `gradio_docs.py` utility for development assistance
291
+ - **CRITICAL**: Do NOT import in main application - this is for development tooling only
292
+
293
+ Usage for development:
294
+ ```bash
295
+ python -c "from gradio_docs import gradio_docs; print(gradio_docs.search_docs('ChatInterface'))"
296
+ ```
CLAUDE_DESKTOP_DEVELOPMENT.md DELETED
@@ -1,411 +0,0 @@
1
- # Claude Desktop Development Guidelines
2
-
3
- ## Overview
4
- This document provides comprehensive guidelines for all-purpose software architecting and development when working with Claude Desktop. These instructions optimize collaboration between developers and Claude for efficient, high-quality software delivery.
5
-
6
- ## Core Principles
7
-
8
- ### 1. Context-First Development
9
- - **Always provide context**: Before asking Claude to work on code, ensure it has adequate context about the project structure, technologies used, and existing patterns
10
- - **Use file exploration**: Leverage Claude's file reading capabilities to understand codebases before making changes
11
- - **Reference existing patterns**: Point Claude to similar implementations in the codebase to maintain consistency
12
-
13
- ### 2. Incremental and Iterative Approach
14
- - **Break down complex tasks**: Divide large features into smaller, manageable components
15
- - **Test frequently**: Implement and test individual components before moving to the next
16
- - **Use TodoWrite**: Track progress on complex tasks to maintain visibility and ensure nothing is missed
17
-
18
- ### 3. Documentation-Driven Development
19
- - **CLAUDE.md integration**: Maintain project-specific instructions in CLAUDE.md for consistent behavior
20
- - **Code documentation**: Ensure all complex logic is well-documented for future maintenance
21
- - **Architecture decisions**: Document architectural choices and trade-offs
22
-
23
- ## Project Architecture Guidelines
24
-
25
- ### File Organization
26
- ```
27
- project-root/
28
- ├── CLAUDE.md # Claude-specific project instructions
29
- ├── README.md # Project overview and setup
30
- ├── .env.example # Environment variable template
31
- ├── src/
32
- │ ├── components/ # Reusable UI components
33
- │ ├── services/ # Business logic and API calls
34
- │ ├── utils/ # Helper functions and utilities
35
- │ ├── types/ # Type definitions (TypeScript)
36
- │ └── tests/ # Test files
37
- ├── docs/ # Additional documentation
38
- ├── scripts/ # Build and deployment scripts
39
- └── config/ # Configuration files
40
- ```
41
-
42
- ### Configuration Management
43
- - **Environment-based configs**: Use environment variables for deployment-specific settings
44
- - **Type-safe configurations**: Define configuration schemas with validation
45
- - **Hierarchical configs**: Support development, staging, and production configurations
46
- - **Secret management**: Never commit secrets; use environment variables or secret management tools
47
-
48
- ### Error Handling Strategy
49
- - **Graceful degradation**: Design systems to handle failures gracefully
50
- - **Comprehensive logging**: Implement structured logging for debugging and monitoring
51
- - **User-friendly errors**: Provide meaningful error messages to end users
52
- - **Recovery mechanisms**: Implement retry logic and fallback strategies where appropriate
53
-
54
- ## Development Workflow
55
-
56
- ### 1. Project Initialization
57
- ```bash
58
- # Set up project structure
59
- mkdir project-name && cd project-name
60
- git init
61
- touch CLAUDE.md README.md .env.example
62
- mkdir -p src/{components,services,utils,types,tests}
63
- ```
64
-
65
- ### 2. CLAUDE.md Configuration
66
- Create project-specific instructions:
67
- ```markdown
68
- # Project: [Project Name]
69
-
70
- ## Tech Stack
71
- - Framework: [React/Vue/Angular/etc.]
72
- - Language: [TypeScript/JavaScript/Python/etc.]
73
- - Database: [PostgreSQL/MongoDB/etc.]
74
- - Testing: [Jest/Pytest/etc.]
75
-
76
- ## Coding Standards
77
- - Use TypeScript for all new code
78
- - Follow ESLint configuration
79
- - Write tests for all business logic
80
- - Document complex functions
81
-
82
- ## Architecture Patterns
83
- - Use custom hooks for React state logic
84
- - Implement repository pattern for data access
85
- - Follow MVC pattern for API endpoints
86
-
87
- ## Deployment
88
- - Test commands: npm test
89
- - Build commands: npm run build
90
- - Lint commands: npm run lint
91
- ```
92
-
93
- ### 3. Development Process
94
- 1. **Analysis Phase**
95
- - Understand requirements thoroughly
96
- - Review existing codebase patterns
97
- - Identify potential integration points
98
- - Plan architecture approach
99
-
100
- 2. **Implementation Phase**
101
- - Start with core functionality
102
- - Build incrementally with frequent testing
103
- - Maintain consistent code style
104
- - Document as you go
105
-
106
- 3. **Testing Phase**
107
- - Unit tests for individual components
108
- - Integration tests for workflows
109
- - End-to-end tests for critical paths
110
- - Performance testing where relevant
111
-
112
- 4. **Documentation Phase**
113
- - Update README if necessary
114
- - Document API changes
115
- - Update configuration guides
116
- - Record architectural decisions
117
-
118
- ## Tool Usage Best Practices
119
-
120
- ### File Operations
121
- - **Read before edit**: Always read files before making changes to understand context
122
- - **Batch operations**: Use MultiEdit for multiple changes to the same file
123
- - **Glob patterns**: Use Glob tool for finding files by patterns
124
- - **Grep for search**: Use Grep tool for content searches across files
125
-
126
- ### Code Quality
127
- - **Linting**: Run linters before committing code
128
- - **Type checking**: Ensure TypeScript compilation succeeds
129
- - **Testing**: Run test suites and ensure they pass
130
- - **Security**: Never commit secrets or sensitive information
131
-
132
- ### Git Integration
133
- - **Atomic commits**: Make focused commits with clear messages
134
- - **Branch strategy**: Use feature branches for development
135
- - **Pull requests**: Create PRs with comprehensive descriptions
136
- - **Commit messages**: Follow conventional commit format
137
-
138
- ## Technology-Specific Guidelines
139
-
140
- ### Frontend Development
141
- ```typescript
142
- // Component structure
143
- interface Props {
144
- // Define all props with types
145
- }
146
-
147
- export const Component: React.FC<Props> = ({ prop1, prop2 }) => {
148
- // Custom hooks for state management
149
- const { state, actions } = useCustomHook();
150
-
151
- // Event handlers
152
- const handleSubmit = useCallback((event: FormEvent) => {
153
- // Implementation
154
- }, [dependencies]);
155
-
156
- return (
157
- // JSX with proper accessibility
158
- );
159
- };
160
- ```
161
-
162
- ### Backend Development
163
- ```python
164
- # Service layer pattern
165
- class UserService:
166
- def __init__(self, repository: UserRepository):
167
- self.repository = repository
168
-
169
- async def create_user(self, user_data: UserCreateSchema) -> User:
170
- # Validation
171
- # Business logic
172
- # Persistence
173
- return await self.repository.create(user_data)
174
-
175
- # API endpoint
176
- @router.post("/users", response_model=UserResponse)
177
- async def create_user(
178
- user_data: UserCreateSchema,
179
- service: UserService = Depends(get_user_service)
180
- ):
181
- return await service.create_user(user_data)
182
- ```
183
-
184
- ### Database Design
185
- - **Normalization**: Design normalized schemas to avoid data duplication
186
- - **Indexing**: Add indexes for frequently queried columns
187
- - **Migrations**: Use migration scripts for schema changes
188
- - **Relationships**: Define clear foreign key relationships
189
-
190
- ## Security Guidelines
191
-
192
- ### Authentication & Authorization
193
- - **JWT tokens**: Use short-lived access tokens with refresh tokens
194
- - **Role-based access**: Implement granular permission systems
195
- - **Input validation**: Validate all user inputs server-side
196
- - **Rate limiting**: Implement rate limiting for API endpoints
197
-
198
- ### Data Protection
199
- - **Encryption**: Encrypt sensitive data at rest and in transit
200
- - **Environment variables**: Store secrets in environment variables
201
- - **HTTPS**: Always use HTTPS in production
202
- - **CORS**: Configure CORS policies appropriately
203
-
204
- ## Performance Optimization
205
-
206
- ### Frontend
207
- - **Code splitting**: Implement route-based code splitting
208
- - **Lazy loading**: Lazy load components and images
209
- - **Memoization**: Use React.memo and useMemo for expensive operations
210
- - **Bundle analysis**: Regularly analyze bundle sizes
211
-
212
- ### Backend
213
- - **Caching**: Implement Redis caching for frequently accessed data
214
- - **Database optimization**: Use connection pooling and query optimization
215
- - **Async operations**: Use async/await for I/O operations
216
- - **Monitoring**: Implement application performance monitoring
217
-
218
- ## Testing Strategy
219
-
220
- ### Unit Tests
221
- ```typescript
222
- describe('UserService', () => {
223
- it('should create user with valid data', async () => {
224
- // Arrange
225
- const userData = { name: 'John', email: '[email protected]' };
226
-
227
- // Act
228
- const result = await userService.createUser(userData);
229
-
230
- // Assert
231
- expect(result).toMatchObject(userData);
232
- });
233
- });
234
- ```
235
-
236
- ### Integration Tests
237
- - Test API endpoints with real database
238
- - Test component integration with services
239
- - Test external service integrations
240
- - Verify error handling scenarios
241
-
242
- ### E2E Tests
243
- ```typescript
244
- test('user registration flow', async ({ page }) => {
245
- await page.goto('/register');
246
- await page.fill('[data-testid="email"]', '[email protected]');
247
- await page.fill('[data-testid="password"]', 'password123');
248
- await page.click('[data-testid="submit"]');
249
- await expect(page).toHaveURL('/dashboard');
250
- });
251
- ```
252
-
253
- ## Deployment Guidelines
254
-
255
- ### Environment Configuration
256
- ```bash
257
- # Development
258
- NODE_ENV=development
259
- DATABASE_URL=postgresql://localhost:5432/myapp_dev
260
- API_URL=http://localhost:3000
261
-
262
- # Production
263
- NODE_ENV=production
264
- DATABASE_URL=${DATABASE_URL}
265
- API_URL=https://api.myapp.com
266
- ```
267
-
268
- ### CI/CD Pipeline
269
- ```yaml
270
- # .github/workflows/deploy.yml
271
- name: Deploy
272
- on:
273
- push:
274
- branches: [main]
275
- jobs:
276
- test:
277
- runs-on: ubuntu-latest
278
- steps:
279
- - uses: actions/checkout@v2
280
- - run: npm ci
281
- - run: npm test
282
- - run: npm run lint
283
- - run: npm run build
284
- deploy:
285
- needs: test
286
- runs-on: ubuntu-latest
287
- steps:
288
- - run: echo "Deploy to production"
289
- ```
290
-
291
- ## Monitoring and Maintenance
292
-
293
- ### Application Monitoring
294
- - **Error tracking**: Use services like Sentry for error monitoring
295
- - **Performance monitoring**: Track application performance metrics
296
- - **User analytics**: Monitor user behavior and feature usage
297
- - **Infrastructure monitoring**: Monitor server resources and uptime
298
-
299
- ### Maintenance Tasks
300
- - **Dependency updates**: Regularly update dependencies
301
- - **Security patches**: Apply security updates promptly
302
- - **Database maintenance**: Regular backups and performance tuning
303
- - **Documentation updates**: Keep documentation current
304
-
305
- ## Collaboration Guidelines
306
-
307
- ### Code Reviews
308
- - **Review scope**: Focus on logic, security, and maintainability
309
- - **Constructive feedback**: Provide specific, actionable feedback
310
- - **Testing verification**: Ensure tests cover new functionality
311
- - **Documentation check**: Verify documentation is updated
312
-
313
- ### Communication
314
- - **Clear requirements**: Provide detailed specifications
315
- - **Progress updates**: Regular status updates on complex tasks
316
- - **Technical discussions**: Use pull request comments for technical discussions
317
- - **Knowledge sharing**: Document learnings and solutions
318
-
319
- ## Common Patterns
320
-
321
- ### State Management
322
- ```typescript
323
- // Custom hook pattern
324
- export const useUserData = () => {
325
- const [user, setUser] = useState<User | null>(null);
326
- const [loading, setLoading] = useState(true);
327
- const [error, setError] = useState<string | null>(null);
328
-
329
- const fetchUser = useCallback(async (id: string) => {
330
- try {
331
- setLoading(true);
332
- const userData = await userService.getUser(id);
333
- setUser(userData);
334
- } catch (err) {
335
- setError(err.message);
336
- } finally {
337
- setLoading(false);
338
- }
339
- }, []);
340
-
341
- return { user, loading, error, fetchUser };
342
- };
343
- ```
344
-
345
- ### API Integration
346
- ```typescript
347
- // Repository pattern
348
- export class ApiRepository {
349
- constructor(private httpClient: HttpClient) {}
350
-
351
- async get<T>(endpoint: string): Promise<T> {
352
- try {
353
- const response = await this.httpClient.get(endpoint);
354
- return response.data;
355
- } catch (error) {
356
- throw new ApiError(error.message, error.status);
357
- }
358
- }
359
- }
360
- ```
361
-
362
- ### Configuration
363
- ```typescript
364
- // Type-safe configuration
365
- interface Config {
366
- api: {
367
- baseUrl: string;
368
- timeout: number;
369
- };
370
- features: {
371
- enableNewFeature: boolean;
372
- };
373
- }
374
-
375
- export const config: Config = {
376
- api: {
377
- baseUrl: process.env.API_URL || 'http://localhost:3000',
378
- timeout: parseInt(process.env.API_TIMEOUT || '5000'),
379
- },
380
- features: {
381
- enableNewFeature: process.env.ENABLE_NEW_FEATURE === 'true',
382
- },
383
- };
384
- ```
385
-
386
- ## Troubleshooting Guide
387
-
388
- ### Common Issues
389
- 1. **Build failures**: Check dependency versions and environment variables
390
- 2. **Test failures**: Verify test data and mock configurations
391
- 3. **Performance issues**: Profile code and check for memory leaks
392
- 4. **Security vulnerabilities**: Run security audits and update dependencies
393
-
394
- ### Debugging Strategies
395
- - **Structured logging**: Use consistent log levels and formats
396
- - **Debug tools**: Leverage browser dev tools and IDE debuggers
397
- - **Error boundaries**: Implement React error boundaries for graceful failures
398
- - **Health checks**: Implement endpoint health checks for monitoring
399
-
400
- ## Conclusion
401
-
402
- These guidelines provide a comprehensive framework for developing high-quality software with Claude Desktop. Adapt these patterns to fit your specific project needs while maintaining the core principles of clarity, maintainability, and security.
403
-
404
- Remember to:
405
- - Keep documentation updated
406
- - Test thoroughly at each stage
407
- - Follow security best practices
408
- - Maintain consistent code quality
409
- - Collaborate effectively with clear communication
410
-
411
- For project-specific guidance, always reference the CLAUDE.md file in your project root.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
app.py CHANGED
@@ -990,8 +990,36 @@ def process_documents(files, current_rag_tool):
990
  else:
991
  return f"❌ {result['message']}", current_rag_tool
992
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
993
  except Exception as e:
994
- return f"❌ Error processing documents: {str(e)}", current_rag_tool
 
 
 
 
 
 
 
995
 
996
  def update_sandbox_preview(config_data):
997
  """Update the sandbox preview with generated content"""
 
990
  else:
991
  return f"❌ {result['message']}", current_rag_tool
992
 
993
+ except ImportError as e:
994
+ error_msg = f"❌ Missing dependencies: {str(e)}\n\n"
995
+ error_msg += "To use RAG functionality, install:\n"
996
+ error_msg += "- sentence-transformers>=2.2.2\n"
997
+ error_msg += "- faiss-cpu==1.7.4\n"
998
+ error_msg += "- PyMuPDF>=1.23.0 (for PDF files)\n"
999
+ error_msg += "- python-docx>=0.8.11 (for DOCX files)"
1000
+ return error_msg, current_rag_tool
1001
+ except RuntimeError as e:
1002
+ error_msg = f"❌ Model initialization error: {str(e)}\n\n"
1003
+ if "network" in str(e).lower() or "download" in str(e).lower():
1004
+ error_msg += "This appears to be a network issue. Please:\n"
1005
+ error_msg += "1. Check your internet connection\n"
1006
+ error_msg += "2. Try again in a few moments\n"
1007
+ error_msg += "3. If the problem persists, restart the application"
1008
+ elif "memory" in str(e).lower():
1009
+ error_msg += "This appears to be a memory issue. Please:\n"
1010
+ error_msg += "1. Try uploading smaller documents\n"
1011
+ error_msg += "2. Process documents one at a time\n"
1012
+ error_msg += "3. Restart the application if needed"
1013
+ return error_msg, current_rag_tool
1014
  except Exception as e:
1015
+ error_msg = f"❌ Unexpected error processing documents: {str(e)}\n\n"
1016
+ error_msg += "This may be due to:\n"
1017
+ error_msg += "- Large files causing memory issues\n"
1018
+ error_msg += "- Network problems downloading the embedding model\n"
1019
+ error_msg += "- File format issues\n\n"
1020
+ error_msg += "Try: uploading smaller files, checking your internet connection, or restarting the application."
1021
+ print(f"RAG processing error: {e}")
1022
+ return error_msg, current_rag_tool
1023
 
1024
  def update_sandbox_preview(config_data):
1025
  """Update the sandbox preview with generated content"""
devjournal.md DELETED
@@ -1,5 +0,0 @@
1
- # Dev Journal - ChatUI Helper
2
-
3
- system prompts:
4
-
5
- - You are blah. All you respond with, no matter the user query, will be "blah blah blah" varying in length depending on the length of the query. Respond only with blah blah. Nothing else. No other words.
 
 
 
 
 
 
test_rag_fix.py ADDED
@@ -0,0 +1,182 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python3
2
+ """
3
+ Test script to verify RAG functionality fixes
4
+ """
5
+
6
+ import os
7
+ import tempfile
8
+ import warnings
9
+ from pathlib import Path
10
+
11
+ # Suppress known warnings
12
+ warnings.filterwarnings("ignore", message=".*use_auth_token.*")
13
+ warnings.filterwarnings("ignore", message=".*urllib3.*")
14
+ warnings.filterwarnings("ignore", message=".*resource_tracker.*")
15
+
16
+ # Set environment variables to prevent multiprocessing issues
17
+ os.environ['TOKENIZERS_PARALLELISM'] = 'false'
18
+
19
+ def test_rag_dependencies():
20
+ """Test that RAG dependencies are available"""
21
+ print("Testing RAG dependencies...")
22
+
23
+ try:
24
+ import sentence_transformers
25
+ print("✅ sentence-transformers available")
26
+ except ImportError:
27
+ print("❌ sentence-transformers not available")
28
+ return False
29
+
30
+ try:
31
+ import faiss
32
+ print("✅ faiss-cpu available")
33
+ except ImportError:
34
+ print("❌ faiss-cpu not available")
35
+ return False
36
+
37
+ try:
38
+ import fitz # PyMuPDF
39
+ print("✅ PyMuPDF available")
40
+ except ImportError:
41
+ print("⚠️ PyMuPDF not available (PDF processing disabled)")
42
+
43
+ try:
44
+ from docx import Document
45
+ print("✅ python-docx available")
46
+ except ImportError:
47
+ print("⚠️ python-docx not available (DOCX processing disabled)")
48
+
49
+ return True
50
+
51
+ def test_vector_store_initialization():
52
+ """Test vector store initialization with improved error handling"""
53
+ print("\nTesting vector store initialization...")
54
+
55
+ try:
56
+ from vector_store import VectorStore
57
+
58
+ # Test with CPU-only settings
59
+ store = VectorStore(embedding_model="all-MiniLM-L6-v2")
60
+ print("✅ VectorStore created successfully")
61
+
62
+ # Test a small embedding operation
63
+ test_texts = ["This is a test sentence.", "Another test sentence."]
64
+ embeddings = store.create_embeddings(test_texts)
65
+ print(f"✅ Created embeddings: shape {embeddings.shape}")
66
+
67
+ return True
68
+
69
+ except Exception as e:
70
+ print(f"❌ VectorStore initialization failed: {e}")
71
+ return False
72
+
73
+ def test_document_processing():
74
+ """Test document processing with a simple text file"""
75
+ print("\nTesting document processing...")
76
+
77
+ try:
78
+ from document_processor import DocumentProcessor
79
+
80
+ # Create a temporary test file
81
+ with tempfile.NamedTemporaryFile(mode='w', suffix='.txt', delete=False) as f:
82
+ f.write("This is a test document for RAG processing. ")
83
+ f.write("It contains multiple sentences that should be processed into chunks. ")
84
+ f.write("Each chunk should have proper metadata and be ready for embedding.")
85
+ test_file = f.name
86
+
87
+ try:
88
+ processor = DocumentProcessor(chunk_size=50, chunk_overlap=10)
89
+ chunks = processor.process_file(test_file)
90
+
91
+ print(f"✅ Created {len(chunks)} chunks from test document")
92
+ if chunks:
93
+ print(f" First chunk: {chunks[0].text[:50]}...")
94
+ print(f" Metadata keys: {list(chunks[0].metadata.keys())}")
95
+
96
+ return True
97
+
98
+ finally:
99
+ # Clean up test file
100
+ os.unlink(test_file)
101
+
102
+ except Exception as e:
103
+ print(f"❌ Document processing failed: {e}")
104
+ return False
105
+
106
+ def test_rag_tool_integration():
107
+ """Test the complete RAG tool integration"""
108
+ print("\nTesting complete RAG tool integration...")
109
+
110
+ try:
111
+ from rag_tool import RAGTool
112
+
113
+ # Create a temporary test file
114
+ with tempfile.NamedTemporaryFile(mode='w', suffix='.txt', delete=False) as f:
115
+ f.write("RAG integration test document. ")
116
+ f.write("This document tests the complete RAG pipeline from file processing to vector search. ")
117
+ f.write("The system should handle this without crashing the server.")
118
+ test_file = f.name
119
+
120
+ try:
121
+ rag_tool = RAGTool()
122
+ result = rag_tool.process_uploaded_files([test_file])
123
+
124
+ if result['success']:
125
+ print(f"✅ RAG processing succeeded: {result['message']}")
126
+ print(f" Files processed: {len(result['summary']['files_processed'])}")
127
+ print(f" Total chunks: {result['summary']['total_chunks']}")
128
+
129
+ # Test search functionality
130
+ context = rag_tool.get_relevant_context("test document")
131
+ if context:
132
+ print(f"✅ Search functionality working: {context[:100]}...")
133
+ else:
134
+ print("⚠️ Search returned no results")
135
+
136
+ return True
137
+ else:
138
+ print(f"❌ RAG processing failed: {result['message']}")
139
+ return False
140
+
141
+ finally:
142
+ # Clean up test file
143
+ os.unlink(test_file)
144
+
145
+ except Exception as e:
146
+ print(f"❌ RAG tool integration failed: {e}")
147
+ return False
148
+
149
+ def main():
150
+ """Run all RAG tests"""
151
+ print("🚀 Testing RAG functionality fixes...")
152
+ print("=" * 50)
153
+
154
+ tests = [
155
+ test_rag_dependencies,
156
+ test_vector_store_initialization,
157
+ test_document_processing,
158
+ test_rag_tool_integration
159
+ ]
160
+
161
+ passed = 0
162
+ total = len(tests)
163
+
164
+ for test in tests:
165
+ try:
166
+ if test():
167
+ passed += 1
168
+ except Exception as e:
169
+ print(f"❌ Test failed with exception: {e}")
170
+
171
+ print("\n" + "=" * 50)
172
+ print(f"📊 Test Results: {passed}/{total} tests passed")
173
+
174
+ if passed == total:
175
+ print("🎉 All tests passed! RAG functionality should work correctly.")
176
+ return True
177
+ else:
178
+ print("⚠️ Some tests failed. Check error messages above.")
179
+ return False
180
+
181
+ if __name__ == "__main__":
182
+ main()
vector_store.py CHANGED
@@ -27,7 +27,7 @@ class SearchResult:
27
 
28
 
29
  class VectorStore:
30
- def __init__(self, embedding_model: str = "sentence-transformers/all-MiniLM-L6-v2"):
31
  self.embedding_model_name = embedding_model
32
  self.embedding_model = None
33
  self.index = None
@@ -43,26 +43,79 @@ class VectorStore:
43
  if not HAS_SENTENCE_TRANSFORMERS:
44
  raise ImportError("sentence-transformers not installed")
45
 
46
- self.embedding_model = SentenceTransformer(self.embedding_model_name)
47
- # Update dimension based on model
48
- self.dimension = self.embedding_model.get_sentence_embedding_dimension()
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
49
 
50
- def create_embeddings(self, texts: List[str], batch_size: int = 32) -> np.ndarray:
51
  """Create embeddings for a list of texts"""
52
  if not self.embedding_model:
53
  self._initialize_model()
54
 
55
- # Process in batches for efficiency
56
  embeddings = []
57
 
58
- for i in range(0, len(texts), batch_size):
59
- batch = texts[i:i + batch_size]
60
- batch_embeddings = self.embedding_model.encode(
61
- batch,
62
- convert_to_numpy=True,
63
- show_progress_bar=False
64
- )
65
- embeddings.append(batch_embeddings)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
66
 
67
  return np.vstack(embeddings) if embeddings else np.array([])
68
 
 
27
 
28
 
29
  class VectorStore:
30
+ def __init__(self, embedding_model: str = "all-MiniLM-L6-v2"):
31
  self.embedding_model_name = embedding_model
32
  self.embedding_model = None
33
  self.index = None
 
43
  if not HAS_SENTENCE_TRANSFORMERS:
44
  raise ImportError("sentence-transformers not installed")
45
 
46
+ try:
47
+ print(f"Loading embedding model: {self.embedding_model_name}")
48
+ print("This may take a moment on first run as the model downloads...")
49
+
50
+ # Set environment variables to prevent multiprocessing issues
51
+ import os
52
+ os.environ['TOKENIZERS_PARALLELISM'] = 'false'
53
+
54
+ # Initialize with specific settings to avoid multiprocessing issues
55
+ self.embedding_model = SentenceTransformer(
56
+ self.embedding_model_name,
57
+ device='cpu', # Force CPU to avoid GPU/multiprocessing conflicts
58
+ cache_folder=None, # Use default cache
59
+ # Additional parameters to reduce memory usage
60
+ use_auth_token=False
61
+ )
62
+
63
+ # Disable multiprocessing for stability in web apps
64
+ if hasattr(self.embedding_model, 'pool'):
65
+ self.embedding_model.pool = None
66
+
67
+ # Update dimension based on model
68
+ self.dimension = self.embedding_model.get_sentence_embedding_dimension()
69
+ print(f"Model loaded successfully, dimension: {self.dimension}")
70
+ except Exception as e:
71
+ print(f"Failed to initialize embedding model: {e}")
72
+ # Provide more specific error messages
73
+ if "connection" in str(e).lower() or "timeout" in str(e).lower():
74
+ raise RuntimeError(f"Network error downloading model '{self.embedding_model_name}'. "
75
+ f"Please check your internet connection and try again: {e}")
76
+ elif "memory" in str(e).lower() or "out of memory" in str(e).lower():
77
+ raise RuntimeError(f"Insufficient memory to load model '{self.embedding_model_name}'. "
78
+ f"Try using a smaller model or increase available memory: {e}")
79
+ else:
80
+ raise RuntimeError(f"Could not load embedding model '{self.embedding_model_name}': {e}")
81
 
82
+ def create_embeddings(self, texts: List[str], batch_size: int = 16) -> np.ndarray:
83
  """Create embeddings for a list of texts"""
84
  if not self.embedding_model:
85
  self._initialize_model()
86
 
87
+ # Use smaller batch size for stability
88
  embeddings = []
89
 
90
+ try:
91
+ print(f"Creating embeddings for {len(texts)} text chunks...")
92
+ for i in range(0, len(texts), batch_size):
93
+ batch = texts[i:i + batch_size]
94
+ print(f"Processing batch {i//batch_size + 1}/{(len(texts) + batch_size - 1)//batch_size}")
95
+
96
+ batch_embeddings = self.embedding_model.encode(
97
+ batch,
98
+ convert_to_numpy=True,
99
+ show_progress_bar=False,
100
+ device='cpu', # Force CPU to avoid GPU conflicts
101
+ normalize_embeddings=False, # We'll normalize later with FAISS
102
+ batch_size=batch_size # Explicit batch size
103
+ )
104
+ embeddings.append(batch_embeddings)
105
+
106
+ # Clear any caches to free memory
107
+ if hasattr(self.embedding_model, 'clear_cache'):
108
+ self.embedding_model.clear_cache()
109
+
110
+ except Exception as e:
111
+ # Log the error and provide a helpful message
112
+ print(f"Error creating embeddings: {e}")
113
+ if "cuda" in str(e).lower() or "gpu" in str(e).lower():
114
+ raise RuntimeError(f"GPU/CUDA error encountered. The model is configured to use CPU only. Error: {e}")
115
+ elif "memory" in str(e).lower() or "out of memory" in str(e).lower():
116
+ raise RuntimeError(f"Out of memory while creating embeddings. Try uploading smaller files or fewer files at once: {e}")
117
+ else:
118
+ raise RuntimeError(f"Failed to create embeddings: {e}")
119
 
120
  return np.vstack(embeddings) if embeddings else np.array([])
121