Spaces:

milwright
/

chatui-helper

Running

App Files Files Community

chatui-helper / TEST_PROCEDURE.md

milwright

Replace Crawl4AI with simple HTTP requests and BeautifulSoup

12839ce 19 days ago

preview code

raw

history blame

8.18 kB

	# Chat UI Helper - Comprehensive Test Procedure

	This document outlines a systematic test procedure for validating the Chat UI Helper application after new commits. This procedure ensures all components function correctly and can be iterated upon as the project evolves.

	## Pre-Test Setup

	### Environment Verification
	```bash
	# Verify Python environment
	python --version # Should be 3.8+

	# Install/update dependencies
	pip install -r requirements.txt

	# Verify optional dependencies status
	python -c "
	try:
	import sentence_transformers, faiss, fitz, docx
	print('✅ All RAG dependencies available')
	except ImportError as e:
	print(f'⚠️ Optional RAG dependencies missing: {e}')
	"
	```

	### Test Data Preparation
	```bash
	# Ensure test document exists
	echo "This is a test document for RAG functionality testing." > test_document.txt

	# Create test directory structure if needed
	mkdir -p test_outputs
	```

	## Test Categories

	### 1. Core Application Tests

	#### 1.1 Application Startup
	```bash
	# Test basic application launch
	python app.py &
	APP_PID=$!
	sleep 10
	curl -f http://localhost:7860 > /dev/null && echo "✅ App started successfully" \|\| echo "❌ App failed to start"
	kill $APP_PID
	```

	#### 1.2 Gradio Interface Validation
	- [ ] Application loads without errors
	- [ ] Two tabs visible: "Spaces Configuration" and "Chat Support"
	- [ ] All form fields render correctly
	- [ ] Template selection works (Custom vs Research Assistant)
	- [ ] File upload components appear when RAG is enabled

	### 2. Vector RAG Component Tests

	#### 2.1 Individual Component Testing
	```bash
	# Test document processing
	python -c "from test_vector_db import test_document_processing; test_document_processing()"

	# Test vector store functionality
	python -c "from test_vector_db import test_vector_store; test_vector_store()"

	# Test full RAG pipeline
	python -c "from test_vector_db import test_rag_tool; test_rag_tool()"
	```

	#### 2.2 RAG Integration Tests
	- [ ] Document upload accepts PDF, DOCX, TXT, MD files
	- [ ] File size validation (10MB limit) works
	- [ ] Documents are processed and chunked correctly
	- [ ] Vector embeddings are generated
	- [ ] Similarity search returns relevant results
	- [ ] RAG data serializes/deserializes properly for templates

	### 3. Space Generation Tests

	#### 3.1 Basic Space Creation
	- [ ] Generate space with minimal configuration
	- [ ] Verify all required files are created (app.py, requirements.txt, README.md, config.json)
	- [ ] Check generated app.py syntax is valid
	- [ ] Verify requirements.txt has correct dependencies
	- [ ] Ensure README.md contains proper deployment instructions

	#### 3.2 Advanced Feature Testing
	- [ ] Generate space with URL grounding enabled
	- [ ] Generate space with vector RAG enabled
	- [ ] Generate space with access code protection
	- [ ] Test template substitution works correctly
	- [ ] Verify environment variable security pattern

	### 4. Web Scraping Tests

	#### 4.1 Mock vs Production Mode
	```bash
	# Test in mock mode (lines 14-18 in app.py)
	# Verify placeholder content is returned

	# Test in production mode
	# Verify actual web content is fetched via HTTP requests
	```

	#### 4.2 URL Processing
	- [ ] Valid URLs are processed correctly
	- [ ] Invalid URLs are handled gracefully
	- [ ] Content extraction works for different site types
	- [ ] Rate limiting and error handling work

	### 5. Security and Configuration Tests

	#### 5.1 Environment Variable Handling
	- [ ] API keys are not embedded in generated templates
	- [ ] Access codes use environment variable pattern
	- [ ] Sensitive data is properly excluded from version control

	#### 5.2 Input Validation
	- [ ] File upload validation works
	- [ ] URL validation prevents malicious inputs
	- [ ] Content length limits are enforced
	- [ ] XSS prevention in user inputs

	### 6. Chat Support Tests

	#### 6.1 OpenRouter Integration
	- [ ] Chat responds when API key is configured
	- [ ] Proper error message when API key is missing
	- [ ] Message history formatting works correctly
	- [ ] URL grounding provides relevant context

	#### 6.2 Gradio 5.x Compatibility
	- [ ] Message format uses `type="messages"`
	- [ ] ChatInterface renders correctly
	- [ ] User/assistant message distinction works
	- [ ] Chat history persists during session

	## Automated Test Execution

	### Quick Test Suite
	```bash
	#!/bin/bash
	# quick_test.sh - Run essential tests

	echo "🔍 Running Quick Test Suite..."

	# 1. Syntax check
	python -m py_compile app.py && echo "✅ app.py syntax valid" \|\| echo "❌ app.py syntax error"

	# 2. Import test
	python -c "import app; print('✅ App imports successfully')" 2>/dev/null \|\| echo "❌ Import failed"

	# 3. RAG component test (if available)
	if python -c "from rag_tool import RAGTool" 2>/dev/null; then
	python test_vector_db.py && echo "✅ RAG tests passed" \|\| echo "❌ RAG tests failed"
	else
	echo "⚠️ RAG components not available"
	fi

	# 4. Template generation test
	python -c "
	import app
	result = app.generate_zip('Test Space', 'Test Description', 'Test Role', 'Test Audience', 'Test Tasks', '', [], '', '', 'gpt-3.5-turbo', 0.7, 4000, [], False, False, None)
	assert result[0].endswith('.zip'), 'ZIP generation failed'
	print('✅ Space generation works')
	"

	echo "🎉 Quick test suite completed!"
	```

	### Full Test Suite
	```bash
	#!/bin/bash
	# full_test.sh - Comprehensive testing

	echo "🔍 Running Full Test Suite..."

	# Run all component tests
	./quick_test.sh

	# Additional integration tests
	echo "🧪 Running integration tests..."

	# Test with different configurations
	# Test error handling
	# Test edge cases
	# Performance tests

	echo "📊 Generating test report..."
	# Generate detailed test report
	```

	## Regression Test Checklist

	After each commit, verify:

	- [ ] All existing functionality still works
	- [ ] New features don't break existing features
	- [ ] Generated spaces deploy successfully to HuggingFace
	- [ ] Documentation is updated appropriately
	- [ ] Dependencies are correctly specified
	- [ ] Security patterns are maintained

	## Performance Benchmarks

	### Metrics to Track
	- Application startup time
	- Space generation time
	- Document processing time (for various file sizes)
	- Memory usage during RAG operations
	- API response times

	### Benchmark Commands
	```bash
	# Startup time
	time python -c "import app; print('App loaded')"

	# Space generation time
	time python -c "
	import app
	app.generate_zip('Benchmark', 'Test', 'Role', 'Audience', 'Tasks', '', [], '', '', 'gpt-3.5-turbo', 0.7, 4000, [], False, False, None)
	"

	# RAG processing time
	time python -c "from test_vector_db import test_rag_tool; test_rag_tool()"
	```

	## Test Data Management

	### Sample Test Files
	- `test_document.txt` - Basic text document
	- `sample.pdf` - PDF document for upload testing
	- `sample.docx` - Word document for testing
	- `sample.md` - Markdown document for testing

	### Test Configuration Profiles
	- Minimal configuration (basic chat only)
	- Research assistant template
	- Full-featured (RAG + URL grounding + access control)
	- Edge case configurations

	## Continuous Integration Integration

	### GitHub Actions Integration
	```yaml
	# .github/workflows/test.yml
	name: Test Chat UI Helper
	on: [push, pull_request]
	jobs:
	test:
	runs-on: ubuntu-latest
	steps:
	- uses: actions/checkout@v3
	- name: Set up Python
	uses: actions/setup-python@v4
	with:
	python-version: '3.9'
	- name: Install dependencies
	run: pip install -r requirements.txt
	- name: Run test suite
	run: ./quick_test.sh
	```

	## Future Test Enhancements

	### Planned Additions
	- [ ] Automated UI testing with Selenium
	- [ ] Load testing for generated spaces
	- [ ] Cross-browser compatibility testing
	- [ ] Mobile responsiveness testing
	- [ ] Accessibility testing
	- [ ] Multi-language content testing

	### Test Coverage Goals
	- [ ] 90%+ code coverage for core components
	- [ ] All user workflows tested end-to-end
	- [ ] Error conditions properly tested
	- [ ] Performance regression detection

	---

	Last Updated: 2025-07-13
	Version: 1.0
	Maintained by: Development Team

	This test procedure should be updated whenever new features are added or existing functionality is modified.