chatui-helper / TEST_PROCEDURE.md
milwright's picture
Replace Crawl4AI with simple HTTP requests and BeautifulSoup
12839ce
|
raw
history blame
8.18 kB

Chat UI Helper - Comprehensive Test Procedure

This document outlines a systematic test procedure for validating the Chat UI Helper application after new commits. This procedure ensures all components function correctly and can be iterated upon as the project evolves.

Pre-Test Setup

Environment Verification

# Verify Python environment
python --version  # Should be 3.8+

# Install/update dependencies
pip install -r requirements.txt

# Verify optional dependencies status
python -c "
try:
    import sentence_transformers, faiss, fitz, docx
    print('βœ… All RAG dependencies available')
except ImportError as e:
    print(f'⚠️  Optional RAG dependencies missing: {e}')
"

Test Data Preparation

# Ensure test document exists
echo "This is a test document for RAG functionality testing." > test_document.txt

# Create test directory structure if needed
mkdir -p test_outputs

Test Categories

1. Core Application Tests

1.1 Application Startup

# Test basic application launch
python app.py &
APP_PID=$!
sleep 10
curl -f http://localhost:7860 > /dev/null && echo "βœ… App started successfully" || echo "❌ App failed to start"
kill $APP_PID

1.2 Gradio Interface Validation

  • Application loads without errors
  • Two tabs visible: "Spaces Configuration" and "Chat Support"
  • All form fields render correctly
  • Template selection works (Custom vs Research Assistant)
  • File upload components appear when RAG is enabled

2. Vector RAG Component Tests

2.1 Individual Component Testing

# Test document processing
python -c "from test_vector_db import test_document_processing; test_document_processing()"

# Test vector store functionality
python -c "from test_vector_db import test_vector_store; test_vector_store()"

# Test full RAG pipeline
python -c "from test_vector_db import test_rag_tool; test_rag_tool()"

2.2 RAG Integration Tests

  • Document upload accepts PDF, DOCX, TXT, MD files
  • File size validation (10MB limit) works
  • Documents are processed and chunked correctly
  • Vector embeddings are generated
  • Similarity search returns relevant results
  • RAG data serializes/deserializes properly for templates

3. Space Generation Tests

3.1 Basic Space Creation

  • Generate space with minimal configuration
  • Verify all required files are created (app.py, requirements.txt, README.md, config.json)
  • Check generated app.py syntax is valid
  • Verify requirements.txt has correct dependencies
  • Ensure README.md contains proper deployment instructions

3.2 Advanced Feature Testing

  • Generate space with URL grounding enabled
  • Generate space with vector RAG enabled
  • Generate space with access code protection
  • Test template substitution works correctly
  • Verify environment variable security pattern

4. Web Scraping Tests

4.1 Mock vs Production Mode

# Test in mock mode (lines 14-18 in app.py)
# Verify placeholder content is returned

# Test in production mode
# Verify actual web content is fetched via HTTP requests

4.2 URL Processing

  • Valid URLs are processed correctly
  • Invalid URLs are handled gracefully
  • Content extraction works for different site types
  • Rate limiting and error handling work

5. Security and Configuration Tests

5.1 Environment Variable Handling

  • API keys are not embedded in generated templates
  • Access codes use environment variable pattern
  • Sensitive data is properly excluded from version control

5.2 Input Validation

  • File upload validation works
  • URL validation prevents malicious inputs
  • Content length limits are enforced
  • XSS prevention in user inputs

6. Chat Support Tests

6.1 OpenRouter Integration

  • Chat responds when API key is configured
  • Proper error message when API key is missing
  • Message history formatting works correctly
  • URL grounding provides relevant context

6.2 Gradio 5.x Compatibility

  • Message format uses type="messages"
  • ChatInterface renders correctly
  • User/assistant message distinction works
  • Chat history persists during session

Automated Test Execution

Quick Test Suite

#!/bin/bash
# quick_test.sh - Run essential tests

echo "πŸ” Running Quick Test Suite..."

# 1. Syntax check
python -m py_compile app.py && echo "βœ… app.py syntax valid" || echo "❌ app.py syntax error"

# 2. Import test
python -c "import app; print('βœ… App imports successfully')" 2>/dev/null || echo "❌ Import failed"

# 3. RAG component test (if available)
if python -c "from rag_tool import RAGTool" 2>/dev/null; then
    python test_vector_db.py && echo "βœ… RAG tests passed" || echo "❌ RAG tests failed"
else
    echo "⚠️  RAG components not available"
fi

# 4. Template generation test
python -c "
import app
result = app.generate_zip('Test Space', 'Test Description', 'Test Role', 'Test Audience', 'Test Tasks', '', [], '', '', 'gpt-3.5-turbo', 0.7, 4000, [], False, False, None)
assert result[0].endswith('.zip'), 'ZIP generation failed'
print('βœ… Space generation works')
"

echo "πŸŽ‰ Quick test suite completed!"

Full Test Suite

#!/bin/bash
# full_test.sh - Comprehensive testing

echo "πŸ” Running Full Test Suite..."

# Run all component tests
./quick_test.sh

# Additional integration tests
echo "πŸ§ͺ Running integration tests..."

# Test with different configurations
# Test error handling
# Test edge cases
# Performance tests

echo "πŸ“Š Generating test report..."
# Generate detailed test report

Regression Test Checklist

After each commit, verify:

  • All existing functionality still works
  • New features don't break existing features
  • Generated spaces deploy successfully to HuggingFace
  • Documentation is updated appropriately
  • Dependencies are correctly specified
  • Security patterns are maintained

Performance Benchmarks

Metrics to Track

  • Application startup time
  • Space generation time
  • Document processing time (for various file sizes)
  • Memory usage during RAG operations
  • API response times

Benchmark Commands

# Startup time
time python -c "import app; print('App loaded')"

# Space generation time
time python -c "
import app
app.generate_zip('Benchmark', 'Test', 'Role', 'Audience', 'Tasks', '', [], '', '', 'gpt-3.5-turbo', 0.7, 4000, [], False, False, None)
"

# RAG processing time
time python -c "from test_vector_db import test_rag_tool; test_rag_tool()"

Test Data Management

Sample Test Files

  • test_document.txt - Basic text document
  • sample.pdf - PDF document for upload testing
  • sample.docx - Word document for testing
  • sample.md - Markdown document for testing

Test Configuration Profiles

  • Minimal configuration (basic chat only)
  • Research assistant template
  • Full-featured (RAG + URL grounding + access control)
  • Edge case configurations

Continuous Integration Integration

GitHub Actions Integration

# .github/workflows/test.yml
name: Test Chat UI Helper
on: [push, pull_request]
jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      - name: Set up Python
        uses: actions/setup-python@v4
        with:
          python-version: '3.9'
      - name: Install dependencies
        run: pip install -r requirements.txt
      - name: Run test suite
        run: ./quick_test.sh

Future Test Enhancements

Planned Additions

  • Automated UI testing with Selenium
  • Load testing for generated spaces
  • Cross-browser compatibility testing
  • Mobile responsiveness testing
  • Accessibility testing
  • Multi-language content testing

Test Coverage Goals

  • 90%+ code coverage for core components
  • All user workflows tested end-to-end
  • Error conditions properly tested
  • Performance regression detection

Last Updated: 2025-07-13 Version: 1.0 Maintained by: Development Team

This test procedure should be updated whenever new features are added or existing functionality is modified.