vedaMD / docs /scratchpad.md
sniro23's picture
Initial commit without binary files
19aaa42

SL Clinical Assistant - Project Scratchpad

Current Active Task

Task: System Redesign and Refinement Implementation Plan: docs/implementation-plan/system-redesign-and-refinement.md Status: Just started. The plan has been formulated.

Previous Tasks (for reference)

  • Task: Maternal Health RAG Chatbot v2 (DEPRECATED)
  • Implement maternal health RAG chatbot v3
  • Task: Web UI for Clinical Chatbot (Superseded by new plan)

Research-Based Redesign Summary

πŸ”¬ Key Research Findings (2024-2025):

  • Complex medical categorization approaches don't work - simpler document-based retrieval significantly outperforms categorical chunking
  • Optimal chunking: 400-800 characters with 15% overlap using natural boundaries
  • NLP Integration Essential: Dedicated medical language models crucial for professional answer presentation
  • Document-Centric: Retrieve directly from parsed guidelines using document structure

❌ Problems with Current v1.0 Implementation:

  1. Over-engineered: 542 medically-aware chunks with separate categories is too complex
  2. Category Fragmentation: Clinical information gets split across artificial categories
  3. Poor Answer Presentation: Lacks proper NLP formatting for healthcare professionals
  4. Reduced Retrieval Accuracy: Complex categorization reduces semantic coherence

New v2.0 Simplified Architecture

🎯 Core Principles:

  • Document-Centric Retrieval: Retrieve from parsed guidelines directly using document structure
  • Simple Semantic Chunking: Use paragraph/section-based chunking preserving clinical context
  • NLP Answer Enhancement: Dedicated models for professional medical presentation
  • Clinical Safety: Maintain medical disclaimers and source attribution

πŸ“‹ Revised Task Plan:

  1. Document Structure Analysis & Simple Chunking - Replace complex categorization
  2. Enhanced Document-Based Vector Store - Simple metadata approach
  3. NLP Answer Generation Pipeline - Medical language model integration
  4. Medical Language Model Integration - OpenBioLLM-8B or similar
  5. Simplified RAG Pipeline - Streamlined retrieval-generation
  6. Professional Interface Enhancement - Healthcare professional UX

Previous v1.0 Achievements (To Be Simplified)

βœ… 15 PDF documents processed (479 pages, 48 tables, 107,010 words)
βœ… Robust PDF extraction using pdfplumber
βœ… Vector store infrastructure with FAISS
βœ… Basic RAG pipeline working
βœ… Gradio interface functional

πŸ”„ Status: Ready to implement v2.0 simplified approach based on latest research

Current Projects Status

  • Task: Web UI for Clinical Chatbot

Current Task: Web UI for Clinical Chatbot

  • File: docs/implementation-plan/web-ui-for-chatbot.md
  • Goal: Create a web-based user interface for the RAG chatbot and deploy it.
  • Status: Just started. The plan has been formulated.

Current Task: Maternal Health RAG Chatbot v3

Reference: docs/implementation-plan/maternal-health-rag-chatbot-v3.md

Planner's Goal

The primary goal is to execute the new three-phase plan to rebuild the chatbot's data processing and retrieval backbone. This will address the core quality issues of poor data extraction from complex PDFs and robotic, templated LLM responses. Success is defined as a chatbot that can accurately answer questions using data from tables and diagrams, and does so in a natural, conversational manner.

Executor's Next Step

The first step for the executor is to begin Phase 1: Advanced Multi-Modal Document Processing. This involves:

  1. Updating requirements.txt to add the mineru library.
  2. Creating the new src/advanced_pdf_processor.py script.

Let's begin. Please switch to executor mode.

Lessons Learned

Data Processing and Medical Documents

  • [2024-12-29] Use pdfplumber over pymupdf4llm for medical documents with tables and flowcharts
  • [2024-12-29] 400-800 character chunks with natural document boundaries work better than complex medical categorization
  • [2024-12-29] Document-based metadata more effective than artificial medical subcategories
  • [2024-12-29] Simple approach with all-MiniLM-L6-v2 embeddings achieves excellent retrieval (0.6-0.8+ relevance)

System Architecture and Performance

  • [2024-12-29] Simplified vector store approach (2,021 chunks) outperforms complex categorization significantly
  • [2024-12-29] Template-based medical formatting works but lacks true medical reasoning capabilities
  • [2024-12-29] User feedback critical: "poor retrieval capabilities, just keyword matching rather than medical reasoning"

Model Deployment and Integration

  • [2024-12-29] Local deployment of large models (15GB OpenBioLLM-8B) unreliable due to download timeouts and hardware requirements
  • [2024-12-29] HuggingFace Inference API more reliable than local model deployment for production systems
  • [2024-12-29] CRITICAL: OpenBioLLM-8B NOT available via HuggingFace Inference API (December 2024)
  • [2024-12-29] Llama 3.3-70B-Instruct via HF API superior to local 8B models: 70B parameters > 8B for medical reasoning
  • [2024-12-29] Medical prompt engineering can adapt general LLMs for healthcare applications effectively
  • [2024-12-29] API integration (OpenAI-compatible) faster and more reliable than local model debugging

Current Implementation State

  • v1.0 System (COMPLETED): Complex medical categorization approach with local vector store
  • v2.0 Core (COMPLETED): Simplified document-based RAG system with 2,021 optimized chunks
  • Current Challenge: Medical LLM integration for proper clinical reasoning vs keyword matching

Active Implementation Files

  • Primary Implementation Plan: docs/implementation-plan/maternal-health-rag-chatbot-v2.md
  • Status: Researching HuggingFace API integration for medical LLM vs local OpenBioLLM deployment

Recent Research and Decision

HuggingFace API Analysis (December 2024)

  • Local OpenBioLLM-8B: Failed deployment due to 15GB size, connection timeouts, hardware requirements
  • HuggingFace API Availability: OpenBioLLM-8B NOT available via HF Inference API
  • Recommended Alternative: Llama 3.3-70B-Instruct via HF API with medical prompt engineering
  • Rationale: 70B parameters > 8B for medical reasoning, reliable API vs local deployment issues

Strategic Pivot Decision

From: Local OpenBioLLM-8B deployment (unreliable, 8B parameters) To: HuggingFace API + Llama 3.3-70B-Instruct (reliable, 70B parameters, medical prompting)

Advantages of HF API Approach:

  • Superior model size (70B vs 8B parameters)
  • Reliable cloud infrastructure vs local deployment
  • Latest December 2024 model with cutting-edge capabilities
  • OpenAI-compatible API for easy integration
  • No hardware/download requirements

Implementation Strategy:

  1. HuggingFace API integration with OpenAI format
  2. Medical prompt engineering for general Llama models
  3. RAG integration with clinical formatting
  4. Professional medical disclaimers and safety
  • Next Task