historical-ocr / .clinerules /systemPatterns.md
milwright's picture
add memory
4c10be0

A newer version of the Streamlit SDK is available: 1.45.1

Upgrade

System Patterns

Code Organization

  • Main processing components in root directory
  • Utility functions in utils/ directory with specific submodules
  • UI components in ui/ directory
  • Test cases and samples in testing/ directory
  • Input/output directories for document processing

Naming Conventions

  • Snake case for file names and functions
  • Module names reflect their purpose (e.g., ocr_processing.py, image_segmentation.py)
  • Consistent test output naming with descriptive prefixes

Processing Pipeline

  1. Preprocessing step (enhancement, cleaning)
  2. Segmentation (identifying text regions)
  3. OCR processing with context-specific strategies
  4. Post-processing and output formatting