docsync / prompts.txt
NRbones's picture
Add 2 files
e350624 verified
Design and implement a full-stack intelligent document automation system functionally analogous to the application available at git clone https://huggingface.co/spaces/seamoors/wealthsync-ad. The application shall enable users to upload a fully completed invoice, form, or PDF document, which will then be programmatically parsed using a suite of Python-based OCR (e.g., Tesseract), PDF parsing (e.g., PyMuPDF, pdfplumber, or PyPDF2), and machine learning techniques for semantic field recognition and structured data extraction. The core functionality must support: Field Extraction & Analysis: Upon uploading a completed form, the application must utilize OCR, visual layout analysis, and NLP-driven entity recognition to extract both labeled and inferred field data with high accuracy. All extracted values should be normalized into a unified schema for downstream processing. Template-Agnostic Form Filling: The user will subsequently upload a second PDF — a blank or partially completed template form that may differ in structure or layout from the original. The system must intelligently map and propagate values from the original form to this new document, using semantic similarity (e.g., cosine similarity on embedded field names, BERT-based field name matching, or fuzzy logic heuristics). It must robustly handle field remapping, layout discrepancies, and varying form structures. Output Formats & Data Export: Render a real-time, user-visible preview of the completed PDF document within the UI. Enable export in three formats: A downloadable filled-in PDF document. A structured and semantically accurate JSON representation of the data. A CSV file conforming to normalized tabular output specifications. Architecture & Stack Requirements: Backend: Python with FastAPI or Flask. Frontend: React or Streamlit (if rapid prototyping is preferred). ML/AI: Integrate document layout models (e.g., LayoutLMv3), OCR engines (e.g., Tesseract or EasyOCR), and optional fine-tuned transformer models for field matching. Data Persistence: Optional use of a document database (e.g., MongoDB) for session persistence or audit logs. Include robust error handling, field confidence scoring, and preview customization. Deliverables: Fully functional application with modular, maintainable code. Inline documentation for all components. README with installation, usage, and architecture overview. Exportable build (e.g., Dockerized container or deployment instructions). Constraints: Must support varying form formats without relying on static template matching. Ensure data privacy and sandboxed file handling. Prioritize high field fidelity, semantic consistency, and UI responsiveness.
That works - almost, except the upload template step 2 does not activate after uploading a source and the final output is not exporting PDF JSON or CSV with the auto filled fields that I require it to or downloading to the users computer