Spaces:

MicroHealth
/

proposal-writer

Paused

File size: 19,744 Bytes

e60c00e
 
 
525d347
bd5de4f
e34874a
24742cb
 
 
 
e60c00e
229097a
 
0a1b210
e34874a
24742cb
f5b637d
83acb90
43e6676
e60c00e
e34874a
 
e60c00e
e34874a
 
 
 
 
24742cb
e34874a
83acb90
e60c00e
e34874a
24742cb
e34874a
83acb90
 
e34874a
 
 
 
 
 
24742cb
83acb90
24742cb
5eb1493
e34874a
 
 
 
 
83acb90
 
 
 
e34874a
83acb90
 
e34874a
83acb90
 
 
 
 
 
 
cb6077d
83acb90
 
 
 
e34874a
83acb90
 
 
 
 
43e6676
83acb90
e34874a
 
cb6077d
e34874a
 
 
 
 
 
 
 
 
cb6077d
 
e34874a
 
83acb90
24742cb
e34874a
43e6676
e34874a
 
24742cb
e34874a
 
cb6077d
 
83acb90
 
 
 
 
24742cb
cb6077d
e34874a
24742cb
e34874a
 
 
24742cb
e34874a
43e6676
e34874a
 
 
 
 
 
01d9e10
83acb90
e34874a
 
83acb90
e34874a
24742cb
e34874a
83acb90
e34874a
 
83acb90
e34874a
 
 
83acb90
e34874a
 
 
24742cb
83acb90
24742cb
83acb90
e34874a
 
 
43e6676
 
 
 
 
24742cb
83acb90
e34874a
 
43e6676
e34874a
 
 
 
83acb90
 
e34874a
 
 
24742cb
e34874a
83acb90
e34874a
 
 
83acb90
e34874a
 
83acb90
e34874a
 
 
 
 
 
83acb90
f5b637d
24742cb
 
f5b637d
83acb90
e34874a
 
 
 
43e6676
 
 
 
 
83acb90
e34874a
cb6077d
83acb90
e34874a
 
 
cb6077d
83acb90
 
 
 
 
 
 
 
 
43e6676
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
cb6077d
e34874a
 
 
 
 
5ec2a94
e34874a
 
 
 
 
 
b011df6
e34874a
 
 
b011df6
e34874a
 
 
 
 
 
 
 
 
24742cb
e34874a
 
 
 
 
5ec2a94
e34874a
 
 
 
5ec2a94
e34874a
 
 
43e6676
 
83acb90
43e6676
83acb90
 
43e6676
 
 
 
 
 
 
e34874a
 
43e6676
 
e34874a
83acb90
43e6676
24742cb
83acb90
43e6676
e34874a
 
 
 
43e6676
 
 
 
e34874a
43e6676
e34874a
43e6676
 
 
 
 
 
 
 
 
 
 
e34874a
43e6676
e34874a
 
 
43e6676
e34874a
24742cb

import base64
import io
import os
import pandas as pd
from docx import Document
from io import BytesIO, StringIO
import dash # Version 3.0.3
import dash_bootstrap_components as dbc # Version 2.0.2
from dash import html, dcc, Input, Output, State, callback_context, ALL, no_update
from dash.exceptions import PreventUpdate
import google.generativeai as genai
from docx.shared import Pt
from docx.enum.style import WD_STYLE_TYPE
from PyPDF2 import PdfReader
import logging
import uuid
import xlsxwriter # Needed for Excel export engine
import threading # For multi-threading
import time # For progress indicator

# --- Logging Configuration ---
logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s')

# --- Initialize Dash app ---
# dash==3.0.3
# dash-bootstrap-components==2.0.2
app = dash.Dash(__name__,
                external_stylesheets=[dbc.themes.BOOTSTRAP],
                suppress_callback_exceptions=True,
                meta_tags=[{"name": "viewport", "content": "width=device-width, initial-scale=1"}])
server = app.server # Expose server for deployment

# --- Configure Gemini AI ---
# IMPORTANT: Set the GEMINI_API_KEY environment variable.
try:
    # Prefer direct CUDA GPU configuration in app.py - Note: Not directly applicable for cloud APIs like Gemini.
    # Configuration happens via environment variable or direct API key setting.
    api_key = os.environ.get("GEMINI_API_KEY")
    if not api_key:
        logging.warning("GEMINI_API_KEY environment variable not found. AI features will be disabled.")
        model = None
    else:
        genai.configure(api_key=api_key)
        # Using 'gemini-1.5-pro-latest' or similar advanced model is recommended.
        # Using the user-specified model: gemini-2.5-pro-preview-03-25
        model = genai.GenerativeModel('gemini-2.5-pro-preview-03-25')
        logging.info("Gemini AI configured successfully using 'gemini-2.5-pro-preview-03-25'.")
except Exception as e:
    logging.error(f"Error configuring Gemini AI: {e}", exc_info=True)
    model = None

# --- Global Variables ---
# Using dcc.Store for more robust state management is recommended for production,
# but for simplicity and current scope, using global variables with locks for threading.
# A lock for thread-safe access to shared global variables
data_lock = threading.Lock()

# {session_id: {filename: content_text}} - Store uploaded files per session
uploaded_files = {}

# {session_id: document_content} - Stores the *results* of generation/review steps per session
shredded_document = {}
pink_review_document = {}
red_review_document = {}
gold_review_document = {}
loe_document = {}
virtual_board_document = {}

# {session_id: document_content} - Stores the *generated* proposal drafts per session
pink_document = {}
red_document = {}
gold_document = {}

# {session_id: content_text} - Store uploaded content specifically for review inputs per session
uploaded_pink_content = {}
uploaded_red_content = {}
uploaded_gold_content = {}

# {session_id: {'doc': content, 'type': doc_type, 'format': format}} - Store the currently displayed document, its type, and format for download/chat per session
current_display_document = {}

# --- Document Types ---
document_types = {
    "Shred": "Generate a requirements spreadsheet from the PWS/Source Docs, identifying action words (shall, will, perform, etc.) by section.",
    "Pink": "Create a compliant and compelling Pink Team proposal draft based on the Shredded requirements.",
    "Pink Review": "Evaluate a Pink Team draft against Shredded requirements. Output findings (compliance, gaps, recommendations) in a spreadsheet.",
    "Red": "Create a Red Team proposal draft, addressing feedback from the Pink Review and enhancing compliance/compellingness.",
    "Red Review": "Evaluate a Red Team draft against Shredded requirements and Pink Review findings. Output findings in a spreadsheet.",
    "Gold": "Create a Gold Team proposal draft, addressing feedback from the Red Review for final compliance and polish.",
    "Gold Review": "Perform a final compliance review of the Gold Team draft against Shredded requirements and Red Review findings. Output findings.",
    "Virtual Board": "Simulate a source selection board evaluation of the final proposal against PWS/Shred requirements and evaluation criteria (Sec L/M). Output evaluation.",
    "LOE": "Generate a Level of Effort (LOE) estimate spreadsheet based on the Shredded requirements."
}

# --- Layout Definition ---
app.layout = dbc.Container(fluid=True, className="dbc", children=[
    dcc.Store(id='session-id', storage_type='session'), # Store for unique session ID
    # Title Row
    dbc.Row(
        dbc.Col(html.H1("Proposal AI Assistant", className="text-center my-4"), width=12)
    ),

    # Progress Indicator Row
    dbc.Row(
        dbc.Col(
            dcc.Loading(
                id="loading-indicator",
                type="dots", # Changed to dots
                children=[html.Div(id="loading-output", style={'height': '10px'})], # Simplified children
                overlay_style={"visibility":"hidden", "opacity": 0}, # Hide default overlay
                style={'visibility':'hidden'}, # Initially hidden
                parent_style={'minHeight': '30px'}, # Ensure space is allocated
                fullscreen=False,
            ),
            width=12,
            className="text-center mb-3"
        )
    ),

    # Main Content Row
    dbc.Row([
        # Left Column (Nav/Upload) - lg=4 (approx 33%)
        dbc.Col(
            dbc.Card(
                dbc.CardBody([
                    html.H4("1. Upload Source Documents", className="card-title"),
                    dcc.Upload(
                        id='upload-document',
                        children=html.Div(['Drag and Drop or ', html.A('Select PWS/Source Files')]),
                        style={ # Basic styling, colors/backgrounds handled by CSS
                            'width': '100%', 'height': '60px', 'lineHeight': '60px',
                            'borderWidth': '1px', 'borderStyle': 'dashed', 'borderRadius': '5px',
                            'textAlign': 'center', 'margin': '10px 0'
                        },
                        multiple=True
                    ),
                    dbc.Card( # Inner card for file list
                        dbc.CardBody(
                           html.Div(id='file-list', style={'maxHeight': '150px', 'overflowY': 'auto', 'fontSize': '0.9em'})
                        ), className="mb-3" # Removed inline style
                    ),
                    html.Hr(),
                    html.H4("2. Select Action", className="card-title mt-3"),
                    dbc.Card( # Inner card for buttons
                        dbc.CardBody([
                            *[dbc.Button(
                                doc_type,
                                id={'type': 'action-button', 'index': doc_type},
                                color="primary", # Use bootstrap classes
                                className="mb-2 w-100 d-block",
                                style={'textAlign': 'left', 'whiteSpace': 'normal', 'height': 'auto', 'wordWrap': 'break-word'} # Style for word wrap
                              ) for doc_type in document_types.keys()]
                         ])
                    )
                ]),
                # color="light", # Let CSS handle background
                className="h-100 left-nav-card", # Add custom class for CSS targeting
            ),
            width=12, lg=4, # Full width on small, 4/12 on large
            className="mb-3 mb-lg-0",
            style={'paddingRight': '15px'} # Add padding between columns
        ),

        # Right Column (Status/Preview/Controls/Chat) - lg=8 (approx 67%)
        dbc.Col(
            dbc.Card(
                dbc.CardBody([
                    dbc.Alert(id='status-bar', children="Upload source documents and select an action.", color="info"),
                    dbc.Card(id='review-controls-card', children=[dbc.CardBody(id='review-controls')], className="mb-3", style={'display': 'none'}), # Initially hidden review controls
                    dbc.Card( # Card for preview
                        dbc.CardBody([
                            html.H5("Document Preview / Output", className="card-title"),
                             dcc.Loading(
                                 id="loading-preview",
                                 type="circle",
                                 children=[html.Div(id='document-preview', style={'whiteSpace': 'pre-wrap', 'wordWrap': 'break-word', 'maxHeight': '400px', 'overflowY': 'auto', 'border': '1px solid #ccc', 'padding': '10px', 'borderRadius': '5px'})] # Added wordWrap
                            )
                        ]), className="mb-3"
                    ),
                    dbc.Button("Download Output", id="btn-download", color="success", className="mt-3 me-2", style={'display': 'none'}), # Initially hidden download
                    dcc.Download(id="download-document"),
                    html.Hr(),
                    dbc.Card( # Card for chat
                         dbc.CardBody([
                            html.H5("Refine Output (Chat)", className="card-title"),
                            dcc.Loading(
                                id="chat-loading",
                                type="circle",
                                children=[
                                    dbc.Textarea(id="chat-input", placeholder="Enter instructions to refine the document shown above...", className="mb-2", style={'whiteSpace': 'normal', 'wordWrap': 'break-word'}), # Ensure word wrap
                                    dbc.ButtonGroup([
                                        dbc.Button("Send Chat", id="btn-send-chat", color="secondary"),
                                        dbc.Button("Clear Chat", id="btn-clear-chat", color="tertiary")
                                    ], className="mb-3"),
                                    html.Div(id="chat-output", style={'whiteSpace': 'pre-wrap', 'wordWrap': 'break-word', 'marginTop': '10px', 'border': '1px solid #eee', 'padding': '10px', 'borderRadius': '5px', 'minHeight': '50px'}) # Added wordWrap
                                ]
                            )
                         ]), className="mb-3"
                     )
                ]),
                # color="white", # Let CSS handle background
                className="h-100 right-nav-card", # Add custom class for CSS targeting
            ),
            width=12, lg=8, # Full width on small, 8/12 on large
            style={'paddingLeft': '15px'} # Add padding between columns
        )
    ])
], style={'padding': '0 15px'}) # Add padding around the container


# --- Helper Functions ---

def get_session_id(session_id_value=None):
    """Gets the current session ID or generates a new one."""
    if session_id_value:
        return session_id_value
    # Fallback for initial load or if session ID is missing
    new_id = str(uuid.uuid4())
    logging.info(f"Generated new session ID: {new_id}")
    return new_id

def parse_generated_content(content_text):
    """Attempts to parse AI-generated content into a DataFrame if it looks like a table."""
    try:
        # Simple check: does it contain multiple lines and pipe characters?
        if content_text and '\n' in content_text and '|' in content_text:
            # Try parsing as Markdown-like table (skip lines that don't fit)
            lines = [line.strip() for line in content_text.strip().split('\n')]
            # Remove separator lines like |---|---|
            lines = [line for line in lines if not all(c in '-| ' for c in line)]
            if len(lines) > 1:
                # Use the first line as header, split by '|'
                header = [h.strip() for h in lines[0].strip('|').split('|')]
                data_rows = []
                for line in lines[1:]:
                    values = [v.strip() for v in line.strip('|').split('|')]
                    if len(values) == len(header): # Ensure matching column count
                        data_rows.append(values)
                    else:
                        logging.warning(f"Skipping row due to mismatched columns: {line}")

                if data_rows:
                    df = pd.DataFrame(data_rows, columns=header)
                    logging.info("Successfully parsed generated content as DataFrame.")
                    return df
    except Exception as e:
        logging.warning(f"Could not parse content into DataFrame: {e}. Treating as plain text.")
    # If parsing fails or it doesn't look like a table, return None
    logging.info("Content does not appear to be a table or parsing failed. Treating as plain text.")
    return None

def process_document(contents, filename):
    """Processes uploaded file content (PDF or DOCX) and returns text, or None and error message."""
    if contents is None:
        logging.warning(f"process_document called with None contents for {filename}")
        return None, f"Error: No content provided for {filename}."

    try:
        content_type, content_string = contents.split(',')
        decoded = base64.b64decode(content_string)
        logging.info(f"Processing file: {filename}")
        text = None
        error_message = None

        if filename.lower().endswith('.docx'):
            doc = Document(io.BytesIO(decoded))
            text = "\n".join([para.text for para in doc.paragraphs if para.text.strip()])
            logging.info(f"Successfully processed DOCX: {filename}")
        elif filename.lower().endswith('.pdf'):
            pdf = PdfReader(io.BytesIO(decoded))
            extracted_pages = []
            for i, page in enumerate(pdf.pages):
                try:
                    page_text = page.extract_text()
                    if page_text:
                        extracted_pages.append(page_text)
                except Exception as page_e:
                    logging.warning(f"Could not extract text from page {i+1} of {filename}: {page_e}")
            text = "\n\n".join(extracted_pages)
            if not text:
                 logging.warning(f"No text extracted from PDF: {filename}. It might be image-based or corrupted.")
                 error_message = f"Error: No text could be extracted from PDF {filename}. It might be image-based or require OCR."
            else:
                logging.info(f"Successfully processed PDF: {filename}")
        else:
            logging.warning(f"Unsupported file format: {filename}")
            error_message = f"Unsupported file format: {filename}. Please upload PDF or DOCX."

        return text, error_message
    except Exception as e:
        logging.error(f"Error processing document {filename}: {e}", exc_info=True)
        return None, f"Error processing file {filename}: {str(e)}"

def get_combined_uploaded_text(session_id, file_dict):
    """Combines text content of files in the provided dictionary for a session."""
    with data_lock:
        session_files = file_dict.get(session_id, {})
        if not session_files:
            return ""
        # Combine content, adding filenames for context if multiple files
        if len(session_files) > 1:
            return "\n\n--- FILE BREAK ---\n\n".join(
                f"**File: {fname}**\n\n{content}" for fname, content in session_files.items()
            )
        else:
            return next(iter(session_files.values()), "")


def generate_ai_document(session_id, doc_type, input_docs, context_docs=None):
    """Generates document using Gemini AI. Returns generated content and format ('text' or 'dataframe')."""
    if not model:
        logging.error(f"[{session_id}] Gemini AI model not initialized.")
        return "Error: AI Model not configured. Please check API Key.", 'text'
    if not input_docs or not any(doc.strip() for doc in input_docs if doc):
        logging.warning(f"[{session_id}] generate_ai_document called for {doc_type} with no valid input documents.")
        return f"Error: Missing required input document(s) for {doc_type} generation.", 'text'

    combined_input = "\n\n---\n\n".join(filter(None, input_docs))
    combined_context = "\n\n---\n\n".join(filter(None, context_docs)) if context_docs else ""

    # Define expected output format based on doc_type
    is_spreadsheet_type = doc_type in ["Shred", "Pink Review", "Red Review", "Gold Review", "LOE", "Virtual Board"]
    output_format_instruction = """**Output Format:** Structure the output as a clear, parseable Markdown table. Use '|' as the column delimiter. Define meaningful column headers relevant to the task (e.g., PWS_Section, Requirement, Action_Verb for Shred; Section, Requirement, Compliance_Status, Finding, Recommendation for Reviews; Section, Task, Estimated_Hours, Resource_Type for LOE). Ensure each row corresponds to a distinct item (e.g., requirement, finding, task).""" if is_spreadsheet_type else """**Output Format:** Write professional, compelling proposal prose. Use clear paragraphs and standard formatting. Address all requirements logically. Avoid tables unless explicitly part of the proposal structure."""

    prompt = f"""**Objective:** Generate the '{doc_type}' document.
**Your Role:** Act as an expert proposal writer/analyst specialized in government contracting.
**Core Instructions:**
1.  **Adhere Strictly to the Task:** Generate *only* the content for the '{doc_type}'. Do not add introductions, summaries, explanations, or conversational filler unless it's part of the requested document format itself (e.g., an executive summary within a proposal draft).
2.  **Follow Format Guidelines:** {output_format_instruction}
3.  **Content Requirements:**
    *   **Shred:** Identify requirements (explicit and implied), action verbs (shall, will, must, provide, perform, etc.), and PWS section references.
    *   **Proposal Sections (Pink, Red, Gold):** Write compliant and compelling content. Directly address requirements from the Context Document(s). Detail the 'how' (approach, methodology, tools). Incorporate win themes, strengths, and discriminators. Substantiate claims. Use active voice ("Our team will..."). Ensure compliance with evaluation criteria (e.g., Section L/M). Clearly map responses back to PWS requirements.
    *   **Reviews (Pink, Red, Gold):** Evaluate the submitted draft against the requirements (Shred/PWS) and previous review findings (if applicable). Identify compliance issues, gaps, weaknesses, and areas for improvement. Provide actionable recommendations. Be specific and reference relevant sections.
    *   **LOE:** Estimate the Level of Effort (hours, resource types) required to fulfill each major task or requirement identified in the Shred/PWS. Justify estimates briefly if necessary.
    *   **Virtual Board:** Simulate a source selection evaluation. Assess the final proposal against the PWS/Shred and evaluation criteria (Sec L/M). Assign strengths, weaknesses, deficiencies, risks. Provide a summary evaluation.
4.  **Utilize Provided Documents:**
    *   **Context Document(s):** These provide the baseline or reference material (e.g., Shredded Requirements, PWS Section L/M, Previous Review Findings). Refer to them diligently.
    *   **Primary Input Document(s):** This is the main subject of the task (e.g., the PWS text to be Shredded, the Pink draft to be Reviewed, the Red Review findings to incorporate into the Gold draft). Analyze and process this document according to the task.
**Provided Documents:**
**Context Document(s):**
```text
{combined_context if combined_context else "N/A"}
```
**Primary Input Document(s):**
```text
{combined_input}