levalencia commited on
Commit
1dc37e8
·
1 Parent(s): cbcf1e4

Enhance Dockerfile and Streamlit app for EasyOCR directory management

Browse files

- Updated Dockerfile to create additional temporary directories with proper permissions for EasyOCR.
- Improved Streamlit app to set environment variables for EasyOCR and create necessary directories, enhancing error handling and logging for directory creation.
- Documented changes in TROUBLESHOOTING.md to address potential permission errors and environment variable configurations.

Files changed (3) hide show
  1. Dockerfile +5 -2
  2. TROUBLESHOOTING.md +28 -1
  3. src/streamlit_app.py +15 -0
Dockerfile CHANGED
@@ -10,10 +10,13 @@ RUN apt-get update && apt-get install -y \
10
  && rm -rf /var/lib/apt/lists/*
11
 
12
  # Create necessary directories with proper permissions
13
- RUN mkdir -p /app/.streamlit /tmp/docling_temp /tmp/easyocr_models && \
14
  chmod 755 /app/.streamlit && \
15
  chmod 777 /tmp/docling_temp && \
16
- chmod 777 /tmp/easyocr_models
 
 
 
17
 
18
  COPY requirements.txt ./
19
  COPY src/ ./src/
 
10
  && rm -rf /var/lib/apt/lists/*
11
 
12
  # Create necessary directories with proper permissions
13
+ RUN mkdir -p /app/.streamlit /tmp/docling_temp /tmp/easyocr_models /tmp/cache /tmp/config /tmp/data && \
14
  chmod 755 /app/.streamlit && \
15
  chmod 777 /tmp/docling_temp && \
16
+ chmod 777 /tmp/easyocr_models && \
17
+ chmod 777 /tmp/cache && \
18
+ chmod 777 /tmp/config && \
19
+ chmod 777 /tmp/data
20
 
21
  COPY requirements.txt ./
22
  COPY src/ ./src/
TROUBLESHOOTING.md CHANGED
@@ -68,4 +68,31 @@ The app creates these directories:
68
  The Dockerfile has been updated to:
69
  - Create necessary directories with proper permissions
70
  - Copy Streamlit configuration files
71
- - Set up proper environment variables
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
68
  The Dockerfile has been updated to:
69
  - Create necessary directories with proper permissions
70
  - Copy Streamlit configuration files
71
+ - Set up proper environment variables
72
+
73
+ ### EasyOCR Permission Errors
74
+
75
+ If you encounter EasyOCR permission errors like:
76
+ ```
77
+ PermissionError: [Errno 13] Permission denied: '/.EasyOCR'
78
+ ```
79
+
80
+ The app now handles these by:
81
+ 1. Setting `EASYOCR_MODULE_PATH` to a writable directory
82
+ 2. Setting `HOME`, `USERPROFILE`, and XDG directories to temp locations
83
+ 3. Creating all necessary directories with proper permissions
84
+ 4. Using fallback directories if the primary ones fail
85
+
86
+ ### Environment Variables
87
+
88
+ The app automatically sets these environment variables:
89
+ - `STREAMLIT_SERVER_FILE_WATCHER_TYPE=none`
90
+ - `STREAMLIT_SERVER_HEADLESS=true`
91
+ - `STREAMLIT_BROWSER_GATHER_USAGE_STATS=false`
92
+ - `STREAMLIT_SERVER_ENABLE_CORS=false`
93
+ - `STREAMLIT_SERVER_ENABLE_XSRF_PROTECTION=false`
94
+ - `EASYOCR_MODULE_PATH=/tmp/easyocr_models` (or fallback)
95
+ - `HOME=/tmp/docling_temp` (or fallback)
96
+ - `XDG_CACHE_HOME=/tmp/cache` (or fallback)
97
+ - `XDG_CONFIG_HOME=/tmp/config` (or fallback)
98
+ - `XDG_DATA_HOME=/tmp/data` (or fallback)
src/streamlit_app.py CHANGED
@@ -82,6 +82,21 @@ except Exception as e:
82
  os.makedirs(os.environ['EASYOCR_MODULE_PATH'], exist_ok=True)
83
  logging.warning(f"Using current directory for EasyOCR models: {os.environ['EASYOCR_MODULE_PATH']}")
84
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
85
  # Log startup information
86
  logging.info("=" * 50)
87
  logging.info("Docling Streamlit App Starting")
 
82
  os.makedirs(os.environ['EASYOCR_MODULE_PATH'], exist_ok=True)
83
  logging.warning(f"Using current directory for EasyOCR models: {os.environ['EASYOCR_MODULE_PATH']}")
84
 
85
+ # Additional EasyOCR environment variables to prevent root directory access
86
+ os.environ['HOME'] = TEMP_DIR # Set HOME to temp directory
87
+ os.environ['USERPROFILE'] = TEMP_DIR # For Windows compatibility
88
+ os.environ['XDG_CACHE_HOME'] = os.path.join(TEMP_DIR, 'cache')
89
+ os.environ['XDG_CONFIG_HOME'] = os.path.join(TEMP_DIR, 'config')
90
+ os.environ['XDG_DATA_HOME'] = os.path.join(TEMP_DIR, 'data')
91
+
92
+ # Create additional directories that EasyOCR might need
93
+ for env_var in ['XDG_CACHE_HOME', 'XDG_CONFIG_HOME', 'XDG_DATA_HOME']:
94
+ try:
95
+ os.makedirs(os.environ[env_var], exist_ok=True)
96
+ logging.info(f"Created directory for {env_var}: {os.environ[env_var]}")
97
+ except Exception as e:
98
+ logging.warning(f"Could not create directory for {env_var}: {e}")
99
+
100
  # Log startup information
101
  logging.info("=" * 50)
102
  logging.info("Docling Streamlit App Starting")