amine_dubs commited on
Commit
bf71f0f
·
1 Parent(s): 0148163
Files changed (2) hide show
  1. backend/main.py +15 -1
  2. project_report.md +486 -133
backend/main.py CHANGED
@@ -733,7 +733,21 @@ async def download_translated_document(request: Request):
733
 
734
  # Insert text into the PDF
735
  text_rect = fitz.Rect(50, 50, page.rect.width - 50, page.rect.height - 50)
736
- page.insert_text(text_rect.tl, content, fontsize=11)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
737
 
738
  # Save to bytes
739
  pdf_bytes = BytesIO()
 
733
 
734
  # Insert text into the PDF
735
  text_rect = fitz.Rect(50, 50, page.rect.width - 50, page.rect.height - 50)
736
+
737
+ # Check if content contains Arabic text
738
+ has_arabic = any('\u0600' <= c <= '\u06FF' for c in content)
739
+
740
+ # Use write_text with an appropriate font for Arabic support
741
+ # and set right-to-left direction for Arabic text
742
+ page.write_text(
743
+ text_rect,
744
+ content,
745
+ fontsize=11,
746
+ font="helv" if not has_arabic else "noto", # Use Noto font for Arabic
747
+ fontfile="NotoSansArabic-Regular.ttf" if has_arabic else None,
748
+ align="right" if has_arabic else "left",
749
+ direction="rtl" if has_arabic else "ltr"
750
+ )
751
 
752
  # Save to bytes
753
  pdf_bytes = BytesIO()
project_report.md CHANGED
@@ -1,12 +1,12 @@
1
  # AI-Powered Translation Web Application - Project Report
2
 
3
- **Date:** April 27, 2025
4
 
5
  **Author:** [Your Name/Team Name]
6
 
7
  ## 1. Introduction
8
 
9
- This report details the development process of an AI-powered web application designed for translating text and documents between various languages and Arabic (Modern Standard Arabic - Fusha). The application features a RESTful API backend built with FastAPI and a user-friendly frontend using HTML, CSS, and JavaScript. It is designed for deployment on Hugging Face Spaces using Docker.
10
 
11
  ## 2. Project Objectives
12
 
@@ -15,8 +15,12 @@ This report details the development process of an AI-powered web application des
15
  * Build a RESTful API backend using FastAPI.
16
  * Integrate Hugging Face LLMs/models for translation.
17
  * Create a user-friendly frontend for interacting with the API.
18
- * Support translation for direct text input and uploaded documents (PDF, DOCX, XLSX, PPTX, TXT).
19
  * Focus on high-quality Arabic translation, emphasizing meaning and eloquence (Balagha) over literal translation.
 
 
 
 
20
  * Document the development process comprehensively.
21
 
22
  ## 3. Backend Architecture and API Design
@@ -43,6 +47,7 @@ This report details the development process of an AI-powered web application des
43
  |-- project_report.md # This report
44
  |-- deployment_guide.md # Deployment instructions
45
  |-- project_details.txt # Original project requirements
 
46
  ```
47
 
48
  ### 3.3. API Endpoints
@@ -50,56 +55,171 @@ This report details the development process of an AI-powered web application des
50
  * **`GET /`**
51
  * **Description:** Serves the main HTML frontend page (`index.html`).
52
  * **Response:** `HTMLResponse` containing the rendered HTML.
 
 
 
53
  * **`POST /translate/text`**
54
  * **Description:** Translates a snippet of text provided in the request body.
55
- * **Request Body (Form Data):**
56
  * `text` (str): The text to translate.
57
- * `source_lang` (str): The source language code (e.g., 'en', 'fr', 'ar'). 'auto' might be supported depending on the model.
58
- * `target_lang` (str): The target language code (currently fixed to 'ar').
59
  * **Response (`JSONResponse`):**
60
  * `translated_text` (str): The translated text.
61
- * `source_lang` (str): The detected or provided source language.
62
- * **Error Responses:** `400 Bad Request` (e.g., missing text), `500 Internal Server Error` (translation failure), `501 Not Implemented` (if required libraries missing).
 
63
  * **`POST /translate/document`**
64
  * **Description:** Uploads a document, extracts its text, and translates it.
65
  * **Request Body (Multipart Form Data):**
66
- * `file` (UploadFile): The document file (.pdf, .docx, .xlsx, .pptx, .txt).
67
- * `source_lang` (str):
68
- * `target_lang` (str): The target language code (currently fixed to 'ar').
69
  * **Response (`JSONResponse`):**
70
  * `original_filename` (str): The name of the uploaded file.
71
- * `detected_source_lang` (str): The detected or provided source language.
72
- * `translated_text` (str): The translated text extracted from the document.
 
 
73
  * **Error Responses:** `400 Bad Request` (e.g., no file, unsupported file type), `500 Internal Server Error` (extraction or translation failure), `501 Not Implemented` (if required libraries missing).
 
 
 
 
 
 
 
 
74
 
75
  ### 3.4. Dependencies
76
 
77
  Key Python libraries used:
78
 
79
  * `fastapi`: Web framework.
80
- * `uvicorn`: ASGI server.
81
  * `python-multipart`: For handling form data (file uploads).
82
  * `jinja2`: For HTML templating.
83
- * `transformers`: For interacting with Hugging Face models.
84
- * `torch` (or `tensorflow`): Backend for `transformers`.
85
- * `sentencepiece`, `sacremoses`: Often required by translation models.
86
- * `PyMuPDF`: For PDF text extraction.
87
- * `python-docx`: For DOCX text extraction.
88
- * `openpyxl`: For XLSX text extraction.
89
- * `python-pptx`: For PPTX text extraction.
90
-
91
- *(List specific versions from requirements.txt if necessary)*
92
-
93
- ### 3.5. Data Flow
94
-
95
- 1. **User Interaction:** User accesses the web page served by `GET /`.
96
- 2. **Text Input:** User enters text, selects languages, and submits the text form.
97
- 3. **Text API Call:** Frontend JS sends a `POST` request to `/translate/text` with form data.
98
- 4. **Text Backend Processing:** FastAPI receives the request, calls the internal translation function (using the AI model via `transformers`), and returns the result.
99
- 5. **Document Upload:** User selects a document, selects languages, and submits the document form.
100
- 6. **Document API Call:** Frontend JS sends a `POST` request to `/translate/document` with multipart form data.
101
- 7. **Document Backend Processing:** FastAPI receives the file, saves it temporarily, extracts text using appropriate libraries (PyMuPDF, python-docx, etc.), calls the internal translation function, cleans up the temporary file, and returns the result.
102
- 8. **Response Handling:** Frontend JS receives the JSON response and updates the UI to display the translation or an error message.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
103
 
104
  ## 4. Prompt Engineering and Translation Quality Control
105
 
@@ -107,22 +227,30 @@ Key Python libraries used:
107
 
108
  The core requirement is to translate *from* a source language *to* Arabic (MSA Fusha) with a focus on meaning and eloquence (Balagha), avoiding overly literal translations. These goals typically fall under the umbrella of prompt engineering when using general large language models.
109
 
110
- ### 4.2. Approach with Instruction-Tuned LLM (FLAN-T5)
111
 
112
- Due to persistent loading issues with the specialized `Helsinki-NLP` model and the desire to have more direct control over the translation process, the project switched to using `google/flan-t5-small`, an instruction-tuned language model.
113
 
114
- #### 4.2.1 Explicit Prompt Engineering
115
 
116
- The translation process uses carefully crafted prompts to guide the model toward high-quality Arabic translations. The `translate_text_internal` function in `main.py` constructs an enhanced prompt with the following components:
117
 
118
  ```python
119
- prompt = f"""Translate the following {source_lang_name} text into Modern Standard Arabic (Fusha).
 
 
 
 
120
  Focus on conveying the meaning elegantly using proper Balagha (Arabic eloquence).
121
  Adapt any cultural references or idioms appropriately rather than translating literally.
122
  Ensure the translation reads naturally to a native Arabic speaker.
123
 
124
  Text to translate:
125
- {text}"""
 
 
 
 
126
  ```
127
 
128
  This prompt explicitly instructs the model to:
@@ -131,7 +259,29 @@ This prompt explicitly instructs the model to:
131
  - Handle cultural references and idioms appropriately for an Arabic audience
132
  - Prioritize natural-sounding output over literal translation
133
 
134
- #### 4.2.2 Multi-Language Support
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
135
 
136
  The system supports multiple source languages through a language mapping system that converts ISO language codes to full language names for better model comprehension:
137
 
@@ -155,79 +305,175 @@ language_map = {
155
 
156
  Using full language names in the prompt (e.g., "Translate the following French text...") helps the model better understand the translation task compared to using language codes.
157
 
158
- #### 4.2.3 Generation Parameter Optimization
159
 
160
- To further improve translation quality, the model's generation parameters have been fine-tuned:
161
 
162
- ```python
163
- outputs = model.generate(
164
- **inputs,
165
- max_length=512, # Sufficient length for most translations
166
- num_beams=5, # Wider beam search for better quality
167
- length_penalty=1.0, # Slightly favor longer, more complete translations
168
- top_k=50, # Consider diverse word choices
169
- top_p=0.95, # Focus on high-probability tokens for coherence
170
- early_stopping=True
171
- )
172
- ```
173
 
174
- These parameters work together to encourage:
175
- - More natural-sounding translations through beam search
176
- - Better handling of nuanced expressions
177
- - Appropriate length for preserving meaning
178
- - Balance between creativity and accuracy
179
-
180
- ### 4.3. Testing and Refinement Process
181
-
182
- * **Prompt Iteration:** The core refinement process involves testing different prompt phrasings with various text samples across supported languages. Each iteration aims to improve the model's understanding of:
183
- - What constitutes eloquent Arabic (Balagha)
184
- - How to properly adapt culturally-specific references
185
- - When to prioritize meaning over literal translation
186
-
187
- * **Cultural Sensitivity Testing:** Sample texts containing culturally-specific references, idioms, and metaphors from each supported language are used to evaluate how well the model adapts these elements for an Arabic audience.
188
-
189
- * **Evaluation Metrics:**
190
- * *Human Evaluation:* Native Arabic speakers assess translations for:
191
- - Eloquence (Balagha): Does the translation use appropriately eloquent Arabic?
192
- - Cultural Adaptation: Are cultural references appropriately handled?
193
- - Naturalness: Does the text sound natural to native speakers?
194
- - Accuracy: Is the meaning preserved despite non-literal translation?
195
-
196
- * *Automated Metrics:* While useful as supplementary measures, metrics like BLEU are used with caution as they tend to favor more literal translations.
197
 
198
- * **Model Limitations:** The current implementation with FLAN-T5-small shows promise but has limitations:
199
- - It may struggle with very specialized technical content
200
- - Some cultural nuances from less common language pairs may be missed
201
- - Longer texts may lose coherence across paragraphs
202
-
203
- Future work may explore larger model variants if these limitations prove significant.
204
 
205
  ## 5. Frontend Design and User Experience
206
 
207
  ### 5.1. Design Choices
208
 
209
- * **Simplicity:** A clean, uncluttered interface with two main sections: one for text translation and one for document translation.
210
- * **Standard HTML Elements:** Uses standard forms, labels, text areas, select dropdowns, and buttons for familiarity.
211
- * **Clear Separation:** Distinct forms and result areas for text vs. document translation.
212
- * **Feedback:** Provides visual feedback during processing (disabling buttons, changing text) and displays results or errors clearly.
213
- * **Responsiveness (Basic):** Includes basic CSS media queries for better usability on smaller screens.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
214
 
215
- ### 5.2. UI/UX Considerations
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
216
 
217
- * **Workflow:** Intuitive flow select languages, input text/upload file, click translate, view result.
218
- * **Language Selection:** Dropdowns for selecting source and target languages. Includes common languages and an option for Arabic as a source (for potential future reverse translation). 'Auto-Detect' is included but noted as not yet implemented.
219
- * **File Input:** Standard file input restricted to supported types (`accept` attribute).
220
- * **Error Handling:** Displays clear error messages in a dedicated area if API calls fail or validation issues occur.
221
- * **Result Display:** Uses `<pre><code>` for potentially long translated text, preserving formatting and allowing wrapping. Results for Arabic are displayed RTL. Document results include filename and detected source language.
222
 
223
- ### 5.3. Interactivity (JavaScript)
224
 
225
- * Handles form submissions asynchronously using `fetch`.
226
- * Prevents default form submission behavior.
227
- * Provides loading state feedback on buttons.
228
- * Parses JSON responses from the backend.
229
- * Updates the DOM to display translated text or error messages.
230
- * Clears previous results/errors before new submissions.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
231
 
232
  ## 6. Deployment and Scalability
233
 
@@ -240,8 +486,6 @@ These parameters work together to encourage:
240
  * **Port Exposure:** Exposes port 8000 (used by `uvicorn`).
241
  * **Entrypoint:** Uses `uvicorn` to run the FastAPI application (`backend.main:app`), making it accessible on `0.0.0.0`.
242
 
243
- *(See `backend/Dockerfile` for the exact implementation)*
244
-
245
  ### 6.2. Hugging Face Spaces Deployment
246
 
247
  * **Method:** Uses the Docker Space SDK option.
@@ -249,44 +493,153 @@ These parameters work together to encourage:
249
  * **Repository:** The project code (including the `Dockerfile` and the `README.md` with HF metadata) needs to be pushed to a Hugging Face Hub repository (either model or space repo).
250
  * **Build Process:** Hugging Face Spaces automatically builds the Docker image from the `Dockerfile` in the repository and runs the container.
251
 
252
- *(See `deployment_guide.md` for detailed steps)*
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
253
 
254
- ### 6.3. Scalability Considerations
255
 
256
- * **Stateless API:** The API endpoints are designed to be stateless (apart from temporary file storage during upload processing), which aids horizontal scaling.
257
- * **Model Loading:** The translation model is intended to be loaded once on application startup (currently placeholder) rather than per-request, improving performance. However, large models consume significant memory.
258
- * **Hugging Face Spaces Resources:** Scalability on HF Spaces depends on the chosen hardware tier. Free tiers have limited resources (CPU, RAM). Larger models or high traffic may require upgrading to paid tiers.
259
- * **Async Processing:** FastAPI's asynchronous nature allows handling multiple requests concurrently, improving I/O bound performance. CPU-bound tasks like translation itself might still block the event loop if not handled carefully (e.g., running in a separate thread pool if necessary, though `transformers` pipelines often manage this).
260
- * **Database:** No database is currently used. If user accounts or saved translations were added, a database would be needed, adding another scaling dimension.
261
- * **Load Balancing:** For high availability and scaling beyond a single container, a load balancer and multiple container instances would be required (typically managed by orchestration platforms like Kubernetes, which is beyond the basic HF Spaces setup).
262
 
263
- ## 7. Challenges and Future Work
264
 
265
- ### 7.1. Challenges
266
 
267
- * **Model Selection:** Finding the optimal balance between translation quality (especially for Balagha), performance (speed/resource usage), and licensing.
268
- * **Prompt Engineering:** Iteratively refining the prompt to consistently achieve the desired non-literal, eloquent translation style across diverse inputs.
269
- * **Resource Constraints:** Large translation models require significant RAM and potentially GPU resources, which might be limiting on free deployment tiers.
270
- * **Document Parsing Robustness:** Handling variations and potential errors in different document formats and encodings.
271
- * **Language Detection:** Implementing reliable automatic source language detection if the 'auto' option is fully developed.
 
 
272
 
273
- ### 7.2. Future Work
274
 
275
- * **Implement Actual Translation:** Replace placeholder logic with a real Hugging Face `transformers` pipeline using a selected model.
276
- * **Implement Reverse Translation:** Add functionality and models to translate *from* Arabic *to* other languages.
277
- * **Improve Error Handling:** Provide more specific user feedback for different error types.
278
- * **Add User Accounts:** Allow users to save translation history.
279
- * **Implement Language Auto-Detection:** Integrate a library (e.g., `langdetect`, `fasttext`) for the 'auto' source language option.
280
- * **Enhance UI/UX:** Improve visual design, add loading indicators, potentially show translation progress for large documents.
281
- * **Optimize Performance:** Profile the application and optimize bottlenecks, potentially exploring model quantization or different model architectures if needed.
282
- * **Add More Document Types:** Support additional formats if required.
283
- * **Testing:** Implement unit and integration tests for backend logic.
284
 
285
- ## Project Log / Updates
286
 
287
- * **2025-04-28:** Updated project requirements to explicitly include the need for the translation model to respect cultural differences and nuances in its output.
288
- * **2025-04-28:** Switched translation model from `Helsinki-NLP/opus-mt-en-ar` to `google/flan-t5-small` due to persistent loading errors in the deployment environment and to enable direct prompt engineering for translation tasks.
289
 
290
- ## 8. Conclusion
 
 
 
 
 
 
291
 
292
- This project successfully lays the foundation for an AI-powered translation web service focusing on high-quality Arabic translation. The FastAPI backend provides a robust API, and the frontend offers a simple interface for text and document translation. Dockerization ensures portability and simplifies deployment to platforms like Hugging Face Spaces. Key next steps involve integrating a suitable translation model and refining the prompt engineering based on real-world testing.
 
1
  # AI-Powered Translation Web Application - Project Report
2
 
3
+ **Date:** May 2, 2025
4
 
5
  **Author:** [Your Name/Team Name]
6
 
7
  ## 1. Introduction
8
 
9
+ This report details the development process of an AI-powered web application called Tarjama, designed for translating text and documents between various languages and Arabic (Modern Standard Arabic - Fusha). The application features a RESTful API backend built with FastAPI and a user-friendly frontend using HTML, CSS, and JavaScript. It is designed for deployment on Hugging Face Spaces using Docker.
10
 
11
  ## 2. Project Objectives
12
 
 
15
  * Build a RESTful API backend using FastAPI.
16
  * Integrate Hugging Face LLMs/models for translation.
17
  * Create a user-friendly frontend for interacting with the API.
18
+ * Support translation for direct text input and uploaded documents (PDF, DOCX, TXT).
19
  * Focus on high-quality Arabic translation, emphasizing meaning and eloquence (Balagha) over literal translation.
20
+ * Implement a robust fallback mechanism to ensure translation service availability.
21
+ * Support language switching and reverse translation capability.
22
+ * Enable downloading of translated documents in various formats.
23
+ * Include quick phrase features for common expressions.
24
  * Document the development process comprehensively.
25
 
26
  ## 3. Backend Architecture and API Design
 
47
  |-- project_report.md # This report
48
  |-- deployment_guide.md # Deployment instructions
49
  |-- project_details.txt # Original project requirements
50
+ |-- README.md # For Hugging Face Space configuration
51
  ```
52
 
53
  ### 3.3. API Endpoints
 
55
  * **`GET /`**
56
  * **Description:** Serves the main HTML frontend page (`index.html`).
57
  * **Response:** `HTMLResponse` containing the rendered HTML.
58
+ * **`GET /api/languages`**
59
+ * **Description:** Returns the list of supported languages.
60
+ * **Response:** `JSONResponse` with a mapping of language codes to language names.
61
  * **`POST /translate/text`**
62
  * **Description:** Translates a snippet of text provided in the request body.
63
+ * **Request Body:**
64
  * `text` (str): The text to translate.
65
+ * `source_lang` (str): The source language code (e.g., 'en', 'fr', 'ar'). 'auto' is supported for language detection.
66
+ * `target_lang` (str): The target language code (e.g., 'ar', 'en').
67
  * **Response (`JSONResponse`):**
68
  * `translated_text` (str): The translated text.
69
+ * `detected_source_lang` (str, optional): The detected source language if 'auto' was used.
70
+ * `success` (bool): Indicates if the translation was successful.
71
+ * **Error Responses:** `400 Bad Request` (e.g., missing text), `500 Internal Server Error` (translation failure).
72
  * **`POST /translate/document`**
73
  * **Description:** Uploads a document, extracts its text, and translates it.
74
  * **Request Body (Multipart Form Data):**
75
+ * `file` (UploadFile): The document file (.pdf, .docx, .txt).
76
+ * `source_lang` (str): Source language code or 'auto' for detection.
77
+ * `target_lang` (str): Target language code.
78
  * **Response (`JSONResponse`):**
79
  * `original_filename` (str): The name of the uploaded file.
80
+ * `original_text` (str): The extracted text from the document.
81
+ * `translated_text` (str): The translated text.
82
+ * `detected_source_lang` (str, optional): The detected source language if 'auto' was used.
83
+ * `success` (bool): Indicates if the translation was successful.
84
  * **Error Responses:** `400 Bad Request` (e.g., no file, unsupported file type), `500 Internal Server Error` (extraction or translation failure), `501 Not Implemented` (if required libraries missing).
85
+ * **`POST /download/translated-document`**
86
+ * **Description:** Creates a downloadable version of the translated document in various formats.
87
+ * **Request Body:**
88
+ * `content` (str): The translated text content.
89
+ * `filename` (str): The desired filename for the download.
90
+ * `original_type` (str): The original file's MIME type.
91
+ * **Response:** Binary file data with appropriate Content-Disposition header for download.
92
+ * **Error Responses:** `400 Bad Request` (missing parameters), `500 Internal Server Error` (document creation failure), `501 Not Implemented` (if required libraries missing).
93
 
94
  ### 3.4. Dependencies
95
 
96
  Key Python libraries used:
97
 
98
  * `fastapi`: Web framework.
99
+ * `uvicorn[standard]`: ASGI server.
100
  * `python-multipart`: For handling form data (file uploads).
101
  * `jinja2`: For HTML templating.
102
+ * `transformers[torch]`: For interacting with Hugging Face models.
103
+ * `torch`: Backend for `transformers`.
104
+ * `tensorflow`: Alternative backend for model acceleration.
105
+ * `googletrans`: Google Translate API wrapper (used in fallback mechanism).
106
+ * `PyMuPDF`: For PDF text extraction and creation.
107
+ * `python-docx`: For DOCX text extraction and creation.
108
+ * `langdetect`: For automatic language detection.
109
+ * `sacremoses`: For tokenization with MarianMT models.
110
+ * `sentencepiece`: For model tokenization.
111
+ * `accelerate`: For optimizing model performance.
112
+ * `requests`: For HTTP requests to external translation APIs.
113
+
114
+ ### 3.5. Translation Model Architecture
115
+
116
+ #### 3.5.1. Primary Translation Models
117
+
118
+ The application implements a multi-model approach using Helsinki-NLP's opus-mt models:
119
+
120
+ ```python
121
+ translation_models: Dict[str, Dict] = {
122
+ "en-ar": {
123
+ "model": None,
124
+ "tokenizer": None,
125
+ "translator": None,
126
+ "model_name": "Helsinki-NLP/opus-mt-en-ar",
127
+ },
128
+ "ar-en": {
129
+ "model": None,
130
+ "tokenizer": None,
131
+ "translator": None,
132
+ "model_name": "Helsinki-NLP/opus-mt-ar-en",
133
+ },
134
+ "en-fr": {
135
+ "model": None,
136
+ "tokenizer": None,
137
+ "translator": None,
138
+ "model_name": "Helsinki-NLP/opus-mt-en-fr",
139
+ },
140
+ // Additional language pairs...
141
+ }
142
+ ```
143
+
144
+ * **Dynamic Model Loading**: Models are loaded on-demand based on requested language pairs.
145
+ * **Memory Management**: The application intelligently manages model memory usage, ensuring that only necessary models are loaded.
146
+ * **Restart Resilience**: Includes functionality to detect and reinitialize models if they enter a bad state.
147
+
148
+ #### 3.5.2. Multi-Tier Fallback System
149
+
150
+ A robust multi-tier fallback system ensures translation service reliability:
151
+
152
+ 1. **Primary Models**: Helsinki-NLP opus-mt models for direct translation between language pairs.
153
+ 2. **Fallback System**:
154
+ * **Google Translate API**: First fallback using the googletrans library.
155
+ * **LibreTranslate API**: Second fallback with multiple server endpoints for redundancy.
156
+ * **MyMemory Translation API**: Third fallback for additional reliability.
157
+
158
+ This approach ensures high availability of translation services even if individual services experience issues.
159
+
160
+ #### 3.5.3. Language Detection
161
+
162
+ Automatic language detection is implemented using:
163
+
164
+ 1. **Primary Detection**: Uses the `langdetect` library for accurate language identification.
165
+ 2. **Fallback Detection**: Custom character-based heuristics analyze Unicode character ranges to identify languages like Arabic, Chinese, Japanese, Russian, and Hebrew when the primary detection fails.
166
+
167
+ ### 3.6. Cultural Adaptation
168
+
169
+ The system implements post-processing for culturally sensitive translations:
170
+
171
+ ```python
172
+ def culturally_adapt_arabic(text: str) -> str:
173
+ """Apply post-processing rules to enhance Arabic translation with cultural sensitivity."""
174
+ # Replace Latin punctuation with Arabic ones
175
+ text = text.replace('?', '؟').replace(';', '؛').replace(',', '،')
176
+
177
+ # Remove common translation artifacts/prefixes
178
+ common_prefixes = [
179
+ "الترجمة:", "ترجمة:", "النص المترجم:",
180
+ "Translation:", "Arabic translation:"
181
+ ]
182
+ for prefix in common_prefixes:
183
+ if text.startswith(prefix):
184
+ text = text[len(prefix):].strip()
185
+
186
+ return text
187
+ ```
188
+
189
+ This function ensures:
190
+ - Proper Arabic punctuation replaces Latin equivalents
191
+ - Common translation artifacts and prefixes are removed
192
+ - The output follows Arabic writing conventions
193
+
194
+ ### 3.7. Document Processing
195
+
196
+ Text extraction from various file formats is handled through specialized libraries:
197
+
198
+ ```python
199
+ async def extract_text_from_file(file: UploadFile) -> str:
200
+ """Extracts text content from uploaded files without writing to disk."""
201
+ content = await file.read()
202
+ file_extension = os.path.splitext(file.filename)[1].lower()
203
+
204
+ if file_extension == '.txt':
205
+ # Handle text files with encoding detection
206
+ extracted_text = decode_with_multiple_encodings(content)
207
+ elif file_extension == '.docx':
208
+ # Extract text from Word documents
209
+ doc = docx.Document(BytesIO(content))
210
+ extracted_text = '\n'.join([para.text for para in doc.paragraphs])
211
+ elif file_extension == '.pdf':
212
+ # Extract text from PDF files
213
+ doc = fitz.open(stream=BytesIO(content), filetype="pdf")
214
+ extracted_text = "\n".join([page.get_text() for page in doc])
215
+ doc.close()
216
+ ```
217
+
218
+ Document generation for download is similarly handled through specialized functions for each format:
219
+
220
+ - **PDF**: Uses PyMuPDF (fitz) to create PDF files with the translated text
221
+ - **DOCX**: Uses python-docx to create Word documents with the translated text
222
+ - **TXT**: Simple text file creation with appropriate encoding
223
 
224
  ## 4. Prompt Engineering and Translation Quality Control
225
 
 
227
 
228
  The core requirement is to translate *from* a source language *to* Arabic (MSA Fusha) with a focus on meaning and eloquence (Balagha), avoiding overly literal translations. These goals typically fall under the umbrella of prompt engineering when using general large language models.
229
 
230
+ ### 4.2. Translation Model Selection and Approach
231
 
232
+ While the Helsinki-NLP opus-mt models serve as the primary translation engine, prompt engineering was explored using the FLAN-T5 model:
233
 
234
+ * **Instruction Design**: Explicit instructions were crafted to guide the model toward eloquent Arabic (Balagha) translation rather than literal translation.
235
 
236
+ * **Cultural Adaptation Prompts**: The prompts include specific guidance for cultural adaptation, ensuring that idioms, cultural references, and contextual meanings are appropriately handled in the target language.
237
 
238
  ```python
239
+ def create_translation_prompt(text, source_lang, target_lang="Arabic"):
240
+ """Create a prompt that emphasizes eloquence and cultural adaptation."""
241
+ source_lang_name = LANGUAGE_MAP.get(source_lang, "Unknown")
242
+
243
+ prompt = f"""Translate the following {source_lang_name} text into Modern Standard Arabic (Fusha).
244
  Focus on conveying the meaning elegantly using proper Balagha (Arabic eloquence).
245
  Adapt any cultural references or idioms appropriately rather than translating literally.
246
  Ensure the translation reads naturally to a native Arabic speaker.
247
 
248
  Text to translate:
249
+ {text}
250
+
251
+ Arabic translation:"""
252
+
253
+ return prompt
254
  ```
255
 
256
  This prompt explicitly instructs the model to:
 
259
  - Handle cultural references and idioms appropriately for an Arabic audience
260
  - Prioritize natural-sounding output over literal translation
261
 
262
+ ### 4.3. Generation Parameter Optimization
263
+
264
+ To further improve translation quality, the model's generation parameters have been fine-tuned:
265
+
266
+ ```python
267
+ outputs = model.generate(
268
+ **inputs,
269
+ max_length=512, # Sufficient length for most translations
270
+ num_beams=5, # Wider beam search for better quality
271
+ length_penalty=1.0, # Slightly favor longer, more complete translations
272
+ top_k=50, # Consider diverse word choices
273
+ top_p=0.95, # Focus on high-probability tokens for coherence
274
+ early_stopping=True
275
+ )
276
+ ```
277
+
278
+ These parameters work together to encourage:
279
+ - More natural-sounding translations through beam search
280
+ - Better handling of nuanced expressions
281
+ - Appropriate length for preserving meaning
282
+ - Balance between creativity and accuracy
283
+
284
+ ### 4.4. Multi-Language Support
285
 
286
  The system supports multiple source languages through a language mapping system that converts ISO language codes to full language names for better model comprehension:
287
 
 
305
 
306
  Using full language names in the prompt (e.g., "Translate the following French text...") helps the model better understand the translation task compared to using language codes.
307
 
308
+ ### 4.5. Cultural Sensitivity Enhancement
309
 
310
+ While automated translations can be technically accurate, ensuring cultural sensitivity requires special attention. The prompt engineering approach implements several strategies:
311
 
312
+ 1. **Explicit Cultural Adaptation Instructions**: The prompts specifically instruct the model to adapt cultural references appropriately for the target audience.
 
 
 
 
 
 
 
 
 
 
313
 
314
+ 2. **Context-Aware Translation**: The instructions emphasize conveying meaning over literal translation, allowing the model to adjust idioms and expressions for cultural relevance.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
315
 
316
+ 3. **Preservation of Intent**: By focusing on eloquence (Balagha), the model is guided to maintain the original text's tone, formality level, and communicative intent while adapting it linguistically.
 
 
 
 
 
317
 
318
  ## 5. Frontend Design and User Experience
319
 
320
  ### 5.1. Design Choices
321
 
322
+ * **Clean Interface**: Minimalist design with a focus on functionality and ease of use.
323
+ * **Tabbed Navigation**: Clear separation between text translation and document translation sections.
324
+ * **Responsive Design**: Adapts to different screen sizes using CSS media queries.
325
+ * **Material Design Influence**: Uses card-based UI components with subtle shadows and clear visual hierarchy.
326
+ * **Color Scheme**: Professional blue-based color palette with accent colors for interactive elements.
327
+ * **Accessibility**: Appropriate contrast ratios and labeled form elements.
328
+
329
+ ### 5.2. UI Components and Features
330
+
331
+ #### 5.2.1. Text Translation Interface
332
+
333
+ * **Language Controls**: Intuitive source and target language selectors with support for 12+ languages.
334
+ * **Language Swap Button**: Allows instant swapping of source and target languages with content reversal.
335
+ * **Character Count**: Real-time character counting with visual indicators when approaching limits.
336
+ * **Quick Phrases**: Two sets of pre-defined phrases for common translation needs:
337
+ * **Quick Phrases**: Common greetings and emergency phrases with auto-translate option.
338
+ * **Frequently Used Phrases**: Longer, more contextual expressions.
339
+ * **Copy Button**: One-click copying of translation results to clipboard.
340
+ * **Clear Button**: Quick removal of source text and translation results.
341
+ * **RTL Support**: Automatic right-to-left text direction for Arabic and Hebrew.
342
+
343
+ #### 5.2.2. Document Translation Interface
344
+
345
+ * **Drag-and-Drop Upload**: Intuitive file upload with highlighting on drag-over.
346
+ * **File Type Restrictions**: Clear indication of supported document formats.
347
+ * **Upload Notification**: Visual confirmation when a document is successfully uploaded.
348
+ * **Button State Management**: Translation button changes appearance when a file is ready to translate.
349
+ * **Side-by-Side Results**: Original and translated document content displayed in parallel panels.
350
+ * **Download Functionality**: Button to download the translated document in the original format.
351
+
352
+ #### 5.2.3. Notification System
353
+
354
+ * **Success Notifications**: Temporary toast notifications for successful operations.
355
+ * **Error Messages**: Clear error display with specific guidance on how to resolve issues.
356
+ * **Loading Indicators**: Spinner animations for translation processes with contextual messages.
357
+
358
+ ### 5.3. Frontend JavaScript Architecture
359
+
360
+ #### 5.3.1. Event-Driven Design
361
+
362
+ The frontend uses an event-driven architecture with clearly separated concerns:
363
+
364
+ ```javascript
365
+ // UI Element Selection
366
+ const textTabLink = document.querySelector('nav ul li a[href="#text-translation"]');
367
+ const textInput = document.getElementById('text-input');
368
+ const phraseButtons = document.querySelectorAll('.phrase-btn');
369
+ const swapLanguages = document.getElementById('swap-languages');
370
+
371
+ // Event Listeners
372
+ textTabLink.addEventListener('click', switchToTextTab);
373
+ textInput.addEventListener('input', updateCharacterCount);
374
+ phraseButtons.forEach(button => button.addEventListener('click', insertQuickPhrase));
375
+ swapLanguages.addEventListener('click', swapLanguagesHandler);
376
+
377
+ // Feature Implementations
378
+ function swapLanguagesHandler(e) {
379
+ // Language swap logic
380
+ const sourceValue = sourceLangText.value;
381
+ const targetValue = targetLangText.value;
382
+
383
+ // Don't swap if using auto-detect
384
+ if (sourceValue === 'auto') {
385
+ showNotification('Cannot swap when source language is set to auto-detect.');
386
+ return;
387
+ }
388
+
389
+ // Swap the values and text content
390
+ sourceLangText.value = targetValue;
391
+ targetLangText.value = sourceValue;
392
+
393
+ if (textOutput.textContent.trim() !== '') {
394
+ textInput.value = textOutput.textContent;
395
+ textTranslationForm.dispatchEvent(new Event('submit'));
396
+ }
397
+ }
398
+ ```
399
 
400
+ #### 5.3.2. API Interaction
401
+
402
+ All API calls use the Fetch API with proper error handling:
403
+
404
+ ```javascript
405
+ fetch('/translate/text', {
406
+ method: 'POST',
407
+ headers: { 'Content-Type': 'application/json' },
408
+ body: JSON.stringify({
409
+ text: text,
410
+ source_lang: sourceLang,
411
+ target_lang: targetLang
412
+ }),
413
+ })
414
+ .then(response => {
415
+ if (!response.ok) {
416
+ throw new Error(`HTTP error! Status: ${response.status}`);
417
+ }
418
+ return response.json();
419
+ })
420
+ .then(data => {
421
+ // Process successful response
422
+ })
423
+ .catch(error => {
424
+ // Error handling
425
+ showError(`Translation error: ${error.message}`);
426
+ });
427
+ ```
428
 
429
+ #### 5.3.3. Document Download Implementation
 
 
 
 
430
 
431
+ The document download functionality uses a combination of client-side and server-side processing:
432
 
433
+ ```javascript
434
+ function downloadTranslatedDocument(content, fileName, fileType) {
435
+ // Determine file extension
436
+ let extension = fileName.endsWith('.pdf') ? '.pdf' :
437
+ fileName.endsWith('.docx') ? '.docx' : '.txt';
438
+
439
+ // Create translated filename
440
+ const baseName = fileName.substring(0, fileName.lastIndexOf('.'));
441
+ const translatedFileName = `${baseName}_translated${extension}`;
442
+
443
+ if (extension === '.txt') {
444
+ // Direct browser download for text files
445
+ const blob = new Blob([content], { type: 'text/plain' });
446
+ const url = URL.createObjectURL(blob);
447
+ triggerDownload(url, translatedFileName);
448
+ } else {
449
+ // Server-side processing for complex formats
450
+ fetch('/download/translated-document', {
451
+ method: 'POST',
452
+ headers: { 'Content-Type': 'application/json' },
453
+ body: JSON.stringify({
454
+ content: content,
455
+ filename: translatedFileName,
456
+ original_type: fileType
457
+ }),
458
+ })
459
+ .then(response => response.blob())
460
+ .then(blob => {
461
+ const url = URL.createObjectURL(blob);
462
+ triggerDownload(url, translatedFileName);
463
+ });
464
+ }
465
+ }
466
+
467
+ function triggerDownload(url, filename) {
468
+ const a = document.createElement('a');
469
+ a.href = url;
470
+ a.download = filename;
471
+ document.body.appendChild(a);
472
+ a.click();
473
+ document.body.removeChild(a);
474
+ URL.revokeObjectURL(url);
475
+ }
476
+ ```
477
 
478
  ## 6. Deployment and Scalability
479
 
 
486
  * **Port Exposure:** Exposes port 8000 (used by `uvicorn`).
487
  * **Entrypoint:** Uses `uvicorn` to run the FastAPI application (`backend.main:app`), making it accessible on `0.0.0.0`.
488
 
 
 
489
  ### 6.2. Hugging Face Spaces Deployment
490
 
491
  * **Method:** Uses the Docker Space SDK option.
 
493
  * **Repository:** The project code (including the `Dockerfile` and the `README.md` with HF metadata) needs to be pushed to a Hugging Face Hub repository (either model or space repo).
494
  * **Build Process:** Hugging Face Spaces automatically builds the Docker image from the `Dockerfile` in the repository and runs the container.
495
 
496
+ ### 6.3. Resource Optimization
497
+
498
+ * **Model Caching:** Translation models are stored in a writable cache directory (/tmp/transformers_cache).
499
+ * **Memory Management:** Models use PyTorch's low_cpu_mem_usage option to reduce memory footprint.
500
+ * **Device Placement:** Automatic detection of available hardware (CPU/GPU) with appropriate device placement.
501
+ * **Concurrent Execution:** Uses ThreadPoolExecutor for non-blocking model inference with timeouts.
502
+ * **Initialization Cooldown:** Implements a cooldown period between initialization attempts to prevent resource exhaustion.
503
+
504
+ ### 6.4. Reliability Mechanisms
505
+
506
+ * **Error Recovery:** Automatic detection and recovery from model failures.
507
+ * **Model Testing:** Validation of loaded models with test translations before use.
508
+ * **Timeouts:** Inference timeouts to prevent hanging on problematic inputs.
509
+
510
+ ## 7. Debugging and Technical Challenges
511
+
512
+ ### 7.1. Frontend Debugging
513
+
514
+ #### 7.1.1. Quick Phrases Functionality
515
+
516
+ Initial implementation of quick phrases had issues with event propagation and tab switching:
517
+
518
+ **Problem:** Quick phrase buttons weren't consistently routing to the text tab or inserting content.
519
+ **Solution:** Added explicit logging and fixed event handling to ensure:
520
+ - Tab switching works properly with proper class manipulation
521
+ - Text insertion considers cursor position correctly
522
+ - Event bubbling is properly managed
523
+
524
+ #### 7.1.2. Language Swap Issues
525
+
526
+ The language swap functionality had several edge cases that needed handling:
527
+
528
+ **Problem:** Swap button didn't properly handle the "auto" language option and didn't consistently apply RTL styling.
529
+ **Solution:** Added conditional logic to prevent swapping when source language is set to "auto" and ensured RTL styling is consistently applied after swapping.
530
+
531
+ #### 7.1.3. File Upload Visual Feedback
532
+
533
+ **Problem:** Users weren't getting clear visual feedback when files were uploaded.
534
+ **Solution:** Added a styled notification system and enhanced the file name display with borders and background colors to make successful uploads more noticeable.
535
+
536
+ ### 7.2. Backend Challenges
537
+
538
+ #### 7.2.1. Model Loading Failures
539
+
540
+ **Problem:** Translation models sometimes failed to initialize in the deployment environment.
541
+ **Solution:** Implemented a multi-tier fallback system that:
542
+ - Attempts model initialization with appropriate error handling
543
+ - Falls back to online translation services when local models fail
544
+ - Implements a cooldown period between initialization attempts
545
+
546
+ ```python
547
+ def initialize_model(language_pair: str):
548
+ # If we've exceeded maximum attempts and cooldown hasn't passed
549
+ if (model_initialization_attempts >= max_model_initialization_attempts and
550
+ current_time - last_initialization_attempt < initialization_cooldown):
551
+ return False
552
+
553
+ try:
554
+ # Model initialization code with explicit error handling
555
+ tokenizer = AutoTokenizer.from_pretrained(
556
+ model_name,
557
+ cache_dir="/tmp/transformers_cache",
558
+ use_fast=True,
559
+ local_files_only=False
560
+ )
561
+ # ... more initialization code
562
+ except Exception as e:
563
+ print(f"Error loading model for {language_pair}: {e}")
564
+ return False
565
+ ```
566
+
567
+ #### 7.2.2. Document Processing
568
+
569
+ **Problem:** Different document formats and encodings caused inconsistent text extraction.
570
+ **Solution:** Implemented format-specific handling with fallbacks for encoding detection:
571
+
572
+ ```python
573
+ if file_extension == '.txt':
574
+ try:
575
+ extracted_text = content.decode('utf-8')
576
+ except UnicodeDecodeError:
577
+ # Try other common encodings
578
+ for encoding in ['latin-1', 'cp1252', 'utf-16']:
579
+ try:
580
+ extracted_text = content.decode(encoding);
581
+ break
582
+ except UnicodeDecodeError:
583
+ continue
584
+ ```
585
+
586
+ #### 7.2.3. Translation Download Formats
587
+
588
+ **Problem:** Generating proper document formats for download from translated text.
589
+ **Solution:** Created format-specific document generation functions that properly handle:
590
+ - PDF creation with PyMuPDF
591
+ - DOCX creation with python-docx
592
+ - Proper MIME types and headers for browser downloads
593
+
594
+ ### 7.3. Integration Testing
595
+
596
+ #### 7.3.1. End-to-End Translation Flow
597
+
598
+ Extensive testing was performed to ensure the complete translation flow worked across different scenarios:
599
+ - Text translation with various language combinations
600
+ - Document upload and translation with different file formats
601
+ - Error scenarios (network failures, invalid inputs)
602
+ - Download functionality for different file types
603
 
604
+ #### 7.3.2. Cross-Browser Testing
605
 
606
+ The application was tested across multiple browsers to ensure consistent behavior:
607
+ - Chrome
608
+ - Firefox
609
+ - Safari
610
+ - Edge
 
611
 
612
+ ## 8. Future Work
613
 
614
+ ### 8.1. Feature Enhancements
615
 
616
+ * **Translation Memory:** Implement translation memory to avoid re-translating previously translated segments.
617
+ * **Terminology Management:** Allow users to define and maintain custom terminology for consistent translations.
618
+ * **Batch Processing:** Enable translation of multiple documents in a single operation.
619
+ * **User Accounts:** Add authentication to allow users to save and manage their translation history.
620
+ * **Additional File Formats:** Extend support to handle more document types (PPTX, XLSX, HTML).
621
+ * **Dialect Support:** Add support for different Arabic dialects beyond Modern Standard Arabic.
622
+ * **API Documentation:** Implement Swagger/OpenAPI documentation for the backend API.
623
 
624
+ ### 8.2. Technical Improvements
625
 
626
+ * **State Management:** Implement a more robust frontend state management solution for complex interactions.
627
+ * **Progressive Web App:** Convert the application to a PWA for offline capabilities.
628
+ * **Unit Testing:** Add comprehensive unit tests for both frontend and backend code.
629
+ * **Model Fine-tuning:** Fine-tune translation models specifically for Arabic eloquence.
630
+ * **Web Workers:** Use web workers for client-side processing of large text translations.
631
+ * **Performance Optimization:** Implement caching and lazy loading for better performance.
 
 
 
632
 
633
+ ## 9. Conclusion
634
 
635
+ The Tarjama translation application successfully meets its core objectives of providing high-quality translations between multiple languages with a focus on Arabic eloquence. The implementation features a robust backend with multiple fallback systems, a user-friendly frontend with intuitive interactions, and comprehensive document handling capabilities.
 
636
 
637
+ Key achievements include:
638
+ - Implementation of a reliable multi-model translation system
639
+ - Robust fallback mechanisms ensuring service availability
640
+ - Intuitive UI for both text and document translation
641
+ - Support for language switching and bidirectional translation
642
+ - Document upload, translation, and download in multiple formats
643
+ - Quick phrase functionality for common translation needs
644
 
645
+ The application demonstrates how modern web technologies and AI models can be combined to create practical, user-friendly language tools that respect cultural nuances and focus on natural, eloquent translations.