marianeft commited on
Commit
65a16a1
Β·
verified Β·
1 Parent(s): 15dba6b

Update README.md with Demo and Documentation

Browse files
Files changed (1) hide show
  1. README.md +193 -1
README.md CHANGED
@@ -8,4 +8,196 @@ pinned: false
8
  license: apache-2.0
9
  ---
10
 
11
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8
  license: apache-2.0
9
  ---
10
 
11
+ # Handwritten Name Recognition (OCR) App ✍🏻
12
+
13
+ *An end-to-end Streamlit application for training and predicting handwritten names using a CRNN model.*
14
+
15
+ [πŸ“ƒ Demo and Documentation](https://drive.google.com/drive/folders/1rOmwyTJkDCsU-Wuh-_CzvQ9sdb_ci_kX?usp=sharing)
16
+ [πŸ“‚ GitHub Repository]([www.google.com](https://github.com/marianeft/handwritten_name_ocr_app.git))
17
+
18
+ ---
19
+
20
+ ## Table of Contents
21
+
22
+ * [Overview](#overview)
23
+ * [Quickstart](#quickstart)
24
+ * [Features](#features)
25
+ * [Project Structure](#project-structure)
26
+ * [Project Index](#project-index)
27
+ * [Roadmap](#roadmap)
28
+ * [Contribution](#contribution)
29
+ * [License](#license)
30
+ * [Acknowledgements](#acknowledgements)
31
+
32
+ ---
33
+
34
+ ## πŸ•ΉοΈ Overview
35
+
36
+ This project implements a Handwritten Name Recognition (OCR) system using a Convolutional Recurrent Neural Network (CRNN) architecture built with PyTorch. The application is presented as an interactive web interface using Streamlit, allowing users to:
37
+
38
+ 1. **Train** a new OCR model from a local dataset.
39
+ 2. **Load** a pre-trained model.
40
+ 3. **Predict** text from uploaded handwritten image files.
41
+ 4. **Upload** the local dataset to the Hugging Face Hub for sharing and versioning.
42
+
43
+ The CRNN model combines a CNN backbone for feature extraction from images and a Bidirectional LSTM layer for sequence modeling, followed by a linear layer for character classification using CTC (Connectionist Temporal Classification) Loss.
44
+
45
+ ---
46
+
47
+ ## 🚩 Quickstart
48
+
49
+ Follow these steps to get the application up and running on your local machine.
50
+
51
+ ### Prerequisites
52
+
53
+ * Python 3.8+
54
+ * `pip` (Python package installer)
55
+
56
+ #### 1. Clone the Repository (or set up your project folder)
57
+
58
+ Ensure your project structure matches the expected layout (e.g., `app.py`, `config.py`, `data/`, `models/` etc.).
59
+
60
+ #### 2. Create and Activate a Virtual Environment
61
+ It's highly recommended to use a virtual environment to manage dependencies.
62
+
63
+ ``` bash
64
+ # Navigate to your project root directory
65
+ cd path/to/your/handwritten_name_ocr_app
66
+
67
+ # Create a virtual environment named 'venvy'
68
+ python -m venv venvy
69
+
70
+ # Activate the virtual environment
71
+ # On Windows (Command Prompt):
72
+ .\venvy\Scripts\activate.bat
73
+
74
+ # On Windows (PowerShell):
75
+ .\venvy\Scripts\Activate.ps1
76
+
77
+ # On macOS/Linux:
78
+ source venvy/bin/activate
79
+ ```
80
+
81
+ #### 3. Install Dependencies
82
+ With your virtual environment activated, install all required Python packages:
83
+ `pip install streamlit` `pandas` `numpy` `Pillow` `torch` `torchvision` `scikit-learn` `tqdm` `editdistance` `huggingface_hub`
84
+
85
+
86
+ *Note on PyTorch (torch and torchvision):
87
+ The command above installs the CPU-only version of PyTorch. If you have a CUDA-enabled GPU and want to leverage it for faster training, please refer to the official PyTorch website (pytorch.org/get-started/locally/) for specific installation commands tailored to your CUDA version.*
88
+
89
+ #### 4. Prepare Your Dataset
90
+ The application expects a dataset structured as follows:
91
+ ``` bash
92
+ data/
93
+ β”œβ”€β”€ images/
94
+ β”‚ β”œβ”€β”€ train/
95
+ β”‚ β”‚ β”œβ”€β”€ image1.png
96
+ β”‚ β”‚ β”œβ”€β”€ image2.png
97
+ β”‚ β”‚ └── ...
98
+ β”‚ └── test/
99
+ β”‚ β”œβ”€β”€ image_test1.png
100
+ β”‚ β”œβ”€β”€ image_test2.png
101
+ β”‚ └── ...
102
+ β”œβ”€β”€ train.csv
103
+ └── test.csv
104
+ ```
105
+
106
+ #### 5. Clear Python Cache *(Important!)*
107
+ After making code changes or installing new packages, it's crucial to clear Python's compiled cache to ensure the latest code is used.
108
+
109
+ ```bash
110
+ find . -name "__pycache__" -exec rm -rf {} + # For macOS/Linux
111
+
112
+ Get-ChildItem -Path . -Include __pycache__ -Recurse | Remove-Item -Recurse -Force # For Windows PowerShell
113
+ ```
114
+
115
+ ### 6. Run the Streamlit Application
116
+ With your virtual environment activated and dependencies installed:
117
+ `streamlit run app.py`
118
+
119
+
120
+ *This will open the application in your web browser.*
121
+
122
+ ## ✏️ Features
123
+ - **CRNN Model Architecture**: Utilizes a Convolutional Recurrent Neural Network for robust OCR.
124
+ - **CTC Loss**: Employs Connectionist Temporal Classification for sequence prediction.
125
+ **Model Training**: Train a new OCR model from your local image and CSV datasets.
126
+ - **Pre-trained Model Loading**: Load previously saved models to avoid retraining.
127
+ - **Handwritten Text Prediction**: Upload an image and get instant text recognition.
128
+ - **Training Progress Visualization**: Real-time updates and plots for training loss, CER, and accuracy.
129
+ - **Hugging Face Hub Integration**: Seamlessly upload your dataset to the Hugging Face Hub for easy sharing and version control.
130
+ - **Responsive UI**: Built with Streamlit for an intuitive and user-friendly experience.
131
+
132
+
133
+ ## πŸ—οΈ Project Structure
134
+ ```
135
+ handwritten_name_ocr_app/
136
+ β”œβ”€β”€ app.py # Main Streamlit application file
137
+ β”œβ”€β”€ config.py # Configuration settings (paths, model params, chars)
138
+ β”œβ”€β”€ data/ # Directory for datasets
139
+ β”‚ β”œβ”€β”€ images/
140
+ β”‚ β”‚ β”œβ”€β”€ train/ # Training images
141
+ β”‚ β”‚ └── test/ # Testing images
142
+ β”‚ β”œβ”€β”€ train.csv # Training labels
143
+ β”‚ └── test.csv # Testing labels
144
+ β”œβ”€β”€ data_handler_ocr.py # Custom PyTorch Dataset and DataLoader logic
145
+ β”œβ”€β”€ models/ # Directory to save/load trained models
146
+ β”‚ └── handwritten_name_ocr_model.pth # Default model save path
147
+ β”œβ”€β”€ model_ocr.py # Defines the CRNN model architecture and training/evaluation functions
148
+ β”œβ”€β”€ utils_ocr.py # Utility functions for image preprocessing
149
+ β”œβ”€β”€ requirements.txt # List of Python dependencies
150
+ └── venvy/ # Python virtual environment (created by `python -m venv venvy`)
151
+ └── ...
152
+ ````
153
+
154
+ ## πŸ—ƒοΈ Project Index
155
+
156
+ `app.py`: The central Streamlit application. Handles UI, triggers training/prediction, and integrates with Hugging Face Hub.
157
+
158
+ `config.py`: Stores global configuration variables such as file paths, image dimensions, character sets, and training hyperparameters.
159
+
160
+ `data_handler_ocr.py`: Contains the CharIndexer class for character-to-index mapping and the OCRDataset and ocr_collate_fn for efficient data loading and batching for PyTorch.
161
+
162
+ `model_ocr.py`: Defines the CNN_Backbone, BidirectionalLSTM, and CRNN (the main OCR model) classes. It also includes functions for train_ocr_model, evaluate_model, save_ocr_model, load_ocr_model, and ctc_greedy_decode.
163
+
164
+ ``utils_ocr.py``: Provides helper functions for image preprocessing steps like binarization, resizing, and normalization, used before feeding images to the model.
165
+
166
+
167
+
168
+ ## πŸ“Œ Roadmap
169
+ - Advanced Data Augmentation: Implement more sophisticated augmentation techniques (e.g., elastic deformations, random noise) for training data.
170
+ - Beam Search Decoding: Replace greedy decoding with beam search for potentially more accurate predictions.
171
+ - Error Analysis Dashboard: Integrate a more detailed error analysis section to visualize common recognition mistakes.
172
+ - Support for Multiple Languages: Extend character sets and train on multilingual datasets.
173
+ - Deployment to Cloud Platforms: Provide instructions for deploying the Streamlit app to platforms like Hugging Face Spaces, Heroku, or AWS.
174
+ - Pre-trained Model Download: Allow users to download pre-trained models directly from Hugging Face Hub.
175
+ - Interactive Drawing Pad: Enable users to draw a name directly in the app for recognition.
176
+
177
+ ## 🎁 Contribution
178
+ Contributions are welcome! If you have suggestions, bug reports, or want to contribute code, please feel free to *fork the repository.*
179
+ - Create a new branch (git checkout -b feature/your-feature-name).
180
+ Make your changes.
181
+ - Commit your changes (git commit -m 'Add new feature').
182
+ - Push to the branch (git push origin feature/your-feature-name).
183
+ - Open a Pull Request.
184
+
185
+ ## βš–οΈ License
186
+ This project is licensed under the MIT License - see the LICENSE file for details.
187
+
188
+ ## ✨ Acknowledgements
189
+ **Streamlit**: For building interactive web applications with ease.
190
+
191
+ **PyTorch**: The open-source machine learning framework.
192
+
193
+ **Hugging** Face Hub: For model and dataset sharing.
194
+
195
+ **OpenCV**: For image processing utilities (implicitly used via utils_ocr).
196
+
197
+ **EditDistance**: For efficient calculation of character error rate.
198
+
199
+ **tqdm**: For progress bars during training.
200
+
201
+ ---
202
+
203
+ *Built using Streamlit, PyTorch, OpenCV, and EditDistance Β© 2025 by **MFT***