Spaces:
Running
Running
| # 3D Person Segmentation and Anaglyph Generation | |
| title: Object Segmentation | |
| emoji: π | |
| colorFrom: gray | |
| colorTo: pink | |
| sdk: gradio | |
| sdk_version: 5.22.0 | |
| app_file: src/app.py | |
| pinned: false | |
| ## Lab Report | |
| ### Introduction | |
| This project implements a sophisticated 3D image processing system that combines person segmentation with stereoscopic and anaglyph image generation. The main objectives were to: | |
| 1. Accurately segment people from images using advanced AI models | |
| 2. Generate stereoscopic 3D effects from 2D images | |
| 3. Create red-cyan anaglyph images for 3D viewing | |
| 4. Provide an interactive web interface for real-time processing | |
| ### Methodology | |
| #### Tools and Technologies Used | |
| - **SegFormer (nvidia/segformer-b0)**: State-of-the-art transformer-based model for semantic segmentation | |
| - **PyTorch**: Deep learning framework for running the SegFormer model | |
| - **OpenCV**: Image processing operations and mask refinement | |
| - **Gradio**: Web interface development | |
| - **NumPy**: Efficient array operations for image manipulation | |
| - **PIL (Python Imaging Library)**: Image loading and basic transformations | |
| #### Implementation Steps | |
| 1. **Person Segmentation** | |
| - Utilized SegFormer model fine-tuned on ADE20K dataset | |
| - Applied post-processing with erosion and Gaussian blur for mask refinement | |
| - Implemented mask scaling and centering for various input sizes | |
| 2. **Stereoscopic Processing** | |
| - Created depth simulation through horizontal pixel shifting | |
| - Implemented parallel view stereo pair generation | |
| - Added configurable interaxial distance for 3D effect adjustment | |
| 3. **Anaglyph Generation** | |
| - Combined left and right eye views into red-cyan anaglyph | |
| - Implemented color channel separation and recombination | |
| - Added background image support with proper masking | |
| 4. **User Interface** | |
| - Developed interactive web interface using Gradio | |
| - Added real-time parameter adjustment capabilities | |
| - Implemented support for custom background images | |
| ### Results | |
| The system produces three main outputs: | |
| 1. Segmentation mask showing the isolated person | |
| 2. Side-by-side stereo pair for parallel viewing | |
| 3. Red-cyan anaglyph image for 3D glasses viewing | |
| Key Features: | |
| - Adjustable person size (10-200%) | |
| - Configurable interaxial distance (0-10 pixels) | |
| - Optional custom background support | |
| - Real-time processing and preview | |
| ### Discussion | |
| #### Technical Challenges | |
| 1. **Mask Alignment**: Ensuring proper alignment between segmentation masks and background images required careful consideration of image dimensions and aspect ratios. | |
| 2. **Stereo Effect Quality**: Balancing the interaxial distance for comfortable viewing while maintaining the 3D effect. | |
| 3. **Performance Optimization**: Efficient processing of large images while maintaining real-time interaction. | |
| #### Learning Outcomes | |
| - Deep understanding of stereoscopic image generation | |
| - Experience with state-of-the-art segmentation models | |
| - Practical knowledge of image processing techniques | |
| - Web interface development for ML applications | |
| ### Conclusion | |
| This project successfully demonstrates the integration of modern AI-powered segmentation with classical stereoscopic image processing techniques. The system provides an accessible way to create 3D effects from regular 2D images. | |
| #### Future Work | |
| - Implementation of depth-aware 3D effect generation | |
| - Support for video processing | |
| - Additional 3D viewing formats (side-by-side, over-under) | |
| - Enhanced background replacement options | |
| - Mobile device optimization | |
| ## Setup | |
| ```bash | |
| pip install -r requirements.txt | |
| ``` | |
| ## Usage | |
| ```bash | |
| cd src | |
| python app.py | |
| ``` | |
| ## Parameters | |
| - **Person Image**: Upload an image containing a person | |
| - **Background Image**: (Optional) Custom background image | |
| - **Interaxial Distance**: Adjust the 3D effect strength (0-10) | |
| - **Person Size**: Adjust the size of the person in the output (10-200%) | |
| ## Output Types | |
| 1. **Segmentation Mask**: Shows the isolated person | |
| 2. **Stereo Pair**: Side-by-side stereo image for parallel viewing | |
| 3. **Anaglyph**: Red-cyan 3D image viewable with anaglyph glasses | |