comment_moderator / README.md
ariansyahdedy's picture
Initial Commit
a23cfdb
---
title: Gambling Comment Filter
emoji: 🎲
colorFrom: blue
colorTo: purple
sdk: docker
pinned: false
license: mit
short_description: A PoC filter for detecting gambling-related comments
---
# Gambling Comment Filter
A high-performance filter for detecting online gambling-related comments. This project is built using FastAPI and is designed to be deployed on Hugging Face Spaces. It uses robust Unicode normalization (via unidecode and a custom visual mapping) and dynamic rule management to catch obfuscated gambling content in comments.
## Features
- **Robust Text Normalization:** Converts fancy or obfuscated Unicode characters (bold, italic, fullwidth, Cyrillic/Greek lookalikes) into plain ASCII.
- **Dynamic Rule Management:** Add or update filtering rules (platform names, gambling terms, safe indicators, gambling contexts, ambiguous terms) on the fly using a web interface.
- **File Upload Support:** Process comments in bulk by uploading CSV or Excel files.
- **Score-Based Classification:** Uses a scoring algorithm to determine if a comment is gambling-related based on multiple signals.
- **Hugging Face Spaces Ready:** Deploy your project easily with a Dockerfile and run it as a Hugging Face Space.
## Project Structure
```
gambling-comment-filter/
β”œβ”€β”€ app.py # Main FastAPI application with filtering logic and endpoints
β”œβ”€β”€ requirements.txt # Python dependencies
β”œβ”€β”€ Dockerfile # Docker configuration for deployment on Hugging Face Spaces
└── templates/
└── index.html # HTML template for the web interface
```
## Requirements
- Python 3.9+
- [FastAPI](https://fastapi.tiangolo.com/)
- [Uvicorn](https://www.uvicorn.org/)
- [Jinja2](https://palletsprojects.com/p/jinja/)
- [Pandas](https://pandas.pydata.org/)
- [openpyxl](https://openpyxl.readthedocs.io/en/stable/)
- [unidecode](https://pypi.org/project/Unidecode/)
## Setup and Local Testing
1. **Clone the Repository**
```bash
git clone https://huggingface.co/spaces/ariansyahdedy/gambling-comment-filter
cd gambling-comment-filter
```
2. **Create a Virtual Environment and Install Dependencies**
```bash
python -m venv venv
source venv/bin/activate # On Windows use: venv\Scripts\activate
pip install -r requirements.txt
```
3. **Run the Application**
```bash
uvicorn app:app --reload
```
4. **Access the Web Interface**
Open your browser and visit http://localhost:8000
## Deployment on Hugging Face Spaces
1. **Create a New Space**
Go to Hugging Face Spaces and create a new Space using the Docker runtime.
2. **Push Your Local Project to the Space**
```bash
cd path/to/gambling-comment-filter
git init # if not already a git repo
git add .
git commit -m "Initial commit for Gambling Comment Filter"
git remote add hf https://huggingface.co/spaces/ariansyahdedy/gambling-comment-filter
git push hf main
```
The Space will automatically build and deploy your project.
## Customization
* **Updating Rules:** Use the web interface to add new rules via the `/add_rule` endpoint.
* **Visual Mapping:** The `_robust_normalize` function uses a `VISUAL_MAP` dictionary to convert fancy characters into plain ASCII. You can update this mapping directly in `app.py` or add new entries through the `/add_visual_char` endpoint.
* **Scoring:** Adjust the scoring logic in `is_gambling_comment` if you want to tweak the sensitivity.
## License
This project is licensed under the MIT License. See the LICENSE file for details.
## Contributing
Contributions are welcome! Please open an issue or submit a pull request for any improvements or bug fixes.