Spaces:
Build error
Build error
title: Gambling Comment Filter | |
emoji: π² | |
colorFrom: blue | |
colorTo: purple | |
sdk: docker | |
pinned: false | |
license: mit | |
short_description: A PoC filter for detecting gambling-related comments | |
# Gambling Comment Filter | |
A high-performance filter for detecting online gambling-related comments. This project is built using FastAPI and is designed to be deployed on Hugging Face Spaces. It uses robust Unicode normalization (via unidecode and a custom visual mapping) and dynamic rule management to catch obfuscated gambling content in comments. | |
## Features | |
- **Robust Text Normalization:** Converts fancy or obfuscated Unicode characters (bold, italic, fullwidth, Cyrillic/Greek lookalikes) into plain ASCII. | |
- **Dynamic Rule Management:** Add or update filtering rules (platform names, gambling terms, safe indicators, gambling contexts, ambiguous terms) on the fly using a web interface. | |
- **File Upload Support:** Process comments in bulk by uploading CSV or Excel files. | |
- **Score-Based Classification:** Uses a scoring algorithm to determine if a comment is gambling-related based on multiple signals. | |
- **Hugging Face Spaces Ready:** Deploy your project easily with a Dockerfile and run it as a Hugging Face Space. | |
## Project Structure | |
``` | |
gambling-comment-filter/ | |
βββ app.py # Main FastAPI application with filtering logic and endpoints | |
βββ requirements.txt # Python dependencies | |
βββ Dockerfile # Docker configuration for deployment on Hugging Face Spaces | |
βββ templates/ | |
βββ index.html # HTML template for the web interface | |
``` | |
## Requirements | |
- Python 3.9+ | |
- [FastAPI](https://fastapi.tiangolo.com/) | |
- [Uvicorn](https://www.uvicorn.org/) | |
- [Jinja2](https://palletsprojects.com/p/jinja/) | |
- [Pandas](https://pandas.pydata.org/) | |
- [openpyxl](https://openpyxl.readthedocs.io/en/stable/) | |
- [unidecode](https://pypi.org/project/Unidecode/) | |
## Setup and Local Testing | |
1. **Clone the Repository** | |
```bash | |
git clone https://huggingface.co/spaces/ariansyahdedy/gambling-comment-filter | |
cd gambling-comment-filter | |
``` | |
2. **Create a Virtual Environment and Install Dependencies** | |
```bash | |
python -m venv venv | |
source venv/bin/activate # On Windows use: venv\Scripts\activate | |
pip install -r requirements.txt | |
``` | |
3. **Run the Application** | |
```bash | |
uvicorn app:app --reload | |
``` | |
4. **Access the Web Interface** | |
Open your browser and visit http://localhost:8000 | |
## Deployment on Hugging Face Spaces | |
1. **Create a New Space** | |
Go to Hugging Face Spaces and create a new Space using the Docker runtime. | |
2. **Push Your Local Project to the Space** | |
```bash | |
cd path/to/gambling-comment-filter | |
git init # if not already a git repo | |
git add . | |
git commit -m "Initial commit for Gambling Comment Filter" | |
git remote add hf https://huggingface.co/spaces/ariansyahdedy/gambling-comment-filter | |
git push hf main | |
``` | |
The Space will automatically build and deploy your project. | |
## Customization | |
* **Updating Rules:** Use the web interface to add new rules via the `/add_rule` endpoint. | |
* **Visual Mapping:** The `_robust_normalize` function uses a `VISUAL_MAP` dictionary to convert fancy characters into plain ASCII. You can update this mapping directly in `app.py` or add new entries through the `/add_visual_char` endpoint. | |
* **Scoring:** Adjust the scoring logic in `is_gambling_comment` if you want to tweak the sensitivity. | |
## License | |
This project is licensed under the MIT License. See the LICENSE file for details. | |
## Contributing | |
Contributions are welcome! Please open an issue or submit a pull request for any improvements or bug fixes. |