Spaces:
Sleeping
Sleeping
title: Book QA Chat | |
emoji: 💬 | |
colorFrom: yellow | |
colorTo: purple | |
sdk: gradio | |
sdk_version: 5.0.1 | |
app_file: app.py | |
pinned: false | |
An example chatbot using [Gradio](https://gradio.app), [`huggingface_hub`](https://huggingface.co/docs/huggingface_hub/v0.22.2/en/index), and the [Hugging Face Inference API](https://huggingface.co/docs/api-inference/index). | |
## Local Transformers mode with alternate tokenizer | |
If the target model repository does not include a tokenizer, you can instruct the app to run locally with `transformers` and use a tokenizer from another repository. | |
Environment variables: | |
- `MODEL_ID` (optional): model repo to load. Defaults to `tianzhechu/BookQA-7B-Instruct`. | |
- `TOKENIZER_ID` (optional): tokenizer repo to use locally (e.g., a base model's tokenizer). When set, the app switches to a local `transformers` backend and streams tokens from your machine. | |
- `USE_LOCAL_TRANSFORMERS` (optional): set to `1` to force local mode even without `TOKENIZER_ID`. | |
Install extra dependencies: | |
```bash | |
pip install -r requirements.txt | |
``` | |
Run with an alternate tokenizer (example): | |
```bash | |
export MODEL_ID=tianzhechu/BookQA-7B-Instruct | |
export TOKENIZER_ID=TheBaseModel/TokenizerRepo | |
python app.py | |
``` | |
Notes: | |
- Local inference will download and load the model weights via `transformers` and may require significant memory. | |
- If the tokenizer exposes a chat template, it is applied automatically. Otherwise a simple fallback template is used. | |
- You'll need a compatible version of `torch` installed for your platform. If the default pip install fails, follow the official install instructions for your OS/GPU. |