README.md · tianzhechu/Book-QA-Chat at main

metadata

title: Book QA Chat
emoji: 💬
colorFrom: yellow
colorTo: purple
sdk: gradio
sdk_version: 5.0.1
app_file: app.py
pinned: false

An example chatbot using Gradio, huggingface_hub, and the Hugging Face Inference API.

Local Transformers mode with alternate tokenizer

If the target model repository does not include a tokenizer, you can instruct the app to run locally with transformers and use a tokenizer from another repository.

Environment variables:

MODEL_ID (optional): model repo to load. Defaults to tianzhechu/BookQA-7B-Instruct.
TOKENIZER_ID (optional): tokenizer repo to use locally (e.g., a base model's tokenizer). When set, the app switches to a local transformers backend and streams tokens from your machine.
USE_LOCAL_TRANSFORMERS (optional): set to 1 to force local mode even without TOKENIZER_ID.

Install extra dependencies:

pip install -r requirements.txt

Run with an alternate tokenizer (example):

export MODEL_ID=tianzhechu/BookQA-7B-Instruct
export TOKENIZER_ID=TheBaseModel/TokenizerRepo
python app.py

Notes:

Local inference will download and load the model weights via transformers and may require significant memory.
If the tokenizer exposes a chat template, it is applied automatically. Otherwise a simple fallback template is used.
You'll need a compatible version of torch installed for your platform. If the default pip install fails, follow the official install instructions for your OS/GPU.