Spaces:
Running
Running
A newer version of the Gradio SDK is available:
5.42.0
metadata
title: Book QA Chat
emoji: 💬
colorFrom: yellow
colorTo: purple
sdk: gradio
sdk_version: 5.0.1
app_file: app.py
pinned: false
An example chatbot using Gradio, huggingface_hub
, and the Hugging Face Inference API.
Local Transformers mode with alternate tokenizer
If the target model repository does not include a tokenizer, you can instruct the app to run locally with transformers
and use a tokenizer from another repository.
Environment variables:
MODEL_ID
(optional): model repo to load. Defaults totianzhechu/BookQA-7B-Instruct
.TOKENIZER_ID
(optional): tokenizer repo to use locally (e.g., a base model's tokenizer). When set, the app switches to a localtransformers
backend and streams tokens from your machine.USE_LOCAL_TRANSFORMERS
(optional): set to1
to force local mode even withoutTOKENIZER_ID
.
Install extra dependencies:
pip install -r requirements.txt
Run with an alternate tokenizer (example):
export MODEL_ID=tianzhechu/BookQA-7B-Instruct
export TOKENIZER_ID=TheBaseModel/TokenizerRepo
python app.py
Notes:
- Local inference will download and load the model weights via
transformers
and may require significant memory. - If the tokenizer exposes a chat template, it is applied automatically. Otherwise a simple fallback template is used.
- You'll need a compatible version of
torch
installed for your platform. If the default pip install fails, follow the official install instructions for your OS/GPU.