Update usage instructions and adjust model size reference

  • Updated usage examples for loading the model with Transformers
  • Updated vLLM usage, added add_special_tokens=True to ensure correct chat formatting (e.g., BOS token)
  • Changed all occurrences of "8B" in code/comments to "0.5B" to reflect correct model size
OpenBMB org

LGTM

BigDong changed pull request status to merged

Sign up or log in to comment