deploying finetuned model on triton using vllm backend

#157
by Prabhjot410 - opened

I am following this article : https://docs.nvidia.com/deeplearning/triton-inference-server/user-guide/docs/vllm_backend/docs/llama_multi_lora_tutorial.html

and getting error :
E1013 12:43:04.029817 1 model_lifecycle.cc:654] "failed to load 'oss-20b' version 1: Internal: ValueError: 'aimv2' is already used by a Transformers config, pick another name.
then i update the version : pip install -U transformers>=4.55.0 kernels torch==2.6.0 in dockerfile
got this error E1013 12:51:12.611620 1 model_lifecycle.cc:654] "failed to load 'oss-20b' version 1: Internal: ModuleNotFoundError: Could not import module 'ProcessorMixin'. Are this object's requirements defined correctly?

Sign up or log in to comment