Though the stats and work that went into this model are impressive, there have been some serious flaws in its ability to be utilized locally. Every comment section reveals that the chat template is not working, at least for llama.cpp and vllm.
https://huggingface.co/ServiceNow-AI/Apriel-1.6-15b-Thinker/discussions/7
https://huggingface.co/ServiceNow-AI/Apriel-1.6-15b-Thinker/discussions/11
Even on the demo page, which should be configured to work without issue, there are multiple reports of endless loops after the second message is sent from the User.
https://huggingface.co/ServiceNow-AI/Apriel-1.6-15b-Thinker/discussions/8
Understanding that this model represents a strategy from ServiceNow to roll out an LLM-powered services to enterprise customers, this release does not signal ServiceNow's ability to accomplish this. I am a huge believer in local models and work on an enterprise team that would be implementing tools like the ServiceNow solution; however, given the state of the Apriel 1.6 release, I would need some major convincing to consider SN's ability to deploy in a real-world scenario a LLM solution. Especially since this feedback was nearly immediate from the community and SN has addressed none of it in a week's time.