taeminlee commited on
Commit
3b67921
·
unverified ·
1 Parent(s): d42d427

Update documentation of OpenAI compatible server configuration (#1141)

Browse files

Update README.md

Fixed incorrect setup for extra parameters in OpenAI compatible server configuration (see PR #1032)

Files changed (1) hide show
  1. README.md +9 -5
README.md CHANGED
@@ -273,10 +273,12 @@ If `endpoints` are left unspecified, ChatUI will look for the model on the hoste
273
 
274
  ##### OpenAI API compatible models
275
 
276
- Chat UI can be used with any API server that supports OpenAI API compatibility, for example [text-generation-webui](https://github.com/oobabooga/text-generation-webui/tree/main/extensions/openai), [LocalAI](https://github.com/go-skynet/LocalAI), [FastChat](https://github.com/lm-sys/FastChat/blob/main/docs/openai_api.md), [llama-cpp-python](https://github.com/abetlen/llama-cpp-python), and [ialacol](https://github.com/chenhunghan/ialacol).
277
 
278
  The following example config makes Chat UI works with [text-generation-webui](https://github.com/oobabooga/text-generation-webui/tree/main/extensions/openai), the `endpoint.baseUrl` is the url of the OpenAI API compatible server, this overrides the baseUrl to be used by OpenAI instance. The `endpoint.completion` determine which endpoint to be used, default is `chat_completions` which uses `v1/chat/completions`, change to `endpoint.completion` to `completions` to use the `v1/completions` endpoint.
279
 
 
 
280
  ```
281
  MODELS=`[
282
  {
@@ -285,15 +287,17 @@ MODELS=`[
285
  "parameters": {
286
  "temperature": 0.9,
287
  "top_p": 0.95,
288
- "repetition_penalty": 1.2,
289
- "top_k": 50,
290
- "truncate": 1000,
291
  "max_new_tokens": 1024,
292
  "stop": []
293
  },
294
  "endpoints": [{
295
  "type" : "openai",
296
- "baseURL": "http://localhost:8000/v1"
 
 
 
 
 
297
  }]
298
  }
299
  ]`
 
273
 
274
  ##### OpenAI API compatible models
275
 
276
+ Chat UI can be used with any API server that supports OpenAI API compatibility, for example [text-generation-webui](https://github.com/oobabooga/text-generation-webui/tree/main/extensions/openai), [LocalAI](https://github.com/go-skynet/LocalAI), [FastChat](https://github.com/lm-sys/FastChat/blob/main/docs/openai_api.md), [llama-cpp-python](https://github.com/abetlen/llama-cpp-python), and [ialacol](https://github.com/chenhunghan/ialacol) and [vllm](https://docs.vllm.ai/en/latest/serving/openai_compatible_server.html).
277
 
278
  The following example config makes Chat UI works with [text-generation-webui](https://github.com/oobabooga/text-generation-webui/tree/main/extensions/openai), the `endpoint.baseUrl` is the url of the OpenAI API compatible server, this overrides the baseUrl to be used by OpenAI instance. The `endpoint.completion` determine which endpoint to be used, default is `chat_completions` which uses `v1/chat/completions`, change to `endpoint.completion` to `completions` to use the `v1/completions` endpoint.
279
 
280
+ Parameters not supported by OpenAI (e.g., top_k, repetition_penalty, etc.) must be set in the extraBody of endpoints. Be aware that setting them in parameters will cause them to be omitted.
281
+
282
  ```
283
  MODELS=`[
284
  {
 
287
  "parameters": {
288
  "temperature": 0.9,
289
  "top_p": 0.95,
 
 
 
290
  "max_new_tokens": 1024,
291
  "stop": []
292
  },
293
  "endpoints": [{
294
  "type" : "openai",
295
+ "baseURL": "http://localhost:8000/v1",
296
+ "extraBody": {
297
+ "repetition_penalty": 1.2,
298
+ "top_k": 50,
299
+ "truncate": 1000
300
+ }
301
  }]
302
  }
303
  ]`