Spaces:

harvesthealth
/

secondme-api

Sleeping

App Files Files Community

secondme-api / docs /Local Chat API.md

Gemini

feat: add detailed logging

01d5a5d 5 days ago

preview code

raw

history blame contribute delete

4.66 kB

Chat API Documentation

Overview

This API provides chat functionality compatible with OpenAI V1 Chat Completions API, supporting streaming responses for interactive conversations with AI assistants.

API Details

URL: /api/kernel2/chat
Method: POST
Description: Chat interface - Streaming response (compatible with OpenAI V1 API)
Access: Available through local endpoint at localhost:8002

Request Parameters

The request body is compatible with OpenAI Chat Completions API, using JSON format:

{
  "messages": [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Hello, who are you?"},
    {"role": "assistant", "content": "I am a helpful assistant."},
    {"role": "user", "content": "What can you do for me?"}  
  ],
  "metadata": {
    "enable_l0_retrieval": true,
    "role_id": "uuid-string"
  },
  "stream": true,
  "model": "gpt-3.5-turbo",
  "temperature": 0.1,
  "max_tokens": 2000
}

Parameter Description

Parameter	Type	Required	Description
messages	Array	Yes	Standard OpenAI message list containing conversation history
metadata	Object	No	Additional parameters for request processing
metadata.enable_l0_retrieval	Boolean	No	Whether to enable basic knowledge retrieval
metadata.role_id	String	No	System customized role UUID
stream	Boolean	No	Whether to return streaming response (default: true)
model	String	No	Model identifier (default: configured model)
temperature	Float	No	Controls randomness (default: 0.1)
max_tokens	Integer	No	Maximum number of tokens to generate (default: 2000)

Response Format

The response format is compatible with OpenAI Chat Completions API, using Server-Sent Events (SSE) format for streaming responses:

{
  "id": "chatcmpl-123",
  "object": "chat.completion.chunk",
  "created": 1677652288,
  "model": "gpt-3.5-turbo",
  "system_fingerprint": "fp_44709d6fcb",
  "choices": [
    {
      "index": 0,
      "delta": {"content": "Hello"},
      "finish_reason": null
    }
  ]
}

Format of Each Chunk in Streaming Response

Field	Type	Description
id	String	Unique identifier for the response
object	String	Fixed as "chat.completion.chunk"
created	Integer	Timestamp
model	String	Model identifier
system_fingerprint	String	System fingerprint
choices	Array	List of generated results
choices[0].index	Integer	Result index, usually 0
choices[0].delta	Object	Incremental content of the current chunk
choices[0].delta.content	String	Incremental text content
choices[0].finish_reason	String	Reason for completion, null or "stop"

Usage Examples

cURL Request Example

curl -X POST \
  'http://localhost:8002/api/kernel2/chat' \
  -H 'Content-Type: application/json' \
  -H 'Accept: text/event-stream' \
  -d '{
    "messages": [
      {"role": "system", "content": "You are a helpful assistant."},
      {"role": "user", "content": "Tell me about artificial intelligence."}
    ],
    "stream": true
  }'

Python Request Example

import json
import http.client

url = "localhost:8002"
path = "/api/kernel2/chat"
headers = {
    "Content-Type": "application/json",
    "Accept": "text/event-stream"
}
data = {
    "messages": [
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Tell me about artificial intelligence."}
    ],
    "stream": True
}

conn = http.client.HTTPConnection(url)

conn.request("POST", path, body=json.dumps(data), headers=headers)

response = conn.getresponse()

for line in response:
    if line:
        decoded_line = line.decode('utf-8').strip()
        if decoded_line == 'data: [DONE]':
            break
        if decoded_line.startswith('data: '):
            try:
                json_str = decoded_line[6:]
                chunk = json.loads(json_str)
                content = chunk['choices'][0]['delta'].get('content', '')
                if content:
                    print(content, end='', flush=True)
            except json.JSONDecodeError:
                pass

conn.close()

Error Handling

When an error occurs, the API will return standard HTTP error status codes and error details in JSON format:

{
  "success": false,
  "message": "Error message",
  "code": 400
}

Error Code	Description
400	Bad Request
401	Unauthorized
404	Not Found
500	Internal Server Error