Spaces:
Sleeping
Sleeping
# Custom Model Endpoint Guide with Ollama | |
## 1. Prerequisites: Ollama Setup | |
First, download and install Ollama from the official website: | |
π **Download Link**: [https://ollama.com/download](https://ollama.com/download) | |
π **Additional Resources**: | |
- Official Website: [https://ollama.com](https://ollama.com/) | |
- Model Library: [https://ollama.com/library](https://ollama.com/library) | |
- GitHub Repository: [https://github.com/ollama/ollama/](https://github.com/ollama/ollama) | |
--- | |
## 2. Basic Ollama Commands | |
| Command | Description | | |
|------|------| | |
| `ollama pull model_name` | Download a model | | |
| `ollama serve` | Start the Ollama service | | |
| `ollama ps` | List running models | | |
| `ollama list` | List all downloaded models | | |
| `ollama rm model_name` | Remove a model | | |
| `ollama show model_name` | Show model details | | |
## 3. Using Ollama API for Custom Model | |
### OpenAI-Compatible API | |
#### Chat Request | |
```bash | |
curl http://127.0.0.1:11434/v1/chat/completions -H "Content-Type: application/json" -d '{ | |
"model": "qwen2.5:0.5b", | |
"messages": [ | |
{"role": "user", "content": "Why is the sky blue?"} | |
] | |
}' | |
``` | |
#### Embedding Request | |
```bash | |
curl http://127.0.0.1:11434/v1/embeddings -d '{ | |
"model": "snowflake-arctic-embed:110m", | |
"input": "Why is the sky blue?" | |
}' | |
``` | |
More Details: [https://github.com/ollama/ollama/blob/main/docs/openai.md](https://github.com/ollama/ollama/blob/main/docs/openai.md) | |
## 4. Configuring Custom Embedding in Second Me | |
1. Start the Ollama service: `ollama serve` | |
2. Check your Ollama embedding model context length: | |
```bash | |
# Example: ollama show snowflake-arctic-embed:110m | |
$ ollama show snowflake-arctic-embed:110m | |
Model | |
architecture bert | |
parameters 108.89M | |
context length 512 | |
embedding length 768 | |
quantization F16 | |
License | |
Apache License | |
Version 2.0, January 2004 | |
``` | |
3. Modify `EMBEDDING_MAX_TEXT_LENGTH` in `Second_Me/.env` to match your embedding model's context window. This prevents chunk length overflow and avoids server-side errors (500 Internal Server Error). | |
```bash | |
# Embedding configurations | |
EMBEDDING_MAX_TEXT_LENGTH=embedding_model_context_length | |
``` | |
4. Configure Custom Embedding in Settings | |
``` | |
Chat: | |
Model Name: qwen2.5:0.5b | |
API Key: ollama | |
API Endpoint: http://127.0.0.1:11434/v1 | |
Embedding: | |
Model Name: snowflake-arctic-embed:110m | |
API Key: ollama | |
API Endpoint: http://127.0.0.1:11434/v1 | |
``` | |
**When running Second Me in Docker environments**, please replace `127.0.0.1` in API Endpoint with `host.docker.internal`: | |
``` | |
Chat: | |
Model Name: qwen2.5:0.5b | |
API Key: ollama | |
API Endpoint: http://host.docker.internal:11434/v1 | |
Embedding: | |
Model Name: snowflake-arctic-embed:110m | |
API Key: ollama | |
API Endpoint: http://host.docker.internal:11434/v1 |