Spaces:
Sleeping
Sleeping
metadata
title: RAG Generation Service
emoji: 🤖
colorFrom: blue
colorTo: purple
sdk: docker
pinned: false
license: mit
Generation Module
This is an LLM-based generation service designed to be deployed as a modular component of a broader RAG system. The service runs on a docker container and exposes a gradio UI on port 7860 as well as an MCP endpoint.
Configuration
- The module requires an API key (set as an environment variable) for an inference provider to run. Multiple inference providers are supported. Make sure to set the appropriate environment variables:
- OpenAI:
OPENAI_API_KEY
- Anthropic:
ANTHROPIC_API_KEY
- Cohere:
COHERE_API_KEY
- HuggingFace:
HF_TOKEN
- Inference provider and model settings are accessible via params.cfg