Spaces:

yonnel
/

karl-movie-vector-backend

Sleeping

File size: 2,916 Bytes

---
title: Karl Movie Vector Backend
emoji: 🎬
colorFrom: blue
colorTo: purple
sdk: docker
pinned: false
license: mit
---

# Karl Movie Vector Backend

FastAPI backend for semantic movie recommendations using FAISS and OpenAI embeddings. Powers intelligent movie discovery with geometric subspace algorithms.

## Features

- Semantic movie search using OpenAI embeddings
- FAISS-powered vector similarity search
- Geometric subspace algorithms for multi-movie preferences
- ~150ms response time on CPU
- RESTful API with Bearer token authentication

## API Usage

```bash
curl -X POST "https://yonnel-karl-movie-vector-backend.hf.space/explore" \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "liked_ids": [550, 680],
    "disliked_ids": [],
    "top_k": 100
  }'
```

# Karl Movie Vector Backend - Hugging Face Deployment

This FastAPI application provides movie recommendations using vector similarity search.

## 🚀 Automatic Setup

This application will automatically build its movie index on first startup. The process includes:

1. **Data Collection**: Fetches movie data from TMDB API
2. **Embedding Generation**: Creates vector embeddings using OpenAI API  
3. **Index Building**: Builds FAISS index for fast similarity search
4. **API Startup**: Launches the FastAPI service

⏱️ **First startup may take 3-5 minutes** to build the index.

## 🔧 Required Environment Variables

Configure these in your Hugging Face Space settings:

### Essential APIs
- `OPENAI_API_KEY`: Your OpenAI API key for generating embeddings
- `TMDB_API_KEY`: Your TMDB API key for fetching movie data

### Optional Configuration  
- `API_TOKEN`: Token for API authentication (optional)
- `LOG_LEVEL`: Logging level (default: INFO)

## 📡 API Endpoints

- `GET /health` - Health check
- `POST /search` - Search for similar movies
- `GET /movie/{movie_id}` - Get movie details

## 🏗️ Technical Details

- **Framework**: FastAPI
- **Vector Search**: FAISS
- **Embeddings**: OpenAI text-embedding-3-small
- **Movie Data**: TMDB (The Movie Database)
- **Container**: Docker

## 🔄 Rebuilding Index

To rebuild the movie index (e.g., to get newer movies):
1. Delete the Space's persistent storage
2. Restart the Space
3. The index will rebuild automatically on startup

## 📦 Data Files Generated

The application creates these files on startup:
- `app/data/faiss.index` - FAISS vector search index
- `app/data/movies.npy` - Movie embeddings matrix
- `app/data/id_map.json` - TMDB ID to matrix mapping
- `app/data/movie_metadata.json` - Movie metadata

These files are automatically generated and don't need to be included in the repository.

## Environment Variables

Set these in your Space settings:
- `OPENAI_API_KEY`: Your OpenAI API key
- `TMDB_API_KEY`: Your TMDB API key  
- `API_TOKEN`: Authentication token for API access
- `ENV`: Set to "prod" for production