About Real-World Application

by Ideal319 - opened Jul 11

Jul 11

Hi! This work indeed demonstrates promising performance. However, I wonder whether the inference speed can truly meet practical requirements. Although the embedding dimension is reduced, the average embedding length is approximately 1000 times that of a bi-encoder architecture with pooling. As a result, the computational burden of the late interaction stage may be difficult to afford in practice—even without considering the storage overhead.

Mengyao00

NVIDIA org about 1 month ago

Hi! Indeed, we discussed such overhead and trade-offs in our technical report section 5: https://arxiv.org/abs/2507.05513

Athrael

18 days ago

Hi! This work indeed demonstrates promising performance. However, I wonder whether the inference speed can truly meet practical requirements. Although the embedding dimension is reduced, the average embedding length is approximately 1000 times that of a bi-encoder architecture with pooling. As a result, the computational burden of the late interaction stage may be difficult to afford in practice—even without considering the storage overhead.

Does pooled retrieval not work for this model? It does for colpali, colqwen and colnomic models.

Mengyao00

NVIDIA org 6 days ago

Does pooled retrieval not work for this model? It does for colpali, colqwen and colnomic models.

Hello, we fine-tuned this model with colbert-like late interaction not by pooling method, so using late-interaction could achieve the best performance.

Athrael

1 day ago

Does pooled retrieval not work for this model? It does for colpali, colqwen and colnomic models.

Hello, we fine-tuned this model with colbert-like late interaction not by pooling method, so using late-interaction could achieve the best performance.

I understand. We can still use pooling at the vector database level to improve retrieval

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment