PwC-Embedding-expr

We trained the PwC-Embedding-expr model on top of the multilingual-e5-large-instruct embedding model.
To enhance performance in Korean, we applied our curated augmentation to STS datasets and fine-tuned the E5 model using a carefully balanced ratio across datasets.

To-do

  • MTEB Leaderboard
  • Technical Report

MTEB

PwC-Embedding_expr was evaluated on the Korean subset of MTEB.
A leaderboard link will be added once it is published.

Task PwC-Embedding_expr multilingual-e5-large Max Result
KLUE-STS 0.88 0.83 0.90
KLUE-TC 0.73 0.61 0.73
Ko-StrategyQA 0.80 0.80 0.83
KorSTS 0.84 0.81 0.98
MIRACL-Reranking 0.72 0.65 0.72
MIRACL-Retrieval 0.65 0.59 0.72
Average 0.77 0.71 0.81

Model

Requirements

It works with the dependencies included in the latest version of MTEB.

Citation

TBD (technical report expected September 2025)

Downloads last month
111
Safetensors
Model size
560M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support