PwC-Embedding_expr / README.md
elplaguister's picture
Add Model card
4c73846 verified
metadata
language:
  - ko
license: apache-2.0
tags:
  - sentence-transformers
  - sentence-similarity
  - transformers

PwC-Embedding-expr

We trained the PwC-Embedding-expr model on top of the multilingual-e5-large-instruct embedding model.
To enhance performance in Korean, we applied our curated augmentation to STS datasets and fine-tuned the E5 model using a carefully balanced ratio across datasets.

To-do

  • MTEB Leaderboard
  • Technical Report

MTEB

PwC-Embedding_expr was evaluated on the Korean subset of MTEB.
A leaderboard link will be added once it is published.

Task PwC-Embedding_expr multilingual-e5-large Max Result
KLUE-STS 0.88 0.83 0.90
KLUE-TC 0.73 0.61 0.73
Ko-StrategyQA 0.80 0.80 0.83
KorSTS 0.84 0.81 0.98
MIRACL-Reranking 0.72 0.65 0.72
MIRACL-Retrieval 0.65 0.59 0.72
Average 0.77 0.71 0.81

Model

Requirements

It works with the dependencies included in the latest version of MTEB.

Citation

TBD (technical report expected September 2025)