ivangabriele's picture
feat: initialize project
2f5127c verified

Judges

TRL Judges is an experimental API which is subject to change at any time.

TRL provides judges to easily compare two completions.

Make sure to have installed the required dependencies by running:

pip install trl[judges]

Using the provided judges

TRL provides several judges out of the box. For example, you can use the HfPairwiseJudge to compare two completions using a pre-trained model from the Hugging Face model hub:

from trl import HfPairwiseJudge

judge = HfPairwiseJudge()
judge.judge(
    prompts=["What is the capital of France?", "What is the biggest planet in the solar system?"],
    completions=[["Paris", "Lyon"], ["Saturn", "Jupiter"]],
)  # Outputs: [0, 1]

Define your own judge

To define your own judge, we provide several base classes that you can subclass. For rank-based judges, you need to subclass [BaseRankJudge] and implement the [BaseRankJudge.judge] method. For pairwise judges, you need to subclass [BasePairJudge] and implement the [BasePairJudge.judge] method. If you want to define a judge that doesn't fit into these categories, you need to subclass [BaseJudge] and implement the [BaseJudge.judge] method.

As an example, let's define a pairwise judge that prefers shorter completions:

from trl import BasePairwiseJudge

class PrefersShorterJudge(BasePairwiseJudge):
    def judge(self, prompts, completions, shuffle_order=False):
        return [0 if len(completion[0]) > len(completion[1]) else 1 for completion in completions]

You can then use this judge as follows:

judge = PrefersShorterJudge()
judge.judge(
    prompts=["What is the capital of France?", "What is the biggest planet in the solar system?"],
    completions=[["Paris", "The capital of France is Paris."], ["Jupiter is the biggest planet in the solar system.", "Jupiter"]],
)  # Outputs: [0, 1]

Provided judges

PairRMJudge

[[autodoc]] PairRMJudge

HfPairwiseJudge

[[autodoc]] HfPairwiseJudge

OpenAIPairwiseJudge

[[autodoc]] OpenAIPairwiseJudge

AllTrueJudge

[[autodoc]] AllTrueJudge

Base classes

BaseJudge

[[autodoc]] BaseJudge

BaseBinaryJudge

[[autodoc]] BaseBinaryJudge

BaseRankJudge

[[autodoc]] BaseRankJudge

BasePairwiseJudge

[[autodoc]] BasePairwiseJudge