metadata

title: TinyV
emoji: 💬
colorFrom: yellow
colorTo: purple
sdk: gradio
sdk_version: 5.0.1
app_file: app.py
pinned: false
license: mit

TinyV

This Hugging Face Space hosts an Answer Verification Tool (TinyV) powered by the zhangchenxu/TinyV-1.5B model. The tool is designed specifically for RL training to verify if a model's answer is semantically equivalent to a ground truth answer.

What This Tool Does

The Answer Verification Tool analyzes:

A question
A ground truth answer
A model-generated answer

It then determines if the model's answer is correct, even if there are minor discrepancies in formatting or wording.

The verification is LLM-based rather than exact matching, which helps reduce false negatives in evaluation pipelines.

How to Use

Web Interface

Enter the question in the first box
Enter the ground truth answer
Enter the model's answer to verify
Adjust model parameters if needed (optional)
Click "Verify Answer" to see the result

The tool will return:

True if the model answer is correct
False if the model answer is incorrect

API Usage

You can also use this tool via API:

from gradio_client import Client

client = Client("zhangchenxu/TinyV")
result = client.predict(
    question="What is the capital of France?",
    ground_truth="The capital of France is Paris.",
    model_answer="Paris is the capital of France.",
    temperature=0.3,
    top_p=0.95,
    max_tokens=128,
    api_name="/verify"
)
print(result)

Advanced Settings

Temperature: Controls randomness. Lower values make output more deterministic (default: 0.3)
Top-p: Controls diversity via nucleus sampling (default: 0.95)
Max Tokens: Maximum tokens to generate in response (default: 128)

Model Information

This tool uses the zhangchenxu/TinyV-1.5B model, which has been optimized for answer verification tasks.

The model uses the following prompt template:

You are an AI tasked with identifying false negatives in answer verification. A false negative occurs when a model's answer is essentially correct but is marked as incorrect due to minor discrepancies or formatting issues. Your job is to analyze the given question, ground truth answer, and model answer to determine if the model's answer is actually correct despite appearing different from the ground truth.

<question>{question}</question>

<ground_truth_answer>{ground_truth}</ground_truth_answer>

<model_answer>{model_answer}</model_answer>

Return "True" if the model's answer is correct, otherwise return "False".