Spaces:
Sleeping
A newer version of the Gradio SDK is available:
5.29.1
title: TinyV
emoji: 💬
colorFrom: yellow
colorTo: purple
sdk: gradio
sdk_version: 5.0.1
app_file: app.py
pinned: false
license: mit
TinyV
This Hugging Face Space hosts an Answer Verification Tool (TinyV) powered by the zhangchenxu/TinyV-1.5B
model. The tool is designed specifically for RL training to verify if a model's answer is semantically equivalent to a ground truth answer.
What This Tool Does
The Answer Verification Tool analyzes:
- A question
- A ground truth answer
- A model-generated answer
It then determines if the model's answer is correct, even if there are minor discrepancies in formatting or wording.
The verification is LLM-based rather than exact matching, which helps reduce false negatives in evaluation pipelines.
How to Use
Web Interface
- Enter the question in the first box
- Enter the ground truth answer
- Enter the model's answer to verify
- Adjust model parameters if needed (optional)
- Click "Verify Answer" to see the result
The tool will return:
- True if the model answer is correct
- False if the model answer is incorrect
API Usage
You can also use this tool via API:
from gradio_client import Client
client = Client("zhangchenxu/TinyV")
result = client.predict(
question="What is the capital of France?",
ground_truth="The capital of France is Paris.",
model_answer="Paris is the capital of France.",
temperature=0.3,
top_p=0.95,
max_tokens=128,
api_name="/verify"
)
print(result)
Advanced Settings
- Temperature: Controls randomness. Lower values make output more deterministic (default: 0.3)
- Top-p: Controls diversity via nucleus sampling (default: 0.95)
- Max Tokens: Maximum tokens to generate in response (default: 128)
Model Information
This tool uses the zhangchenxu/TinyV-1.5B
model, which has been optimized for answer verification tasks.
The model uses the following prompt template:
You are an AI tasked with identifying false negatives in answer verification. A false negative occurs when a model's answer is essentially correct but is marked as incorrect due to minor discrepancies or formatting issues. Your job is to analyze the given question, ground truth answer, and model answer to determine if the model's answer is actually correct despite appearing different from the ground truth.
<question>{question}</question>
<ground_truth_answer>{ground_truth}</ground_truth_answer>
<model_answer>{model_answer}</model_answer>
Return "True" if the model's answer is correct, otherwise return "False".