Update README.md
Browse files
README.md
CHANGED
@@ -25,6 +25,9 @@ We introduce VideoHallu, a curated dataset that includes videos generated by sev
|
|
25 |
|
26 |
We also use GRPO to train [Qwen-2.5-VL-7B](https://huggingface.co/Qwen/Qwen2.5-VL-7B-Instruct) on a subset of our dataset and show improvement on generated video understanding.
|
27 |
|
|
|
|
|
|
|
28 |
|
29 |
## 🔥 News
|
30 |
- [2025/05/02] We release our datasets in [huggingface](https://huggingface.co/datasets/IntelligenceLab/VideoHallu)🤗.
|
|
|
25 |
|
26 |
We also use GRPO to train [Qwen-2.5-VL-7B](https://huggingface.co/Qwen/Qwen2.5-VL-7B-Instruct) on a subset of our dataset and show improvement on generated video understanding.
|
27 |
|
28 |
+
## About Open-Ended R1 Training
|
29 |
+
As open-ended long-form generation gains traction, reliably judging the quality of multi-sentence and paragraph-length outputs has become a major hurdle—traditional overlap metrics like ROUGE-L and BERTScore often miss nuances of coherence, style, and relevance, and can be skewed by pretraining biases. This leaves a critical gap in evaluation methods for guiding and training models that produce lengthy, free-form text.
|
30 |
+
|
31 |
|
32 |
## 🔥 News
|
33 |
- [2025/05/02] We release our datasets in [huggingface](https://huggingface.co/datasets/IntelligenceLab/VideoHallu)🤗.
|