3 4 2

Valentina Pyatkin

valpy

https://valentinapy.github.io

AI & ML interests

None yet

Recent Activity

upvoted a paper 16 days ago

2 OLMo 2 Furious

upvoted a paper 16 days ago

RewardBench 2: Advancing Reward Model Evaluation

upvoted a paper 16 days ago

Skywork-Reward-V2: Scaling Preference Data Curation via Human-AI Synergy

View all activity

Organizations

authored 4 papers 30 days ago

2 OLMo 2 Furious

Paper • 2501.00656 • Published Dec 31, 2024 • 21

IssueBench: Millions of Realistic Prompts for Measuring Issue Bias in LLM Writing Assistance

Paper • 2502.08395 • Published Feb 12

RewardBench 2: Advancing Reward Model Evaluation

Paper • 2506.01937 • Published Jun 2 • 7

Generalizing Verifiable Instruction Following

Paper • 2507.02833 • Published Jul 3 • 1

authored 2 papers 9 months ago

TÜLU 3: Pushing Frontiers in Open Language Model Post-Training

Paper • 2411.15124 • Published Nov 22, 2024 • 65

SafetyAnalyst: Interpretable, transparent, and steerable LLM safety moderation

Paper • 2410.16665 • Published Oct 22, 2024

authored 5 papers 10 months ago

Hybrid Preferences: Learning to Route Instances for Human vs. AI Feedback

Paper • 2410.19133 • Published Oct 24, 2024 • 11

Political Compass or Spinning Arrow? Towards More Meaningful Evaluations for Values and Opinions in Large Language Models

Paper • 2402.16786 • Published Feb 26, 2024

QASem Parsing: Text-to-text Modeling of QA-based Semantics

Paper • 2205.11413 • Published May 23, 2022

Unpacking DPO and PPO: Disentangling Best Practices for Learning from Preference Feedback

Paper • 2406.09279 • Published Jun 13, 2024 • 3

The Art of Saying No: Contextual Noncompliance in Language Models

Paper • 2407.12043 • Published Jul 2, 2024 • 4

authored a paper about 1 year ago

WildBench: Benchmarking LLMs with Challenging Tasks from Real Users in the Wild

Paper • 2406.04770 • Published Jun 7, 2024 • 31

authored 8 papers over 1 year ago

RewardBench: Evaluating Reward Models for Language Modeling

Paper • 2403.13787 • Published Mar 20, 2024 • 23

Phenomenal Yet Puzzling: Testing Inductive Reasoning Capabilities of Language Models with Hypothesis Refinement

Paper • 2310.08559 • Published Oct 12, 2023 • 1

ClarifyDelphi: Reinforced Clarification Questions with Defeasibility Rewards for Social and Moral Situations

Paper • 2212.10409 • Published Dec 20, 2022

Design Choices for Crowdsourcing Implicit Discourse Relations: Revealing the Biases Introduced by Task Design

Paper • 2304.00815 • Published Apr 3, 2023

Draw Me a Flower: Processing and Grounding Abstraction in Natural Language

Paper • 2106.14321 • Published Jun 27, 2021 • 1

Asking It All: Generating Contextualized Questions for any Semantic Role

Paper • 2109.04832 • Published Sep 10, 2021

The Possible, the Plausible, and the Desirable: Event-Based Modality Detection for Language Processing

Paper • 2106.08037 • Published Jun 15, 2021

QADiscourse -- Discourse Relations as QA Pairs: Representation, Crowdsourcing and Baselines

Paper • 2010.02815 • Published Oct 6, 2020

Valentina Pyatkin

AI & ML interests

Recent Activity

Organizations

valpy's activity