Neel Jain
nsjain
AI & ML interests
None yet
Recent Activity
upvoted
a
paper
27 minutes ago
Strategic Dishonesty Can Undermine AI Safety Evaluations of Frontier LLM
authored
a paper
20 days ago
Refusal Tokens: A Simple Way to Calibrate Refusals in Large Language
Models
authored
a paper
20 days ago
DynaGuard: A Dynamic Guardrail Model With User-Defined Policies