Cyber-Zero: Training Cybersecurity Agents without Runtime Paper • 2508.00910 • Published 16 days ago • 7
CodeArena: A Collective Evaluation Platform for LLM Code Generation Paper • 2503.01295 • Published Mar 3 • 8
BigCodeBench: Benchmarking Code Generation with Diverse Function Calls and Complex Instructions Paper • 2406.15877 • Published Jun 22, 2024 • 48