Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Posts
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
sbrandeis 's Collections
Papers to read - General
Papers to read - Reinforcement Learning
Papers to read - Diffusion

Papers to read - Reinforcement Learning

updated Oct 31, 2023

Papers I want to read, at some point. Focused on Reinforcement Learning papers.

Upvote
-

  • Deep reinforcement learning from human preferences

    Paper • 1706.03741 • Published Jun 12, 2017 • 3

  • Training language models to follow instructions with human feedback

    Paper • 2203.02155 • Published Mar 4, 2022 • 17

  • Direct Preference-based Policy Optimization without Reward Modeling

    Paper • 2301.12842 • Published Jan 30, 2023

  • Woodpecker: Hallucination Correction for Multimodal Large Language Models

    Paper • 2310.16045 • Published Oct 24, 2023 • 17

  • DPOK: Reinforcement Learning for Fine-tuning Text-to-Image Diffusion Models

    Paper • 2305.16381 • Published May 25, 2023 • 3

    Note DPO


  • Secrets of RLHF in Large Language Models Part I: PPO

    Paper • 2307.04964 • Published Jul 11, 2023 • 29

    Note PPO

Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs