Definition
An open-source training framework for reinforcement learning on language models.
A distributed RL training framework for LLMs commonly used in agent post-training pipelines, notable for FSDP-based sharding rather than DeepSpeed.
An open-source training framework for reinforcement learning on language models.
A distributed RL training framework for LLMs commonly used in agent post-training pipelines, notable for FSDP-based sharding rather than DeepSpeed.