Unity Rect Transform in Script

RainbowPO: A Unified Framework for Combining Improvements in Preference Optimization

This is the code repository of our ICLR 2025 paper. Download a model (e.g., Llama3-8B-Instruct), which you are going to fine-tune, and set the path to model_path. Download datasets from Ultrafeedback ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results

RainbowPO: A Unified Framework for Combining Improvements in Preference Optimization

Trending now