cs.LG(2023-12-17)

📊 共 5 篇论文

🎯 兴趣领域导航

支柱二:RL算法与架构 (RL & Architecture) (3) 支柱九:具身大模型 (Embodied Foundation Models) (2)

🔬 支柱二:RL算法与架构 (RL & Architecture) (3 篇)

#题目一句话要点标签🔗
1 Policy Optimization in RLHF: The Impact of Out-of-preference Data 研究表明,在RLHF中,利用偏好外数据进行策略优化能显著提升性能。 RLHF DPO direct preference optimization
2 GO-DICE: Goal-Conditioned Option-Aware Offline Imitation Learning via Stationary Distribution Correction Estimation GO-DICE:基于稳态分布校正估计的目标条件选项感知离线模仿学习 policy learning imitation learning
3 Learning to Act without Actions 提出LAPO,仅从视频中学习潜在动作策略,实现无动作强化学习 reinforcement learning world model

🔬 支柱九:具身大模型 (Embodied Foundation Models) (2 篇)

#题目一句话要点标签🔗
4 A mathematical perspective on Transformers 从交互粒子系统视角分析Transformer,揭示其长期演化中的聚类现象 large language model
5 Can persistent homology whiten Transformer-based black-box models? A case study on BERT compression 提出Optimus BERT压缩与可解释性方法,利用持续同调性精简BERT模型。 large language model

⬅️ 返回 cs.LG 首页 · 🏠 返回主页