cs.LG(2023-12-17)
📊 共 5 篇论文
🎯 兴趣领域导航
🔬 支柱二:RL算法与架构 (RL & Architecture) (3 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 1 | Policy Optimization in RLHF: The Impact of Out-of-preference Data | 研究表明,在RLHF中,利用偏好外数据进行策略优化能显著提升性能。 | RLHF DPO direct preference optimization | ||
| 2 | GO-DICE: Goal-Conditioned Option-Aware Offline Imitation Learning via Stationary Distribution Correction Estimation | GO-DICE:基于稳态分布校正估计的目标条件选项感知离线模仿学习 | policy learning imitation learning | ||
| 3 | Learning to Act without Actions | 提出LAPO,仅从视频中学习潜在动作策略,实现无动作强化学习 | reinforcement learning world model |
🔬 支柱九:具身大模型 (Embodied Foundation Models) (2 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 4 | A mathematical perspective on Transformers | 从交互粒子系统视角分析Transformer,揭示其长期演化中的聚类现象 | large language model | ||
| 5 | Can persistent homology whiten Transformer-based black-box models? A case study on BERT compression | 提出Optimus BERT压缩与可解释性方法,利用持续同调性精简BERT模型。 | large language model |