cs.LG(2026-01-02)
📊 共 10 篇论文 | 🔗 1 篇有代码
🎯 兴趣领域导航
支柱二:RL算法与架构 (RL & Architecture) (5)
支柱九:具身大模型 (Embodied Foundation Models) (4 🔗1)
支柱一:机器人控制 (Robot Control) (1)
🔬 支柱二:RL算法与架构 (RL & Architecture) (5 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 1 | IRPO: Scaling the Bradley-Terry Model via Reinforcement Learning | 提出IRPO:通过强化学习扩展Bradley-Terry模型,提升生成式奖励模型效率。 | reinforcement learning chain-of-thought | ||
| 2 | Avatar Forcing: Real-Time Interactive Head Avatar Generation for Natural Conversation | 提出Avatar Forcing框架,实现自然对话的实时交互式头部Avatar生成 | direct preference optimization multimodal | ||
| 3 | The Reasoning-Creativity Trade-off: Toward Creativity-Driven Problem Solving | 提出DCR框架,解决LLM推理创造力权衡问题,实现正确性和创造性的统一。 | DPO large language model | ||
| 4 | Traffic-Aware Optimal Taxi Placement Using Graph Neural Network-Based Reinforcement Learning | 提出基于图神经网络强化学习的交通感知出租车优化调度方法,提升城市出行效率。 | reinforcement learning | ||
| 5 | ARISE: Adaptive Reinforcement Integrated with Swarm Exploration | ARISE:一种融合群体探索的自适应强化学习框架,提升探索能力 | reinforcement learning PPO |
🔬 支柱九:具身大模型 (Embodied Foundation Models) (4 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 6 | Memory Bank Compression for Continual Adaptation of Large Language Models | 提出MBC模型,通过压缩记忆库实现大语言模型的持续自适应,显著降低存储成本。 | large language model | ✅ | |
| 7 | Bayesian Inverse Games with High-Dimensional Multi-Modal Observations | 提出基于变分自编码器的贝叶斯逆向博弈框架,用于多智能体目标推断与不确定性量化。 | multimodal | ||
| 8 | Geometry of Reason: Spectral Signatures of Valid Mathematical Reasoning | 提出一种免训练的谱分析方法,通过分析LLM注意力模式检测数学推理的有效性。 | large language model | ||
| 9 | HFedMoE: Resource-aware Heterogeneous Federated Learning with Mixture-of-Experts | HFedMoE:面向资源受限设备的异构联邦MoE学习框架 | large language model |
🔬 支柱一:机器人控制 (Robot Control) (1 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 10 | Adversarial Samples Are Not Created Equal | 区分利用脆弱特征与否的对抗样本,重新评估深度网络的对抗鲁棒性 | manipulation |