cs.LG（2023-12-14）

📊 共 11 篇论文 | 🔗 2 篇有代码

🎯 兴趣领域导航

支柱二：RL算法与架构 (RL & Architecture) (8 🔗2) 支柱九：具身大模型 (Embodied Foundation Models) (2) 支柱一：机器人控制 (Robot Control) (1)

🔬 支柱二：RL算法与架构 (RL & Architecture) (8 篇)

#	题目	一句话要点	标签	🔗	⭐
1	LiFT: Unsupervised Reinforcement Learning with Foundation Models as Teachers	LiFT：利用基础模型作为教师的无监督强化学习，提升智能体语义行为学习能力	reinforcement learning large language model foundation model
2	Global Rewards in Multi-Agent Deep Reinforcement Learning for Autonomous Mobility on Demand Systems	提出基于全局奖励的多智能体深度强化学习算法，优化按需出行系统车辆调度	reinforcement learning deep reinforcement learning	✅
3	Less is more -- the Dispatcher/ Executor principle for multi-task Reinforcement Learning	提出Dispatcher/Executor原则，提升多任务强化学习泛化能力和数据效率	reinforcement learning
4	RdimKD: Generic Distillation Paradigm by Dimensionality Reduction	提出基于降维的通用知识蒸馏范式RdimKD，简化蒸馏流程并提升泛化性	distillation
5	iOn-Profiler: intelligent Online multi-objective VNF Profiling with Reinforcement Learning	提出iOn-Profiler，利用强化学习进行智能在线多目标VNF剖析，优化资源分配和性能。	reinforcement learning
6	Vision-Language Models as a Source of Rewards	利用视觉-语言模型作为强化学习的奖励来源，提升通用智能体能力	reinforcement learning generalist agent
7	Personalized Path Recourse for Reinforcement Learning Agents	提出个性化路径补救方法，为强化学习智能体生成目标导向的相似行为路径。	reinforcement learning
8	Gradient Informed Proximal Policy Optimization	提出梯度指导的近端策略优化算法，提升强化学习在可微环境中的性能	policy learning PPO	✅

🔬 支柱九：具身大模型 (Embodied Foundation Models) (2 篇)

#	题目	一句话要点	标签	🔗	⭐
9	Successor Heads: Recurring, Interpretable Attention Heads In The Wild	发现并解释了大型语言模型中具有递增功能的successor heads注意力头	large language model
10	Dynamic Retrieval-Augmented Generation	提出动态检索增强生成（DRAG）方法，提升代码生成任务中大语言模型的准确性与效率。	large language model

🔬 支柱一：机器人控制 (Robot Control) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
11	ReCoRe: Regularized Contrastive Representation Learning of World Model	提出ReCoRe，通过正则化对比表示学习提升世界模型在视觉导航中的泛化能力。	sim-to-real reinforcement learning world model

⬅️ 返回 cs.LG 首页 · 🏠 返回主页