cs.LG(2026-04-02)
📊 共 7 篇论文
🎯 兴趣领域导航
🔬 支柱九:具身大模型 (Embodied Foundation Models) (4 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 1 | FourierMoE: Fourier Mixture-of-Experts Adaptation of Large Language Models | 提出FourierMoE,通过频域混合专家模型高效微调大语言模型 | large language model | ||
| 2 | Batched Contextual Reinforcement: A Task-Scaling Law for Efficient Reasoning | 提出批量上下文强化学习(BCR),提升LLM推理效率并避免显式长度惩罚的缺陷。 | large language model chain-of-thought | ||
| 3 | MiCA Learns More Knowledge Than LoRA and Full Fine-Tuning | MiCA:一种参数高效的微调方法,通过适配次要成分提升知识获取 | large language model | ||
| 4 | CRIT: Graph-Based Automatic Data Synthesis to Enhance Cross-Modal Multi-Hop Reasoning | 提出CRIT:一种基于图的自动数据合成方法,增强跨模态多跳推理能力。 | multimodal |
🔬 支柱二:RL算法与架构 (RL & Architecture) (3 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 5 | World Action Verifier: Self-Improving World Models via Forward-Inverse Asymmetry | 提出世界行动验证器(WAV),通过前向-逆向不对称性自提升世界模型 | policy learning world model world models | ||
| 6 | Physics Informed Reinforcement Learning with Gibbs Priors for Topology Control in Power Grids | 提出基于吉布斯先验的物理信息强化学习,用于电力网络拓扑控制 | reinforcement learning PPO | ||
| 7 | Model-Based Reinforcement Learning for Control under Time-Varying Dynamics | 提出自适应数据缓冲的乐观模型强化学习算法,解决时变动力学控制问题 | reinforcement learning |