cs.LG(2025-08-11)

📊 共 16 篇论文

🎯 兴趣领域导航

支柱九:具身大模型 (Embodied Foundation Models) (7) 支柱二:RL算法与架构 (RL & Architecture) (6) 支柱一:机器人控制 (Robot Control) (2) 支柱八:物理动画 (Physics-based Animation) (1)

🔬 支柱九:具身大模型 (Embodied Foundation Models) (7 篇)

#题目一句话要点标签🔗
1 MuaLLM: A Multimodal Large Language Model Agent for Circuit Design Assistance with Hybrid Contextual Retrieval-Augmented Generation 提出MuaLLM以解决电路设计文献检索与生成问题 large language model multimodal
2 C-MAG: Cascade Multimodal Attributed Graphs for Supply Chain Link Prediction 提出C-MAG以解决供应链链接预测中的多模态数据融合问题 multimodal
3 BadPromptFL: A Novel Backdoor Threat to Prompt-based Federated Learning in Multimodal Models 提出BadPromptFL以解决多模态模型中的后门攻击问题 multimodal
4 Vision-Based Localization and LLM-based Navigation for Indoor Environments 提出基于视觉定位与大语言模型导航的室内导航解决方案 large language model
5 Selective KV-Cache Sharing to Mitigate Timing Side-Channels in LLM Inference 提出SafeKV以解决LLM推理中的时序侧信道攻击问题 large language model
6 Multi-head Transformers Provably Learn Symbolic Multi-step Reasoning via Gradient Descent 提出多头变换器以解决符号多步推理问题 chain-of-thought
7 On Understanding of the Dynamics of Model Capacity in Continual Learning 提出有效模型容量以解决持续学习中的稳定性与可塑性困境 large language model

🔬 支柱二:RL算法与架构 (RL & Architecture) (6 篇)

#题目一句话要点标签🔗
8 WeChat-YATT: A Scalable, Simple, Efficient, and Production Ready Training Library 提出WeChat-YATT以解决多模态RLHF训练的可扩展性问题 reinforcement learning RLHF large language model
9 Pareto Multi-Objective Alignment for Language Models 提出Pareto多目标对齐以解决语言模型的多重目标优化问题 RLHF large language model
10 Learning to Align, Aligning to Learn: A Unified Approach for Self-Optimized Alignment 提出GRAO框架以解决语言模型对齐效率低下问题 reinforcement learning PPO DPO
11 GLiClass: Generalist Lightweight Model for Sequence Classification Tasks 提出GLiClass以解决序列分类任务中的效率与准确性问题 PPO instruction following
12 Klear-Reasoner: Advancing Reasoning Capability via Gradient-Preserving Clipping Policy Optimization 提出Klear-Reasoner以解决推理模型性能再现性问题 reinforcement learning chain-of-thought
13 Part I: Tricks or Traps? A Deep Dive into RL for LLM Reasoning 提出系统化评估框架以优化大语言模型的强化学习应用 reinforcement learning PPO

🔬 支柱一:机器人控制 (Robot Control) (2 篇)

#题目一句话要点标签🔗
14 Robust Anomaly Detection in O-RAN: Leveraging LLMs against Data Manipulation Attacks 利用大语言模型解决O-RAN中的数据操控攻击问题 manipulation large language model
15 Learning Robust Satellite Attitude Dynamics with Physics-Informed Normalising Flow 提出基于物理信息的神经网络以提升卫星姿态控制的鲁棒性 MPC model predictive control

🔬 支柱八:物理动画 (Physics-based Animation) (1 篇)

#题目一句话要点标签🔗
16 Fast and Generalizable parameter-embedded Neural Operators for Lithium-Ion Battery Simulation 提出参数嵌入的神经算子以加速锂离子电池模拟 PULSE

⬅️ 返回 cs.LG 首页 · 🏠 返回主页