cs.LG（2025-08-11）

📊 共 16 篇论文

🎯 兴趣领域导航

支柱九：具身大模型 (Embodied Foundation Models) (7) 支柱二：RL算法与架构 (RL & Architecture) (6) 支柱一：机器人控制 (Robot Control) (2) 支柱八：物理动画 (Physics-based Animation) (1)

🔬 支柱九：具身大模型 (Embodied Foundation Models) (7 篇)

#	题目	一句话要点	标签	🔗	⭐
1	MuaLLM: A Multimodal Large Language Model Agent for Circuit Design Assistance with Hybrid Contextual Retrieval-Augmented Generation	提出MuaLLM以解决电路设计文献检索与生成问题	large language model multimodal
2	C-MAG: Cascade Multimodal Attributed Graphs for Supply Chain Link Prediction	提出C-MAG以解决供应链链接预测中的多模态数据融合问题	multimodal
3	BadPromptFL: A Novel Backdoor Threat to Prompt-based Federated Learning in Multimodal Models	提出BadPromptFL以解决多模态模型中的后门攻击问题	multimodal
4	Vision-Based Localization and LLM-based Navigation for Indoor Environments	提出基于视觉定位与大语言模型导航的室内导航解决方案	large language model
5	Selective KV-Cache Sharing to Mitigate Timing Side-Channels in LLM Inference	提出SafeKV以解决LLM推理中的时序侧信道攻击问题	large language model
6	Multi-head Transformers Provably Learn Symbolic Multi-step Reasoning via Gradient Descent	提出多头变换器以解决符号多步推理问题	chain-of-thought
7	On Understanding of the Dynamics of Model Capacity in Continual Learning	提出有效模型容量以解决持续学习中的稳定性与可塑性困境	large language model

🔬 支柱二：RL算法与架构 (RL & Architecture) (6 篇)

#	题目	一句话要点	标签	🔗	⭐
8	WeChat-YATT: A Scalable, Simple, Efficient, and Production Ready Training Library	提出WeChat-YATT以解决多模态RLHF训练的可扩展性问题	reinforcement learning RLHF large language model
9	Pareto Multi-Objective Alignment for Language Models	提出Pareto多目标对齐以解决语言模型的多重目标优化问题	RLHF large language model
10	Learning to Align, Aligning to Learn: A Unified Approach for Self-Optimized Alignment	提出GRAO框架以解决语言模型对齐效率低下问题	reinforcement learning PPO DPO
11	GLiClass: Generalist Lightweight Model for Sequence Classification Tasks	提出GLiClass以解决序列分类任务中的效率与准确性问题	PPO instruction following
12	Klear-Reasoner: Advancing Reasoning Capability via Gradient-Preserving Clipping Policy Optimization	提出Klear-Reasoner以解决推理模型性能再现性问题	reinforcement learning chain-of-thought
13	Part I: Tricks or Traps? A Deep Dive into RL for LLM Reasoning	提出系统化评估框架以优化大语言模型的强化学习应用	reinforcement learning PPO

🔬 支柱一：机器人控制 (Robot Control) (2 篇)

#	题目	一句话要点	标签	🔗	⭐
14	Robust Anomaly Detection in O-RAN: Leveraging LLMs against Data Manipulation Attacks	利用大语言模型解决O-RAN中的数据操控攻击问题	manipulation large language model
15	Learning Robust Satellite Attitude Dynamics with Physics-Informed Normalising Flow	提出基于物理信息的神经网络以提升卫星姿态控制的鲁棒性	MPC model predictive control

🔬 支柱八：物理动画 (Physics-based Animation) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
16	Fast and Generalizable parameter-embedded Neural Operators for Lithium-Ion Battery Simulation	提出参数嵌入的神经算子以加速锂离子电池模拟	PULSE

⬅️ 返回 cs.LG 首页 · 🏠 返回主页