cs.LG(2025-08-05)

📊 共 31 篇论文 | 🔗 3 篇有代码

🎯 兴趣领域导航

支柱二:RL算法与架构 (RL & Architecture) (17) 支柱九:具身大模型 (Embodied Foundation Models) (10 🔗3) 支柱八:物理动画 (Physics-based Animation) (3) 支柱五:交互与反应 (Interaction & Reaction) (1)

🔬 支柱二:RL算法与架构 (RL & Architecture) (17 篇)

#题目一句话要点标签🔗
1 Understanding protein function with a multimodal retrieval-augmented foundation model 提出PoET-2以解决蛋白质功能预测的挑战 representation learning foundation model multimodal
2 A Rolling Stone Gathers No Moss: Adaptive Policy Optimization for Stable Self-Evaluation in Large Multimodal Models 提出AdaPO以解决大规模多模态模型自我评估问题 reinforcement learning foundation model multimodal
3 Scaling DRL for Decision Making: A Survey on Data, Network, and Training Budget Strategies 提出数据、网络与训练预算策略以提升深度强化学习决策能力 reinforcement learning deep reinforcement learning DRL
4 PAC Apprenticeship Learning with Bayesian Active Inverse Reinforcement Learning 提出PAC-EIG以解决主动逆强化学习中的可靠性问题 reinforcement learning inverse reinforcement learning
5 Reinforcement Learning for Target Zone Blood Glucose Control 提出强化学习框架以解决1型糖尿病血糖控制问题 reinforcement learning policy learning PULSE
6 Training Long-Context, Multi-Turn Software Engineering Agents with Reinforcement Learning 提出强化学习方法以解决软件工程中的多轮交互问题 reinforcement learning large language model
7 Rethinking Selectivity in State Space Models: A Minimal Predictive Sufficiency Approach 提出最小预测充分性模型以优化状态空间模型的选择性问题 Mamba SSM state space model
8 Cross-Model Semantics in Representation Learning 提出结构约束以提升深度网络表示的跨模型兼容性 representation learning distillation
9 HiTeC: Hierarchical Contrastive Learning on Text-Attributed Hypergraph with Semantic-Aware Augmentation 提出HiTeC框架以解决文本属性超图的对比学习问题 representation learning contrastive learning
10 Reinforcement Learning in MDPs with Information-Ordered Policies 提出基于信息有序策略的强化学习算法以优化MDPs reinforcement learning
11 Self-Questioning Language Models 提出自问自答语言模型以提升推理能力 reinforcement learning large language model
12 SLA-MORL: SLA-Aware Multi-Objective Reinforcement Learning for HPC Resource Optimization 提出SLA-MORL以解决云环境中资源优化问题 reinforcement learning
13 Physics-Constrained Fine-Tuning of Flow-Matching Models for Generation and Inverse Problems 提出物理约束微调流匹配模型以解决逆问题 flow matching
14 Pseudo-label Induced Subspace Representation Learning for Robust Out-of-Distribution Detection 提出伪标签诱导子空间表示学习以解决OOD检测问题 representation learning
15 VRPO: Rethinking Value Modeling for Robust RL Training under Noisy Supervision 提出VRPO以解决噪声监督下的强化学习训练问题 reinforcement learning PPO RLHF
16 ORVIT: Near-Optimal Online Distributionally Robust Reinforcement Learning 提出在线分布鲁棒强化学习方法以应对训练与部署环境不匹配问题 reinforcement learning
17 Increasing Interaction Fidelity: Training Routines for Biomechanical Models in HCI 提出改进训练方案以提升生物力学模型在HCI中的交互精度 reinforcement learning curriculum learning

🔬 支柱九:具身大模型 (Embodied Foundation Models) (10 篇)

#题目一句话要点标签🔗
18 SoilNet: A Multimodal Multitask Model for Hierarchical Classification of Soil Horizons 提出SoilNet以解决土壤层次分类问题 foundation model multimodal
19 CoTox: Chain-of-Thought-Based Molecular Toxicity Reasoning and Prediction 提出CoTox框架以解决药物毒性预测的可解释性问题 large language model chain-of-thought
20 A Novel Multimodal Framework for Early Detection of Alzheimers Disease Using Deep Learning 提出多模态框架以解决阿尔茨海默病早期检测问题 multimodal
21 VRPRM: Process Reward Modeling via Visual Reasoning 提出VRPRM以解决PRM在长远推理中的不足 large language model chain-of-thought
22 A DbC Inspired Neurosymbolic Layer for Trustworthy Agent Design 提出基于契约设计的神经符号层以提升智能体可信度 large language model
23 MoKA: Mixture of Kronecker Adapters 提出MoKA以解决低秩适配器表达能力不足的问题 large language model
24 Revisiting Heat Flux Analysis of Tungsten Monoblock Divertor on EAST using Physics-Informed Neural Network 提出物理信息神经网络以加速EAST热流分析 TAMP
25 Exploring Layer-wise Information Effectiveness for Post-Training Quantization in Small Language Models 提出LieQ框架以解决小型语言模型的后训练量化问题 large language model
26 GTPO: Stabilizing Group Relative Policy Optimization via Gradient and Entropy Control 提出GTPO以解决GRPO训练不稳定性问题 large language model
27 Frontier: Simulating the Next Generation of LLM Inference Systems 提出Frontier以解决LLM推理系统复杂性问题 large language model

🔬 支柱八:物理动画 (Physics-based Animation) (3 篇)

#题目一句话要点标签🔗
28 Intelligent Sampling of Extreme-Scale Turbulence Datasets for Accurate and Efficient Spatiotemporal Model Training 提出SICKLE框架以高效训练大规模湍流数据模型 spatiotemporal
29 Minimal Convolutional RNNs Accelerate Spatiotemporal Learning 提出MinConvLSTM和MinConvGRU以加速时空学习 spatiotemporal
30 AI on the Pulse: Real-Time Health Anomaly Detection with Wearable and Ambient Intelligence 提出AI on the Pulse以解决实时健康异常检测问题 PULSE

🔬 支柱五:交互与反应 (Interaction & Reaction) (1 篇)

#题目一句话要点标签🔗
31 No LLM Solved Yu Tsumura's 554th Problem 揭示现有LLM无法解决的Yu Tsumura第554个问题 IMoS

⬅️ 返回 cs.LG 首页 · 🏠 返回主页