cs.LG(2025-09-19)

📊 共 33 篇论文 | 🔗 2 篇有代码

🎯 兴趣领域导航

支柱二:RL算法与架构 (RL & Architecture) (18 🔗1) 支柱九:具身大模型 (Embodied Foundation Models) (12 🔗1) 支柱一:机器人控制 (Robot Control) (3)

🔬 支柱二:RL算法与架构 (RL & Architecture) (18 篇)

#题目一句话要点标签🔗
1 Foundation Models as World Models: A Foundational Study in Text-Based GridWorlds 提出基于Foundation Model的世界模型与智能体,提升文本网格世界中的强化学习效率。 reinforcement learning world model large language model
2 Estimating Clinical Lab Test Result Trajectories from PPG using Physiological Foundation Model and Patient-Aware State Space Model -- a UNIPHY+ Approach UNIPHY+Lab:利用PPG和生理基础模型预测ICU患者的连续生化指标 Mamba state space model MAE
3 Polynomial Contrastive Learning for Privacy-Preserving Representation Learning on Graphs 提出Poly-GRACE,实现同态加密友好的图神经网络自监督表示学习 representation learning contrastive learning OMOMO
4 Optimizing Product Deduplication in E-Commerce with Multimodal Embeddings 提出一种基于多模态嵌入的电商商品去重方法,提升大规模商品目录下的去重精度。 masked autoencoder multimodal
5 MTS-DMAE: Dual-Masked Autoencoder for Unsupervised Multivariate Time Series Representation Learning 提出双掩码自编码器DMAE,用于无监督多元时间序列表示学习 representation learning masked autoencoder
6 Test-Time Learning and Inference-Time Deliberation for Efficiency-First Offline Reinforcement Learning in Care Coordination and Population Health Management 提出TTL+ITD方法,用于高效、可审计的医疗协调离线强化学习。 reinforcement learning offline reinforcement learning
7 Rethinking Molecule Synthesizability with Chain-of-Reaction ReaSyn:利用反应链解决分子生成模型合成性不足的问题 reinforcement learning large language model chain-of-thought
8 Mental Accounts for Actions: EWA-Inspired Attention in Decision Transformers EWA-VQ-ODT:利用经验加权吸引力改进在线决策Transformer的样本效率 reinforcement learning decision transformer reward shaping
9 Automated Cyber Defense with Generalizable Graph-based Reinforcement Learning Agents 提出基于图的通用强化学习智能体,用于自动化网络防御。 reinforcement learning deep reinforcement learning
10 Fully Decentralized Cooperative Multi-Agent Reinforcement Learning is A Context Modeling Problem 提出动态感知上下文(DAC)方法,解决完全去中心化合作多智能体强化学习中的非平稳性和过度泛化问题 reinforcement learning policy learning
11 DiffusionNFT: Online Diffusion Reinforcement with Forward Process 提出DiffusionNFT,通过前向过程优化扩散模型,实现高效在线强化学习。 reinforcement learning flow matching classifier-free guidance
12 Uncertainty-Based Smooth Policy Regularisation for Reinforcement Learning with Few Demonstrations SPReD:基于不确定性的平滑策略正则化,提升少样本演示强化学习效果 reinforcement learning
13 RLinf: Flexible and Efficient Large-scale Reinforcement Learning via Macro-to-Micro Flow Transformation RLinf:通过宏微观流转换实现灵活高效的大规模强化学习 reinforcement learning
14 RMT-KD: Random Matrix Theoretic Causal Knowledge Distillation 提出RMT-KD以解决深度学习模型压缩问题 distillation
15 Nonconvex Regularization for Feature Selection in Reinforcement Learning 提出基于非凸正则化的强化学习特征选择算法,提升高噪声环境下的性能。 reinforcement learning
16 Inverse Optimization Latent Variable Models for Learning Costs Applied to Route Problems 提出逆优化隐变量模型(IO-LVM),用于学习路径规划问题中的成本函数分布。 reinforcement learning inverse reinforcement learning
17 HyP-ASO: A Hybrid Policy-based Adaptive Search Optimization Framework for Large-Scale Integer Linear Programs HyP-ASO:混合策略自适应搜索优化框架,用于求解大规模整数线性规划问题 reinforcement learning deep reinforcement learning
18 Learning to Optimize Capacity Planning in Semiconductor Manufacturing 提出基于异构图神经网络的深度强化学习模型,优化半导体制造中的产能规划。 reinforcement learning deep reinforcement learning

🔬 支柱九:具身大模型 (Embodied Foundation Models) (12 篇)

#题目一句话要点标签🔗
19 Uncertainty Quantification of Large Language Models using Approximate Bayesian Computation 提出基于近似贝叶斯计算的大语言模型不确定性量化方法,提升临床诊断可靠性。 large language model
20 Efficient Long-Tail Learning in Latent Space by sampling Synthetic Data 提出基于合成数据采样的潜在空间长尾学习方法,提升计算效率。 foundation model
21 MatchFixAgent: Language-Agnostic Autonomous Repository-Level Code Translation Validation and Repair 提出MatchFixAgent,实现语言无关的仓库级代码翻译验证与修复 large language model
22 Randomized Smoothing Meets Vision-Language Models 针对视觉-语言模型,提出基于随机平滑的鲁棒性验证方法,防御对抗攻击。 VLA
23 SABER: Uncovering Vulnerabilities in Safety Alignment via Cross-Layer Residual Connection SABER:通过跨层残差连接揭示安全对齐大语言模型的脆弱性 large language model
24 The Alignment Bottleneck 提出容量耦合对齐性能区间以解决对齐瓶颈问题 large language model
25 On Optimal Steering to Achieve Exact Fairness 提出基于KL散度的最优特征分布引导方法,实现精确公平性并提升模型效用 large language model
26 EigenTrack: Spectral Activation Feature Tracking for Hallucination and Out-of-Distribution Detection in LLMs and VLMs EigenTrack:利用谱激活特征追踪检测LLM和VLM中的幻觉和OOD large language model
27 KITE: Kernelized and Information Theoretic Exemplars for In-Context Learning KITE:基于核方法和信息论的上下文学习范例选择,提升小样本分类性能 large language model
28 Information Geometry of Variational Bayes 揭示信息几何与变分贝叶斯的联系,并应用于大规模语言模型。 large language model
29 Spectral Logit Sculpting: Adaptive Low-Rank Logit Transformation for Controlled Text Generation 提出Spectral Logit Sculpting (SLS),通过自适应低秩logit变换控制文本生成,提升LLM可靠性。 large language model
30 Small LLMs with Expert Blocks Are Good Enough for Hyperparamter Tuning 提出专家块框架以优化小型LLM的超参数调优 large language model

🔬 支柱一:机器人控制 (Robot Control) (3 篇)

#题目一句话要点标签🔗
31 Quantum Reinforcement Learning with Dynamic-Circuit Qubit Reuse and Grover-Based Trajectory Optimization 提出基于动态量子电路和Grover算法的量子强化学习框架,提升可扩展性。 trajectory optimization reinforcement learning
32 CoUn: Empowering Machine Unlearning via Contrastive Learning CoUn:通过对比学习增强机器学习的不可学习性 manipulation contrastive learning
33 UniTac2Pose: A Unified Approach Learned in Simulation for Category-level Visuotactile In-hand Pose Estimation UniTac2Pose:模拟环境学习的统一框架,用于类别级视觉触觉手内姿态估计 sim-to-real feature matching

⬅️ 返回 cs.LG 首页 · 🏠 返回主页