cs.LG(2026-04-07)

📊 共 109 篇论文

🎯 兴趣领域导航

支柱九:具身大模型 (Embodied Foundation Models) (48) 支柱二:RL算法与架构 (RL & Architecture) (47) 支柱一:机器人控制 (Robot Control) (7) 支柱八:物理动画 (Physics-based Animation) (5) 支柱四:生成式动作 (Generative Motion) (1) 支柱三:空间感知与语义 (Perception & Semantics) (1)

🔬 支柱九:具身大模型 (Embodied Foundation Models) (48 篇)

#题目一句话要点标签🔗
1 Uncertainty-Aware Foundation Models for Clinical Data 提出不确定性感知的临床数据Foundation模型,提升预测性能和数据缺失鲁棒性。 foundation model multimodal
2 Discrete Prototypical Memories for Federated Time Series Foundation Models 提出FeDPM:基于离散原型记忆的联邦时间序列基础模型 large language model foundation model
3 The Geometric Alignment Tax: Tokenization vs. Continuous Geometry in Scientific Foundation Models 揭示科学大模型中几何对齐税:离散Token化导致连续几何结构失真 foundation model
4 Good Rankings, Wrong Probabilities: A Calibration Audit of Multimodal Cancer Survival Models 多模态癌症生存模型校准性审计:揭示排序性能良好但概率预测失准问题 multimodal
5 Entropy, Disagreement, and the Limits of Foundation Models in Genomics 揭示基因组序列高熵特性对基因组Foundation Model性能的限制 foundation model
6 SLaB: Sparse-Lowrank-Binary Decomposition for Efficient Large Language Models 提出SLaB框架以解决大语言模型的高效部署问题 large language model
7 A Clinical Point Cloud Paradigm for In-Hospital Mortality Prediction from Multi-Level Incomplete Multimodal EHRs 提出HealthPoint,解决多层不完整多模态EHR的院内死亡率预测问题 multimodal
8 A Family of Open Time-Series Foundation Models for the Radio Access Network 提出TimeRAN:面向无线接入网的时序基础模型,提升多任务学习性能。 foundation model
9 MLorc: Momentum Low-rank Compression for Memory Efficient Large Language Model Adaptation 提出MLorc:一种用于大语言模型高效微调的动量低秩压缩方法 large language model
10 Bridging the Semantic Gap for Categorical Data Clustering via Large Language Models 提出ARISE,利用大语言模型弥合分类数据聚类中的语义鸿沟 large language model
11 LLM-ODE: Data-driven Discovery of Dynamical Systems with Large Language Models LLM-ODE:利用大语言模型进行数据驱动的动力系统方程发现 large language model
12 Spectral Compact Training: Pre-Training Large Language Models via Permanent Truncated SVD and Stiefel QR Retraction 提出谱紧凑训练SCT,通过截断SVD和Stiefel流形QR回撤预训练大型语言模型,显著降低内存占用。 large language model
13 InfoTok: Information-Theoretic Regularization for Capacity-Constrained Shared Visual Tokenization in Unified MLLMs InfoTok:面向统一多模态大语言模型,提出信息论正则化的容量约束共享视觉Token化方法 large language model multimodal
14 Subspace Control: Turning Constrained Model Steering into Controllable Spectral Optimization 提出SIFT,通过子空间控制解决模型微调中的目标冲突问题 large language model foundation model
15 Optimizing LLM Prompt Engineering with DSPy Based Declarative Learning 利用DSPy声明式学习优化LLM提示工程,提升事实准确性和泛化能力 large language model chain-of-thought
16 ACT: Agentic Classification Tree 提出Agentic Classification Tree (ACT),将决策树方法扩展到非结构化文本数据分类任务。 large language model chain-of-thought
17 SynQuE: Estimating Synthetic Dataset Quality Without Annotations SynQuE:无需标注评估合成数据集质量,提升真实任务性能 large language model foundation model
18 LoFT: Parameter-Efficient Fine-Tuning for Long-tailed Semi-Supervised Learning in Open-World Scenarios 提出LoFT框架,解决开放世界下长尾半监督学习的参数高效微调问题 foundation model
19 Scaling DPPs for RAG: Density Meets Diversity 提出ScalDPP,通过行列式点过程提升RAG中检索结果的密度与多样性 large language model
20 Scalable Variational Bayesian Fine-Tuning of LLMs via Orthogonalized Low-Rank Adapters 提出PoLAR-VBLL,通过正交低秩适配器实现LLM的可扩展变分贝叶斯微调,提升不确定性量化。 large language model
21 Beauty in the Eye of AI: Aligning LLMs and Vision Models with Human Aesthetics in Network Visualization 利用大语言模型和视觉模型对齐人类审美,实现网络可视化中的自动美学评估。 large language model
22 Automated Conjecture Resolution with Formal Verification 提出Rethlas-Archon框架,结合非形式化推理与形式化验证,自动解决并验证数学难题。 large language model
23 Representational Collapse in Multi-Agent LLM Committees: Measurement and Diversity-Aware Consensus 提出DALC协议以解决多智能体LLM委员会的表征崩溃问题 chain-of-thought
24 Where to Steer: Input-Dependent Layer Selection for Steering Improves LLM Alignment 提出W2S:输入依赖的层选择策略提升LLM对齐效果 large language model
25 Diagonal-Tiled Mixed-Precision Attention for Efficient Low-Bit MXFP Inference 提出对角分块混合精度注意力机制,加速低比特MXFP大模型推理。 large language model
26 Multirate Stein Variational Gradient Descent for Efficient Bayesian Sampling 提出多速率Stein变分梯度下降算法,提升贝叶斯采样的效率和鲁棒性 multimodal
27 Towards Unveiling Vulnerabilities of Large Reasoning Models in Machine Unlearning 针对大型推理模型提出新型不可学习攻击,揭示其安全漏洞 large language model
28 MUXQ: Mixed-to-Uniform Precision MatriX Quantization via Low-Rank Outlier Decomposition 提出MUXQ:一种基于低秩异常分解的混合精度到均匀精度矩阵量化方法,用于解决LLM在NPU上的部署难题。 large language model
29 Noise Immunity in In-Context Tabular Learning: An Empirical Robustness Analysis of TabPFN's Attention Mechanisms TabPFN在上下文表格学习中表现出噪声免疫性,其注意力机制具有显著的鲁棒性。 foundation model
30 ENEC: A Lossless AI Model Compression Method Enabling Fast Inference on Ascend NPUs ENEC:一种用于昇腾NPU的AI模型无损压缩加速推理方法 large language model
31 LLMs Judging LLMs: A Simplex Perspective 提出几何贝叶斯先验以评估大型语言模型的输出质量 large language model
32 Beyond Linear Steering: Unified Multi-Attribute Control for Language Models 提出K-Steering,通过非线性多标签分类统一控制语言模型多种行为属性 large language model
33 xRFM: Accurate, scalable, and interpretable feature learning models for tabular data xRFM:一种准确、可扩展且可解释的表格数据特征学习模型 foundation model
34 Fewer Weights, More Problems: A Practical Attack on LLM Pruning 提出一种新方法揭示LLM剪枝的安全隐患 large language model
35 From Bits to Chips: An LLM-based Hardware-Aware Quantization Agent for Streamlined Deployment of LLMs 提出基于LLM的硬件感知量化Agent(HAQA),简化LLM部署流程。 large language model
36 HOSL: Hybrid-Order Split Learning for Memory-Constrained Edge Training 提出HOSL以解决边缘设备训练中的内存限制问题 large language model
37 Auxiliary-predicted Compress Memory Model(ApCM Model): A Neural Memory Storage Model Based on Invertible Compression and Learnable Prediction 提出ApCM模型,通过可逆压缩和预测机制增强LLM的运行时记忆能力 large language model
38 ModalImmune: Immunity Driven Unlearning via Self Destructive Training 提出ModalImmune框架,增强多模态系统在模态缺失下的鲁棒性 multimodal
39 MOOSE-Star: Unlocking Tractable Training for Scientific Discovery by Breaking the Complexity Barrier 提出MOOSE-Star以解决科学发现中的复杂性训练问题 large language model
40 See it to Place it: Evolving Macro Placements with Vision-Language Models 提出VeoPlace,利用视觉-语言模型进化芯片宏布局,显著降低线长。 foundation model
41 Rethinking Language Model Scaling under Transferable Hypersphere Optimization 提出HyperP框架以优化大语言模型的可扩展性问题 large language model
42 Stealthy and Adjustable Text-Guided Backdoor Attacks on Multimodal Pretrained Models 提出文本引导的后门攻击TGB,提升多模态预训练模型的隐蔽性和可控性 multimodal
43 Task Ecologies and the Evolution of World-Tracking Representations in Large Language Models 研究语言模型中世界追踪表征的涌现,揭示任务生态与表征演化的关系 large language model
44 In-Place Test-Time Training 提出In-Place TTT,使LLM在推理时动态适应新信息,提升长上下文任务性能。 large language model
45 QiMeng-PRepair: Precise Code Repair via Edit-Aware Reward Optimization PRepair:通过编辑感知奖励优化实现精确代码修复 large language model
46 ALTO: Adaptive LoRA Tuning and Orchestration for Heterogeneous LoRA Training Workloads ALTO:面向异构LoRA训练工作负载的自适应调优与编排系统 large language model
47 Cross-Machine Anomaly Detection Leveraging Pre-trained Time-series Model 提出一种基于预训练时间序列模型的跨机器异常检测框架,提升泛化能力。 foundation model
48 LLMs Should Express Uncertainty Explicitly 提出显式不确定性表达接口,提升LLM决策能力与可靠性 large language model

🔬 支柱二:RL算法与架构 (RL & Architecture) (47 篇)

#题目一句话要点标签🔗
49 Large Language Model Guided Incentive Aware Reward Design for Cooperative Multi-Agent Reinforcement Learning 提出基于大语言模型引导的激励感知奖励设计框架,用于提升合作多智能体强化学习效果 reinforcement learning reward design large language model
50 The limits of bio-molecular modeling with large language models : a cross-scale evaluation BioMol-LLM-Bench:跨尺度生物分子建模中大语言模型能力的系统性评估与局限性分析 Mamba large language model chain-of-thought
51 Delayed Homomorphic Reinforcement Learning for Environments with Delayed Feedback 提出延迟同态强化学习(DHRL)框架,解决延迟反馈环境下的强化学习问题。 reinforcement learning policy learning OMOMO
52 SODA: Semi On-Policy Black-Box Distillation for Large Language Models SODA:面向大语言模型的半在线黑盒蒸馏方法,提升效率与稳定性。 distillation large language model
53 Automated Attention Pattern Discovery at Scale in Large Language Models 提出AP-MAE,通过注意力模式分析和干预提升大语言模型性能 masked autoencoder MAE large language model
54 APPA: Adaptive Preference Pluralistic Alignment for Fair Federated RLHF of LLMs APPA:面向LLM公平联邦RLHF的自适应偏好多元对齐 reinforcement learning PPO RLHF
55 Not All Tokens Matter: Towards Efficient LLM Reasoning via Token Significance in Reinforcement Learning 提出基于Token重要性的强化学习方法,提升LLM推理效率与准确性 reinforcement learning large language model chain-of-thought
56 Realistic Market Impact Modeling for Reinforcement Learning Trading Environments 提出MACE环境,解决强化学习交易中市场冲击成本建模不足问题 reinforcement learning DRL PPO
57 Provable Multi-Task Reinforcement Learning: A Representation Learning Framework with Low Rank Rewards 提出基于低秩奖励矩阵的多任务强化学习表征学习框架,提升学习效率。 reinforcement learning representation learning
58 Empowering Power Outage Prediction with Spatially Aware Hybrid Graph Neural Networks and Contrastive Learning 提出SA-HGNN模型,结合对比学习,提升极端天气下电力中断预测的准确性。 predictive model contrastive learning spatial relationship
59 Co-Evolving Latent Action World Models 提出CoLA-World,实现潜变量动作世界模型的协同进化,提升视频模拟和视觉规划能力。 world model world models
60 Making Bias Non-Predictive: Training Robust LLM Reasoning via Reinforcement Learning 提出EIT框架,通过强化学习提升LLM在推理中对抗认知偏差的鲁棒性 reinforcement learning reward design large language model
61 One Model for All: Multi-Objective Controllable Language Models 提出多目标控制(MOC)方法,训练单个LLM以实现用户偏好控制的个性化输出。 reinforcement learning RLHF HuMoR
62 DP-OPD: Differentially Private On-Policy Distillation for Language Models 提出DP-OPD以解决语言模型隐私保护与压缩效率的矛盾 distillation large language model
63 Stratifying Reinforcement Learning with Signal Temporal Logic 提出基于分层信号时序逻辑的强化学习框架,提升任务规划能力。 reinforcement learning deep reinforcement learning DRL
64 Causal Process Models: Reframing Dynamic Causal Graph Discovery as a Reinforcement Learning Problem 提出因果过程模型,将动态因果图发现重构为强化学习问题 reinforcement learning world model world models
65 Apriel-1.5-OpenReasoner: RL Post-Training for General-Purpose and Efficient Reasoning 提出Apriel-1.5-OpenReasoner,通过强化学习后训练提升通用推理能力和效率。 reinforcement learning instruction following chain-of-thought
66 Adversarial Robustness of Deep State Space Models for Forecasting 针对时序预测,提出对抗鲁棒的深度状态空间模型,提升模型在恶意扰动下的预测精度。 SSM state space model
67 Geometric Limits of Knowledge Distillation: A Minimum-Width Theorem via Superposition Theory 提出几何限制理论以解决知识蒸馏性能饱和问题 distillation
68 Correcting Source Mismatch in Flow Matching with Radial-Angular Transport 提出径向-角度流匹配(RAFM)以解决流匹配中源分布不匹配问题 flow matching
69 Boosted Distributional Reinforcement Learning: Analysis and Healthcare Applications 提出BDRL算法,通过优化分布强化学习解决医疗决策中异构群体的一致性问题。 reinforcement learning
70 Isokinetic Flow Matching for Pathwise Straightening of Generative Flows 提出Isokinetic Flow Matching,通过动态正则化显著提升生成流的快速采样效率。 flow matching
71 Anticipatory Reinforcement Learning: From Generative Path-Laws to Distributional Value Functions 提出ARL框架,通过生成路径法则和分布价值函数解决非马尔可夫决策过程中的强化学习问题。 reinforcement learning
72 Selecting Decision-Relevant Concepts in Reinforcement Learning 提出自动概念选择算法以优化强化学习决策 reinforcement learning
73 Explainable Autonomous Cyber Defense using Adversarial Multi-Agent Reinforcement Learning 提出Causal Multi-Agent Decision Framework以解决网络防御中的模糊性问题 reinforcement learning
74 Generative modeling of granular flow on inclined planes using conditional flow matching 提出基于条件流匹配的生成模型,用于倾斜面上颗粒流的内部运动学重建。 flow matching
75 SLSREC: Self-Supervised Contrastive Learning for Adaptive Fusion of Long- and Short-Term User Interests SLSRec:自监督对比学习融合长短期用户兴趣,提升会话推荐效果 contrastive learning
76 EventFlow: Forecasting Temporal Point Processes with Flow Matching EventFlow:利用Flow Matching进行时序点过程预测,显著降低预测误差。 flow matching
77 An Information-Theoretic Analysis of OOD Generalization in Meta-Reinforcement Learning 基于信息论的元强化学习OOD泛化分析与界限 reinforcement learning
78 Personalized Federated Distillation Assisted Vehicle Edge Caching Strategy 提出个性化联邦蒸馏辅助的车载边缘缓存策略,降低通信开销。 distillation
79 Kinetic-Mamba: Mamba-Assisted Predictions of Stiff Chemical Kinetics Kinetic-Mamba:利用Mamba预测刚性化学动力学,提升燃烧模拟精度。 Mamba
80 NePPO: Near-Potential Policy Optimization for General-Sum Multi-Agent Reinforcement Learning NePPO:面向通用和多智能体强化学习的近势策略优化 reinforcement learning
81 Audio-to-Image Bird Species Retrieval without Audio-Image Pairs via Text Distillation 提出基于文本蒸馏的音频到图像鸟类检索方法,无需配对数据。 distillation
82 Learning from Imperfect Demonstrations via Temporal Behavior Tree-Guided Trajectory Repair 提出基于时间行为树的轨迹修复方法以改善机器人学习 reinforcement learning policy learning
83 Restless Bandits with Individual Penalty Constraints: A New Near-Optimal Index Policy and How to Learn It 提出个体惩罚约束下的Restless Bandits新策略,解决动态无线网络资源分配问题 reinforcement learning deep reinforcement learning
84 Cog-DRIFT: Exploration on Adaptively Reformulated Instances Enables Learning from Hard Reasoning Problems Cog-DRIFT通过自适应重构实例,解决LLM在困难推理问题上的学习难题。 reinforcement learning curriculum learning
85 Hierarchical Contrastive Learning for Multimodal Data 提出分层对比学习(HCL)框架,解决多模态数据表示中模态间复杂关系建模问题。 representation learning contrastive learning multimodal
86 Hidden in the Multiplicative Interaction: Uncovering Fragility in Multimodal Contrastive Learning 提出Gated Symile,解决多模态对比学习中模态不可靠性问题,提升检索精度。 contrastive learning multimodal
87 Toward Consistent World Models with Multi-Token Prediction and Latent Semantic Enhancement 提出LSE-MTP,通过多步预测和隐语义增强提升世界模型的连贯性 world model world models large language model
88 AttnDiff: Attention-based Differential Fingerprinting for Large Language Models AttnDiff:基于注意力的差分指纹技术,用于识别大型语言模型的衍生关系 PPO DPO large language model
89 A Mixture of Experts Foundation Model for Scanning Electron Microscopy Image Analysis 提出用于扫描电子显微镜图像分析的混合专家基础模型,提升泛化性和自动化水平。 representation learning foundation model
90 The UNDO Flip-Flop: A Controlled Probe for Reversible Semantic State Management in State Space Model 提出UNDO Flip-Flop任务,用于评估状态空间模型中可逆语义状态管理能力。 Mamba SSM state space model
91 Optimal-Transport-Guided Functional Flow Matching for Turbulent Field Generation in Hilbert Space 提出基于最优传输引导的函数式流匹配方法,用于Hilbert空间中的湍流场生成。 flow matching spatiotemporal
92 Value Mirror Descent for Reinforcement Learning 提出值镜下降法以优化强化学习中的价值迭代 reinforcement learning
93 Graph Topology Information Enhanced Heterogeneous Graph Representation Learning 提出ToGRL框架,通过拓扑学习增强异构图表示,提升下游任务性能 representation learning
94 Top-K Retrieval with Fixed-Size Linear-Attention Completion: Backbone- and KV-Format-Preserving Attention for KV-Cache Read Reduction 提出固定大小线性注意力补全的Top-K检索,减少KV缓存读取,提升长文本生成效率。 linear attention
95 Jeffreys Flow: Robust Boltzmann Generators for Rare Event Sampling via Parallel Tempering Distillation 提出Jeffreys Flow,通过并行回火蒸馏解决玻尔兹曼生成器中的模式崩塌问题 distillation

🔬 支柱一:机器人控制 (Robot Control) (7 篇)

#题目一句话要点标签🔗
96 FlashSAC: Fast and Stable Off-Policy Reinforcement Learning for High-Dimensional Robot Control FlashSAC:面向高维机器人控制的快速稳定离线强化学习算法 humanoid humanoid locomotion locomotion
97 WIMLE: Uncertainty-Aware World Models with IMLE for Sample-Efficient Continuous Control WIMLE:基于不确定性感知世界模型与IMLE的样本高效连续控制 humanoid reinforcement learning world model
98 A Multi-Level Causal Intervention Framework for Mechanistic Interpretability in Variational Autoencoders 提出多层次因果干预框架,用于变分自编码器的机制可解释性研究 manipulation VQ-VAE
99 Neural Operators for Multi-Task Control and Adaptation 提出基于神经算子的多任务控制框架,实现高效的任务泛化与快速适应。 locomotion
100 Relay-Assisted Activation-Integrated SIM for Wireless Physical Neural Networks 提出基于中继辅助激活集成智能超表面的无线物理神经网络 manipulation
101 Byzantine-Robust and Differentially Private Federated Optimization under Weaker Assumptions 提出Byz-Clip21-SGD2M以解决联邦学习中的隐私与鲁棒性问题 manipulation
102 Convergence of Byzantine-Resilient Gradient Tracking via Probabilistic Edge Dropout 提出基于概率边丢弃的拜占庭容错梯度追踪方法,解决分布式优化中的恶意攻击问题。 manipulation

🔬 支柱八:物理动画 (Physics-based Animation) (5 篇)

#题目一句话要点标签🔗
103 Spatiotemporal Interpolation of GEDI Biomass with Calibrated Uncertainty 提出基于Attentive Neural Process的时空插值方法,用于GEDI生物量不确定性校准。 spatiotemporal foundation model
104 Spatiotemporal-Aware Bit-Flip Injection on DNN-based Advanced Driver Assistance Systems 提出STAFI框架,用于检测ADAS中DNN对时空敏感的位翻转故障 spatiotemporal
105 Algebraic Diversity: Group-Theoretic Spectral Estimation from Single Observations 提出代数多样性理论,用单次观测实现等效于多次观测的谱估计。 PULSE
106 Deep Gaussian Processes for Functional Maps 提出DGPFM模型,解决函数空间映射中的非线性关系建模和不确定性量化问题。 spatiotemporal
107 Topological Characterization of Churn Flow and Unsupervised Correction to the Wu Flow-Regime Map in Small-Diameter Vertical Pipes 提出基于拓扑特征的相态识别方法,无需标注数据即可校正现有流型图 AMP

🔬 支柱四:生成式动作 (Generative Motion) (1 篇)

#题目一句话要点标签🔗
108 Collapse-Free Prototype Readout Layer for Transformer Encoders 提出DDCL-Attention,一种无崩溃原型读取层,用于Transformer编码器。 VQ-VAE

🔬 支柱三:空间感知与语义 (Perception & Semantics) (1 篇)

#题目一句话要点标签🔗
109 Three-dimensional inversion of gravity data using implicit neural representations and scientific machine learning 提出隐式神经表示以解决三维重力数据反演问题 implicit representation

⬅️ 返回 cs.LG 首页 · 🏠 返回主页