cs.LG（2026-04-07）

📊 共 109 篇论文

🎯 兴趣领域导航

支柱九：具身大模型 (Embodied Foundation Models) (48) 支柱二：RL算法与架构 (RL & Architecture) (47) 支柱一：机器人控制 (Robot Control) (7) 支柱八：物理动画 (Physics-based Animation) (5) 支柱四：生成式动作 (Generative Motion) (1) 支柱三：空间感知与语义 (Perception & Semantics) (1)

🔬 支柱九：具身大模型 (Embodied Foundation Models) (48 篇)

#	题目	一句话要点	标签
1	Uncertainty-Aware Foundation Models for Clinical Data	提出不确定性感知的临床数据Foundation模型，提升预测性能和数据缺失鲁棒性。	foundation model multimodal
2	Discrete Prototypical Memories for Federated Time Series Foundation Models	提出FeDPM：基于离散原型记忆的联邦时间序列基础模型	large language model foundation model
3	The Geometric Alignment Tax: Tokenization vs. Continuous Geometry in Scientific Foundation Models	揭示科学大模型中几何对齐税：离散Token化导致连续几何结构失真	foundation model
4	Good Rankings, Wrong Probabilities: A Calibration Audit of Multimodal Cancer Survival Models	多模态癌症生存模型校准性审计：揭示排序性能良好但概率预测失准问题	multimodal
5	Entropy, Disagreement, and the Limits of Foundation Models in Genomics	揭示基因组序列高熵特性对基因组Foundation Model性能的限制	foundation model
6	SLaB: Sparse-Lowrank-Binary Decomposition for Efficient Large Language Models	提出SLaB框架以解决大语言模型的高效部署问题	large language model
7	A Clinical Point Cloud Paradigm for In-Hospital Mortality Prediction from Multi-Level Incomplete Multimodal EHRs	提出HealthPoint，解决多层不完整多模态EHR的院内死亡率预测问题	multimodal
8	A Family of Open Time-Series Foundation Models for the Radio Access Network	提出TimeRAN：面向无线接入网的时序基础模型，提升多任务学习性能。	foundation model
9	MLorc: Momentum Low-rank Compression for Memory Efficient Large Language Model Adaptation	提出MLorc：一种用于大语言模型高效微调的动量低秩压缩方法	large language model
10	Bridging the Semantic Gap for Categorical Data Clustering via Large Language Models	提出ARISE，利用大语言模型弥合分类数据聚类中的语义鸿沟	large language model
11	LLM-ODE: Data-driven Discovery of Dynamical Systems with Large Language Models	LLM-ODE：利用大语言模型进行数据驱动的动力系统方程发现	large language model
12	Spectral Compact Training: Pre-Training Large Language Models via Permanent Truncated SVD and Stiefel QR Retraction	提出谱紧凑训练SCT，通过截断SVD和Stiefel流形QR回撤预训练大型语言模型，显著降低内存占用。	large language model
13	InfoTok: Information-Theoretic Regularization for Capacity-Constrained Shared Visual Tokenization in Unified MLLMs	InfoTok：面向统一多模态大语言模型，提出信息论正则化的容量约束共享视觉Token化方法	large language model multimodal
14	Subspace Control: Turning Constrained Model Steering into Controllable Spectral Optimization	提出SIFT，通过子空间控制解决模型微调中的目标冲突问题	large language model foundation model
15	Optimizing LLM Prompt Engineering with DSPy Based Declarative Learning	利用DSPy声明式学习优化LLM提示工程，提升事实准确性和泛化能力	large language model chain-of-thought
16	ACT: Agentic Classification Tree	提出Agentic Classification Tree (ACT)，将决策树方法扩展到非结构化文本数据分类任务。	large language model chain-of-thought
17	SynQuE: Estimating Synthetic Dataset Quality Without Annotations	SynQuE：无需标注评估合成数据集质量，提升真实任务性能	large language model foundation model
18	LoFT: Parameter-Efficient Fine-Tuning for Long-tailed Semi-Supervised Learning in Open-World Scenarios	提出LoFT框架，解决开放世界下长尾半监督学习的参数高效微调问题	foundation model
19	Scaling DPPs for RAG: Density Meets Diversity	提出ScalDPP，通过行列式点过程提升RAG中检索结果的密度与多样性	large language model
20	Scalable Variational Bayesian Fine-Tuning of LLMs via Orthogonalized Low-Rank Adapters	提出PoLAR-VBLL，通过正交低秩适配器实现LLM的可扩展变分贝叶斯微调，提升不确定性量化。	large language model
21	Beauty in the Eye of AI: Aligning LLMs and Vision Models with Human Aesthetics in Network Visualization	利用大语言模型和视觉模型对齐人类审美，实现网络可视化中的自动美学评估。	large language model
22	Automated Conjecture Resolution with Formal Verification	提出Rethlas-Archon框架，结合非形式化推理与形式化验证，自动解决并验证数学难题。	large language model
23	Representational Collapse in Multi-Agent LLM Committees: Measurement and Diversity-Aware Consensus	提出DALC协议以解决多智能体LLM委员会的表征崩溃问题	chain-of-thought
24	Where to Steer: Input-Dependent Layer Selection for Steering Improves LLM Alignment	提出W2S：输入依赖的层选择策略提升LLM对齐效果	large language model
25	Diagonal-Tiled Mixed-Precision Attention for Efficient Low-Bit MXFP Inference	提出对角分块混合精度注意力机制，加速低比特MXFP大模型推理。	large language model
26	Multirate Stein Variational Gradient Descent for Efficient Bayesian Sampling	提出多速率Stein变分梯度下降算法，提升贝叶斯采样的效率和鲁棒性	multimodal
27	Towards Unveiling Vulnerabilities of Large Reasoning Models in Machine Unlearning	针对大型推理模型提出新型不可学习攻击，揭示其安全漏洞	large language model
28	MUXQ: Mixed-to-Uniform Precision MatriX Quantization via Low-Rank Outlier Decomposition	提出MUXQ：一种基于低秩异常分解的混合精度到均匀精度矩阵量化方法，用于解决LLM在NPU上的部署难题。	large language model
29	Noise Immunity in In-Context Tabular Learning: An Empirical Robustness Analysis of TabPFN's Attention Mechanisms	TabPFN在上下文表格学习中表现出噪声免疫性，其注意力机制具有显著的鲁棒性。	foundation model
30	ENEC: A Lossless AI Model Compression Method Enabling Fast Inference on Ascend NPUs	ENEC：一种用于昇腾NPU的AI模型无损压缩加速推理方法	large language model
31	LLMs Judging LLMs: A Simplex Perspective	提出几何贝叶斯先验以评估大型语言模型的输出质量	large language model
32	Beyond Linear Steering: Unified Multi-Attribute Control for Language Models	提出K-Steering，通过非线性多标签分类统一控制语言模型多种行为属性	large language model
33	xRFM: Accurate, scalable, and interpretable feature learning models for tabular data	xRFM：一种准确、可扩展且可解释的表格数据特征学习模型	foundation model
34	Fewer Weights, More Problems: A Practical Attack on LLM Pruning	提出一种新方法揭示LLM剪枝的安全隐患	large language model
35	From Bits to Chips: An LLM-based Hardware-Aware Quantization Agent for Streamlined Deployment of LLMs	提出基于LLM的硬件感知量化Agent(HAQA)，简化LLM部署流程。	large language model
36	HOSL: Hybrid-Order Split Learning for Memory-Constrained Edge Training	提出HOSL以解决边缘设备训练中的内存限制问题	large language model
37	Auxiliary-predicted Compress Memory Model(ApCM Model): A Neural Memory Storage Model Based on Invertible Compression and Learnable Prediction	提出ApCM模型，通过可逆压缩和预测机制增强LLM的运行时记忆能力	large language model
38	ModalImmune: Immunity Driven Unlearning via Self Destructive Training	提出ModalImmune框架，增强多模态系统在模态缺失下的鲁棒性	multimodal
39	MOOSE-Star: Unlocking Tractable Training for Scientific Discovery by Breaking the Complexity Barrier	提出MOOSE-Star以解决科学发现中的复杂性训练问题	large language model
40	See it to Place it: Evolving Macro Placements with Vision-Language Models	提出VeoPlace，利用视觉-语言模型进化芯片宏布局，显著降低线长。	foundation model
41	Rethinking Language Model Scaling under Transferable Hypersphere Optimization	提出HyperP框架以优化大语言模型的可扩展性问题	large language model
42	Stealthy and Adjustable Text-Guided Backdoor Attacks on Multimodal Pretrained Models	提出文本引导的后门攻击TGB，提升多模态预训练模型的隐蔽性和可控性	multimodal
43	Task Ecologies and the Evolution of World-Tracking Representations in Large Language Models	研究语言模型中世界追踪表征的涌现，揭示任务生态与表征演化的关系	large language model
44	In-Place Test-Time Training	提出In-Place TTT，使LLM在推理时动态适应新信息，提升长上下文任务性能。	large language model
45	QiMeng-PRepair: Precise Code Repair via Edit-Aware Reward Optimization	PRepair：通过编辑感知奖励优化实现精确代码修复	large language model
46	ALTO: Adaptive LoRA Tuning and Orchestration for Heterogeneous LoRA Training Workloads	ALTO：面向异构LoRA训练工作负载的自适应调优与编排系统	large language model
47	Cross-Machine Anomaly Detection Leveraging Pre-trained Time-series Model	提出一种基于预训练时间序列模型的跨机器异常检测框架，提升泛化能力。	foundation model
48	LLMs Should Express Uncertainty Explicitly	提出显式不确定性表达接口，提升LLM决策能力与可靠性	large language model

🔬 支柱二：RL算法与架构 (RL & Architecture) (47 篇)

#	题目	一句话要点	标签
49	Large Language Model Guided Incentive Aware Reward Design for Cooperative Multi-Agent Reinforcement Learning	提出基于大语言模型引导的激励感知奖励设计框架，用于提升合作多智能体强化学习效果	reinforcement learning reward design large language model
50	The limits of bio-molecular modeling with large language models : a cross-scale evaluation	BioMol-LLM-Bench：跨尺度生物分子建模中大语言模型能力的系统性评估与局限性分析	Mamba large language model chain-of-thought
51	Delayed Homomorphic Reinforcement Learning for Environments with Delayed Feedback	提出延迟同态强化学习(DHRL)框架，解决延迟反馈环境下的强化学习问题。	reinforcement learning policy learning OMOMO
52	SODA: Semi On-Policy Black-Box Distillation for Large Language Models	SODA：面向大语言模型的半在线黑盒蒸馏方法，提升效率与稳定性。	distillation large language model
53	Automated Attention Pattern Discovery at Scale in Large Language Models	提出AP-MAE，通过注意力模式分析和干预提升大语言模型性能	masked autoencoder MAE large language model
54	APPA: Adaptive Preference Pluralistic Alignment for Fair Federated RLHF of LLMs	APPA：面向LLM公平联邦RLHF的自适应偏好多元对齐	reinforcement learning PPO RLHF
55	Not All Tokens Matter: Towards Efficient LLM Reasoning via Token Significance in Reinforcement Learning	提出基于Token重要性的强化学习方法，提升LLM推理效率与准确性	reinforcement learning large language model chain-of-thought
56	Realistic Market Impact Modeling for Reinforcement Learning Trading Environments	提出MACE环境，解决强化学习交易中市场冲击成本建模不足问题	reinforcement learning DRL PPO
57	Provable Multi-Task Reinforcement Learning: A Representation Learning Framework with Low Rank Rewards	提出基于低秩奖励矩阵的多任务强化学习表征学习框架，提升学习效率。	reinforcement learning representation learning
58	Empowering Power Outage Prediction with Spatially Aware Hybrid Graph Neural Networks and Contrastive Learning	提出SA-HGNN模型，结合对比学习，提升极端天气下电力中断预测的准确性。	predictive model contrastive learning spatial relationship
59	Co-Evolving Latent Action World Models	提出CoLA-World，实现潜变量动作世界模型的协同进化，提升视频模拟和视觉规划能力。	world model world models
60	Making Bias Non-Predictive: Training Robust LLM Reasoning via Reinforcement Learning	提出EIT框架，通过强化学习提升LLM在推理中对抗认知偏差的鲁棒性	reinforcement learning reward design large language model
61	One Model for All: Multi-Objective Controllable Language Models	提出多目标控制（MOC）方法，训练单个LLM以实现用户偏好控制的个性化输出。	reinforcement learning RLHF HuMoR
62	DP-OPD: Differentially Private On-Policy Distillation for Language Models	提出DP-OPD以解决语言模型隐私保护与压缩效率的矛盾	distillation large language model
63	Stratifying Reinforcement Learning with Signal Temporal Logic	提出基于分层信号时序逻辑的强化学习框架，提升任务规划能力。	reinforcement learning deep reinforcement learning DRL
64	Causal Process Models: Reframing Dynamic Causal Graph Discovery as a Reinforcement Learning Problem	提出因果过程模型，将动态因果图发现重构为强化学习问题	reinforcement learning world model world models
65	Apriel-1.5-OpenReasoner: RL Post-Training for General-Purpose and Efficient Reasoning	提出Apriel-1.5-OpenReasoner，通过强化学习后训练提升通用推理能力和效率。	reinforcement learning instruction following chain-of-thought
66	Adversarial Robustness of Deep State Space Models for Forecasting	针对时序预测，提出对抗鲁棒的深度状态空间模型，提升模型在恶意扰动下的预测精度。	SSM state space model
67	Geometric Limits of Knowledge Distillation: A Minimum-Width Theorem via Superposition Theory	提出几何限制理论以解决知识蒸馏性能饱和问题	distillation
68	Correcting Source Mismatch in Flow Matching with Radial-Angular Transport	提出径向-角度流匹配(RAFM)以解决流匹配中源分布不匹配问题	flow matching
69	Boosted Distributional Reinforcement Learning: Analysis and Healthcare Applications	提出BDRL算法，通过优化分布强化学习解决医疗决策中异构群体的一致性问题。	reinforcement learning
70	Isokinetic Flow Matching for Pathwise Straightening of Generative Flows	提出Isokinetic Flow Matching，通过动态正则化显著提升生成流的快速采样效率。	flow matching
71	Anticipatory Reinforcement Learning: From Generative Path-Laws to Distributional Value Functions	提出ARL框架，通过生成路径法则和分布价值函数解决非马尔可夫决策过程中的强化学习问题。	reinforcement learning
72	Selecting Decision-Relevant Concepts in Reinforcement Learning	提出自动概念选择算法以优化强化学习决策	reinforcement learning
73	Explainable Autonomous Cyber Defense using Adversarial Multi-Agent Reinforcement Learning	提出Causal Multi-Agent Decision Framework以解决网络防御中的模糊性问题	reinforcement learning
74	Generative modeling of granular flow on inclined planes using conditional flow matching	提出基于条件流匹配的生成模型，用于倾斜面上颗粒流的内部运动学重建。	flow matching
75	SLSREC: Self-Supervised Contrastive Learning for Adaptive Fusion of Long- and Short-Term User Interests	SLSRec：自监督对比学习融合长短期用户兴趣，提升会话推荐效果	contrastive learning
76	EventFlow: Forecasting Temporal Point Processes with Flow Matching	EventFlow：利用Flow Matching进行时序点过程预测，显著降低预测误差。	flow matching
77	An Information-Theoretic Analysis of OOD Generalization in Meta-Reinforcement Learning	基于信息论的元强化学习OOD泛化分析与界限	reinforcement learning
78	Personalized Federated Distillation Assisted Vehicle Edge Caching Strategy	提出个性化联邦蒸馏辅助的车载边缘缓存策略，降低通信开销。	distillation
79	Kinetic-Mamba: Mamba-Assisted Predictions of Stiff Chemical Kinetics	Kinetic-Mamba：利用Mamba预测刚性化学动力学，提升燃烧模拟精度。	Mamba
80	NePPO: Near-Potential Policy Optimization for General-Sum Multi-Agent Reinforcement Learning	NePPO：面向通用和多智能体强化学习的近势策略优化	reinforcement learning
81	Audio-to-Image Bird Species Retrieval without Audio-Image Pairs via Text Distillation	提出基于文本蒸馏的音频到图像鸟类检索方法，无需配对数据。	distillation
82	Learning from Imperfect Demonstrations via Temporal Behavior Tree-Guided Trajectory Repair	提出基于时间行为树的轨迹修复方法以改善机器人学习	reinforcement learning policy learning
83	Restless Bandits with Individual Penalty Constraints: A New Near-Optimal Index Policy and How to Learn It	提出个体惩罚约束下的Restless Bandits新策略，解决动态无线网络资源分配问题	reinforcement learning deep reinforcement learning
84	Cog-DRIFT: Exploration on Adaptively Reformulated Instances Enables Learning from Hard Reasoning Problems	Cog-DRIFT通过自适应重构实例，解决LLM在困难推理问题上的学习难题。	reinforcement learning curriculum learning
85	Hierarchical Contrastive Learning for Multimodal Data	提出分层对比学习(HCL)框架，解决多模态数据表示中模态间复杂关系建模问题。	representation learning contrastive learning multimodal
86	Hidden in the Multiplicative Interaction: Uncovering Fragility in Multimodal Contrastive Learning	提出Gated Symile，解决多模态对比学习中模态不可靠性问题，提升检索精度。	contrastive learning multimodal
87	Toward Consistent World Models with Multi-Token Prediction and Latent Semantic Enhancement	提出LSE-MTP，通过多步预测和隐语义增强提升世界模型的连贯性	world model world models large language model
88	AttnDiff: Attention-based Differential Fingerprinting for Large Language Models	AttnDiff：基于注意力的差分指纹技术，用于识别大型语言模型的衍生关系	PPO DPO large language model
89	A Mixture of Experts Foundation Model for Scanning Electron Microscopy Image Analysis	提出用于扫描电子显微镜图像分析的混合专家基础模型，提升泛化性和自动化水平。	representation learning foundation model
90	The UNDO Flip-Flop: A Controlled Probe for Reversible Semantic State Management in State Space Model	提出UNDO Flip-Flop任务，用于评估状态空间模型中可逆语义状态管理能力。	Mamba SSM state space model
91	Optimal-Transport-Guided Functional Flow Matching for Turbulent Field Generation in Hilbert Space	提出基于最优传输引导的函数式流匹配方法，用于Hilbert空间中的湍流场生成。	flow matching spatiotemporal
92	Value Mirror Descent for Reinforcement Learning	提出值镜下降法以优化强化学习中的价值迭代	reinforcement learning
93	Graph Topology Information Enhanced Heterogeneous Graph Representation Learning	提出ToGRL框架，通过拓扑学习增强异构图表示，提升下游任务性能	representation learning
94	Top-K Retrieval with Fixed-Size Linear-Attention Completion: Backbone- and KV-Format-Preserving Attention for KV-Cache Read Reduction	提出固定大小线性注意力补全的Top-K检索，减少KV缓存读取，提升长文本生成效率。	linear attention
95	Jeffreys Flow: Robust Boltzmann Generators for Rare Event Sampling via Parallel Tempering Distillation	提出Jeffreys Flow，通过并行回火蒸馏解决玻尔兹曼生成器中的模式崩塌问题	distillation

🔬 支柱一：机器人控制 (Robot Control) (7 篇)

#	题目	一句话要点	标签
96	FlashSAC: Fast and Stable Off-Policy Reinforcement Learning for High-Dimensional Robot Control	FlashSAC：面向高维机器人控制的快速稳定离线强化学习算法	humanoid humanoid locomotion locomotion
97	WIMLE: Uncertainty-Aware World Models with IMLE for Sample-Efficient Continuous Control	WIMLE：基于不确定性感知世界模型与IMLE的样本高效连续控制	humanoid reinforcement learning world model
98	A Multi-Level Causal Intervention Framework for Mechanistic Interpretability in Variational Autoencoders	提出多层次因果干预框架，用于变分自编码器的机制可解释性研究	manipulation VQ-VAE
99	Neural Operators for Multi-Task Control and Adaptation	提出基于神经算子的多任务控制框架，实现高效的任务泛化与快速适应。	locomotion
100	Relay-Assisted Activation-Integrated SIM for Wireless Physical Neural Networks	提出基于中继辅助激活集成智能超表面的无线物理神经网络	manipulation
101	Byzantine-Robust and Differentially Private Federated Optimization under Weaker Assumptions	提出Byz-Clip21-SGD2M以解决联邦学习中的隐私与鲁棒性问题	manipulation
102	Convergence of Byzantine-Resilient Gradient Tracking via Probabilistic Edge Dropout	提出基于概率边丢弃的拜占庭容错梯度追踪方法，解决分布式优化中的恶意攻击问题。	manipulation

🔬 支柱八：物理动画 (Physics-based Animation) (5 篇)

#	题目	一句话要点	标签
103	Spatiotemporal Interpolation of GEDI Biomass with Calibrated Uncertainty	提出基于Attentive Neural Process的时空插值方法，用于GEDI生物量不确定性校准。	spatiotemporal foundation model
104	Spatiotemporal-Aware Bit-Flip Injection on DNN-based Advanced Driver Assistance Systems	提出STAFI框架，用于检测ADAS中DNN对时空敏感的位翻转故障	spatiotemporal
105	Algebraic Diversity: Group-Theoretic Spectral Estimation from Single Observations	提出代数多样性理论，用单次观测实现等效于多次观测的谱估计。	PULSE
106	Deep Gaussian Processes for Functional Maps	提出DGPFM模型，解决函数空间映射中的非线性关系建模和不确定性量化问题。	spatiotemporal
107	Topological Characterization of Churn Flow and Unsupervised Correction to the Wu Flow-Regime Map in Small-Diameter Vertical Pipes	提出基于拓扑特征的相态识别方法，无需标注数据即可校正现有流型图	AMP

🔬 支柱四：生成式动作 (Generative Motion) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
108	Collapse-Free Prototype Readout Layer for Transformer Encoders	提出DDCL-Attention，一种无崩溃原型读取层，用于Transformer编码器。	VQ-VAE

🔬 支柱三：空间感知与语义 (Perception & Semantics) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
109	Three-dimensional inversion of gravity data using implicit neural representations and scientific machine learning	提出隐式神经表示以解决三维重力数据反演问题	implicit representation

⬅️ 返回 cs.LG 首页 · 🏠 返回主页

cs.LG（2026-04-07）

🎯 兴趣领域导航

🔬 支柱九：具身大模型 (Embodied Foundation Models) (48 篇)

🔬 支柱二：RL算法与架构 (RL & Architecture) (47 篇)

🔬 支柱一：机器人控制 (Robot Control) (7 篇)

🔬 支柱八：物理动画 (Physics-based Animation) (5 篇)

🔬 支柱四：生成式动作 (Generative Motion) (1 篇)

🔬 支柱三：空间感知与语义 (Perception & Semantics) (1 篇)

⭐ 我的收藏

📁 新建收藏夹

⚙️ 管理收藏夹

🔍 搜索论文

🔐 登录 / 注册

👤 用户管理