cs.LG（2025-06-27）

📊 共 23 篇论文 | 🔗 5 篇有代码

🎯 兴趣领域导航

支柱二：RL算法与架构 (RL & Architecture) (14 🔗3) 支柱九：具身大模型 (Embodied Foundation Models) (6 🔗2) 支柱一：机器人控制 (Robot Control) (2) 支柱八：物理动画 (Physics-based Animation) (1)

🔬 支柱二：RL算法与架构 (RL & Architecture) (14 篇)

#	题目	一句话要点	标签	🔗	⭐
1	The Hidden Link Between RLHF and Contrastive Learning	提出互信息优化方法以提升人类反馈强化学习效果	reinforcement learning RLHF DPO
2	Hyper-modal Imputation Diffusion Embedding with Dual-Distillation for Federated Multimodal Knowledge Graph Completion	提出MMFeD3-HidE以解决联邦多模态知识图谱补全问题	distillation multimodal
3	Frequency-Aligned Knowledge Distillation for Lightweight Spatiotemporal Forecasting	提出频率对齐知识蒸馏以解决轻量级时空预测问题	MAE distillation spatiotemporal	✅
4	TROFI: Trajectory-Ranked Offline Inverse Reinforcement Learning	提出TROFI以解决离线强化学习中的奖励函数缺失问题	reinforcement learning offline reinforcement learning inverse reinforcement learning
5	EFRame: Deeper Reasoning via Exploration-Filter-Replay Reinforcement Learning Framework	提出EFRame框架以解决GRPO在复杂推理任务中的不足	reinforcement learning PPO large language model	✅
6	Layer Importance for Mathematical Reasoning is Forged in Pre-Training and Invariant after Post-Training	提出层重要性分析以优化数学推理能力	reinforcement learning distillation large language model
7	TOAST: Task-Oriented Adaptive Semantic Transmission over Dynamic Wireless Environments	提出TOAST框架以解决动态无线环境中的多任务优化问题	reinforcement learning deep reinforcement learning PULSE
8	Reinforcement Learning with Physics-Informed Symbolic Program Priors for Zero-Shot Wireless Indoor Navigation	提出物理信息符号程序先验的强化学习框架以解决零样本室内导航问题	reinforcement learning
9	SceneDiffuser++: City-Scale Traffic Simulation via a Generative World Model	提出SceneDiffuser++以解决城市规模交通模拟问题	world model
10	MetaCipher: A Time-Persistent and Universal Multi-Agent Framework for Cipher-Based Jailbreak Attacks for LLMs	提出MetaCipher以解决LLMs的低成本多代理越狱攻击问题	reinforcement learning large language model
11	Smooth-Distill: A Self-distillation Framework for Multitask Learning with Wearable Sensor Data	提出Smooth-Distill框架以解决可穿戴传感器数据的多任务学习问题	distillation	✅
12	Advancements and Challenges in Continual Reinforcement Learning: A Comprehensive Review	综述持续强化学习的进展与挑战，推动动态学习能力提升	reinforcement learning
13	A Survey of Continual Reinforcement Learning	提出持续强化学习方法以解决动态环境中的知识保持问题	reinforcement learning
14	Unfolding Generative Flows with Koopman Operators: Fast and Interpretable Sampling	提出基于Koopman算子的生成流展开方法以加速采样	flow matching distillation

🔬 支柱九：具身大模型 (Embodied Foundation Models) (6 篇)

#	题目	一句话要点	标签	🔗	⭐
15	XxaCT-NN: Structure Agnostic Multimodal Learning for Materials Science	提出XxaCT-NN以解决材料科学中的结构依赖问题	foundation model multimodal
16	UniCA: Adapting Time Series Foundation Model to General Covariate-Aware Forecasting	提出UniCA以解决时间序列预测中的协变量适应问题	foundation model multimodal	✅
17	Sheaf-Based Decentralized Multimodal Learning for Next-Generation Wireless Communication Systems	提出Sheaf-DMFL以解决多模态数据协作学习问题	multimodal
18	OptScale: Probabilistic Optimality for Inference-time Scaling	提出OptScale以解决推理时间缩放的效率问题	large language model
19	Projected Compression: Trainable Projection for Efficient Transformer Compression	提出Projected Compression以解决Transformer模型压缩问题	large language model
20	GPAS: Accelerating Convergence of LLM Pretraining via Gradient-Preserving Activation Scaling	提出GPAS以解决大语言模型预训练中的激活方差问题	large language model	✅

🔬 支柱一：机器人控制 (Robot Control) (2 篇)

#	题目	一句话要点	标签	🔗	⭐
21	ARMOR: Robust Reinforcement Learning-based Control for UAVs under Physical Attacks	提出ARMOR以解决无人机在物理攻击下的控制问题	manipulation reinforcement learning privileged information
22	Earthquake Damage Grades Prediction using An Ensemble Approach Integrating Advanced Machine and Deep Learning Models	提出集成先进机器学习与深度学习模型的地震损伤等级预测方法	manipulation

🔬 支柱八：物理动画 (Physics-based Animation) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
23	Hitchhiking Rides Dataset: Two decades of crowd-sourced records on stochastic traveling	提出搭便车数据集以研究随机旅行现象	spatiotemporal

⬅️ 返回 cs.LG 首页 · 🏠 返回主页