cs.LG(2025-08-19)

📊 共 25 篇论文 | 🔗 2 篇有代码

🎯 兴趣领域导航

支柱二:RL算法与架构 (RL & Architecture) (13 🔗1) 支柱九:具身大模型 (Embodied Foundation Models) (9 🔗1) 支柱八:物理动画 (Physics-based Animation) (1) 支柱六:视频提取与匹配 (Video Extraction) (1) 支柱五:交互与反应 (Interaction & Reaction) (1)

🔬 支柱二:RL算法与架构 (RL & Architecture) (13 篇)

#题目一句话要点标签🔗
1 Categorical Policies: Multimodal Policy Learning and Exploration in Continuous Control 提出分类策略以解决连续控制中的多模态探索问题 reinforcement learning deep reinforcement learning policy learning
2 Revisiting Diffusion Q-Learning: From Iterative Denoising to One-Step Action Generation 提出One-Step Flow Q-Learning以解决DQL训练与推理效率低下问题 reinforcement learning offline reinforcement learning diffusion policy
3 Your Reward Function for RL is Your Best PRM for Search: Unifying RL and Search-Based TTS 提出AIRL-S以统一强化学习与基于搜索的测试时缩放问题 reinforcement learning inverse reinforcement learning large language model
4 Depth-Breadth Synergy in RLVR: Unlocking LLM Reasoning Gains with Adaptive Exploration 提出DARS以解决RLVR中的深度与广度探索问题 reinforcement learning PPO large language model
5 MuFlex: A Scalable, Physics-based Platform for Multi-Building Flexibility Analysis and Coordination 提出MuFlex以解决多建筑灵活性协调问题 reinforcement learning SAC penetration
6 Ethics-Aware Safe Reinforcement Learning for Rare-Event Risk Control in Interactive Urban Driving 提出伦理意识安全强化学习框架以解决城市驾驶中的稀有事件风险控制问题 reinforcement learning
7 Convergent Reinforcement Learning Algorithms for Stochastic Shortest Path Problem 提出收敛强化学习算法以解决随机最短路径问题 reinforcement learning
8 Reinforcement Learning-based Adaptive Path Selection for Programmable Networks 提出基于强化学习的自适应路径选择以优化可编程网络 reinforcement learning
9 MACTAS: Self-Attention-Based Module for Inter-Agent Communication in Multi-Agent Reinforcement Learning 提出自注意力模块以提升多智能体强化学习中的通信效率 reinforcement learning
10 A Generalized Learning Framework for Self-Supervised Contrastive Learning 提出通用学习框架以解决自监督对比学习的约束问题 contrastive learning
11 EventTSF: Event-Aware Non-Stationary Time Series Forecasting 提出EventTSF以解决多模态非平稳时间序列预测问题 flow matching multimodal
12 Formal Algorithms for Model Efficiency 提出KMR框架以统一深度学习模型效率优化方法 policy learning distillation
13 Towards Agent-based Test Support Systems: An Unsupervised Environment Design Approach 提出基于智能体的测试支持系统以解决动态环境下传感器布局问题 reinforcement learning curriculum learning

🔬 支柱九:具身大模型 (Embodied Foundation Models) (9 篇)

#题目一句话要点标签🔗
14 Amortized Bayesian Meta-Learning for Low-Rank Adaptation of Large Language Models 提出ABMLL以解决大语言模型低秩适应的泛化问题 large language model
15 EmoSLLM: Parameter-Efficient Adaptation of LLMs for Speech Emotion Recognition 提出EmoSLLM以高效解决语音情感识别问题 large language model multimodal
16 NovoMolGen: Rethinking Molecular Language Model Pretraining 提出NovoMolGen以提升分子生成效率与效果 large language model foundation model
17 GLASS: Test-Time Acceleration for LLMs via Global-Local Neural Importance Aggregation 提出A/I-GLASS以解决LLMs在边缘硬件上的动态剪枝问题 large language model
18 Powering Job Search at Scale: LLM-Enhanced Query Understanding in Job Matching Systems 提出统一查询理解框架以提升招聘匹配系统的效果 large language model
19 RewardRank: Optimizing True Learning-to-Rank Utility 提出RewardRank以优化真实学习排序效用 large language model
20 MCPTox: A Benchmark for Tool Poisoning Attack on Real-World MCP Servers 提出MCPTox基准以评估工具中毒攻击对MCP服务器的影响 instruction following
21 Input-Time Scaling 提出输入时间缩放方法以提升大语言模型性能 large language model
22 AdapSNE: Adaptive Fireworks-Optimized and Entropy-Guided Dataset Sampling for Edge DNN Training 提出AdapSNE以解决边缘设备DNN训练中的数据采样问题 large language model

🔬 支柱八:物理动画 (Physics-based Animation) (1 篇)

#题目一句话要点标签🔗
23 Know Me by My Pulse: Toward Practical Continuous Authentication on Wearable Devices via Wrist-Worn PPG 提出基于低频PPG信号的连续身份认证方法以解决可穿戴设备的安全性问题 PULSE

🔬 支柱六:视频提取与匹配 (Video Extraction) (1 篇)

#题目一句话要点标签🔗
24 MAVIS: Multi-Objective Alignment via Value-Guided Inference-Time Search 提出MAVIS以解决多目标对齐问题 HuMoR large language model

🔬 支柱五:交互与反应 (Interaction & Reaction) (1 篇)

#题目一句话要点标签🔗
25 Trans-XFed: An Explainable Federated Learning for Supply Chain Credit Assessment 提出Trans-XFed以解决供应链信用评估中的隐私与可解释性问题 OMOMO

⬅️ 返回 cs.LG 首页 · 🏠 返回主页