cs.LG(2025-09-11)

📊 共 17 篇论文 | 🔗 1 篇有代码

🎯 兴趣领域导航

支柱二:RL算法与架构 (RL & Architecture) (11) 支柱九:具身大模型 (Embodied Foundation Models) (5 🔗1) 支柱七:动作重定向 (Motion Retargeting) (1)

🔬 支柱二:RL算法与架构 (RL & Architecture) (11 篇)

#题目一句话要点标签🔗
1 Feasibility-Guided Fair Adaptive Offline Reinforcement Learning for Medicaid Care Management 提出可行性引导的公平自适应离线强化学习,用于改善医疗补助计划管理。 reinforcement learning offline RL offline reinforcement learning
2 Hybrid Adaptive Conformal Offline Reinforcement Learning for Fair Population Health Management 提出混合自适应保形离线强化学习框架HACO,用于公平的人群健康管理。 reinforcement learning offline RL offline reinforcement learning
3 Vejde: A Framework for Inductive Deep Reinforcement Learning Based on Factor Graph Color Refinement 提出Vejde框架以解决复杂状态下的决策问题 reinforcement learning deep reinforcement learning
4 Quantum-Enhanced Forecasting for Deep Reinforcement Learning in Algorithmic Trading 提出基于量子增强深度强化学习的算法交易方法,实现外汇交易回报率提升。 reinforcement learning deep reinforcement learning
5 Revisiting Actor-Critic Methods in Discrete Action Off-Policy Reinforcement Learning 解耦Actor-Critic熵正则化,提升离散动作离策略强化学习性能 reinforcement learning PPO SAC
6 Meta-Learning Reinforcement Learning for Crypto-Return Prediction 提出Meta-RL-Crypto,用于加密货币收益预测的自提升交易Agent reinforcement learning multimodal
7 Harnessing Uncertainty: Entropy-Modulated Policy Gradients for Long-Horizon LLM Agents 提出熵调制策略梯度(EMPG)以提升LLM Agent在长时任务中的表现 reinforcement learning inverse reinforcement learning large language model
8 Continuous-Time Value Iteration for Multi-Agent Reinforcement Learning 提出基于物理信息神经网络的连续时间多智能体强化学习框架,解决高维动力系统中的策略训练问题。 reinforcement learning policy learning
9 Finite Scalar Quantization Enables Redundant and Transmission-Robust Neural Audio Compression at Low Bit-rates NeuCodec:基于有限标量量化的鲁棒性神经音频压缩编码 distillation large language model
10 Incentivizing Safer Actions in Policy Optimization for Constrained Reinforcement Learning 提出IP3O算法以解决约束强化学习中的安全性问题 reinforcement learning
11 Clip Your Sequences Fairly: Enforcing Length Fairness for Sequence-Level RL 提出FSPO,通过长度公平的裁剪解决序列级强化学习中的长度偏差问题 reinforcement learning PPO

🔬 支柱九:具身大模型 (Embodied Foundation Models) (5 篇)

#题目一句话要点标签🔗
12 Safe-SAIL: Towards a Fine-grained Safety Landscape of Large Language Models via Sparse Autoencoder Interpretation Framework Safe-SAIL:通过稀疏自编码器解释框架实现大语言模型细粒度安全分析 large language model
13 Sensitivity-LoRA: Low-Load Sensitivity-Based Fine-Tuning for Large Language Models 提出Sensitivity-LoRA,基于敏感度动态调整LoRA秩以高效微调大语言模型 large language model
14 Latency and Token-Aware Test-Time Compute 提出一种延迟和Token感知的测试时计算动态分配框架,优化LLM推理。 large language model
15 One Head, Many Models: Cross-Attention Routing for Cost-Aware LLM Selection 提出基于交叉注意力路由的LLM选择框架,实现成本效益优化。 large language model
16 ButterflyQuant: Ultra-low-bit LLM Quantization through Learnable Orthogonal Butterfly Transforms 提出ButterflyQuant以解决大语言模型量化问题 large language model

🔬 支柱七:动作重定向 (Motion Retargeting) (1 篇)

#题目一句话要点标签🔗
17 Graph Alignment via Dual-Pass Spectral Encoding and Latent Space Communication 提出双通道谱编码与潜在空间通信的图对齐框架,提升节点区分性并保证几何一致性。 geometric consistency

⬅️ 返回 cs.LG 首页 · 🏠 返回主页