| 1 |
Foundation Models as World Models: A Foundational Study in Text-Based GridWorlds |
提出基于Foundation Model的世界模型与智能体,提升文本网格世界中的强化学习效率。 |
reinforcement learning world model large language model |
|
|
| 2 |
Estimating Clinical Lab Test Result Trajectories from PPG using Physiological Foundation Model and Patient-Aware State Space Model -- a UNIPHY+ Approach |
UNIPHY+Lab:利用PPG和生理基础模型预测ICU患者的连续生化指标 |
Mamba state space model MAE |
|
|
| 3 |
Polynomial Contrastive Learning for Privacy-Preserving Representation Learning on Graphs |
提出Poly-GRACE,实现同态加密友好的图神经网络自监督表示学习 |
representation learning contrastive learning OMOMO |
|
|
| 4 |
Optimizing Product Deduplication in E-Commerce with Multimodal Embeddings |
提出一种基于多模态嵌入的电商商品去重方法,提升大规模商品目录下的去重精度。 |
masked autoencoder multimodal |
|
|
| 5 |
MTS-DMAE: Dual-Masked Autoencoder for Unsupervised Multivariate Time Series Representation Learning |
提出双掩码自编码器DMAE,用于无监督多元时间序列表示学习 |
representation learning masked autoencoder |
|
|
| 6 |
Test-Time Learning and Inference-Time Deliberation for Efficiency-First Offline Reinforcement Learning in Care Coordination and Population Health Management |
提出TTL+ITD方法,用于高效、可审计的医疗协调离线强化学习。 |
reinforcement learning offline reinforcement learning |
|
|
| 7 |
Rethinking Molecule Synthesizability with Chain-of-Reaction |
ReaSyn:利用反应链解决分子生成模型合成性不足的问题 |
reinforcement learning large language model chain-of-thought |
|
|
| 8 |
Mental Accounts for Actions: EWA-Inspired Attention in Decision Transformers |
EWA-VQ-ODT:利用经验加权吸引力改进在线决策Transformer的样本效率 |
reinforcement learning decision transformer reward shaping |
|
|
| 9 |
Automated Cyber Defense with Generalizable Graph-based Reinforcement Learning Agents |
提出基于图的通用强化学习智能体,用于自动化网络防御。 |
reinforcement learning deep reinforcement learning |
|
|
| 10 |
Fully Decentralized Cooperative Multi-Agent Reinforcement Learning is A Context Modeling Problem |
提出动态感知上下文(DAC)方法,解决完全去中心化合作多智能体强化学习中的非平稳性和过度泛化问题 |
reinforcement learning policy learning |
|
|
| 11 |
DiffusionNFT: Online Diffusion Reinforcement with Forward Process |
提出DiffusionNFT,通过前向过程优化扩散模型,实现高效在线强化学习。 |
reinforcement learning flow matching classifier-free guidance |
|
|
| 12 |
Uncertainty-Based Smooth Policy Regularisation for Reinforcement Learning with Few Demonstrations |
SPReD:基于不确定性的平滑策略正则化,提升少样本演示强化学习效果 |
reinforcement learning |
✅ |
|
| 13 |
RLinf: Flexible and Efficient Large-scale Reinforcement Learning via Macro-to-Micro Flow Transformation |
RLinf:通过宏微观流转换实现灵活高效的大规模强化学习 |
reinforcement learning |
|
|
| 14 |
RMT-KD: Random Matrix Theoretic Causal Knowledge Distillation |
提出RMT-KD以解决深度学习模型压缩问题 |
distillation |
|
|
| 15 |
Nonconvex Regularization for Feature Selection in Reinforcement Learning |
提出基于非凸正则化的强化学习特征选择算法,提升高噪声环境下的性能。 |
reinforcement learning |
|
|
| 16 |
Inverse Optimization Latent Variable Models for Learning Costs Applied to Route Problems |
提出逆优化隐变量模型(IO-LVM),用于学习路径规划问题中的成本函数分布。 |
reinforcement learning inverse reinforcement learning |
|
|
| 17 |
HyP-ASO: A Hybrid Policy-based Adaptive Search Optimization Framework for Large-Scale Integer Linear Programs |
HyP-ASO:混合策略自适应搜索优化框架,用于求解大规模整数线性规划问题 |
reinforcement learning deep reinforcement learning |
|
|
| 18 |
Learning to Optimize Capacity Planning in Semiconductor Manufacturing |
提出基于异构图神经网络的深度强化学习模型,优化半导体制造中的产能规划。 |
reinforcement learning deep reinforcement learning |
|
|