| 1 |
Greener Deep Reinforcement Learning: Analysis of Energy and Carbon Efficiency Across Atari Benchmarks |
分析Atari基准测试中深度强化学习的能源和碳效率,为绿色DRL提供基准。 |
reinforcement learning deep reinforcement learning DRL |
|
|
| 2 |
Deep Reinforcement Learning for Ranking Utility Tuning in the Ad Recommender System at Pinterest |
提出DRL-PUT框架,利用深度强化学习优化Pinterest广告推荐系统中排序效用函数。 |
reinforcement learning deep reinforcement learning DRL |
|
|
| 3 |
FinXplore: An Adaptive Deep Reinforcement Learning Framework for Balancing and Discovering Investment Opportunities |
FinXplore:一种自适应深度强化学习框架,用于平衡和发现投资机会 |
reinforcement learning deep reinforcement learning DRL |
|
|
| 4 |
Beyond I-Con: Exploring New Dimension of Distance Measures in Representation Learning |
Beyond I-Con:探索表征学习中距离度量的新维度,提升聚类与降维效果 |
representation learning contrastive learning |
|
|
| 5 |
Self-Aligned Reward: Towards Effective and Efficient Reasoners |
提出自对齐奖励(SAR),提升LLM推理效率和准确性。 |
reinforcement learning PPO large language model |
|
|
| 6 |
An Arbitration Control for an Ensemble of Diversified DQN variants in Continual Reinforcement Learning |
提出ACED-DQN,通过仲裁控制多样化DQN集成解决持续强化学习中的灾难性遗忘问题 |
reinforcement learning deep reinforcement learning |
|
|
| 7 |
MambaLite-Micro: Memory-Optimized Mamba Inference on MCUs |
MambaLite-Micro:面向MCU的内存优化Mamba模型推理引擎 |
Mamba |
|
|
| 8 |
PLanTS: Periodicity-aware Latent-state Representation Learning for Multivariate Time Series |
PLanTS:提出周期感知的潜在状态表征学习框架,用于多元时间序列分析。 |
representation learning |
|
|
| 9 |
SpikingBrain: Spiking Brain-inspired Large Models |
SpikingBrain:受脑启发的大模型,提升长文本处理效率并降低功耗 |
linear attention large language model |
|
|
| 10 |
Shift Before You Learn: Enabling Low-Rank Representations in Reinforcement Learning |
提出基于转移后继测度的低秩强化学习方法,提升目标条件强化学习性能 |
reinforcement learning |
|
|
| 11 |
Pre-Forgettable Models: Prompt Learning as a Native Mechanism for Unlearning |
提出基于Prompt学习的预先可遗忘模型,实现高效、安全的知识移除。 |
distillation foundation model |
|
|
| 12 |
Topology-Aware Graph Reinforcement Learning for Dynamic Routing in Cloud Networks |
提出拓扑感知图强化学习,解决云网络动态路由优化问题 |
reinforcement learning |
|
|