| 1 |
Splitwise: Collaborative Edge-Cloud Inference for LLMs via Lyapunov-Assisted DRL |
Splitwise:基于Lyapunov优化的DRL实现LLM在边缘-云协同推理的自适应切分。 |
reinforcement learning deep reinforcement learning DRL |
|
|
| 2 |
Stochastic Siamese MAE Pretraining for Longitudinal Medical Images |
提出STAMP:一种用于纵向医学图像的随机Siamese MAE预训练框架 |
representation learning MAE foundation model |
|
|
| 3 |
Bellman Calibration for V-Learning in Offline Reinforcement Learning |
提出迭代贝尔曼校准以优化离线强化学习中的价值预测 |
reinforcement learning offline reinforcement learning |
|
|
| 4 |
Joint Link Adaptation and Device Scheduling Approach for URLLC Industrial IoT Network: A DRL-based Method with Bayesian Optimization |
针对URLLC工业物联网,提出基于贝叶斯优化的DRL联合链路自适应与设备调度方法 |
DRL TD3 |
|
|
| 5 |
Eliminating Inductive Bias in Reward Models with Information-Theoretic Guidance |
提出DIR方法,通过信息论优化消除奖励模型中的归纳偏置,提升RLHF性能。 |
reinforcement learning RLHF large language model |
✅ |
|
| 6 |
On the Inverse Flow Matching Problem in the One-Dimensional and Gaussian Cases |
研究一维和高斯分布下的逆流匹配问题,为流匹配模型蒸馏提供理论基础 |
flow matching distillation |
|
|
| 7 |
Diffusion-based Decentralized Federated Multi-Task Representation Learning |
提出基于扩散的去中心化联邦多任务表征学习算法,解决数据稀缺环境下的特征提取问题。 |
representation learning |
|
|