| 24 |
TuneComp: Joint Fine-tuning and Compression for Large Foundation Models |
提出TuneComp以解决大规模基础模型的联合微调与压缩问题 |
distillation foundation model |
|
|
| 25 |
A Cross Modal Knowledge Distillation & Data Augmentation Recipe for Improving Transcriptomics Representations through Morphological Features |
提出跨模态知识蒸馏与数据增强方法以提升转录组学表现 |
distillation foundation model multimodal |
|
|
| 26 |
Foundation Model Hidden Representations for Heart Rate Estimation from Auscultation |
利用基础模型隐含表示进行心率估计,提升听诊技术的准确性 |
MAE foundation model |
|
|
| 27 |
TabReason: A Reinforcement Learning-Enhanced Reasoning LLM for Explainable Tabular Data Prediction |
提出TabReason以解决表格数据预测的可解释性问题 |
reinforcement learning predictive model large language model |
|
|
| 28 |
Deep Reinforcement Learning Agents are not even close to Human Intelligence |
提出HackAtari以解决深度强化学习智能不足问题 |
reinforcement learning deep reinforcement learning |
|
|
| 29 |
Topology-Aware and Highly Generalizable Deep Reinforcement Learning for Efficient Retrieval in Multi-Deep Storage Systems |
提出基于深度强化学习的框架以解决多深度存储系统的检索问题 |
reinforcement learning deep reinforcement learning |
|
|
| 30 |
Hierarchical Reinforcement Learning with Uncertainty-Guided Diffusional Subgoals |
提出不确定性引导的扩散子目标以解决层次强化学习问题 |
reinforcement learning diffusion policy |
|
|
| 31 |
Simple yet Effective Graph Distillation via Clustering |
提出ClustGDD以解决图神经网络训练中的计算开销问题 |
representation learning distillation |
|
|
| 32 |
Semi-supervised Clustering Through Representation Learning of Large-scale EHR Data |
提出SCORE框架以解决电子健康记录数据建模挑战 |
predictive model representation learning |
|
|
| 33 |
Accelerating RL for LLM Reasoning with Optimal Advantage Regression |
提出A*-PO以解决RL在LLM推理中的高计算开销问题 |
reinforcement learning PPO large language model |
✅ |
|
| 34 |
A Framework for Adversarial Analysis of Decision Support Systems Prior to Deployment |
提出对决策支持系统的对抗性分析框架以增强安全性 |
reinforcement learning deep reinforcement learning DRL |
|
|
| 35 |
Universal Value-Function Uncertainties |
提出通用价值函数不确定性以解决强化学习中的不确定性问题 |
reinforcement learning offline RL distillation |
|
|
| 36 |
HAD: Hybrid Architecture Distillation Outperforms Teacher in Genomic Sequence Modeling |
提出混合架构蒸馏方法以提升基因序列建模性能 |
distillation |
|
|
| 37 |
A reinforcement learning agent for maintenance of deteriorating systems with increasingly imperfect repairs |
提出强化学习代理以优化逐渐恶化系统的维护策略 |
reinforcement learning |
|
|
| 38 |
Apprenticeship learning with prior beliefs using inverse optimization |
提出逆优化框架以增强逆强化学习的学习能力 |
reinforcement learning inverse reinforcement learning |
|
|
| 39 |
Sparsified State-Space Models are Efficient Highway Networks |
提出Simba方法以提高状态空间模型的效率 |
Mamba SSM |
✅ |
|