| 11 |
Towards Alignment-Centric Paradigm: A Survey of Instruction Tuning in Large Language Models |
提出以对齐为中心的范式以优化大语言模型的指令调优 |
distillation large language model multimodal |
|
|
| 12 |
Routing Distilled Knowledge via Mixture of LoRA Experts for Large Language Model based Bundle Generation |
提出RouteDK框架以解决大语言模型知识蒸馏冲突问题 |
distillation large language model |
|
|
| 13 |
SSFO: Self-Supervised Faithfulness Optimization for Retrieval-Augmented Generation |
提出自监督信度优化方法以解决检索增强生成中的信度问题 |
DPO direct preference optimization large language model |
✅ |
|
| 14 |
CORE-RAG: Lossless Compression for Retrieval-Augmented LLMs via Reinforcement Learning |
提出CORE以解决RAG文档压缩效率低下问题 |
reinforcement learning large language model |
|
|
| 15 |
Persuasion Dynamics in LLMs: Investigating Robustness and Adaptability in Knowledge and Safety with DuET-PD |
提出DuET-PD框架以解决LLMs在说服对话中的鲁棒性与适应性问题 |
DPO large language model |
✅ |
|
| 16 |
LLMs Can't Handle Peer Pressure: Crumbling under Multi-Agent Social Interactions |
提出KAIROS基准以解决LLMs在多智能体社交互动中的脆弱性问题 |
reinforcement learning large language model |
|
|