cs.LG(2023-12-05)
📊 共 5 篇论文 | 🔗 2 篇有代码
🎯 兴趣领域导航
支柱九:具身大模型 (Embodied Foundation Models) (3 🔗1)
支柱一:机器人控制 (Robot Control) (1 🔗1)
支柱二:RL算法与架构 (RL & Architecture) (1)
🔬 支柱九:具身大模型 (Embodied Foundation Models) (3 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 1 | FlexModel: A Framework for Interpretability of Distributed Large Language Models | FlexModel:用于分布式大语言模型可解释性的框架 | large language model | ✅ | |
| 2 | Towards Measuring Representational Similarity of Large Language Models | 通过表征相似性度量评估大型语言模型之间的差异性 | large language model | ||
| 3 | Weakly Supervised Detection of Hallucinations in LLM Activations | 提出一种弱监督审计方法,用于检测LLM激活中是否存在幻觉模式。 | large language model |
🔬 支柱一:机器人控制 (Robot Control) (1 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 4 | H-GAP: Humanoid Control with a Generalist Planner | 提出H-GAP:一种基于通用规划器的类人机器人控制方法 | humanoid humanoid control bipedal | ✅ |
🔬 支柱二:RL算法与架构 (RL & Architecture) (1 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 5 | ULMA: Unified Language Model Alignment with Human Demonstration and Point-wise Preference | 提出ULMA,通过人类演示和逐点偏好统一对齐语言模型 | preference learning RLHF DPO |