| 1 |
MM-PRM: Enhancing Multimodal Mathematical Reasoning with Scalable Step-Level Supervision |
提出MM-PRM以解决多模态数学推理中的步骤监督不足问题 |
large language model multimodal |
✅ |
|
| 2 |
Aneumo: A Large-Scale Multimodal Aneurysm Dataset with Computational Fluid Dynamics Simulations and Deep Learning Benchmarks |
提出Aneumo数据集以解决脑动脉瘤风险评估问题 |
multimodal |
✅ |
|
| 3 |
Survey: Multi-Armed Bandits Meet Large Language Models |
探讨多臂老虎机算法与大型语言模型的协同潜力 |
large language model |
|
|
| 4 |
Advancing Software Quality: A Standards-Focused Review of LLM-Based Assurance Techniques |
基于大语言模型的SQA技术提升软件质量保障 |
large language model multimodal |
|
|
| 5 |
Causal Head Gating: A Framework for Interpreting Roles of Attention Heads in Transformers |
提出因果头门控方法以解析变换器中注意力头的功能角色 |
large language model instruction following |
|
|
| 6 |
Unified Cross-modal Translation of Score Images, Symbolic Music, and Performance Audio |
提出统一跨模态翻译方法以解决音乐信息检索问题 |
multimodal |
|
|
| 7 |
Ice Cream Doesn't Cause Drowning: Benchmarking LLMs Against Statistical Pitfalls in Causal Inference |
提出CausalPitfalls基准以解决LLMs因果推断中的统计陷阱问题 |
large language model |
|
|
| 8 |
Security Degradation in Iterative AI Code Generation -- A Systematic Analysis of the Paradox |
分析迭代AI代码生成中的安全退化问题 |
large language model |
|
|
| 9 |
Safety Alignment Can Be Not Superficial With Explicit Safety Signals |
通过显式安全信号提升大语言模型的安全对齐能力 |
large language model |
|
|
| 10 |
Measuring the Faithfulness of Thinking Drafts in Large Reasoning Models |
提出系统性反事实干预框架以评估大型推理模型的思维草稿可信度 |
chain-of-thought |
|
|
| 11 |
Language Models Are Capable of Metacognitive Monitoring and Control of Their Internal Activations |
提出神经反馈范式以量化语言模型的元认知能力 |
large language model |
|
|
| 12 |
CoT-Kinetics: A Theoretical Modeling Assessing LRM Reasoning Process |
提出CoT-Kinetics以评估大规模推理模型的推理过程 |
large language model |
|
|
| 13 |
AutoMathKG: The automated mathematical knowledge graph based on LLM and vector database |
提出AutoMathKG以解决数学知识图谱构建的自动化问题 |
large language model |
|
|
| 14 |
CompeteSMoE -- Statistically Guaranteed Mixture of Experts Training via Competition |
提出CompeteSMoE以解决稀疏专家模型训练中的路由效率问题 |
large language model |
✅ |
|
| 15 |
LLM-KG-Bench 3.0: A Compass for SemanticTechnology Capabilities in the Ocean of LLMs |
提出LLM-KG-Bench 3.0以评估大语言模型在知识图谱领域的能力 |
large language model |
|
|