| 1 |
TRACE: Training-Free Partial Audio Deepfake Detection via Embedding Trajectory Analysis of Speech Foundation Models |
提出TRACE,通过分析语音基础模型的嵌入轨迹来检测无训练的部分音频深度伪造。 |
foundation model |
|
|
| 2 |
Does Unification Come at a Cost? Uni-SafeBench: A Safety Benchmark for Unified Multimodal Large Models |
提出Uni-SafeBench,评估统一多模态大模型在多任务下的安全性问题。 |
multimodal |
|
|
| 3 |
Adversarial Moral Stress Testing of Large Language Models |
提出对抗性道德压力测试框架AMST,评估LLM在对抗环境下的伦理鲁棒性。 |
large language model |
|
|
| 4 |
OmniMem: Autoresearch-Guided Discovery of Lifelong Multimodal Agent Memory |
OmniMem:通过自主研究发现终身多模态Agent记忆 |
multimodal |
✅ |
|
| 5 |
Towards Reliable Truth-Aligned Uncertainty Estimation in Large Language Models |
提出真值锚定(TAC)校准方法,提升大语言模型不确定性估计的可靠性。 |
large language model |
✅ |
|
| 6 |
HippoCamp: Benchmarking Contextual Agents on Personal Computers |
HippoCamp:用于评估个人电脑上上下文感知Agent的新基准 |
large language model multimodal |
|
|
| 7 |
Automated Framework to Evaluate and Harden LLM System Instructions against Encoding Attacks |
提出自动化框架评估并强化LLM系统指令,防御编码攻击 |
large language model chain-of-thought |
|
|
| 8 |
Investigating Autonomous Agent Contributions in the Wild: Activity Patterns and Code Change over Time |
研究自主编码Agent在开源项目中的贡献:活动模式与代码随时间的变化 |
large language model |
|
|
| 9 |
Therefore I am. I Think |
揭示大语言模型推理决策机制:决策先于思考,早期编码影响思维链 |
chain-of-thought |
|
|
| 10 |
OrgAgent: Organize Your Multi-Agent System like a Company |
OrgAgent:构建公司式层级多智能体系统,提升复杂推理能力 |
large language model |
|
|
| 11 |
Streaming Model Cascades for Semantic SQL |
提出两种流式模型级联算法,用于在语义SQL中降低大语言模型的推理成本。 |
large language model |
|
|
| 12 |
Ontology-Constrained Neural Reasoning in Enterprise Agentic Systems: A Neurosymbolic Architecture for Domain-Grounded AI Agents |
提出本体约束神经推理框架,解决企业级Agent系统中LLM的幻觉和合规性问题 |
large language model |
|
|
| 13 |
Adaptive Parallel Monte Carlo Tree Search for Efficient Test-time Compute Scaling |
提出自适应并行蒙特卡洛树搜索,提升大模型推理时效性与吞吐量。 |
large language model |
|
|
| 14 |
Signals: Trajectory Sampling and Triage for Agentic Interactions |
提出基于信号的Agent交互轨迹采样与分流框架,提升后部署优化效率。 |
large language model |
|
|