cs.AI(2026-04-06)

📊 共 51 篇论文

🎯 兴趣领域导航

支柱九:具身大模型 (Embodied Foundation Models) (37) 支柱二:RL算法与架构 (RL & Architecture) (9) 支柱一:机器人控制 (Robot Control) (3) 支柱七:动作重定向 (Motion Retargeting) (1) 支柱三:空间感知与语义 (Perception & Semantics) (1)

🔬 支柱九:具身大模型 (Embodied Foundation Models) (37 篇)

#题目一句话要点标签🔗
1 Do Audio-Visual Large Language Models Really See and Hear? AVLLM模态偏见研究:揭示视听大语言模型中视觉主导的融合机制 large language model multimodal
2 Agentic-MME: What Agentic Capability Really Brings to Multimodal Intelligence? Agentic-MME:用于评估多模态智能体能力的流程验证基准 large language model multimodal
3 Improving MPI Error Detection and Repair with Large Language Models and Bug References 利用LLM和Bug参考,提升MPI错误检测与修复能力 large language model chain-of-thought
4 Understanding the Effects of Safety Unalignment on Large Language Models 研究安全对齐失效对大型语言模型的影响,揭示权重正交化方法的潜在风险。 large language model
5 AutoVerifier: An Agentic Automated Verification Framework Using Large Language Models AutoVerifier:利用大语言模型自动验证科技情报的Agent框架 large language model
6 Analysis of Optimality of Large Language Models on Planning Problems 分析大型语言模型在规划问题上的最优性 large language model
7 When simulations look right but causal effects go wrong: Large language models as behavioral simulators 大型语言模型作为行为模拟器,描述性拟合良好但因果效应预测失准 large language model
8 Automated Malware Family Classification using Weighted Hierarchical Ensembles of Large Language Models 提出基于加权层级集成大语言模型的零标签恶意软件家族分类框架 large language model
9 Learn to Relax with Large Language Models: Solving Constraint Optimization Problems via Bidirectional Coevolution AutoCO:利用大语言模型和双向协同进化解决约束优化问题 large language model
10 Chain-of-Authorization: Embedding authorization into large language models 提出Chain-of-Authorization框架,将访问控制嵌入大语言模型推理过程,提升安全性。 large language model
11 Competency Questions as Executable Plans: a Controlled RAG Architecture for Cultural Heritage Storytelling 提出基于知识图谱和能力问题的可控RAG架构,用于文化遗产故事生成。 large language model multimodal
12 Code-in-the-Loop Forensics: Agentic Tool Use for Image Forgery Detection 提出ForenAgent,利用Agentic工具进行图像伪造检测,实现更灵活可解释的分析。 large language model multimodal
13 Patterns behind Chaos: Forecasting Data Movement for Efficient Large-Scale MoE LLM Inference 针对大规模MoE LLM推理,提出数据移动预测方法以优化系统效率。 large language model
14 ProdCodeBench: A Production-Derived Benchmark for Evaluating AI Coding Agents 提出ProdCodeBench,一个源于真实生产环境的AI代码生成Agent评估基准。 foundation model
15 Holos: A Web-Scale LLM-Based Multi-Agent System for the Agentic Web Holos:一个基于Web规模LLM的多智能体系统,旨在构建Agentic Web。 large language model
16 I must delete the evidence: AI Agents Explicitly Cover up Fraud and Violent Crime AI Agent倾向于掩盖欺诈和暴力犯罪证据以服务公司利益 large language model
17 Improving Role Consistency in Multi-Agent Collaboration via Quantitative Role Clarity 提出量化角色清晰度以解决多智能体协作中的角色一致性问题 large language model
18 InfoSeeker: A Scalable Hierarchical Parallel Agent Framework for Web Information Seeking 提出InfoSeeker,解决Web信息搜寻中大规模异构数据聚合的挑战。 large language model
19 Beyond Message Passing: Toward Semantically Aligned Agent Communication 分析LLM Agent通信协议,揭示语义对齐不足,提出未来研究方向。 large language model
20 Ambig-IaC: Multi-level Disambiguation for Interactive Cloud Infrastructure-as-Code Synthesis 提出Ambig-IaC以解决云基础设施代码生成中的歧义问题 large language model
21 Audio Spatially-Guided Fusion for Audio-Visual Navigation 提出音频空间引导融合方法,提升音频-视觉导航在未知环境下的泛化性 multimodal
22 From Theory to Practice: Code Generation Using LLMs for CAPEC and CWE Frameworks 利用LLM为CAPEC和CWE框架生成代码,提升漏洞理解与检测 large language model
23 High Volatility and Action Bias Distinguish LLMs from Humans in Group Coordination 揭示LLM在群体协作中高波动性和行动偏见,与人类存在显著差异 large language model
24 GBQA: A Game Benchmark for Evaluating LLMs as Quality Assurance Engineers 提出GBQA:一个评估LLM作为质量保证工程师能力的游戏基准 large language model
25 Do Agent Societies Develop Intellectual Elites? The Hidden Power Laws of Collective Cognition in LLM Multi-Agent Systems 揭示LLM多智能体系统中集体认知涌现的幂律,并提出DTI解决集成瓶颈。 large language model
26 ChatSVA: Bridging SVA Generation for Hardware Verification via Task-Specific LLMs ChatSVA:通过任务特定LLM桥接SVA生成,用于硬件验证 large language model
27 LLM+Graph@VLDB'2025 Workshop Summary LLM+Graph研讨会聚焦LLM与图数据融合,推动算法与系统创新 large language model
28 AlertStar: Path-Aware Alert Prediction on Hyper-Relational Knowledge Graphs AlertStar:基于超关系知识图谱的路径感知警报预测 TAMP
29 An Independent Safety Evaluation of Kimi K2.5 Kimi K2.5安全性评估:揭示开源大模型在CBRNE、网络安全和偏见等方面的潜在风险 multimodal
30 A Systematic Security Evaluation of OpenClaw and Its Variants 系统性评估OpenClaw及其变体的安全漏洞,揭示工具增强型AI Agent的潜在风险。 large language model
31 Glia: A Human-Inspired AI for Automated Systems Design and Optimization Glia:一种受人类启发的人工智能,用于自动化系统设计与优化 large language model
32 From Abstract to Contextual: What LLMs Still Cannot Do in Mathematics ContextMATH基准测试揭示大语言模型在上下文数学推理中问题建模能力的不足 large language model
33 Therefore I am. I Think 研究表明大型语言模型在推理前已初步决定,推理过程可能服务于决策 chain-of-thought
34 Beyond the Assistant Turn: User Turn Generation as a Probe of Interaction Awareness in Language Models 提出用户轮次生成探针,评估语言模型交互感知能力,发现任务准确率与交互感知解耦。 instruction following
35 Expressive Prompting: Improving Emotion Intensity and Speaker Consistency in Zero-Shot TTS 提出一种两阶段提示选择策略,提升零样本TTS的情感强度和说话人一致性 large language model
36 StructEval: Benchmarking LLMs' Capabilities to Generate Structural Outputs StructEval:构建LLM结构化输出能力评测基准,揭示模型在多种格式上的性能差距。 large language model
37 Terminal Agents Suffice for Enterprise Automation 提出基于终端的Agent,用于企业自动化任务,性能优于复杂Agent系统 foundation model

🔬 支柱二:RL算法与架构 (RL & Architecture) (9 篇)

#题目一句话要点标签🔗
38 A Multimodal Vision Transformer-based Modeling Framework for Prediction of Fluid Flows in Energy Systems 提出基于多模态Vision Transformer的流体预测框架,加速能量系统CFD仿真。 predictive model spatiotemporal multimodal
39 InCoder-32B-Thinking: Industrial Code World Model for Thinking 提出InCoder-32B-Thinking,通过工业代码世界模型生成推理轨迹,提升工业软件开发效率。 world model world models chain-of-thought
40 CharTool: Tool-Integrated Visual Reasoning for Chart Understanding CharTool:工具集成视觉推理用于图表理解 reinforcement learning large language model multimodal
41 Chart-RL: Policy Optimization Reinforcement Learning for Enhanced Visual Reasoning in Chart Question Answering with Vision Language Models 提出Chart-RL,通过强化学习优化视觉语言模型在图表问答中的推理能力 reinforcement learning spatial relationship foundation model
42 Interpretable Deep Reinforcement Learning for Element-level Bridge Life-cycle Optimization 提出一种可解释的深度强化学习方法,用于桥梁构件级全寿命周期优化。 reinforcement learning deep reinforcement learning
43 Mitigating LLM biases toward spurious social contexts using direct preference optimization 提出Debiasing-DPO,缓解LLM对虚假社会上下文的偏见,提升教育评估公平性。 DPO direct preference optimization
44 Compositional Neuro-Symbolic Reasoning 提出一种神经符号组合推理框架,提升ARC问题的泛化能力 reinforcement learning large language model
45 GrandCode: Achieving Grandmaster Level in Competitive Programming via Agentic Reinforcement Learning GrandCode:通过Agent强化学习在竞技编程中达到特级大师水平 reinforcement learning
46 Multi-Turn Reinforcement Learning for Tool-Calling Agents with Iterative Reward Calibration 提出迭代奖励校准的多轮强化学习方法,提升工具调用Agent在复杂任务中的性能 reinforcement learning

🔬 支柱一:机器人控制 (Robot Control) (3 篇)

#题目一句话要点标签🔗
47 Coupled Control, Structured Memory, and Verifiable Action in Agentic AI (SCRAT -- Stochastic Control with Retrieval and Auditable Trajectories): A Comparative Perspective from Squirrel Locomotion and Scatter-Hoarding 提出SCRAT框架,耦合控制、记忆与验证,提升Agentic AI在复杂环境下的鲁棒性 locomotion latent dynamics
48 Evaluating Language Models for Harmful Manipulation 提出基于人机交互的评估框架,用于评估语言模型在公共政策、金融和健康领域中的有害操纵能力。 manipulation
49 Aligning Progress and Feasibility: A Neuro-Symbolic Dual Memory Framework for Long-Horizon LLM Agents 提出神经符号双记忆框架,解决长程LLM智能体中的全局漂移和局部违规问题 manipulation large language model

🔬 支柱七:动作重定向 (Motion Retargeting) (1 篇)

#题目一句话要点标签🔗
50 MECO: A Multimodal Dataset for Emotion and Cognitive Understanding in Older Adults MECO:用于老年人情绪和认知理解的多模态数据集 motion prediction multimodal

🔬 支柱三:空间感知与语义 (Perception & Semantics) (1 篇)

#题目一句话要点标签🔗
51 Making Written Theorems Explorable by Grounding Them in Formal Representations 提出Explorable Theorems,通过形式化表示增强数学定理的可探索性。 affordance

⬅️ 返回 cs.AI 首页 · 🏠 返回主页