cs.AI（2026-04-06）

📊 共 51 篇论文

🎯 兴趣领域导航

支柱九：具身大模型 (Embodied Foundation Models) (37) 支柱二：RL算法与架构 (RL & Architecture) (9) 支柱一：机器人控制 (Robot Control) (3) 支柱七：动作重定向 (Motion Retargeting) (1) 支柱三：空间感知与语义 (Perception & Semantics) (1)

🔬 支柱九：具身大模型 (Embodied Foundation Models) (37 篇)

#	题目	一句话要点	标签
1	Do Audio-Visual Large Language Models Really See and Hear?	AVLLM模态偏见研究：揭示视听大语言模型中视觉主导的融合机制	large language model multimodal
2	Agentic-MME: What Agentic Capability Really Brings to Multimodal Intelligence?	Agentic-MME：用于评估多模态智能体能力的流程验证基准	large language model multimodal
3	Improving MPI Error Detection and Repair with Large Language Models and Bug References	利用LLM和Bug参考，提升MPI错误检测与修复能力	large language model chain-of-thought
4	Understanding the Effects of Safety Unalignment on Large Language Models	研究安全对齐失效对大型语言模型的影响，揭示权重正交化方法的潜在风险。	large language model
5	AutoVerifier: An Agentic Automated Verification Framework Using Large Language Models	AutoVerifier：利用大语言模型自动验证科技情报的Agent框架	large language model
6	Analysis of Optimality of Large Language Models on Planning Problems	分析大型语言模型在规划问题上的最优性	large language model
7	When simulations look right but causal effects go wrong: Large language models as behavioral simulators	大型语言模型作为行为模拟器，描述性拟合良好但因果效应预测失准	large language model
8	Automated Malware Family Classification using Weighted Hierarchical Ensembles of Large Language Models	提出基于加权层级集成大语言模型的零标签恶意软件家族分类框架	large language model
9	Learn to Relax with Large Language Models: Solving Constraint Optimization Problems via Bidirectional Coevolution	AutoCO：利用大语言模型和双向协同进化解决约束优化问题	large language model
10	Chain-of-Authorization: Embedding authorization into large language models	提出Chain-of-Authorization框架，将访问控制嵌入大语言模型推理过程，提升安全性。	large language model
11	Competency Questions as Executable Plans: a Controlled RAG Architecture for Cultural Heritage Storytelling	提出基于知识图谱和能力问题的可控RAG架构，用于文化遗产故事生成。	large language model multimodal
12	Code-in-the-Loop Forensics: Agentic Tool Use for Image Forgery Detection	提出ForenAgent，利用Agentic工具进行图像伪造检测，实现更灵活可解释的分析。	large language model multimodal
13	Patterns behind Chaos: Forecasting Data Movement for Efficient Large-Scale MoE LLM Inference	针对大规模MoE LLM推理，提出数据移动预测方法以优化系统效率。	large language model
14	ProdCodeBench: A Production-Derived Benchmark for Evaluating AI Coding Agents	提出ProdCodeBench，一个源于真实生产环境的AI代码生成Agent评估基准。	foundation model
15	Holos: A Web-Scale LLM-Based Multi-Agent System for the Agentic Web	Holos：一个基于Web规模LLM的多智能体系统，旨在构建Agentic Web。	large language model
16	I must delete the evidence: AI Agents Explicitly Cover up Fraud and Violent Crime	AI Agent倾向于掩盖欺诈和暴力犯罪证据以服务公司利益	large language model
17	Improving Role Consistency in Multi-Agent Collaboration via Quantitative Role Clarity	提出量化角色清晰度以解决多智能体协作中的角色一致性问题	large language model
18	InfoSeeker: A Scalable Hierarchical Parallel Agent Framework for Web Information Seeking	提出InfoSeeker，解决Web信息搜寻中大规模异构数据聚合的挑战。	large language model
19	Beyond Message Passing: Toward Semantically Aligned Agent Communication	分析LLM Agent通信协议，揭示语义对齐不足，提出未来研究方向。	large language model
20	Ambig-IaC: Multi-level Disambiguation for Interactive Cloud Infrastructure-as-Code Synthesis	提出Ambig-IaC以解决云基础设施代码生成中的歧义问题	large language model
21	Audio Spatially-Guided Fusion for Audio-Visual Navigation	提出音频空间引导融合方法，提升音频-视觉导航在未知环境下的泛化性	multimodal
22	From Theory to Practice: Code Generation Using LLMs for CAPEC and CWE Frameworks	利用LLM为CAPEC和CWE框架生成代码，提升漏洞理解与检测	large language model
23	High Volatility and Action Bias Distinguish LLMs from Humans in Group Coordination	揭示LLM在群体协作中高波动性和行动偏见，与人类存在显著差异	large language model
24	GBQA: A Game Benchmark for Evaluating LLMs as Quality Assurance Engineers	提出GBQA：一个评估LLM作为质量保证工程师能力的游戏基准	large language model
25	Do Agent Societies Develop Intellectual Elites? The Hidden Power Laws of Collective Cognition in LLM Multi-Agent Systems	揭示LLM多智能体系统中集体认知涌现的幂律，并提出DTI解决集成瓶颈。	large language model
26	ChatSVA: Bridging SVA Generation for Hardware Verification via Task-Specific LLMs	ChatSVA：通过任务特定LLM桥接SVA生成，用于硬件验证	large language model
27	LLM+Graph@VLDB'2025 Workshop Summary	LLM+Graph研讨会聚焦LLM与图数据融合，推动算法与系统创新	large language model
28	AlertStar: Path-Aware Alert Prediction on Hyper-Relational Knowledge Graphs	AlertStar：基于超关系知识图谱的路径感知警报预测	TAMP
29	An Independent Safety Evaluation of Kimi K2.5	Kimi K2.5安全性评估：揭示开源大模型在CBRNE、网络安全和偏见等方面的潜在风险	multimodal
30	A Systematic Security Evaluation of OpenClaw and Its Variants	系统性评估OpenClaw及其变体的安全漏洞，揭示工具增强型AI Agent的潜在风险。	large language model
31	Glia: A Human-Inspired AI for Automated Systems Design and Optimization	Glia：一种受人类启发的人工智能，用于自动化系统设计与优化	large language model
32	From Abstract to Contextual: What LLMs Still Cannot Do in Mathematics	ContextMATH基准测试揭示大语言模型在上下文数学推理中问题建模能力的不足	large language model
33	Therefore I am. I Think	研究表明大型语言模型在推理前已初步决定，推理过程可能服务于决策	chain-of-thought
34	Beyond the Assistant Turn: User Turn Generation as a Probe of Interaction Awareness in Language Models	提出用户轮次生成探针，评估语言模型交互感知能力，发现任务准确率与交互感知解耦。	instruction following
35	Expressive Prompting: Improving Emotion Intensity and Speaker Consistency in Zero-Shot TTS	提出一种两阶段提示选择策略，提升零样本TTS的情感强度和说话人一致性	large language model
36	StructEval: Benchmarking LLMs' Capabilities to Generate Structural Outputs	StructEval：构建LLM结构化输出能力评测基准，揭示模型在多种格式上的性能差距。	large language model
37	Terminal Agents Suffice for Enterprise Automation	提出基于终端的Agent，用于企业自动化任务，性能优于复杂Agent系统	foundation model

🔬 支柱二：RL算法与架构 (RL & Architecture) (9 篇)

#	题目	一句话要点	标签
38	A Multimodal Vision Transformer-based Modeling Framework for Prediction of Fluid Flows in Energy Systems	提出基于多模态Vision Transformer的流体预测框架，加速能量系统CFD仿真。	predictive model spatiotemporal multimodal
39	InCoder-32B-Thinking: Industrial Code World Model for Thinking	提出InCoder-32B-Thinking，通过工业代码世界模型生成推理轨迹，提升工业软件开发效率。	world model world models chain-of-thought
40	CharTool: Tool-Integrated Visual Reasoning for Chart Understanding	CharTool：工具集成视觉推理用于图表理解	reinforcement learning large language model multimodal
41	Chart-RL: Policy Optimization Reinforcement Learning for Enhanced Visual Reasoning in Chart Question Answering with Vision Language Models	提出Chart-RL，通过强化学习优化视觉语言模型在图表问答中的推理能力	reinforcement learning spatial relationship foundation model
42	Interpretable Deep Reinforcement Learning for Element-level Bridge Life-cycle Optimization	提出一种可解释的深度强化学习方法，用于桥梁构件级全寿命周期优化。	reinforcement learning deep reinforcement learning
43	Mitigating LLM biases toward spurious social contexts using direct preference optimization	提出Debiasing-DPO，缓解LLM对虚假社会上下文的偏见，提升教育评估公平性。	DPO direct preference optimization
44	Compositional Neuro-Symbolic Reasoning	提出一种神经符号组合推理框架，提升ARC问题的泛化能力	reinforcement learning large language model
45	GrandCode: Achieving Grandmaster Level in Competitive Programming via Agentic Reinforcement Learning	GrandCode：通过Agent强化学习在竞技编程中达到特级大师水平	reinforcement learning
46	Multi-Turn Reinforcement Learning for Tool-Calling Agents with Iterative Reward Calibration	提出迭代奖励校准的多轮强化学习方法，提升工具调用Agent在复杂任务中的性能	reinforcement learning

🔬 支柱一：机器人控制 (Robot Control) (3 篇)

#	题目	一句话要点	标签
47	Coupled Control, Structured Memory, and Verifiable Action in Agentic AI (SCRAT -- Stochastic Control with Retrieval and Auditable Trajectories): A Comparative Perspective from Squirrel Locomotion and Scatter-Hoarding	提出SCRAT框架，耦合控制、记忆与验证，提升Agentic AI在复杂环境下的鲁棒性	locomotion latent dynamics
48	Evaluating Language Models for Harmful Manipulation	提出基于人机交互的评估框架，用于评估语言模型在公共政策、金融和健康领域中的有害操纵能力。	manipulation
49	Aligning Progress and Feasibility: A Neuro-Symbolic Dual Memory Framework for Long-Horizon LLM Agents	提出神经符号双记忆框架，解决长程LLM智能体中的全局漂移和局部违规问题	manipulation large language model

🔬 支柱七：动作重定向 (Motion Retargeting) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
50	MECO: A Multimodal Dataset for Emotion and Cognitive Understanding in Older Adults	MECO：用于老年人情绪和认知理解的多模态数据集	motion prediction multimodal

🔬 支柱三：空间感知与语义 (Perception & Semantics) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
51	Making Written Theorems Explorable by Grounding Them in Formal Representations	提出Explorable Theorems，通过形式化表示增强数学定理的可探索性。	affordance

⬅️ 返回 cs.AI 首页 · 🏠 返回主页

cs.AI（2026-04-06）

🎯 兴趣领域导航

🔬 支柱九：具身大模型 (Embodied Foundation Models) (37 篇)

🔬 支柱二：RL算法与架构 (RL & Architecture) (9 篇)

🔬 支柱一：机器人控制 (Robot Control) (3 篇)

🔬 支柱七：动作重定向 (Motion Retargeting) (1 篇)

🔬 支柱三：空间感知与语义 (Perception & Semantics) (1 篇)

⭐ 我的收藏

📁 新建收藏夹

⚙️ 管理收藏夹

🔍 搜索论文

🔐 登录 / 注册

👤 用户管理