cs.AI（2025-06-12）

📊 共 36 篇论文 | 🔗 11 篇有代码

🎯 兴趣领域导航

支柱九：具身大模型 (Embodied Foundation Models) (31 🔗9) 支柱八：物理动画 (Physics-based Animation) (2) 支柱二：RL算法与架构 (RL & Architecture) (2 🔗2) 支柱一：机器人控制 (Robot Control) (1)

🔬 支柱九：具身大模型 (Embodied Foundation Models) (31 篇)

#	题目	一句话要点	标签	🔗
1	Multimodal Modeling of CRISPR-Cas12 Activity Using Foundation Models and Chromatin Accessibility Data	利用基础模型和染色质可及性数据提升CRISPR-Cas12 gRNA活性预测	foundation model multimodal
2	Contemporary AI foundation models increase biological weapons risk	提出新框架评估AI模型对生物武器风险的影响	large language model foundation model
3	Mirage-1: Augmenting and Updating GUI Agent with Hierarchical Multimodal Skills	提出层次化多模态技能模块以解决GUI代理知识不足问题	large language model multimodal	✅
4	LLM-as-a-Fuzzy-Judge: Fine-Tuning Large Language Models as a Clinical Evaluation Judge with Fuzzy Logic	提出LLM-as-a-Fuzzy-Judge以解决临床评估自动化问题	large language model	✅
5	Formalising Software Requirements using Large Language Models	提出VERIFAI项目以解决软件需求的可追溯性与验证问题	large language model
6	TeleMath: A Benchmark for Large Language Models in Telecom Mathematical Problem Solving	提出TeleMath基准以评估大语言模型在电信数学问题求解中的表现	large language model
7	SoK: Evaluating Jailbreak Guardrails for Large Language Models	提出多维分类法以评估大型语言模型的监控防护机制	large language model	✅
8	Intelligent Automation for FDI Facilitation: Optimizing Tariff Exemption Processes with OCR And Large Language Models	提出智能自动化框架以优化外资投资的关税豁免流程	large language model
9	Augmenting Large Language Models with Static Code Analysis for Automated Code Quality Improvements	通过静态代码分析增强大型语言模型以自动改善代码质量	large language model
10	WGSR-Bench: Wargame-based Game-theoretic Strategic Reasoning Benchmark for Large Language Models	提出WGSR-Bench以解决战略推理评估问题	large language model
11	Breaking Bad Molecules: Are MLLMs Ready for Structure-Level Molecular Detoxification?	提出ToxiMol基准以解决分子毒性修复问题	large language model multimodal
12	Scientists' First Exam: Probing Cognitive Abilities of MLLM via Perception, Understanding, and Reasoning	提出科学家首考基准以评估多模态大语言模型的认知能力	large language model multimodal
13	DiMo-GUI: Advancing Test-time Scaling in GUI Grounding via Modality-Aware Visual Reasoning	提出DiMo-GUI以解决GUI基础上的自然语言查询问题	visual grounding
14	OPT-BENCH: Evaluating LLM Agent on Large-Scale Search Spaces Optimization Problems	提出OPT-BENCH以评估LLM代理在大规模搜索空间优化问题上的表现	large language model	✅
15	LLM Embedding-based Attribution (LEA): Quantifying Source Contributions to Generative Model's Response for Vulnerability Analysis	提出LEA以量化生成模型响应中的源贡献问题	large language model
16	Invocable APIs derived from NL2SQL datasets for LLM Tool-Calling Evaluation	提出NL2API数据集生成方法以评估LLM工具调用能力	large language model
17	SWE-Factory: Your Automated Factory for Issue Resolution Training Data and Evaluation Benchmarks	提出SWE-Factory以解决GitHub问题解决数据集构建难题	large language model	✅
18	GenPlanX. Generation of Plans and Execution	提出GenPlanX以解决自然语言规划任务理解问题	large language model
19	Precise Zero-Shot Pointwise Ranking with LLMs through Post-Aggregated Global Context Information	提出全球一致比较点对点排名方法以提升零样本文档排名效果	large language model
20	LLM-Driven Personalized Answer Generation and Evaluation	利用大语言模型生成个性化答案以提升在线学习体验	large language model
21	What Users Value and Critique: Large-Scale Analysis of User Feedback on AI-Powered Mobile Apps	提出大规模用户反馈分析方法以提升AI移动应用体验	large language model
22	Automated Validation of Textual Constraints Against AutomationML via LLMs and SHACL	提出自动化验证文本约束以解决AutomationML建模问题	large language model
23	Beyond Formal Semantics for Capabilities and Skills: Model Context Protocol in Manufacturing	提出模型上下文协议以简化制造业能力与技能建模	large language model
24	Primender Sequence: A Novel Mathematical Construct for Testing Symbolic Inference and AI Reasoning	提出Primender序列以评估大型语言模型的符号推理能力	large language model
25	StepProof: Step-by-step verification of natural language mathematical proofs	提出StepProof以解决自然语言数学证明逐步验证问题	large language model
26	LogiPlan: A Structured Benchmark for Logical Planning and Relational Reasoning in LLMs	提出LogiPlan以评估大语言模型在逻辑规划中的能力	large language model
27	SOFT: Selective Data Obfuscation for Protecting LLM Fine-tuning against Membership Inference Attacks	提出SOFT以解决LLM微调中的成员推断攻击问题	large language model
28	PAL: Probing Audio Encoders via LLMs - Audio Information Transfer into LLMs	提出轻量级音频LLM集成方法以提升音频信息传递效率	large language model	✅
29	Reasoning RAG via System 1 or System 2: A Survey on Reasoning Agentic Retrieval-Augmented Generation for Industry Challenges	提出Reasoning Agentic RAG以解决复杂推理与动态检索问题	large language model	✅
30	Towards Understanding Bias in Synthetic Data for Evaluation	探讨合成数据中的偏差以优化信息检索系统评估	large language model	✅
31	Discrete Audio Tokens: More Than a Survey!	提出离散音频标记以提升音频处理效率与性能	large language model	✅

🔬 支柱八：物理动画 (Physics-based Animation) (2 篇)

#	题目	一句话要点	标签	🔗	⭐
32	A Study on Individual Spatiotemporal Activity Generation Method Using MCP-Enhanced Chain-of-Thought Large Language Models	提出MCP增强的链式思维模型以解决城市行为模拟问题	spatiotemporal large language model chain-of-thought
33	DUN-SRE: Deep Unrolling Network with Spatiotemporal Rotation Equivariance for Dynamic MRI Reconstruction	提出DUN-SRE以解决动态MRI重建中的时空旋转对称性问题	spatiotemporal

🔬 支柱二：RL算法与架构 (RL & Architecture) (2 篇)

#	题目	一句话要点	标签	🔗	⭐
34	Optimus-3: Towards Generalist Multimodal Minecraft Agents with Scalable Task Experts	提出知识增强数据生成管道以解决Minecraft中的多模态智能体挑战	reinforcement learning generalist agent large language model	✅
35	A Benchmark for Generalizing Across Diverse Team Strategies in Competitive Pokémon	提出VGC-Bench以解决宝可梦团队策略泛化问题	reinforcement learning behavior cloning large language model	✅

🔬 支柱一：机器人控制 (Robot Control) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
36	Agentic Semantic Control for Autonomous Wireless Space Networks: Extending Space-O-RAN with MCP-Driven Distributed Intelligence	提出基于MCP的语义智能控制以提升月球无线网络的自主性	locomotion motion planning

⬅️ 返回 cs.AI 首页 · 🏠 返回主页

cs.AI（2025-06-12）

🎯 兴趣领域导航

🔬 支柱九：具身大模型 (Embodied Foundation Models) (31 篇)

🔬 支柱八：物理动画 (Physics-based Animation) (2 篇)

🔬 支柱二：RL算法与架构 (RL & Architecture) (2 篇)

🔬 支柱一：机器人控制 (Robot Control) (1 篇)

⭐ 我的收藏

📁 新建收藏夹

⚙️ 管理收藏夹

🔍 搜索论文

🔐 登录 / 注册