cs.AI(2025-06-12)

📊 共 36 篇论文 | 🔗 11 篇有代码

🎯 兴趣领域导航

支柱九:具身大模型 (Embodied Foundation Models) (31 🔗9) 支柱八:物理动画 (Physics-based Animation) (2) 支柱二:RL算法与架构 (RL & Architecture) (2 🔗2) 支柱一:机器人控制 (Robot Control) (1)

🔬 支柱九:具身大模型 (Embodied Foundation Models) (31 篇)

#题目一句话要点标签🔗
1 Multimodal Modeling of CRISPR-Cas12 Activity Using Foundation Models and Chromatin Accessibility Data 利用基础模型和染色质可及性数据提升CRISPR-Cas12 gRNA活性预测 foundation model multimodal
2 Contemporary AI foundation models increase biological weapons risk 提出新框架评估AI模型对生物武器风险的影响 large language model foundation model
3 Mirage-1: Augmenting and Updating GUI Agent with Hierarchical Multimodal Skills 提出层次化多模态技能模块以解决GUI代理知识不足问题 large language model multimodal
4 LLM-as-a-Fuzzy-Judge: Fine-Tuning Large Language Models as a Clinical Evaluation Judge with Fuzzy Logic 提出LLM-as-a-Fuzzy-Judge以解决临床评估自动化问题 large language model
5 Formalising Software Requirements using Large Language Models 提出VERIFAI项目以解决软件需求的可追溯性与验证问题 large language model
6 TeleMath: A Benchmark for Large Language Models in Telecom Mathematical Problem Solving 提出TeleMath基准以评估大语言模型在电信数学问题求解中的表现 large language model
7 SoK: Evaluating Jailbreak Guardrails for Large Language Models 提出多维分类法以评估大型语言模型的监控防护机制 large language model
8 Intelligent Automation for FDI Facilitation: Optimizing Tariff Exemption Processes with OCR And Large Language Models 提出智能自动化框架以优化外资投资的关税豁免流程 large language model
9 Augmenting Large Language Models with Static Code Analysis for Automated Code Quality Improvements 通过静态代码分析增强大型语言模型以自动改善代码质量 large language model
10 WGSR-Bench: Wargame-based Game-theoretic Strategic Reasoning Benchmark for Large Language Models 提出WGSR-Bench以解决战略推理评估问题 large language model
11 Breaking Bad Molecules: Are MLLMs Ready for Structure-Level Molecular Detoxification? 提出ToxiMol基准以解决分子毒性修复问题 large language model multimodal
12 Scientists' First Exam: Probing Cognitive Abilities of MLLM via Perception, Understanding, and Reasoning 提出科学家首考基准以评估多模态大语言模型的认知能力 large language model multimodal
13 DiMo-GUI: Advancing Test-time Scaling in GUI Grounding via Modality-Aware Visual Reasoning 提出DiMo-GUI以解决GUI基础上的自然语言查询问题 visual grounding
14 OPT-BENCH: Evaluating LLM Agent on Large-Scale Search Spaces Optimization Problems 提出OPT-BENCH以评估LLM代理在大规模搜索空间优化问题上的表现 large language model
15 LLM Embedding-based Attribution (LEA): Quantifying Source Contributions to Generative Model's Response for Vulnerability Analysis 提出LEA以量化生成模型响应中的源贡献问题 large language model
16 Invocable APIs derived from NL2SQL datasets for LLM Tool-Calling Evaluation 提出NL2API数据集生成方法以评估LLM工具调用能力 large language model
17 SWE-Factory: Your Automated Factory for Issue Resolution Training Data and Evaluation Benchmarks 提出SWE-Factory以解决GitHub问题解决数据集构建难题 large language model
18 GenPlanX. Generation of Plans and Execution 提出GenPlanX以解决自然语言规划任务理解问题 large language model
19 Precise Zero-Shot Pointwise Ranking with LLMs through Post-Aggregated Global Context Information 提出全球一致比较点对点排名方法以提升零样本文档排名效果 large language model
20 LLM-Driven Personalized Answer Generation and Evaluation 利用大语言模型生成个性化答案以提升在线学习体验 large language model
21 What Users Value and Critique: Large-Scale Analysis of User Feedback on AI-Powered Mobile Apps 提出大规模用户反馈分析方法以提升AI移动应用体验 large language model
22 Automated Validation of Textual Constraints Against AutomationML via LLMs and SHACL 提出自动化验证文本约束以解决AutomationML建模问题 large language model
23 Beyond Formal Semantics for Capabilities and Skills: Model Context Protocol in Manufacturing 提出模型上下文协议以简化制造业能力与技能建模 large language model
24 Primender Sequence: A Novel Mathematical Construct for Testing Symbolic Inference and AI Reasoning 提出Primender序列以评估大型语言模型的符号推理能力 large language model
25 StepProof: Step-by-step verification of natural language mathematical proofs 提出StepProof以解决自然语言数学证明逐步验证问题 large language model
26 LogiPlan: A Structured Benchmark for Logical Planning and Relational Reasoning in LLMs 提出LogiPlan以评估大语言模型在逻辑规划中的能力 large language model
27 SOFT: Selective Data Obfuscation for Protecting LLM Fine-tuning against Membership Inference Attacks 提出SOFT以解决LLM微调中的成员推断攻击问题 large language model
28 PAL: Probing Audio Encoders via LLMs - Audio Information Transfer into LLMs 提出轻量级音频LLM集成方法以提升音频信息传递效率 large language model
29 Reasoning RAG via System 1 or System 2: A Survey on Reasoning Agentic Retrieval-Augmented Generation for Industry Challenges 提出Reasoning Agentic RAG以解决复杂推理与动态检索问题 large language model
30 Towards Understanding Bias in Synthetic Data for Evaluation 探讨合成数据中的偏差以优化信息检索系统评估 large language model
31 Discrete Audio Tokens: More Than a Survey! 提出离散音频标记以提升音频处理效率与性能 large language model

🔬 支柱八:物理动画 (Physics-based Animation) (2 篇)

#题目一句话要点标签🔗
32 A Study on Individual Spatiotemporal Activity Generation Method Using MCP-Enhanced Chain-of-Thought Large Language Models 提出MCP增强的链式思维模型以解决城市行为模拟问题 spatiotemporal large language model chain-of-thought
33 DUN-SRE: Deep Unrolling Network with Spatiotemporal Rotation Equivariance for Dynamic MRI Reconstruction 提出DUN-SRE以解决动态MRI重建中的时空旋转对称性问题 spatiotemporal

🔬 支柱二:RL算法与架构 (RL & Architecture) (2 篇)

#题目一句话要点标签🔗
34 Optimus-3: Towards Generalist Multimodal Minecraft Agents with Scalable Task Experts 提出知识增强数据生成管道以解决Minecraft中的多模态智能体挑战 reinforcement learning generalist agent large language model
35 A Benchmark for Generalizing Across Diverse Team Strategies in Competitive Pokémon 提出VGC-Bench以解决宝可梦团队策略泛化问题 reinforcement learning behavior cloning large language model

🔬 支柱一:机器人控制 (Robot Control) (1 篇)

#题目一句话要点标签🔗
36 Agentic Semantic Control for Autonomous Wireless Space Networks: Extending Space-O-RAN with MCP-Driven Distributed Intelligence 提出基于MCP的语义智能控制以提升月球无线网络的自主性 locomotion motion planning

⬅️ 返回 cs.AI 首页 · 🏠 返回主页