| 1 |
Refine-POI: Reinforcement Fine-Tuned Large Language Models for Next Point-of-Interest Recommendation |
提出Refine-POI以解决POI推荐中的数据不匹配问题 |
large language model |
|
|
| 2 |
Advancing Harmful Content Detection in Organizational Research: Integrating Large Language Models with Elo Rating System |
提出基于Elo评分系统的方法以提升有害内容检测 |
large language model |
|
|
| 3 |
Large Language Models are Near-Optimal Decision-Makers with a Non-Human Learning Behavior |
研究表明大型语言模型在决策中接近最优但学习行为非人类化 |
large language model |
|
|
| 4 |
IS-Bench: Evaluating Interactive Safety of VLM-Driven Embodied Agents in Daily Household Tasks |
提出IS-Bench以解决VLM驱动的智能体交互安全问题 |
embodied AI chain-of-thought |
✅ |
|
| 5 |
Evaluating VisualRAG: Quantifying Cross-Modal Performance in Enterprise Document Understanding |
提出量化框架以提升企业文档理解中的跨模态性能 |
foundation model multimodal |
|
|
| 6 |
LLMs in Coding and their Impact on the Commercial Software Engineering Landscape |
提出代码生成工具的安全审查机制以应对软件工程中的风险 |
large language model |
|
|
| 7 |
Evaluating the Use of LLMs for Documentation to Code Traceability |
评估大型语言模型在文档与代码追踪中的应用潜力 |
large language model |
|
|
| 8 |
TrainVerify: Equivalence-Based Verification for Distributed LLM Training |
提出TrainVerify以解决分布式大语言模型训练的验证问题 |
large language model |
|
|
| 9 |
SemAgent: A Semantics Aware Program Repair Agent |
提出SemAgent以解决程序修复中的语义理解问题 |
large language model |
|
|
| 10 |
A Community-driven vision for a new Knowledge Resource for AI |
提出社区驱动的知识资源框架以解决AI知识缺口问题 |
large language model |
|
|
| 11 |
AI-Driven Tools in Modern Software Quality Assurance: An Assessment of Benefits, Challenges, and Future Directions |
提出AI驱动工具以解决现代软件质量保证中的挑战 |
large language model |
|
|
| 12 |
Do We Talk to Robots Like Therapists, and Do They Respond Accordingly? Language Alignment in AI Emotional Support |
探讨情感支持机器人与人类治疗师对话的相似性与响应机制 |
large language model |
|
|
| 13 |
Explainable Rule Application via Structured Prompting: A Neural-Symbolic Approach |
提出结构化提示框架以解决法律分析中的规则应用问题 |
large language model |
|
|
| 14 |
LMR-BENCH: Evaluating LLM Agent's Ability on Reproducing Language Modeling Research |
提出LMR-BENCH以评估LLM代理在语言建模研究中的代码重现能力 |
large language model |
|
|