| 1 |
A Large Language Model-Empowered Agent for Reliable and Robust Structural Analysis |
提出LLM驱动的代理以解决结构分析的可靠性与鲁棒性问题 |
large language model chain-of-thought |
|
|
| 2 |
VERA: Variational Inference Framework for Jailbreaking Large Language Models |
提出VERA框架以解决大型语言模型的黑箱越狱问题 |
large language model |
|
|
| 3 |
The Consistency Hypothesis in Uncertainty Quantification for Large Language Models |
提出一致性假设以提升大语言模型的不确定性量化 |
large language model |
|
|
| 4 |
Assessing the feasibility of Large Language Models for detecting micro-behaviors in team interactions during space missions |
利用大型语言模型检测太空任务团队互动中的微行为 |
large language model |
|
|
| 5 |
Detection of Personal Data in Structured Datasets Using a Large Language Model |
提出基于GPT-4o的个人数据检测方法以解决结构化数据集中的隐私问题 |
large language model |
|
|
| 6 |
PapersPlease: A Benchmark for Evaluating Motivational Values of Large Language Models Based on ERG Theory |
提出PapersPlease基准以评估大型语言模型的动机价值 |
large language model |
✅ |
|
| 7 |
DeepOmni: Towards Seamless and Smart Speech Interaction with Adaptive Modality-Specific MoE |
提出DeepTalk以解决多模态大语言模型的遗忘与性能下降问题 |
large language model multimodal |
✅ |
|
| 8 |
Lost at the Beginning of Reasoning |
提出高效采样策略以优化推理初步步骤 |
large language model chain-of-thought |
|
|
| 9 |
Optimal Estimation of Watermark Proportions in Hybrid AI-Human Texts |
提出最优估计方法以解决混合来源文本水印比例问题 |
large language model |
|
|
| 10 |
Temperature Matters: Enhancing Watermark Robustness Against Paraphrasing Attacks |
提出新水印方法以增强对抗改写攻击的鲁棒性 |
large language model |
|
|
| 11 |
QuickSilver -- Speeding up LLM Inference through Dynamic Token Halting, KV Skipping, Contextual Token Fusion, and Adaptive Matryoshka Quantization |
提出QuickSilver以加速大型语言模型推理过程 |
large language model |
|
|
| 12 |
RExBench: Can coding agents autonomously implement AI research extensions? |
提出RExBench以评估AI代理的研究扩展能力 |
large language model |
|
|
| 13 |
Refining Czech GEC: Insights from a Multi-Experiment Approach |
提出基于Transformer的捷克语语法错误纠正系统 |
large language model |
✅ |
|
| 14 |
Towards Fair Rankings: Leveraging LLMs for Gender Bias Detection and Measurement |
利用大型语言模型检测和测量性别偏见以实现公平排名 |
large language model |
|
|
| 15 |
A Dual-Layered Evaluation of Geopolitical and Cultural Bias in LLMs |
提出双层评估框架以分析LLMs中的地缘政治与文化偏见 |
large language model |
|
|
| 16 |
WildSpeech-Bench: Benchmarking End-to-End SpeechLLMs in the Wild |
提出WildSpeech-Bench以解决语音LLM评估不足问题 |
large language model |
|
|
| 17 |
RiverEcho: Real-Time Interactive Digital System for Ancient Yellow River Culture |
提出RiverEcho系统以实时互动传承黄河文化 |
large language model |
|
|