cs.LG(2025-05-12)

📊 共 30 篇论文 | 🔗 1 篇有代码

🎯 兴趣领域导航

支柱二:RL算法与架构 (RL & Architecture) (14 🔗1) 支柱九:具身大模型 (Embodied Foundation Models) (10) 支柱八:物理动画 (Physics-based Animation) (4) 支柱五:交互与反应 (Interaction & Reaction) (2)

🔬 支柱二:RL算法与架构 (RL & Architecture) (14 篇)

#题目一句话要点标签🔗
1 Cache-Efficient Posterior Sampling for Reinforcement Learning with LLM-Derived Priors Across Discrete and Continuous Domains 提出缓存高效的后验采样框架以降低RL计算成本 reinforcement learning offline RL CQL
2 RLSR: Reinforcement Learning from Self Reward 提出自我奖励强化学习方法以解决奖励工程挑战 reinforcement learning large language model
3 Combining Bayesian Inference and Reinforcement Learning for Agent Decision Making: A Review 结合贝叶斯推断与强化学习以提升智能体决策能力 reinforcement learning policy learning model-based RL
4 Simple yet Effective Semi-supervised Knowledge Distillation from Vision-Language Models via Dual-Head Optimization 提出双头优化方法以解决知识蒸馏中的梯度冲突问题 distillation
5 An Extra RMSNorm is All You Need for Fine Tuning to 1.58 Bits 提出RMSNorm以稳定微调至1.58位的低比特量化模型 distillation large language model
6 A Theoretical Framework for Explaining Reinforcement Learning with Shapley Values 提出统一理论框架以解释强化学习中的行为与预测 reinforcement learning
7 MLE-Dojo: Interactive Environments for Empowering LLM Agents in Machine Learning Engineering 提出MLE-Dojo以解决LLM代理在机器学习工程中的交互性不足问题 reinforcement learning large language model
8 Self-Supervised Transformer-based Contrastive Learning for Intrusion Detection Systems 提出基于自监督对比学习的变换器模型以提升入侵检测系统性能 contrastive learning
9 EAGLE: Contrastive Learning for Efficient Graph Anomaly Detection 提出EAGLE以解决图异常检测效率低下问题 contrastive learning
10 Online Episodic Convex Reinforcement Learning 提出在线凸强化学习算法以解决CURL问题 reinforcement learning
11 INTELLECT-2: A Reasoning Model Trained Through Globally Decentralized Reinforcement Learning 提出INTELLECT-2以实现全球去中心化的强化学习训练 reinforcement learning
12 REMEDI: Relative Feature Enhanced Meta-Learning with Distillation for Imbalanced Prediction 提出REMEDI以解决极端类别不平衡的车辆购买预测问题 distillation
13 Representation Learning with Mutual Influence of Modalities for Node Classification in Multi-Modal Heterogeneous Networks 提出HGNN-IMA以解决多模态异构网络节点分类问题 representation learning
14 VoI-Driven Joint Optimization of Control and Communication in Vehicular Digital Twin Network 提出基于信息价值的联合优化框架以提升车载数字双胞胎网络性能 reinforcement learning deep reinforcement learning DRL

🔬 支柱九:具身大模型 (Embodied Foundation Models) (10 篇)

#题目一句话要点标签🔗
15 Symbolic Regression with Multimodal Large Language Models and Kolmogorov Arnold Networks 提出一种基于多模态大语言模型的符号回归新方法 large language model multimodal
16 Multimodal Cancer Modeling in the Age of Foundation Model Embeddings 提出多模态癌症建模方法以提升癌症数据分析效果 foundation model multimodal
17 Assessing the Chemical Intelligence of Large Language Models 提出ChemIQ基准以评估大型语言模型的化学智能 large language model
18 SpecRouter: Adaptive Routing for Multi-Level Speculative Decoding in Large Language Models 提出SpecRouter以解决大语言模型推理效率与质量的权衡问题 large language model
19 Direct Density Ratio Optimization: A Statistically Consistent Approach to Aligning Large Language Models 提出直接密度比优化方法以解决大语言模型对齐问题 large language model
20 Injecting Knowledge Graphs into Large Language Models 提出知识图谱注入大语言模型的方法以提升推理能力 large language model
21 Beyond Input Activations: Identifying Influential Latents by Gradient Sparse Autoencoders 提出Gradient Sparse Autoencoder以识别影响模型输出的潜在特征 large language model
22 TACOS: Temporally-aligned Audio CaptiOnS for Language-Audio Pretraining 提出TACOS以解决音频与文本描述的时间对齐问题 large language model
23 LEAD: Iterative Data Selection for Efficient LLM Instruction Tuning 提出LEAD框架以解决LLM指令调优中的数据选择效率问题 large language model
24 Uncertainty Profiles for LLMs: Uncertainty Source Decomposition and Adaptive Model-Metric Selection 提出不确定性源分解与自适应模型选择方法以提高LLM可靠性 large language model

🔬 支柱八:物理动画 (Physics-based Animation) (4 篇)

#题目一句话要点标签🔗
25 The Geography of Transportation Cybersecurity: Visitor Flows, Industry Clusters, and Spatial Dynamics 提出BiTransGCN框架以优化交通网络的网络安全预测 spatiotemporal
26 Self-cross Feature based Spiking Neural Networks for Efficient Few-shot Learning 提出基于自交叉特征的脉冲神经网络以解决高效少样本学习问题 spatiotemporal
27 Joint Graph Convolution and Sequential Modeling for Scalable Network Traffic Estimation 提出联合图卷积与序列建模以解决网络流量预测问题 spatiotemporal
28 EnvCDiff: Joint Refinement of Environmental Information and Channel Fingerprints via Conditional Generative Diffusion Model 提出EnvCDiff以解决环境信息与信道指纹联合优化问题 diff-sim

🔬 支柱五:交互与反应 (Interaction & Reaction) (2 篇)

#题目一句话要点标签🔗
29 Private LoRA Fine-tuning of Open-Source LLMs with Homomorphic Encryption 提出私密LoRA微调方法以解决开源LLM数据隐私问题 OMOMO large language model
30 Latent Behavior Diffusion for Sequential Reaction Generation in Dyadic Setting 提出潜在行为扩散模型以解决双人反应生成问题 reaction synthesis

⬅️ 返回 cs.LG 首页 · 🏠 返回主页