cs.LG（2025-05-12）

📊 共 30 篇论文 | 🔗 1 篇有代码

🎯 兴趣领域导航

支柱二：RL算法与架构 (RL & Architecture) (14 🔗1) 支柱九：具身大模型 (Embodied Foundation Models) (10) 支柱八：物理动画 (Physics-based Animation) (4) 支柱五：交互与反应 (Interaction & Reaction) (2)

🔬 支柱二：RL算法与架构 (RL & Architecture) (14 篇)

#	题目	一句话要点	标签	🔗	⭐
1	Cache-Efficient Posterior Sampling for Reinforcement Learning with LLM-Derived Priors Across Discrete and Continuous Domains	提出缓存高效的后验采样框架以降低RL计算成本	reinforcement learning offline RL CQL
2	RLSR: Reinforcement Learning from Self Reward	提出自我奖励强化学习方法以解决奖励工程挑战	reinforcement learning large language model
3	Combining Bayesian Inference and Reinforcement Learning for Agent Decision Making: A Review	结合贝叶斯推断与强化学习以提升智能体决策能力	reinforcement learning policy learning model-based RL
4	Simple yet Effective Semi-supervised Knowledge Distillation from Vision-Language Models via Dual-Head Optimization	提出双头优化方法以解决知识蒸馏中的梯度冲突问题	distillation	✅
5	An Extra RMSNorm is All You Need for Fine Tuning to 1.58 Bits	提出RMSNorm以稳定微调至1.58位的低比特量化模型	distillation large language model
6	A Theoretical Framework for Explaining Reinforcement Learning with Shapley Values	提出统一理论框架以解释强化学习中的行为与预测	reinforcement learning
7	MLE-Dojo: Interactive Environments for Empowering LLM Agents in Machine Learning Engineering	提出MLE-Dojo以解决LLM代理在机器学习工程中的交互性不足问题	reinforcement learning large language model
8	Self-Supervised Transformer-based Contrastive Learning for Intrusion Detection Systems	提出基于自监督对比学习的变换器模型以提升入侵检测系统性能	contrastive learning
9	EAGLE: Contrastive Learning for Efficient Graph Anomaly Detection	提出EAGLE以解决图异常检测效率低下问题	contrastive learning
10	Online Episodic Convex Reinforcement Learning	提出在线凸强化学习算法以解决CURL问题	reinforcement learning
11	INTELLECT-2: A Reasoning Model Trained Through Globally Decentralized Reinforcement Learning	提出INTELLECT-2以实现全球去中心化的强化学习训练	reinforcement learning
12	REMEDI: Relative Feature Enhanced Meta-Learning with Distillation for Imbalanced Prediction	提出REMEDI以解决极端类别不平衡的车辆购买预测问题	distillation
13	Representation Learning with Mutual Influence of Modalities for Node Classification in Multi-Modal Heterogeneous Networks	提出HGNN-IMA以解决多模态异构网络节点分类问题	representation learning
14	VoI-Driven Joint Optimization of Control and Communication in Vehicular Digital Twin Network	提出基于信息价值的联合优化框架以提升车载数字双胞胎网络性能	reinforcement learning deep reinforcement learning DRL

🔬 支柱九：具身大模型 (Embodied Foundation Models) (10 篇)

#	题目	一句话要点	标签	🔗	⭐
15	Symbolic Regression with Multimodal Large Language Models and Kolmogorov Arnold Networks	提出一种基于多模态大语言模型的符号回归新方法	large language model multimodal
16	Multimodal Cancer Modeling in the Age of Foundation Model Embeddings	提出多模态癌症建模方法以提升癌症数据分析效果	foundation model multimodal
17	Assessing the Chemical Intelligence of Large Language Models	提出ChemIQ基准以评估大型语言模型的化学智能	large language model
18	SpecRouter: Adaptive Routing for Multi-Level Speculative Decoding in Large Language Models	提出SpecRouter以解决大语言模型推理效率与质量的权衡问题	large language model
19	Direct Density Ratio Optimization: A Statistically Consistent Approach to Aligning Large Language Models	提出直接密度比优化方法以解决大语言模型对齐问题	large language model
20	Injecting Knowledge Graphs into Large Language Models	提出知识图谱注入大语言模型的方法以提升推理能力	large language model
21	Beyond Input Activations: Identifying Influential Latents by Gradient Sparse Autoencoders	提出Gradient Sparse Autoencoder以识别影响模型输出的潜在特征	large language model
22	TACOS: Temporally-aligned Audio CaptiOnS for Language-Audio Pretraining	提出TACOS以解决音频与文本描述的时间对齐问题	large language model
23	LEAD: Iterative Data Selection for Efficient LLM Instruction Tuning	提出LEAD框架以解决LLM指令调优中的数据选择效率问题	large language model
24	Uncertainty Profiles for LLMs: Uncertainty Source Decomposition and Adaptive Model-Metric Selection	提出不确定性源分解与自适应模型选择方法以提高LLM可靠性	large language model

🔬 支柱八：物理动画 (Physics-based Animation) (4 篇)

#	题目	一句话要点	标签	🔗	⭐
25	The Geography of Transportation Cybersecurity: Visitor Flows, Industry Clusters, and Spatial Dynamics	提出BiTransGCN框架以优化交通网络的网络安全预测	spatiotemporal
26	Self-cross Feature based Spiking Neural Networks for Efficient Few-shot Learning	提出基于自交叉特征的脉冲神经网络以解决高效少样本学习问题	spatiotemporal
27	Joint Graph Convolution and Sequential Modeling for Scalable Network Traffic Estimation	提出联合图卷积与序列建模以解决网络流量预测问题	spatiotemporal
28	EnvCDiff: Joint Refinement of Environmental Information and Channel Fingerprints via Conditional Generative Diffusion Model	提出EnvCDiff以解决环境信息与信道指纹联合优化问题	diff-sim

🔬 支柱五：交互与反应 (Interaction & Reaction) (2 篇)

#	题目	一句话要点	标签	🔗	⭐
29	Private LoRA Fine-tuning of Open-Source LLMs with Homomorphic Encryption	提出私密LoRA微调方法以解决开源LLM数据隐私问题	OMOMO large language model
30	Latent Behavior Diffusion for Sequential Reaction Generation in Dyadic Setting	提出潜在行为扩散模型以解决双人反应生成问题	reaction synthesis

⬅️ 返回 cs.LG 首页 · 🏠 返回主页