| 1 |
How Transformers Learn Regular Language Recognition: A Theoretical Study on Training Dynamics and Implicit Bias |
提出一层变换器学习正则语言识别的训练动态分析 |
large language model chain-of-thought |
|
|
| 2 |
When Dynamic Data Selection Meets Data Augmentation |
提出在线数据训练框架以解决动态数据选择与数据增强的协同问题 |
multimodal |
|
|
| 3 |
Grouped Sequency-arranged Rotation: Optimizing Rotation Transformation for Quantization for Free |
提出Grouped Sequency-arranged Rotation以优化低比特量化问题 |
large language model |
|
|
| 4 |
Scalability Matters: Overcoming Challenges in InstructGLM with Similarity-Degree-Based Sampling |
提出SDM-InstructGLM以解决大规模图处理的可扩展性问题 |
large language model |
|
|