Skip to yearly menu bar Skip to main content


Poster

Bypassing the Exponential Dependency: Looped Transformers Efficiently Learn In-context by Multi-step Gradient Descent

Bo Chen ⋅ Xiaoyu Li ⋅ Yingyu Liang ⋅ Zhenmei Shi ⋅ Zhao Song

Abstract

Chat is not available.