Trained Transformers Learn Linear Models In-Context
Updated: 2024-04-23 23:24:38
Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Data DMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us Trained Transformers Learn Linear Models In-Context Ruiqi Zhang , Spencer Frei , Peter L . Bartlett 25(49 1 55, 2024. Abstract Attention-based neural networks such as transformers have demonstrated a remarkable ability to exhibit in-context learning ICL Given a short prompt sequence of tokens from an unseen task , they can formulate relevant per-token and next-token predictions without any parameter updates . By embedding a sequence of labeled training data and unlabeled test data as a prompt , this allows for transformers to behave like supervised learning algorithms . Indeed , recent work has shown