H Furuta, M Gouki,
Y Iwasawa, Y Matsuo - arXiv preprint arXiv:2402.16726, 2024 - arxiv.org
Grokking has been actively explored to reveal the mystery of delayed generalization.
Identifying interpretable algorithms inside the grokked models is a suggestive hint to …