L Merrick, D Xu, G Nuti, D Campos - arXiv preprint arXiv:2405.05374, 2024 - arxiv.org
This report describes the training dataset creation and recipe behind the family of\texttt
{arctic-embed} text embedding models (a set of five models ranging from 22 to 334 million …