Matrix information theory for self-supervised learning

Y Zhang, Z Tan, J Yang, W Huang, Y Yuan - arXiv preprint arXiv …, 2023 - arxiv.org
The maximum entropy encoding framework provides a unified perspective for many non-
contrastive learning methods like SimSiam, Barlow Twins, and MEC. Inspired by this …

Provable Contrastive Continual Learning

Y Wen, Z Tan, K Zheng, C Xie, W Huang - arXiv preprint arXiv:2405.18756, 2024 - arxiv.org
Continual learning requires learning incremental tasks with dynamic data distributions. So
far, it has been observed that employing a combination of contrastive loss and distillation …

Large language model evaluation via matrix entropy

L Wei, Z Tan, C Li, J Wang, W Huang - arXiv preprint arXiv:2401.17139, 2024 - arxiv.org
Large language models (LLMs) have revolutionized the field of natural language
processing, extending their strong capabilities into multi-modal domains. Thus, it is vital to …

Meta prompting for agi systems

Y Zhang - arXiv preprint arXiv:2311.11482, 2023 - arxiv.org
This paper presents an in-depth exploration of Meta Prompting, a novel technique that
revolutionizes the way large language models (LLMs), multi-modal foundation models, and …

Deep Tensor Network

Y Zhang - arXiv preprint arXiv:2311.11091, 2023 - arxiv.org
In this paper, we delve into the foundational principles of tensor categories, harnessing the
universal property of the tensor product to pioneer novel methodologies in deep network …

The Information of Large Language Model Geometry

Z Tan, C Li, W Huang - arXiv preprint arXiv:2402.03471, 2024 - arxiv.org
This paper investigates the information encoded in the embeddings of large language
models (LLMs). We conduct simulations to analyze the representation entropy and discover …