A simplistic model of neural scaling laws: Multiperiodic Santa Fe processes

Ł Dębowski - arXiv preprint arXiv:2302.09049, 2023 - arxiv.org
It was observed that large language models exhibit a power-law decay of cross entropy with
respect to the number of parameters and training tokens. When extrapolated literally, this …

Is natural language a perigraphic process? The theorem about facts and words revisited

Ł Dębowski - Entropy, 2018 - mdpi.com
As we discuss, a stationary stochastic process is nonergodic when a random persistent topic
can be detected in the infinite random text sampled from the process, whereas we call the …

A refutation of finite-state language models through Zipf's law for factual knowledge

Ł Dębowski - Entropy, 2021 - mdpi.com
We present a hypothetical argument against finite-state processes in statistical language
modeling that is based on semantics rather than syntax. In this theoretical model, we …

Excess entropy in natural language: Present state and perspectives

Ł Dębowski - Chaos: An Interdisciplinary Journal of Nonlinear …, 2011 - pubs.aip.org
We review recent progress in understanding the meaning of mutual information in natural
language. Let us define words in a text as strings that occur sufficiently often. In a few …

Maximal Repetitions in Written Texts: Finite Energy Hypothesis vs. Strong Hilberg Conjecture

Ł Dębowski - Entropy, 2015 - mdpi.com
The article discusses two mutually-incompatible hypotheses about the stochastic
mechanism of the generation of texts in natural language, which could be related to entropy …

On hidden Markov processes with infinite excess entropy

Ł Dębowski - Journal of Theoretical Probability, 2014 - Springer
We investigate stationary hidden Markov processes for which mutual information between
the past and the future is infinite. It is assumed that the number of observable states is finite …

The relaxed Hilberg conjecture: A review and new experimental support

Ł Dębowski - Journal of Quantitative Linguistics, 2015 - Taylor & Francis
The relaxed Hilberg conjecture states that the mutual information between two adjacent
blocks of text in natural language grows as a power of the block length. The present paper …

Regular Hilberg processes: An example of processes with a vanishing entropy rate

Ł Dębowski - IEEE Transactions on Information Theory, 2017 - ieeexplore.ieee.org
A regular Hilberg process is a stationary process that satisfies both a hyperlogarithmic
growth of maximal repetition and a power-law growth of topological entropy, which are a …

Approximating information measures for fields

Ł Dębowski - Entropy, 2020 - mdpi.com
We supply corrected proofs of the invariance of completion and the chain rule for the
Shannon information measures of arbitrary fields, as stated by Dębowski in 2009. Our …

Hilberg exponents: New measures of long memory in the process

Ł Debowski - IEEE Transactions on Information Theory, 2015 - ieeexplore.ieee.org
This paper concerns the rates of power law growth of mutual information computed for a
stationary measure or for a universal code. The rates are called Hilberg exponents, and four …