It was observed that large language models exhibit a power-law decay of cross entropy with respect to the number of parameters and training tokens. When extrapolated literally, this …
As we discuss, a stationary stochastic process is nonergodic when a random persistent topic can be detected in the infinite random text sampled from the process, whereas we call the …
We present a hypothetical argument against finite-state processes in statistical language modeling that is based on semantics rather than syntax. In this theoretical model, we …
Neural language models have drawn a lot of attention for their strong ability to predict natural language text. In this paper, we estimate the entropy rate of natural language with …
Ł Dębowski - IEEE Transactions on Information Theory, 2017 - ieeexplore.ieee.org
Maximal repetition of a string is the maximal length of a repeated substring. This paper investigates maximal repetition of strings drawn from stochastic processes. Strengthening …
By an analogy to the duality between the recurrence time and the longest match length, we introduce a quantity dual to the maximal repetition length, which we call the repetition time …
Ł Dębowski - IEEE Transactions on Information Theory, 2017 - ieeexplore.ieee.org
A regular Hilberg process is a stationary process that satisfies both a hyperlogarithmic growth of maximal repetition and a power-law growth of topological entropy, which are a …
We review three mathematical developments linked with Hilberg's conjecture–a hypothesis about the power-law growth of entropy of texts in natural language, which sets up a …
BF Skinner. Verbal Behavior. Prentice Hall, 1957. Skinner-like argument: Human brain consists of a billion of neurons (a finite number). Assuming that each neuron can be in two …