M Wiering, J Schmidhuber - Adaptive behavior, 1997 - journals.sagepub.com
HQ-learning is a hierarchical extension of Q (λ)-learning designed to solve certain types of
partially observable Markov decision problems (POMDPs). HQ automatically decomposes …