A human attention estimation system is proposed, derived from automatic analysis of movement and behavior of people in front of public displays. For this purpose, new attention metrics defining detailed levels and types of attention are proposed, based on an established attention model which is adapted to the specific use-case. New movement features are introduced which translate detected movements and behavior of people into the novel metric scales. Data capturing is achieved by depth image analysis for tracking of people and extraction of statistical movement data, whereas head orientation of people in front of the interactive system is estimated from RGB video analysis. Classification is carried out for long and short term attention levels, employing a data-driven approach via Support Vector Machines on experimental data derived from an empirical field study at a public exhibition. The system is capable of real-time feature extraction and classification.