Mobile phones are a ubiquitous and preferred communication, entertainment, and information access platform. Smartphones may provide an opportunity to better assess mood and behavior and to provide intervention timely, economical, rapid and effective intervention for those with mental disorders. This is an important target because behavioral health problems are associated with many of the medical disorders most responsible for morbidity and cost. Today, psychiatrists seek for various channels of mobile technology that can reduce evaluation costs and increase accuracy and also facilitate ubiquitous longitudinal monitoring of treatment and outcome measures on patients' smartphone. Facial expression recognition is one of the active research areas in the field of psychiatry to evaluate a patient's emotional health. Smartphone technology for recognizing facial expression of emotions is still emerging and offers an open platform for the research areas such as ubiquitous intelligence and computing. In this research, we present a framework to track user's emotional engagement to videos played on a smartphone. The presented framework processes user's video recorded from the front-facing camera of a smartphone and tracks facial features to detect joyful durations induced by the played videos. We also conducted subject studies on healthy individuals to evaluate the applied approach of emotional engagement. We believe that the presented results are promising and present a valuable insight to build ubiquitous intelligent systems that can help various areas of psychiatric research.