The Colonoscopic Withdrawal Time (CWT) is the time required to withdraw the endoscope during a colonoscopy procedure. Estimating the CWT has several applications, including as a performance metric for gastroenterologists, and as an augmentation to polyp detection systems. We present a method for estimating the CWT directly from colonoscopy video based on three separate modules: egomotion computation; depth estimation; and anatomical landmark classification. Features are computed based on the modules’ outputs, which are then used to classify each frame as representing forward, stagnant, or backward motion. This allows for the optimal detection of the change points between these phases based on efficient maximization of the likelihood; from which the CWT follows directly. We collect a dataset consisting of 788 videos of colonoscopy procedures, with the CWT for each annotated by gastroenterologists. Our algorithm achieves a mean error of 1.20 min, which nearly matches the inter-rater disagreement of 1.17 min.