attractive low-cost solution to guide the endoscopist to target peripheral lesions for biopsy and histological analysis. We propose a decoupled deep learning architecture that projects input frames onto the domain of CT renderings, thus allowing offline training from patient- specific CT data. Methods A fully convolutional network architecture is implemented on GPU and tested on a phantom dataset involving 32 video sequences and ∼∼ 60k frames with …
Purpose
In bronchoschopy, computer vision systems for navigation assistance are an attractive low-cost solution to guide the endoscopist to target peripheral lesions for biopsy and histological analysis. We propose a decoupled deep learning architecture that projects input frames onto the domain of CT renderings, thus allowing offline training from patient-specific CT data.
Methods
A fully convolutional network architecture is implemented on GPU and tested on a phantom dataset involving 32 video sequences and 60k frames with aligned ground truth and renderings, which is made available as the first public dataset for bronchoscopy navigation.
Results
An average estimated depth accuracy of 1.5 mm was obtained, outperforming conventional direct depth estimation from input frames by 60%, and with a computational time of 30 ms on modern GPUs. Qualitatively, the estimated depth and renderings closely resemble the ground truth.
Conclusions
The proposed method shows a novel architecture to perform real-time monocular depth estimation without losing patient specificity in bronchoscopy. Future work will include integration within SLAM systems and collection of in vivo datasets.