Objective
3D modeling is a major challenge in computer-assisted surgery (CAS). Manual segmentation, as the gold standard, is tedious, time consuming, and particularly challenging for the mandible, while artificial intelligence (AI)-based segmentation is a promising and time-saving alternative. However, little is known about the clinical implications of various segmentation methods.
Method
In this cross-over study, ten mandibles were segmented in virtual reality (VR), on a desktop screen (DS) by five experts and via five AI models. The exported mandible models were evaluated using metrics, a public reference (PUBDS), and blinded assessments by two radiologists.
Results
Average segmentation-to-volume accuracy (1 = poor, 5 = perfect) was comparable for human segmentation (VR: 4.56; DS: 4.33; PUBDS: 4.55) and significant better than AI-based segmentation (AI: 3.80), while the average segmentation-to-segmentation accuracy revealed that DS (91.4 %/0.37 mm [Dice coefficient/average Hausdorff distance]) was more comparable to PUBDS than to VR (90.1 %/0.44 mm). The precision of VR (96.8 %/0.14 mm) and DS (96.6 %/0.15 mm) was superior to PUBDS (94.1 %/0.21 mm) and the AI method (89.2 %/0.60 mm). While VR was significantly faster than DS and PUBDS for the manual segmentation methods (p = 0.007/< 0.001), in contrast, the AI method is not time sensitive due to its possible hardware scalability.
Conclusion
Accuracy and precision of mandible segmentation depends primarily on CT quality and anatomical site, which should be considered in clinical applications and the generation of AI training data and could negatively impact CAS. Although current AI models have perfect intra-model reliability, they demonstrate higher inter-model variability and are accompanied by invalid outliers making human review still necessary. In summary, the use of VR in manual segmentation showed high accuracy and precision overall while saving time, making it the preferred method over DS due to its good usability.