Methods
We developed the Histomorphological Phenotype Learning (HPL) method. The training pipeline consists of whole slide image (WSI) ingestion, normalisation and tiling, Barlow-Twins training of a convolutional neural network to extract meaningful image features, and Leiden clustering of vector tile representations into morphologically similar groups. The network was trained and evaluated on wholly separate collections of several hundred primary lung adenocarcinomas from TCGA and NYU. Histomorphological phenotype clusters (HPCs) were then scrutinised and classified by expert histopathologists, assessed for their ability to predict patient outcome, and investigated molecularly.
Results
All WSIs can be represented as a tile-map consisting of up to 46 HPCs. Histopathological examination reveals HPCs to be consistent, recognisable pathological entities. For example, 5 HPCs represent different varieties of solid pattern growth, differing in stromal proportion, cellular grade, and tumour infiltrating lymphocyte (TIL) density (Figure A).