Computational pathology for gigapixel whole-slide images (WSIs) at slide level is helpful in disease diagnosis and remains challenging. We propose a context-aware approach termed WSI inspection via transformer (WIT) for slide-level classification via holistically modeling dependencies among patches on WSI. WIT automatically learns feature representation of WSI by aggregating features of all image patches. We evaluate classification performance of WIT and state-of-the-art baseline method. WIT achieved an accuracy of 82.1% (95% CI, 80.7%–83.3%) in the detection of 32 cancer types on the TCGA dataset, 0.918 (0.910–0.925) in diagnosis of cancer on the CPTAC dataset, and 0.882 (0.87–0.890) in the diagnosis of prostate cancer from needle biopsy slide, outperforming the baseline by 31.6%, 5.4%, and 9.3%, respectively. WIT can pinpoint the WSI regions that are most influential for its decision. WIT represents a new paradigm for computational pathology, facilitating the development of digital pathology tools.