Accurate infrastructure condition assessment is critical for optimized maintenance and rehabilitation plan. Closed Circuit Television (CCTV) inspection has been widely applied in the internal inspection of sewerage systems. However, the manual approach adopted under current practice is expertise intensive and time-consuming. Previous research has attempted to apply specialized image processing techniques for the detection of specific defects with engineered features, such as cracks and joint offset. However, these engineered features are less generalizable than the state-of-the-art deep learning methods. Another crucial problem in defect classification is the imbalance between defects and non-defects due to high volume of normal images and the imbalance between different defects due to varying defect occurrence rates. This raises a big challenge for both traditional methods and deep learning methods. In this paper, a method based on the deep convolutional neural network is proposed to detect and classify defects from CCTV inspections. To improve the performance on imbalanced datasets, a hierarchical classification approach is introduced to supervise the learning process at different levels. The high-level detection task tries to discriminate images with defects from normal images. The low-level classification calculates the probability of each defect assuming the image has defects. The final defect classification is then derived from the chain rule of conditional probability. The network was trained and tested using inspection images collected from 24.7 km sewer lines. The high-level defect detection accuracy was improved from 78.4% to 83.2% with a hierarchical classification approach. Due to the difficulty to discriminate the defects, the low-level defect classification accuracy still needs improvements, but the proposed network with hierarchical classification also demonstrated superior performance over traditional approaches.