This paper describes convergent validity evidence regarding the mandatory, standards‐based Chilean national teacher evaluation system (NTES). The study examined whether NTES identifies – and thereby rewards or punishes – the ‘right’ teachers as high‐ or low‐performing. We collected in‐depth teaching performance data on a sample of 58 teachers who were evaluated by NTES as either ‘outstanding’ (group 1) or ‘unsatisfactory’ (group 2). The collected evidence included gains in student achievement scores, observation log data, expert ratings of a teaching materials binder, and teachers' scores on a subject and pedagogical knowledge test. The results support the validity of NTES' performance categorisations of the two extreme groups. The groups differed significantly on half of the performance indicators, and showed differences in the expected direction on the remaining indicators. We found especially strong and practically significant differences related to time on task during lessons, lesson structure, student behaviour, and student evaluation materials. We also found significant correlations between our results and the sample scores on three out of four NTES instruments.