The GPi-LHb pathway is the main output of the basal ganglia suggested to shape motivated behaviors. We show here that Sst+ GPi-LHb neurons send direct feedback to key nodes of the basal ganglia: the GPe, the striatal striosomes, and dopamine neurons in the SNc. Chronic silencing of this pathway did not affect learning or execution of value-guided choices, but severely disrupted the ability to adapt choice-behavior and seek an alternative reward location after task reversal. Calcium imaging revealed that Sst+ GPi neurons did not signal outcome value or value updates during reversal learning. Instead, progressive suppression of the Sst+ GPi activity was linked to increased commitment to one choice, and activity increased during exploration of alternative choices. We propose that GPi Sst+ neurons drive behavioral flexibility through a direct feedback signal to balance the activity of key nodes in the basal ganglia.