Quantitative biomechanical models can identify control parameters that are used during movements, and movement parameters that are encoded by premotor neurons. We fit a mathematical dynamical systems model including subsyringeal pressure, syringeal biomechanics and upper-vocal-tract filtering to the songs of zebra finches. This reduces the dimensionality of singing dynamics, described as trajectories (motor ‘gestures’) in a space of syringeal pressure and tension. Here we assess model performance by characterizing the auditory response ‘replay’ of song premotor HVC neurons to the presentation of song variants in sleeping birds, and by examining HVC activity in singing birds. HVC projection neurons were excited and interneurons were suppressed within a few milliseconds of the extreme time points of the gesture trajectories. Thus, the HVC precisely encodes vocal motor output through activity at the times of extreme points of movement trajectories. We propose that the sequential activity of HVC neurons is used as a ‘forward’ model, representing the sequence of gestures in song to make predictions on expected behaviour and evaluate feedback.