The reinforcement learning hypothesis of dopamine function predicts that dopamine acts as a teaching signal by governing synaptic plasticity in the striatum. Induced changes in synaptic strength enable the cortico-striatal network to learn a mapping between situations and actions that lead to a reward. A review of the relevant neurophysiology of dopamine function in the cortico-striatal network and the machine reinforcement learning hypothesis reveals an apparent mismatch with recent electrophysiological studies. It was found that in addition to the well-described reward-related responses, a subpopulation of dopamine neurons also exhibits phasic responses to aversive stimuli or to cues predicting aversive stimuli. Obviously, actions that lead to aversive events should not be reinforced. However, published data suggest that the phasic responses of dopamine neurons to reward-related stimuli have a higher firing rate and have a longer duration than phasic responses of dopamine neurons to aversion-related stimuli. We propose that based on different dopamine concentrations, the target structures are able to decode reward-related dopamine from aversion-related dopamine responses. Thereby, the learning of actions in the basal-ganglia network integrates information about both costs and benefits. This hypothesis predicts that dopamine concentration should be a crucial parameter for plasticity rules at cortico-striatal synapses. Recent in vitro studies on cortico-striatal synaptic plasticity rules support a striatal action-learning scheme where during reward-related dopamine release dopamine-dependent forms of synaptic plasticity occur, while during aversion-related dopamine release the dopamine concentration only allows dopamine-independent forms of synaptic plasticity to occur.