Today's network control systems have very limited ability to adapt to changing network conditions. The addition of reinforcement learning‐based network management agents can improve quality of service by reconfiguring the network layer protocol parameters in response to observed network performance conditions. This paper presents a closed‐loop approach to tuning the layer three protocol based upon current and previous network state observations, specifically the Hello Interval and Active Route Timeout parameters of the AODV routing protocol (AODV‐Q). Simulation results demonstrate that the self‐configuration method proposed here demonstrably improves the performance of the original Ad‐Hoc On‐Demand Distance Vector (AODV) protocol, reducing protocol overhead by 43% and end‐to‐end delay 29% while increasing the packet delivery ratio by up to 11%. Copyright © 2012 John Wiley & Sons, Ltd.