This paper proposes a novel reward based Single Network Adaptive Critic (SNAC) architecture for optimal kinematic control of a redundant robot manipulator. At first, the closed-loop positioning task is formulated as a discrete-time input affine system. The optimal controller is then derived for the reward based cost function with the adaptive critic. The notion of goal-directed optimal behavior is captured by defining a performance measure or cost function in terms of discounted reward function. The reward function is chosen to deliberately speed up convergence of the closed loop positioning error. The performance of the proposed reward based SNAC architecture is compared with standard quadratic cost based SNAC for kinematic control of the redundant robot manipulator. The results show faster convergence of the overall trajectory cost. The proposed kinematic solution has been validated in simulations and experimentally executed using 6 degrees of freedom (DOF) Universal Robot (UR) 10 robot manipulator for both regulation and tracking.