Due to their massive parallelism and high performance per Watt, GPUs have gained high popularity in high-performance computing and are a strong candidate for future exascale systems. But communication and data transfer in GPU-accelerated systems remain a challenging problem. Since the GPU normally is not able to control a network device, a hybrid-programming model is preferred whereby the GPU is used for calculation and the CPU handles the communication. As a result, communication between distributed GPUs suffers from unnecessary overhead, introduced by switching control flow from GPUs to CPUs and vice versa. Furthermore, often a designated CPU thread is required to control GPU-related communication. In this work, we modify user space libraries and device drivers of GPUs and the InfiniBand network device in a way to enable the GPU to control an InfiniBand network device to independently source and sink communication requests without any involvement of the CPU. Our results show that complex networking protocols such as InfiniBand Verbs are better handled by CPUs, since overhead of work request generation cannot be parallelized and is not suitable for the highly parallel programming model of GPUs. The massive number of instructions and accesses to host memory that is required to source and sink a communication request on the GPU slows down the performance. Only through a massive reduction in the complexity of the InfiniBand protocol can some performance improvements be achieved.