Heterogeneous multi-core systems (CPU, GPU, HWA, DSP) are becoming the de-facto norm for multiple computer vision applications across automotive, robotics, AR/VR, and industrial machine vision. This creates a need for a software framework which realizes high utilization of computing elements, low latency, real-time operation and ease of use. For specific applications, multiple proprietary solutions are offered to satisfy few of the above requirements. This paper proposes a solution based on the standard OpenVX specification to address heterogeneous systems. It introduces novel techniques of distributed graph execution across heterogeneous cores, data tiling to address diverse memory constraints and easy to use high-level graph description to describe the application. This novel solution is implemented on TI's TDA family of SoC for mono camera vision application with platform code generated from high-level graph description. The profiling confirms real time operation, low latency by reducing host CPU interaction and achieving 99% utilization across heterogeneous cores.