The increasing complexity of HPC systems has introduced new sources of variability, which can contribute to significant differences in run-to-run performance of applications. With …
Interference between jobs competing for network bandwidth on a fat-tree cluster can cause significant variability and degradation in performance. These performance issues can be …
The de-facto standard topology for modern HPC systems and data-centers are Folded Clos networks, commonly known as Fat-Trees. The number of network endpoints in these …
The era of extremely heterogeneous supercomputing brings with itself the devil of increased performance variation and reduced reproducibility. There is a lack of understanding in the …
Scalable management of user workloads on large-scale supercomputers remains a challenge due to the tradeoff between capturing adequate detail for analysis from various …
Resource sharing with its implied mutual interference has been considered a major concern for running applications of multiple tenants in shared cloud datacenters. Besides its security …
Identifying a suitable network topology and deciding its optimal configuration parameters are critical aspects of the overall HPC system design, procurement and installation process …
N Moldvai, M Malpani - US Patent 10,819,656, 2020 - Google Patents
Methods and systems for throttling per-node network bandwidths over time to maximize the aggregate bandwidth of a distributed cluster of nodes without exceeding a global bandwidth …
G Juniwal, G Jain, A Gee - US Patent 11,030,062, 2021 - Google Patents
Methods and systems for identifying a set of disks within a cluster and then storing a plurality of data chunks into the set of disks such that the placement of the plurality of data chunks …