time performance by exploiting the GPU's streaming architecture at all stages of kd-tree
construction. Unlike previous parallel kd-tree algorithms, our method builds tree nodes
completely in BFS (breadth-first search) order. We also develop a special strategy for large
nodes at upper tree levels so as to further exploit the fine-grained parallelism of GPUs. For
these nodes, we parallelize the computation over all geometric primitives instead of nodes at …