SQUID: subtrajectory query in trillion-scale GPS database

D Zhang, Z Chang, D Yang, D Li, KL Tan, K Chen… - The VLDB Journal, 2023 - Springer
D Zhang, Z Chang, D Yang, D Li, KL Tan, K Chen, G Chen
The VLDB Journal, 2023Springer
Subtrajectory query has been a fundamental operator in mobility data management and
useful in the applications of trajectory clustering, co-movement pattern mining and contact
tracing in epidemiology. In this paper, we make the first attempt to study subtrajectory query
in trillion-scale GPS databases, so as to support applications with urban-scale moving users
and weeks-long historical data. We develop SQUID as a distributed subtrajectory query
processing engine on Spark, with threefold technical contributions. First, we propose …
Abstract
Subtrajectory query has been a fundamental operator in mobility data management and useful in the applications of trajectory clustering, co-movement pattern mining and contact tracing in epidemiology. In this paper, we make the first attempt to study subtrajectory query in trillion-scale GPS databases, so as to support applications with urban-scale moving users and weeks-long historical data. We develop SQUID as a distributed subtrajectory query processing engine on Spark, with threefold technical contributions. First, we propose compact index and storage layers to handle massive trajectory datasets with trillion-scale GPS points. Second, we leverage hybrid partitioning, together with local indexes that are disk I/O friendly, to facilitate pruning. Third, we devise a novel filter-and-refine query processing framework to effectively reduce the number of trajectories for verification. Our experiments are conducted on huge trajectory datasets with up to 520 billion GPS points. The results validate the compactness of the storage mechanism and the scalability of the distributed query processing framework.
Springer
以上显示的是最相近的搜索结果。 查看全部搜索结果