Complex join queries are expensive to process on big data. Providing fast and accurate approximations to join queries with common aggregate functions can bring tremendous benefits in many fields such as data management, data mining, and machine learning. The state-of-the-art methods mainly focus on generating non-reusable samples during query time which can be costly for big data applications. In this research, we develop a scalable sample-based synopsis, called Scalable Join Correlated Sample Synopsis (or CS*), which can be pre-computed and doesn’t rely on any index structure. CS* only needs to be generated once and can be used to answer all future queries on the same database. It efficiently maintains join relationships between sampled tuples thanks to the introduced scheme of scalable join correlated sampling and a unique numerical value called join ratio (or JR). We further introduce two novel data structures, namely count trace and join correlated histogram, to optimize the calculation of JR values in map-reduce. For query estimations, multiple unbiased estimators are developed on CS* to provide fast and accurate approximations for join queries with common aggregate functions, acyclic or cyclic join graphs, and dangling tuples. The experimental study on large datasets demonstrates that CS* can be efficiently generated and provides accurate join query estimations with small sampling fractions.