Effective input modelling in stochastic simulation is essential in driving and understanding underlying system behaviours. Current approaches to input modelling either consider all input data as a homogeneous data set, resulting in simulation models that ignore idiosyncratic systems characteristics, or alternatively, treat individual data sets independently, leading to more complex analysis. In this article we propose a novel approach based on exploratory machine learning techniques to generate representative system behaviours with just adequate scenario experiments by grouping input data into clusters. Dynamic time warping measures the similarity between input sources and silhouette indices are used to determine the optimal number of clusters. This approach provides more targeted analysis to characterize underlying systems behaviours driven by factors such as socio-economics, demographics or geography. Results from two simulation case studies demonstrated the effectiveness of the proposed approach, in that system output behaviours remain invariant based on several statistical tests.