Feature extraction is an initial and essential part for the development of accurate predictive machine learning classifiers. In the research field of drug discovery and development, the usage of molecular descriptors, which can be defined as mathematical representations of molecules’ chemical properties, is a challenging task not only for machine learning studies but even for "classical wet lab" approaches. However, a high diversity of these descriptors is required in order to exploit all the available knowledge and, consequently, to maximize the potentially predictive power of approaches that could be applied for the discovery of new bioactive compounds against one or more molecular targets. Furthermore, the representation and normalization of these information is considered a rather time-consuming process. Herein, we present an approach that employs the power of cloud and distributing computing for the extraction, processing and representation of big datasets, leading to the generation of molecular descriptors in a reasonable time frame.