Distance-based sampling of software configuration spaces

C Kaltenecker, A Grebhahn… - 2019 IEEE/ACM 41st …, 2019 - ieeexplore.ieee.org
2019 IEEE/ACM 41st International Conference on Software …, 2019ieeexplore.ieee.org
Configurable software systems provide a multitude of configuration options to adjust and
optimize their functional and non-functional properties. For instance, to find the fastest
configuration for a given setting, a brute-force strategy measures the performance of all
configurations, which is typically intractable. Addressing this challenge, state-of-the-art
strategies rely on machine learning, analyzing only a few configurations (ie, a sample set) to
predict the performance of other configurations. However, to obtain accurate performance …
Configurable software systems provide a multitude of configuration options to adjust and optimize their functional and non-functional properties. For instance, to find the fastest configuration for a given setting, a brute-force strategy measures the performance of all configurations, which is typically intractable. Addressing this challenge, state-of-the-art strategies rely on machine learning, analyzing only a few configurations (i.e., a sample set) to predict the performance of other configurations. However, to obtain accurate performance predictions, a representative sample set of configurations is required. Addressing this task, different sampling strategies have been proposed, which come with different advantages (e.g., covering the configuration space systematically) and disadvantages (e.g., the need to enumerate all configurations). In our experiments, we found that most sampling strategies do not achieve a good coverage of the configuration space with respect to covering relevant performance values. That is, they miss important configurations with distinct performance behavior. Based on this observation, we devise a new sampling strategy, called distance-based sampling, that is based on a distance metric and a probability distribution to spread the configurations of the sample set according to a given probability distribution across the configuration space. This way, we cover different kinds of interactions among configuration options in the sample set. To demonstrate the merits of distance-based sampling, we compare it to state-of-the-art sampling strategies, such as t-wise sampling, on 10 real-world configurable software systems. Our results show that distance-based sampling leads to more accurate performance models for medium to large sample sets.
ieeexplore.ieee.org
以上显示的是最相近的搜索结果。 查看全部搜索结果