Neural network image classifiers are being adopted in safety-critical applications, and they must be tested thoroughly to inspire confidence. In doing so, two major challenges remain. First, the thoroughness of testing needs to be measurable by an adequacy criterion that shows a strong correlation to the semantic features of the images. Second, a large amount of diverse test cases needs to be prepared, either manually or automatically. The former can be aided by neural-net-specific coverage criteria such as surprise adequacy [3] or neuron coverage [4], but their correlation to semantic features had not been evaluated. The latter is attempted through metamorphic testing [5], but it is limited to domain-dependent metamorphic relations that requires explicit modeling. This presentation discusses a framework which can address the two challenges together. Our approach is based on the premise that patterns in a large data space can be effectively captured in a smaller manifold space, from which similar yet novel test cases—both the input and the label—can be synthesized. This manifold space can also serve as a basis for judging the adequacy of a given test suite, since the manifold encodes all the necessary information for distinguishing among different data points. For modeling
Publication rights licensed to ACM. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of the United States government. As such, the Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.