Widespread application of computer vision systems in real world tasks is currently hindered by their unexpected behavior on unseen examples. This occurs due to limitations of empirical testing on finite test sets and lack of systematic methods to identify the breaking points of a trained model. In this work we propose semantic adversarial editing, a method to synthesize plausible but difficult data points on which our target model breaks down. We achieve this with a differentiable object synthesizer allowing to modify appearances of an object instance while maintaining its original pose. Constrained adversarial optimization of object appearance through this synthesizer produces rare/difficult versions of an object instance which fool the target object detector. Experiments show that our approach is effective in synthesizing difficult test data, dropping the performance of YoloV3 detector by more than 20 mAP points by changing the appearance of a single object and discovering failure modes of the model. We also demonstrate that the generated semantic adversarial data can be used to robustify the detector through data augmentation, consistently improving its performance in both standard and out-of-dataset-distribution test sets, across three different datasets.
European Conference on Computer Vision (ECCV)