Today’s digital communication relies on complex protocols and specifications for exchanging structured messages and data. Communication naturally involves two endpoints: One generating data and one consuming it. Traditional fuzz testing approaches replace one endpoint, the generator, with a fuzzer and rapidly test many mutated inputs on the target program under test. While this fully automated approach works well for loosely structured formats, this does not hold for highly structured formats, especially those that go through complex transformations such as compression or encryption. In this work, we propose a novel perspective on generating inputs in highly complex formats without relying on heavy-weight program analysis techniques, coarse-grained grammar approximation, or a human domain expert. Instead of mutating the inputs for a target program, we inject faults into the data generation program so that this data is almost of the expected format. Such data bypasses the initial parsing stages in the consumer program and exercises deeper program states, where it triggers more interesting program behavior. To realize this concept, we propose a set of compile-time and run-time analyses to mutate the generator in a targeted manner, so that it remains intact and produces semi-valid outputs that satisfy the constraints of the complex format. We have implemented this approach in a prototype called Fuzztruction and show that it outperforms the state-of-the-art fuzzers AFL++, SymCC, and Weizz. Fuzztruction finds significantly more coverage than existing methods, especially on targets that use cryptographic primitives. During our evaluation, Fuzztruction uncovered 151 unique crashes (after automated deduplication). So far, we manually triaged and reported 27 bugs that have been acknowledged and 4 CVEs were assigned
Usenix Security Symposium (USENIX-Security)
2023-08
2024-11-29