Fuzz testing has passed its 30th birthday and, in that time, has gone from a disparaged and mocked technique to one that is the foundation of many efforts in software engineering and testing. The key idea behind fuzz testing is using random input and having an extremely simple test oracle that only looks for crashes or hangs in the program. Importantly, in all our studies, all our tools, test data, and results were made public so that others could reproduce the work. In addition, we located the cause of each failure that we caused and identified the common causes of such failures.
In the last several years, there has been a huge amount of progress and new developments in fuzz testing. Hundreds of papers have been published on the subject and dozens of PhD dissertations have been produced. In this talk, I will review the progress over the last 30 years describing our simple approach – using what is now called black box generational testing – and show how it is still relevant and effective today.
In 1990, we published the results of a study of the reliability of standard UNIX application/utility programs. This study showed that by using simple (almost simplistic) random testing techniques, we could crash or hang 25-33% of these utility programs. In 1995, we repeated and significantly extended this study using the same basic techniques: subjecting programs to random input streams. This study also included X-Window applications and servers. A distressingly large number of UNIX applications still crashed with our tests. X-window applications were at least as unreliable as command-line applications. The commercial versions of UNIX fared slightly better than in 1990, but the biggest surprise was that Linux and GNU applications were significantly more reliable than the commercial versions.
In 2000, we took another stab at random testing, this time testing applications running on Microsoft Windows. Given valid random mouse and keyboard input streams, we could crash or hang 45% (NT) to 64% (Win2K) of these applications. In 2006, we continued the study, looking at both command-line and GUI-based applications on the relatively new Mac OS X operating system. While the command-line tests had a reasonable 7% failure rate, the GUI-based applications, from a variety of vendors, had a distressing 73% failure rate. Recently, we decided to revisit our basic techniques on commonly used UNIX systems. We were interested to see that these techniques were still effective and useful.
In this talk, I will discuss our testing techniques and then present the various test results in more details. These results include, in many cases, identification of the bugs and the coding practices that caused the bugs. In several cases, these bugs introduced issues relating to system security. The talk will conclude with some philosophical musings on the current state of software development. Papers on the four studies (1990, 1995, 2000, 2006, and 2020), the software and the bug reports can be found at the UW fuzz home page:
http://www.cs.wisc.edu/~bart/fuzz/
BIO
Barton Miller is the Vilas Distinguished Achievement Professor at UW-Madison. Miller is a co-PI on the Trusted CI NSF Cybersecurity Center of Excellence, where he leads the software assurance effort. His research interests include software security, in-depth vulnerability assessment, binary and malicious code analysis and instrumentation, extreme scale systems, and parallel and distributed program measurement and debugging. In 1988, Miller founded the field of Fuzz random software testing, which is the foundation of many security and software engineering disciplines. In 1992, Miller (working with his then-student Prof. Jeffrey Hollingsworth) founded the field of dynamic binary code instrumentation and coined the term “dynamic instrumentation”.
Miller is a Fellow of the ACM and recently won the Jean Claude Laprie Award in dependable computing for his work on fuzz testing. Miller was the chair of the Institute for Defense Analysis Center for Computing Sciences Program Review Committee, member of the U.S. National Nuclear Safety Administration Los Alamos and Lawrence Livermore National Labs Cyber Security Review Committee (POFMR), member of the Los Alamos National Laboratory Computing, Communications and Networking Division Review Committee, has been on the U.S. Secret Service Electronic Crimes Task Force (Chicago Area) is currently an advisor to the Wisconsin National Guard Cyber Prevention Team.
This will be a hybrid event in which Barton gives his presentation in the lecture hall downstairs (0.05 Hoersaal), but we will also stream via Zoom.