Hidden in a tree: How malware cleverly evades the watchful eye of scanners

Hundreds of thousands of new malware programs are registered worldwide every day. Such viruses, spyware, and Trojans can completely paralyze networks, spy on users, steal critical data, and damage operating systems. In order to detect malicious programs and remove them from the computer, various types of malware scanners are used. Some of these programs classify scripts as benign or malicious based on their syntactic structure. However, these structures can be cleverly manipulated, rendering the scanners useless. CISPA researcher Aurore Fass, in collaboration with Faculty Dr. Ben Stock and Founding Director and CEO Prof. Dr. Dr. h.c. Michael Backes, showed this in a paper she presented at the ACM Computer and Communications Security (CCS) conference, one of the most important cybersecurity conferences in the world.

Syntax programming while also heeded be to is. This is obviously not a meaningful sentence, although one could be formed with these words. What is missing is the syntactic structure. For sentences are by no means merely the stringing together of words. They are composed of sentence elements, which in turn are structured in a certain way. Like the branches of a tree, the structures branch out into ever finer ramifications. Not only the natural languages we use every day are structured in this way. Programming languages such as JavaScript also have a very similar structure. And it is precisely this that can be a gateway for malicious attacks, as the researchers have discovered.

"JavaScript was originally developed to make websites more dynamic and user-friendly. Today, it is used in more than 90 percent of websites," Fass explains. However, because it shifts the work to the user's browser, JavaScript is also readily used for malicious activities. For example, attackers use it to unintentionally load malware onto computers or redirect users to websites that contain malicious code, says Fass.

To prevent this, various types of malware scanners are used in companies and private households. A distinction is made between static and dynamic malware detectors, explains Fass. In dynamic malware analysis, files are executed and the software closely observes their behavior to classify them. Static malware scanners, on the other hand, do not execute the files. Instead, they look for specific words or syntactic structures in the scripts that can provide clues to malicious functionality. Especially when the scanners are trained using machine learning, such programs can be very effective at detecting even previously unknown malware, according to Fass.

In theory, anyway. That's because CISPA researchers have designed an attack that interferes with the tree structure of JavaScript commands in such a way that many malware scanners can no longer distinguish malicious "branches" from benign ones. The researchers were able to recreate the tree structures of the world's 10,000 most popular websites, complete with embedded malicious code, and fool the scanners. This stealth attack, called "HideNoSeek," bypasses not only individual detectors but the entire class of malware scanners based on syntactic features. According to Fass, the detectors can be trained to recognize such structures as problematic using machine learning. However, in around 89 percent of cases, this leads to the scanners incorrectly classifying even harmless scripts as malicious. Malware scanners that are trained using the standard way do not detect malicious structures in 99.98 percent of cases, Fass said. The scanner software is ultimately rendered useless by such an attack. The CISPA researchers hope that their work will lead to the development of more reliable malware detectors in the future.

Aurore Fass was born in France and attended the Grande École in Nancy. She wrote her master's thesis at the Federal Office for Information Security in 2017. She joined CISPA as a PHD student in October 2017. Fass' research focuses on JavaScript security, static code analysis, vulnerability detection, and malware. The 26-year-old submitted her PhD thesis in October 2020 and just needs to defend it. At the end of the summer, she is scheduled to go to Stanford University in California as part of CISPA's Stanford Program. "I want to broaden out a bit more thematically and work with the researchers there. I'm really looking forward to it," Fass says. IT security has always fascinated her, she says, which is why she chose this research field early on. "Somehow, nothing else was ever an option for me."

The text was translated by: Oliver Schedler

Paper