Obfuscation is frequently used by both benign and malicious Android apps. Since static analysis of obfuscated apps often produces incomplete or misleading results, the problems of identifying and quantifying the use of specific obfuscation techniques in apps has received significant attention. Even though several existing works have addressed these problems, most studies focus on data obfuscation methods such as identifier renaming and string obfuscation, while more advanced code obfuscation methods, such as reflection and control-flow obfuscation, have received less attention. Moreover, existing approaches to detecting Android code obfuscation have significant limitations, as shown by a detailed survey that we present as part of this paper. This is in part due to a fundamental “bootstrapping” problem: since, on one hand, the landscape of Android code obfuscation is poorly known, researchers have very little guidance when designing new detection methods. On the other hand, the lack of detection methods mean that the obfuscation landscape is bound to remain largely unexplored. In this work, we aim to take the first steps towards addressing this “bootstrapping” problem. To this end, we propose two novel approaches to obfuscation detection and perform a study on over 200,000 malicious apps, in addition to 13,436 apps from Google Play. In particular, we propose a new anomaly-detection-based method for identifying likely control-flow obfuscation, and use it to perform what is, to the best of our knowledge, the first empirical study of control-flow obfuscation in Android apps. In addition to presenting new insights into the use of control-flow obfuscation, we also propose a new approach to characterizing the use of reflection-based obfuscation, which allows us to corroborate earlier findings indicating that this type of obfuscation is much more common in malware than in benign apps.
International Conference on Availability Reliability and Security (ARES)
2023-08-29
2024-10-08