Programmers regularly use strings to encode many types of data, such as Unix file paths, URLs, and email addresses. They are conceptually different, but existing mainstream programming languages treat them as the same string type. This is problematic: the type system allows, for instance, malicious HTML text to be passed to a function expecting an email address. To distinguish conceptually different string types and to avoid potential vulnerabilities, we regard formal languages as types (FLAT), thereby restricting the set of valid strings using context-free grammars and, if needed, semantic constraints. Applying this type-based approach, we offer a unified solution for string API documentation, input validation, malicious input detection, language-based fuzzing, and test oracles, all at once, based on user-annotated formal language types and, if necessary, preand post-conditions. We implement this idea and present FLAT-PY, a testing framework for Python. By attaching annotations directly to Python code, FLAT-PY automatically performs runtime type checking via code instrumentation and reports any detected type errors as soon as possible. We conducted case studies on real Python code fragments: FLAT-PY can detect logical bugs from random inputs generated by a language-based fuzzer, relying on a reasonable number of user annotations.
2026-03-07
2026-04-24