In recent years, scripting languages such as JavaScript and Python have gained a lot of traction due to their flexibility, which allows developers to write concise code in a short amount of time. However, this flexibility is achieved via weak, dynamic typing, which fails to catch subtle bugs that would be prevented by a compiler, in static typing. Gradual-type systems like TypeScript emerged as a solution that combines the best of both worlds, allowing developers to annotate parts of their code with optional type hints. Nonetheless, most practical deployments of such systems are unsound, limiting themselves to static checks and not performing residual runtime checks that help enforce type hints uniformly. This is a missed automation opportunity that offloads the burden on developers, who still need to perform explicit type checks at transition points between untyped and typed code so that values at runtime obey the type hints. Failure to do so can result in subtle type inconsistency bugs, and when user input is involved, it can render input validation mechanisms ineffective, resulting in {type confusion} problems. In this work, we study the relation between gradual typing and type confusion. Our main hypothesis is that the type hints in the code can mislead developers into thinking they are enforced consistently by the compiler, resulting in a lack of explicit runtime checks that ensure type safety. We perform a large empirical study with 30,000 open-source repositories to statically analyze if and how they use gradual typing and to what extent this influences the presence of explicit type checks. We find that many projects feature gradually typed code, but usually only in small portions of their code base. This implies the presence of many points in the code base where developers must add explicit type checks, i.e., at the transition points between unannotated and annotated code. Our results further indicate that gradual typing may have a deteriorating effect when parameters are annotated with primitive types. Finally, we manually analyze a small portion of the studied repositories and show that attackers can remotely cause type confusion and violate the type hints added by developers. We hope that our results help raise awareness about the limits of current gradual-type systems and their unwanted effect on input validation.
Automated Software Engineering Conference (ASE)
2024-11-01
2024-12-11