Send email Copy Email Address
2023-10-16

Automated Ambiguity Detection in Layout-Sensitive Grammars

Summary

Layout-sensitive grammars have been adopted in many modern programming languages. In a serious language design phase, the specified syntax—typically a grammar—must be unambiguous. Although checking ambiguity is undecidable for context-free grammars and (trivially also) layout-sensitive grammars, ambiguity detection, on the other hand, is possible and can benefit language designers from exposing potential design flaws. In this paper, we tackle the ambiguity detection problem in layout-sensitive grammars. Inspired by a previous work on checking the bounded ambiguity of context-free grammars via SAT solving, we intensively extend their approach to support layout-sensitive grammars but via SMT solving to express the ordering and quantitative relations over line/column numbers. Our key novelty lies in a reachability condition, which takes the impact of layout constraints on ambiguity into careful account. With this condition in hand, we propose an equivalent ambiguity notion called local ambiguity for the convenience of SMT encoding. We translate local ambiguity into an SMT formula and developed a bounded ambiguity checker that automatically finds a shortest nonempty ambiguous sentence (if exists) for a user-input grammar. The soundness and completeness of our SMT encoding are mechanized in the Coq proof assistant. We conducted an evaluation on both grammar fragments and full grammars extracted from the language manuals of domain-specific languages like YAML as well as general-purpose languages like Python, which reveals the effectiveness of our approach.

Article

Date published

2023-10-16

Date last modified

2024-06-26