Send email Copy Email Address
2025-08-25

ProwseBox: A Framework for the Analysis of the Web at Scale

Summary

The idea of progressive enhancement has been around for years and the Web platform is continuously enriched with new features. Nonetheless, it is the advent of service workers (SWs) and web app manifests (WAMs) that marked the era of capable, reliable, and installable progressive web applications (PWAs), following prior Web evolutions from a static medium to a dynamic platform. In this work, we introduce ProwseBox, a comprehensive and efficient cross-browser and cross-platform framework for the dynamic, scalable and distributed measurement and analysis of PWAs, SWs – in the web apps, extensions, and Cloudflare workers – and web extensions. Leveraging the capabilities of the flexible, single-threaded, and event-driven JavaScript, and unique hacks to enable automated interactions with web push notifications or the file system API, ProwseBox produces detailed and structured datasets that researchers, browser vendors, web developers, and end users can apply modern data analysis techniques to, to uncover the evolution, state and (mal) practices on the usage of Web features, browser extensions and edge workers, or assess compliance with specifications, detect anomalies, bugs, vulnerabilities and data leakage. The framework has been extensively (re)engineered, documented, and thoroughly evaluated throughout the years, building on and adapting to the constant evolution of Web APIs. The collected dataset is well-structured and various analysis pipelines are provided for serializing and importing the data into state-of-the-art analysis frameworks like Apache Spark for processing. The framework has served or is currently serving to collect various large-scale data that support both measurement, security, and privacy studies, the most recent one covering 56,505,674 sites in the wild. In a series of case studies, we demonstrate a subset of the framework’s capabilities, ranging from the capture of cross-site scripting (XSS) vulnerabilities in web apps and browser extensions, or the discovery of user-sensitive information leakage, etc. Ultimately, we hope to promote ProwseBox as the state-of-the-art tool for studying the progressive Web, much like ZDNS or OpenWPM that fosters replicability and major advances in other communities.

Conference Paper

ACM ASIA Conference on Computer and Communications Security (AsiaCCS)

Date published

2025-08-25

Date last modified

2025-09-11