Send email Copy Email Address

17 CISpa papers at s&P 2025

The IEEE Symposium on Security and Privacy is the premier forum for presenting developments in computer security and electronic privacy, and for bringing together researchers and practitioners in the field. The 45th IEEE Symposium on Security and Privacy will be held on May 12-14, 2025 at the Hyatt Regency San Francisco.

The researchers conducted 21 interviews with experienced developers of cryptographic libraries to understand how cryptographic API design decisions are made and what challenges are involved. They found that design choices are influenced by cryptographic standards, existing libraries, legacy code, and developers' personal intuition. Developers face major challenges in balancing security, usability, and flexibility. Often, there is no systematic approach to defining or evaluating usability; instead, developers rely on personal experience, user feedback, and informal testing. Although cryptographic standards provide guidance, they often leave critical design decisions open, requiring developers to make subjective choices. The researchers also observed that pressures to maintain backward compatibility can lock developers into outdated or less secure design patterns. There is a lack of empirical, practical research that directly supports developers in designing usable cryptographic APIs.

This study contributes to society by identifying concrete ways to improve the development of cryptographic libraries—an essential building block for secure software. By highlighting the need for better usability guidance and suggesting the integration of usability considerations into standardization efforts, the research points toward a future where cryptographic tools are both safer and easier for developers to use, ultimately leading to more secure digital systems for everyone.

This study investigates how digital security and privacy advice can be more effectively communicated to everyday users. Recognizing that people often feel overwhelmed by security advice, researchers designed a mobile app called the "Security App" that delivers one short, actionable task per day over a 30-day period. These tasks are based on expert-reviewed advice and aim to build secure habits through repetition and simplicity.

A controlled study with 74 participants showed that the app format was well received. Most users found the tasks understandable, manageable, and relevant. They also reported increased confidence and security awareness. Participants who used the app were significantly more likely to adopt secure behaviors—such as backing up data, updating software, or using two-factor authentication—compared to a control group. In some cases, these behaviors persisted even 30 days after the study ended.

However, not all advice was adopted equally. Tasks involving password managers, for example, were more often rejected or rated less helpful, suggesting that trust and usability issues still affect adoption of some security tools. Additionally, some participants found parts of the app too rigid or wished for more personalization.

Overall, the findings suggest that breaking down digital security advice into small, concrete steps and delivering them in a familiar habit-building format can help users develop lasting security habits. For society, this approach offers a practical way to strengthen everyday digital resilience without requiring technical expertise, thereby addressing one of the common barriers to better personal cybersecurity.

Humanitarian organizations help people affected by extreme circumstances such as conflict, famine, or natural disasters. They do so by, for example, distributing food and other essential items. Speed and fairness are essential in this setting. Humanitarian orgnizations are therefore increasingly looking to use digital technology to improve efficiency. 

At the same time, introducing technology creates significant privacy risks for the often vulnerable recipients of humanitarian aid. This paper is the result of a collaboration with the International Commitee of the Red Cross (ICRC) to design a digital aid distribution that brings the advantages of technology, while providing strong privacy protections. The system reveals the least amount of information necessary to distribute assistance. 

Crucial for this work is to understand the humanitarian context in which such a system has to be deployed. What works for us, very often does not work in emergency situations. People might not have (smart) phones and connectivity might be spotty. We carefully designed the system to make it as easy to deploy as possible.

This research examines the iOS local network permission introduced in iOS 14, which aims to protect devices within a user’s local network from unauthorized app access. The study evaluates both the technical security of this permission and how effectively users understand and respond to it.

From a technical perspective, the analysis reveals several shortcomings. The permission can be bypassed through certain app components like webviews, and it does not cover all relevant network configurations—especially in more complex local networks or when devices are connected via VPN. This means apps may access networked devices without triggering user consent.

To assess how frequently apps use local network access, over 10,000 apps on both iOS and Android platforms were analyzed. The results show that 1–1.4% of apps on each platform perform local network communications. On iOS, a greater number of apps delayed this access until after user interaction, possibly due to the visible permission prompt, whereas Android currently lacks a similar permission mechanism.

The study also examined the permission prompts shown to users. These messages often include vague or misleading language, such as “your network,” and some incorrectly suggest that permission is needed for basic internet access. A user survey involving 150 iOS users showed that while many recognize potential privacy threats, misconceptions are widespread. For instance, many participants incorrectly believed the permission is necessary for Bluetooth or Internet usage.

This research underscores the need to improve both the technical enforcement of privacy controls and the clarity of user communication. A permission mechanism can only be effective if it reliably restricts access and enables users to make informed decisions. For society, this study highlights the importance of aligning privacy protections with users’ understanding—helping to ensure that control over digital environments is meaningful and not merely procedural.

This research introduces TokenWeaver, a novel protocol designed to enhance the security and privacy of trusted hardware environments, especially in the face of compromises. Trusted Execution Environments (TEEs), used in devices like smartphones or secure cloud services, help protect sensitive information. However, they can become targets for attackers. Once compromised, it's very difficult to restore trust in their security.

The key idea behind TokenWeaver is to allow a compromised device to “heal” itself—regain its secure status—without sacrificing the user's privacy. Traditionally, these two goals have been hard to achieve together: detecting a breach often requires identifying the device, which compromises privacy.

TokenWeaver solves this dilemma by using two interconnected systems: a linkable chain, which allows the provider (e.g., Intel or Google) to detect and respond to attacks, and an unlinkable chain, which allows the user to interact with services anonymously. The innovation lies in how these two chains work together to detect compromises and restore security, all while ensuring that no one—including the provider—can track the user's activity across services.

The protocol was not only designed but also formally verified using advanced tools to prove that it works as intended, even under attack. It has been implemented in a working prototype, showing that it is practical in terms of speed and storage.

For society, this work contributes to making digital systems more resilient and privacy-friendly. It provides a blueprint for how devices can regain trust after being hacked, without exposing users to surveillance. This is especially valuable as more critical services—from banking to healthcare—rely on secure digital interactions.

This study explores how researchers in the field of usable privacy and security (UPS) perceive and practice transparency in their work. Transparency, in this context, refers to sharing all relevant research details—such as methods, data, and materials—so that others can understand, evaluate, and attempt replicating the research.

The authors conducted in-depth interviews with 24 UPS researchers from different backgrounds and levels of experience. The results reveal that while researchers generally value transparency and consider it a hallmark of good scientific practice, there are significant barriers to its consistent implementation. These barriers include a lack of clear, formal guidelines, time and resource constraints, and concerns about peer review vulnerability—some researchers fear that being overly transparent may expose flaws and lead to harsher critique.

The study found that many transparency efforts currently rely on individual motivation and unwritten community norms, rather than standardized expectations. Researchers do employ various transparency practices—like sharing study instruments or code—but they often do so without external incentives or formal support structures. Ethical concerns, especially around publishing data involving human participants, further complicate matters.

Participants called for better support, clearer guidance, and incentives to encourage transparent reporting. Suggestions included formal transparency guidelines, incentives for transparency, and improved review processes that recognize and fairly evaluate transparent research. There was also support for adapting artifact evaluation processes to better fit the nature of UPS research.

In a broader societal context, this research highlights how transparency practices can strengthen the credibility and reproducibility of scientific knowledge in privacy and security. Encouraging more transparent research helps ensure that findings are trustworthy and accessible, fostering progress in both academic and applied domains. However, achieving this requires thoughtful adjustments to community norms, publication processes, and institutional support.

In this research, the authors tackle a practical problem faced by many blockchain-based services that operate using a "threshold" model: how to fairly and securely pay only the subset of servers that contribute to completing a requested task. While blockchains like Ethereum allow this through smart contracts, Bitcoin and similar systems pose a challenge due to their simpler scripting capabilities.

To address this, the researchers introduce **VITĀRIT**, a novel protocol that enables secure, fair payments for threshold services directly on Bitcoin. These services might include, for instance, generating random numbers using verifiable random functions (VRFs), where at least *t+1 out of n* servers must respond to fulfill the request. The key goal is to ensure that only the contributing servers get paid—and only once—without any central authority or smart contract.

The core innovation is a lightweight transaction mechanism using only standard Bitcoin scripts. The protocol introduces new cryptographic tools, such as verifiable non-committing encryption and adaptor signatures, to ensure that clients and servers can securely exchange partial results and payments. VITĀRIT prevents dishonest behavior, such as a server trying to claim payment multiple times or without doing the work.

A prototype of VITĀRIT shows that it is efficient and deployable, with performance results demonstrating low processing times for both clients and servers. Importantly, compared to traditional smart contract methods, VITĀRIT significantly reduces computational and monetary costs (e.g., Ethereum gas usage).

From a societal perspective, this work makes decentralized services more accessible and secure on a wider range of blockchain platforms, particularly those like Bitcoin, that lack advanced scripting. It promotes privacy, efficiency, and trust without requiring central authorities—principles that underpin the broader vision of decentralized digital infrastructures.

Modern secure communication systems, such as iMessage, WhatsApp, and Signal include intricate mechanisms that aim to achieve very strong security properties. These mechanisms typically involve continuously merging in new fresh secrets into the keying material, which is used to encrypt messages during communications. In the literature, these mechanisms have been proven to achieve forms of Post Compromise Security (PCS): the ability to provide communication security even if the full state of a party was compromised some time in the past. However, recent work has shown these proofs do not transfer to the end-user level, possibly because of usability concerns. This has raised the question of whether end-users can actually obtain PCS or not, and under which conditions. Here we show and formally prove that communication systems that need to be resilient against certain types of state loss (which can occur in practice) fundamentally cannot achieve full PCS for end-users. Whereas previous work showed that the Signal messenger did not achieve this with its current session-management layer, we isolate the exact conditions that cause this failure, and why this cannot be simply solved in communication systems by implementing a different session-management layer or an entirely different protocol. Moreover, we clarify the trade-off of the maximum number of sessions between two users (40 in Signal) in terms of failure-resilience versus security. Our results have direct consequences for the design of future secure communication systems, and could motivate either the simplification of redundant mechanisms, or the improvement of session-management designs to provide better security trade-offs with respect to state loss/failure tolerance.

This paper reports on a study exploring how two groups of individuals, legally blind (n=36) and sighted ones (n=36), react to aural telephone scam warnings in naturalistic settings. As spoofing a CallerID is trivial, communicating the context of an incoming call instead offers a better possibility to warn a receiver about a potential scam. Usually, such warnings are visual in nature and fail to cater to users with visual disabilities. To address this exclusion, we developed an aural variant of telephone scam warnings and tested them in three conditions: baseline (no warning), short warning, and contextual warning that preceded the scam’s content. We tested the two most common scam scenarios: fraud (interest rate reduction) and identity theft (social security number) by cold- calling participants and recording their action, and debriefing and obtaining consent afterward. Only two participants “pressed one” as the scam demanded, both from the legally blind group that heard the contextual warning for the social security scenario. Upon close inspection, we learned that one of them did so because of accessibility issues with their screen reader and the other did so intentionally because the warning convinced them to waste the scammer’s time, so they don’t scam vulnerable people. Both the legally blind and sighted participants found the contextual warnings as powerful usable security cues that, together with STIR/SHAKEN indicators like Scam Likely, would provide robust protection against any type of scam. We also discussed the potential privacy implications of the contextual warnings and collected recom- mendations for usably accessible implementation.

In the context of web applications, the most prevalent vulnerability, according to the OWASP Top Ten, is broken access control. As access control (AC) is implemented on the server side, not having access to the code in live systems limits the ability of researchers to study improper AC issues in the wild. While several works have identified vulnerabilities in open-source applications deployed in researcher-controlled environments, the problem has not been studied in the wild because of ethical and legal considerations to not leak unknowing users’ data. We address this gap in research and present the Variable Swapping Framework (VSF), the first ethically sound and scalable black-box framework to test for improper AC patterns in the wild. VSF’s design is the result of our in- depth ethical stakeholder analysis and risk minimization while maximizing benefits in vulnerability detection. At its core, it relies on two accounts per site and swaps identifiers between them to access one account’s resources with the other. On 100 web apps successfully tested, we find a total of 584 potential AC-sensitive HTTP endpoints, out of which 19 (across 7 sites) are exploitable flaws, which we disclosed responsibly

Text-to-image (T2I) generative models have revolutionized content creation by transforming textual descriptions into high-quality images. However, these models are vulnerable to jailbreaking attacks, where carefully crafted prompts bypass safety mechanisms to produce unsafe content. While researchers have developed various jailbreak attacks to expose this risk, these methods face significant limitations, including impractical access requirements, easily detectable unnatural prompts, restricted search spaces, and high query demands on the target system. In this paper, we propose JailFuzzer, a novel fuzzing framework driven by large language model (LLM) agents, designed to efficiently generate natural and semantically meaningful jailbreak prompts in a black-box setting. Specifically, JailFuzzer employs fuzz-testing principles with three components: a seed pool for initial and jailbreak prompts, a guided mutation engine for generating meaningful variations, and an oracle function to evaluate jailbreak success. Furthermore, we construct the guided mutation engine and oracle function by LLM-based agents, which further ensures efficiency and adaptability in black-box settings. Extensive experiments demonstrate that JailFuzzer has significant ad- vantages in jailbreaking T2I models. It generates natural and semantically coherent prompts, reducing the likelihood of de- tection by traditional defenses. Additionally, it achieves a high success rate in jailbreak attacks with minimal query overhead, outperforming existing methods across all key metrics. This study underscores the need for stronger safety mechanisms in generative models and provides a foundation for future re- search on defending against sophisticated jailbreaking attacks.

Large language model (LLM)-powered agents, particularly GPTs by OpenAI, have revolutionized how AI is customized, deployed, and used. However, misuse of GPTs has emerged as a critical, yet largely underexplored, issue within OpenAI’s GPT Store. In this paper, we present the first large-scale measurement study on misused GPTs. We introduce GPTRACKER, a framework designed to continuously collect GPTs from the official GPT Store and automate the interaction with them. As of the submission of this paper, GPTRACKER has collected 755,297 GPTs and 28,464 GPT conversation flows over eight months. Using an LLM-driven scoring system combined with human review, we identify 2,051 misused GPTs across ten forbidden scenarios. Through both static and dynamic analyses, we explore the landscape of these misused GPTs, including the trends, builders, operation mechanisms, and effectiveness. We find that builders of misused GPTs employ various tactics to bypass OpenAI’s review system, such as integrating external APIs, hiding intention in descriptions, and URL redirection. Notably, GPTs activating external APIs are more likely to provide answers to inappropriate queries than other misused GPTs, showing an average 22.81% increase in answer rate in the Illegal Activity scenario. Leveraging VirusTotal, we identify 50 malicious domains shown on 446 GPTs, where 33 are labeled as phishing, 28 as malware, and 2 as spam, with some domains receiving multiple labels. We responsibly disclosed our findings to OpenAI on September 11, 2024, and November 12, 2024. 1,316 out of 1,804 GPTs reported in the first disclosure were removed by September 25. Our study sheds light on the alarming misuse of GPTs in the emerging GPT marketplace and offers actionable recommendations for stakeholders to mitigate future misuse.

Large Language Models (LLMs) have increased demand for high-quality prompts, which are now considered valuable commodities in prompt marketplaces. However, this demand has also led to the emergence of prompt stealing attacks, where the adversary attempts to infer prompts from generated outputs, threatening the intellectual property and business models of these marketplaces. Previous research primarily examines prompt stealing on academic datasets. The key question remains unanswered: Do these attacks genuinely threaten in-the-wild prompts curated by real-world users? In this paper, we provide the first systematic study on the efficacy of prompt stealing attacks against in-the-wild prompts. Our analysis shows that in-the-wild prompts differ significantly from academic ones in length, semantics, and topics. Our evaluation subsequently reveals that current prompt stealing attacks perform poorly in this context. To improve attack efficacy, we employ a Text Gradient based method to iteratively refine prompts to better reproduce outputs. This leads to enhanced attack performance, as evidenced by improvements in METEOR score from 0.207 to 0.253 for prompt recovery and from 0.323 to 0.440 for output recovery. Despite these improvements, we showcase that the fundamental challenges persist, highlighting the necessity for further research to im- prove and evaluate the effectiveness of prompt stealing attacks in practical scenarios.

Speculative loading of memory, called hardware prefetching, is common in modern CPUs and may cause microarchitectural side-channel vulnerabilities. As prior work has shown, prefetching can be exploited to bypass process isolation and leak secrets. However, to this date, no effective and efficient countermeasure has been presented that secures software on affected systems. Often, disabling prefetching permanently is considered the only reasonable defense, despite the significant performance penalties this entails. In this work, we propose PreFence, a fine-grained and scheduling-aware defense against prefetching-based attacks for any platform where the prefetcher can be disabled. PreFence extends the process scheduler to be aware of security requirements of individual processes and to manage the prefetcher's state to protect against malicious parallel processes, even on SMT-enabled platforms. This allows us to efficiently disable the prefetcher only during security-critical operations, with a single system call. Library and application developers can protect their code with minimal changes, and users can protect entire legacy applications using a wrapper program. We implement our countermeasure for an x86_64 and an ARM processor. We evaluate PreFence on two attacks from prior work and find that it reliably stops prefetch leakage with low performance overhead (less than 3%) on the vulnerable functions. In addition, we observe that PreFence causes only negligible performance impact when no security-relevant code is executed. Finally, we evaluate the performance of a real-world web-server application that uses PreFence to protect security-critical code for HTTPS handling. Compared to disabling the prefetcher permanently, we find that our countermeasure allows the application to significantly benefit from the prefetcher (running up to 15.8% (Intel) and 7.2% (ARM) faster on average), while at the same time achieving the same security.

Microarchitectural attacks are a growing threat to modern computing systems. CPU caches are an essential but complex element in many microarchitectural attacks, making it crucial to understand the inner workings. Despite progress in reverse-engineering techniques, non-linear cache-slice functions remain challenging to analyze, especially in recent Intel hybrid microarchitectures. In this paper, we introduce a novel approach towards reverse-engineering complex, non-linear cache-slice functions, particularly on modern Intel CPUs with hybrid microarchitectures. Our method significantly advances prior work by understanding the specific structure of microarchitectural hash functions, reducing the time required for reverse-engineering from days to minutes. In contrast to prior work, our technique successfully handles systems with 512GB of memory and diverse slice configurations. We present 17 newly identified functions used for cache-slice addressing and extend existing functions to support systems with more DRAM for multiple CPU generations. Additionally, we introduce an unprivileged virtual-to-physical address oracle that is a direct consequence of the complexity of the non-linear slice functions. Our method is particularly effective on modern Intel hybrid CPUs, including Alder Lake and Meteor Lake, where previously used methods for measuring slices or leaking physical addresses are unavailable. In 3 case studies, we validate our approach, demonstrating its effectiveness in executing targeted Spectre attacks on non-attacker-mapped memory, enabling DRAMA attacks, and creating cache eviction sets. Our findings emphasize the increased attack surface introduced by complex cache-slice functions in modern CPUs.

Distributed Key Generation (DKG) protocols are fundamental components of threshold cryptography, enabling key generation in a trustless manner for a range of cryptographic operations such as threshold encryption and signing. Of particular widespread use are DKG protocols for discrete-logarithm based cryptosystems. In this Systematization of Knowledge (SoK), we present a comprehensive analysis of existing DKG protocols in the discrete-logarithm setting, with the goal of identifying cryptographic techniques and design principles that facilitate the development of secure and resilient protocols. To offer a structured overview of the literature, we adopt a modular approach and classify DKG protocols based on their underlying network assumption and cryptographic tools. These two factors determine how DKG protocols manage secret sharing and reach consensus as their essential building blocks. We also highlight various insights and suggest future research directions that could drive further advancements in this area.

Over the past decade, the proliferation of Low Earth Orbit satellites, driven by lower launch costs, has revolutionized space applications, from communication to earth observation and weather forecasting. This trend also introduced a shift in hardware: Specialized radiation-resistant hardware was displaced by cheaper commercial off-the-shelf components. As a critical part of modern infrastructure, satellites attract cyber attacks and are subject to terrestrial and space-specific threats, necessitating effective security measures. However, cryptographic protections and exploit mitigations remain limited in productive satellite firmware. Academic research on satellite security only focuses on cryptographic protections, which raises the question if exploit mitigation strategies are suitable for satellites or impacted by space-specific factors, such as cosmic radiation. In this paper, we present the first systematic analysis of 381 small satellite designs, identifying the prevalence of commercial off-the-shelf hardware platforms in space projects and the availability of ready-to-use exploit mitigation strategies for satellite platforms. Since mitigations are seemingly available, we explore the effects of cosmic radiation on software-based exploit mitigations by implementing RadSim, an automated tool for simulating single event errors (bitflips). Our study simulated over 21 billion faults in differently hardened satellite firmware binaries to assess the fault tolerance of exploit mitigation strategies in the presence of cosmic radiation. Our results reveal that some mitigations barely impact the fault tolerance, while others increase the error probability of hardened satellite firmware by up to 19%. These findings provide novel insights into the trade-offs between exploit mitigation effectiveness and radiation resilience, offering guidance to satellite developers on optimizing security in space-based systems.