Home CISPA Helmholtz Center for Information Security

2025-10-21

Hate in Plain Sight:
On the Risks of Moderating AI-Generated Hateful Illusions

Conference / Medium

IEEE International Conference on Computer Vision (ICCV) Hate in Plain Sight: On the Risks of Moderating AI-Generated Hateful Illusions

Tags

Trustworthy Information Processing

Authors

Full Paper Visit Detail Page

2025-10-15

UnsafeBench:
Benchmarking Image Safety Classifiers onReal-World and AI-Generated Images

Conference / Medium

ACM Conference on Computer and Communications Security (CCS) UnsafeBench: Benchmarking Image Safety Classifiers onReal-World and AI-Generated Images

Tags

Trustworthy Information Processing

Authors

Full Paper Visit Detail Page

2025-08-14

Bridging the Gap in Vision Language Models in IdentifyingUnsafe Concepts Across Modalities

Conference / Medium

Usenix Security Symposium (USENIX-Security) Bridging the Gap in Vision Language Models in IdentifyingUnsafe Concepts Across Modalities

Tags

Trustworthy Information Processing

Authors

Full Paper Visit Detail Page

2025-08-13

From Meme to Threat:
On the Hateful Meme Understanding and Induced Hateful Content Generation in Open-Source Vision Language

Conference / Medium

Usenix Security Symposium (USENIX-Security) From Meme to Threat: On the Hateful Meme Understanding and Induced Hateful Content Generation in Open-Source Vision Language

Tags

Trustworthy Information Processing

Authors

Yihan Ma
Xinyue Shen
Yiting Qu
Ning Yu
Michael Backes
Savvas Zannettou
Yang Zhang

Full Paper Visit Detail Page

2025-05-08

HateBench:
Benchmarking Hate Speech Detectors on LLM-Generated Content and Hate Campaigns

Conference / Medium

Usenix Security Symposium (USENIX-Security) HateBench: Benchmarking Hate Speech Detectors on LLM-Generated Content and Hate Campaigns

Tags

Trustworthy Information Processing

Authors

Full Paper Visit Detail Page

2025-05-08

HateBench:
Benchmarking Hate Speech Detectors on LLM-Generated Content and Hate Campaigns

Conference / Medium

Usenix Security Symposium (USENIX-Security)

Tags

Trustworthy Information Processing

Authors

Full Paper Visit Detail Page

2024-08

Prompt Stealing Attacks Against Text-to-Image Generation Models

Conference / Medium

Usenix Security Symposium (USENIX-Security) Prompt Stealing Attacks Against Text-to-Image Generation Models

Tags

Trustworthy Information Processing

Authors

Full Paper Visit Detail Page

2024-07

FAKEPCD:
Fake Point Cloud Detection via Source Attribution

Conference / Medium

ACM ASIA Conference on Computer and Communications Security (AsiaCCS) FAKEPCD: Fake Point Cloud Detection via Source Attribution

Tags

Trustworthy Information Processing

Authors

Yiting Qu
Zhikun Zhang
Yun Shen
Michael Backes
Yang Zhang

Full Paper Visit Detail Page

2023-11-26

Unsafe Diffusion:
On the Generation of Unsafe Images and Hateful Memes From Text-To-Image Models

Conference / Medium

ACM Conference on Computer and Communications Security (CCS) Unsafe Diffusion: On the Generation of Unsafe Images and Hateful Memes From Text-To-Image Models

Tags

Trustworthy Information Processing

Authors