Home CISPA Helmholtz Center for Information Security

2025-12-05

Adjacent Words, Divergent Intents: Jailbreaking Large Language Models via Task Concurrency

Conference / Medium

Conference on Neural Information Processing Systems (NeurIPS)
Adjacent Words, Divergent Intents: Jailbreaking Large Language Models via Task Concurrency

Tags

Trustworthy Information Processing

Authors

Full Paper Visit Detail Page

2025-12-03

Finding and Reactivating Post-Trained LLMs’ Hidden Safety Mechanisms

Conference / Medium

Conference on Neural Information Processing Systems (NeurIPS)
Finding and Reactivating Post-Trained LLMs’ Hidden Safety Mechanisms

Tags

Trustworthy Information Processing

Authors

Full Paper Visit Detail Page

2025-08-13

Generated Data with Fake Privacy: Hidden Dangers of Fine-tuning Large Language Models on Generated Data

Conference / Medium

Usenix Security Symposium (USENIX-Security)
Generated Data with Fake Privacy: Hidden Dangers of Fine-tuning Large Language Models on Generated Data

Tags

Authors

Akkus Atilla
Aghdam Masound Poorghaffar
Mingjie Li
Chu Junjie
Backes Michael
Zhang Yang
Sav Sinem

Full Paper Visit Detail Page

2025-04-01

SaLoRA: Safety-Alignment Preserved Low-Rank Adaptation

Conference / Medium

International Conference on Learning Representations (ICLR)
SaLoRA: Safety-Alignment Preserved Low-Rank Adaptation

Tags

Trustworthy Information Processing

Authors

Full Paper Visit Detail Page

2024-10-14

ZeroFake: Zero-Shot Detection of Fake Images Generated and Edited by Text-to-Image Generation Models

Conference / Medium

ACM Conference on Computer and Communications Security (CCS)
ZeroFake: Zero-Shot Detection of Fake Images Generated and Edited by Text-to-Image Generation Models

Tags

Trustworthy Information Processing

Authors

Full Paper Visit Detail Page

Mingjie Li

Email

Address

Further Information

Publications by Mingjie Li

Year 2025

Year 2024