Annual Meeting of the Association for Computational Linguistics (ACL)
IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
When Understanding Becomes a Risk: Authenticity and Safety Risks in the Emerging Image Generation Paradigm
International Conference on Learning Representations (ICLR)
Benchmarking Empirical Privacy Protection for Adaptations of Large Language Models
Conference on Neural Information Processing Systems (NeurIPS)
Adjacent Words, Divergent Intents: Jailbreaking Large Language Models via Task Concurrency
Conference on Neural Information Processing Systems (NeurIPS)
Finding and Reactivating Post-Trained LLMs’ Hidden Safety Mechanisms