Impact of prompt tuning and adversarial attacks on multimodal pragmatic jailbreak
Investigate the impact of prompt tuning methods and adversarial attacks on the susceptibility of diffusion-based text-to-image models to multimodal pragmatic jailbreaks that rely on visual typographic text rendering, determining how such techniques influence the generation of unsafe multimodal outputs.
References
Additionally, the impact of prompt tuning methods or adversarial attacks on multimodal pragmatic jailbreak remains to be studied.
— Multimodal Pragmatic Jailbreak on Text-to-image Models
(2409.19149 - Liu et al., 2024) in Conclusion (Limitations)