Researchers claim that leading image editing AIs can be jailbroken through rasterized text and visual cues, allowing prohibited edits to bypass safety filters and succeed in up to 80.9% of cases.
As LLMs and diffusion models power more applications, their safety alignment becomes critical. Our research shows that even minimal downstream fine‑tuning can weaken safeguards, raising a key question ...
A PyTorch implementation of distributional alignment loss based on Cauchy-Schwarz (CS) divergence with Kernel Density Estimation (KDE). This module provides a plug-and-play loss function for ...