Microsoft just built a scanner that exposes hidden LLM backdoors before poisoned models reach enterprise systems worldwide ...
If organizations want their learning and development efforts to produce results, they need to redesign the infrastructure ...
Nvidia-led researchers unveiled DreamDojo, a robot “world model” trained on 44,000 hours of human egocentric video to help ...
As LLMs and diffusion models power more applications, their safety alignment becomes critical. Our research shows that even minimal downstream fine‑tuning can weaken safeguards, raising a key question ...
Improving care transitions for patients with Opioid Use Disorder and Stimulant Use Disorder in inpatient, primary care and ...
Customers are 32% more likely to buy a product after reading a review summary generated by a chatbot than after reading the ...
"Safety alignment is only as robust as its weakest failure mode," Microsoft said in a blog accompanying the research. "Despite extensive work on safety post-training, it has been shown that models can ...
The degradation is subtle but cumulative. Tools that release frequent updates while training on datasets polluted with ...
Model poisoning is so hard to detect that Ram Shankar Siva Kumar, who founded Microsoft's AI red team in 2019, calls detecting these sleeper-agent backdoors the "golden cup," and anyone who claims to ...
Researchers at the Department of Energy's Oak Ridge National Laboratory have developed a deep learning algorithm that ...
Microsoft’s research shows how poisoned language models can hide malicious triggers, creating new integrity risks for ...
Microsoft develops a lightweight scanner that detects backdoors in open-weight LLMs using three behavioral signals, improving ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results