Behavior Modeling Training Method

8don MSN

Is your AI model secretly poisoned? 3 warning signs

Is your AI model secretly poisoned? 3 warning signs ...

NDSS 2025 – PBP: Post-Training Backdoor Purification For Malware Classifiers

Dung Thuy Nguyen (Vanderbilt University), Ngoc N. Tran (Vanderbilt University), Taylor T. Johnson (Vanderbilt University), Kevin Leach (Vanderbilt University) PAPER PBP: Post-Training Backdoor ...

InfoWorld

Single prompt breaks AI safety in 15 major language models

The GRP‑Obliteration technique reveals that even mild prompts can reshape internal safety mechanisms, raising oversight concerns as enterprises increasingly fine‑tune open‑weight models with ...

Microsoft

A one-prompt attack that breaks LLM safety alignment

As LLMs and diffusion models power more applications, their safety alignment becomes critical. Our research shows that even minimal downstream fine‑tuning can weaken safeguards, raising a key question ...

Unite.AI

Giving Language Models a ‘Truth Dial’

True or chatty: pick one. A new training method lets users tell AI chatbots exactly how 'factual' to be, turning accuracy into a dial you can crank up or down. A new research collaboration between the ...

2don MSN

3D-printed brain models could improve medical research and training

University of Missouri researchers are developing new ways to better simulate the complex nature of human brain tissue. For ...

Microsoft

Detecting backdoored language models at scale

Learn how Microsoft research uncovers backdoor risks in language models and introduces a practical scanner to detect tampering and strengthen AI security.

InfoTech Spotlight

The Future of AI: Autonomous Training Models That Learn Continuously

The traditional approach to artificial intelligence development relies on discrete training cycles. Engineers feed models vast datasets, let them learn, then freeze the parameters and deploy the ...

Unite.AI

The Scheming Problem: Why Advanced AI Models Are Learning to Hide Their True Goals

For years, the AI community has worked to make systems not just more capable, but more aligned with human values. Researchers have developed training methods to ensure models follow instructions, ...

9don MSN

Computational models predict neural activity for re-establishing connectivity after stroke or injury

Researchers at The Hong Kong University of Science and Technology (HKUST) School of Engineering have developed a novel ...

Hub

Researchers use AI tools to model, improve wildfire evacuation

Scientists at Hopkins, University of Florida simulate and predict human behavior during wildfire evacuation, allowing for improved planning and safety ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results