Threat Models & Mitigations from Physical AGI

Yesterday, I had the pleasure of speaking at Safe AI Germany (SAIGE) on threat models and mitigations from physical AGI. Frontier AI models are already being deployed on physical hardware in robotics, logistics, lab automation, and autonomous weapons, but we do not yet have good epistemics for reasoning about the safety risks this creates.

The talk outlines five threat scenarios arising from physical AGI capabilities: asymmetric autonomous violence, AI-assisted bioweapons development, infrastructure capture through self-replication, AI-enabled coups, and embodied emotional manipulation. For each, I discuss the state of currently deployed capabilities, plausible trajectories, and candidate mitigations. I suggest that progress on these questions is bottlenecked on empirical evidence, and propose a tentative research agenda aimed at reducing some of these uncertainties.

Slides are available online. Thanks to Jessica Wang and the SAIGE community for organizing the event!

Enjoy Reading This Article?