Introduction

Multimodal systems are transforming AI by enabling models to understand and act across language, vision, and other modalities, powering advances in robotics, autonomous driving, and scientific discovery. Yet, these capabilities raise serious safety and trustworthiness concerns, especially as traditional safeguards fall short in multimodal contexts. The Workshop on Safe and Trustworthy Multimodal AI Systems (SaFeMM-AI) at ICCV 2025 brings together the computer vision community to address challenges, including and beyond hallucinations, privacy leakage, and jailbreak vulnerabilities, and advance the development of safer, more robust, and more reliable multimodal models.

Important Deadlines

~~June 19~~ June 27, 2025 (23:59 GMT)

Archival track - will appear in ICCV proceedings

Full-Paper Submission
~~July 11~~ July 14, 2025

Full-Paper Notifications
August 15, 2025 (23:59 GMT)

Non-archival track - will NOT appear in ICCV proceedings

Short-Paper Submission
August 18, 2025

Camera-Ready Full-Paper (Archival Track)
~~September 7~~ September 18, 2025

Short-Paper Notifications
October 10, 2025

Final Poster Submission
October 19, 2025

Workshop Day

Schedule

Time	Event
9:00	Opening Remarks
9:10	Prof. Yoshua Bengio: Avoiding Catastrophic Risks from Uncontrolled AI Agency
9:45	Prof. Florian Tramèr: Negative Progress in Machine Learning Security
10:20	Networking/Coffee Break
10:55	Prof. Yarin Gal: Multimodal AI hallucinations, jailbreaks, and beyond: Some recent work and open questions
11:30	Oral Presentation - Safety Without Semantic Disruptions: Editing-free Safe Image Generation via Context-preserving Dual Latent Reconstruction
11:45	Voxel51 - Sponsored Demo
12:00	Lunch Break
13:30	Oral Presentation - FrEVL: Leveraging Frozen Pretrained Embeddings for Efficient Vision-Language Understanding
13:45	Oral Presentation - Pulling Back the Curtain: Unsupervised Adversarial Detection via Contrastive Auxiliary Networks
14:00	Oral Presentation - On the Importance of Conditioning for Privacy-Preserving Data Augmentation
14:15	Prof. Yao Qin: TBD
14:50	Oral Presentation - Superimposing 3D-models for Privacy in Videos for Self-Supervised Training
15:05	Closing Remarks
15:15	Poster Session & Networking/Coffee Break

Speakers

Yoshua Bengio

Full Professor - Mila, Université de Montréal

Yoshua Bengio is a world-leading expert in artificial intelligence, renowned for his pioneering work in deep learning, which earned him the 2018 A.M. Turing Award alongside Geoffrey Hinton and Yann LeCun. He is a Full Professor at Université de Montréal, Founder and Scientific Advisor of Mila - Quebec AI Institute. He received numerous awards, including the prestigious Killam Prize and Herzberg Gold medal in Canada, CIFAR's AI Chair, Spain's Princess of Asturias Award, the VinFuture Prize and he is a Fellow of both the Royal Society of London and Canada, Knight of the Legion of Honor of France, Officer of the Order of Canada, Member of the UN's Scientific Advisory Board for Independent Advice on Breakthroughs in Science and Technology. Yoshua Bengio was named in 2024 as one of TIME magazine's 100 most influential people in the world. Concerned about the social impact of AI, he actively contributed to the Montreal Declaration for the Responsible Development of Artificial Intelligence and currently chairs the International Scientific Report on the Safety of Advanced AI. In June 2025, he launches a new non-profit AI safety research organization called LawZero, to prioritize safety over commercial imperatives.

Title: Avoiding Catastrophic Risks from Uncontrolled AI Agency

Abstract: AI agentic capabilities are rising exponentially, driven by scientific advances incorporating system 2 cognition into deep networks as well as by the commercial value of automating numerous human tasks. Besides bodily control, this may be the most significant gap that remains to human-level intelligence. Unfortunately, a series of recent scientific observations raise a major red flag: as AIs become better at reasoning and planning, more occurrences of deceptive and self-preservation behaviors are observed. We have not solved the problem of making sure that advanced AIs will follow our instructions, and in some circumstances they are found to cheat, lie, hack computers and try to escape our control, against their alignment training and instructions. Is it wise to design AIs that will soon be smarter than us across many cognitive abilities and could compete with us and try to avoid our control? We propose a safer path going forward: the design of non-agentic but fully trustworthy AIs modeled after a selfless platonic scientist trying to understand the world rather than trying to imitate or please us. For example, such non-agentic Scientist AIs could be used as monitors that reject potentially dangerous inputs or outputs of untrusted AI agents.

Yarin Gal

Associate Professor - University of Oxford

Yarin leads the Oxford Applied and Theoretical Machine Learning (OATML) group. He is an Associate Professor of Machine Learning at the Computer Science department, University of Oxford. He is also the Tutorial Fellow in Computer Science at Christ Church, Oxford, a Turing AI Fellow at the Turing Institute, and Director of Research at the UK Government's AI Security Institute (AISI, formerly the Frontier AI Taskforce). Prior to his move to Oxford he was a Research Fellow in Computer Science at St Catharine's College at the University of Cambridge. He obtained his PhD from the Cambridge machine learning group, working with Prof Zoubin Ghahramani and funded by the Google Europe Doctoral Fellowship.

TBD

Florian Tramèr

Assistant Professor - ETH Zürich

Florian Tramèr is an Assistant Professor of Computer Science at ETH Zürich, where he leads the Secure and Private AI (SPY) Lab. He is a member of the Information Security Institute and ZISC, and an associated faculty member of the ETHZ AI Center. His research lies at the intersection of Computer Security, Machine Learning, and Cryptography. His work studies the worst-case behavior of deep learning systems from an adversarial perspective, aiming to understand and mitigate long-term threats to user safety and privacy. Under his leadership, the SPY Lab investigates the robustness and trustworthiness of machine learning systems, often using adversarial attacks to probe and improve their security.

Title: Negative Progress in Machine Learning Security

Abstract: There's been a huge amount of research on ML security in the past decade. And yet, I'll argue that ML is significantly less secure today. I'll discuss the growing use of ML in security sensitive applications, the degradation of evaluation standards, and new emerging harmful uses of ML, that all together contribute to a bleak picture of ML security today.

Yao Qin

Assistant Professor - UC Santa Barbara

Qin Yao is an Assistant Professor at the Department of Electrical and Computer Engineering, affiliated with the Department of Computer Science at UC Santa Barbara, where she is also co-leading the REAL AI initiative. Meanwhile, she is a senior Research Scientist at Google DeepMind, working on Gemini Multimodal. She obtained her PhD degree at UC San Diego in Computer Science, advised by Prof. Garrison W. Cottrell. During her PhD, she also interned under the supervision of Geoffrey Hinton, Ian Goodfellow and many others.

TBD

Accepted Papers

Organizers

Carlos Hinojosa KAUST

Adel Bibi University of Oxford

Jindong Gu University of Oxford

Yichi Zhang Tsinghua University

Wenxuan Zhang KAUST

Lama Alssum KAUST

Andres Villa KAUST

Juan Carlos L. Alcazar KAUST

Karen Sanchez KAUST

Chen Zhao KAUST

Mohamed Elhoseiny KAUST

Bernard Ghanem KAUST

Philip Torr University of Oxford

Contact

Stay up to date by following us on @SaFeMMAI.

For any inquiries, feel free to reach out to us via email at: safemm.ai.workshop@gmail.com. or You may also contact the organizers directly: Carlos Hinojosa.

SaFeMM-AI: Safe and Trustworthy Multimodal AI Systems

ICCV2025 @ Honolulu Hawaii