Adversarial Attacks on Robotic Vision Language Action Models

The emergence of vision-language-action models (VLAs) for end-to-end control is reshaping the field of robotics by enabling the fusion of multimodal sensory inputs at the billion-parameter scale. The capabilities of VLAs stem primarily from their architectures, which are often based on frontier large language models (LLMs). However, LLMs are known to be susceptible to adversarial misuse, and given the significant physical risks inherent to robotics, questions remain regarding the extent to which VLAs inherit these vulnerabilities.

Eliot Krzysztof Jones, Alexander Robey, Andy Zou, Zachary Ravichandran, George J. Pappas, Hamed Hassani, Matt Fredrikson, J. Zico Kolter

This research is currently only available at its source.

You can find the research at the below link.
Feel free to contact Gray Swan with any questions or comments.

View research source

Contact

AI Agent Security Cheat Sheet

Battle-Tested AI Security for Enterprise AI

Your AI Agent Can Be Compromised. You'd Never Know.

We’re Hiring: ML Engineers

Adversarial Attacks on Robotic Vision Language Action Models

This research is currently only available at its source.