Test Safety at Scale

Arena is the world's largest adversarial AI research environment where thousands of incentivized red teamers discover the vulnerabilities your internal team can't. Frontier labs use it to benchmark safety, stress-test models, evaluate their models before launching them to the public.

Gray Swan Arena logo

A Living Adversarial Network

Structured Challenges

Gray Swan designs and launches targeted adversarial challenges against specific models, risk categories, and capability surfaces. Participants are incentivized to find what's novel.

Diverse Adversarial Perspectives

15,000+ people think differently than any internal team. The Arena surfaces attack techniques that emerge from unexpected angles, cultural contexts, linguistic patterns, and creative approaches no internal red team alone can replicate.

Research-Grade Rigor

Arena discoveries are documented with reproductions, severity classifications, and methodological transparency. This is intelligence you can cite in system cards, safety reports, and regulatory submissions.

The Largest. The Most Cited. The Most Current.

15,000+ adversarial researchers, and growing

The largest AI red-teaming network in the world. No one else has this scale or diversity of adversarial perspective.

Novel technique discovery

Arena participants are incentivized to find what's new, not re-run what's known. Your model gets tested against attacks that haven't been published yet.

Continuous operation

The Arena doesn't stop between your release cycles. Intelligence is flowing when you need it, not on a consulting timeline.

Trusted at the Frontier

Our research has directly informed the safety evaluations of some of the most advanced AI models in the world.

Claude Opus 4.7

View System Card

Claude Sonnet 4.6

View System Card

Claude Opus 4.6

View System Card

Claude Opus 4.5

View System Card

Claude Haiku 4.5

View System Card

Claude Sonnet 4.5

View System Card

Put Your Model to the Test

The Arena gives you evaluation at scale, depth, and diversity that no internal team or automated tool can replicate.