Arena logoAR

About

Arena is a community-powered platform for evaluating AI model performance based on real-world usage. Tens of millions of builders, researchers, and creative professionals use the platform to test frontier models and provide feedback on outputs, with millions of monthly active users. This human-in-the-loop approach informs Arena's public leaderboards, which rank models on performance and reliability grounded in practical use rather than controlled benchmarks alone.

The platform addresses a core challenge in AI development: understanding how state-of-the-art models perform outside laboratory conditions. Leading enterprises and AI research labs rely on Arena's evaluations to assess model reliability and alignment. The leaderboards have become influential in discussions about AI progress across the industry, shaped by feedback from a global user base spanning builders, researchers, and practitioners across creative and technical domains.

Arena was created by researchers from UC Berkeley. The company's focus centres on transparency and rigor in model evaluation, positioning human feedback and real-world usage as essential to measuring AI advancement. The platform serves as both a testing ground for frontier models and a data source for understanding their practical capabilities and limitations.

Similar companies

Arize AI logoAA

Arize AI

Arize AI is the leading AI & Agent Engineering observability and evaluation platform, providing one place for development, observability, and evaluation of AI systems.

Vals AI logoVA

Vals AI

Vals AI builds domain-specific benchmarks and evaluation platforms to measure how well large language models perform on real-world industry tasks.

Scale logoSC

Scale

Scale AI is a San Francisco-based data annotation platform that provides high-quality training data and full-stack AI infrastructure to power machine learning models for enterprises, governments, and AI labs worldwide.

Encord logoEN

Encord

Encord provides a data development platform for AI teams, covering dataset management and curation, annotation, workforce management, and model evaluation.

Centific logoCE

Centific

Centific builds data infrastructure and platforms that help organisations train, evaluate, and deploy AI systems, supported by a network of 1.8 million human experts across 230 markets.

Together AI logoTA

Together AI

Together AI is a research-driven AI cloud infrastructure provider enabling developers and enterprises to train, fine-tune, and deploy open-source generative AI models at scale.