Vals AI develops domain-specific benchmarks and benchmarking services designed to evaluate how large language models and other AI systems perform on real-world tasks. The company operates both a public benchmark, Vals Index, and a proprietary platform that measures model performance across critical domains for startups, enterprises, and research laboratories.
The company conducts all evaluations independently, without vendor interference, and develops its benchmarks in-house. This approach reflects its focus on providing rigorous, unbiased assessment of AI model capabilities. Vals AI offers private benchmarking services tailored to specific industries, with explicit engagements in legal, tax, and finance sectors, alongside broader work across software engineering and healthcare.
Work from Vals AI has been featured by major publications including the Wall Street Journal, Washington Post, and Bloomberg. The company was founded by Stanford NLP researchers, grounding its approach in academic rigor applied to practical industry problems.