Anthropic is on a mission to change the game when it comes to evaluating AI models. The company has announced a new program that will provide funding for the development of innovative benchmarks that can accurately assess the performance and impact of AI models, including Anthropic’s own Claude generative model.
The program, unveiled recently, will offer grants to third-party organizations capable of measuring advanced capabilities in AI models. Applications are currently being accepted on a rolling basis for evaluation.
“Our investment in these evaluations is intended to elevate the entire field of AI safety, providing valuable tools that benefit the whole ecosystem,” Anthropic stated in a blog post. The company acknowledges the challenges in developing high-quality evaluations for AI models and the increasing demand for such tools in the industry.
AI benchmarking has long been a problem, with existing benchmarks often falling short in capturing real-world usage scenarios. Anthropic aims to address this by creating new, challenging benchmarks that focus on AI security and societal implications through innovative tools and methods.
Anthropic specifically calls for benchmarks that assess a model’s capabilities in areas like cyberattacks, weapon enhancement, and manipulation of information. The company also plans to develop an early warning system for identifying and assessing AI risks related to national security and defense.
Additionally, Anthropic’s program will support research into benchmarks that explore AI’s potential in scientific research, language processing, bias mitigation, and toxicity detection.
To facilitate these initiatives, Anthropic envisions platforms where experts can create evaluations and large-scale trials involving thousands of users. The company has hired a coordinator for the program and is open to acquiring or expanding projects with scalability potential.
Teams interested in participating in the program will have access to various funding options tailored to their needs and project stage. They will also have the opportunity to collaborate with Anthropic’s domain experts.
While Anthropic’s efforts to support new AI benchmarks are commendable, some in the AI community may have concerns about the company’s commercial interests influencing evaluation criteria. Anthropic acknowledges that evaluations funded by the program should align with its AI safety classifications.
Despite the potential challenges, Anthropic hopes that its program will set a new standard for comprehensive AI evaluation in the industry. The company aims to collaborate with various stakeholders to drive progress towards this goal, though the extent of cooperation from independent efforts remains to be seen.