OpenAI has announced the formation of a new team dedicated to probing and evaluating AI models for potential risks. Known as Preparedness, this team's key mandate includes forecasting, tracking, and mitigating threats posed by future AI systems. From their capability to deceive humans, as seen in phishing attacks, to their malicious code-producing abilities, the Preparedness Team intends to preemptively counter these challenges.
Noted AI Expert to Lead Preparedness Team
Selected to head Preparedness is Aleksander Madry, the current Director of MIT's Center for Deployable Machine Learning. Madry, a well-regarded figure in AI research, linked up with OpenAI in May, taking up the role of Head of Preparedness. His core responsibilities will comprise studying potential threats to AI systems, involving chemical, biological, radiological, and nuclear risks. Although some of these categories might appear implausible, OpenAI displays firm commitment to addressing AI threats, including hard-to-detect risks that may seem less apparent.
Open Call for Public Involvement and Policy Formation
Accompanying the announcement of the Preparedness team, OpenAI made a call to the public for ideas on risk studies, providing incentives to entice participation. These include a $25,000 prize for the best submissions and a potential employment opportunity within the Preparedness team. The team is also tasked with the development of a “risk-informed policy,” which will outline OpenAI's strategies for AI model evaluations, as well as the cognitive organization's risk-reducing actions and governance mechanisms.
The creation of the Preparedness team reflects OpenAI's commitment to AI safety, focusing on both pre- and post-model deployment stages in AI systems. The reveal comes in proximity to a major U.K. government summit on AI safety, signaling a wider, global concern on the issue.
This also follows OpenAI's announcement of a team to study and control “superintelligent” AI, an area of emerging study in artificial intelligence. Interestingly, prominent figures in OpenAI such as CEO Sam Altman and Chief Scientist Ilya Sutskever, speculate that AI with intelligence surpassing human cognitive ability may be a reality within the next ten years. This expectation further underpins the necessity for research into limiting and controlling AI.
In September, OpenAI introduced its Red Teaming Network to help increase safety across its AI models. Red teaming, a method of simulating adversarial attacks to evaluate security, is now essential for developing AI models. This is especially true as generative AI technologies become more popular. The goal is to identify and fix biases and vulnerabilities in models before they become widespread problems. For example, OpenAI's DALL-E 2 has been criticized for perpetuating stereotypes, and red teaming helps to ensure that models like ChatGPT follow safety protocols.
OpenAI has a history of using red teaming, having previously worked with outside experts to assess risks. However, the Red Teaming Network represents a more formal effort to deepen and expand OpenAI's collaborations with scientists, research institutions, and civil society organizations. As stated in the announcement, “Members of the network will be called upon based on their expertise to help red team at various stages of the model and product development lifecycle.”