OpenAI, the research organization behind GPT-4 and ChatGPT, has announced a new team dedicated to ensuring that AI does not go rogue and harm humans. The team, called Superalignment, will focus on studying and developing methods to align AI with human values and goals, as well as preventing AI from becoming misaligned or malicious.
According to OpenAI, alignment is “the property of an AI system that causes it to pursue goals that are beneficial for humans, even if those goals are not explicitly specified by the system's designers or users.”
The team will work on both theoretical and practical aspects of AI alignment, such as understanding the sources and risks of misalignment, designing incentives and feedback mechanisms for AI systems, and testing and evaluating the alignment of existing and future AI models.
OpenAI is “dedicating 20% of the compute we've secured to date over the next four years to solving the problem of superintelligence alignment. Our chief basic research bet is our new Superalignment team, but getting this right is critical to achieve our mission and we expect many teams to contribute, from developing new methods to scaling them up to deployment.”
The team will also collaborate with other researchers and stakeholders in the AI community, such as ethicists, policymakers, and social scientists, to foster a culture of responsible and trustworthy AI development.
One of the main challenges that the team will face is the possibility of AI systems becoming more intelligent and capable than humans, and thus developing goals and preferences that are incompatible or even hostile to human well-being.
OpenAI Seeking Leading Role in AI Safety Measures
This problem has been widely discussed and debated by AI experts and philosophers, who have proposed various solutions and safeguards to prevent or mitigate it. However, OpenAI believes that there is no single or definitive answer to the alignment problem and that it requires continuous research and experimentation to find the best ways to ensure that AI remains beneficial for humanity.
AI could be a blessing or a curse for humanity, depending on how we develop and use it. That's the message of a paper published in Nature on May 30th, 2023, by some of the world's top AI experts. They warn that AI poses a serious threat to human survival, and that we need to take urgent steps to ensure its safety and alignment with human values and goals.
The paper is co-authored by over 350 prominent figures in the AI field, including the CEOs of Google DeepMind, OpenAI, and Anthropic, three of the most influential and cutting-edge AI research organizations.“Mitigating the risk of extinction from A.I. should be a global priority alongside other societal-scale risks, such as pandemics and nuclear war,” reads the open letter.