HomeWinBuzzer NewsAI Safety Index 2024 Results: OpenAI, Google, Meta, xAI Fall Short; Anthropic...

AI Safety Index 2024 Results: OpenAI, Google, Meta, xAI Fall Short; Anthropic on Top

The 2024 AI Safety Index reveals poor safety practices at leading firms, with Anthropic earning the top score and Meta failing outright.

-

The just released AI Safety Index 2024 from the Future of Life Institute (FLI) has revealed stark shortcomings in AI safety practices across six leading companies, including Meta, OpenAI, and Google DeepMind.

The AI Safety Index ranked Anthropic highest with a “C” grade, but the rest of the companies—Meta, OpenAI, Google DeepMind, xAI, and Zhipu AI—received dismal scores, with Meta failing outright.

The findings underscore the urgent need for stronger governance and risk management in an industry racing to develop increasingly advanced AI systems.

FLI President Max Tegmark described the situation bluntly to IEEE Spectrum: “I feel that the leaders of these companies are trapped in a race to the bottom that none of them can get out of, no matter how kind-hearted they are.”

The FLI evaluated the companies on six categories, including risk assessment, governance, and existential safety strategies. Despite the industry’s focus on developing powerful systems, the report highlights a significant gap between technological capabilities and effective safety measures.

Anthropic Leads the Pack with a Mediocre “C”

Anthropic emerged as the best performer in the Index, although its “C” grade indicates there is still substantial room for improvement. The company’s “responsible scaling policy” stood out as a positive example.

This policy requires that all models undergo rigorous risk assessments to identify and mitigate catastrophic harms before deployment. Additionally, Anthropic has consistently achieved strong results on established safety benchmarks, earning it a “B-” in the category of addressing current harms, the highest score in that area.

Despite its relatively strong performance, Anthropic was also criticized for its lack of a comprehensive strategy to manage existential risks—an issue shared by all the companies evaluated.

Source: FutureOfLife initiative

The reviewers noted that while Anthropic, OpenAI, and Google DeepMind have articulated initial approaches to existential safety, these efforts remain preliminary and insufficient to address the scale of the challenge.

Meta Scores an “F” as Safety Concerns Mount

At the opposite end of the spectrum, Meta received a failing grade for its inadequate governance and safety practices. The report identified major gaps in Meta’s transparency, accountability frameworks, and existential safety strategies. The absence of meaningful safety measures has raised concerns about the company’s ability to manage the risks associated with its AI developments.

The poor showing of Meta highlights a broader issue within the industry. Despite significant investments in AI research, many companies have not prioritized the governance structures needed to ensure their technologies are safe and aligned with societal values. This lack of prioritization is exacerbated by a competitive environment that rewards rapid deployment over caution.

Transparency: A Persistent Weak Spot

Transparency was another critical area where the companies underperformed. Only xAI and Zhipu AI completed the tailored safety questionnaires sent by FLI, earning them slightly higher scores in this category.

The other companies failed to provide detailed insights into their internal safety practices, further emphasizing the opacity that often surrounds AI development.

In response to the report, Google DeepMind issued a statement defending its approach, saying, “While the Index incorporates some of Google DeepMind’s AI safety efforts, and reflects industry-adopted benchmarks, our comprehensive approach to AI safety extends beyond what’s captured.”

However, the reviewers remained unconvinced, pointing out that without greater transparency, it is impossible to assess whether these claims translate into meaningful action.

Max Tegmark stressed the importance of external pressure, noting, “If a company isn’t feeling external pressure to meet safety standards, then other people in the company will just view you as a nuisance, someone who’s trying to slow things down and throw gravel in the machinery. But if those safety researchers are suddenly responsible for improving the company’s reputation, they’ll get resources, respect, and influence.”

Existential Risk Strategies: A Common Failure

One of the most concerning findings in the Index was the universal failure of companies to develop robust existential risk strategies. While most of the firms publicly aspire to create artificial general intelligence (AGI)—AI systems capable of human-level cognition—none have devised comprehensive plans to ensure these systems remain aligned with human values.

Tegmark highlighted the gravity of this issue, stating, “The truth is, nobody knows how to control a new species that’s much smarter than us.”

Stuart Russell echoed this sentiment, warning, “As these systems get bigger, it’s possible that the current technology direction can never support the necessary safety guarantees.”

The lack of concrete existential safety strategies was particularly troubling given the increasing capabilities of AI systems. Reviewers emphasized that without clear frameworks, the risks posed by AGI could escalate rapidly.

History of the AI Safety Index

The AI Safety Index builds on previous FLI initiatives, including the widely discussed 2023 “pause letter,” which called for a temporary halt to the development of advanced AI systems to establish robust safety protocols.

The letter, signed by over 33,700 individuals, including high-profile names like Elon Musk and Steve Wozniak, was ultimately ignored by the companies it targeted. The 2024 Index represents FLI’s latest effort to hold the industry accountable by publicly grading firms on their safety practices.

According to Stuart Russell, a computer science professor at UC Berkeley and one of the report’s reviewers, “The findings suggest that although there is a lot of activity at AI companies that goes under the heading of ‘safety,’ it is not yet very effective. None of the current activity provides any kind of quantitative guarantee of safety.”

The Index aims to foster transparency, empower internal safety teams, and encourage external oversight to prevent potentially catastrophic risks.

A Call for Regulatory Oversight

The FLI report concludes with a call for stronger regulatory frameworks to address the gaps identified in the Index. Tegmark proposed the creation of an oversight body similar to the U.S. Food and Drug Administration (FDA) to evaluate AI systems before they are deployed.

Such an agency could enforce mandatory safety standards, shifting the industry’s competitive pressures from prioritizing speed to prioritizing safety.

Turing Award laureate Yoshua Bengio, another panelist on the Index, supported this proposal, stating, “Evaluations like this highlight safety practices and encourage companies to adopt more responsible approaches.” Bengio argued that independent oversight is essential to ensure that companies do not cut corners in the pursuit of market dominance.

How the Index Was Compiled

The 2024 AI Safety Index was developed through a meticulous evaluation process led by a panel of seven independent experts, including Bengio, Russell, and Encode Justice founder Sneha Revanur.

The panel assessed the companies across 42 indicators spanning six categories: risk assessment, current harms, transparency, governance, existential safety, and communication.

The methodology combined public data, such as policy documents, research papers, and industry reports, with responses to a custom questionnaire. However, the limited participation of companies in this process underscored the challenges of achieving transparency in the industry.

David Krueger, a panelist and AI researcher, remarked, “It’s horrifying that the very companies whose leaders predict AI could end humanity have no strategy to avert such a fate.”

Implications for the Future of AI Governance

The 2024 AI Safety Index paints a sobering picture of the current state of AI governance. While companies like Anthropic are making strides, the overall findings reveal a lack of preparedness to manage the risks posed by advanced AI technologies.

The report calls for urgent action to establish stronger safety protocols and regulatory oversight to ensure that AI development aligns with societal interests.

As Tegmark noted, “Without external pressure and robust standards, the race to develop AI could lead to catastrophic consequences.”

Markus Kasanmascheff
Markus Kasanmascheff
Markus has been covering the tech industry for more than 15 years. He is holding a Master´s degree in International Economics and is the founder and managing editor of Winbuzzer.com.

Recent News

0 0 votes
Article Rating
Subscribe
Notify of
guest
0 Comments
Newest
Oldest Most Voted
Inline Feedbacks
View all comments
0
We would love to hear your opinion! Please comment below.x
()
x