Update: The article the news story below is based on is satire, the author Sebastian Carlos has been publishing similar stories on his Medium profile for years. We apologize for not having noticed that earlier.
Meta’s algorithm for identifying harmful content mistakenly flagged its mission statement, “bringing the world closer together,” showing an extremely high correlation between “Meta” and the concept of a “Terrorist Organization.”
The incident led to the swift dismissal of Sebastian Carlos, a newly hired employee who was tasked to overhaul the algorithm, raising ethical and technical questions about the role of AI in content moderation, as he recounts in a blog post.
A Bold Overhaul of Content Moderation
The controversy began when developer Sebastian Carlos joined Meta’s “Harmful Content Detection” team in his first week as a new employee. Tasked with improving the efficiency and accuracy of Meta’s content moderation systems, they identified significant shortcomings in the existing algorithm.
Carlos proposed a radical rewrite using Prolog, a programming language renowned for its ability to handle symbolic reasoning and complex relationships. Prolog’s declarative nature made it particularly well-suited to analyze the nuanced definitions of harmful content.
To strengthen the algorithm’s contextual understanding, the overhaul incorporated diverse datasets, including Wikipedia, religious texts, and encyclopedias. With this multi-faceted approach Carlos aimed to ensure cultural and contextual inclusivity. However, this approach also required extensive computational resources.
Related: Zuckerberg’s Move to Stop Fact-Checking Praised by Trump and Musk, Critics Shocked
Carlos explains how Meta’s internal cost metric, humorously referred to as “Guatemala Years” (equivalent to the GDP of Guatemala), was invoked to justify the computational expenses.
The revamped algorithm was designed to process millions of posts daily, analyzing their content against a highly detailed moral topology. According to Carlos, the goal was to create an unbiased system capable of accurately categorizing harmful material. He assures that the system functioned without bias and operated strictly according to the rules it was programmed to follow.
When AI Turns Inward
During its first major test run, the updated algorithm flagged Meta’s mission statement as harmful content.
Debugging revealed no errors in logic. Instead, the system’s advanced analysis identified a high correlation between the term “Meta” and phrases associated with “terrorism” and “corporate overreach.” The unexpected outcome highlighted the challenges of training AI systems to navigate ambiguous concepts like morality and harm.
As Carlos writes, the flagged statement prompted immediate internal discussions. Some engineers praised the algorithm’s rigor, while others worried about the potential public relations fallout. One senior manager reportedly told Carlos: “Look, this is… impressive, but we can’t hit our OKRs like this.” highlighting the tension and growing divide between technical accuracy and organizational priorities.
The Fallout: Ethics and NDAs
The incident escalated when Carlos presented his findings during a team meeting. Despite demonstrating the logic behind the algorithm’s decisions, the employee’s work was met with resistance from higher-ups. Shortly after, he was dismissed and asked to sign a “double NDA,” an obscure legal mechanism designed to enforce strict confidentiality.
A double NDA, also known as a bilateral or mutual NDA, is a legally binding agreement where both parties involved disclose confidential information to each other and agree to protect that information from further disclosure.
Carlos writes that his lawyer suggests that such agreements may nullify each other, allowing the employee to discuss their experience publicly. Reflecting on his dismissal, he writes, “Considering they fired me for telling some truths, I figure I owe the internet the full story.”
The Role of AI in Moderation
Meta’s content moderation tools are integral to managing the vast amount of user-generated content on its platforms. The company’s reliance on AI has been both a necessity and a source of controversy, with critics pointing to instances of overreach or insufficient action against harmful material.
The incident adds to the scrutiny, raising questions about transparency and accountability in AI decision-making. Critics of Meta on the other side might interpret it as a sign of efficiency of algorithms.
Maybe Meta’s practices and ideological biases actually resemble those of a terrorist organization and the algorithm was spot-on? Without technical details we can only guess. But for Carlos this outcome put an end to an already rocky start at the company.
As Carlos also writes, Meta’s augmented reality glasses had been used during his interview process when an involved developer relied on them to solve a coding challenge, correcting and improving a proposed solution of Carlos. He then confronted the interviewer with his findings and even used this incident to bargain a higher salary when starting at Meta, actually blackmailing him.
This side-story aside, Carlos’ experience puts a spotlight on the unintended consequences of AI systems and highlights the difficulties of programming algorithms to understand and apply nuanced human concepts like harm, morality, and safety.
The flagged mission statement may have been an anomaly or a precise match. Whatever the case, Meta surely won’t dig into this.
Last Updated on January 15, 2025 12:02 pm CET
According to Sebastian Carlos, the author of your source article titled Fired From Meta After 1 Week: Here’s All The Dirt I Got, the source article is complete fiction. Sebastian Carlos writes satire.
We have missed that and apologize for not being more diligent. A note has been added to the story on top.