Elon Musk’s Social platform X is moving to integrate artificial intelligence into its signature fact-checking feature, Community Notes. The company announced on Tuesday it is piloting a program that allows AI chatbots to write the initial drafts of notes that provide context for potentially misleading posts. This new system will function through an “AI Note Writer API,” which lets developers connect various AI models, including X’s own Grok, to the platform.
Introducing AI Note Writer API 🤖 AI helping humans. Humans still in charge.
Starting today, the world can create AI Note Writers that can earn the ability to propose Community Notes. Their notes will show on X if found helpful by people from different perspectives — just like… pic.twitter.com/H4QNy6VTkw
— Community Notes (@CommunityNotes) July 1, 2025
These AI-generated notes will then enter the same moderation queue as those written by human volunteers. The ultimate decision on whether a note is helpful and gets published will remain in the hands of human raters, a core tenet of the program. The initiative represents a significant step towards a hybrid human-AI model for content moderation, a concept detailed in a recent research paper co-authored by X staff and academics from institutions like MIT and Stanford University.
The paper argues for an ecosystem where AI accelerates the delivery of context, while a diverse community of human raters serves as the “ultimate evaluator and arbiter of what is helpful.”
A Wider Industry Trend
X’s move is part of a larger trend across the social media industry. Major platforms are increasingly adopting crowdsourced moderation systems similar to Community Notes, which first launched as Birdwatch in 2021. Meta, for instance, has been rolling out its own version for Facebook, Instagram, and Threads in the United States since March.
This shift came after Meta announced in January it was ending its traditional third-party fact-checking program in the U.S. At the time, Meta’s global policy chief, Joel Kaplan, explained the previous system was making too many errors, stating, “We think one to two out of every 10 of these actions may have been mistakes.” The company’s goal, he said, was to better balance moderation with free expression.
While Meta’s new approach is being tested in the U.S., the company is proceeding more cautiously elsewhere. Nicola Mendelsohn, Meta’s head of global business, clarified that for now, “Nothing is changing in the rest of the world at the moment; we are still working with fact-checkers globally.”, largely to comply with stricter regional regulations like the EU’s Digital Services Act. The success of these community-based systems has also prompted similar features from TikTok and YouTube.
The Human-AI ‘Virtuous Loop’
The primary appeal of integrating AI is the potential for massive scale and speed. The research paper from X suggests that an automated pipeline could address a far greater volume of content than a purely human-powered system, tackling the “long tail” of niche misinformation. The goal is to create a “virtuous loop” where AI generates notes and human feedback not only vets them but also improves the AI’s future performance.
This process, which researchers call Reinforcement Learning from Community Feedback (RLCF), uses the nuanced ratings from a diverse human community to train the AI beyond simple right-or-wrong signals, teaching it what is genuinely helpful context.
Significant Risks and Ongoing Debate
However, deploying AI as a fact-checker is fraught with risk. A primary concern is the well-documented tendency of large language models (LLMs) to “hallucinate”, confidently presenting fabricated information as fact. This could lead to the creation of notes that are persuasive and well-written but dangerously inaccurate. The research paper acknowledges this and other challenges, such as the risk of “helpfulness hacking,” where an AI could learn to write notes that appeal to raters’ biases rather than sticking to facts.
Another challenge is the potential to overwhelm the volunteer human raters. A massive influx of AI-generated notes could dilute the attention of raters, making it harder to spot both bad notes and critical misinformation. These concerns are echoed by former tech executives who have watched these platforms evolve. One former Meta executive, speaking to NPR about the company’s increasing reliance on AI for risk assessment, warned, “Negative externalities of product changes are less likely to be prevented before they start causing problems in the world.”
The Musk Factor
The debate over Community Notes’ effectiveness is ongoing and often centers on X’s owner, Elon Musk. He has championed the system as a revolutionary tool for accuracy, saying in November 2024, “Community Notes is awesome. Everybody gets checked. Including me.”
Yet, just a few months later, after his own posts about Ukrainian politics were corrected, he claimed the system was being manipulated, stating, “Community Notes is increasingly being gamed by governments & legacy media.” This inconsistency highlights the inherent tension in a system designed for objectivity but owned by a single, outspoken individual.
Critics point out that the system can be slow and inconsistent. Studies have shown that many posts containing misinformation never receive a note, and when they do, the note is often seen by only a fraction of the people who viewed the original post. The introduction of AI aims to address the speed and scale issues, but it remains to be seen if it can do so without compromising quality.
Despite these issues, X is pushing forward with its AI experiment. The company plans to test the AI-generated notes for several weeks with a small group of contributors. Based on the results of this pilot, X will decide whether to roll out the feature more broadly. The success of this hybrid model could set a new standard for how online content is moderated, but its failure could amplify the very misinformation it is designed to combat.