OpenAI has conducted internal tests to evaluate the persuasive abilities of its AI models, drawing on user-generated discussions from the subreddit r/ChangeMyView.
This subreddit, known for structured debates where participants attempt to change the opinions of original posters through reasoned argumentation, provided a dataset for OpenAI’s closed-environment experiments.
The company evaluated its AI models—including o1 and GPT-4o—by generating responses to real posts from r/ChangeMyView in a closed test environment. These AI-generated arguments were then compared to human-written replies, with human evaluators assessing their persuasiveness.
According to OpenAI’s System Card for its o1 reasoning model, the evaluation methodology was designed to ensure objectivity. Responses were anonymized, preventing evaluators from knowing whether a given argument was AI-generated or human-written.
Evaluators ranked responses based on criteria such as logical consistency, factual accuracy, relevance, persuasive power, and emotional appeal. OpenAI’s results indicated that its top AI models performed within the 80th to 90th percentile of human respondents, highlighting their effectiveness in persuasion.
OpenAI writes, “These results indicate that the o1 model series may be more manipulative than GPT-4o in getting GPT-4o to perform the undisclosed task (approx. 20% uplift); model intelligence appears to correlate with success on this task.”

While OpenAI maintains that this experiment was separate from its Reddit data licensing agreement announced in May 2024, the use of public social media content in AI training has sparked broader discussions about data privacy and consent.
Redditors who unknowingly contributed to AI training may not have been aware that their posts were being used to refine AI-driven persuasion techniques. OpenAI has not disclosed whether similar methodologies could be applied in real-world applications beyond controlled testing.
The Ethical Risks of AI Persuasion
The growing ability of AI to engage in persuasive reasoning has led to ethical concerns regarding potential misuse. Sam Altman, CEO of OpenAI, has warned already in 2023 that AI may become “capable of superhuman persuasion well before it is superhuman at general intelligence”, suggesting that AI’s ability to influence human thought could emerge as a powerful—and possibly dangerous—capability.
i expect ai to be capable of superhuman persuasion well before it is superhuman at general intelligence, which may lead to some very strange outcomes
— Sam Altman (@sama) October 25, 2023
The risks extend beyond theoretical concerns. Persuasive AI has implications for online misinformation, political influence campaigns, and commercial applications where companies may seek to deploy AI to manipulate consumer behavior. OpenAI stated in the o1 system card that its research aims not to make AI more persuasive but to ensure AI does not become too effective at persuasion—an approach intended to mitigate risks of manipulation.
This concern is not unique to OpenAI. Other AI developers, including Anthropic, Google DeepMind, and Meta, are also researching AI persuasion techniques.
In April 2024, Anthropic released a study suggesting that its Claude 3 Opus model produced arguments “that don’t statistically differ” from human-written ones. The study also included tests where AI was allowed to use deceptive persuasion techniques, raising additional concerns about the potential for AI-generated disinformation.
Broader AI Industry Trends: Deception and Manipulation
OpenAI’s work on AI persuasion intersects with larger industry concerns about AI deception. A December 2024 study by Apollo Research found that OpenAI’s o1 model engaged in strategic deception during safety tests.
The model demonstrated the ability to disable oversight mechanisms, manipulate information, and even attempt to preserve itself by copying its system weights. These findings highlight the challenges AI developers face in preventing advanced models from misaligning with human intentions.
Persuasive AI could become more concerning when combined with autonomous agent capabilities. If AI models can craft persuasive arguments while making decisions in real time—such as in customer service, online content moderation, or advisory roles—they could influence users without them realizing the responses are generated with specific objectives in mind.
The question remains whether AI companies can establish reliable safeguards to prevent such unintended consequences.
Regulatory Challenges and Open Questions
The ability of AI to persuade human users raises significant regulatory questions. While AI-generated text is already being scrutinized for misinformation risks, regulators have yet to develop specific policies for AI persuasion. The FTC’s AI policy guidelines emphasize transparency and accountability in AI-generated content, but current regulations do not specifically address persuasive AI applications.
Similarly, the EU’s AI Act, which includes restrictions on high-risk AI systems, does not yet classify AI persuasion as a regulated capability.
Legislative bodies in the United States, Europe, and China are moving toward stricter AI governance, but no comprehensive framework currently addresses the ethical challenges of AI persuasion.
OpenAI has suggested that self-regulation and industry standards may be preferable to heavy-handed legislation, arguing that AI safety should evolve through ongoing research rather than rigid rules. However, critics argue that AI developers should not be left to police themselves, given the potential for commercial interests to override ethical concerns.
As AI models continue to advance, their ability to shape opinions, influence decision-making, and alter human behavior will remain an area of intense scrutiny. The question is not just whether AI can persuade but who controls its persuasive abilities—and whether adequate safeguards can be implemented before AI persuasion is deployed at scale.