New AI-Powered Tool Uncovers Zero-Day Bugs in Python Codebases

Vulnhuntr leverages Anthropic's Claude AI model to detect dangerous security vulnerabilities like RCE, XSS, and SQL injection in Python projects.

Seattle AI startup Protect AI has launched a tool called Vulnhuntr that scans Python codebases for dangerous vulnerabilities, including zero-day exploits. Vulnhuntr uses Anthropic’s Claude model to dive into Python projects and locate potential security threats by focusing on how user input interacts with the code.

Unlike many static code analyzers, Vulnhuntr doesn’t just pull out random bits of code for review. It analyzes entire call chains—connections between files, functions, and variables across a project—without losing the context, allowing it to catch vulnerabilities that go unnoticed by simpler tools.

How Vulnhuntr Changes the Game for Code Security

Dan McInerney, the researcher behind the project at Protect AI, says that Vulnhuntr works by feeding highly detailed, vulnerability-specific prompts into Claude. From there, the AI enters a loop, calling for additional code snippets until it’s gathered enough information to map the entire path from input to output. In simpler terms, this means the AI doesn’t just stop at spotting risky code; it fully investigates how that code works in the broader project.

McInerney says Vulnhuntr is far more efficient than current methods, which often flag random functions like eval() without understanding whether they’re actually dangerous. He noted that the tool has already revealed multiple zero-day bugs across several popular open-source projects, vulnerabilities that were unknown and had yet to be reported.

Real-World Examples of Vulnhuntr’s Success

Vulnhuntr isn’t just a concept—it’s already exposed vulnerabilities in real-world cases. For example, the tool identified dangerous security flaws like Local File Inclusions (LFI) and Cross-Site Scripting (XSS) in widely-used Python-based repositories such as gpt_academic and ComfyUI, which have tens of thousands of GitHub stars. In one case, Vulnhuntr flagged a Remote Code Execution (RCE) bug in Ragflow, a machine learning library.

These discoveries highlight the AI’s capability to pinpoint issues that could be exploited in the wild, which might have gone undetected for much longer without Vulnhuntr’s intervention. As of now, the tool has successfully focused on seven types of high-risk vulnerabilities, including SQL Injection, Server-Side Request Forgery, and Insecure Direct Object References.

Limitations and Costs of Vulnhuntr

Despite its powerful scanning ability, Vulnhuntr does have a few restrictions. First off, it currently only supports Python, so developers using other languages won’t benefit just yet. Additionally, because Vulnhuntr depends on static Python code analyzers to break down the projects before feeding them to the AI, any code written in different languages (like JavaScript) might lead to false positives or missed vulnerabilities.

Another aspect to keep in mind is the cost. Since Claude’s API isn’t free, running Vulnhuntr on large codebases can incur expenses. McInerney estimates that a scan focusing on one or two critical files might only cost about 50 cents, but if you run a full-scale scan across a large project, it could total anywhere from $1 to $3. That said, for security teams on a budget, selectively scanning key files could keep the costs manageable.

The Future of AI-Powered Security

The introduction of Vulnhuntr marks a significant step in AI’s role in security, being one of the first tools to identify real-world zero-day vulnerabilities using machine learning models like Claude. While previous research papers have claimed AI can find security issues, many of these studies involved feeding models with known bugs. Vulnhuntr, however, has proven that it can discover completely new vulnerabilities as it considers the broader context of the codebase, including relationships between different files and functions. this could reshape the way developers and security professionals approach code safety in the future.

Though Vulnhuntr currently operates best using Claude, Protect AI is open-sourcing the project, allowing developers to further adapt it for use with other AI models, including OpenAI’s GPT-4.

Last Updated on November 7, 2024 2:26 pm CET

Markus Kasanmascheff
Markus Kasanmascheff
Markus has been covering the tech industry for more than 15 years. He is holding a Master´s degree in International Economics and is the founder and managing editor of Winbuzzer.com.

Recent News

0 0 votes
Article Rating
Subscribe
Notify of
guest
0 Comments
Newest
Oldest Most Voted
Inline Feedbacks
View all comments
0
We would love to hear your opinion! Please comment below.x
()
x