Is Penetration Testing with LLM Agents Friend or Foe?

Penetration testing with LLM is becoming more popular, just like everything else powered by AI. The processing capabilities offered by this technology are unprecedented, and its potential only grows as it attracts billions of dollars in investment every year.

At this point, the growth of LLMs is a major factor not only for cybersecurity but also for global economic security. Many people talk about it, but the truth is that, despite all the risks, AI growth is unstoppable. Therefore, the best you can do is exploit its full potential to protect your business’s systems, which means implementing LLM penetration testing. However, to achieve the best results with it, you’ll need to be aware of those risks and mitigate them effectively. In this article, QAwerk’s experts in penetration and AI testing will explain the pros and cons of using LLM agents and how to do so safely.

What Is Penetration Testing with LLM Agents?

Let’s start by defining what LLM penetration testing actually is. In simple terms, this type of testing uses tools powered by Large Language Models to attempt to penetrate your systems. Basically, these agents, fueled by AI, simulate the actions of potential attackers to help you find and evaluate vulnerabilities.

That said, QAwerk’s experts currently identify two categories of implementing LLM agents in cybersecurity:

Using agentic pentesting platforms: These solutions help automate traditional pentesting services to some extent. Despite the word ‘agentic’ in the mix, these solutions are only partially automated. Therefore, human testers are very much present in the mix and exercise major control over the processes. LLMs assist in orchestrating existing security tools. However, it’s carried out with multiple guardrails and human oversight.
Implementing LLM/GenAI red-teaming agents: These are used for LLM application testing. They help perform a comprehensive security analysis by emulating prompt injection, data leakage, jailbreaks, and unsafe tool use. In this case, machines perform LLM security testing more independently from humans.

However, the real question is whether LLM penetration testing is actually safe and reliable. There are risks to it that might seem like paranoia brought by sci-fi movies. However, according to numerous studies and reports, including an assessment by the Australian Competition and Consumer Commission, as well as research by Deloitte and McKinsey, AI can be both a friend and a foe in cybersecurity matters. Below, we will break down the main points of how LLM testing is used in security analysis and discuss its strengths and weaknesses.

How Does LLM Penetration Testing Work?

While no two tools are exactly the same, in general, the pipeline for penetration testing with LLM agents looks similar to this:

Defining scope and rules of engagement
This is a fully human step, as the pentesting engineer will need to define what’s in scope and what is forbidden. They must also list the criteria for success.
Context ingestion
This step will require a combination of human skill and supervision with machine learning model processing. This is where the LLM agent learns to understand targets, architecture hints, authentication constraints, logs, and known assets.
LLM reasoning and planning
At this stage, LLM agents must propose a prioritized penetration testing plan. It must be based on a logical risk assessment based on the data consumed by the model.
Tool orchestration and execution
Agentic AI and platform controls collaborate on this step, presenting the ultimate automated testing combo. The agent triggers the scanning or validation actions as needed by using integrated tools or platform APIs.
Perception and iteration
This is where penetration testing with LLM makes full use of the model’s capabilities. It will interpret test results and take actions, such as updating hypotheses and choosing the next tests.
Evidence packaging
This step can be handled jointly by the human expert and automated AI tools. It entails drafting reproducibility notes, listing affected components, and providing the severity rationale.
Human verification and reporting
Human-in-the-loop is always a must for mature and complex programs. In this case of LLM penetration testing, they will confirm exploitability and business impact, as well as remove false positives.
Remediation verification
In essence, this is an agent-first step where it reruns targeted tests to confirm successful vulnerability remediation.

Note that these processes may differ significantly across tools. Some of the most popular penetration testing LLM agents today are:

Terra Security
Horizon3.ai NodeZero
PentestGPT
PentestAgent (framework)
AutoPentester
“Caido” Assistant
NVIDIA “garak”

Penetration Testing with LLM: Pros and Cons

Using LLM agents in cybersecurity comes with both pros and cons. However, the most important factor might not be the fact that this is a ‘friend or foe’ situation, but that it’s inevitable. At this point, technology and global digitalization are advancing so rapidly that AI has become an integral part of our everyday lives. It’s also used by attackers themselves. Therefore, it’s imperative to use LLM-powered software and network security tools as they are the only things capable of keeping up with LLM-powered threats.

Objectively, you must consider both the strengths and weaknesses of LLM penetration testing. However, the goal should be to understand how best to implement them and manage the risks associated with AI tech, not to decide if these tests should be used at all. Right now, skipping this type of testing is making your systems vulnerable to AI-level threats by default.

Benefits of Using LLM Agents in Cybersecurity

We can sum up the benefits of penetration testing with LLM agents with two simple words: productivity boost. Implementing this technology as part of an automated pipeline, especially for AI testing, improves accuracy, coverage, thoroughness, and speed. Machine Learning models can be trained to identify patterns that are nearly unnoticeable to the human eye and perform tests that are impossible to replicate manually because of the enormous resources required.

We can define the main strengths of penetration testing with LLM like this:

Breadth of coverage
AI agents can cover an extensive attack surface and keep the testing continuous throughout remediation to ensure the highest level of thoroughness during changes.
Speed of testing
Machine Learning models perform tests extremely fast, enabling them to process more data than a team of QA testers at a fraction of the time with minimal margin for error.
‘Glue’ between tools and context
Using MCP-style bridges (Model-Context Protocol), LLMs can query exposure context and drive workflows naturally.
Detailed reporting
LLMs are highly effective at turning chaotic data into consistent reports formatted to your exact specifications and visualized for easy understanding.
GenAI application security
Specialized agents can perform LLM penetration testing, systematically probing failure modes that regular pentests don’t cover, thereby making AI itself more secure.

Penetration Testing with LLM Agents: Friend or Foe?

Risks of LLM Security Testing

The risks and general cons of implementing penetration testing with LLM agents largely echo the risk of using agentic AI at all (see the image above). The most notable things to consider when thinking of potential issues with LLM penetration testing include:

Hallucinations and overconfidence
We’ve not yet discovered a way to remove AI hallucinations completely, and you have to keep in mind that even having a human in the loop doesn’t offer 100% protection against incorrect decisions by the agent. The reason is that LLMs can sound extremely confident and provide seemingly sound argumentation to support their opinions, even while they are completely wrong. To put it into an understandable context, LLMs can be extremely stubborn and might inadvertently persuade humans who verify their performance to agree with them by offering false data.
False positives or negatives
These can occur more often in systems with complex business logic, authorisation flows, and multi-step exploits.
Data exposure
If the data pipeline sends prompts to external models, you risk leaking sensitive information. This risk can be mitigated by setting regulations regarding data placement and forbidding external calls.
Safety and governance
As the agent’s autonomy grows, so does the risk that it will do something outside your policies. For example, penetration testing with LLM agents can grow exceedingly aggressive without strict guardrails.

The final consideration to keep in mind is that LLMs facilitate testing, making it faster, broader, more thorough, and, to some extent, more accurate. However, AI is still unable to perform well in complex and sophisticated systems. Therefore, intricate testing work that can catch nearly invisible, but dangerous vulnerabilities is still reserved solely for human experts who can think creatively and ‘outside the box’.

How to Make the Most of LLM Penetration Testing

If you want to achieve the best results from any type of software testing, especially in terms of security analysis, you need to keep in mind two things. The first is that a combination of human expertise and machine learning power delivers the ultimate accuracy of testing results. Therefore, it shouldn’t be a choice of either manual or automated LLM testing. The best answer is both, especially when the strategy is developed by experienced penetration testing professionals.

The second thing to keep in mind is that penetration testing with LLM agents might not be completely safe. However, nothing really is 100% safe with AI technology because it’s developing so rapidly. It’s a necessary risk that you must learn to mitigate because attackers, your real foes, will use AI solutions against you. Therefore, the only way to build resilience is to harness the power of LLMs to strengthen your defenses.

In addition, do not forget that security verifications must be consistent. Therefore, the frequency of penetration testing affects your system’s defenses no less than whether you include LLM agents in the process.

Considering all this, here are some points that will help guide you in building a thorough and effective software penetration testing strategy:

Human-in-the-loop oversight is essential
Experienced pentesters will be most effective during scoping and setting legal and ethical controls. They are also essential for validating impact and reducing noise. It means that human professionals should filter the agent output and decide what should actually be assessed or changed. Moreover, they should be responsible for executive communication and the discovery of logical flaws.
Guardrails for LLM penetration testing are mandatory
Making penetration testing with LLM your friend requires setting strict boundaries for the agentic AI and its implementation. The required guardrails must cover strict scoping, action-approval gates, isolated testing environments, strong provenance, and logging. Also, it’s imperative to establish strict data governance policies that align with the compliance requirements your business must meet.
The hybrid approach is the future of LLM testing
At the moment, the best penetration testing strategy is ‘hybrid’, meaning it includes continuous agentic testing for maximum coverage and human testers to guide the process. AI agents are an excellent accelerator, but not a replacement for the human ability to identify and analyze complex patterns while considering the business’s strategic goals.

If you are ready to start building and implementing an LLM penetration testing strategy, QAwerk will be happy to share our experience and help you by delivering a plan tailored to your exact needs. Contact us today to start enhancing your system’s security in the most efficient manner.