How To Conduct A Successful Audit Of AI-Driven Software Development

Traditionally, an audit independently examines records, processes and controls to verify compliance and assess financial and operational integrity.

In the modern world, such an approach should extend to the software development lifecycle (SDLC) – especially in the age of artificial intelligence (AI) or large language model (LLM)-assisted code. Chief Information Security Officers (CISOs) and their teams need proof that developers are producing protected products, because one in five organizations has experienced a serious security incident directly tied to AI-generated code. Getting to the root of the problems requires visibility into who is leveraging AI, what tools they are using and where AI-generated code is introduced into the SDLC. This is considered the ADLC: the agentic development lifecycle.

CISOs must feel confident that these tools are approved and safe. A thorough audit will identify specific AI-linked vulnerabilities, and which tools are causing the most issues. Even better, it will transform the information into action.

To be clear, AI/LLM-driven software development creates significant boosts in efficiencies and overall productivity. But it also introduces new, often unmanaged risks. Software vulnerabilities discovered “after the fact” will result in time-consuming fixes and reworks. Security and developer team leaders must work together to find an appropriate balance of efficacy, innovation and protection.

An impactful audit starts with establishing enterprise-level visibility into how AI influences production code. However, such visibility remains elusive. Individual developers have their own preferred LLM tools for daily tasks, but these tools often operate at completely different security proficiency levels, making it extremely difficult for CISOs to report quantifiable risks to stakeholders, and for their teams to enforce governance policies.

Such actions prove critical, especially when our research has demonstrated a range of outcomes when comparing humans vs. machines on specific security tasks: The best LLMs perform comparably with proficient professionals for only a limited range of secure coding tasks, including the flagging of code smells (structural or design issues) and anti-patterns (common but harmful solutions). But we’ve also seen the tools struggle with DoS protection, insufficient logging or misconfigured permissions, to cite a few examples. Overall, top security-proficient developers will outperform LLMs, and average developers will not.

Advertisement. Scroll to continue reading.

A new source of risk

Indeed, it’s safe to say that the AI boom has created a new category of operational risk – one originating inside the SDLC as opposed to external attackers. Subsequently, CISOs are encountering greater visibility gaps due to unintentional developer actions, at a time when it’s already challenging to trace accountability and attribution.

To successfully report quantifiable risk to stakeholders, they need to include these variables into a comprehensive audit of AI impact on the SDLC:

AI deployment. Who is using AI tools? How often? Where?

Developer capabilities. Which team members are advanced enough to identify and eliminate LLM-introduced inaccuracies/vulnerabilities? Which one needs upskilling to do so?

Vulnerability assessments. At what stage did something go wrong? How damaging was it?

With this, CISOs can answer essential board-level questions about the audit: Where is AI increasing risks? Which teams or behaviors are driving the risks? Do teams bring the right skills to routinely deploy AI/LLM?

To get to this point, CISOs should work closely with development team leaders to complete the following stages of an effective audit:

Record tool usage. Compile a verifiable record of all AI/LLM assistants deployed for code generation – whether sanctioned or not. Map them directly to code outputs. These steps will allow CISOs to ensure audit and compliance readiness, and acquire the traceability required to meet emerging regulatory directives.

Evaluate and benchmark these tools – and make fixes. Gage AI models against known vulnerability patterns, and standardize those that produce secure products. Use this to determine approved tool selection and proper governance. Track and oversee model context protocol (MCP) integrations to ensure AI agents connect only to approved tools and data sources. Take advantage of “time travel” auditing to instantly isolate and fix every commit linked to a compromised LLM model, to avoid the excessive costs of lengthy, manual code reviews.

Invest in upskilling. Beyond continuous education and benchmarking, organizations should come up with a risk score. This is similar to a credit score, only it considers multiple factors to determine how much unintentional risk development team members cause, based upon their skillsets, practices and oversight capabilities.

Link AI to business goals. Insights from audits must connect AI tool deployment with productivity, code quality and secure outcomes. This informs decision-makers as they assess which tools to invest in, and how to balance innovation with risk management.

Fortunately, readily available solutions enable CISOs and development team leaders to raise visibility, identify risks and trigger policy-driven training and governance with respect to AI and the SDLC. And all of this starts with a comprehensive audit, ultimately resulting in the right people using the right tools – without delegating too much to AI. Inevitably, these initiatives will ensure that SDLCs are innovative, productive and safe.

Learn More at the AI Risk Summit | Ritz-Carlton, Half Moon Bay

How to Conduct a Successful Audit of AI-Driven Software Development

Search

The 6 Stages of a Vulnerability Management Process

Get Cybersecurity Alerts

Read More

Categories

Site Navigation

Resources