An AI security audit should be judged by reproducibility. If the finding cannot point to code, show the exploit path, explain severity, and give a command or test that supports the claim, it is not ready for a security decision. Tangle Code Auditor is being shaped around that standard: agent-assisted review with sandboxed execution and proof-backed reports.
The upcoming public surface is audit.tangle.tools. Until it is live, do not treat the URL as a working product page.
Finding Bar
| Field | Required content |
|---|---|
| title | specific vulnerability, not a category label |
| affected code | file and function or contract path |
| exploit path | how an attacker reaches the issue |
| impact | what breaks and who loses value |
| reproduction | command, test, or proof notes |
| severity | why the level is justified |
| fix | practical mitigation |
This is the difference between “possible reentrancy” and “this function can be reentered before balance update, here is a failing test.”
Tools Are Inputs, Not The Report
Static analyzers help, but their output needs triage. OWASP WSTG is useful for web testing structure. CodeQL and Semgrep are useful for code search and static analysis. The audit agent should use those tools and then explain what is real in the target repository.
| Tool output | Agent responsibility |
|---|---|
| warning | inspect reachability |
| dataflow path | check exploitability |
| failing test | explain root cause |
| build failure | separate setup issue from vulnerability |
| duplicate finding | merge or discard |
Severity Discipline
High and critical findings need proof. A useful audit runtime should downgrade severe claims when no exploit or loss path is shown.
claim
-> inspect reachable code path
-> create or run reproduction
-> estimate impact
-> assign severity
-> downgrade if proof is missing
For smart-contract-specific validation, read Automated Smart Contract Audit With PoC Validation. For the difference between scanners and agent review, read AI Vulnerability Scanner Vs Agent Audit.
Reproduction Packet
Every accepted finding should include a packet a reviewer can run or inspect.
| Packet item | Purpose |
|---|---|
| repo ref | fixes the exact code version under review |
| setup command | separates environment failure from security signal |
| proof command | shows the finding can be triggered or reasoned about |
| expected result | tells the reviewer what should happen |
| observed result | shows the vulnerable behavior |
| proposed patch | gives engineering a concrete next step |
For web application issues, OWASP WSTG gives a useful testing structure. For code search, CodeQL code scanning can surface paths worth reviewing. The audit report should turn those inputs into repo-specific evidence.
What To Downgrade
The auditor should downgrade:
| Claim | Downgrade reason |
|---|---|
| high severity without reachable path | no demonstrated attacker route |
| critical issue without asset loss | impact not proven |
| scanner warning with safe wrapper | context reduces risk |
| duplicate path | same root cause already reported |
| setup failure | environment problem, not vulnerability |
This keeps the report short enough for engineers to act on.
Fix Verification
The audit should not end at “recommendation written.” For important findings, the agent should rerun the reproduction against the patched code and record the result. A good fix note says what changed, which proof no longer works, and whether any residual risk remains. That turns the audit from a report generator into a release gate.
When the reproduction cannot be rerun, the report should say why. A dependency issue, missing fixture, or unavailable chain state is still useful context for the reviewer.
What This Does Not Prove
An AI security audit does not guarantee absence of vulnerabilities. It produces findings under a scope and evidence bar. Use it to speed triage, catch obvious and non-obvious issues, and prepare for human review.
Decision Rule
Accept an AI security audit finding only when it includes location, path, impact, reproduction, and fix guidance. Treat unsupported severe claims as hypotheses.
FAQ
What is an AI security audit?
It is a security review assisted by agents that inspect code, run tools, validate findings, and produce a report with evidence.
What makes a finding reproducible?
The reviewer can follow the file references, commands, tests, or proof notes and see why the issue is real.
Does this replace human auditors?
No. It can speed review and catch issues earlier, but humans should review high-risk systems and final release decisions.
Where does Tangle Code Auditor fit?
Tangle Code Auditor is the upcoming audit product for sandboxed, agent-assisted security review.