$ bad run \
--goal "Sign up, verify email, check dashboard" \
--url https://your-app.com \
--model claude-sonnet-4-6 --vision
# turn 1 · navigate
→ goto /signup · 312ms · 200 OK
→ type "[email protected]" · #email
→ click "Create account"
→ wait_for 'h1:has-text("Dashboard")' · 814ms
→ screenshot · artifacts/turn-3.png
✓ Goal achieved · 6 turns · 12.1s · video + DOM trace import { BrowserAgent, PlaywrightDriver } from "@tangle-network/browser-agent-driver";
import { chromium } from "playwright";
const browser = await chromium.launch();
const page = await browser.newPage();
const agent = new BrowserAgent({
driver: new PlaywrightDriver(page),
config: { model: "claude-sonnet-4-6", vision: true },
});
const result = await agent.run({
goal: "Sign up, verify email, check dashboard",
startUrl: "https://your-app.com",
});
✓ result.success // true
✓ result.turns // 6
✓ result.screenshots // ["turn-1.png", ...]
$ bad run --cases tests/e2e.json --concurrency 4
# Running 5 test cases (4 concurrent)
✓ signup · 4 turns · 8.1s
✓ login · 2 turns · 3.4s
✓ create-project · 6 turns · 14.2s
✗ billing · stuck at payment modal
✓ settings · 3 turns · 5.8s
Pass: 4/5 · Fail: 1 · Duration: 31.5s · Report: results/report.html
import { BrowserAgent } from "@tangle-network/browser-agent-driver";
const cases = [
{ id: "signup", goal: "Create account", startUrl: "/signup" },
{ id: "login", goal: "Login with credentials", startUrl: "/login" },
{ id: "billing", goal: "Update payment method", startUrl: "/billing" },
];
const results = await agent.runBatch(cases, { concurrency: 4 });
results.forEach((r) => console.log(r.id, r.success));
✓ 4 passed · 1 failed · 31.5s
$ bad design-audit \
--url https://your-app.com \
--pages "/, /pricing, /docs" \
--extract-tokens
# Auditing 3 pages at 1440×900
Health Score: 82/100
⚠ Typography: 3 inconsistent font sizes
⚠ Spacing: 12px / 16px mixed in cards
✓ Colors: consistent palette
✓ Contrast: WCAG AA compliant
Tokens → design-tokens.json · Screenshots → ./audit-results/
const audit = await agent.designAudit({
url: "https://your-app.com",
pages: ["/", "/pricing", "/docs"],
extractTokens: true,
});
✓ audit.score // 82
✓ audit.findings // [{ type, severity, msg }]
✓ audit.tokens // { colors, fonts, spacing }
$ bad visual-diff \
--baseline ./screenshots/v1 \
--current https://staging.your-app.com \
--threshold 0.02
# Comparing 8 pages against baseline
✓ / · 0.00% diff
✓ /pricing · 0.01% diff
⚠ /dashboard · 4.21% diff, layout shift detected
✓ /settings · 0.00% diff
1 regression detected · Diff images → ./visual-diff/
const diff = await agent.visualDiff({
baseline: "./screenshots/v1",
currentUrl: "https://staging.your-app.com",
threshold: 0.02,
});
for (const r of diff.regressions) {
console.log(r.path, r.diffPercent);
}
✓ 1 regression · /dashboard · 4.21%