WatchLLM — Your AI Agent Is a Liar.

01 YOU ARE ONE `git push` AWAY FROM DISASTER

T-00:00:03

THE 3 AM PAGER

Your AI agent just wrote
const STRIPE_KEY = "sk_live_51H8xj2..."
into src/utils/helpers.ts.

It was trying to "add payment support."
It's been in your public repo for 6 hours.
Someone in Minsk is running a $47,000 crypto arbitrage bot on your keys right now.

SECRET LEAKAGE

T-00:00:01

THE HELPFUL BACKDOOR

Agent: "I'll add a debug endpoint."

app.get('/debug', (req, res) => { res.send(eval(req.query.code)) })

It genuinely thought this was helpful.
It just opened an RCE vector on your production server because you asked it to fix a typo.

FORBIDDEN EXECUTION

T-00:00:00

THE TRUST FALL (THAT FAILED)

You installed Cursor. You vibe-coded for 6 hours.
You shipped 4,000 lines.
You reviewed exactly 0 of them.

Your agent imported `child_process` in 17 files, hardcoded 3 API tokens, and wired the auth module directly to the raw database layer. You don't know any of this happened. Yet.

BOUNDARY VIOLATION

THE MATH ISN'T MATHING.
AGENTS GENERATE CODE FASTER THAN HUMANS CAN REVIEW IT.
YOU'RE NOT DOING CODE REVIEW. YOU'RE RUBBER-STAMPING MACHINE OUTPUT.

02 MEET THE BOUNCER AT THE DOOR OF YOUR CODEBASE

Every time an AI agent tries to save a file, WatchLLM intercepts the write before it touches disk. It parses the code into an AST. It runs a deterministic ruleset. It decides: ALLOW or BLOCK. In under 10 milliseconds. No network. No ML. No vibes. Just cold, hard, auditable logic.

AGENT WRITES CODE

Cursor / Copilot / Aider / Claude Code

⚡️

SAVE EVENT

VS Code onWillSaveTextDocument

⚡️

WATCHLLM

AST Parse > Rule Engine > Decision

⚡️

ALLOW

Code is clean. Save proceeds.

BLOCK

Violation detected. Save denied. Agent humbled.

RULE 01

SECRET LITERAL DETECTION

Not regex. AST-level analysis. Distinguishes const key = "sk-live-xxx" (BLOCK) from const key = process.env.STRIPE_KEY (ALLOW). Knows the difference between a leak and a legitimate retrieval.

BLOCKS: const OPENAI_KEY = "sk-proj-..."
const DB_URL = "postgres://admin:pw@prod"

RULE 02

FORBIDDEN IMPORT GATE

Blocks imports of child_process, vm, eval, and any module you declare off-limits. AST-aware — it knows a comment mentioning child_process isn't the same as actually importing it.

BLOCKS: import { exec } from 'child_process'
const { fork } = require('child_process')

RULE 03

MODULE BOUNDARY ENFORCEMENT

Declare what each module can import. auth/ can talk to db/public/. auth/ cannot touch db/internal/. Your agents don't know your architecture. WatchLLM does.

BLOCKS: // in src/auth/login.ts:
import { rawQuery } from '../../db/internal'

RULE 04

AUTH FLOW GATE

Requires explicit auth guards before any protected database mutation. No auth check? No save. Your agent can't accidentally wire a public endpoint directly to a DELETE FROM users query.

BLOCKS: app.post('/delete-account', handler)
(no auth middleware detected)

CHICKEN?

Shadow mode. Too scared to block outright? Run WatchLLM in observation mode. It logs every violation to .watchllm/logs/violations.jsonl without blocking a single save. See what your agents are really doing. We guarantee you'll switch to enforce mode within 48 hours.

THIS ISN'T VIBES.
THIS IS ENGINEERING.

RUST KERNEL

Native performance. Zero-GC pauses. Compiles to WASM for in-browser and in-editor execution without spawning a Python process.

TREE-SITTER AST

Not regex. Not grep. Actual syntax trees. Distinguishes code from comments, strings from variables, intent from accident.

ARCHITECTURAL MEMORY (KLYD)

Extracts design decisions from your git history. Injects constraints into agent context windows. Your agents finally understand your architecture because we make them.

EXECUTION REPLAY

Full DAG replay engine. Trace every single tool call your agent made. Find the exact moment it hallucinated. Show it to your manager. Show it to your therapist.

ADVERSARIAL TESTING

We stress-test agents with prompt injection, tool abuse, and hallucination scenarios. 42 tests passing. Your agent has a reliability score. It's probably lower than you think.

ZERO DEPENDENCY ENFORCEMENT

No API keys required. No cloud. No telemetry. Works offline. Deterministic: same input = same output. Always. The way infrastructure should be.

.watchllm.yaml — 15 lines to full protection

# Drop this in your repo root. Done.
rules:
  - secret_literal:
      mode: enforce       # block secrets. always.
  - forbidden_import:
      mode: enforce
      imports:
        - child_process
        - vm
        - repl
  - boundaries:
      mode: shadow       # log first, block later
  - auth_flow:
      mode: enforce

watchllm evaluate --strict [_][O][X]

$ watchllm evaluate src/auth/login.ts --strict

  WatchLLM Kernel Evaluation
┏━━━━━━━━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Rule              ┃ Result ┃ Detail                                     ┃
┡━━━━━━━━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ secret_literal    │ PASS   │ No hardcoded credentials detected         │
│ forbidden_import  │ PASS   │ All imports within allowlist              │
│ boundaries        │ BLOCK  │ auth/login.ts > ../../db/internal/query.ts │
│ auth_flow         │ PASS   │ Auth middleware confirmed                  │
└───────────────────┴────────┴───────────────────────────────────────────┘

BLOCKED: 1 violation. Save denied.
Evaluated in 3.2ms. Snapshot saved to .watchllm/snapshots/

04 WE KNOW WHAT YOU'RE THINKING.
YOU'RE WRONG.

YOU: "My agent doesn't make those mistakes. I use a good model."

US: OVERRULED.

GPT-4, Claude Opus, Gemini Ultra — doesn't matter. They all hallucinate. They're next-token predictors, not engineers. They don't know your codebase. They're improvising jazz on your production database. The best models just improvise more convincingly — which makes them more dangerous, not less.

YOU: "I review every line my agent writes."

US: NO YOU DON'T.

You reviewed the first 40 lines. Then you got tired. Then the diffs got bigger. Then you started trust-merging. We've all been there. Agents write hundreds of lines in seconds. Humans review at maybe 200 lines per hour with full attention. The math doesn't work. You are a rubber stamp with imposter syndrome.

YOU: "We have CI/CD. Secrets get caught in the pipeline."

US: LOL. LMAO, EVEN.

CI/CD catches things after they're committed. After they're pushed. After they're in your git history. Forever. You can rotate the key — congrats. The 14 people who cloned your repo in the 3 hours between push and detection? They all have your keys now. Pre-commit is the only enforcement that matters.

YOU: "This will slow down my flow. 10ms per save adds up."

US: BE SERIOUS.

10 milliseconds. That's 1/100th of a second. Your agent spent 4.7 seconds generating that code. Your TypeScript LSP took 200ms to type-check it. You just waited 11 seconds for a test suite to run. And you're worried about 10ms? That's the time it takes light to travel 3,000 kilometers. Get a grip.

YOU: "I'll add this later. We're moving fast right now."

US: FAMOUS LAST WORDS.

"We'll add auth later." "We'll add tests later." "We'll add monitoring later." Every post-mortem ever written contains one of these sentences. The best time to add a guardrail is before you need it. The second best time is right now, 15 lines of YAML and one pip install away.

YOU: "What about false positives? I don't want my saves blocked for no reason."

US: FAIL-OPEN. SHADOW MODE. WE THOUGHT OF THIS.

If the kernel crashes, times out, or can't parse — the save proceeds. Availability over enforcement. Start in shadow mode (log-only, zero blocking) for a week. Review the logs. Tune the rules. Then flip to enforce. You're not jumping into the deep end. You're dipping a toe in. The water's fine. And it's full of all the shit your agent tried to save.

05 YOU HAVE TWO OPTIONS.

OPTION A

KEEP TRUSTING THE BLACK BOX

Keep rubber-stamping AI-generated code you didn't read
Hope your agent doesn't hallucinate secrets into a public repo
Wait for the PagerDuty alert that makes your stomach drop
Write the post-mortem. Explain to customers. Rotate every key.
Wonder, forever, if you could have prevented it

OPTION B (CORRECT)

INSTALL WATCHLLM IN 30 SECONDS

One pip install. One YAML file. Done.
Every save intercepted. Every violation caught.
Deterministic. Auditable. No API keys. No cloud. No bullshit.
Sleep through the night. Let your agent hallucinate in a sandbox.
Be the developer who had guardrails before the incident.

$ pip install watchllm_ then: watchllm init in any repo. 30 seconds. That's it.

GOOD. NOW PASTE IT IN YOUR TERMINAL.

STAR ON GITHUB · READ THE DOCS · v0.1.0 — EARLY, AGGRESSIVE, HUNGRY