AI moderation

On Pro+ plans, every submission runs through a language-model-based moderation classifier. The model reads the submission body and returns a single number between 0 and 1. Higher means more likely spam.

Unlike honeypot or captcha, AI moderation catches human spam — manually-typed junk that lands real keystrokes on a real keyboard but is still garbage. SEO link drops, copy-pasted bot pitches, harassment, and phishing attempts all score high.

Score range

Scores are floats from 0.0 to 1.0, rounded to two decimals.

Score	Interpretation
`0.00 – 0.30`	Almost certainly legitimate
`0.30 – 0.60`	Ambiguous — review
`0.60 – 0.85`	Likely spam
`0.85 – 1.00`	Almost certainly spam

The score is also broken down by category — harassment, solicitation, phishing, nonsense — when applicable. The category lives next to the score on the submission detail page.

Threshold tuning

Each form has a moderation threshold (default 0.75). Submissions scoring at or above the threshold are filed in the spam folder; below it lands in the inbox.

Tune on the form's edit page under AI moderation:

Lower the threshold (e.g. 0.55) if you'd rather over-flag and hand-review.
Raise it (e.g. 0.90) if you only want the model to catch obvious garbage.
Set it to 1.0 to keep the score visible without acting on it.

The model itself isn't tuned per-form. The threshold is the only knob. If the score consistently misses a category of spam you care about, layer a custom rule on top.

Where it appears in the UI

On the submission detail page:

A coloured pill shows the score (green/yellow/red).
The category is shown next to it, if the model returned one.
If the submission was filed as spam because of the score, the spam reason will read ai_moderation:0.83 (or whatever the score was).

On the form overview, the average score over the last 30 days is shown next to the spam-rate chart so you can see drift.

Where it appears in the API

Every submission resource includes:

{
  "id": "subm_01H...",
  "status": "received",
  "ai_moderation": {
    "score": 0.12,
    "category": null,
    "model": "moderation-2025-11"
  },
  "payload": { ... }
}

The model field identifies the model version that scored this submission so you can compare scores across model upgrades.

Where it appears in MCP

The list-submissions and get-submission MCP tools both return the moderation score. The categorize-submissions prompt uses it to bucket submissions for batch labeling — see Categorization →.

Plan gating

Free, Starter, and Pro plans don't run the moderation classifier. The submission resource on those plans returns "ai_moderation": null. Upgrade to Pro+ in Billing to turn it on. There's no per-submission charge — moderation runs on every Pro+ submission as part of the plan.