Reddit Bot Detection: The Signals Mods Use, Read as a Playbook

Reddit bot detection runs on public signals — Pangram, the Reddit Bot Detector repo, and the moderator threads in r/AskReddit all publish the same ones. Read them as a checklist of what NOT to do, and you have the brief for a resident network that mods don't pattern-match.

Konstantin Anisimov

Founder & CEO, NotPeople · May 18, 2026 · 8 min read

Reddit Bot Detection: The Signals Mods Use, Read as a Playbook

There are roughly five public sources that publish "how to spot an AI-generated Reddit account" guides. Pangram Labs has the most cited one. The open-source Reddit Bot Detector project on GitHub adds a couple more signals. Moderator AMAs in r/ModSupport name behavioral patterns that get accounts banned. A handful of academic papers have analysed the same signals.

If you read all of them as a single document, you have the operating playbook for a resident network that doesn't get caught.

The signals fall into four buckets. We'll walk each one and how a credible residents operation works around it. This is meant as transparency about how this work is actually done, not as a how-to for getting accounts banned. The live version of what the inverted checklist looks like when an agency ignores it sits in Reddit's mid-May 2026 ban of two GEO-spam agency subs.

Quick answer

Reddit bot detection leans on a public literature (Pangram, the Reddit Bot Detector repo, moderator AMAs in r/ModSupport) that names four buckets of signals flagging AI-generated accounts: linguistic tells, formatting tells, behavioural tells, network tells. Read inverted, they become the operating brief for a resident network that mods don't pattern-match. The work is editorial discipline, not better models: strip AI-tic vocabulary, force sentence-length variance, age accounts properly, scatter activity across subs, human-review every comment before publish.

The rest of this piece walks each bucket and how a credible residents operation works around it. Push the signals too far and you trip the silent version of all this, a Reddit shadowban.

The four buckets, side by side

Bucket	What the engine flags	What a credible operation does
Linguistic	AI-tic vocabulary, essay-grader transitions, uniform sentence length, em-dash overuse	Strip the ~60-word AI-tic blocklist, force sentence variance, cap em-dashes, human edit every comment
Formatting	H2 headers in comments, bullets in casual replies, paragraph breaks every 2 sentences	Reddit-native flowing text, lists only where the sub uses them, match the formatting cadence of the host community
Account metadata	Young account + high posting rate, low karma in the active sub, cross-sub expert-mode footprint, automated timing	2-5 year aged accounts, primary niche of expertise, randomised cadence, 60-90 day off-brand onboarding
Behavioural	Self-similar phrasing, shared templates across accounts, top-of-thread bot replies, coordinated voting, reply-doesn't-engage-prior-comment	Per-account voice profile, varied brand-mention phrasing, threading discipline, no coordinated voting, reply engages the previous comment first

Each row is one column of the public detection literature read against one column of the operating playbook. The four sections below walk the detail under each bucket.

Bucket 1 · Linguistic tells

These are the patterns that any halfway-competent AI classifier picks up first.

The AI-tic vocabulary. Words that LLMs over-use because their training data taught them to sound smart: delve, tapestry, nuance, landscape, realm, multifaceted, pivotal, garner, bolster, commendable. Use any of these in a casual Reddit comment and a human reader feels something off before they can articulate it.

Transition phrases that no human writes. "It's important to remember that...", "In conclusion...", "Furthermore, one might consider...", "Ultimately, it boils down to...". These are essay-grader phrases. Nobody writes like this in a comment thread. The same signature shows up in AI-generated LinkedIn outreach. The gap between AI SDR templated DMs and operator-voice profiles sits mostly on this linguistic layer, before the prospect even reads the message.

Uniform sentence length. A real comment swings: 4 words, then 22, then 9. AI-generated text tends to settle at 15-25 words per sentence consistently. The variance is a stronger signal than the average.

Overuse of em-dashes and the rule of three. AI loves listing three things separated by em-dashes. Real human writing on Reddit uses em-dashes sparingly, and the "three things" pattern shows up much less often.

How a residents operation works around it:

Strip the AI-tic vocabulary at the editorial layer (we maintain a 60-word blocklist)
Replace essay-grader transitions with conversational ones (or none)
Force sentence-length variance through human editing
Limit em-dashes to one per long post, zero per casual comment
Run every comment through a human reviewer before publish

Bucket 2 · Formatting tells

This bucket is where most off-the-shelf "AI commenting" tools get caught immediately.

Headers inside a Reddit comment. Nobody writes a Reddit comment with H2 markdown headers. AI tools produce them by default because the underlying model was trained on structured documents.

Bulleted lists in casual replies. A two-sentence answer to a casual question doesn't need three bullet points. AI tools default to bullets because bullets feel "structured." Real Reddit comments are flowing text 90% of the time.

Perfectly structured advice posts where nobody asked for advice. A LinkedIn-style "Here are 5 things to consider..." reply to "anyone know if X exchange is good?" is the canonical bot tell.

Paragraph breaks every 2 sentences. Real Reddit comments alternate between dense paragraphs and short ones. Bot-generated content tends toward uniform spacing.

How a residents operation works around it:

Reddit-native formatting only, flowing text, occasional one-line emphasis, almost no bullets in comments
Long-form is fine in long-form (5 paragraphs minimum), but casual replies stay casual
Don't add structure to questions that didn't ask for it
Match the formatting cadence of the sub, some subs lean technical (more lists OK), others conversational (almost no lists ever)

Bucket 3 · Account metadata

This is where shallow operations get caught even if the writing is good.

Young account, high posting frequency. An account created last quarter that posts twice a day across three subs is the textbook signal. Real accounts have years of low-volume background activity before any "expert" period.

Low karma in the sub where the account suddenly becomes active. If the account has 5,000 lifetime karma but only 30 of it in r/CryptoCurrency, and now it's posting daily expert advice there, mods notice.

Cross-sub footprint that doesn't make sense. Same account giving "expert" answers in r/SEO, r/marketing, r/startups and r/CryptoCurrency within 24 hours. Real expertise tends to cluster in one or two adjacent communities.

Posting times that look automated. Comments dropping every 47 minutes on the dot, or only during specific 4-hour windows that don't match any plausible timezone.

How a residents operation works around it:

Use accounts that are 2-5 years old before they say a client's name
Each resident has a primary niche (one to two subs of expertise) and stays mostly there
Adjacent activity in tangential subs is OK; expert-grade activity across unrelated topics is not
Posting cadence is randomised across a real-human distribution; not even cohort-uniform
New accounts onboard through 60-90 days of off-brand activity before becoming relevant

Bucket 4 · Behavioral tells

This is the deepest bucket and what separates credible operations from the rest.

High inter-post similarity within an account. Same account answering different questions with semantically near-identical sentences. Modern classifiers flag this within a few months of activity.

High inter-account similarity within a campaign. Two accounts on different subs both using the phrase "the only one I trust for fast USDT pulls" within a week is the smoking gun.

Top-of-thread replies vs threading into conversation. Bots default to top-level replies. Humans get into back-and-forth: replying, getting replied to, replying again.

Vote patterns that correlate too tightly. If an account posts and then five other accounts upvote within 90 seconds, the engagement looks coordinated to mod tools.

Comments that don't engage with the prior comment. A bot tends to answer the post, not the comment they're replying to. Real humans reply to the specific thing the previous person said.

How a residents operation works around it:

Per-account voice profile maintained over months (different residents write differently on purpose)
Brand mentions across the pool use varied phrasing, no shared templates
Threading in conversations is part of the operating protocol, not optional
No coordinated voting, ever, accounts engage independently with content they'd find anyway
Replies engage with the prior comment first, then add new value

A real-world example: u/Personal-Method3958

In May 2026 a Reddit user posted a now-circulating thread in r/SEO breaking down the profile of a single suspected GEO bot. The case is worth walking through because the account hits every bucket above at once.

Visible signals from the account page alone:

Aged but empty. Account is 1 year old. Total karma: 2. Total contributions: 0. Active in 12 subs. A real user with one year of activity in 12 subs accumulates karma in the hundreds at minimum. Zero contributions with 12 active subs is the signature of a write-only account.
Off-topic mistakes that reveal the bot's training. The account commented in r/SEO with: "GPT has its own sources... GPT can't reach your site because GPT doesn't use Google to search; it has its own search engine." Wrong on the facts (ChatGPT search does use Bing-indexed web). The error is the kind a generic LLM makes when it has no domain expertise, exactly what you'd expect from an account whose comment-generation step has no fact-checking layer.
Mod-removal pattern across moderated subs. Posts in subs with serious moderation (r/AskTechnology, r/WritingWithAI) are removed or awaiting approval. Posts in unmoderated or under-moderated GEO-platforming subs stay up. The footprint is exactly inverse to a legitimate community member: real users get approved everywhere, this account gets approved only where nobody is filtering.
Cross-sub expert-mode pattern. The same account is publishing "best AI for writing" comparison posts in r/aipromptprogramming, r/AskTechnology, r/SEO, r/WritingWithAI within 24 hours. Expert-grade activity across topically unrelated subs in one day is the textbook cross-sub footprint flag from bucket 3.

This account hits every bucket of the public detection checklist. Linguistic (essay-grader transitions, AI-tic phrasing). Formatting (long structured posts in reply slots). Metadata (1y account, 2 karma, 12 active subs). Behavioural (mod-removal pattern, expert-mode cross-posting).

The lesson isn't "this specific account got caught." The lesson is that the inverse of every flag above is the minimum bar for a residents operation. If your residents would show up to a human moderator looking like u/Personal-Method3958 after one year, the operation isn't ready to run.

What the operating floor looks like in practice

A few first-party numbers from the residents pool we operate across crypto, fintech and iGaming. The figures move quarter to quarter; the shape is stable.

Median account age at first brand-relevant comment: 32 months. Floor under which we don't ship a brand mention: 18 months.
Karma floor in the active sub before a brand mention is allowed: 1,500 in-sub karma minimum, 2,500 typical.
Editorial pass rate: ~73% of drafts ship after first review. The remaining 27% get sent back, usually for one of three reasons: AI-tic vocabulary, off-cadence formatting, or a reply that didn't engage the previous comment.
Mention density per account: ~3% of total comments. The other 97% is the resident's own niche activity, with no brand mention at all.
Comments per resident per week: 25-40, distributed across waking hours of the resident's stated timezone. No burst windows.

These aren't aspirational thresholds. Drafts that fail any of them don't ship. The operation runs against the inverted checklist above as a single editorial gate, not as a content goal.

Why publishing this matters

The honest reason these checklists are public is that the platforms benefit when bad operations get caught. Reddit's mod tooling, Pangram's commercial product, and the open-source detectors all share an interest in making low-effort AI shilling unprofitable.

That's fine for our category. Low-effort AI shilling burns subs, gets accounts banned, and makes communities hostile to anything that smells like marketing. Operations that follow the inversion of these checklists do the opposite, they participate in subs in ways that match how real members participate, and the brand mention is incidental to that participation. The same logic governs X, where trend formation rewards distinct-voice clustering and discounts copy-paste templated amplification. The buyer-side application of the same checklist (run it against the sample handles your X distribution vendor sends you on the first call) is Question 2 of the 10-question X vendor vetting call. Editorial discipline is platform-neutral; the detection signals are not.

If your residents operation can't pass every bucket above, it shouldn't run. The asset you create is only worth as much as it doesn't get pattern-matched and removed. Every campaign that gets a thread banned costs more than ten campaigns that don't.

What the next generation of detection will look like

A short read on where the detection arms race is heading.

Multi-modal classifiers. Current detection looks at text only. The next generation will correlate text with posting history, posting times, vote patterns and cross-sub footprint as a single multi-modal score. The text-only signals stop being sufficient.

Stylometric fingerprinting per account. Each account will be expected to have a consistent linguistic fingerprint across months of activity. Operations that swap out writers without preserving the voice will get caught.

Community-driven moderation. The most effective detection in 2026 isn't algorithmic, it's mods who've been in the sub for years pattern-matching things that feel off. Operations that don't earn community trust first will lose to operations that do.

The inversion of the public checklist is still the foundation. But the foundation will only get harder to fake.

Frequently asked

Are AI-generated Reddit comments illegal? No. They're against most subs' policies if undisclosed, and they get accounts banned, but they're not illegal. The risk is policy-level, not legal.

Can mods detect AI comments reliably? Increasingly yes. Combined-signal detection (text + metadata + behavior) catches most low-effort AI shilling. High-effort operations that follow the public-checklist inversion are still harder to detect by anyone.

What is Pangram? Pangram Labs is a commercial AI-content detection company. Their Reddit detector blog post is one of the most-cited public sources on the linguistic and behavioral signals of AI-generated comments.

What's the difference between a bot and a managed account? A bot posts automated content with no human review. A managed account is a real-aged Reddit account that posts content reviewed and approved by a human editor for fit. The behavioural signature is very different.

Can I just write Reddit comments with ChatGPT? Out of the box, no, the AI-tic vocabulary and formatting tells will get the comment flagged or downvoted by humans even before a classifier touches it. With editorial pass and per-account voice maintenance, the gap closes.

How do you avoid the "delve" problem? We maintain a blocklist of ~60 AI-tic words and phrases that the editorial layer flags before publish. The harder fix is the cadence and uniformity signals, which require per-account voice profiling.

Want to see how your existing Reddit presence (or your competitors') stacks against the public detection checklist? Run an audit. Free, takes 30 minutes, shows you which signals are working in your favour and which are working against you. See our Reddit Resident Network for how we run the inverse of every bucket above. The same detection-inverse logic gates the X-side pools so they survive listing review and quarter-over-quarter scrutiny: Crypto Launch on X for the launch-window flavour and Crypto Community on X for the standing version.

Keep reading

Reddit Bot Detection: The Signals Mods Use, Read as a Playbook

Quick answer

The four buckets, side by side

Bucket 1 · Linguistic tells

Bucket 2 · Formatting tells

Bucket 3 · Account metadata

Bucket 4 · Behavioral tells

A real-world example: u/Personal-Method3958

What the operating floor looks like in practice

Why publishing this matters

What the next generation of detection will look like

Frequently asked

More from Reddit