What's New

Summarized just now

Analyzing your brand's AI visibility...

Share of Voice ?

Calculating...

Average Position ?

Calculating...

Visibility Trend ?

Calculating...

Prompts Lost ?

Calculating...

Brand Recovery ?

Calculating...

Backlinks ?

Calculating...

Content Health ?

Calculating...

Quick Wins ?

High-impact, low-effort improvements

Brand Associations ?

How well your content aligns with key concepts

What is GEO Score?

Generative Engine Optimization (GEO) measures how likely AI models are to cite your content. Research shows 72.4% of ChatGPT-cited posts include answer capsules (120-150 char summaries). This score analyzes your content structure, data density, and technical markup to predict citability.

GEO Content Scorer ?

Content structure analysis for AI citation optimization

-- GEO Score

Answer Capsules

Direct answer summaries

Section Structure

100-200 words optimal

Original Data

Stats & research density

Technical SEO

Schema, lists, tables

Top GEO Improvements

Highest-impact fixes to increase AI citation probability

Page-by-Page Analysis

Click any page to see GEO breakdown and get generated answer capsules

What is Topic Alignment?

Topic Alignment measures how well your content matches what people actually ask AI about. When someone asks an AI assistant about a topic in your industry, does your content answer that question? Higher alignment means AI is more likely to recommend you. Calculated using cosine similarity between your page embeddings and topic embeddings.

Topic Alignment Score ?

Content-to-topic relevance analysis across your pages

Topic-by-Topic Breakdown ?

Per-topic alignment strength with page-level detail

Alignment Recommendations ?

High-impact actions to improve content-topic relevance

What is Citation Source Mapping?

When AI answers questions, it pulls information from specific sources. This analysis shows which third-party sites AI is citing in your industry — and crucially, where your competitors appear but you don't. These gaps represent PR and content opportunities.

Citation Decision Journey

How AI citations guide users through their buying journey

Your Brand

Competitors

Other Sources

0/4

Stage Coverage

Strongest Stage

Gap Opportunity

Total Citations

Citation Source Mapping ?

Aggregated citation analysis across all AI prompts tested

Total Citations Found ?

Your Domain Cited

Citation Gaps

Competitor-only sources

Authority Sources

News, .edu, .gov

Top Citation Sources

Most frequently cited sources in your category

Citation Gaps

High-value sources that cite competitors but not you - your biggest opportunities

Competitor Citation Frequency

How often competitors appear in AI responses

Vector Embedding Map

Visual representation of your brand's semantic position

Legend

Page Scores

Strong (≥50%)

Medium (≥35%)

Weak (<35%)

Click to Explore

Topic circles → breakdown

Page dots → details

Competitors

Competitor nodes (ring)

Dashed line = relation

ℹ First 10 pages get deep analysis

Content Themes & Gaps ?

Selected Page ?

Select a page

CLUSTER AFFINITIES ?

Competitor Brands ?

Query Fanout Analysis ?

How user prompts expand into multiple search intents

Top Cited Sources ?

Critical Issues ?

Problems affecting your AI visibility

Keyword Cannibalization ?

Pages competing for the same keywords

Backlink Health ?

External links pointing to your site

Content Lifecycle ?

Page-by-page health assessment

OMC Cost & GEO Command

Read-only rollup of metered spend (cost_ledger) and GEO measurement (geo_prompt_results / geo_citations). All figures are live from the database — no sample data.

Rolling spend

By model

By brand

Latest metered calls

GEO measurement

Visibility / sentiment / share-of-answer by model

Citation frequency (per domain)

Prompt results & the citations cited for each

Instruments

Config-driven measurements (e.g. the Disintermediation Index). Each instrument is defined as a config row — a new metric needs no deploy — and its headline score is computed over a measurement scan. Index = share of AI answers that route away from the brand (competitor or self-served).

Refresh Schedule

Citation Half-Life → Refresh-by-date + tax-seasonality urgency over citation_decay_metrics. Refresh-by = first citation date + half-life (fitted, fallback empirical) — when a citation's freshness has decayed to ~50%. Urgency (0–100) ranks overdue refreshes, lifted when the refresh date lands in or just before the US tax filing window (Jan 1 – Apr 15). Real half-life accrues over weeks; rows marked SAMPLE are labeled placeholders and are not real measurements.

Simulate SYNTHETIC

Pre-publication AI Focus Group: pre-test a proposed claim/headline against this brand's SYNTHETIC personas (an AI focus group, not real customers) — see modeled per-persona reactions + an overall sentiment/themes read before you publish. Then run the Beam preflight claim-check to gate the same content against the brand's ground-truth facts (AI-accuracy / hallucination risk). Uses the Brand ID entered above.

Story Mode

The HERO SCENARIO for one brand: a guided diagnose → leak → freshness → simulate → cost narrative assembled from the metrics we already compute. Every figure is live from the database or carries a SAMPLE / SYNTHETIC badge — nothing is fabricated.

Strategy Hub

AI-powered insights combining your data with cutting-edge marketing intelligence

Data-Backed 0 data points

Knowledge Real-time 2026

Confidence

Sources Platform + AI

AI Visibility

Content

Authority

Competitive

Growth

Generating strategic insights...

Competitive Intelligence

How you compare in the AI landscape

Cross-Channel Intelligence

Unified view joining AI analysis with organic search & analytics data

Action Queue

AI-generated fixes ready to preview and apply

Run an analysis to see competitor data

Competitor intelligence will appear here after your first analysis.

Keyword Research & Search Trends ?

Research keywords, analyze page content, and track search trends

Keyword Explorer

Suggestions will appear here

Search Trend

Research a keyword to see trends

Avg Interest

Trend

Results

Page Keyword Performance

Extract keywords from your crawled pages and check their organic rankings

Pages Analyzed ?

Keywords Found ?

Ranking Top 10 ?

Ranking Top 20 ?

Not Ranking ?

Keyword Opportunities

Click "Analyze Page Keywords" to discover opportunities

Requires a website analysis first • Extracts keywords from your crawled pages and checks their Google rankings

Generate AI Prompts from Keywords

Test how AI responds to prompts based on your keywords • See if your brand appears in AI answers

Select Keywords to Test:

Run a website analysis first to see extracted keywords here

Consumer Intent Types:

Awareness "What is..." Consideration "Best..." Decision "Buy..."

Configure Monitor

Industry Vertical

Product Category

Products to Track

Comma-separated list

Key Competitors

Query Templates

"Best [category] for [use case]"

"[brand] vs [competitor] [product]"

"[category] under $[price]"

"Should I buy [A] or [B]?"

Price Range (optional)

Site Traffic

Search performance, site analytics, and AI bot crawl monitoring

Loading Search & Traffic...

Bot Crawl Tracking

Monitor human visitors and AI systems crawling your website

Set up AI bot tracking

Track how often AI systems like GPTBot, ClaudeBot, and PerplexityBot crawl your website — it reveals how visible your content is to AI training and search systems.

Bot visits

AI crawler requests

AI providers ?

crawling your site

Pages crawled ?

unique URLs indexed

Bot traffic share

of all requests are bots

AI crawlers

by provider

Loading…

Where they crawl from

geography & edge

Loading…

Most-crawled pages

Loading…

Recent crawl activity

Loading…

Humans arriving from AI

ChatGPT, Perplexity…

Loading…

Crawl trend

Loading…

Human traffic

Real visitors recorded by your tracking pixel

Audience overview

Loading…

Top human pages

Loading…

Recent human visits

Loading…

Top referrers

Loading…

Pages & Analysis Mode

0 pages tracked for analysis

Analysis Mode

Tracked Pages

Same pages every time for consistent comparison

Auto Discovery

Automatically discover new pages (up to 20)

Hybrid

Tracked pages + auto-discover more

Current: When you click "Update", these exact 0 pages will be re-analyzed for consistent comparison over time.

Managing Your Tracked Pages

Add or remove pages below. Click "Save Changes" to persist updates, then "Update" in the header to re-analyze.

Synthetic Focus Group ?

Data-grounded consumer personas for concept testing and message validation

Recommended Tests Based on your data

Loading recommendations...

Brand Personas

Loading personas...

Create New Focus Group Test

Test Type

Session Title

Stimulus to Test (Product concept, message, ad copy, etc.)

Select Personas for Discussion

Recent Sessions

No sessions yet. Create your first focus group test above!

AI Pages AI-Optimized

Clean, semantic HTML pages optimized for AI crawlers with ~80%+ token reduction

What are AI Pages?

When AI tools like ChatGPT, Claude, or Perplexity visit your website, they read the entire page — ads, navigation menus, footers, JavaScript, CSS, cookie banners, and all. This wastes their limited reading capacity (called "tokens") on content that isn't useful, which means your key messages can get buried or missed entirely.

AI pages are simplified, AI-friendly versions of your real pages. They contain only the important content — your headings, key facts, FAQs, and structured data — in clean HTML that AI crawlers can read quickly and cite accurately. Regular visitors never see them; only AI bots get the optimized version.

~80%+

Token Reduction

Less noise, more signal

Zero Code

No JS, no CSS

Pure semantic HTML

Auto Schema

JSON-LD generated

AI understands your content

Pages Generated

Avg Token Reduction (est.)

Higher = better for AI

Avg AI Page Tokens (est.)

Target: under 2,000

Schema Types

Structured data for AI

Page Comparison

Before vs. after — see how much each page was slimmed down for AI. Click "Preview" to see exactly what an AI crawler will read.

URL	Original (est.)	AI Page (est.)	Reduction	GEO	Action

llms.txt A machine-readable index of your site — like a sitemap, but designed for AI models

Download Deployment Package

Everything you need to make your site AI-ready, in one ZIP file

shadow-pages/

Your optimized HTML files

worker/index.js

Cloudflare Worker script

llms.txt

AI-readable site index

robots.txt

AI crawler permissions

SETUP-GUIDE.md

Step-by-step instructions

Cloudflare Deployment

Deploy AI pages directly to Cloudflare Workers — no manual setup needed

Setup Guide

Follow these steps to deploy your AI pages — no coding experience needed

Download the ZIP file

Click the "Download ZIP" button above. A file called ai-pages-yourdomain.zip will save to your Downloads folder.

Double-click the ZIP file to unzip it. You'll see a folder with all the files listed above.

Create a free Cloudflare account

If you don't have one already, go to cloudflare.com and sign up for a free account. Cloudflare Workers has a generous free tier (100,000 requests/day) that covers most sites.

If your website already uses Cloudflare, skip this step — just log in to your existing account.

Add your website to Cloudflare (if not already)

In the Cloudflare dashboard, click "Add a site" and enter your domain name. Select the Free plan.

Cloudflare will ask you to update your domain's nameservers at your domain registrar (the company you bought your domain from, like GoDaddy, Namecheap, etc.). Follow their instructions to point your nameservers to Cloudflare. This usually takes 5-30 minutes to take effect.

Already using Cloudflare?

If your site already runs through Cloudflare, skip this step entirely. Just make sure you're logged in.

Create a Cloudflare Worker

A "Worker" is a small program that runs on Cloudflare's servers. It decides whether a visitor is an AI bot and, if so, serves the AI page instead of your normal page.

In the Cloudflare dashboard, click "Workers & Pages" in the left sidebar
Click "Create", then click "Create Worker"
Give it a name like "ai-pages" and click "Deploy"
After it deploys, click "Edit code"
Select all the default code in the editor and delete it
Open the worker/index.js file from your downloaded ZIP
Copy the entire contents and paste it into the Cloudflare editor
Click "Deploy" in the top right

Connect the Worker to your website

This tells Cloudflare to run your Worker when someone visits your website.

Go back to your Worker's settings page
Click the "Settings" tab, then "Domains & Routes"
Click "Add" then "Route"
Enter your domain with a wildcard: yourdomain.com/*
Select your zone (your domain) from the dropdown
Click "Add Route"

Don't worry — this is safe

Regular visitors to your site won't notice any change. The Worker only intercepts requests from AI bots and serves them the optimized version. Everyone else gets your normal website.

Upload llms.txt to your website (optional but recommended)

The llms.txt file is like a sitemap designed specifically for AI. Some AI systems look for this file when crawling your site.

How to upload: Use your website's file manager (cPanel, FTP, Netlify dashboard, etc.) to upload the llms.txt file to your website's root folder — the same folder where your homepage lives. It should be accessible at yourdomain.com/llms.txt.

Note: The Worker handles this too

The Cloudflare Worker already serves llms.txt automatically for all visitors. This step just adds a backup in case someone bypasses the Worker.

Update your robots.txt (optional)

robots.txt is a file that tells search engines and AI bots what they're allowed to read on your site. If your current robots.txt blocks AI crawlers, they won't see your AI pages.

Open the robots.txt file from the ZIP and add its contents to your existing robots.txt file (usually at yourdomain.com/robots.txt). This ensures AI crawlers like GPTBot, ClaudeBot, and PerplexityBot are explicitly allowed.

Verify it's working

To confirm your AI pages are being served to AI crawlers, you can test with a simple command. Open a terminal (Mac: Terminal app; Windows: Command Prompt) and run:

curl -H "User-Agent: GPTBot" https://yourdomain.com/

If it returns your clean AI page HTML (not your normal website), it's working. You can also check the response headers for X-Shadow-Served: true.

Not comfortable with the terminal? Use Project Harbor.s Bot Tracking tab to monitor when AI crawlers visit your site. After deploying, you should see visits being served with AI-optimized content.

Common Questions

Will this affect how my website looks to normal visitors?

No. The Cloudflare Worker only changes what AI crawlers see. Regular visitors (people using browsers) always see your normal website, completely unchanged.

Is this "cloaking"? Will Google penalize my site?

AI pages serve the same content as your real pages, just in a cleaner format. This is content optimization, not deceptive cloaking. The AI pages don't add false claims or unrelated keywords — they simply present your existing content in a way AI can read more efficiently. Google's regular search bot (Googlebot) sees your normal pages through standard search indexing.

What does "token reduction" mean?

"Tokens" are how AI models measure text. Every word, piece of HTML code, and even spaces count as tokens. A typical web page might be 50,000+ tokens, but an AI model may only use the first few thousand when forming answers. By reducing your page to under 2,000 tokens of pure content, AI models can read and understand all of your key information instead of a fraction of it.

What is JSON-LD / Schema markup?

JSON-LD is a way to add structured data to a webpage so that AI and search engines can understand what the page is about without having to "read" the whole thing. For example, it can explicitly tell an AI "this page is about a Product called X, made by Company Y, priced at Z." AI pages automatically generate this structured data based on your content.

Do I need to regenerate AI pages when I update my website?

Yes. AI pages are a snapshot of your content. When you make significant changes to your website, run a new analysis in Project Harbor, come back to this tab, and click "Regenerate AI Pages." Then re-deploy the updated Worker code to Cloudflare.

Is Cloudflare the only option?

The Worker script is designed for Cloudflare, but the concept works with any CDN or server that can inspect User-Agent headers. If you use Vercel, Netlify, or Nginx, a developer can adapt the Worker logic into middleware for those platforms. The shadow HTML files themselves work anywhere.

Try a specific query to see how AI responds and whether your brand appears.

Industry Monitor

Track brand visibility across AI platforms by industry

Brand Safety Hub

Monitor AI model accuracy against your verified brand facts. Track how models represent your brand, identify issues, and get actionable recommendations.

Brand Name *

Domain

AI Demand Intelligence

Analyze how consumers search for your brand in AI contexts. Track Google Trends interest, Reddit discussions, rising topics, and AI model accuracy — every metric traced to a named public data source.

Brand Name *

Domain

Your Content

Select a project and switch to the Content Studio tab to view your content.

Start writing your content here...

AI Topic Suggestions

Click "Suggest Topics" to generate gap-driven content ideas.

Content Calendar

Sun

Mon

Tue

Wed

Thu

Fri

Sat

SERP Position Tracking

Run keyword research or SERP profiling to start tracking positions.

Agents

Automate GEO workflows with composable, multi-step agents

0 active 0 runs --%

Build an Agent with AI

Describe what you want to automate in plain language. The AI will design a multi-step agent for you, selecting the right steps, configuring parameters, and setting up triggers — all from your description.

Weekly brand monitoring with competitor comparison Full GEO audit with content briefs Reddit intelligence pipeline with alerts

Steps

Add your first step

Pick a step from the palette

Manual Scheduled

Monitoring

Weekly Brand Monitor

5 steps · ~8 min

Crawls ground truth weekly, compares against competitors, calculates scores, generates a report, and dispatches alerts on score changes.

ground_truth_crawl competitor_comparison calculate_scores generate_report dispatch_alerts

Audit

Full GEO Audit

6 steps · ~15 min

End-to-end GEO readiness assessment: crawl, build query universe, score readiness, detect attributions, generate recommendations, and produce a comprehensive report.

ground_truth_crawl query_universe_build readiness_score detect_attributions generate_recommendations generate_report

Content

Content Gap Filler

4 steps · ~10 min

Identifies content gaps by scoring existing pages, uses Aurora to suggest missing topics, and generates content briefs for each gap in batch.

aurora_geo_score aurora_suggest_topics generate_content_briefs_batch send_notification

Intelligence

Competitor Intel

4 steps · ~12 min

Runs a full comparison batch against your competitor set, detects attributions for each, generates a comparative report, and alerts on significant ranking changes.

comparison_batch detect_attributions generate_report dispatch_alerts

Social

Reddit Listener

3 steps · ~5 min

Monitors Reddit discussions in your category subreddits, detects any brand attributions or competitor mentions, and sends notifications for actionable threads.

reddit_pipeline detect_attributions send_notification

Optimization

Remediation Pipeline

5 steps · ~20 min

Runs preflight checks, scores readiness, generates targeted recommendations, applies automated remediation, and sends a completion notification.

preflight_check readiness_score generate_recommendations apply_remediation send_notification

Monitoring

DADU Monitor

7 steps · ~15-25 min

Detection-Analysis-Delivery-Update pattern: detect changes in AI visibility, analyze root causes, deliver alerts, and update baselines for continuous monitoring.

ground_truth_crawl comparison_batch calculate_scores detect_attributions generate_recommendations generate_report dispatch_alerts

Monitoring

Citation Accuracy Monitor

6 steps · ~20-30 min

Track AI model citations of your brand, verify accuracy against ground truth facts, and alert when inaccuracies appear.

fetch_active_facts comparison_batch detect_attributions calculate_scores generate_report dispatch_alerts

Optimization

Content Remediation Pipeline

8 steps · ~30-45 min

End-to-end content fix pipeline: identify weak pages, generate optimized content briefs, apply fixes via GitHub PR or CMS, and verify improvements.

ground_truth_crawl readiness_score aurora_geo_score generate_recommendations generate_content_briefs_batch preflight_check apply_remediation send_notification

Knowledge Base Documents

Select a brand to view knowledge base documents.

Execution Dashboard

Total Runs

Success Rate

Estimated Cost

Active Agents

Runs Timeline

Completed Failed Cancelled

Agent Performance

Agent	Runs	Success Rate	Avg Duration	Avg Cost	Last Run

Recent Failures

My Agents 0

No agents yet

Create your first agent using Chat, the visual Builder, or pick a Template above.

Intent Architecture ?

Systematic prompt battery generation across intent dimensions

Generate Prompt Battery

Brand Name *

Category *

Competitors (comma-separated)

Consumer Profile

Research Objective

Intent Dimensions

Saved Batteries

No saved batteries yet. Generate one above and it will be saved automatically.

What is Generative Engine Optimization (GEO)?

When users ask AI assistants like ChatGPT, Perplexity, or Google Gemini questions, these AI models don't search the web like traditional search engines. Instead, they understand the meaning of content through vector embeddings — mathematical representations of concepts in 3,072-dimensional space.

GEO is the practice of optimizing your content so AI models understand, recommend, and cite your brand when users ask relevant questions. Project Harbor is a full-stack platform that measures and improves your AI visibility across every major LLM.

The Science Behind AI Understanding

Large Language Models (LLMs) like GPT-4 process text through a transformer architecture that converts words into high-dimensional vectors. These vectors capture semantic meaning — words with similar meanings have similar vectors. When you ask ChatGPT "What's the best electric shaver?", it converts your question into a vector and finds content whose vectors are semantically close.

Key difference: Google ranks pages. AI understands concepts. A page can rank #1 on Google but be invisible to AI if its semantic embeddings don't align with user intent. This platform tests both — and bridges the gap.

Source: Cosine similarity computation in lib/shared/vector-math.js. Embedding config in lib/shared/config.js.

II. Data Collection

1 Website Crawling

We scan your website's pages, extracting titles, headings, meta descriptions, and body content. This is the raw material AI models use to understand what your site is about.

Crawling Implementation

We use fetch() with custom headers to request your pages, then parse HTML using Cheerio (a Node.js HTML parser). We extract:

<title> tag — primary page identifier
<h1> through <h3> — content structure
<meta name="description"> — summary content
Body text — full content stripped of HTML/scripts (limited to 50,000 chars raw, 10,000 processed)
Schema.org JSON-LD — structured data

// Crawl configuration (lib/shared/config.js)
MAX_PAGES: 100         // max pages per analysis
MAX_DEPTH: 3           // link-following depth
TIMEOUT: 30000ms       // per-page timeout
DELAY: 1000ms          // politeness delay between requests
BODY_TEXT_LIMIT: 10000 // processed body char limit
RAW_BODY_LIMIT: 50000  // raw body char limit

Edge cases: Pages that return non-200 status codes are skipped. JavaScript-rendered content is handled via ScraperAPI fallback. Duplicate URLs are deduplicated by canonical URL.

2 Vector Embedding Generation

We convert your page content into vector embeddings using OpenAI's text-embedding-3-large model. Each page becomes a point in 3,072-dimensional "meaning space" — similar concepts cluster together, different concepts are far apart.

Embedding Pipeline

Model: text-embedding-3-large — 3,072 dimensions, max 8,191 tokens per request. Config: lib/shared/config.js.

Chunking: Long pages are split at sentence boundaries via chunkTextForEmbedding() in lib/lighthouse/utils.js. Each chunk ≤ 8,000 tokens (~32,000 chars at ~4 chars/token). Chunks are embedded separately, then averaged.

Composite page embedding — rather than embedding raw HTML, we generate separate embeddings per content component and combine via weighted average:

// Weighted composite embedding (lib/shared/config.js)
composite = title    × 0.20  // 20% — primary page identifier
          + h1       × 0.15  // 15% — main heading signal
          + meta     × 0.15  // 15% — search description
          + headings × 0.15  // 15% — content structure (H2-H4)
          + bodyText × 0.35  // 35% — full content depth

// Then L2-normalize to unit length
normalized[i] = composite[i] / √(Σ composite²)

Retry logic: All embedding API calls use exponential backoff (lib/shared/retry.js). Retryable status codes: 429, 500, 502, 503, 529. Formula: delay = min(baseDelay × 2^attempt, 30000ms) + 10% jitter. Max 3 retries. Batch size: 100 embeddings per request.

3 Topic Alignment Analysis

We generate topic clusters relevant to your industry and measure how well each page aligns using cosine similarity — a mathematical measure of angular distance between vectors.

Cosine Similarity — `lib/shared/vector-math.js`

// Exact implementation
function cosineSimilarity(a, b) {
  if (!a || !b || a.length !== b.length) return 0;
  let dotProduct = 0, normA = 0, normB = 0;
  for (let i = 0; i < a.length; i++) {
    dotProduct += a[i] * b[i];
    normA += a[i] * a[i];
    normB += b[i] * b[i];
  }
  const magnitude = Math.sqrt(normA) * Math.sqrt(normB);
  return magnitude === 0 ? 0 : dotProduct / magnitude;
}
// Range: -1 (opposite) to 0 (unrelated) to 1 (identical)

Topic clusters are generated dynamically using GPT-4o-mini based on your industry. Each cluster is embedded and compared against each page embedding. First 10 pages receive full deep semantic analysis (concept extraction, semantic signals, detailed affinities). Remaining pages receive alignment scores only.

Thresholds: ≥60% = strong alignment (green), 35-60% = moderate (yellow), <35% = weak (red).

4 Real Query Harvesting

Instead of guessing what consumers search for, we harvest real queries from Google Autocomplete, People Also Ask, Related Searches, and Google Trends. These are the actual questions your customers type every day.

Query Pipeline

The pipeline starts with seed terms from your industry, top page keywords, and competitors. For each seed, we query Google for real consumer data:

Autocomplete: What Google suggests as you type
People Also Ask: Real questions from search results
Related Searches: What other users searched
Google Trends: Trending queries with real search volume

Multi-source queries are boosted as high-signal. Duplicates and near-duplicates merged. An LLM classifies each query by intent (discovery, comparison, review) and search stage (awareness, consideration, decision), creates branded variants, and fills gaps if a stage is underrepresented. The LLM never invents core queries — those come from real search data. Result: ~100 prompts split into unbranded and branded.

III. AI Visibility Testing

5 AI Visibility Testing

We ask up to 12 AI models (free + premium tiers) the real consumer queries from Step 4 — both branded and unbranded. We track whether AI mentions your brand, which competitors it recommends, and what sources it cites.

LLM Testing Matrix

Tiered model testing — free and premium variants per provider:

Provider	Free Tier	Premium Tier	API
OpenAI	`gpt-5-mini`	`gpt-5.4`	Responses API + web_search
Anthropic	`claude-haiku-4-5`	`claude-sonnet-4-5`	Messages API + web search
Google	`gemini-2.0-flash`	`gemini-3.1-pro`	AI Studio + grounding
Perplexity	`sonar`	`sonar-pro`	Perplexity API (citation-focused)
Google	Google AI Overview		SERPapi extraction
Meta	`llama-4-maverick`		OpenRouter + web search
Mistral AI	`mistral-medium-3.1`		OpenRouter
Cohere	`command-r`		OpenRouter (retrieval-focused)

Total: up to 12 model slots (4 providers × 2 tiers + AI Overview + 3 OpenRouter). Actual count depends on configured API keys.

Brand detection: Case-insensitive string matching of brand name (extracted from domain) in the full response text. Domain variations (www., country prefixes) are stripped.

Competitor extraction: extractCompetitorsAndCitations() parses response annotations for url_citation URLs. extractCompetitorMentions() uses regex patterns and NLP heuristics to identify company names, filtering out generic terms ("Premium Choice", "Best Value"), product attributes, and the analyzed brand itself.

6 Competitive Intelligence

We identify which competitors AI recommends most often and which citation sources AI trusts. This reveals who's winning the AI visibility battle and why.

Competitor & Citation Extraction

We use regex patterns and NLP heuristics to identify company names in AI responses. Filtering removes generic terms, product attributes, and the analyzed brand. A whitelist of known brands improves detection accuracy. Mentions are aggregated across all prompts.

Citations are extracted from the API's url_citation annotations. Citation gaps = sources that appeared with competitor mentions but not your brand — these are PR and outreach opportunities.

10 Multi-LLM Cross-Platform Testing

We test your brand visibility across up to 12 AI models simultaneously (tiered: free + premium per provider). Different AIs have different training data, biases, and search capabilities — your brand may be visible on ChatGPT but invisible on Gemini.

Parallel Execution Architecture

// All LLMs tested in parallel via Promise.allSettled()
// Tiered providers (free + premium per provider):
searchAI(prompt, { model: 'gpt-5-mini' })       // OpenAI free
searchAI(prompt, { model: 'gpt-5.4' })           // OpenAI premium
searchClaude(prompt, { model: 'claude-haiku-4-5' })  // Anthropic free
searchClaude(prompt, { model: 'claude-sonnet-4-5' }) // Anthropic premium
searchGemini(prompt, { model: 'gemini-2.0-flash' }) // Google free
searchGemini(prompt, { model: 'gemini-3.1-pro' })   // Google premium
searchPerplexity(prompt, { model: 'sonar' })     // Perplexity free
searchPerplexity(prompt, { model: 'sonar-pro' }) // Perplexity premium
searchGoogleAIOverview(prompt)                   // SERPapi
searchOpenRouter('llama-4-maverick', prompt)      // Meta via OpenRouter
searchOpenRouter('mistral-medium-3.1', prompt)    // Mistral via OpenRouter
searchOpenRouter('command-r', prompt)             // Cohere via OpenRouter

Per-prompt visibility: (LLMsWithBrand / totalLLMsTested) × 100. Overall visibility: Average of per-prompt visibility across all prompts (two-level averaging).

Cross-platform consistency: Standard deviation of per-platform visibility scores. σ ≤ 8 = Grade A, 9-15 = Grade B, 16-22 = Grade C, > 22 = Grade D.

IV. Content Analysis & Scoring

7 GEO Content Scoring (LLM Citability)

We analyze your page structure for factors that make AI more likely to cite you. Research shows 72.4% of ChatGPT-cited posts include specific content patterns we detect.

GEO Score Sub-Components (0-100)

GEO Score = answerCapsules + sectionStructure + originalData + technicalSEO

Answer Capsules (0-25):
  capsules = count of sentences 80-200 chars after H2/H3 headings
  capsuleScore = (capsules ≥ 3) ? 10 : capsules × 3
  firstParaBonus = +8 if first paragraph is 80-160 chars
  questionScore = (questionHeadings ≥ 3) ? 7 : questionHeadings × 2
  score = min(25, capsuleScore + firstParaBonus + questionScore)

Section Structure (0-25):
  Optimal paragraph: 100-200 words (sweet spot for AI citation)
  ratio = optimalParagraphs / totalParagraphs
  avgOptimal = (avgParagraphLength 100-200 words)
  lengthBonus = avgOptimal ? 10 : max(0, 10 - |avg - 150| / 15)
  score = min(25, ratio × 15 + lengthBonus)

Original Data (0-25):
  Regex count of percentages, years, dollar amounts, statistics
  rawScore = min(10, dataPointCount × 0.8)
  densityBonus = (dataPerThousandWords ≥ 5) ? 8 : density × 1.5
  studyBonus = (studies > 0) ? 4 : 0
  comparisonBonus = (comparisons > 0) ? 3 : 0
  score = min(25, rawScore + densityBonus + studyBonus + comparisonBonus)

Technical SEO (0-25):
  Schema.org JSON-LD present:     +10
  FAQPage schema type:            +5
  HowTo schema type:              +5
  List items (≥ 5 items ? 5 : n): up to +5
  Table present (≥ 1):            +5
  score = min(25, sum of above)

Grade: A (80-100) = highly citable | B (60-79) = good | C (40-59) = moderate | D/F (<40) = needs restructuring

8 Citation Source Mapping

We aggregate all citations across all prompts to show which third-party sources AI trusts. We identify citation gaps — sources that cite competitors but not you.

Citation Gap Analysis

// Citation gap = source with competitor mentions but not your brand
const citationGaps = topSources.filter(source =>
  source.competitorsMentioned.length > 0 &&
  !source.includesYourBrand
);
// Categories: News, Reviews, Educational, Forums
// Authority: .edu, .gov, and major news domains carry more weight

Action: Target citation gap sources for guest posts, press releases, or product reviews. Getting featured there means AI will cite you too.

9 Answer Capsule Generator

For pages missing good answer capsules, we use AI to generate 120-150 character summaries optimized for citation. Copy-paste these after your H2 headings.

Why 120-150 Characters?

Research on ChatGPT citation patterns: 72.4% of cited posts have concise answer summaries immediately after headings. Optimal: 120-150 chars — long enough for detail, short enough to cite verbatim. Direct answers (not questions or lead-ins) perform best. Generated via GPT-4o-mini with page context.

V. Commerce Intelligence

11 AI Commerce Intelligence

Track how AI systems recommend your products in shopping-intent queries. Measure AI Share of Shelf, recommendation position, merchant citations, and brand sentiment.

Commerce Formulas

// Share of Shelf (overall)
shareOfShelf = (brandFoundQueries / totalQueries) × 100

// Per-platform Share of Shelf
perLLM = (llm.brandFound / llm.totalQueries) × 100

// Thresholds:
shareOfShelf < 30%  → CRITICAL (missing from most queries)
shareOfShelf 30-59% → ACTION NEEDED (grow to 70%+)
shareOfShelf ≥ 60%  → STRONG visibility

Sentiment Scoring — count-based lexicon matching in a context window around each brand mention (lighthouse-server.js):

// Context window: ±150 chars before to +300 chars after brand mention

// Positive words (21):
"excellent", "best", "highly recommended", "top pick", "outstanding",
"great", "fantastic", "superior", "premium", "exceptional",
"impressive", "reliable", "trusted", "favorite", "leading",
"innovative", "high-quality", "winner", "recommend", "worth", "value"

// Negative words (16):
"poor", "avoid", "disappointing", "issues", "problems", "complaints",
"overpriced", "not recommended", "inferior", "unreliable",
"cheap quality", "skip", "pass", "drawback", "downside", "cons"

// Scoring (count-based, not weighted):
let delta = positiveCount - negativeCount

if (positiveCount > negativeCount + 1)  // clearly positive
  score = min(100, 60 + delta × 8)
else if (negativeCount > positiveCount + 1)  // clearly negative
  score = max(0, 40 - delta × 8)
else                                    // neutral / mixed
  score = 50 + delta × 5

// Labels: positive | negative | neutral

Limitation: Lexicon-based approach — sarcasm, context-dependent language, and negation modifiers (e.g., "not great") may not be captured accurately.

Battlecards: Head-to-head comparison queries. Position-based winner detection (who appears first in text). Displayed as a comparison view — not a standalone calculated metric.

Average Rank: Product position parsed via numbered lists, bold patterns, bullet patterns. Capped at 10.

Merchant Distribution: Regex extraction from LLM text matching patterns like "buy at [Merchant]", "available on [Merchant]". Deduplicated, capped at 10 merchants.

VI. Bot & Traffic Tracking

12 Site Traffic & AI Bot Tracking

Monitor which AI crawlers visit your website. See visits from GPTBot, ClaudeBot, PerplexityBot, QwenBot, Google-Extended, and more. Compare human vs. bot traffic.

Bot Detection

// AI bots detected via user-agent patterns
{ pattern: /GPTBot/i,         provider: 'openai',     type: 'training' }
{ pattern: /ChatGPT-User/i,   provider: 'openai',     type: 'browsing' }
{ pattern: /ClaudeBot/i,       provider: 'anthropic',  type: 'training' }
{ pattern: /PerplexityBot/i,   provider: 'perplexity', type: 'search' }
{ pattern: /Google-Extended/i, provider: 'google',     type: 'training' }
// + QwenBot, DeepSeekBot, and others

Human tracking: 1x1 transparent pixel on page load records visits, referrers, and device info. Bot tracking: Server-side user-agent analysis. Key metrics: Human pageviews, bot visits by provider, bot traffic %, top crawled pages, top referrers.

VII. Response Analysis

13 Sentiment Analysis

We analyze how positively or negatively AI discusses your brand. When AI recommends a competitor with glowing language but describes you neutrally, that's a sentiment gap.

Sentiment Scoring — Visibility Context (0-100)

See Commerce Intelligence (Step 11) for the full word lists. The same count-based lexicon algorithm is used: ±150/+300 char context window around brand mentions, positive/negative word counting, scored via the 60 + delta×8 / 40 - delta×8 / 50 + delta×5 formula.

// Default: 50 (neutral) when no brand mentions found

// Display thresholds (dashboard rendering):
score ≥ 75  → Very Positive (green)
score 60-74 → Positive      (green)
score 45-59 → Neutral       (yellow)
score 30-44 → Negative      (orange)
score < 30  → Very Negative (red)

Sentiment Scoring — Focus Group Context

The Consumer Focus Group uses a separate, more nuanced sentiment pipeline (lib/focus-group/ingestion/sentiment.js) with fractional word weights and square-root normalization:

// Fractional weights (0.3 to 0.9 per word):
Positive: "love" (0.8), "excellent" (0.8), "highly recommend" (0.9),
  "best" (0.7), "perfect" (0.8), "game changer" (0.7), "quality" (0.4)...
Negative: "worst" (-0.9), "hate" (-0.8), "terrible" (-0.8),
  "don't recommend" (-0.8), "waste" (-0.7), "useless" (-0.7)...

// Normalization:
score = totalScore / √matchCount     // dampen high match counts
score = clamp(score, -1, +1)

// Labels (threshold ±0.2):
score > 0.2         → positive
score < -0.2        → negative
|score| < 0.2 AND matchCount > 2 → mixed
otherwise           → neutral

// Also detects: emotions (8 types) and intents (8 types)

Limitation: Both sentiment systems are lexicon-based. Sarcasm, double negation, and context-dependent language may not be captured.

14 Customer Journey Funnel

Every citation is classified into a buyer journey stage — Awareness, Consideration, Conversion, or Community — showing where your brand appears (and disappears) in the decision path.

Two-Priority Classification

Priority 1 — URL pattern matching (most reliable):

Awareness:     Wikipedia, blogs, news, guides, tutorials,
               "what-is" pages, educational sites

Consideration: CNET, Wirecutter, PCMag, RTINGS, Tom's Guide,
               TechRadar, comparison pages, "vs" content,
               review sites, "best-of" roundups

Conversion:    Amazon, Best Buy, Walmart, Target, Newegg,
               /buy, /shop, /pricing, /checkout, /cart URLs,
               retailer domains, product pages

Community:     Reddit, Quora, Stack Overflow, GitHub discussions,
               forums, Discord, community boards

Priority 2 — Prompt text patterns (fallback): If URL doesn't match, classify based on prompt text ("best X vs Y" → Consideration, "buy X" → Conversion). Default: Awareness.

Deduplication: Same domain across prompts is merged — counts summed, unique LLMs and URLs collected. Sorting: Brand first → Competitor → Other, then by citation count.

VIII. Content Optimization

15 Content Gap Detection

We identify concepts AI expects in your industry that your website doesn't cover. These missing concepts are why AI overlooks you.

Gap Severity Algorithm

// GPT-4o-mini identifies missing concepts per page
// Concepts are: deduplicated by name, frequency-ranked,
// importance-labeled (high/medium/low)

highGaps ≥ 3  → "Weak"     (significant content gaps)
highGaps  1-2 → "Moderate" (some gaps to address)
highGaps    0 → "Strong"   (comprehensive coverage)

// Top 10 gaps shown, sorted by frequency
// Topic gaps: clusters with zero aligned pages

16 Technical SEO Audit

We audit every crawled page for structural issues that prevent AI from parsing your content — missing schema markup, weak meta descriptions, and poor heading hierarchy.

Audit Checks & Thresholds

// Per-page technical SEO scoring (part of GEO Score)
score ≥ 18/25 → Good       (schema and structure present)
score   12-17 → Needs work (partial coverage)
score <  12   → Weak       (priority fix needed)

// Site-wide audit verdict
0 weak pages     → "Strong"   (all pages well-structured)
≤ 30% weak pages → "Moderate" (some pages need attention)
> 30% weak pages → "Weak"     (widespread technical gaps)

// Checks: JSON-LD schema, meta descriptions (flag <120 chars),
// lists & tables, heading hierarchy (H1→H2→H3)

17 AI-Generated Recommendations

Based on all detected gaps, GPT-4o-mini generates specific, actionable recommendations — each with a priority level, estimated effort, and the exact page to modify.

Recommendation Pipeline

Top 20 content gaps fed into GPT-4o-mini with full context (gap severity, AI response snippet, topics, best matching page, similarity score). Output: structured JSON with issue, impact, action steps, implementation guide, effort estimate, priority, and suggested page. Rate-limited: 200ms between calls. Sorted by gap severity (highest-impact first).

18 Consumer Demand Prioritization

Not all prompts matter equally. We analyze relative consumer interest using DataForSEO search volume data, then assign demand tiers — High, Medium, Low, or Niche.

Percentile-Based Tier Assignment

Data source: DataForSEO keywords/data endpoint for search volume signals per prompt.

// Percentile-based tiers (within analysis batch)
volumes = prompts.map(p => searchVolume(p))
p75, p50, p25 = percentiles(nonZeroVolumes)

volume ≥ p75  → HIGH DEMAND   (score: 3)
volume ≥ p50  → MED DEMAND    (score: 2)
volume >  0   → LOW DEMAND    (score: 1)
volume == 0   → NICHE         (score: 0)

// Usage: demand badges on prompt cards, smart sorting
// (high-demand gaps surface first), demand-weighted
// visibility summary metric

Why relative tiers? Raw volumes vary wildly by industry. Percentile tiers show which prompts matter most relative to your market.

IX. Brand Safety Hub

19 Ground Truth Fact Extraction

We extract verified facts from your brand's website using three parallel extractors, each with different confidence levels based on data reliability.

Three-Tier Extraction Pipeline

// 1. Schema.org Extractor (lib/ground-truth/ingestion/schema-extractor.js)
//    Parses JSON-LD structured data
//    Confidence: 0.85 base, +0.10 for high-signal fields
//    (name, url, price, address), +0.05 for medium-signal
//    (description, logo, brand). Capped at 0.95.

// 2. Meta Tag Extractor (lib/ground-truth/ingestion/meta-extractor.js)
//    Reads <meta>, Open Graph (og:*), Twitter Card tags
//    Confidence: 0.70 base meta, 0.75 OG/Twitter, 0.80 canonical

// 3. LLM Extractor (lib/ground-truth/ingestion/llm-extractor.js)
//    Claude/GPT extracts facts from page body text
//    Confidence clamped: 0.40 (inferred) to 0.60 (very explicit)
//    Lowest tier — LLM-inferred facts are less reliable

Deduplication: Same (category, fact_key) keeps highest-confidence source. 11 valid categories: identity, product, pricing, claim, policy, location, leadership, competitive, contact, technical, visual.

20 LLM Accuracy Testing

We test how accurately each LLM represents your ground truth facts — detecting hallucinations, outdated info, and factual errors across 5 query framings per fact.

Two-Pass Evaluation — `lib/ground-truth/comparison/evaluator.js`

Query generation (lib/ground-truth/comparison/query-generator.js): 5 framings per fact — direct, comparative, recommendation, negative, conversational. Temperature 0.7 for variety.

// Pass 1: LLM Judge (temperature 0)
// Classifies response as: correct, partially_correct, incorrect,
// hallucinated, not_mentioned, or refused

// Pass 2: Embedding Similarity
similarity = cosineSimilarity(embed(groundTruth), embed(extractedAnswer))

// Overall Accuracy
accuracy = ((correct + 0.5 × partial) / (total - refused)) × 100

// Confidence scoring
correct + similarity > 0.8 → 0.95
correct alone              → 0.85
similarity > 0.9           → 0.70
else                       → 0.50

// Severity classification
hallucinated                       → critical
incorrect + confused_with_competitor → high
incorrect                          → medium
partially_correct / not_mentioned  → low
refused / correct                  → info

// Issue types: outdated_info, wrong_number, wrong_name,
// confused_with_competitor, fabricated_detail, missing_context,
// oversimplified, contradictory

Models tested: GPT-4o (500ms rate limit), Claude Sonnet 4.5 (1000ms), Gemini 2.0 Flash (500ms), Perplexity Sonar (1000ms).

21 AI Readiness Score

A composite score measuring how well your website is prepared for AI consumption — structured data, content quality, fact coverage, freshness, and crawlability.

Two Distinct AI Readiness Scores

1. Brand-Level Readiness — appears in the Brand Safety Hub. Measures overall AI preparedness across your brand's fact base:

// lib/ground-truth/scoring/readiness-scorer.js
readiness = 0.20 × schema_markup       // JSON-LD score (0-100)
          + 0.20 × content_structure   // context(40%) + diversity(30%) + FAQ(30%)
          + 0.20 × fact_coverage       // coveredCategories / 11 × 100
          + 0.15 × freshness           // factFreshness(70%) + sourceFreshness(30%)
          + 0.10 × crawlability        // robots(20) + AI access(30) + sitemap(30) + URLs(20)
          + 0.15 × query_coverage      // coveredQueries / totalQueries × 100

// Content structure sub-scores:
//   Context: (factsWithContext / totalFacts) × 40
//   Diversity: min(30, uniqueSubcategories / totalFacts × 100)
//   FAQ: min(30, faqFactCount × 10)

// Freshness decay:
//   Fact freshness: linear decay over 90 days
//   Source freshness: linear decay over 30 days

// Crawlability breakdown:
//   robots.txt present:    +20 pts
//   AI crawlers allowed:   +30 pts
//   Sitemap found:         +30 pts
//   URL count (scale/50):  +20 pts max

2. Page-Level Readiness — appears per-page in the Pages tab. Measures individual page AI readiness:

// lighthouse-server.js — calculateAIReadinessScore()
readiness = 0.40 × geoScore            // GEO content score (0-100)
          + 0.30 × alignment           // avgAssociationScore × 100
          + 0.15 × technicalHealth     // 100 baseline, deductions for:
              // no schema (-30), weak meta (-20), no H1 (-20),
              // thin content <300 words (-15), slow >3s (-10),
              // redirect chains >2 (-5), broken links (-15)
          + 0.15 × linkAuthority       // 50 baseline, +10/inlink (max 30),
              // -40 if orphan, +20 if hub

// Grade: A (80+), B (60-79), C (40-59), D (20-39), F (<20)

22 Prompt Sensitivity & Hallucination Analysis

We measure how much LLM accuracy varies by question framing and identify systematic hallucination patterns across models and fact categories.

Sensitivity & Pattern Analysis

// Prompt Sensitivity Score (0-100)
// Variance of accuracy across 5 query framings per fact
sensitivity = min(avgVariance / 0.25, 1) × 100
// 0 = consistent across all framings
// 100 = highly framing-dependent

// Hallucination Pattern Analysis
// Groups hallucinated responses by category and issue type
// Identifies which fact categories and error types are
// most common per model

// Causal Attribution
// Compares 7-day accuracy windows before/after fact changes
// Measures impact of content updates on LLM accuracy

// Benchmarks (percentile-based)
// Min sample: 5 brands for comparison
// Segmented by: industry, company size, model

X. Reddit Intelligence

23 Reddit Intelligence

We discover, scrape, and analyze Reddit discussions about your brand — extracting sentiment, competitor mentions, topic relevance, and consumer recommendations.

Three-Stage Pipeline

1. Discovery (lib/reddit-intel/discovery.js):

// SerpAPI searches with 6 query variants:
site:reddit.com "{brand}"
site:reddit.com "{brand}" review
site:reddit.com "{brand}" vs
site:reddit.com "{brand}" alternative
site:reddit.com {domain}
site:reddit.com "best {category}" recommendation

// 20 results per query, 500ms between queries, max 2 retries

2. Scraping (lib/reddit-intel/scraper.js): Reddit JSON endpoint .json?limit=200&sort=top. Recursive comment flattening (max depth 10). Filters AutoModerator and deleted content.

3. Analysis (lib/reddit-intel/analyzer.js): Claude Haiku 4.5 (claude-haiku-4-5-20251001, max 2000 tokens) extracts:

Brand mentions with sentiment (positive/negative/neutral/mixed)
Domain mentions with context
Topics with relevance scores (0-1)
Recommendations with upvote counts

Scoring: Sentiment breakdown by count, competitor comparison by mention volume, recommendations grouped case-insensitive and sorted by count.

XI. Consumer Focus Group

24 Focus Group — Data Ingestion

We ingest consumer voices from Reddit, reviews, and surveys. Each response is quality-scored, embedded, and sentiment-analyzed.

Ingestion Pipeline

// Quality score per response (lib/focus-group/ingestion/)
quality = length    × 0.25  // longer = more detail
        + engagement × 0.25  // upvotes, replies
        + specificity × 0.30 // concrete details vs generic
        + recency    × 0.20  // newer = more relevant

// Embeddings: text-embedding-3-large (3072 dims)
// Sentiment: Claude API (positive/negative/neutral/mixed)
// Rule-based fallback (lib/focus-group/ingestion/sentiment.js):
//   Fractional weights per word (0.3 to 0.9)
//   e.g., "highly recommend" +0.9, "worst" -0.9
//   Normalized: score / √matchCount, label threshold ±0.2
//   Also detects 8 emotion types + 8 intent types
//   See Step 13 for full details on both sentiment systems

25 Focus Group — Persona Generation

We cluster consumer responses using K-means++ on embedding vectors, then generate distinct personas from each cluster — complete with values, motivations, and communication style.

K-means++ Clustering — `lib/focus-group/personas/clustering.js`

// K-means++ initialization:
//   First centroid: random selection
//   Remaining k-1: probability ∝ distance² (spreads centers)

// Iterative assignment/centroid update
//   Distance: Euclidean (√Σ(a[i]-b[i])²)
//   Convergence tolerance: 0.0001
//   Max iterations: 100

// Optimal K selection (k=3 to k=8):
combinedScore = silhouetteScore - 0.02 × (k - minK)
// Slight parsimony penalty for more clusters

// Silhouette score per point:
s(i) = (b - a) / max(a, b)
// a = avg distance to own cluster
// b = min avg distance to any other cluster
// Range: -1 (misclassified) to +1 (well-clustered)

// Cluster quality:
tightness  = exp(-avgDistance / 2)      // normalized 0-1
sizeScore  = min(1, clusterSize / 20)   // maxes at 20+
confidence = tightness × 0.6 + sizeScore × 0.4

26 Focus Group — Discussion Simulation

Each persona participates in a 5-round AI-moderated discussion, reacting to your brand naturally based on their cluster's values, frustrations, and communication style.

Discussion Engine — `lib/focus-group/discussion/engine.js`

// Model: claude-haiku-4-5-20251001 (fast, cost-effective)
// Max tokens per response: 300

// 5 structured rounds per template:
1. First Impressions — gut reaction
2. Deep Dive — specific appeals vs concerns
3. Personal Relevance — would you realistically use this?
4. Competitive Context — how does it compare?
5. Final Verdict — decision scale + what would change mind

// Each persona's system prompt includes:
//   values, motivations, frustrations,
//   communication style, authentic voice samples
//   from their cluster

// Interaction: each persona sees previous responses
// for natural conversation flow

// Delays: 500ms between responses, 1000ms between rounds,
// 800ms between personas

27 Focus Group — Validation

We validate persona traits against real consumer data to ensure the AI-generated personas accurately represent actual consumer segments.

Validation Thresholds — `lib/focus-group/validation/question-generator.js`

// Per-trait validation:
percentage = (validated / total) × 100
≥ 70% → confirmed    (trait matches real data)
40-69% → partial     (some evidence)
< 40% → weak        (insufficient support)

// Overall score: average of trait percentages

// Confidence levels:
50+ responses AND 70%+ score → high
30+ responses AND 50%+ score → medium
otherwise                    → low

XII. Prediction Markets

28 Prediction Markets

We ingest prediction markets from Kalshi and Polymarket, embed contract titles for semantic matching to your brand, and measure forecast calibration using Brier score decomposition.

Brier Score Calibration — `lib/prediction-markets/calibration/brier.js`

// Brier Score: (forecast - outcome)²
// 0 = perfect, 0.25 = coin-flip, 1 = maximally wrong

// Brier Decomposition:
Brier = Reliability - Resolution + Uncertainty

Reliability = Σ(nk/n) × (avgForecast - avgOutcome)²
  // Lower = better calibrated (forecasts match outcomes)

Resolution  = Σ(nk/n) × (avgOutcome - baseRate)²
  // Higher = better discrimination (separates events)

Uncertainty = baseRate × (1 - baseRate)
  // Fixed — inherent outcome variance

// Calibration: 10 probability bins (deciles)

// Platform comparison thresholds:
delta < 0.01  → similarly calibrated
delta 0.01-0.05 → slight advantage
delta > 0.05  → significant advantage

// Rate limits: Kalshi 50ms, Polymarket 34ms
// Relevance threshold: 0.3 cosine similarity to brand

Time horizons: days (≤7), weeks (≤30), months (≤90), quarters (≤270), years (>270).

XIII. AI Pages (Shadow Sites)

29 AI Pages (Shadow Sites)

We generate AI-optimized shadow versions of your pages — restructured for maximum LLM citability with answer capsules, key facts, FAQ schema, and rich structured data. Served only to AI crawlers.

Shadow Page Generation — `scripts/lighthouse-server.js`

// Content components generated per page:
1. Answer Capsule (≤150 chars)
   Priority: pre-extracted → potential capsule → first sentence

2. Key Facts Section
   Regex: sentences with %, large numbers, study refs
   Filter: 20-200 chars, limit 5 facts

3. Content Sections (H2/H3 hierarchy)
   Ceiling allocation across headings
   Paragraph truncation at sentence boundary (500 chars)

4. Expert Insights (300 char summary + quick wins)
5. Competitive Landscape (top 5 competitors)
6. Common Questions (top 5 queries)
7. Trusted Sources (top 5 citation sources)
8. FAQ Schema (schema.org/Question markup)
9. Related Pages Navigation (top 15 internal links)

// Schema types auto-detected:
Organization (always), Article, Product, HowTo,
FAQ, WebPage, BreadcrumbList

// Token reduction: reductionPct = (1 - shadow/original) × 100

// llms.txt: Markdown sitemap for AI models

// Deployment: Cloudflare Worker serves shadow HTML to
// AI bots (GPTBot, ClaudeBot, PerplexityBot, etc.)
// and original content to human visitors
// robots meta: "noindex, nofollow" (AI-only)

XIV. Metrics Reference

GEO Score (0-100)

Content citability score. answerCapsules(0-25) + sectionStructure(0-25) + originalData(0-25) + technicalSEO(0-25). Grades: A (80+), B (60-79), C (40-59), D/F (<40).

Topic Alignment Score

cosineSimilarity(pageEmbedding, topicEmbedding) × 100. ≥60% strong (green), 35-60% moderate (yellow), <35% weak (red).

Brand Visibility (%)

Two-level cross-LLM averaging. Per prompt: (LLMsWithBrand / totalLLMs) × 100. Overall: avg(perPromptVisibility) across all prompts. Case-insensitive brand name detection. Tracked separately for branded vs unbranded queries.

Multi-LLM Visibility (%)

(LLMsWithBrand / totalLLMs) × 100 per prompt, then averaged. Up to 12 models: ChatGPT 5 Mini/5.4, Claude Haiku/Sonnet 4.5, Gemini 2.0 Flash/3.1 Pro, Perplexity Sonar/Pro, Google AI Overview, Llama 4 Maverick, Mistral Medium, Command R.

Share of Voice (0-100%)

brandMentions / (brandMentions + allCompetitorMentions) × 100. Competitive positioning — even 100% visibility doesn't help if 4 competitors appear alongside you (SOV = 20%).

Discovery Gap (percentage points)

brandedVisibility% − unbrandedVisibility%. >15 pts = major gap (AI only finds you by name). 1-15 = modest. 0 = strong organic discovery.

Sentiment Score (0-100)

Count-based lexicon matching in ±150/+300 char brand context. positive > negative+1 → 60 + delta×8; negative > positive+1 → 40 - delta×8; else → 50 + delta×5. Display: ≥75 very positive, ≥60 positive, ≥45 neutral, ≥30 negative, <30 very negative.

Strategy Readiness Score (0-100)

visibility(25%) + geoScore(20%) + sentiment(15%) + alignment(15%) + citations(15%) + sov(10%). ≥60 strong, 35-59 moderate, <35 weak.

Cross-Platform Consistency Grade

Population standard deviation of per-platform visibility. σ = √(Σ(vis - mean)² / n). σ ≤ 8 = A, 9-15 = B, 16-22 = C, >22 = D.

Ground Truth Accuracy (0-100)

((correct + 0.5 × partial) / (total - refused)) × 100. Two-pass evaluation: LLM judge + embedding similarity. Severity: critical (hallucinated) → info (correct).

AI Readiness Score (0-100) — Two Variants

Brand-level: schema(20%) + content(20%) + coverage(20%) + freshness(15%) + crawlability(10%) + queries(15%). Page-level: GEO(40%) + alignment(30%) + technical(15%) + linkAuthority(15%). See Step 21 for full breakdowns.

Prompt Sensitivity (0-100)

min(avgVariance / 0.25, 1) × 100. 0 = consistent across all question framings, 100 = highly framing-dependent.

Brier Score (0-1)

(forecast - outcome)². 0 = perfect prediction, 0.25 = coin-flip, 1 = maximally wrong. Decomposed into reliability, resolution, uncertainty.

Focus Group Persona Confidence (0-1)

tightness × 0.6 + sizeScore × 0.4. Tightness = exp(-avgDistance/2). Size = min(1, clusterSize/20).

Commerce Share of Shelf (0-100%)

(brandFoundQueries / totalQueries) × 100. <30% critical, 30-59% action needed, ≥60% strong.

90-Day Projections

Tiered additions capped at 100: Visibility <40 → +20, <70 → +15, else → +8. GEO <40 → +18, <60 → +12, else → +8. Sentiment <50 → +12, else → +8. SOV <20 → +10, else → +5. These are directional estimates based on observed improvement ranges after implementing recommended optimizations — not statistical forecasts.

Why This Matters

AI is becoming the new search. Millions of users now ask ChatGPT, Perplexity, and Gemini for recommendations instead of Googling.

AI gives ONE answer, not ten links. If you're not in that answer, you're invisible.

Early movers win. Brands optimizing for AI visibility now will dominate as AI adoption accelerates.

XV. Technical Stack

Embeddings: OpenAI text-embedding-3-large (3072 dims)

AI Testing: GPT-5 Mini / GPT-5.4 with web_search tool

Multi-LLM: ChatGPT (2 tiers), Claude (2), Gemini (2), Perplexity (2), Google AI Overview, Llama 4 Maverick, Mistral Medium, Command R

Analysis: GPT-4o-mini for concept extraction

Ground Truth: Claude Sonnet 4.5 / GPT-4o for fact extraction & evaluation

Focus Group: K-means++ on 3072-dim embeddings, Claude Haiku for roleplay

Prediction Markets: Kalshi + Polymarket APIs, Brier score calibration

Reddit: SerpAPI discovery, Reddit JSON API, Claude Haiku 4.5 analysis

Similarity: Cosine similarity on embedding vectors

Database: Supabase (PostgreSQL) with pgvector for embeddings

Backend: Node.js + Express on Railway

Crawling: Cheerio HTML parser + ScraperAPI

Data Pipeline Overview

Crawl (extract page content, schema, structure) → Embed (text-embedding-3-large, 3072 dims) → Query (harvest real consumer queries from Google) → Test (ask up to 12 LLMs each query) → Score (GEO, visibility, sentiment, citations, readiness) → Recommend (AI-generated action items, gap analysis, projections).

Limitations & Assumptions

Snapshot in time: LLM responses are non-deterministic. Results reflect behavior at analysis time and may differ on re-test.
Lexicon-based sentiment: Sarcasm, negation modifiers, and context-dependent language may not be captured accurately.
Projections are estimates: 90-day projections are directional based on observed improvement ranges, not statistical forecasts.
Brand detection: Simple string matching — may miss abbreviated or misspelled brand names.
API availability: Actual LLM count depends on configured API keys; some providers may be unavailable.

Built with full transparency. All methods described above are exactly how this platform operates.
Data reflects real-time AI model behavior as of your analysis date.

Methodology v3.0 — Last updated March 2026 — 29 steps, 16 metrics, up to 12 LLMs

— checks API keys

Shopping Signal Detection ?

Detect commerce intent and shopping surfaces in AI responses

Analyze Response Text

LLM Response Text

Product Category (optional)

Extra Merchants (comma-separated)

Custom Product Names (comma-separated, optional)

Surface Type Reference

Product Cards Score 60+, 2+ merchants, 2+ prices — rich product listing surface

Shopping Guide Score 60+ — strong commerce signals without rich product cards

Soft Commerce Score 35-59 — moderate commerce presence in response

Informational+ Score 15-34 — primarily informational with some commerce hints

Pure Informational Score 0-14 — no meaningful commerce signals

Customer Journey Fanout ?

Full-funnel brand visibility across the customer decision journey

Generate Query Fanout

Brand Name *

Product Category *

Competitors (comma-separated)

Industry (optional)

Journey Stages

Awareness

Customer discovers the category. Queries: "best X", "X buying guide", "how to choose X"

Consideration

Customer evaluates options. Queries: "X vs Y", "X reviews", "top rated X"

Decision

Customer ready to buy. Queries: "best X to buy", "where to buy X", "X deals"

Conversation Journey Analysis ?

Multi-turn brand tracking across simulated AI conversations

Create Journey

Trajectory Classifications

Strengthening Brand presence increases through the conversation

Stable Consistent presence across all turns

Weakening Brand presence decreases through the conversation

Filtered Out Brand appeared but was dropped before the final turn

Never Appeared Brand was never mentioned in any turn

Unified AI Visibility Dashboard ?

Cross-module brand visibility intelligence

Overall AI Visibility Score

No data yet

Executive Summary

Generate a snapshot to see your executive summary.

Intent Architecture

No data

Shopping Detection

No data

Journey Fanout

No data

Conversation Analysis

No data

Actionable Insights

Generate a snapshot to see insights.

Module Summaries

Generate a snapshot to see module summaries.

Schedules ?

Is your recurring data collection running?

Loading schedules…

Citation Decay & Time-to-Citation ?

Track how long your citations persist across AI models

Decay Curves

Click data points to drill into URLs

Refresh Queue

URLs past 75% of their fitted half-life

Declining

URLs with >40% drop in 7-day average

Durable

URLs above 80% of peak after 60 days

Slow Movers

URLs with TTC > 30 days

Median Time-to-Citation (days)

Median Citation Half-Life (days)

Total URLs Tracked

URLs

URL	TTC	Half-Life	Peak Rate	Window	Action

Sign in to Project Harbor

Create Account

Reset Password

Unsaved Analysis

Free Analyses Used Up

Out of Credits

Analysis Setup

Your Brand

Pages & Links

Your Audience

Competitors

Evaluation Focus

Review & Launch

Your Projects

AI Visibility Score

Competitor Analysis

Expert Insights

Actionable Fixes

Citation Decision Journey

Strategy Hub

Competitive Intelligence

Cross-Channel Intelligence

Action Queue

Strategy Brief

Keyword Explorer

Search Trend

People Also Ask

Page Keyword Performance

Generate AI Prompts from Keywords

Trending Searches

Configure Monitor

Query Templates

Recommendations

Add Page to Track

Choose Your Report Template

Recommended Tests Based on your data

Brand Personas

Create New Focus Group Test

Recent Sessions

Focus Group Results

No AI Pages Yet

Industry Monitor

Cross-Industry Comparison

Brand Safety Hub

Brand Safety Preflight

Brand Safety Reports

Alert Rules

Alert History

AI Demand Intelligence

Key Takeaways

Search Interest Over Time

Query Comparison

Rising Topics

Sentiment Breakdown

Recent Reddit Threads

AI Model Ranking for Your Brand

Your Content

Content Scores

SEO Metadata

SERP Competitors

GEO Suggestions

Platform Insights

AI Topic Suggestions

Content Calendar

SERP Position Tracking

Agents

Build an Agent with AI

Knowledge Base Documents

Add Document

What is Generative Engine Optimization (GEO)?

The Science Behind AI Understanding

II. Data Collection

Crawling Implementation

Embedding Pipeline

Cosine Similarity — lib/shared/vector-math.js

Query Pipeline

III. AI Visibility Testing

LLM Testing Matrix

Competitor & Citation Extraction

Parallel Execution Architecture

Cosine Similarity — `lib/shared/vector-math.js`

Two-Pass Evaluation — `lib/ground-truth/comparison/evaluator.js`

K-means++ Clustering — `lib/focus-group/personas/clustering.js`

Discussion Engine — `lib/focus-group/discussion/engine.js`

Validation Thresholds — `lib/focus-group/validation/question-generator.js`

Brier Score Calibration — `lib/prediction-markets/calibration/brier.js`

Shadow Page Generation — `scripts/lighthouse-server.js`