Documentation

Learn how to use Web Reveal to its full potential

Getting Started

Web Reveal offers two ways to scan websites for technology stacks:

1. Chrome Extension

Install the extension from the Chrome Web Store
Click the extension icon while browsing any website
Instantly see the detected technologies
Access scan history and advanced features

2. Web Scanner

Visit our website at webreveal.com
Enter any URL in the scanner
Click "Scan" to analyze the website
View categorized results instantly

Chrome Extension Guide

Installation

Visit the Chrome Web Store
Search for "Web Reveal"
Click "Add to Chrome"
Confirm the installation

Using the Extension

Navigate to any website you want to analyze
Click the Web Reveal icon in your browser toolbar
The extension will automatically detect technologies
Results are organized by category (CMS, Frameworks, Analytics, etc.)
Click on any technology for more details

Features

Real-time detection as you browse
Detailed technology information
Version detection when available
Export scan results

Web Scanner Guide

How to Scan

Enter the complete URL (including https://)
Click the "Scan" button
Wait for the analysis to complete (usually 2-5 seconds)
Review the categorized results

Understanding Results

Technologies are organized into categories:

CMS: Content Management Systems (WordPress, Shopify, etc.)
Frontend: JavaScript frameworks (React, Vue, Angular)
Backend: Server technologies (Node.js, PHP, etc.)
Hosting: Hosting providers and CDNs
Analytics: Tracking and analytics tools
Libraries: JavaScript libraries and dependencies

Scan History

All scans are automatically saved
Access your scan history from the dashboard
Search and filter previous scans
Export scan data as CSV

Tips & Best Practices

Always use the full URL including protocol (https://)
Some technologies may not be detected if a site uses heavy obfuscation
Results are more accurate on production sites than development environments
CDN-hosted libraries are automatically detected

API & Detection Methods for LLMs

Web Reveal exposes a simple JSON API that LLMs, agents, and automated pipelines can use to query the technology stack of any public website. All responses are structured and machine-readable.

Scan Endpoint

// POST a URL to receive a structured tech-stack report

POST https://webreveal.io/api/scan

Content-Type: application/json

{
  "url": "https://example.com"
}

Response Structure

{
  "success": true,
  "totalFound": 14,
  "reportUrl": "/scan/example.com.html",
  "results": {
    "Frameworks": [
      { "name": "react", "version": "18.2.0" }
    ],
    "Analytics": [
      { "name": "google-analytics" }
    ],
    "CMS": [],
    "Hosting": [
      { "name": "vercel" }
    ]
  }
}

Detection Methods

Web Reveal uses five complementary detection strategies. Each method is applied independently and results are merged, so a technology can be confirmed by multiple signals:

1. HTML & DOM Analysis

Inspects the raw HTML of the page for generator meta tags, data attributes, class names, and script src paths that uniquely identify frameworks, CMS platforms, and libraries.

LLM note: Matched against patterns[] regex array per technology entry.

2. HTTP Header Inspection

Reads response headers such as X-Powered-By, Server, X-Generator, and caching headers to identify server-side frameworks, hosting providers, and CDNs.

LLM note: Mapped via headers{} key in signature entries.

3. JavaScript Global Detection

Scans inline scripts and external JS bundles for global variable names and module patterns (e.g. window.React, __NEXT_DATA__) that are emitted by popular libraries and frameworks.

LLM note: Matched against globals[] array per signature entry.

4. CSS Framework Fingerprinting

Detects utility-class naming conventions, CSS custom property namespaces, and stylesheet hrefs that are characteristic of Tailwind, Bootstrap, Bulma, and other CSS systems.

LLM note: Included in the patterns[] array targeting stylesheet URLs and inline class lists.

5. DNS & Infrastructure Lookup

Performs DNS resolution to identify nameservers, CNAME targets, and MX records that indicate hosting platforms (Cloudflare, AWS, Render, Vercel) and email providers even when no HTML clues are present.

LLM note: Results appear under Hosting and Networking categories.

Technology Signature Schema

Each technology in the detection library follows this schema. LLMs can use this to understand why a technology was or was not detected:

{
  "name": "react",           // Unique slug identifier
  "displayName": "React",    // Human-readable name
  "subcategory": "Frameworks", // Category grouping
  "globals": ["React", "__REACT_DEVTOOLS_GLOBAL_HOOK__"],
  "patterns": [
    "react(?![a-zA-Z0-9-_])", // Regex matched against HTML/scripts
    "react\\.min\\.js"
  ]
}

Rate Limits & Usage

Free (unauthenticated): 20 scans per hour per IP address.
Authenticated Pro users: unlimited scans via credit-based billing.
Responses are cached per domain — rescanning a domain always returns the latest detected stack.
The reportUrl field, when present, links to a permanent, publicly accessible HTML report at https://webreveal.io/scan/<domain>.html.
Every successful response from the public API includes X-RateLimit-Limit, X-RateLimit-Remaining, and X-RateLimit-Reset (UTC epoch) headers so agents can track quota usage. A Retry-After header (seconds) is included on 429 responses.

LLM Integration Tips

Parse the results object by category key — each value is an array of detected technology objects with name, optional version, and evidence.
Use totalFound to quickly gauge detection depth — values below 5 indicate the site may be blocking automated requests or serving a minimal response.
All detections are verified with multi-signal analysis — only technologies with strong evidence are included in results.
For agent pipelines, prefer the permanent report URL (reportUrl) as a stable reference rather than re-scanning on every invocation.

FAQ

How accurate are the results?

Most detections achieve 97% accuracy. Results are most reliable when sites expose standard tags, headers, or well-known script signatures.

What do you scan?

We look at the public page: HTML, meta tags, linked scripts, and key response headers. No login or authenticated areas are touched.

Does this crawl the whole site?

No. We scan the page you submit. Deep crawling is not performed to keep scans fast and respectful.

Why might something be missed?

Some technologies load conditionally, hide behind CDNs, or obfuscate assets. In those cases we may show partial matches or none.

Is it safe to run?

Yes. Scans are read-only and do not execute arbitrary code. They fetch the page like a normal browser request.

Can I export results?

You can copy results from the web scanner. For bulk exports, use the upcoming data portal or the Chrome extension.

Documentation

Getting Started

Chrome Extension

Web Scanner

API & LLM Detection

Getting Started

1. Chrome Extension

2. Web Scanner

Chrome Extension Guide

Installation

Using the Extension

Features

Web Scanner Guide

How to Scan

Understanding Results

Scan History

Tips & Best Practices

API & Detection Methods for LLMs

Scan Endpoint

Response Structure

Detection Methods

1. HTML & DOM Analysis

2. HTTP Header Inspection

3. JavaScript Global Detection

4. CSS Framework Fingerprinting

5. DNS & Infrastructure Lookup

Technology Signature Schema

Rate Limits & Usage

LLM Integration Tips

FAQ

How accurate are the results?

What do you scan?

Does this crawl the whole site?

Why might something be missed?

Is it safe to run?

Can I export results?