Documentation

Learn how to use Web Reveal to its full potential

Getting Started

Web Reveal offers two ways to scan websites for technology stacks:

1. Chrome Extension

  • Install the extension from the Chrome Web Store
  • Click the extension icon while browsing any website
  • Instantly see the detected technologies
  • Access scan history and advanced features

2. Web Scanner

  • Visit our website at webreveal.com
  • Enter any URL in the scanner
  • Click "Scan" to analyze the website
  • View categorized results instantly

Chrome Extension Guide

Installation

  1. Visit the Chrome Web Store
  2. Search for "Web Reveal"
  3. Click "Add to Chrome"
  4. Confirm the installation

Using the Extension

  • Navigate to any website you want to analyze
  • Click the Web Reveal icon in your browser toolbar
  • The extension will automatically detect technologies
  • Results are organized by category (CMS, Frameworks, Analytics, etc.)
  • Click on any technology for more details

Features

  • Real-time detection as you browse
  • Detailed technology information
  • Version detection when available
  • Export scan results

Web Scanner Guide

How to Scan

  1. Enter the complete URL (including https://)
  2. Click the "Scan" button
  3. Wait for the analysis to complete (usually 2-5 seconds)
  4. Review the categorized results

Understanding Results

Technologies are organized into categories:

  • CMS: Content Management Systems (WordPress, Shopify, etc.)
  • Frontend: JavaScript frameworks (React, Vue, Angular)
  • Backend: Server technologies (Node.js, PHP, etc.)
  • Hosting: Hosting providers and CDNs
  • Analytics: Tracking and analytics tools
  • Libraries: JavaScript libraries and dependencies

Scan History

  • All scans are automatically saved
  • Access your scan history from the dashboard
  • Search and filter previous scans
  • Export scan data as CSV

Tips & Best Practices

  • Always use the full URL including protocol (https://)
  • Some technologies may not be detected if a site uses heavy obfuscation
  • Results are more accurate on production sites than development environments
  • CDN-hosted libraries are automatically detected

API & Detection Methods for LLMs

Web Reveal exposes a simple JSON API that LLMs, agents, and automated pipelines can use to query the technology stack of any public website. All responses are structured and machine-readable.

Scan Endpoint

// POST a URL to receive a structured tech-stack report
POST https://webreveal.io/api/scan
Content-Type: application/json
{
  "url": "https://example.com"
}

Response Structure

{
  "success": true,
  "totalFound": 14,
  "reportUrl": "/scan/example.com.html",
  "results": {
    "Frameworks": [
      { "name": "react", "version": "18.2.0" }
    ],
    "Analytics": [
      { "name": "google-analytics" }
    ],
    "CMS": [],
    "Hosting": [
      { "name": "vercel" }
    ]
  }
}

Detection Methods

Web Reveal uses five complementary detection strategies. Each method is applied independently and results are merged, so a technology can be confirmed by multiple signals:

1. HTML & DOM Analysis

Inspects the raw HTML of the page for generator meta tags, data attributes, class names, and script src paths that uniquely identify frameworks, CMS platforms, and libraries.

LLM note: Matched against patterns[] regex array per technology entry.

2. HTTP Header Inspection

Reads response headers such as X-Powered-By, Server, X-Generator, and caching headers to identify server-side frameworks, hosting providers, and CDNs.

LLM note: Mapped via headers{} key in signature entries.

3. JavaScript Global Detection

Scans inline scripts and external JS bundles for global variable names and module patterns (e.g. window.React, __NEXT_DATA__) that are emitted by popular libraries and frameworks.

LLM note: Matched against globals[] array per signature entry.

4. CSS Framework Fingerprinting

Detects utility-class naming conventions, CSS custom property namespaces, and stylesheet hrefs that are characteristic of Tailwind, Bootstrap, Bulma, and other CSS systems.

LLM note: Included in the patterns[] array targeting stylesheet URLs and inline class lists.

5. DNS & Infrastructure Lookup

Performs DNS resolution to identify nameservers, CNAME targets, and MX records that indicate hosting platforms (Cloudflare, AWS, Render, Vercel) and email providers even when no HTML clues are present.

LLM note: Results appear under Hosting and Networking categories.

Technology Signature Schema

Each technology in the detection library follows this schema. LLMs can use this to understand why a technology was or was not detected:

{
  "name": "react",           // Unique slug identifier
  "displayName": "React",    // Human-readable name
  "subcategory": "Frameworks", // Category grouping
  "globals": ["React", "__REACT_DEVTOOLS_GLOBAL_HOOK__"],
  "patterns": [
    "react(?![a-zA-Z0-9-_])", // Regex matched against HTML/scripts
    "react\\.min\\.js"
  ]
}

Rate Limits & Usage

  • Free (unauthenticated): 20 scans per hour per IP address.
  • Authenticated Pro users: unlimited scans via credit-based billing.
  • Responses are cached per domain — rescanning a domain always returns the latest detected stack.
  • The reportUrl field, when present, links to a permanent, publicly accessible HTML report at https://webreveal.io/scan/<domain>.html.

LLM Integration Tips

  • Parse the results object by category key — each value is an array of detected technology objects with name, optional version, and evidence.
  • Use totalFound to quickly gauge detection depth — values below 5 indicate the site may be blocking automated requests or serving a minimal response.
  • All detections are verified with multi-signal analysis — only technologies with strong evidence are included in results.
  • For agent pipelines, prefer the permanent report URL (reportUrl) as a stable reference rather than re-scanning on every invocation.

FAQ

How accurate are the results?

Most detections achieve 97% accuracy. Results are most reliable when sites expose standard tags, headers, or well-known script signatures.

What do you scan?

We look at the public page: HTML, meta tags, linked scripts, and key response headers. No login or authenticated areas are touched.

Does this crawl the whole site?

No. We scan the page you submit. Deep crawling is not performed to keep scans fast and respectful.

Why might something be missed?

Some technologies load conditionally, hide behind CDNs, or obfuscate assets. In those cases we may show partial matches or none.

Is it safe to run?

Yes. Scans are read-only and do not execute arbitrary code. They fetch the page like a normal browser request.

Can I export results?

You can copy results from the web scanner. For bulk exports, use the upcoming data portal or the Chrome extension.