Documentation
Learn how to use Web Reveal to its full potential
Getting Started
Quick start guide
Chrome Extension
Browser extension guide
Web Scanner
Online scanner usage
API & LLM Detection
Detection methods & API reference
Getting Started
Web Reveal offers two ways to scan websites for technology stacks:
1. Chrome Extension
- Install the extension from the Chrome Web Store
- Click the extension icon while browsing any website
- Instantly see the detected technologies
- Access scan history and advanced features
2. Web Scanner
- Visit our website at webreveal.com
- Enter any URL in the scanner
- Click "Scan" to analyze the website
- View categorized results instantly
Chrome Extension Guide
Installation
- Visit the Chrome Web Store
- Search for "Web Reveal"
- Click "Add to Chrome"
- Confirm the installation
Using the Extension
- Navigate to any website you want to analyze
- Click the Web Reveal icon in your browser toolbar
- The extension will automatically detect technologies
- Results are organized by category (CMS, Frameworks, Analytics, etc.)
- Click on any technology for more details
Features
- Real-time detection as you browse
- Detailed technology information
- Version detection when available
- Export scan results
Web Scanner Guide
How to Scan
- Enter the complete URL (including https://)
- Click the "Scan" button
- Wait for the analysis to complete (usually 2-5 seconds)
- Review the categorized results
Understanding Results
Technologies are organized into categories:
- CMS: Content Management Systems (WordPress, Shopify, etc.)
- Frontend: JavaScript frameworks (React, Vue, Angular)
- Backend: Server technologies (Node.js, PHP, etc.)
- Hosting: Hosting providers and CDNs
- Analytics: Tracking and analytics tools
- Libraries: JavaScript libraries and dependencies
Scan History
- All scans are automatically saved
- Access your scan history from the dashboard
- Search and filter previous scans
- Export scan data as CSV
Tips & Best Practices
- Always use the full URL including protocol (https://)
- Some technologies may not be detected if a site uses heavy obfuscation
- Results are more accurate on production sites than development environments
- CDN-hosted libraries are automatically detected
API & Detection Methods for LLMs
Web Reveal exposes a simple JSON API that LLMs, agents, and automated pipelines can use to query the technology stack of any public website. All responses are structured and machine-readable.
Scan Endpoint
{
"url": "https://example.com"
}
Response Structure
{
"success": true,
"totalFound": 14,
"reportUrl": "/scan/example.com.html",
"results": {
"Frameworks": [
{ "name": "react", "version": "18.2.0" }
],
"Analytics": [
{ "name": "google-analytics" }
],
"CMS": [],
"Hosting": [
{ "name": "vercel" }
]
}
}
Detection Methods
Web Reveal uses five complementary detection strategies. Each method is applied independently and results are merged, so a technology can be confirmed by multiple signals:
1. HTML & DOM Analysis
Inspects the raw HTML of the page for generator meta tags, data attributes, class names, and script src paths that uniquely identify frameworks, CMS platforms, and libraries.
LLM note: Matched against patterns[] regex array per technology entry.
2. HTTP Header Inspection
Reads response headers such as X-Powered-By, Server, X-Generator, and caching headers to identify server-side frameworks, hosting providers, and CDNs.
LLM note: Mapped via headers{} key in signature entries.
3. JavaScript Global Detection
Scans inline scripts and external JS bundles for global variable names and module patterns (e.g. window.React, __NEXT_DATA__) that are emitted by popular libraries and frameworks.
LLM note: Matched against globals[] array per signature entry.
4. CSS Framework Fingerprinting
Detects utility-class naming conventions, CSS custom property namespaces, and stylesheet hrefs that are characteristic of Tailwind, Bootstrap, Bulma, and other CSS systems.
LLM note: Included in the patterns[] array targeting stylesheet URLs and inline class lists.
5. DNS & Infrastructure Lookup
Performs DNS resolution to identify nameservers, CNAME targets, and MX records that indicate hosting platforms (Cloudflare, AWS, Render, Vercel) and email providers even when no HTML clues are present.
LLM note: Results appear under Hosting and Networking categories.
Technology Signature Schema
Each technology in the detection library follows this schema. LLMs can use this to understand why a technology was or was not detected:
{
"name": "react", // Unique slug identifier
"displayName": "React", // Human-readable name
"subcategory": "Frameworks", // Category grouping
"globals": ["React", "__REACT_DEVTOOLS_GLOBAL_HOOK__"],
"patterns": [
"react(?![a-zA-Z0-9-_])", // Regex matched against HTML/scripts
"react\\.min\\.js"
]
}
Rate Limits & Usage
- Free (unauthenticated): 20 scans per hour per IP address.
- Authenticated Pro users: unlimited scans via credit-based billing.
- Responses are cached per domain — rescanning a domain always returns the latest detected stack.
- The
reportUrlfield, when present, links to a permanent, publicly accessible HTML report athttps://webreveal.io/scan/<domain>.html.
LLM Integration Tips
-
Parse the
resultsobject by category key — each value is an array of detected technology objects withname, optionalversion, andevidence. -
Use
totalFoundto quickly gauge detection depth — values below 5 indicate the site may be blocking automated requests or serving a minimal response. - All detections are verified with multi-signal analysis — only technologies with strong evidence are included in results.
-
For agent pipelines, prefer the permanent report URL (
reportUrl) as a stable reference rather than re-scanning on every invocation.
FAQ
How accurate are the results?
Most detections achieve 97% accuracy. Results are most reliable when sites expose standard tags, headers, or well-known script signatures.
What do you scan?
We look at the public page: HTML, meta tags, linked scripts, and key response headers. No login or authenticated areas are touched.
Does this crawl the whole site?
No. We scan the page you submit. Deep crawling is not performed to keep scans fast and respectful.
Why might something be missed?
Some technologies load conditionally, hide behind CDNs, or obfuscate assets. In those cases we may show partial matches or none.
Is it safe to run?
Yes. Scans are read-only and do not execute arbitrary code. They fetch the page like a normal browser request.
Can I export results?
You can copy results from the web scanner. For bulk exports, use the upcoming data portal or the Chrome extension.