What is Scrubnet?

Scrubnet is a machine-readable layer of the web designed from the ground up for AI agents and LLMs. It hosts optimised, structured data formats with no UX bloat. Just clean, fast, and purposeful content.

Why Now?

👁️ Humans no longer consume most web content. Bots do.
📦 Traditional websites are bloated, redundant, and slow for bots.
⚡ Scrubnet provides direct access to structured, validated data optimised for crawling, indexing, and training.

Who It's For?

🤖 LLM platforms and AI agents looking for faster access to verified knowledge.
🏢 Brands seeking visibility in AI-powered discovery systems.
📊 Researchers and engineers building the next generation of AI infrastructure.

Our Principles

🛡️ Neutral by design: Scrubnet is independent and unaffiliated with any AI platform.
⚙️ Machine-first: Built for bots, not browsers.
🔍 Transparency: Every data point is timestamped, traceable, and documented.

Built for the post-website world.

The Future We See

As AI replaces traditional search, Scrubnet becomes the structured foundation beneath it: a frictionless, signal-rich web layer tuned for intelligent systems. We're not just adapting to change. We're building what comes next.

Meet ScrubberDuck

ScrubberDuck is our lightweight web crawler, designed to extract clean, structured data from public pages to help large language models (LLMs) access better content.

It quietly visits websites, avoids unnecessary load, and respects all robots.txt rules.

If you see ScrubberDuck in your logs, it means you’ve asked to join us. It’s optimising your content for the future of the web. Thanks for letting us pass through.

User-Agent: ScrubberDuck/1.0 (+https://scrubnet.org) Clean web noise since 2025

We Are Planning Ahead

ScrubberDuck is just the beginning. It represents our first bridge between brands and Scrubnet, a lightweight crawler that ensures clean, structured data enters our network.

But our vision goes further. We’re planning on building APIs that will connect directly with brand CRMs, providing an ever-fresh and authenticated feed of content. We also plan to release plugins for common website platforms like Shopify and WordPress, which will optimise the content at the source and automatically send the clean files to Scrubnet.

On top of this, a dedicated portal on Scrubnet will let brands manage their data directly inside our platform. This direct integration will become essential in the future when websites fade into the background, and brands need only one thing: a reliable way to deliver their information straight into the AI systems people use every day.

Download Our Deck

Allowed Bots

Scrubnet is designed for trustworthy AI agents and search crawlers like:

Googlebot – Google Search and Discover
Google-Extended – AI training exclusion support
GPTBot – OpenAI's web crawler
ClaudeBot – Anthropic's crawler for Claude
PerplexityBot – Perplexity AI's research assistant bot
bingbot – Microsoft's Bing search crawler
BingPreview – Bing's page preview bot
CCBot – Common Crawl archive bot
DuckDuckBot – DuckDuckGo's search engine crawler
Applebot – Apple’s Siri and Spotlight crawler

Get Involved

Be part of the new web as an early adopter. Whether you’re a brand wanting your data included or building an AI that needs structured feeds, we’d love to hear from you.

I'm a Brand I'm an LLM

Or reach out at contact@scrubnet.org