
What is Scrubnet?
Scrubnet is a machine-readable layer of the web designed from the ground up for AI agents and LLMs. It hosts optimised, structured data formats with no UX bloat. Just clean, fast, and purposeful content.
Why Now?
- 👁️ Humans no longer consume most web content. Bots do.
- 📦 Traditional websites are bloated, redundant, and slow for bots.
- ⚡ Scrubnet provides direct access to structured, validated data optimised for crawling, indexing, and training.
Who It's For?
- 🤖 LLM platforms and AI agents looking for faster access to verified knowledge.
- 🏢 Brands seeking visibility in AI-powered discovery systems.
- 📊 Researchers and engineers building the next generation of AI infrastructure.
Our Principles
- 🛡️ Neutral by design: Scrubnet is independent and unaffiliated with any AI platform.
- ⚙️ Machine-first: Built for bots, not browsers.
- 🔍 Transparency: Every data point is timestamped, traceable, and documented.
The Future We See
As AI replaces traditional search, Scrubnet becomes the structured foundation beneath it: a frictionless, signal-rich web layer tuned for intelligent systems. We're not just adapting to change. We're building what comes next.
Meet ScrubberDuck
ScrubberDuck is our lightweight web crawler, designed to extract clean, structured data from public pages to help large language models (LLMs) access better content.
It quietly visits websites, avoids unnecessary load, and respects all robots.txt
rules.

If you see ScrubberDuck in your logs, it means you’ve asked to join us. It’s optimising your content for the future of the web. Thanks for letting us pass through.
User-Agent: ScrubberDuck/1.0 (+https://scrubnet.org) Clean web noise since 2025
We Are Planning Ahead
ScrubberDuck is just the beginning. It represents our first bridge between brands and Scrubnet, a lightweight crawler that ensures clean, structured data enters our network.
But our vision goes further. We’re planning on building APIs that will connect directly with brand CRMs, providing an ever-fresh and authenticated feed of content. We also plan to release plugins for common website platforms like Shopify and WordPress, which will optimise the content at the source and automatically send the clean files to Scrubnet.
On top of this, a dedicated portal on Scrubnet will let brands manage their data directly inside our platform. This direct integration will become essential in the future when websites fade into the background, and brands need only one thing: a reliable way to deliver their information straight into the AI systems people use every day.
Download Our DeckAllowed Bots
Scrubnet is designed for trustworthy AI agents and search crawlers like:
- Googlebot – Google Search and Discover
- Google-Extended – AI training exclusion support
- GPTBot – OpenAI's web crawler
- ClaudeBot – Anthropic's crawler for Claude
- PerplexityBot – Perplexity AI's research assistant bot
- bingbot – Microsoft's Bing search crawler
- BingPreview – Bing's page preview bot
- CCBot – Common Crawl archive bot
- DuckDuckBot – DuckDuckGo's search engine crawler
- Applebot – Apple’s Siri and Spotlight crawler
Get Involved
Be part of the new web as an early adopter. Whether you’re a brand wanting your data included or building an AI that needs structured feeds, we’d love to hear from you.
Or reach out at contact@scrubnet.org
