Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Rendered link.
This is a skeleton for an idea I've had recently. I'm fully expecting this to require revisions and expansion before its production ready so please feel free to propose changes.
An alternate solution I was considering was advertising a plain
CRAWLER
token and then bots can detect that execute aCRAWLER <name>
command and get back a response about whether that specific crawler is allowed on the network. I'm not sure if that overengineering things though.Problem
Its very hard to find IRC channels because there's no useful comprehensive database of channels. A few exist (i.e. netsplit) but they rely on admins manually adding them which isn't great.
Its possible to crawl the entire address space for networks (and IRCStats currently does this) to collect data but many IRC admins have historically resisted making that information public for privacy reasons.
Solution
This specification adds a way for networks to declare that they are okay with bots crawling them. It also allows them to specify how often they'd like to be crawled. This allows networks with privacy concerns to opt-out of scanning.
I've put a WIP module with support for this on the InspIRCd Testnet (testnet.inspircd.org).