7.8 Internet Robots and Crawlers

The use of internet robots, crawlers and spiders has the potential to artificially inflate usage statistics. Only genuine, user-driven usage should be reported in COUNTER reports. Activity that is not initiated by human users (that is, activity initiated by autonomous systems without immediate human oversight or control) SHOULD NOT be included in COUNTER reports. This rule is a SHOULD NOT instead of MUST NOT due to the challenges presented by rapidly changing technology and user behavior in defining activity that is not initiated by human users.

7.8.1 Traditional Bots and Crawlers

For compliance with the COUNTER Code of Practice, activity generated by traditional bots and crawlers MUST be excluded from all COUNTER reports. COUNTER provides a non-comprehensive list of user agent values that represent the crawlers and bots that MUST be excluded in the GitHub COUNTER Bots Repository. Any transaction with a user agent matching one on the list MUST NOT be included in COUNTER reports.

The bots repository will be updated no more frequently than once per quarter, and no less frequently than once per year. Any new additions SHOULD be excluded as soon as possible, and MUST be excluded from the subsequent fix, feature or breaking release of the Code of Practice, as described under the continuous maintenance process in Section 12. Usage data prior to the report provider implementing the updated bots repository does not need to be reprocessed to exclude the new additions unless doing so will remove significant bot usage.

Report providers SHOULD alert COUNTER to new bots and crawlers to add to the list by raising an issue in the bots repository.

Usage by other bots and crawlers that can be identified by the report provider MUST NOT be included in COUNTER reports.

Traditional bot and crawler activity is often detected by commercial web application firewall solutions (e.g. Cloudflare, AWS WAF, F5, Scamlytics, Barracuda Networks, or open source alternatives). COUNTER supports the use of but does not require report providers to implement such solutions, provided bot and crawler usage can still be effectively excluded from usage reports.

Note that the main Code of Practice takes precedence in the case of any conflicts between it and the bots repository.

7.8.2 Generative and Agentic Artificial Intelligence

Release 5.1 of the COUNTER Code of Practice was developed prior to the public availability of generative and agentic artificial intelligence. In 2025 COUNTER started developing best practice guidelines that will draw on the existing extensions protocol outlined in Section 11 to facilitate reporting on AI usage. Please refer to the Best Practice Guidance on the COUNTER website for details.