Files
anubis-mirror/data/crawlers/_allow-good.yaml
Marielle Volz 24857f430f feat(data): add Citoid to good bots list (#1524)
* Add Wikimedia Foundation citoid services file

Wikimedia Foundation runs a service called citoid which retrieves citation metadata from urls in order to create formatted citations. 

This file contains the ip ranges allocated to the WMF (https://wikitech.wikimedia.org/wiki/IP_and_AS_allocations) from which the services make requests, as well as regex for the User-Agents from both services used to generate citations (citoid, and Zotero's translation-server which citoid makes requests to as well in order to generate the metadata).

Signed-off-by: Marielle Volz <marielle.volz@gmail.com>

* Add Wikimedia Citoid crawler to allowed list

Signed-off-by: Marielle Volz <marielle.volz@gmail.com>

* chore: update spelling

Signed-off-by: Xe Iaso <me@xeiaso.net>

---------

Signed-off-by: Marielle Volz <marielle.volz@gmail.com>
Signed-off-by: Xe Iaso <me@xeiaso.net>
Co-authored-by: Xe Iaso <me@xeiaso.net>
2026-03-20 11:13:26 +00:00

13 lines
505 B
YAML

- import: (data)/crawlers/googlebot.yaml
- import: (data)/crawlers/applebot.yaml
- import: (data)/crawlers/bingbot.yaml
- import: (data)/crawlers/duckduckbot.yaml
- import: (data)/crawlers/qwantbot.yaml
- import: (data)/crawlers/internet-archive.yaml
- import: (data)/crawlers/kagibot.yaml
- import: (data)/crawlers/marginalia.yaml
- import: (data)/crawlers/mojeekbot.yaml
- import: (data)/crawlers/commoncrawl.yaml
- import: (data)/crawlers/wikimedia-citoid.yaml
- import: (data)/crawlers/yandexbot.yaml