mirror of
https://github.com/TecharoHQ/anubis.git
synced 2026-04-05 16:28:17 +00:00
* Add Wikimedia Foundation citoid services file Wikimedia Foundation runs a service called citoid which retrieves citation metadata from urls in order to create formatted citations. This file contains the ip ranges allocated to the WMF (https://wikitech.wikimedia.org/wiki/IP_and_AS_allocations) from which the services make requests, as well as regex for the User-Agents from both services used to generate citations (citoid, and Zotero's translation-server which citoid makes requests to as well in order to generate the metadata). Signed-off-by: Marielle Volz <marielle.volz@gmail.com> * Add Wikimedia Citoid crawler to allowed list Signed-off-by: Marielle Volz <marielle.volz@gmail.com> * chore: update spelling Signed-off-by: Xe Iaso <me@xeiaso.net> --------- Signed-off-by: Marielle Volz <marielle.volz@gmail.com> Signed-off-by: Xe Iaso <me@xeiaso.net> Co-authored-by: Xe Iaso <me@xeiaso.net>
13 lines
505 B
YAML
13 lines
505 B
YAML
- import: (data)/crawlers/googlebot.yaml
|
|
- import: (data)/crawlers/applebot.yaml
|
|
- import: (data)/crawlers/bingbot.yaml
|
|
- import: (data)/crawlers/duckduckbot.yaml
|
|
- import: (data)/crawlers/qwantbot.yaml
|
|
- import: (data)/crawlers/internet-archive.yaml
|
|
- import: (data)/crawlers/kagibot.yaml
|
|
- import: (data)/crawlers/marginalia.yaml
|
|
- import: (data)/crawlers/mojeekbot.yaml
|
|
- import: (data)/crawlers/commoncrawl.yaml
|
|
- import: (data)/crawlers/wikimedia-citoid.yaml
|
|
- import: (data)/crawlers/yandexbot.yaml
|