Commit Graph

3 Commits

Author SHA1 Message Date
Benjamin Bouvier f21706eb12 feat(data): add Meta's web indexer used for AI purposes (#1573)
This indexer is documented in
https://developers.facebook.com/docs/sharing/webmasters/web-crawlers. I
saw it parsing the entirety of my Forgejo instance, so I suggest to
widely block it.

Signed-off-by: Benjamin Bouvier <benjamin@bouvier.cc>
2026-04-21 16:56:23 -04:00
Timon de Groot 57c0b2b22c Add IP mapped Perplexity user agents (#1393)
Perplexity has some proper documentation available for their crawlers,
with published IP addresses: https://docs.perplexity.ai/guides/bots.

Signed-off-by: Timon de Groot <timon.degroot@team.blue>
2026-01-15 19:57:31 -05:00
Corry Haines de7dbfe6d6 Split up AI filtering files (#592)
* Split up AI filtering files

Create aggressive/moderate/permissive policies to allow administrators to choose their AI/LLM stance.

Aggressive policy matches existing default in Anubis.

Removes `Google-Extended` flag from `ai-robots-txt.yaml` as it doesn't exist in requests.

Rename `ai-robots-txt.yaml` to `ai-catchall.yaml` as the file is no longer a copy of the source repo/file.

* chore: spelling

* chore: fix embeds

* chore: fix data includes

* chore: fix file name typo

* chore: Ignore READMEs in configs

* chore(lib/policy/config): go tool goimports -w

Signed-off-by: Xe Iaso <me@xeiaso.net>

---------

Signed-off-by: Xe Iaso <me@xeiaso.net>
Co-authored-by: Xe Iaso <me@xeiaso.net>
2025-06-01 20:21:18 +00:00