mirror of
https://github.com/TecharoHQ/anubis.git
synced 2026-04-05 16:28:17 +00:00
Compare commits
154 Commits
Xe/docker-
...
Xe/promise
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
b07e93289e | ||
|
|
3e1aaa6273 | ||
|
|
dce7ed2405 | ||
|
|
03758405d3 | ||
|
|
eb78ccc30c | ||
|
|
4156f84020 | ||
|
|
76dcd21582 | ||
|
|
6b639cd911 | ||
|
|
a0aba2d74a | ||
|
|
b485499125 | ||
|
|
300720f030 | ||
|
|
d6298adc6d | ||
|
|
1a9d8fb0cf | ||
|
|
36e25ff5f3 | ||
|
|
c59b7179c3 | ||
|
|
59515ed669 | ||
|
|
4d6b578f93 | ||
|
|
2915c1d209 | ||
|
|
68b653b099 | ||
|
|
509a4f3ce8 | ||
|
|
5c4d8480e6 | ||
|
|
132b2ed853 | ||
|
|
d28991ce8d | ||
|
|
0fd4bb81b8 | ||
|
|
603c68fd54 | ||
|
|
c8f2eb1185 | ||
|
|
f6b94dca98 | ||
|
|
6d8b98eb3d | ||
|
|
b9d8275234 | ||
|
|
c2cc1df172 | ||
|
|
735b2ceb14 | ||
|
|
2cb57fc247 | ||
|
|
61ce581f36 | ||
|
|
3f6750ac7d | ||
|
|
25d75b352a | ||
|
|
de17823bc7 | ||
|
|
29622e605d | ||
|
|
9fa1795db7 | ||
|
|
fbf69680f5 | ||
|
|
c74de19532 | ||
|
|
6dc726013a | ||
|
|
02304e8f3c | ||
|
|
607c9791d8 | ||
|
|
6b67be86a1 | ||
|
|
e02f017153 | ||
|
|
66b39f64af | ||
|
|
944fd25924 | ||
|
|
fa3fbfb0a5 | ||
|
|
3c739c1305 | ||
|
|
cc56baa5c7 | ||
|
|
053d29e0b6 | ||
|
|
a668095c22 | ||
|
|
1c4a1aec4a | ||
|
|
5b8b6d1c94 | ||
|
|
0cb6ef76e1 | ||
|
|
a900e98b8b | ||
|
|
e79cd93b61 | ||
|
|
d17fc6a174 | ||
|
|
95768cb70f | ||
|
|
ca61b8a05f | ||
|
|
1ea1157cd7 | ||
|
|
44ae5f2e2b | ||
|
|
ea2e76c6ee | ||
|
|
4ea0add50d | ||
|
|
289c802a0b | ||
|
|
543b942be1 | ||
|
|
edbe1dcfd6 | ||
|
|
94db16c0df | ||
|
|
c2f46907a1 | ||
|
|
6fa5b8e4e0 | ||
|
|
f98750b038 | ||
|
|
7d0c58d1a8 | ||
|
|
e870ede120 | ||
|
|
592d1e3dfc | ||
|
|
f6254b4b98 | ||
|
|
d19026d693 | ||
|
|
7b72c790ab | ||
|
|
719a1409ca | ||
|
|
890f21bf47 | ||
|
|
93bfe910d8 | ||
|
|
19d8de784b | ||
|
|
dff2176beb | ||
|
|
506d8817d5 | ||
|
|
d0fae02d05 | ||
|
|
845095c3f6 | ||
|
|
2f1e78cc6c | ||
|
|
7c0996448a | ||
|
|
d7a758f805 | ||
|
|
c121896f9c | ||
|
|
888b7d6e77 | ||
|
|
0e43138324 | ||
|
|
c981c23f7e | ||
|
|
9f0c5e974e | ||
|
|
292c470ada | ||
|
|
12453fdc00 | ||
|
|
f5b3bf81bc | ||
|
|
1820649987 | ||
|
|
14eeeb56d6 | ||
|
|
d9e0fbe905 | ||
|
|
6aa17532da | ||
|
|
b1edf84a7c | ||
|
|
d47a3406db | ||
|
|
ff5991b5cf | ||
|
|
19f78f37ad | ||
|
|
b0b0a5c08a | ||
|
|
261306dc63 | ||
|
|
3520421757 | ||
|
|
ad5430612f | ||
|
|
c2423d0688 | ||
|
|
a1b7d2ccda | ||
|
|
7cf6ac5de6 | ||
|
|
59f5b07281 | ||
|
|
1562f88c35 | ||
|
|
15bd9b6a44 | ||
|
|
1ca531b930 | ||
|
|
f9259299b9 | ||
|
|
16a4e04027 | ||
|
|
8c79870edb | ||
|
|
060b10ea2d | ||
|
|
4c74934e9f | ||
|
|
5870f7072c | ||
|
|
3c1d95d61e | ||
|
|
ab801a3597 | ||
|
|
ecc716940e | ||
|
|
4948036f39 | ||
|
|
7aa732c700 | ||
|
|
226cf36bf7 | ||
|
|
1d5fa49eb0 | ||
|
|
97c1d4f353 | ||
|
|
244f1c505a | ||
|
|
ae4d3b0ce5 | ||
|
|
e60c43cdd2 | ||
|
|
b2b2679bae | ||
|
|
e2b46fc5e7 | ||
|
|
3437e575d4 | ||
|
|
ae064be710 | ||
|
|
e3826df3ab | ||
|
|
823d1be5d1 | ||
|
|
0c6a820372 | ||
|
|
81f6380dd4 | ||
|
|
e5455c02d8 | ||
|
|
1d8033d69e | ||
|
|
e0781e4560 | ||
|
|
7a195f1595 | ||
|
|
2904ff974b | ||
|
|
3b3080d497 | ||
|
|
60ba8e9557 | ||
|
|
14c80483a9 | ||
|
|
d1452b6d39 | ||
|
|
5e95da6b6c | ||
|
|
988fc0941b | ||
|
|
f5140ae57b | ||
|
|
bbdee34f37 | ||
|
|
6e2eeb9e65 |
@@ -9,4 +9,4 @@ exclude_dir = ["var", "vendor", "docs", "node_modules"]
|
||||
|
||||
[logger]
|
||||
time = true
|
||||
# to change flags at runtime, prepend with -- e.g. $ air -- --target http://localhost:3000 --difficulty 20 --use-remote-address
|
||||
# to change flags at runtime, prepend with -- e.g. $ air -- --target http://localhost:3000 --difficulty 20 --use-remote-address
|
||||
|
||||
12
.devcontainer/Dockerfile
Normal file
12
.devcontainer/Dockerfile
Normal file
@@ -0,0 +1,12 @@
|
||||
FROM ghcr.io/xe/devcontainer-base/pre/go
|
||||
|
||||
WORKDIR /app
|
||||
|
||||
COPY go.mod go.sum package.json package-lock.json ./
|
||||
RUN apt-get update \
|
||||
&& apt-get -y install zstd brotli redis \
|
||||
&& mkdir -p /home/vscode/.local/share/fish \
|
||||
&& chown -R vscode:vscode /home/vscode/.local/share/fish \
|
||||
&& chown -R vscode:vscode /go
|
||||
|
||||
CMD ["/usr/bin/sleep", "infinity"]
|
||||
13
.devcontainer/README.md
Normal file
13
.devcontainer/README.md
Normal file
@@ -0,0 +1,13 @@
|
||||
# Anubis Dev Container
|
||||
|
||||
Anubis offers a [development container](https://containers.dev/) image in order to make it easier to contribute to the project. This image is based on [Xe/devcontainer-base/go](https://github.com/Xe/devcontainer-base/tree/main/src/go), which is based on Debian Bookworm with the following customizations:
|
||||
|
||||
- [Fish](https://fishshell.com/) as the shell complete with a custom theme
|
||||
- [Go](https://go.dev) at the most recent stable version
|
||||
- [Node.js](https://nodejs.org/en) at the most recent stable version
|
||||
- [Atuin](https://atuin.sh/) to sync shell history between your host OS and the development container
|
||||
- [Docker](https://docker.com) to manage and build Anubis container images from inside the development container
|
||||
- [Ko](https://ko.build/) to build production-ready Anubis container images
|
||||
- [Neovim](https://neovim.io/) for use with Git
|
||||
|
||||
This development container is tested and known to work with [Visual Studio Code](https://code.visualstudio.com/). If you run into problems with it outside of VS Code, please file an issue and let us know what editor you are using.
|
||||
28
.devcontainer/devcontainer.json
Normal file
28
.devcontainer/devcontainer.json
Normal file
@@ -0,0 +1,28 @@
|
||||
// For format details, see https://aka.ms/devcontainer.json. For config options, see the
|
||||
// README at: https://github.com/devcontainers/templates/tree/main/src/debian
|
||||
{
|
||||
"name": "Dev",
|
||||
"dockerComposeFile": [
|
||||
"./docker-compose.yaml"
|
||||
],
|
||||
"service": "workspace",
|
||||
"workspaceFolder": "/workspace/anubis",
|
||||
"postStartCommand": "bash ./.devcontainer/poststart.sh",
|
||||
"features": {
|
||||
"ghcr.io/xe/devcontainer-features/ko:1.1.0": {},
|
||||
"ghcr.io/devcontainers/features/github-cli:1": {}
|
||||
},
|
||||
"initializeCommand": "mkdir -p ${localEnv:HOME}${localEnv:USERPROFILE}/.local/share/atuin",
|
||||
"customizations": {
|
||||
"vscode": {
|
||||
"extensions": [
|
||||
"esbenp.prettier-vscode",
|
||||
"ms-azuretools.vscode-containers",
|
||||
"golang.go",
|
||||
"unifiedjs.vscode-mdx",
|
||||
"a-h.templ",
|
||||
"redhat.vscode-yaml"
|
||||
]
|
||||
}
|
||||
}
|
||||
}
|
||||
26
.devcontainer/docker-compose.yaml
Normal file
26
.devcontainer/docker-compose.yaml
Normal file
@@ -0,0 +1,26 @@
|
||||
services:
|
||||
playwright:
|
||||
image: mcr.microsoft.com/playwright:v1.52.0-noble
|
||||
init: true
|
||||
network_mode: service:workspace
|
||||
command:
|
||||
- /bin/sh
|
||||
- -c
|
||||
- npx -y playwright@1.52.0 run-server --port 9001 --host 0.0.0.0
|
||||
|
||||
valkey:
|
||||
image: valkey/valkey:8
|
||||
pull_policy: always
|
||||
|
||||
# VS Code workspace service
|
||||
workspace:
|
||||
image: ghcr.io/techarohq/anubis/devcontainer
|
||||
build:
|
||||
context: ..
|
||||
dockerfile: .devcontainer/Dockerfile
|
||||
volumes:
|
||||
- ../:/workspace/anubis:cached
|
||||
environment:
|
||||
VALKEY_URL: redis://valkey:6379/0
|
||||
#entrypoint: ["/usr/bin/sleep", "infinity"]
|
||||
user: vscode
|
||||
9
.devcontainer/poststart.sh
Normal file
9
.devcontainer/poststart.sh
Normal file
@@ -0,0 +1,9 @@
|
||||
#!/usr/bin/env bash
|
||||
|
||||
pwd
|
||||
|
||||
npm ci &
|
||||
go mod download &
|
||||
go install ./utils/cmd/... &
|
||||
|
||||
wait
|
||||
2
.gitattributes
vendored
2
.gitattributes
vendored
@@ -1 +1 @@
|
||||
web/index_templ.go linguist-generated
|
||||
**/*_templ.go linguist-generated=true
|
||||
|
||||
3
.github/actions/spelling/allow.txt
vendored
3
.github/actions/spelling/allow.txt
vendored
@@ -2,4 +2,5 @@ github
|
||||
https
|
||||
ssh
|
||||
ubuntu
|
||||
workarounds
|
||||
workarounds
|
||||
rjack
|
||||
7
.github/actions/spelling/excludes.txt
vendored
7
.github/actions/spelling/excludes.txt
vendored
@@ -83,6 +83,13 @@
|
||||
^\Q.github/FUNDING.yml\E$
|
||||
^\Q.github/workflows/spelling.yml\E$
|
||||
^data/crawlers/
|
||||
^docs/blog/tags\.yml$
|
||||
^docs/docs/user/known-instances.md$
|
||||
^docs/manifest/.*$
|
||||
^docs/static/\.nojekyll$
|
||||
^lib/policy/config/testdata/bad/unparseable\.json$
|
||||
ignore$
|
||||
robots.txt
|
||||
^lib/localization/locales/.*\.json$
|
||||
^lib/localization/.*_test.go$
|
||||
^test/.*$
|
||||
|
||||
99
.github/actions/spelling/expect.txt
vendored
99
.github/actions/spelling/expect.txt
vendored
@@ -1,5 +1,4 @@
|
||||
acs
|
||||
aeacus
|
||||
Aibrew
|
||||
alrest
|
||||
amazonbot
|
||||
@@ -8,52 +7,67 @@ anubis
|
||||
anubistest
|
||||
Applebot
|
||||
archlinux
|
||||
asnc
|
||||
asnchecker
|
||||
asns
|
||||
aspirational
|
||||
atuin
|
||||
azuretools
|
||||
badregexes
|
||||
bbolt
|
||||
bdba
|
||||
berr
|
||||
bingbot
|
||||
bitcoin
|
||||
blogging
|
||||
Bitcoin
|
||||
bitrate
|
||||
Bluesky
|
||||
blueskybot
|
||||
boi
|
||||
botnet
|
||||
botstopper
|
||||
BPort
|
||||
Brightbot
|
||||
broked
|
||||
byteslice
|
||||
Bytespider
|
||||
cachebuster
|
||||
cachediptoasn
|
||||
Caddyfile
|
||||
caninetools
|
||||
Cardyb
|
||||
celchecker
|
||||
CELPHASE
|
||||
celphase
|
||||
cerr
|
||||
certresolver
|
||||
cespare
|
||||
CGNAT
|
||||
cgr
|
||||
chainguard
|
||||
chall
|
||||
challengemozilla
|
||||
challengetest
|
||||
checkpath
|
||||
checkresult
|
||||
chen
|
||||
chibi
|
||||
cidranger
|
||||
ckie
|
||||
ckies
|
||||
cloudflare
|
||||
Codespaces
|
||||
confd
|
||||
connnection
|
||||
containerbuild
|
||||
coreutils
|
||||
Cotoyogi
|
||||
CRDs
|
||||
Cromite
|
||||
crt
|
||||
Cscript
|
||||
daemonizing
|
||||
DDOS
|
||||
Debian
|
||||
debrpm
|
||||
decaymap
|
||||
decompiling
|
||||
devcontainers
|
||||
Diffbot
|
||||
discordapp
|
||||
discordbot
|
||||
@@ -61,13 +75,17 @@ distros
|
||||
dnf
|
||||
dnsbl
|
||||
dnserr
|
||||
domainhere
|
||||
dracula
|
||||
dronebl
|
||||
droneblresponse
|
||||
dropin
|
||||
duckduckbot
|
||||
eerror
|
||||
ellenjoe
|
||||
emacs
|
||||
enbyware
|
||||
etld
|
||||
everyones
|
||||
evilbot
|
||||
evilsite
|
||||
@@ -79,6 +97,7 @@ facebookgo
|
||||
Factset
|
||||
fastcgi
|
||||
fediverse
|
||||
ffprobe
|
||||
finfos
|
||||
Firecrawl
|
||||
flagenv
|
||||
@@ -86,46 +105,61 @@ Fordola
|
||||
forgejo
|
||||
fsys
|
||||
fullchain
|
||||
gaissmai
|
||||
Galvus
|
||||
geoip
|
||||
geoipchecker
|
||||
gha
|
||||
gipc
|
||||
gitea
|
||||
godotenv
|
||||
goland
|
||||
gomod
|
||||
goodbot
|
||||
googlebot
|
||||
gopsutil
|
||||
govulncheck
|
||||
goyaml
|
||||
GPG
|
||||
GPT
|
||||
gptbot
|
||||
grpcprom
|
||||
grw
|
||||
Hashcash
|
||||
hashrate
|
||||
headermap
|
||||
healthcheck
|
||||
hebis
|
||||
healthz
|
||||
hec
|
||||
hmc
|
||||
hostable
|
||||
htmlc
|
||||
htmx
|
||||
httpdebug
|
||||
Huawei
|
||||
hypertext
|
||||
iaskspider
|
||||
iat
|
||||
ifm
|
||||
Imagesift
|
||||
imgproxy
|
||||
impressum
|
||||
inp
|
||||
internets
|
||||
IPTo
|
||||
iptoasn
|
||||
iss
|
||||
isset
|
||||
ivh
|
||||
Jenomis
|
||||
JGit
|
||||
joho
|
||||
journalctl
|
||||
jshelter
|
||||
JWTs
|
||||
kagi
|
||||
kagibot
|
||||
keikaku
|
||||
Keyfunc
|
||||
keypair
|
||||
KHTML
|
||||
kinda
|
||||
@@ -138,13 +172,13 @@ lgbt
|
||||
licend
|
||||
licstart
|
||||
lightpanda
|
||||
LIMSA
|
||||
limsa
|
||||
Linting
|
||||
linuxbrew
|
||||
LLU
|
||||
loadbalancer
|
||||
lol
|
||||
LOMINSA
|
||||
lominsa
|
||||
maintainership
|
||||
malware
|
||||
mcr
|
||||
@@ -152,14 +186,16 @@ memes
|
||||
metarefresh
|
||||
metrix
|
||||
mimi
|
||||
minica
|
||||
Minfilia
|
||||
mistralai
|
||||
Mojeek
|
||||
mojeekbot
|
||||
mozilla
|
||||
nbf
|
||||
nepeat
|
||||
netsurf
|
||||
nginx
|
||||
nicksnyder
|
||||
nobots
|
||||
NONINFRINGEMENT
|
||||
nosleep
|
||||
@@ -167,9 +203,10 @@ OCOB
|
||||
ogtags
|
||||
omgili
|
||||
omgilibot
|
||||
onionservice
|
||||
openai
|
||||
opengraph
|
||||
openrc
|
||||
oswald
|
||||
pag
|
||||
palemoon
|
||||
Pangu
|
||||
@@ -184,43 +221,48 @@ pipefail
|
||||
pki
|
||||
podkova
|
||||
podman
|
||||
poststart
|
||||
prebaked
|
||||
privkey
|
||||
promauto
|
||||
promhttp
|
||||
proofofwork
|
||||
publicsuffix
|
||||
pwcmd
|
||||
pwuser
|
||||
qualys
|
||||
qwant
|
||||
qwantbot
|
||||
rac
|
||||
rawler
|
||||
rcvar
|
||||
rdb
|
||||
redhat
|
||||
redir
|
||||
redirectscheme
|
||||
relayd
|
||||
refactors
|
||||
reputational
|
||||
reqmeta
|
||||
risc
|
||||
ruleset
|
||||
runlevels
|
||||
RUnlock
|
||||
runtimedir
|
||||
sas
|
||||
sasl
|
||||
Scumm
|
||||
searchbot
|
||||
searx
|
||||
sebest
|
||||
secretplans
|
||||
selfsigned
|
||||
Semrush
|
||||
Seo
|
||||
setsebool
|
||||
shellcheck
|
||||
shirou
|
||||
Sidetrade
|
||||
simprint
|
||||
sitemap
|
||||
sls
|
||||
sni
|
||||
Sourceware
|
||||
Spambot
|
||||
sparkline
|
||||
spyderbot
|
||||
@@ -228,31 +270,43 @@ srv
|
||||
stackoverflow
|
||||
startprecmd
|
||||
stoppostcmd
|
||||
storetest
|
||||
subgrid
|
||||
subr
|
||||
subrequest
|
||||
SVCNAME
|
||||
tagline
|
||||
tarballs
|
||||
tarrif
|
||||
tbn
|
||||
tbr
|
||||
techaro
|
||||
techarohq
|
||||
templ
|
||||
templruntime
|
||||
testarea
|
||||
Thancred
|
||||
thoth
|
||||
thothmock
|
||||
Tik
|
||||
Timpibot
|
||||
torproject
|
||||
traefik
|
||||
uberspace
|
||||
unixhttpd
|
||||
Unbreak
|
||||
unbreakdocker
|
||||
unifiedjs
|
||||
unmarshal
|
||||
unparseable
|
||||
uvx
|
||||
UXP
|
||||
valkey
|
||||
Varis
|
||||
Velen
|
||||
vendored
|
||||
vhosts
|
||||
videotest
|
||||
VKE
|
||||
Vultr
|
||||
waitloop
|
||||
weblate
|
||||
webmaster
|
||||
@@ -261,11 +315,11 @@ websecure
|
||||
websites
|
||||
Webzio
|
||||
wildbase
|
||||
withthothmock
|
||||
wordpress
|
||||
Workaround
|
||||
workdir
|
||||
wpbot
|
||||
xcaddy
|
||||
Xeact
|
||||
xeiaso
|
||||
xeserv
|
||||
@@ -274,6 +328,7 @@ xess
|
||||
xff
|
||||
XForwarded
|
||||
XNG
|
||||
XOB
|
||||
XReal
|
||||
yae
|
||||
YAMLTo
|
||||
@@ -281,6 +336,8 @@ yeet
|
||||
yeetfile
|
||||
yourdomain
|
||||
yoursite
|
||||
yyz
|
||||
Zenos
|
||||
zizmor
|
||||
zombocom
|
||||
zos
|
||||
|
||||
@@ -273,14 +273,6 @@
|
||||
# Most people only have two hands. Reword.
|
||||
\b(?i)on the third hand\b
|
||||
|
||||
# Should be `Open Graph`
|
||||
# unless talking about a specific Open Graph implementation:
|
||||
# - Java
|
||||
# - Node
|
||||
# - Py
|
||||
# - Ruby
|
||||
\bOpenGraph\b
|
||||
|
||||
# Should be `OpenShift`
|
||||
\bOpenshift\b
|
||||
|
||||
|
||||
2
.github/actions/spelling/patterns.txt
vendored
2
.github/actions/spelling/patterns.txt
vendored
@@ -131,4 +131,4 @@ go install(?:\s+[a-z]+\.[-@\w/.]+)+
|
||||
|
||||
# hit-count: 1 file-count: 1
|
||||
# microsoft
|
||||
\b(?:https?://|)(?:(?:(?:blogs|download\.visualstudio|docs|msdn2?|research)\.|)microsoft|blogs\.msdn)\.co(?:m|\.\w\w)/[-_a-zA-Z0-9()=./%]*
|
||||
\b(?:https?://|)(?:(?:(?:blogs|download\.visualstudio|docs|msdn2?|research)\.|)microsoft|blogs\.msdn)\.co(?:m|\.\w\w)/[-_a-zA-Z0-9()=./%]*
|
||||
|
||||
2
.github/workflows/docker-pr.yml
vendored
2
.github/workflows/docker-pr.yml
vendored
@@ -22,7 +22,7 @@ jobs:
|
||||
persist-credentials: false
|
||||
|
||||
- name: Set up Homebrew
|
||||
uses: Homebrew/actions/setup-homebrew@master
|
||||
uses: Homebrew/actions/setup-homebrew@main
|
||||
|
||||
- name: Setup Homebrew cellar cache
|
||||
uses: actions/cache@5a3ec84eff668545956fd18022155c47e93e2684 # v4.2.3
|
||||
|
||||
11
.github/workflows/docker.yml
vendored
11
.github/workflows/docker.yml
vendored
@@ -3,8 +3,8 @@ name: Docker image builds
|
||||
on:
|
||||
workflow_dispatch:
|
||||
push:
|
||||
branches: [ "main" ]
|
||||
tags: [ "v*" ]
|
||||
branches: ["main"]
|
||||
tags: ["v*"]
|
||||
|
||||
env:
|
||||
DOCKER_METADATA_SET_OUTPUT_ENV: "true"
|
||||
@@ -32,7 +32,7 @@ jobs:
|
||||
echo "IMAGE=ghcr.io/${GITHUB_REPOSITORY,,}" >> $GITHUB_ENV
|
||||
|
||||
- name: Set up Homebrew
|
||||
uses: Homebrew/actions/setup-homebrew@master
|
||||
uses: Homebrew/actions/setup-homebrew@main
|
||||
|
||||
- name: Setup Homebrew cellar cache
|
||||
uses: actions/cache@5a3ec84eff668545956fd18022155c47e93e2684 # v4.2.3
|
||||
@@ -55,7 +55,7 @@ jobs:
|
||||
run: |
|
||||
brew bundle
|
||||
|
||||
- name: Log into registry
|
||||
- name: Log into registry
|
||||
uses: docker/login-action@74a5d142397b4f367a81961eba4e8cd7edddf772 # v3.4.0
|
||||
with:
|
||||
registry: ghcr.io
|
||||
@@ -77,9 +77,8 @@ jobs:
|
||||
DOCKER_REPO: ${{ env.IMAGE }}
|
||||
SLOG_LEVEL: debug
|
||||
|
||||
|
||||
- name: Generate artifact attestation
|
||||
uses: actions/attest-build-provenance@db473fddc028af60658334401dc6fa3ffd8669fd # v2.3.0
|
||||
uses: actions/attest-build-provenance@e8998f949152b193b063cb0ec769d69d929409be # v2.4.0
|
||||
with:
|
||||
subject-name: ${{ env.IMAGE }}
|
||||
subject-digest: ${{ steps.build.outputs.digest }}
|
||||
|
||||
13
.github/workflows/docs-deploy.yml
vendored
13
.github/workflows/docs-deploy.yml
vendored
@@ -22,7 +22,7 @@ jobs:
|
||||
persist-credentials: false
|
||||
|
||||
- name: Set up Docker Buildx
|
||||
uses: docker/setup-buildx-action@b5ca514318bd6ebac0fb2aedd5d36ec1b5c232a2 # v3.10.0
|
||||
uses: docker/setup-buildx-action@e468171a9de216ec08956ac3ada2f0791b6bd435 # v3.11.1
|
||||
|
||||
- name: Log into registry
|
||||
uses: docker/login-action@74a5d142397b4f367a81961eba4e8cd7edddf772 # v3.4.0
|
||||
@@ -36,6 +36,9 @@ jobs:
|
||||
uses: docker/metadata-action@902fa8ec7d6ecbf8d84d538b9b233a880e428804 # v5.7.0
|
||||
with:
|
||||
images: ghcr.io/techarohq/anubis/docs
|
||||
tags: |
|
||||
type=sha,enable=true,priority=100,prefix=,suffix=,format=long
|
||||
main
|
||||
|
||||
- name: Build and push
|
||||
id: build
|
||||
@@ -49,15 +52,15 @@ jobs:
|
||||
platforms: linux/amd64
|
||||
push: true
|
||||
|
||||
- name: Apply k8s manifests to aeacus
|
||||
uses: actions-hub/kubectl@f632a31512a74cb35940627c49c20f67723cbaaf # v1.33.1
|
||||
- name: Apply k8s manifests to limsa lominsa
|
||||
uses: actions-hub/kubectl@b5b19eeb6a0ffde16637e398f8b96ef01eb8fdb7 # v1.33.3
|
||||
env:
|
||||
KUBE_CONFIG: ${{ secrets.LIMSA_LOMINSA_KUBECONFIG }}
|
||||
with:
|
||||
args: apply -k docs/manifest
|
||||
|
||||
- name: Apply k8s manifests to aeacus
|
||||
uses: actions-hub/kubectl@f632a31512a74cb35940627c49c20f67723cbaaf # v1.33.1
|
||||
- name: Apply k8s manifests to limsa lominsa
|
||||
uses: actions-hub/kubectl@b5b19eeb6a0ffde16637e398f8b96ef01eb8fdb7 # v1.33.3
|
||||
env:
|
||||
KUBE_CONFIG: ${{ secrets.LIMSA_LOMINSA_KUBECONFIG }}
|
||||
with:
|
||||
|
||||
9
.github/workflows/docs-test.yml
vendored
9
.github/workflows/docs-test.yml
vendored
@@ -2,7 +2,7 @@ name: Docs test build
|
||||
|
||||
on:
|
||||
pull_request:
|
||||
branches: [ "main" ]
|
||||
branches: ["main"]
|
||||
|
||||
permissions:
|
||||
contents: read
|
||||
@@ -18,13 +18,16 @@ jobs:
|
||||
persist-credentials: false
|
||||
|
||||
- name: Set up Docker Buildx
|
||||
uses: docker/setup-buildx-action@b5ca514318bd6ebac0fb2aedd5d36ec1b5c232a2 # v3.10.0
|
||||
uses: docker/setup-buildx-action@e468171a9de216ec08956ac3ada2f0791b6bd435 # v3.11.1
|
||||
|
||||
- name: Docker meta
|
||||
id: meta
|
||||
uses: docker/metadata-action@902fa8ec7d6ecbf8d84d538b9b233a880e428804 # v5.7.0
|
||||
with:
|
||||
images: ghcr.io/${{ github.repository }}/docs
|
||||
images: ghcr.io/techarohq/anubis/docs
|
||||
tags: |
|
||||
type=sha,enable=true,priority=100,prefix=,suffix=,format=long
|
||||
main
|
||||
|
||||
- name: Build and push
|
||||
id: build
|
||||
|
||||
4
.github/workflows/go.yml
vendored
4
.github/workflows/go.yml
vendored
@@ -25,7 +25,7 @@ jobs:
|
||||
sudo apt-get install -y build-essential
|
||||
|
||||
- name: Set up Homebrew
|
||||
uses: Homebrew/actions/setup-homebrew@master
|
||||
uses: Homebrew/actions/setup-homebrew@main
|
||||
|
||||
- name: Setup Homebrew cellar cache
|
||||
uses: actions/cache@5a3ec84eff668545956fd18022155c47e93e2684 # v4.2.3
|
||||
@@ -82,7 +82,7 @@ jobs:
|
||||
run: npm run test
|
||||
|
||||
- name: Lint with staticcheck
|
||||
uses: dominikh/staticcheck-action@fe1dd0c3658873b46f8c9bb3291096a617310ca6 # v1.3.1
|
||||
uses: dominikh/staticcheck-action@024238d2898c874f26d723e7d0ff4308c35589a2 # v1.4.0
|
||||
with:
|
||||
version: "latest"
|
||||
|
||||
|
||||
117
.github/workflows/package-builds-stable.yml
vendored
117
.github/workflows/package-builds-stable.yml
vendored
@@ -1,8 +1,9 @@
|
||||
name: Package builds (stable)
|
||||
|
||||
on:
|
||||
release:
|
||||
types: [published]
|
||||
workflow_dispatch:
|
||||
# release:
|
||||
# types: [published]
|
||||
|
||||
permissions:
|
||||
contents: write
|
||||
@@ -13,67 +14,67 @@ jobs:
|
||||
#runs-on: alrest-techarohq
|
||||
runs-on: ubuntu-24.04
|
||||
steps:
|
||||
- uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2
|
||||
with:
|
||||
persist-credentials: false
|
||||
fetch-tags: true
|
||||
fetch-depth: 0
|
||||
- uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2
|
||||
with:
|
||||
persist-credentials: false
|
||||
fetch-tags: true
|
||||
fetch-depth: 0
|
||||
|
||||
- name: build essential
|
||||
run: |
|
||||
sudo apt-get update
|
||||
sudo apt-get install -y build-essential
|
||||
- name: build essential
|
||||
run: |
|
||||
sudo apt-get update
|
||||
sudo apt-get install -y build-essential
|
||||
|
||||
- name: Set up Homebrew
|
||||
uses: Homebrew/actions/setup-homebrew@master
|
||||
- name: Set up Homebrew
|
||||
uses: Homebrew/actions/setup-homebrew@main
|
||||
|
||||
- name: Setup Homebrew cellar cache
|
||||
uses: actions/cache@5a3ec84eff668545956fd18022155c47e93e2684 # v4.2.3
|
||||
with:
|
||||
path: |
|
||||
/home/linuxbrew/.linuxbrew/Cellar
|
||||
/home/linuxbrew/.linuxbrew/bin
|
||||
/home/linuxbrew/.linuxbrew/etc
|
||||
/home/linuxbrew/.linuxbrew/include
|
||||
/home/linuxbrew/.linuxbrew/lib
|
||||
/home/linuxbrew/.linuxbrew/opt
|
||||
/home/linuxbrew/.linuxbrew/sbin
|
||||
/home/linuxbrew/.linuxbrew/share
|
||||
/home/linuxbrew/.linuxbrew/var
|
||||
key: ${{ runner.os }}-go-homebrew-cellar-${{ hashFiles('go.sum') }}
|
||||
restore-keys: |
|
||||
${{ runner.os }}-go-homebrew-cellar-
|
||||
- name: Setup Homebrew cellar cache
|
||||
uses: actions/cache@5a3ec84eff668545956fd18022155c47e93e2684 # v4.2.3
|
||||
with:
|
||||
path: |
|
||||
/home/linuxbrew/.linuxbrew/Cellar
|
||||
/home/linuxbrew/.linuxbrew/bin
|
||||
/home/linuxbrew/.linuxbrew/etc
|
||||
/home/linuxbrew/.linuxbrew/include
|
||||
/home/linuxbrew/.linuxbrew/lib
|
||||
/home/linuxbrew/.linuxbrew/opt
|
||||
/home/linuxbrew/.linuxbrew/sbin
|
||||
/home/linuxbrew/.linuxbrew/share
|
||||
/home/linuxbrew/.linuxbrew/var
|
||||
key: ${{ runner.os }}-go-homebrew-cellar-${{ hashFiles('go.sum') }}
|
||||
restore-keys: |
|
||||
${{ runner.os }}-go-homebrew-cellar-
|
||||
|
||||
- name: Install Brew dependencies
|
||||
run: |
|
||||
brew bundle
|
||||
- name: Install Brew dependencies
|
||||
run: |
|
||||
brew bundle
|
||||
|
||||
- name: Setup Golang caches
|
||||
uses: actions/cache@5a3ec84eff668545956fd18022155c47e93e2684 # v4.2.3
|
||||
with:
|
||||
path: |
|
||||
~/.cache/go-build
|
||||
~/go/pkg/mod
|
||||
key: ${{ runner.os }}-golang-${{ hashFiles('**/go.sum') }}
|
||||
restore-keys: |
|
||||
${{ runner.os }}-golang-
|
||||
- name: Setup Golang caches
|
||||
uses: actions/cache@5a3ec84eff668545956fd18022155c47e93e2684 # v4.2.3
|
||||
with:
|
||||
path: |
|
||||
~/.cache/go-build
|
||||
~/go/pkg/mod
|
||||
key: ${{ runner.os }}-golang-${{ hashFiles('**/go.sum') }}
|
||||
restore-keys: |
|
||||
${{ runner.os }}-golang-
|
||||
|
||||
- name: install node deps
|
||||
run: |
|
||||
npm ci
|
||||
- name: install node deps
|
||||
run: |
|
||||
npm ci
|
||||
|
||||
- name: Build Packages
|
||||
run: |
|
||||
go tool yeet
|
||||
- name: Build Packages
|
||||
run: |
|
||||
go tool yeet
|
||||
|
||||
- name: Upload released artifacts
|
||||
env:
|
||||
GITHUB_TOKEN: ${{ github.TOKEN }}
|
||||
RELEASE_VERSION: ${{github.event.release.tag_name}}
|
||||
shell: bash
|
||||
run: |
|
||||
RELEASE="${RELEASE_VERSION}"
|
||||
cd var
|
||||
for file in *; do
|
||||
gh release upload $RELEASE $file
|
||||
done
|
||||
- name: Upload released artifacts
|
||||
env:
|
||||
GITHUB_TOKEN: ${{ github.TOKEN }}
|
||||
RELEASE_VERSION: ${{github.event.release.tag_name}}
|
||||
shell: bash
|
||||
run: |
|
||||
RELEASE="${RELEASE_VERSION}"
|
||||
cd var
|
||||
for file in *; do
|
||||
gh release upload $RELEASE $file
|
||||
done
|
||||
|
||||
@@ -27,7 +27,7 @@ jobs:
|
||||
sudo apt-get install -y build-essential
|
||||
|
||||
- name: Set up Homebrew
|
||||
uses: Homebrew/actions/setup-homebrew@master
|
||||
uses: Homebrew/actions/setup-homebrew@main
|
||||
|
||||
- name: Setup Homebrew cellar cache
|
||||
uses: actions/cache@5a3ec84eff668545956fd18022155c47e93e2684 # v4.2.3
|
||||
|
||||
45
.github/workflows/smoke-tests.yml
vendored
Normal file
45
.github/workflows/smoke-tests.yml
vendored
Normal file
@@ -0,0 +1,45 @@
|
||||
name: Smoke tests
|
||||
|
||||
on:
|
||||
push:
|
||||
branches: ["main"]
|
||||
pull_request:
|
||||
branches: ["main"]
|
||||
|
||||
permissions:
|
||||
contents: read
|
||||
|
||||
jobs:
|
||||
smoke-test:
|
||||
strategy:
|
||||
matrix:
|
||||
test:
|
||||
- git-clone
|
||||
- git-push
|
||||
- healthcheck
|
||||
- i18n
|
||||
runs-on: ubuntu-24.04
|
||||
steps:
|
||||
- name: Checkout code
|
||||
uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2
|
||||
with:
|
||||
persist-credentials: false
|
||||
|
||||
- uses: actions/setup-node@49933ea5288caeca8642d1e84afbd3f7d6820020 # v4.4.0
|
||||
with:
|
||||
node-version: latest
|
||||
|
||||
- uses: actions/setup-go@d35c59abb061a4a6fb18e82ac0862c26744d6ab5 # v5.5.0
|
||||
with:
|
||||
go-version: stable
|
||||
|
||||
- uses: ko-build/setup-ko@d006021bd0c28d1ce33a07e7943d48b079944c8d # v0.9
|
||||
|
||||
- name: Install utils
|
||||
run: |
|
||||
go install ./utils/cmd/...
|
||||
|
||||
- name: Run test
|
||||
run: |
|
||||
cd test/${{ matrix.test }}
|
||||
backoff-retry --try-count 10 ./test.sh
|
||||
37
.github/workflows/ssh-ci-runner-cron.yml
vendored
Normal file
37
.github/workflows/ssh-ci-runner-cron.yml
vendored
Normal file
@@ -0,0 +1,37 @@
|
||||
name: Regenerate ssh ci runner image
|
||||
|
||||
on:
|
||||
# pull_request:
|
||||
# branches: ["main"]
|
||||
schedule:
|
||||
- cron: "0 0 1,8,15,22 * *"
|
||||
workflow_dispatch:
|
||||
|
||||
permissions:
|
||||
pull-requests: write
|
||||
contents: write
|
||||
packages: write
|
||||
|
||||
jobs:
|
||||
ssh-ci-rebuild:
|
||||
if: github.repository == 'TecharoHQ/anubis'
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- name: Checkout code
|
||||
uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2
|
||||
with:
|
||||
fetch-tags: true
|
||||
fetch-depth: 0
|
||||
persist-credentials: false
|
||||
- name: Log into registry
|
||||
uses: docker/login-action@74a5d142397b4f367a81961eba4e8cd7edddf772 # v3.4.0
|
||||
with:
|
||||
registry: ghcr.io
|
||||
username: ${{ github.repository_owner }}
|
||||
password: ${{ secrets.GITHUB_TOKEN }}
|
||||
- name: Set up Docker Buildx
|
||||
uses: docker/setup-buildx-action@e468171a9de216ec08956ac3ada2f0791b6bd435 # v3.11.1
|
||||
- name: Build and push
|
||||
run: |
|
||||
cd ./test/ssh-ci
|
||||
docker buildx bake --push
|
||||
43
.github/workflows/ssh-ci.yml
vendored
Normal file
43
.github/workflows/ssh-ci.yml
vendored
Normal file
@@ -0,0 +1,43 @@
|
||||
name: SSH CI
|
||||
|
||||
on:
|
||||
push:
|
||||
branches: ["main"]
|
||||
# pull_request:
|
||||
# branches: ["main"]
|
||||
|
||||
permissions:
|
||||
contents: read
|
||||
|
||||
jobs:
|
||||
ssh:
|
||||
if: github.repository == 'TecharoHQ/anubis'
|
||||
runs-on: ubuntu-24.04
|
||||
strategy:
|
||||
matrix:
|
||||
host:
|
||||
- ubuntu@riscv64.techaro.lol
|
||||
- ci@ppc64le.techaro.lol
|
||||
steps:
|
||||
- name: Checkout code
|
||||
uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2
|
||||
with:
|
||||
fetch-tags: true
|
||||
fetch-depth: 0
|
||||
persist-credentials: false
|
||||
|
||||
- name: Install CI target SSH key
|
||||
uses: shimataro/ssh-key-action@d4fffb50872869abe2d9a9098a6d9c5aa7d16be4 # v2.7.0
|
||||
with:
|
||||
key: ${{ secrets.CI_SSH_KEY }}
|
||||
name: id_rsa
|
||||
known_hosts: ${{ secrets.CI_SSH_KNOWN_HOSTS }}
|
||||
|
||||
- uses: actions/setup-go@d35c59abb061a4a6fb18e82ac0862c26744d6ab5 # v5.5.0
|
||||
with:
|
||||
go-version: stable
|
||||
|
||||
- name: Run CI
|
||||
run: go run ./utils/cmd/backoff-retry bash test/ssh-ci/rigging.sh ${{ matrix.host }}
|
||||
env:
|
||||
GITHUB_RUN_ID: ${{ github.run_id }}
|
||||
4
.github/workflows/zizmor.yml
vendored
4
.github/workflows/zizmor.yml
vendored
@@ -21,7 +21,7 @@ jobs:
|
||||
persist-credentials: false
|
||||
|
||||
- name: Install the latest version of uv
|
||||
uses: astral-sh/setup-uv@f0ec1fc3b38f5e7cd731bb6ce540c5af426746bb # v6.1.0
|
||||
uses: astral-sh/setup-uv@7edac99f961f18b581bbd960d59d049f04c0002f # v6.4.1
|
||||
|
||||
- name: Run zizmor 🌈
|
||||
run: uvx zizmor --format sarif . > results.sarif
|
||||
@@ -29,7 +29,7 @@ jobs:
|
||||
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
|
||||
|
||||
- name: Upload SARIF file
|
||||
uses: github/codeql-action/upload-sarif@fca7ace96b7d713c7035871441bd52efbe39e27e # v3.28.19
|
||||
uses: github/codeql-action/upload-sarif@181d5eefc20863364f96762470ba6f862bdef56b # v3.29.2
|
||||
with:
|
||||
sarif_file: results.sarif
|
||||
category: zizmor
|
||||
|
||||
2
.gitignore
vendored
2
.gitignore
vendored
@@ -20,3 +20,5 @@ node_modules
|
||||
|
||||
# how does this get here
|
||||
doc/VERSION
|
||||
|
||||
web/static/locales/*.json
|
||||
10
.vscode/extensions.json
vendored
Normal file
10
.vscode/extensions.json
vendored
Normal file
@@ -0,0 +1,10 @@
|
||||
{
|
||||
"recommendations": [
|
||||
"esbenp.prettier-vscode",
|
||||
"ms-azuretools.vscode-containers",
|
||||
"golang.go",
|
||||
"unifiedjs.vscode-mdx",
|
||||
"a-h.templ",
|
||||
"redhat.vscode-yaml"
|
||||
]
|
||||
}
|
||||
27
.vscode/launch.json
vendored
Normal file
27
.vscode/launch.json
vendored
Normal file
@@ -0,0 +1,27 @@
|
||||
{
|
||||
// Use IntelliSense to learn about possible attributes.
|
||||
// Hover to view descriptions of existing attributes.
|
||||
// For more information, visit: https://go.microsoft.com/fwlink/?linkid=830387
|
||||
"version": "0.2.0",
|
||||
"configurations": [
|
||||
{
|
||||
"name": "Launch Package",
|
||||
"type": "go",
|
||||
"request": "launch",
|
||||
"mode": "auto",
|
||||
"program": "${fileDirname}"
|
||||
},
|
||||
{
|
||||
"name": "Anubis [dev]",
|
||||
"command": "npm run dev",
|
||||
"request": "launch",
|
||||
"type": "node-terminal"
|
||||
},
|
||||
{
|
||||
"name": "Start Docs",
|
||||
"command": "cd docs && npm ci && npm run start",
|
||||
"request": "launch",
|
||||
"type": "node-terminal"
|
||||
}
|
||||
]
|
||||
}
|
||||
19
.vscode/settings.json
vendored
19
.vscode/settings.json
vendored
@@ -11,5 +11,24 @@
|
||||
"zig": false,
|
||||
"javascript": false,
|
||||
"properties": false
|
||||
},
|
||||
"[markdown]": {
|
||||
"editor.wordWrap": "wordWrapColumn",
|
||||
"editor.wordWrapColumn": 80,
|
||||
"editor.wordBasedSuggestions": "off"
|
||||
},
|
||||
"[mdx]": {
|
||||
"editor.wordWrap": "wordWrapColumn",
|
||||
"editor.wordWrapColumn": 80,
|
||||
"editor.wordBasedSuggestions": "off"
|
||||
},
|
||||
"[nunjucks]": {
|
||||
"editor.wordWrap": "wordWrapColumn",
|
||||
"editor.wordWrapColumn": 80,
|
||||
"editor.wordBasedSuggestions": "off"
|
||||
},
|
||||
"cSpell.enabledFileTypes": {
|
||||
"mdx": true,
|
||||
"md": true
|
||||
}
|
||||
}
|
||||
|
||||
2
Makefile
2
Makefile
@@ -18,6 +18,7 @@ assets: deps
|
||||
|
||||
build: assets
|
||||
$(GO) build -o ./var/anubis ./cmd/anubis
|
||||
$(GO) build -o ./var/robots2policy ./cmd/robots2policy
|
||||
@echo "Anubis is now built to ./var/anubis"
|
||||
|
||||
lint: assets
|
||||
@@ -27,6 +28,7 @@ lint: assets
|
||||
|
||||
prebaked-build:
|
||||
$(GO) build -o ./var/anubis -ldflags "-X 'github.com/TecharoHQ/anubis.Version=$(VERSION)'" ./cmd/anubis
|
||||
$(GO) build -o ./var/robots2policy -ldflags "-X 'github.com/TecharoHQ/anubis.Version=$(VERSION)'" ./cmd/robots2policy
|
||||
|
||||
test: assets
|
||||
$(GO) test ./...
|
||||
|
||||
15
README.md
15
README.md
@@ -9,6 +9,7 @@
|
||||

|
||||

|
||||

|
||||
[](https://github.com/sponsors/Xe)
|
||||
|
||||
## Sponsors
|
||||
|
||||
@@ -40,6 +41,20 @@ Anubis is brought to you by sponsors and donors like:
|
||||
<a href="https://wildbase.xyz/">
|
||||
<img src="./docs/static/img/sponsors/wildbase-logo.webp" alt="Wildbase" height="64">
|
||||
</a>
|
||||
<a href="https://emma.pet">
|
||||
<img
|
||||
src="./docs/static/img/sponsors/nepeat-logo.webp"
|
||||
alt="Cat eyes over the word Emma in a serif font"
|
||||
height="64"
|
||||
/>
|
||||
</a>
|
||||
<a href="https://fabulous.systems/">
|
||||
<img
|
||||
src="./docs/static/img/sponsors/fabulous-systems.webp"
|
||||
alt="Cat eyes over the word Emma in a serif font"
|
||||
height="64"
|
||||
/>
|
||||
</a>
|
||||
|
||||
## Overview
|
||||
|
||||
|
||||
13
anubis.go
13
anubis.go
@@ -11,12 +11,11 @@ var Version = "devel"
|
||||
|
||||
// CookieName is the name of the cookie that Anubis uses in order to validate
|
||||
// access.
|
||||
const CookieName = "techaro.lol-anubis-auth"
|
||||
var CookieName = "techaro.lol-anubis-auth"
|
||||
|
||||
// WithDomainCookieName is the name that is prepended to the per-domain cookie used when COOKIE_DOMAIN is set.
|
||||
const WithDomainCookieName = "techaro.lol-anubis-auth-for-"
|
||||
|
||||
const TestCookieName = "techaro.lol-anubis-cookie-test-if-you-block-this-anubis-wont-work"
|
||||
// TestCookieName is the name of the cookie that Anubis uses in order to check
|
||||
// if cookies are enabled on the client's browser.
|
||||
var TestCookieName = "techaro.lol-anubis-cookie-verification"
|
||||
|
||||
// CookieDefaultExpirationTime is the amount of time before the cookie/JWT expires.
|
||||
const CookieDefaultExpirationTime = 7 * 24 * time.Hour
|
||||
@@ -33,3 +32,7 @@ const APIPrefix = "/.within.website/x/cmd/anubis/api/"
|
||||
// DefaultDifficulty is the default "difficulty" (number of leading zeroes)
|
||||
// that must be met by the client in order to pass the challenge.
|
||||
const DefaultDifficulty = 4
|
||||
|
||||
// ForcedLanguage is the language being used instead of the one of the request's Accept-Language header
|
||||
// if being set.
|
||||
var ForcedLanguage = ""
|
||||
|
||||
@@ -30,12 +30,15 @@ import (
|
||||
"github.com/TecharoHQ/anubis"
|
||||
"github.com/TecharoHQ/anubis/data"
|
||||
"github.com/TecharoHQ/anubis/internal"
|
||||
"github.com/TecharoHQ/anubis/internal/thoth"
|
||||
libanubis "github.com/TecharoHQ/anubis/lib"
|
||||
botPolicy "github.com/TecharoHQ/anubis/lib/policy"
|
||||
"github.com/TecharoHQ/anubis/lib/policy/config"
|
||||
"github.com/TecharoHQ/anubis/web"
|
||||
"github.com/facebookgo/flagenv"
|
||||
_ "github.com/joho/godotenv/autoload"
|
||||
"github.com/prometheus/client_golang/prometheus/promhttp"
|
||||
healthv1 "google.golang.org/grpc/health/grpc_health_v1"
|
||||
)
|
||||
|
||||
var (
|
||||
@@ -44,8 +47,13 @@ var (
|
||||
bindNetwork = flag.String("bind-network", "tcp", "network family to bind HTTP to, e.g. unix, tcp")
|
||||
challengeDifficulty = flag.Int("difficulty", anubis.DefaultDifficulty, "difficulty of the challenge")
|
||||
cookieDomain = flag.String("cookie-domain", "", "if set, the top-level domain that the Anubis cookie will be valid for")
|
||||
cookieDynamicDomain = flag.Bool("cookie-dynamic-domain", false, "if set, automatically set the cookie Domain value based on the request domain")
|
||||
cookieExpiration = flag.Duration("cookie-expiration-time", anubis.CookieDefaultExpirationTime, "The amount of time the authorization cookie is valid for")
|
||||
cookiePrefix = flag.String("cookie-prefix", "techaro.lol-anubis", "prefix for browser cookies created by Anubis")
|
||||
cookiePartitioned = flag.Bool("cookie-partitioned", false, "if true, sets the partitioned flag on Anubis cookies, enabling CHIPS support")
|
||||
forcedLanguage = flag.String("forced-language", "", "if set, this language is being used instead of the one from the request's Accept-Language header")
|
||||
hs512Secret = flag.String("hs512-secret", "", "secret used to sign JWTs, uses ed25519 if not set")
|
||||
cookieSecure = flag.Bool("cookie-secure", true, "if true, sets the secure flag on Anubis cookies")
|
||||
ed25519PrivateKeyHex = flag.String("ed25519-private-key-hex", "", "private key used to sign JWTs, if not set a random one will be assigned")
|
||||
ed25519PrivateKeyHexFile = flag.String("ed25519-private-key-hex-file", "", "file name containing value for ed25519-private-key-hex")
|
||||
metricsBind = flag.String("metrics-bind", ":9090", "network address to bind metrics to")
|
||||
@@ -55,6 +63,7 @@ var (
|
||||
policyFname = flag.String("policy-fname", "", "full path to anubis policy document (defaults to a sensible built-in policy)")
|
||||
redirectDomains = flag.String("redirect-domains", "", "list of domains separated by commas which anubis is allowed to redirect to. Leaving this unset allows any domain.")
|
||||
slogLevel = flag.String("slog-level", "INFO", "logging level (see https://pkg.go.dev/log/slog#hdr-Levels)")
|
||||
stripBasePrefix = flag.Bool("strip-base-prefix", false, "if true, strips the base prefix from requests forwarded to the target server")
|
||||
target = flag.String("target", "http://localhost:3923", "target to reverse proxy to, set to an empty string to disable proxying when only using auth request")
|
||||
targetSNI = flag.String("target-sni", "", "if set, the value of the TLS handshake hostname when forwarding requests to the target")
|
||||
targetHost = flag.String("target-host", "", "if set, the value of the Host header when forwarding requests to the target")
|
||||
@@ -69,6 +78,10 @@ var (
|
||||
webmasterEmail = flag.String("webmaster-email", "", "if set, displays webmaster's email on the reject page for appeals")
|
||||
versionFlag = flag.Bool("version", false, "print Anubis version")
|
||||
xffStripPrivate = flag.Bool("xff-strip-private", true, "if set, strip private addresses from X-Forwarded-For")
|
||||
|
||||
thothInsecure = flag.Bool("thoth-insecure", false, "if set, connect to Thoth over plain HTTP/2, don't enable this unless support told you to")
|
||||
thothURL = flag.String("thoth-url", "", "if set, URL for Thoth, the IP reputation database for Anubis")
|
||||
thothToken = flag.String("thoth-token", "", "if set, API token for Thoth, the IP reputation database for Anubis")
|
||||
)
|
||||
|
||||
func keyFromHex(value string) (ed25519.PrivateKey, error) {
|
||||
@@ -85,7 +98,7 @@ func keyFromHex(value string) (ed25519.PrivateKey, error) {
|
||||
}
|
||||
|
||||
func doHealthCheck() error {
|
||||
resp, err := http.Get("http://localhost" + *metricsBind + anubis.BasePrefix + "/metrics")
|
||||
resp, err := http.Get("http://localhost" + *metricsBind + "/healthz")
|
||||
if err != nil {
|
||||
return fmt.Errorf("failed to fetch metrics: %w", err)
|
||||
}
|
||||
@@ -98,8 +111,41 @@ func doHealthCheck() error {
|
||||
return nil
|
||||
}
|
||||
|
||||
// parseBindNetFromAddr determine bind network and address based on the given network and address.
|
||||
func parseBindNetFromAddr(address string) (string, string) {
|
||||
defaultScheme := "http://"
|
||||
if !strings.Contains(address, "://") {
|
||||
if strings.HasPrefix(address, ":") {
|
||||
address = defaultScheme + "localhost" + address
|
||||
} else {
|
||||
address = defaultScheme + address
|
||||
}
|
||||
}
|
||||
|
||||
bindUri, err := url.Parse(address)
|
||||
if err != nil {
|
||||
log.Fatal(fmt.Errorf("failed to parse bind URL: %w", err))
|
||||
}
|
||||
|
||||
switch bindUri.Scheme {
|
||||
case "unix":
|
||||
return "unix", bindUri.Path
|
||||
case "tcp", "http", "https":
|
||||
return "tcp", bindUri.Host
|
||||
default:
|
||||
log.Fatal(fmt.Errorf("unsupported network scheme %s in address %s", bindUri.Scheme, address))
|
||||
}
|
||||
return "", address
|
||||
}
|
||||
|
||||
func setupListener(network string, address string) (net.Listener, string) {
|
||||
formattedAddress := ""
|
||||
|
||||
if network == "" {
|
||||
// keep compatibility
|
||||
network, address = parseBindNetFromAddr(address)
|
||||
}
|
||||
|
||||
switch network {
|
||||
case "unix":
|
||||
formattedAddress = "unix:" + address
|
||||
@@ -186,20 +232,6 @@ func makeReverseProxy(target string, targetSNI string, targetHost string, insecu
|
||||
return rp, nil
|
||||
}
|
||||
|
||||
func startDecayMapCleanup(ctx context.Context, s *libanubis.Server) {
|
||||
ticker := time.NewTicker(1 * time.Hour)
|
||||
defer ticker.Stop()
|
||||
|
||||
for {
|
||||
select {
|
||||
case <-ticker.C:
|
||||
s.CleanupDecayMap()
|
||||
case <-ctx.Done():
|
||||
return
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func main() {
|
||||
flagenv.Parse()
|
||||
flag.Parse()
|
||||
@@ -210,6 +242,15 @@ func main() {
|
||||
}
|
||||
|
||||
internal.InitSlog(*slogLevel)
|
||||
internal.SetHealth("anubis", healthv1.HealthCheckResponse_NOT_SERVING)
|
||||
|
||||
if *healthcheck {
|
||||
log.Println("running healthcheck")
|
||||
if err := doHealthCheck(); err != nil {
|
||||
log.Fatal(err)
|
||||
}
|
||||
return
|
||||
}
|
||||
|
||||
if *extractResources != "" {
|
||||
if err := extractEmbedFS(data.BotPolicies, ".", *extractResources); err != nil {
|
||||
@@ -222,6 +263,17 @@ func main() {
|
||||
return
|
||||
}
|
||||
|
||||
// install signal handler
|
||||
ctx, stop := signal.NotifyContext(context.Background(), os.Interrupt, syscall.SIGTERM)
|
||||
defer stop()
|
||||
|
||||
wg := new(sync.WaitGroup)
|
||||
|
||||
if *metricsBind != "" {
|
||||
wg.Add(1)
|
||||
go metricsServer(ctx, wg.Done)
|
||||
}
|
||||
|
||||
var rp http.Handler
|
||||
// when using anubis via Systemd and environment variables, then it is not possible to set targe to an empty string but only to space
|
||||
if strings.TrimSpace(*target) != "" {
|
||||
@@ -232,7 +284,27 @@ func main() {
|
||||
}
|
||||
}
|
||||
|
||||
policy, err := libanubis.LoadPoliciesOrDefault(*policyFname, *challengeDifficulty)
|
||||
if *cookieDomain != "" && *cookieDynamicDomain {
|
||||
log.Fatalf("you can't set COOKIE_DOMAIN and COOKIE_DYNAMIC_DOMAIN at the same time")
|
||||
}
|
||||
|
||||
// Thoth configuration
|
||||
switch {
|
||||
case *thothURL != "" && *thothToken == "":
|
||||
slog.Warn("THOTH_URL is set but no THOTH_TOKEN is set")
|
||||
case *thothURL == "" && *thothToken != "":
|
||||
slog.Warn("THOTH_TOKEN is set but no THOTH_URL is set")
|
||||
case *thothURL != "" && *thothToken != "":
|
||||
slog.Debug("connecting to Thoth")
|
||||
thothClient, err := thoth.New(ctx, *thothURL, *thothToken, *thothInsecure)
|
||||
if err != nil {
|
||||
log.Fatalf("can't dial thoth at %s: %v", *thothURL, err)
|
||||
}
|
||||
|
||||
ctx = thoth.With(ctx, thothClient)
|
||||
}
|
||||
|
||||
policy, err := libanubis.LoadPoliciesOrDefault(ctx, *policyFname, *challengeDifficulty)
|
||||
if err != nil {
|
||||
log.Fatalf("can't parse policy file: %v", err)
|
||||
}
|
||||
@@ -260,12 +332,20 @@ func main() {
|
||||
} else if strings.HasSuffix(*basePrefix, "/") {
|
||||
log.Fatalf("[misconfiguration] base-prefix must not end with a slash")
|
||||
}
|
||||
if *stripBasePrefix && *basePrefix == "" {
|
||||
log.Fatalf("[misconfiguration] strip-base-prefix is set to true, but base-prefix is not set, " +
|
||||
"this may result in unexpected behavior")
|
||||
}
|
||||
|
||||
var priv ed25519.PrivateKey
|
||||
if *ed25519PrivateKeyHex != "" && *ed25519PrivateKeyHexFile != "" {
|
||||
var ed25519Priv ed25519.PrivateKey
|
||||
if *hs512Secret != "" && (*ed25519PrivateKeyHex != "" || *ed25519PrivateKeyHexFile != "") {
|
||||
log.Fatal("do not specify both HS512 and ED25519 secrets")
|
||||
} else if *hs512Secret != "" {
|
||||
ed25519Priv = ed25519.PrivateKey(*hs512Secret)
|
||||
} else if *ed25519PrivateKeyHex != "" && *ed25519PrivateKeyHexFile != "" {
|
||||
log.Fatal("do not specify both ED25519_PRIVATE_KEY_HEX and ED25519_PRIVATE_KEY_HEX_FILE")
|
||||
} else if *ed25519PrivateKeyHex != "" {
|
||||
priv, err = keyFromHex(*ed25519PrivateKeyHex)
|
||||
ed25519Priv, err = keyFromHex(*ed25519PrivateKeyHex)
|
||||
if err != nil {
|
||||
log.Fatalf("failed to parse and validate ED25519_PRIVATE_KEY_HEX: %v", err)
|
||||
}
|
||||
@@ -275,12 +355,12 @@ func main() {
|
||||
log.Fatalf("failed to read ED25519_PRIVATE_KEY_HEX_FILE %s: %v", *ed25519PrivateKeyHexFile, err)
|
||||
}
|
||||
|
||||
priv, err = keyFromHex(string(bytes.TrimSpace(hexFile)))
|
||||
ed25519Priv, err = keyFromHex(string(bytes.TrimSpace(hexFile)))
|
||||
if err != nil {
|
||||
log.Fatalf("failed to parse and validate content of ED25519_PRIVATE_KEY_HEX_FILE: %v", err)
|
||||
}
|
||||
} else {
|
||||
_, priv, err = ed25519.GenerateKey(rand.Reader)
|
||||
_, ed25519Priv, err = ed25519.GenerateKey(rand.Reader)
|
||||
if err != nil {
|
||||
log.Fatalf("failed to generate ed25519 key: %v", err)
|
||||
}
|
||||
@@ -302,42 +382,47 @@ func main() {
|
||||
slog.Warn("REDIRECT_DOMAINS is not set, Anubis will only redirect to the same domain a request is coming from, see https://anubis.techaro.lol/docs/admin/configuration/redirect-domains")
|
||||
}
|
||||
|
||||
anubis.CookieName = *cookiePrefix + "-auth"
|
||||
anubis.TestCookieName = *cookiePrefix + "-cookie-verification"
|
||||
anubis.ForcedLanguage = *forcedLanguage
|
||||
|
||||
// If OpenGraph configuration values are not set in the config file, use the
|
||||
// values from flags / envvars.
|
||||
if !policy.OpenGraph.Enabled {
|
||||
policy.OpenGraph.Enabled = *ogPassthrough
|
||||
policy.OpenGraph.ConsiderHost = *ogCacheConsiderHost
|
||||
policy.OpenGraph.TimeToLive = *ogTimeToLive
|
||||
policy.OpenGraph.Override = map[string]string{}
|
||||
}
|
||||
|
||||
s, err := libanubis.New(libanubis.Options{
|
||||
BasePrefix: *basePrefix,
|
||||
Next: rp,
|
||||
Policy: policy,
|
||||
ServeRobotsTXT: *robotsTxt,
|
||||
PrivateKey: priv,
|
||||
CookieDomain: *cookieDomain,
|
||||
CookieExpiration: *cookieExpiration,
|
||||
CookiePartitioned: *cookiePartitioned,
|
||||
OGPassthrough: *ogPassthrough,
|
||||
OGTimeToLive: *ogTimeToLive,
|
||||
RedirectDomains: redirectDomainsList,
|
||||
Target: *target,
|
||||
WebmasterEmail: *webmasterEmail,
|
||||
OGCacheConsidersHost: *ogCacheConsiderHost,
|
||||
BasePrefix: *basePrefix,
|
||||
StripBasePrefix: *stripBasePrefix,
|
||||
Next: rp,
|
||||
Policy: policy,
|
||||
ServeRobotsTXT: *robotsTxt,
|
||||
ED25519PrivateKey: ed25519Priv,
|
||||
HS512Secret: []byte(*hs512Secret),
|
||||
CookieDomain: *cookieDomain,
|
||||
CookieDynamicDomain: *cookieDynamicDomain,
|
||||
CookieExpiration: *cookieExpiration,
|
||||
CookiePartitioned: *cookiePartitioned,
|
||||
RedirectDomains: redirectDomainsList,
|
||||
Target: *target,
|
||||
WebmasterEmail: *webmasterEmail,
|
||||
OpenGraph: policy.OpenGraph,
|
||||
CookieSecure: *cookieSecure,
|
||||
})
|
||||
if err != nil {
|
||||
log.Fatalf("can't construct libanubis.Server: %v", err)
|
||||
}
|
||||
|
||||
wg := new(sync.WaitGroup)
|
||||
// install signal handler
|
||||
ctx, stop := signal.NotifyContext(context.Background(), os.Interrupt, syscall.SIGTERM)
|
||||
defer stop()
|
||||
|
||||
if *metricsBind != "" {
|
||||
wg.Add(1)
|
||||
go metricsServer(ctx, wg.Done)
|
||||
}
|
||||
go startDecayMapCleanup(ctx, s)
|
||||
|
||||
var h http.Handler
|
||||
h = s
|
||||
h = internal.RemoteXRealIP(*useRemoteAddress, *bindNetwork, h)
|
||||
h = internal.XForwardedForToXRealIP(h)
|
||||
h = internal.XForwardedForUpdate(*xffStripPrivate, h)
|
||||
h = internal.JA4H(h)
|
||||
|
||||
srv := http.Server{Handler: h, ErrorLog: internal.GetFilteredHTTPLogger()}
|
||||
listener, listenerUrl := setupListener(*bindNetwork, *bind)
|
||||
@@ -366,6 +451,8 @@ func main() {
|
||||
}
|
||||
}()
|
||||
|
||||
internal.SetHealth("anubis", healthv1.HealthCheckResponse_SERVING)
|
||||
|
||||
if err := srv.Serve(listener); !errors.Is(err, http.ErrServerClosed) {
|
||||
log.Fatal(err)
|
||||
}
|
||||
@@ -376,20 +463,30 @@ func metricsServer(ctx context.Context, done func()) {
|
||||
defer done()
|
||||
|
||||
mux := http.NewServeMux()
|
||||
mux.Handle(anubis.BasePrefix+"/metrics", promhttp.Handler())
|
||||
mux.Handle("/metrics", promhttp.Handler())
|
||||
mux.HandleFunc("/healthz", func(w http.ResponseWriter, r *http.Request) {
|
||||
st, ok := internal.GetHealth("anubis")
|
||||
if !ok {
|
||||
slog.Error("health service anubis does not exist, file a bug")
|
||||
}
|
||||
|
||||
switch st {
|
||||
case healthv1.HealthCheckResponse_NOT_SERVING:
|
||||
http.Error(w, "NOT OK", http.StatusInternalServerError)
|
||||
return
|
||||
case healthv1.HealthCheckResponse_SERVING:
|
||||
fmt.Fprintln(w, "OK")
|
||||
return
|
||||
default:
|
||||
http.Error(w, "UNKNOWN", http.StatusFailedDependency)
|
||||
return
|
||||
}
|
||||
})
|
||||
|
||||
srv := http.Server{Handler: mux, ErrorLog: internal.GetFilteredHTTPLogger()}
|
||||
listener, metricsUrl := setupListener(*metricsBindNetwork, *metricsBind)
|
||||
slog.Debug("listening for metrics", "url", metricsUrl)
|
||||
|
||||
if *healthcheck {
|
||||
log.Println("running healthcheck")
|
||||
if err := doHealthCheck(); err != nil {
|
||||
log.Fatal(err)
|
||||
}
|
||||
return
|
||||
}
|
||||
|
||||
go func() {
|
||||
<-ctx.Done()
|
||||
c, cancel := context.WithTimeout(context.Background(), 5*time.Second)
|
||||
|
||||
78
cmd/robots2policy/batch/batch_process.go
Normal file
78
cmd/robots2policy/batch/batch_process.go
Normal file
@@ -0,0 +1,78 @@
|
||||
/*
|
||||
Batch process robots.txt files from archives like https://github.com/nrjones8/robots-dot-txt-archive-bot/tree/master/data/cleaned
|
||||
into Anubis CEL policies. Usage: go run batch_process.go <directory with robots.txt files>
|
||||
*/
|
||||
package main
|
||||
|
||||
import (
|
||||
"fmt"
|
||||
"io/fs"
|
||||
"log"
|
||||
"os"
|
||||
"os/exec"
|
||||
"path/filepath"
|
||||
"strings"
|
||||
)
|
||||
|
||||
func main() {
|
||||
if len(os.Args) < 2 {
|
||||
fmt.Println("Usage: go run batch_process.go <cleaned_directory>")
|
||||
fmt.Println("Example: go run batch_process.go ./cleaned")
|
||||
os.Exit(1)
|
||||
}
|
||||
|
||||
cleanedDir := os.Args[1]
|
||||
outputDir := "generated_policies"
|
||||
|
||||
// Create output directory
|
||||
if err := os.MkdirAll(outputDir, 0755); err != nil {
|
||||
log.Fatalf("Failed to create output directory: %v", err)
|
||||
}
|
||||
|
||||
count := 0
|
||||
err := filepath.WalkDir(cleanedDir, func(path string, d fs.DirEntry, err error) error {
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
|
||||
// Skip directories
|
||||
if d.IsDir() {
|
||||
return nil
|
||||
}
|
||||
|
||||
// Generate policy name from file path
|
||||
relPath, _ := filepath.Rel(cleanedDir, path)
|
||||
policyName := strings.ReplaceAll(relPath, "/", "-")
|
||||
policyName = strings.TrimSuffix(policyName, "-robots.txt")
|
||||
policyName = strings.ReplaceAll(policyName, ".", "-")
|
||||
|
||||
outputFile := filepath.Join(outputDir, policyName+".yaml")
|
||||
|
||||
cmd := exec.Command("go", "run", "main.go",
|
||||
"-input", path,
|
||||
"-output", outputFile,
|
||||
"-name", policyName,
|
||||
"-format", "yaml")
|
||||
|
||||
if err := cmd.Run(); err != nil {
|
||||
fmt.Printf("Warning: Failed to process %s: %v\n", path, err)
|
||||
return nil // Continue processing other files
|
||||
}
|
||||
|
||||
count++
|
||||
if count%100 == 0 {
|
||||
fmt.Printf("Processed %d files...\n", count)
|
||||
} else if count%10 == 0 {
|
||||
fmt.Print(".")
|
||||
}
|
||||
|
||||
return nil
|
||||
})
|
||||
|
||||
if err != nil {
|
||||
log.Fatalf("Error walking directory: %v", err)
|
||||
}
|
||||
|
||||
fmt.Printf("Successfully processed %d robots.txt files\n", count)
|
||||
fmt.Printf("Generated policies saved to: %s/\n", outputDir)
|
||||
}
|
||||
313
cmd/robots2policy/main.go
Normal file
313
cmd/robots2policy/main.go
Normal file
@@ -0,0 +1,313 @@
|
||||
package main
|
||||
|
||||
import (
|
||||
"bufio"
|
||||
"encoding/json"
|
||||
"flag"
|
||||
"fmt"
|
||||
"io"
|
||||
"log"
|
||||
"net/http"
|
||||
"os"
|
||||
"regexp"
|
||||
"strings"
|
||||
|
||||
"github.com/TecharoHQ/anubis/lib/policy/config"
|
||||
|
||||
"sigs.k8s.io/yaml"
|
||||
)
|
||||
|
||||
var (
|
||||
inputFile = flag.String("input", "", "path to robots.txt file (use - for stdin)")
|
||||
outputFile = flag.String("output", "", "output file path (use - for stdout, defaults to stdout)")
|
||||
outputFormat = flag.String("format", "yaml", "output format: yaml or json")
|
||||
baseAction = flag.String("action", "CHALLENGE", "default action for disallowed paths: ALLOW, DENY, CHALLENGE, WEIGH")
|
||||
crawlDelay = flag.Int("crawl-delay-weight", 0, "if > 0, add weight adjustment for crawl-delay (difficulty adjustment)")
|
||||
policyName = flag.String("name", "robots-txt-policy", "name for the generated policy")
|
||||
userAgentDeny = flag.String("deny-user-agents", "DENY", "action for specifically blocked user agents: DENY, CHALLENGE")
|
||||
helpFlag = flag.Bool("help", false, "show help")
|
||||
)
|
||||
|
||||
type RobotsRule struct {
|
||||
UserAgent string
|
||||
Disallows []string
|
||||
Allows []string
|
||||
CrawlDelay int
|
||||
IsBlacklist bool // true if this is a specifically denied user agent
|
||||
}
|
||||
|
||||
type AnubisRule struct {
|
||||
Expression *config.ExpressionOrList `yaml:"expression,omitempty" json:"expression,omitempty"`
|
||||
Challenge *config.ChallengeRules `yaml:"challenge,omitempty" json:"challenge,omitempty"`
|
||||
Weight *config.Weight `yaml:"weight,omitempty" json:"weight,omitempty"`
|
||||
Name string `yaml:"name" json:"name"`
|
||||
Action string `yaml:"action" json:"action"`
|
||||
}
|
||||
|
||||
func init() {
|
||||
flag.Usage = func() {
|
||||
fmt.Fprintf(os.Stderr, "Usage of %s:\n", os.Args[0])
|
||||
fmt.Fprintf(os.Stderr, "%s [options] -input <robots.txt>\n\n", os.Args[0])
|
||||
flag.PrintDefaults()
|
||||
fmt.Fprintln(os.Stderr, "\nExamples:")
|
||||
fmt.Fprintln(os.Stderr, " # Convert local robots.txt file")
|
||||
fmt.Fprintln(os.Stderr, " robots2policy -input robots.txt -output policy.yaml")
|
||||
fmt.Fprintln(os.Stderr, "")
|
||||
fmt.Fprintln(os.Stderr, " # Convert from URL")
|
||||
fmt.Fprintln(os.Stderr, " robots2policy -input https://example.com/robots.txt -format json")
|
||||
fmt.Fprintln(os.Stderr, "")
|
||||
fmt.Fprintln(os.Stderr, " # Read from stdin, write to stdout")
|
||||
fmt.Fprintln(os.Stderr, " curl https://example.com/robots.txt | robots2policy -input -")
|
||||
os.Exit(2)
|
||||
}
|
||||
}
|
||||
|
||||
func main() {
|
||||
flag.Parse()
|
||||
|
||||
if len(flag.Args()) > 0 || *helpFlag || *inputFile == "" {
|
||||
flag.Usage()
|
||||
}
|
||||
|
||||
// Read robots.txt
|
||||
var input io.Reader
|
||||
if *inputFile == "-" {
|
||||
input = os.Stdin
|
||||
} else if strings.HasPrefix(*inputFile, "http://") || strings.HasPrefix(*inputFile, "https://") {
|
||||
resp, err := http.Get(*inputFile)
|
||||
if err != nil {
|
||||
log.Fatalf("failed to fetch robots.txt from URL: %v", err)
|
||||
}
|
||||
defer resp.Body.Close()
|
||||
input = resp.Body
|
||||
} else {
|
||||
file, err := os.Open(*inputFile)
|
||||
if err != nil {
|
||||
log.Fatalf("failed to open input file: %v", err)
|
||||
}
|
||||
defer file.Close()
|
||||
input = file
|
||||
}
|
||||
|
||||
// Parse robots.txt
|
||||
rules, err := parseRobotsTxt(input)
|
||||
if err != nil {
|
||||
log.Fatalf("failed to parse robots.txt: %v", err)
|
||||
}
|
||||
|
||||
// Convert to Anubis rules
|
||||
anubisRules := convertToAnubisRules(rules)
|
||||
|
||||
// Check if any rules were generated
|
||||
if len(anubisRules) == 0 {
|
||||
log.Fatal("no valid rules generated from robots.txt - file may be empty or contain no disallow directives")
|
||||
}
|
||||
|
||||
// Generate output
|
||||
var output []byte
|
||||
switch strings.ToLower(*outputFormat) {
|
||||
case "yaml":
|
||||
output, err = yaml.Marshal(anubisRules)
|
||||
case "json":
|
||||
output, err = json.MarshalIndent(anubisRules, "", " ")
|
||||
default:
|
||||
log.Fatalf("unsupported output format: %s (use yaml or json)", *outputFormat)
|
||||
}
|
||||
|
||||
if err != nil {
|
||||
log.Fatalf("failed to marshal output: %v", err)
|
||||
}
|
||||
|
||||
// Write output
|
||||
if *outputFile == "" || *outputFile == "-" {
|
||||
fmt.Print(string(output))
|
||||
} else {
|
||||
err = os.WriteFile(*outputFile, output, 0644)
|
||||
if err != nil {
|
||||
log.Fatalf("failed to write output file: %v", err)
|
||||
}
|
||||
fmt.Printf("Generated Anubis policy written to %s\n", *outputFile)
|
||||
}
|
||||
}
|
||||
|
||||
func parseRobotsTxt(input io.Reader) ([]RobotsRule, error) {
|
||||
scanner := bufio.NewScanner(input)
|
||||
var rules []RobotsRule
|
||||
var currentRule *RobotsRule
|
||||
|
||||
for scanner.Scan() {
|
||||
line := strings.TrimSpace(scanner.Text())
|
||||
|
||||
// Skip empty lines and comments
|
||||
if line == "" || strings.HasPrefix(line, "#") {
|
||||
continue
|
||||
}
|
||||
|
||||
// Split on first colon
|
||||
parts := strings.SplitN(line, ":", 2)
|
||||
if len(parts) != 2 {
|
||||
continue
|
||||
}
|
||||
|
||||
directive := strings.TrimSpace(strings.ToLower(parts[0]))
|
||||
value := strings.TrimSpace(parts[1])
|
||||
|
||||
switch directive {
|
||||
case "user-agent":
|
||||
// Start a new rule section
|
||||
if currentRule != nil {
|
||||
rules = append(rules, *currentRule)
|
||||
}
|
||||
currentRule = &RobotsRule{
|
||||
UserAgent: value,
|
||||
Disallows: make([]string, 0),
|
||||
Allows: make([]string, 0),
|
||||
}
|
||||
|
||||
case "disallow":
|
||||
if currentRule != nil && value != "" {
|
||||
currentRule.Disallows = append(currentRule.Disallows, value)
|
||||
}
|
||||
|
||||
case "allow":
|
||||
if currentRule != nil && value != "" {
|
||||
currentRule.Allows = append(currentRule.Allows, value)
|
||||
}
|
||||
|
||||
case "crawl-delay":
|
||||
if currentRule != nil {
|
||||
if delay, err := parseIntSafe(value); err == nil {
|
||||
currentRule.CrawlDelay = delay
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// Don't forget the last rule
|
||||
if currentRule != nil {
|
||||
rules = append(rules, *currentRule)
|
||||
}
|
||||
|
||||
// Mark blacklisted user agents (those with "Disallow: /")
|
||||
for i := range rules {
|
||||
for _, disallow := range rules[i].Disallows {
|
||||
if disallow == "/" {
|
||||
rules[i].IsBlacklist = true
|
||||
break
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
return rules, scanner.Err()
|
||||
}
|
||||
|
||||
func parseIntSafe(s string) (int, error) {
|
||||
var result int
|
||||
_, err := fmt.Sscanf(s, "%d", &result)
|
||||
return result, err
|
||||
}
|
||||
|
||||
func convertToAnubisRules(robotsRules []RobotsRule) []AnubisRule {
|
||||
var anubisRules []AnubisRule
|
||||
ruleCounter := 0
|
||||
|
||||
for _, robotsRule := range robotsRules {
|
||||
userAgent := robotsRule.UserAgent
|
||||
|
||||
// Handle crawl delay as weight adjustment (do this first before any continues)
|
||||
if robotsRule.CrawlDelay > 0 && *crawlDelay > 0 {
|
||||
ruleCounter++
|
||||
rule := AnubisRule{
|
||||
Name: fmt.Sprintf("%s-crawl-delay-%d", *policyName, ruleCounter),
|
||||
Action: "WEIGH",
|
||||
Weight: &config.Weight{Adjust: *crawlDelay},
|
||||
}
|
||||
|
||||
if userAgent == "*" {
|
||||
rule.Expression = &config.ExpressionOrList{
|
||||
All: []string{"true"}, // Always applies
|
||||
}
|
||||
} else {
|
||||
rule.Expression = &config.ExpressionOrList{
|
||||
All: []string{fmt.Sprintf("userAgent.contains(%q)", userAgent)},
|
||||
}
|
||||
}
|
||||
|
||||
anubisRules = append(anubisRules, rule)
|
||||
}
|
||||
|
||||
// Handle blacklisted user agents (complete deny/challenge)
|
||||
if robotsRule.IsBlacklist {
|
||||
ruleCounter++
|
||||
rule := AnubisRule{
|
||||
Name: fmt.Sprintf("%s-blacklist-%d", *policyName, ruleCounter),
|
||||
Action: *userAgentDeny,
|
||||
}
|
||||
|
||||
if userAgent == "*" {
|
||||
// This would block everything - convert to a weight adjustment instead
|
||||
rule.Name = fmt.Sprintf("%s-global-restriction-%d", *policyName, ruleCounter)
|
||||
rule.Action = "WEIGH"
|
||||
rule.Weight = &config.Weight{Adjust: 20} // Increase difficulty significantly
|
||||
rule.Expression = &config.ExpressionOrList{
|
||||
All: []string{"true"}, // Always applies
|
||||
}
|
||||
} else {
|
||||
rule.Expression = &config.ExpressionOrList{
|
||||
All: []string{fmt.Sprintf("userAgent.contains(%q)", userAgent)},
|
||||
}
|
||||
}
|
||||
anubisRules = append(anubisRules, rule)
|
||||
continue
|
||||
}
|
||||
|
||||
// Handle specific disallow rules
|
||||
for _, disallow := range robotsRule.Disallows {
|
||||
if disallow == "/" {
|
||||
continue // Already handled as blacklist above
|
||||
}
|
||||
|
||||
ruleCounter++
|
||||
rule := AnubisRule{
|
||||
Name: fmt.Sprintf("%s-disallow-%d", *policyName, ruleCounter),
|
||||
Action: *baseAction,
|
||||
}
|
||||
|
||||
// Build CEL expression
|
||||
var conditions []string
|
||||
|
||||
// Add user agent condition if not wildcard
|
||||
if userAgent != "*" {
|
||||
conditions = append(conditions, fmt.Sprintf("userAgent.contains(%q)", userAgent))
|
||||
}
|
||||
|
||||
// Add path condition
|
||||
pathCondition := buildPathCondition(disallow)
|
||||
conditions = append(conditions, pathCondition)
|
||||
|
||||
rule.Expression = &config.ExpressionOrList{
|
||||
All: conditions,
|
||||
}
|
||||
|
||||
anubisRules = append(anubisRules, rule)
|
||||
}
|
||||
|
||||
}
|
||||
|
||||
return anubisRules
|
||||
}
|
||||
|
||||
func buildPathCondition(robotsPath string) string {
|
||||
// Handle wildcards in robots.txt paths
|
||||
if strings.Contains(robotsPath, "*") || strings.Contains(robotsPath, "?") {
|
||||
// Convert robots.txt wildcards to regex
|
||||
regex := regexp.QuoteMeta(robotsPath)
|
||||
regex = strings.ReplaceAll(regex, `\*`, `.*`) // * becomes .*
|
||||
regex = strings.ReplaceAll(regex, `\?`, `.`) // ? becomes .
|
||||
regex = "^" + regex
|
||||
return fmt.Sprintf("path.matches(%q)", regex)
|
||||
}
|
||||
|
||||
// Simple prefix match for most cases
|
||||
return fmt.Sprintf("path.startsWith(%q)", robotsPath)
|
||||
}
|
||||
418
cmd/robots2policy/robots2policy_test.go
Normal file
418
cmd/robots2policy/robots2policy_test.go
Normal file
@@ -0,0 +1,418 @@
|
||||
package main
|
||||
|
||||
import (
|
||||
"encoding/json"
|
||||
"fmt"
|
||||
"os"
|
||||
"path/filepath"
|
||||
"reflect"
|
||||
"strings"
|
||||
"testing"
|
||||
|
||||
"gopkg.in/yaml.v3"
|
||||
)
|
||||
|
||||
type TestCase struct {
|
||||
name string
|
||||
robotsFile string
|
||||
expectedFile string
|
||||
options TestOptions
|
||||
}
|
||||
|
||||
type TestOptions struct {
|
||||
format string
|
||||
action string
|
||||
crawlDelayWeight int
|
||||
policyName string
|
||||
deniedAction string
|
||||
}
|
||||
|
||||
func TestDataFileConversion(t *testing.T) {
|
||||
|
||||
testCases := []TestCase{
|
||||
{
|
||||
name: "simple_default",
|
||||
robotsFile: "simple.robots.txt",
|
||||
expectedFile: "simple.yaml",
|
||||
options: TestOptions{format: "yaml"},
|
||||
},
|
||||
{
|
||||
name: "simple_json",
|
||||
robotsFile: "simple.robots.txt",
|
||||
expectedFile: "simple.json",
|
||||
options: TestOptions{format: "json"},
|
||||
},
|
||||
{
|
||||
name: "simple_deny_action",
|
||||
robotsFile: "simple.robots.txt",
|
||||
expectedFile: "deny-action.yaml",
|
||||
options: TestOptions{format: "yaml", action: "DENY"},
|
||||
},
|
||||
{
|
||||
name: "simple_custom_name",
|
||||
robotsFile: "simple.robots.txt",
|
||||
expectedFile: "custom-name.yaml",
|
||||
options: TestOptions{format: "yaml", policyName: "my-custom-policy"},
|
||||
},
|
||||
{
|
||||
name: "blacklist_with_crawl_delay",
|
||||
robotsFile: "blacklist.robots.txt",
|
||||
expectedFile: "blacklist.yaml",
|
||||
options: TestOptions{format: "yaml", crawlDelayWeight: 3},
|
||||
},
|
||||
{
|
||||
name: "wildcards",
|
||||
robotsFile: "wildcards.robots.txt",
|
||||
expectedFile: "wildcards.yaml",
|
||||
options: TestOptions{format: "yaml"},
|
||||
},
|
||||
{
|
||||
name: "empty_file",
|
||||
robotsFile: "empty.robots.txt",
|
||||
expectedFile: "empty.yaml",
|
||||
options: TestOptions{format: "yaml"},
|
||||
},
|
||||
{
|
||||
name: "complex_scenario",
|
||||
robotsFile: "complex.robots.txt",
|
||||
expectedFile: "complex.yaml",
|
||||
options: TestOptions{format: "yaml", crawlDelayWeight: 5},
|
||||
},
|
||||
}
|
||||
|
||||
for _, tc := range testCases {
|
||||
t.Run(tc.name, func(t *testing.T) {
|
||||
robotsPath := filepath.Join("testdata", tc.robotsFile)
|
||||
expectedPath := filepath.Join("testdata", tc.expectedFile)
|
||||
|
||||
// Read robots.txt input
|
||||
robotsFile, err := os.Open(robotsPath)
|
||||
if err != nil {
|
||||
t.Fatalf("Failed to open robots file %s: %v", robotsPath, err)
|
||||
}
|
||||
defer robotsFile.Close()
|
||||
|
||||
// Parse robots.txt
|
||||
rules, err := parseRobotsTxt(robotsFile)
|
||||
if err != nil {
|
||||
t.Fatalf("Failed to parse robots.txt: %v", err)
|
||||
}
|
||||
|
||||
// Set test options
|
||||
oldFormat := *outputFormat
|
||||
oldAction := *baseAction
|
||||
oldCrawlDelay := *crawlDelay
|
||||
oldPolicyName := *policyName
|
||||
oldDeniedAction := *userAgentDeny
|
||||
|
||||
if tc.options.format != "" {
|
||||
*outputFormat = tc.options.format
|
||||
}
|
||||
if tc.options.action != "" {
|
||||
*baseAction = tc.options.action
|
||||
}
|
||||
if tc.options.crawlDelayWeight > 0 {
|
||||
*crawlDelay = tc.options.crawlDelayWeight
|
||||
}
|
||||
if tc.options.policyName != "" {
|
||||
*policyName = tc.options.policyName
|
||||
}
|
||||
if tc.options.deniedAction != "" {
|
||||
*userAgentDeny = tc.options.deniedAction
|
||||
}
|
||||
|
||||
// Restore options after test
|
||||
defer func() {
|
||||
*outputFormat = oldFormat
|
||||
*baseAction = oldAction
|
||||
*crawlDelay = oldCrawlDelay
|
||||
*policyName = oldPolicyName
|
||||
*userAgentDeny = oldDeniedAction
|
||||
}()
|
||||
|
||||
// Convert to Anubis rules
|
||||
anubisRules := convertToAnubisRules(rules)
|
||||
|
||||
// Generate output
|
||||
var actualOutput []byte
|
||||
switch strings.ToLower(*outputFormat) {
|
||||
case "yaml":
|
||||
actualOutput, err = yaml.Marshal(anubisRules)
|
||||
case "json":
|
||||
actualOutput, err = json.MarshalIndent(anubisRules, "", " ")
|
||||
}
|
||||
if err != nil {
|
||||
t.Fatalf("Failed to marshal output: %v", err)
|
||||
}
|
||||
|
||||
// Read expected output
|
||||
expectedOutput, err := os.ReadFile(expectedPath)
|
||||
if err != nil {
|
||||
t.Fatalf("Failed to read expected file %s: %v", expectedPath, err)
|
||||
}
|
||||
|
||||
if strings.ToLower(*outputFormat) == "yaml" {
|
||||
var actualData []interface{}
|
||||
var expectedData []interface{}
|
||||
|
||||
err = yaml.Unmarshal(actualOutput, &actualData)
|
||||
if err != nil {
|
||||
t.Fatalf("Failed to unmarshal actual output: %v", err)
|
||||
}
|
||||
|
||||
err = yaml.Unmarshal(expectedOutput, &expectedData)
|
||||
if err != nil {
|
||||
t.Fatalf("Failed to unmarshal expected output: %v", err)
|
||||
}
|
||||
|
||||
// Compare data structures
|
||||
if !compareData(actualData, expectedData) {
|
||||
actualStr := strings.TrimSpace(string(actualOutput))
|
||||
expectedStr := strings.TrimSpace(string(expectedOutput))
|
||||
t.Errorf("Output mismatch for %s\nExpected:\n%s\n\nActual:\n%s", tc.name, expectedStr, actualStr)
|
||||
}
|
||||
} else {
|
||||
var actualData []interface{}
|
||||
var expectedData []interface{}
|
||||
|
||||
err = json.Unmarshal(actualOutput, &actualData)
|
||||
if err != nil {
|
||||
t.Fatalf("Failed to unmarshal actual JSON output: %v", err)
|
||||
}
|
||||
|
||||
err = json.Unmarshal(expectedOutput, &expectedData)
|
||||
if err != nil {
|
||||
t.Fatalf("Failed to unmarshal expected JSON output: %v", err)
|
||||
}
|
||||
|
||||
// Compare data structures
|
||||
if !compareData(actualData, expectedData) {
|
||||
actualStr := strings.TrimSpace(string(actualOutput))
|
||||
expectedStr := strings.TrimSpace(string(expectedOutput))
|
||||
t.Errorf("Output mismatch for %s\nExpected:\n%s\n\nActual:\n%s", tc.name, expectedStr, actualStr)
|
||||
}
|
||||
}
|
||||
})
|
||||
}
|
||||
}
|
||||
|
||||
func TestCaseInsensitiveParsing(t *testing.T) {
|
||||
robotsTxt := `User-Agent: *
|
||||
Disallow: /admin
|
||||
Crawl-Delay: 10
|
||||
|
||||
User-agent: TestBot
|
||||
disallow: /test
|
||||
crawl-delay: 5
|
||||
|
||||
USER-AGENT: UpperBot
|
||||
DISALLOW: /upper
|
||||
CRAWL-DELAY: 20`
|
||||
|
||||
reader := strings.NewReader(robotsTxt)
|
||||
rules, err := parseRobotsTxt(reader)
|
||||
if err != nil {
|
||||
t.Fatalf("Failed to parse case-insensitive robots.txt: %v", err)
|
||||
}
|
||||
|
||||
expectedRules := 3
|
||||
if len(rules) != expectedRules {
|
||||
t.Errorf("Expected %d rules, got %d", expectedRules, len(rules))
|
||||
}
|
||||
|
||||
// Check that all crawl delays were parsed
|
||||
for i, rule := range rules {
|
||||
expectedDelays := []int{10, 5, 20}
|
||||
if rule.CrawlDelay != expectedDelays[i] {
|
||||
t.Errorf("Rule %d: expected crawl delay %d, got %d", i, expectedDelays[i], rule.CrawlDelay)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func TestVariousOutputFormats(t *testing.T) {
|
||||
robotsTxt := `User-agent: *
|
||||
Disallow: /admin`
|
||||
|
||||
reader := strings.NewReader(robotsTxt)
|
||||
rules, err := parseRobotsTxt(reader)
|
||||
if err != nil {
|
||||
t.Fatalf("Failed to parse robots.txt: %v", err)
|
||||
}
|
||||
|
||||
oldPolicyName := *policyName
|
||||
*policyName = "test-policy"
|
||||
defer func() { *policyName = oldPolicyName }()
|
||||
|
||||
anubisRules := convertToAnubisRules(rules)
|
||||
|
||||
// Test YAML output
|
||||
yamlOutput, err := yaml.Marshal(anubisRules)
|
||||
if err != nil {
|
||||
t.Fatalf("Failed to marshal YAML: %v", err)
|
||||
}
|
||||
|
||||
if !strings.Contains(string(yamlOutput), "name: test-policy-disallow-1") {
|
||||
t.Errorf("YAML output doesn't contain expected rule name")
|
||||
}
|
||||
|
||||
// Test JSON output
|
||||
jsonOutput, err := json.MarshalIndent(anubisRules, "", " ")
|
||||
if err != nil {
|
||||
t.Fatalf("Failed to marshal JSON: %v", err)
|
||||
}
|
||||
|
||||
if !strings.Contains(string(jsonOutput), `"name": "test-policy-disallow-1"`) {
|
||||
t.Errorf("JSON output doesn't contain expected rule name")
|
||||
}
|
||||
}
|
||||
|
||||
func TestDifferentActions(t *testing.T) {
|
||||
robotsTxt := `User-agent: *
|
||||
Disallow: /admin`
|
||||
|
||||
testActions := []string{"ALLOW", "DENY", "CHALLENGE", "WEIGH"}
|
||||
|
||||
for _, action := range testActions {
|
||||
t.Run("action_"+action, func(t *testing.T) {
|
||||
reader := strings.NewReader(robotsTxt)
|
||||
rules, err := parseRobotsTxt(reader)
|
||||
if err != nil {
|
||||
t.Fatalf("Failed to parse robots.txt: %v", err)
|
||||
}
|
||||
|
||||
oldAction := *baseAction
|
||||
*baseAction = action
|
||||
defer func() { *baseAction = oldAction }()
|
||||
|
||||
anubisRules := convertToAnubisRules(rules)
|
||||
|
||||
if len(anubisRules) != 1 {
|
||||
t.Fatalf("Expected 1 rule, got %d", len(anubisRules))
|
||||
}
|
||||
|
||||
if anubisRules[0].Action != action {
|
||||
t.Errorf("Expected action %s, got %s", action, anubisRules[0].Action)
|
||||
}
|
||||
})
|
||||
}
|
||||
}
|
||||
|
||||
func TestPolicyNaming(t *testing.T) {
|
||||
robotsTxt := `User-agent: *
|
||||
Disallow: /admin
|
||||
Disallow: /private
|
||||
|
||||
User-agent: BadBot
|
||||
Disallow: /`
|
||||
|
||||
testNames := []string{"custom-policy", "my-rules", "site-protection"}
|
||||
|
||||
for _, name := range testNames {
|
||||
t.Run("name_"+name, func(t *testing.T) {
|
||||
reader := strings.NewReader(robotsTxt)
|
||||
rules, err := parseRobotsTxt(reader)
|
||||
if err != nil {
|
||||
t.Fatalf("Failed to parse robots.txt: %v", err)
|
||||
}
|
||||
|
||||
oldName := *policyName
|
||||
*policyName = name
|
||||
defer func() { *policyName = oldName }()
|
||||
|
||||
anubisRules := convertToAnubisRules(rules)
|
||||
|
||||
// Check that all rule names use the custom prefix
|
||||
for _, rule := range anubisRules {
|
||||
if !strings.HasPrefix(rule.Name, name+"-") {
|
||||
t.Errorf("Rule name %s doesn't start with expected prefix %s-", rule.Name, name)
|
||||
}
|
||||
}
|
||||
})
|
||||
}
|
||||
}
|
||||
|
||||
func TestCrawlDelayWeights(t *testing.T) {
|
||||
robotsTxt := `User-agent: *
|
||||
Disallow: /admin
|
||||
Crawl-delay: 10
|
||||
|
||||
User-agent: SlowBot
|
||||
Disallow: /slow
|
||||
Crawl-delay: 60`
|
||||
|
||||
testWeights := []int{1, 5, 10, 25}
|
||||
|
||||
for _, weight := range testWeights {
|
||||
t.Run(fmt.Sprintf("weight_%d", weight), func(t *testing.T) {
|
||||
reader := strings.NewReader(robotsTxt)
|
||||
rules, err := parseRobotsTxt(reader)
|
||||
if err != nil {
|
||||
t.Fatalf("Failed to parse robots.txt: %v", err)
|
||||
}
|
||||
|
||||
oldWeight := *crawlDelay
|
||||
*crawlDelay = weight
|
||||
defer func() { *crawlDelay = oldWeight }()
|
||||
|
||||
anubisRules := convertToAnubisRules(rules)
|
||||
|
||||
// Count weight rules and verify they have correct weight
|
||||
weightRules := 0
|
||||
for _, rule := range anubisRules {
|
||||
if rule.Action == "WEIGH" && rule.Weight != nil {
|
||||
weightRules++
|
||||
if rule.Weight.Adjust != weight {
|
||||
t.Errorf("Expected weight %d, got %d", weight, rule.Weight.Adjust)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
expectedWeightRules := 2 // One for *, one for SlowBot
|
||||
if weightRules != expectedWeightRules {
|
||||
t.Errorf("Expected %d weight rules, got %d", expectedWeightRules, weightRules)
|
||||
}
|
||||
})
|
||||
}
|
||||
}
|
||||
|
||||
func TestBlacklistActions(t *testing.T) {
|
||||
robotsTxt := `User-agent: BadBot
|
||||
Disallow: /
|
||||
|
||||
User-agent: SpamBot
|
||||
Disallow: /`
|
||||
|
||||
testActions := []string{"DENY", "CHALLENGE"}
|
||||
|
||||
for _, action := range testActions {
|
||||
t.Run("blacklist_"+action, func(t *testing.T) {
|
||||
reader := strings.NewReader(robotsTxt)
|
||||
rules, err := parseRobotsTxt(reader)
|
||||
if err != nil {
|
||||
t.Fatalf("Failed to parse robots.txt: %v", err)
|
||||
}
|
||||
|
||||
oldAction := *userAgentDeny
|
||||
*userAgentDeny = action
|
||||
defer func() { *userAgentDeny = oldAction }()
|
||||
|
||||
anubisRules := convertToAnubisRules(rules)
|
||||
|
||||
// All rules should be blacklist rules with the specified action
|
||||
for _, rule := range anubisRules {
|
||||
if !strings.Contains(rule.Name, "blacklist") {
|
||||
t.Errorf("Expected blacklist rule, got %s", rule.Name)
|
||||
}
|
||||
if rule.Action != action {
|
||||
t.Errorf("Expected action %s, got %s", action, rule.Action)
|
||||
}
|
||||
}
|
||||
})
|
||||
}
|
||||
}
|
||||
|
||||
// compareData performs a deep comparison of two data structures,
|
||||
// ignoring differences that are semantically equivalent in YAML/JSON
|
||||
func compareData(actual, expected interface{}) bool {
|
||||
return reflect.DeepEqual(actual, expected)
|
||||
}
|
||||
15
cmd/robots2policy/testdata/blacklist.robots.txt
vendored
Normal file
15
cmd/robots2policy/testdata/blacklist.robots.txt
vendored
Normal file
@@ -0,0 +1,15 @@
|
||||
# Test with blacklisted user agents
|
||||
User-agent: *
|
||||
Disallow: /admin
|
||||
Crawl-delay: 10
|
||||
|
||||
User-agent: BadBot
|
||||
Disallow: /
|
||||
|
||||
User-agent: SpamBot
|
||||
Disallow: /
|
||||
Crawl-delay: 60
|
||||
|
||||
User-agent: Googlebot
|
||||
Disallow: /search
|
||||
Crawl-delay: 5
|
||||
30
cmd/robots2policy/testdata/blacklist.yaml
vendored
Normal file
30
cmd/robots2policy/testdata/blacklist.yaml
vendored
Normal file
@@ -0,0 +1,30 @@
|
||||
- action: WEIGH
|
||||
expression: "true"
|
||||
name: robots-txt-policy-crawl-delay-1
|
||||
weight:
|
||||
adjust: 3
|
||||
- action: CHALLENGE
|
||||
expression: path.startsWith("/admin")
|
||||
name: robots-txt-policy-disallow-2
|
||||
- action: DENY
|
||||
expression: userAgent.contains("BadBot")
|
||||
name: robots-txt-policy-blacklist-3
|
||||
- action: WEIGH
|
||||
expression: userAgent.contains("SpamBot")
|
||||
name: robots-txt-policy-crawl-delay-4
|
||||
weight:
|
||||
adjust: 3
|
||||
- action: DENY
|
||||
expression: userAgent.contains("SpamBot")
|
||||
name: robots-txt-policy-blacklist-5
|
||||
- action: WEIGH
|
||||
expression: userAgent.contains("Googlebot")
|
||||
name: robots-txt-policy-crawl-delay-6
|
||||
weight:
|
||||
adjust: 3
|
||||
- action: CHALLENGE
|
||||
expression:
|
||||
all:
|
||||
- userAgent.contains("Googlebot")
|
||||
- path.startsWith("/search")
|
||||
name: robots-txt-policy-disallow-7
|
||||
30
cmd/robots2policy/testdata/complex.robots.txt
vendored
Normal file
30
cmd/robots2policy/testdata/complex.robots.txt
vendored
Normal file
@@ -0,0 +1,30 @@
|
||||
# Complex real-world example
|
||||
User-agent: *
|
||||
Disallow: /admin/
|
||||
Disallow: /private/
|
||||
Disallow: /api/internal/
|
||||
Allow: /api/public/
|
||||
Crawl-delay: 5
|
||||
|
||||
User-agent: Googlebot
|
||||
Disallow: /search/
|
||||
Allow: /api/
|
||||
Crawl-delay: 2
|
||||
|
||||
User-agent: Bingbot
|
||||
Disallow: /search/
|
||||
Disallow: /admin/
|
||||
Crawl-delay: 10
|
||||
|
||||
User-agent: BadBot
|
||||
Disallow: /
|
||||
|
||||
User-agent: SeoBot
|
||||
Disallow: /
|
||||
Crawl-delay: 300
|
||||
|
||||
# Test with various patterns
|
||||
User-agent: TestBot
|
||||
Disallow: /*/admin
|
||||
Disallow: /temp*.html
|
||||
Disallow: /file?.log
|
||||
71
cmd/robots2policy/testdata/complex.yaml
vendored
Normal file
71
cmd/robots2policy/testdata/complex.yaml
vendored
Normal file
@@ -0,0 +1,71 @@
|
||||
- action: WEIGH
|
||||
expression: "true"
|
||||
name: robots-txt-policy-crawl-delay-1
|
||||
weight:
|
||||
adjust: 5
|
||||
- action: CHALLENGE
|
||||
expression: path.startsWith("/admin/")
|
||||
name: robots-txt-policy-disallow-2
|
||||
- action: CHALLENGE
|
||||
expression: path.startsWith("/private/")
|
||||
name: robots-txt-policy-disallow-3
|
||||
- action: CHALLENGE
|
||||
expression: path.startsWith("/api/internal/")
|
||||
name: robots-txt-policy-disallow-4
|
||||
- action: WEIGH
|
||||
expression: userAgent.contains("Googlebot")
|
||||
name: robots-txt-policy-crawl-delay-5
|
||||
weight:
|
||||
adjust: 5
|
||||
- action: CHALLENGE
|
||||
expression:
|
||||
all:
|
||||
- userAgent.contains("Googlebot")
|
||||
- path.startsWith("/search/")
|
||||
name: robots-txt-policy-disallow-6
|
||||
- action: WEIGH
|
||||
expression: userAgent.contains("Bingbot")
|
||||
name: robots-txt-policy-crawl-delay-7
|
||||
weight:
|
||||
adjust: 5
|
||||
- action: CHALLENGE
|
||||
expression:
|
||||
all:
|
||||
- userAgent.contains("Bingbot")
|
||||
- path.startsWith("/search/")
|
||||
name: robots-txt-policy-disallow-8
|
||||
- action: CHALLENGE
|
||||
expression:
|
||||
all:
|
||||
- userAgent.contains("Bingbot")
|
||||
- path.startsWith("/admin/")
|
||||
name: robots-txt-policy-disallow-9
|
||||
- action: DENY
|
||||
expression: userAgent.contains("BadBot")
|
||||
name: robots-txt-policy-blacklist-10
|
||||
- action: WEIGH
|
||||
expression: userAgent.contains("SeoBot")
|
||||
name: robots-txt-policy-crawl-delay-11
|
||||
weight:
|
||||
adjust: 5
|
||||
- action: DENY
|
||||
expression: userAgent.contains("SeoBot")
|
||||
name: robots-txt-policy-blacklist-12
|
||||
- action: CHALLENGE
|
||||
expression:
|
||||
all:
|
||||
- userAgent.contains("TestBot")
|
||||
- path.matches("^/.*/admin")
|
||||
name: robots-txt-policy-disallow-13
|
||||
- action: CHALLENGE
|
||||
expression:
|
||||
all:
|
||||
- userAgent.contains("TestBot")
|
||||
- path.matches("^/temp.*\\.html")
|
||||
name: robots-txt-policy-disallow-14
|
||||
- action: CHALLENGE
|
||||
expression:
|
||||
all:
|
||||
- userAgent.contains("TestBot")
|
||||
- path.matches("^/file.\\.log")
|
||||
name: robots-txt-policy-disallow-15
|
||||
6
cmd/robots2policy/testdata/custom-name.yaml
vendored
Normal file
6
cmd/robots2policy/testdata/custom-name.yaml
vendored
Normal file
@@ -0,0 +1,6 @@
|
||||
- action: CHALLENGE
|
||||
expression: path.startsWith("/admin/")
|
||||
name: my-custom-policy-disallow-1
|
||||
- action: CHALLENGE
|
||||
expression: path.startsWith("/private")
|
||||
name: my-custom-policy-disallow-2
|
||||
6
cmd/robots2policy/testdata/deny-action.yaml
vendored
Normal file
6
cmd/robots2policy/testdata/deny-action.yaml
vendored
Normal file
@@ -0,0 +1,6 @@
|
||||
- action: DENY
|
||||
expression: path.startsWith("/admin/")
|
||||
name: robots-txt-policy-disallow-1
|
||||
- action: DENY
|
||||
expression: path.startsWith("/private")
|
||||
name: robots-txt-policy-disallow-2
|
||||
2
cmd/robots2policy/testdata/empty.robots.txt
vendored
Normal file
2
cmd/robots2policy/testdata/empty.robots.txt
vendored
Normal file
@@ -0,0 +1,2 @@
|
||||
# Empty robots.txt (comments only)
|
||||
# No actual rules
|
||||
1
cmd/robots2policy/testdata/empty.yaml
vendored
Normal file
1
cmd/robots2policy/testdata/empty.yaml
vendored
Normal file
@@ -0,0 +1 @@
|
||||
[]
|
||||
12
cmd/robots2policy/testdata/simple.json
vendored
Normal file
12
cmd/robots2policy/testdata/simple.json
vendored
Normal file
@@ -0,0 +1,12 @@
|
||||
[
|
||||
{
|
||||
"action": "CHALLENGE",
|
||||
"expression": "path.startsWith(\"/admin/\")",
|
||||
"name": "robots-txt-policy-disallow-1"
|
||||
},
|
||||
{
|
||||
"action": "CHALLENGE",
|
||||
"expression": "path.startsWith(\"/private\")",
|
||||
"name": "robots-txt-policy-disallow-2"
|
||||
}
|
||||
]
|
||||
5
cmd/robots2policy/testdata/simple.robots.txt
vendored
Normal file
5
cmd/robots2policy/testdata/simple.robots.txt
vendored
Normal file
@@ -0,0 +1,5 @@
|
||||
# Simple robots.txt test
|
||||
User-agent: *
|
||||
Disallow: /admin/
|
||||
Disallow: /private
|
||||
Allow: /public
|
||||
6
cmd/robots2policy/testdata/simple.yaml
vendored
Normal file
6
cmd/robots2policy/testdata/simple.yaml
vendored
Normal file
@@ -0,0 +1,6 @@
|
||||
- action: CHALLENGE
|
||||
expression: path.startsWith("/admin/")
|
||||
name: robots-txt-policy-disallow-1
|
||||
- action: CHALLENGE
|
||||
expression: path.startsWith("/private")
|
||||
name: robots-txt-policy-disallow-2
|
||||
6
cmd/robots2policy/testdata/wildcards.robots.txt
vendored
Normal file
6
cmd/robots2policy/testdata/wildcards.robots.txt
vendored
Normal file
@@ -0,0 +1,6 @@
|
||||
# Test wildcard patterns
|
||||
User-agent: *
|
||||
Disallow: /search*
|
||||
Disallow: /*/private
|
||||
Disallow: /file?.txt
|
||||
Disallow: /admin/*?action=delete
|
||||
12
cmd/robots2policy/testdata/wildcards.yaml
vendored
Normal file
12
cmd/robots2policy/testdata/wildcards.yaml
vendored
Normal file
@@ -0,0 +1,12 @@
|
||||
- action: CHALLENGE
|
||||
expression: path.matches("^/search.*")
|
||||
name: robots-txt-policy-disallow-1
|
||||
- action: CHALLENGE
|
||||
expression: path.matches("^/.*/private")
|
||||
name: robots-txt-policy-disallow-2
|
||||
- action: CHALLENGE
|
||||
expression: path.matches("^/file.\\.txt")
|
||||
name: robots-txt-policy-disallow-3
|
||||
- action: CHALLENGE
|
||||
expression: path.matches("^/admin/.*.action=delete")
|
||||
name: robots-txt-policy-disallow-4
|
||||
@@ -51,6 +51,48 @@ bots:
|
||||
# report_as: 4 # lie to the operator
|
||||
# algorithm: slow # intentionally waste CPU cycles and time
|
||||
|
||||
# Requires a subscription to Thoth to use, see
|
||||
# https://anubis.techaro.lol/docs/admin/thoth#geoip-based-filtering
|
||||
- name: countries-with-aggressive-scrapers
|
||||
action: WEIGH
|
||||
geoip:
|
||||
countries:
|
||||
- BR
|
||||
- CN
|
||||
weight:
|
||||
adjust: 10
|
||||
|
||||
# Requires a subscription to Thoth to use, see
|
||||
# https://anubis.techaro.lol/docs/admin/thoth#asn-based-filtering
|
||||
- name: aggressive-asns-without-functional-abuse-contact
|
||||
action: WEIGH
|
||||
asns:
|
||||
match:
|
||||
- 13335 # Cloudflare
|
||||
- 136907 # Huawei Cloud
|
||||
- 45102 # Alibaba Cloud
|
||||
weight:
|
||||
adjust: 10
|
||||
|
||||
# ## System load based checks.
|
||||
# # If the system is under high load, add weight.
|
||||
# - name: high-load-average
|
||||
# action: WEIGH
|
||||
# expression: load_1m >= 10.0 # make sure to end the load comparison in a .0
|
||||
# weight:
|
||||
# adjust: 20
|
||||
|
||||
## If your backend service is running on the same operating system as Anubis,
|
||||
## you can uncomment this rule to make the challenge easier when the system is
|
||||
## under low load.
|
||||
##
|
||||
## If it is not, remove weight.
|
||||
# - name: low-load-average
|
||||
# action: WEIGH
|
||||
# expression: load_15m <= 4.0 # make sure to end the load comparison in a .0
|
||||
# weight:
|
||||
# adjust: -10
|
||||
|
||||
# Generic catchall rule
|
||||
- name: generic-browser
|
||||
user_agent_regex: >-
|
||||
@@ -61,6 +103,59 @@ bots:
|
||||
|
||||
dnsbl: false
|
||||
|
||||
# #
|
||||
# impressum:
|
||||
# # Displayed at the bottom of every page rendered by Anubis.
|
||||
# footer: >-
|
||||
# This website is hosted by Zombocom. If you have any complaints or notes
|
||||
# about the service, please contact
|
||||
# <a href="mailto:contact@domainhere.example">contact@domainhere.example</a>
|
||||
# and we will assist you as soon as possible.
|
||||
|
||||
# # The imprint page that will be linked to at the footer of every Anubis page.
|
||||
# page:
|
||||
# # The HTML <title> of the page
|
||||
# title: Imprint and Privacy Policy
|
||||
# # The HTML contents of the page. The exact contents of this page can
|
||||
# # and will vary by locale. Please consult with a lawyer if you are not
|
||||
# # sure what to put here
|
||||
# body: >-
|
||||
# <p>Last updated: June 2025</p>
|
||||
|
||||
# <h2>Information that is gathered from visitors</h2>
|
||||
|
||||
# <p>In common with other websites, log files are stored on the web server saving details such as the visitor's IP address, browser type, referring page and time of visit.</p>
|
||||
|
||||
# <p>Cookies may be used to remember visitor preferences when interacting with the website.</p>
|
||||
|
||||
# <p>Where registration is required, the visitor's email and a username will be stored on the server.</p>
|
||||
|
||||
# <!-- ... -->
|
||||
|
||||
# Open Graph passthrough configuration, see here for more information:
|
||||
# https://anubis.techaro.lol/docs/admin/configuration/open-graph/
|
||||
openGraph:
|
||||
# Enables Open Graph passthrough
|
||||
enabled: false
|
||||
# Enables the use of the HTTP host in the cache key, this enables
|
||||
# caching metadata for multiple http hosts at once.
|
||||
considerHost: false
|
||||
# How long cached OpenGraph metadata should last in memory
|
||||
ttl: 24h
|
||||
# # If set, return these opengraph values instead of looking them up with
|
||||
# # the target service.
|
||||
# #
|
||||
# # Correlates to properties in https://ogp.me/
|
||||
# override:
|
||||
# # og:title is required, it is the title of the website
|
||||
# "og:title": "Techaro Anubis"
|
||||
# "og:description": >-
|
||||
# Anubis is a Web AI Firewall Utility that helps you fight the bots
|
||||
# away so that you can maintain uptime at work!
|
||||
# "description": >-
|
||||
# Anubis is a Web AI Firewall Utility that helps you fight the bots
|
||||
# away so that you can maintain uptime at work!
|
||||
|
||||
# By default, send HTTP 200 back to clients that either get issued a challenge
|
||||
# or a denial. This seems weird, but this is load-bearing due to the fact that
|
||||
# the most aggressive scraper bots seem to really, really, want an HTTP 200 and
|
||||
@@ -68,3 +163,65 @@ dnsbl: false
|
||||
status_codes:
|
||||
CHALLENGE: 200
|
||||
DENY: 200
|
||||
|
||||
# Anubis can store temporary data in one of a few backends. See the storage
|
||||
# backends section of the docs for more information:
|
||||
#
|
||||
# https://anubis.techaro.lol/docs/admin/policies#storage-backends
|
||||
store:
|
||||
backend: memory
|
||||
parameters: {}
|
||||
|
||||
# The weight thresholds for when to trigger individual challenges. Any
|
||||
# CHALLENGE will take precedence over this.
|
||||
#
|
||||
# A threshold has four configuration options:
|
||||
#
|
||||
# - name: the name that is reported down the stack and used for metrics
|
||||
# - expression: A CEL expression with the request weight in the variable
|
||||
# weight
|
||||
# - action: the Anubis action to apply, similar to in a bot policy
|
||||
# - challenge: which challenge to send to the user, similar to in a bot policy
|
||||
#
|
||||
# See https://anubis.techaro.lol/docs/admin/configuration/thresholds for more
|
||||
# information.
|
||||
thresholds:
|
||||
# By default Anubis ships with the following thresholds:
|
||||
- name: minimal-suspicion # This client is likely fine, its soul is lighter than a feather
|
||||
expression: weight <= 0 # a feather weighs zero units
|
||||
action: ALLOW # Allow the traffic through
|
||||
# For clients that had some weight reduced through custom rules, give them a
|
||||
# lightweight challenge.
|
||||
- name: mild-suspicion
|
||||
expression:
|
||||
all:
|
||||
- weight > 0
|
||||
- weight < 10
|
||||
action: CHALLENGE
|
||||
challenge:
|
||||
# https://anubis.techaro.lol/docs/admin/configuration/challenges/metarefresh
|
||||
algorithm: metarefresh
|
||||
difficulty: 1
|
||||
report_as: 1
|
||||
# For clients that are browser-like but have either gained points from custom rules or
|
||||
# report as a standard browser.
|
||||
- name: moderate-suspicion
|
||||
expression:
|
||||
all:
|
||||
- weight >= 10
|
||||
- weight < 20
|
||||
action: CHALLENGE
|
||||
challenge:
|
||||
# https://anubis.techaro.lol/docs/admin/configuration/challenges/proof-of-work
|
||||
algorithm: fast
|
||||
difficulty: 2 # two leading zeros, very fast for most clients
|
||||
report_as: 2
|
||||
# For clients that are browser like and have gained many points from custom rules
|
||||
- name: extreme-suspicion
|
||||
expression: weight >= 20
|
||||
action: CHALLENGE
|
||||
challenge:
|
||||
# https://anubis.techaro.lol/docs/admin/configuration/challenges/proof-of-work
|
||||
algorithm: fast
|
||||
difficulty: 4
|
||||
report_as: 4
|
||||
|
||||
@@ -7,5 +7,5 @@
|
||||
# Warning: May contain user agents that _must_ be blocked in robots.txt, or the opt-out will have no effect.
|
||||
- name: "ai-catchall"
|
||||
user_agent_regex: >-
|
||||
AI2Bot|Ai2Bot-Dolma|aiHitBot|Amazonbot|anthropic-ai|Brightbot 1.0|Bytespider|CCBot|Claude-Web|cohere-ai|cohere-training-data-crawler|Cotoyogi|Crawlspace|Diffbot|DuckAssistBot|FacebookBot|Factset_spyderbot|FirecrawlAgent|FriendlyCrawler|Google-CloudVertexBot|GoogleOther|GoogleOther-Image|GoogleOther-Video|iaskspider/2.0|ICC-Crawler|ImagesiftBot|img2dataset|imgproxy|ISSCyberRiskCrawler|Kangaroo Bot|meta-externalagent|Meta-ExternalAgent|meta-externalfetcher|Meta-ExternalFetcher|NovaAct|omgili|omgilibot|Operator|PanguBot|Perplexity-User|PerplexityBot|PetalBot|QualifiedBot|Scrapy|SemrushBot-OCOB|SemrushBot-SWA|Sidetrade indexer bot|TikTokSpider|Timpibot|VelenPublicWebCrawler|Webzio-Extended|wpbot|YouBot
|
||||
AI2Bot|Ai2Bot-Dolma|aiHitBot|Amazonbot|anthropic-ai|Brightbot 1.0|Bytespider|Claude-Web|cohere-ai|cohere-training-data-crawler|Cotoyogi|Crawlspace|Diffbot|DuckAssistBot|FacebookBot|Factset_spyderbot|FirecrawlAgent|FriendlyCrawler|Google-CloudVertexBot|GoogleOther|GoogleOther-Image|GoogleOther-Video|iaskspider/2.0|ICC-Crawler|ImagesiftBot|img2dataset|imgproxy|ISSCyberRiskCrawler|Kangaroo Bot|meta-externalagent|Meta-ExternalAgent|meta-externalfetcher|Meta-ExternalFetcher|NovaAct|omgili|omgilibot|Operator|PanguBot|Perplexity-User|PerplexityBot|PetalBot|QualifiedBot|Scrapy|SemrushBot-OCOB|SemrushBot-SWA|Sidetrade indexer bot|TikTokSpider|Timpibot|VelenPublicWebCrawler|Webzio-Extended|wpbot|YouBot
|
||||
action: DENY
|
||||
|
||||
@@ -1,6 +1,8 @@
|
||||
# Warning: Contains user agents that _must_ be blocked in robots.txt, or the opt-out will have no effect.
|
||||
# Note: Blocks human-directed/non-training user agents
|
||||
#
|
||||
# CCBot is allowed because if Common Crawl is allowed, then scrapers don't need to scrape to get the data.
|
||||
- name: "ai-robots-txt"
|
||||
user_agent_regex: >-
|
||||
AI2Bot|Ai2Bot-Dolma|aiHitBot|Amazonbot|Andibot|anthropic-ai|Applebot|Applebot-Extended|bedrockbot|Brightbot 1.0|Bytespider|CCBot|ChatGPT-User|Claude-SearchBot|Claude-User|Claude-Web|ClaudeBot|cohere-ai|cohere-training-data-crawler|Cotoyogi|Crawlspace|Diffbot|DuckAssistBot|FacebookBot|Factset_spyderbot|FirecrawlAgent|FriendlyCrawler|Google-CloudVertexBot|Google-Extended|GoogleOther|GoogleOther-Image|GoogleOther-Video|GPTBot|iaskspider/2.0|ICC-Crawler|ImagesiftBot|img2dataset|ISSCyberRiskCrawler|Kangaroo Bot|meta-externalagent|Meta-ExternalAgent|meta-externalfetcher|Meta-ExternalFetcher|MistralAI-User/1.0|NovaAct|OAI-SearchBot|omgili|omgilibot|Operator|PanguBot|Panscient|panscient.com|Perplexity-User|PerplexityBot|PetalBot|PhindBot|QualifiedBot|QuillBot|quillbot.com|SBIntuitionsBot|Scrapy|SemrushBot-OCOB|SemrushBot-SWA|Sidetrade indexer bot|TikTokSpider|Timpibot|VelenPublicWebCrawler|Webzio-Extended|wpbot|YandexAdditional|YandexAdditionalBot|YouBot
|
||||
AI2Bot|Ai2Bot-Dolma|aiHitBot|Amazonbot|Andibot|anthropic-ai|Applebot|Applebot-Extended|bedrockbot|Brightbot 1.0|Bytespider|ChatGPT-User|Claude-SearchBot|Claude-User|Claude-Web|ClaudeBot|cohere-ai|cohere-training-data-crawler|Cotoyogi|Crawlspace|Diffbot|DuckAssistBot|EchoboxBot|FacebookBot|facebookexternalhit|Factset_spyderbot|FirecrawlAgent|FriendlyCrawler|Google-CloudVertexBot|Google-Extended|GoogleOther|GoogleOther-Image|GoogleOther-Video|GPTBot|iaskspider/2.0|ICC-Crawler|ImagesiftBot|img2dataset|ISSCyberRiskCrawler|Kangaroo Bot|meta-externalagent|Meta-ExternalAgent|meta-externalfetcher|Meta-ExternalFetcher|MistralAI-User/1.0|MyCentralAIScraperBot|NovaAct|OAI-SearchBot|omgili|omgilibot|Operator|PanguBot|Panscient|panscient.com|Perplexity-User|PerplexityBot|PetalBot|PhindBot|Poseidon Research Crawler|QualifiedBot|QuillBot|quillbot.com|SBIntuitionsBot|Scrapy|SemrushBot|SemrushBot-BA|SemrushBot-CT|SemrushBot-OCOB|SemrushBot-SI|SemrushBot-SWA|Sidetrade indexer bot|TikTokSpider|Timpibot|VelenPublicWebCrawler|Webzio-Extended|wpbot|YandexAdditional|YandexAdditionalBot|YouBot
|
||||
action: DENY
|
||||
|
||||
@@ -6,4 +6,5 @@
|
||||
- import: (data)/crawlers/internet-archive.yaml
|
||||
- import: (data)/crawlers/kagibot.yaml
|
||||
- import: (data)/crawlers/marginalia.yaml
|
||||
- import: (data)/crawlers/mojeekbot.yaml
|
||||
- import: (data)/crawlers/mojeekbot.yaml
|
||||
- import: (data)/crawlers/commoncrawl.yaml
|
||||
|
||||
12
data/crawlers/commoncrawl.yaml
Normal file
12
data/crawlers/commoncrawl.yaml
Normal file
@@ -0,0 +1,12 @@
|
||||
- name: common-crawl
|
||||
user_agent_regex: CCBot
|
||||
action: ALLOW
|
||||
# https://index.commoncrawl.org/ccbot.json
|
||||
remote_addresses:
|
||||
[
|
||||
"2600:1f28:365:80b0::/60",
|
||||
"18.97.9.168/29",
|
||||
"18.97.14.80/29",
|
||||
"18.97.14.88/30",
|
||||
"98.85.178.216/32",
|
||||
]
|
||||
223
data/services/uptime-robot.yaml
Normal file
223
data/services/uptime-robot.yaml
Normal file
@@ -0,0 +1,223 @@
|
||||
- name: uptime-robot
|
||||
user_agent_regex: UptimeRobot
|
||||
action: ALLOW
|
||||
# https://api.uptimerobot.com/meta/ips
|
||||
remote_addresses: [
|
||||
"3.12.251.153/32",
|
||||
"3.20.63.178/32",
|
||||
"3.77.67.4/32",
|
||||
"3.79.134.69/32",
|
||||
"3.105.133.239/32",
|
||||
"3.105.190.221/32",
|
||||
"3.133.226.214/32",
|
||||
"3.149.57.90/32",
|
||||
"3.212.128.62/32",
|
||||
"5.161.61.238/32",
|
||||
"5.161.73.160/32",
|
||||
"5.161.75.7/32",
|
||||
"5.161.113.195/32",
|
||||
"5.161.117.52/32",
|
||||
"5.161.177.47/32",
|
||||
"5.161.194.92/32",
|
||||
"5.161.215.244/32",
|
||||
"5.223.43.32/32",
|
||||
"5.223.53.147/32",
|
||||
"5.223.57.22/32",
|
||||
"18.116.205.62/32",
|
||||
"18.180.208.214/32",
|
||||
"18.192.166.72/32",
|
||||
"18.193.252.127/32",
|
||||
"24.144.78.39/32",
|
||||
"24.144.78.185/32",
|
||||
"34.198.201.66/32",
|
||||
"45.55.123.175/32",
|
||||
"45.55.127.146/32",
|
||||
"49.13.24.81/32",
|
||||
"49.13.130.29/32",
|
||||
"49.13.134.145/32",
|
||||
"49.13.164.148/32",
|
||||
"49.13.167.123/32",
|
||||
"52.15.147.27/32",
|
||||
"52.22.236.30/32",
|
||||
"52.28.162.93/32",
|
||||
"52.59.43.236/32",
|
||||
"52.87.72.16/32",
|
||||
"54.64.67.106/32",
|
||||
"54.79.28.129/32",
|
||||
"54.87.112.51/32",
|
||||
"54.167.223.174/32",
|
||||
"54.249.170.27/32",
|
||||
"63.178.84.147/32",
|
||||
"64.225.81.248/32",
|
||||
"64.225.82.147/32",
|
||||
"69.162.124.227/32",
|
||||
"69.162.124.235/32",
|
||||
"69.162.124.238/32",
|
||||
"78.46.190.63/32",
|
||||
"78.46.215.1/32",
|
||||
"78.47.98.55/32",
|
||||
"78.47.173.76/32",
|
||||
"88.99.80.227/32",
|
||||
"91.99.101.207/32",
|
||||
"128.140.41.193/32",
|
||||
"128.140.106.114/32",
|
||||
"129.212.132.140/32",
|
||||
"134.199.240.137/32",
|
||||
"138.197.53.117/32",
|
||||
"138.197.53.138/32",
|
||||
"138.197.54.143/32",
|
||||
"138.197.54.247/32",
|
||||
"138.197.63.92/32",
|
||||
"139.59.50.44/32",
|
||||
"142.132.180.39/32",
|
||||
"143.198.249.237/32",
|
||||
"143.198.250.89/32",
|
||||
"143.244.196.21/32",
|
||||
"143.244.196.211/32",
|
||||
"143.244.221.177/32",
|
||||
"144.126.251.21/32",
|
||||
"146.190.9.187/32",
|
||||
"152.42.149.135/32",
|
||||
"157.90.155.240/32",
|
||||
"157.90.156.63/32",
|
||||
"159.69.158.189/32",
|
||||
"159.223.243.219/32",
|
||||
"161.35.247.201/32",
|
||||
"167.99.18.52/32",
|
||||
"167.235.143.113/32",
|
||||
"168.119.53.160/32",
|
||||
"168.119.96.239/32",
|
||||
"168.119.123.75/32",
|
||||
"170.64.250.64/32",
|
||||
"170.64.250.132/32",
|
||||
"170.64.250.235/32",
|
||||
"178.156.181.172/32",
|
||||
"178.156.184.20/32",
|
||||
"178.156.185.127/32",
|
||||
"178.156.185.231/32",
|
||||
"178.156.187.238/32",
|
||||
"178.156.189.113/32",
|
||||
"178.156.189.249/32",
|
||||
"188.166.201.79/32",
|
||||
"206.189.241.133/32",
|
||||
"209.38.49.1/32",
|
||||
"209.38.49.206/32",
|
||||
"209.38.49.226/32",
|
||||
"209.38.51.43/32",
|
||||
"209.38.53.7/32",
|
||||
"209.38.124.252/32",
|
||||
"216.144.248.18/31",
|
||||
"216.144.248.21/32",
|
||||
"216.144.248.22/31",
|
||||
"216.144.248.24/30",
|
||||
"216.144.248.28/31",
|
||||
"216.144.248.30/32",
|
||||
"216.245.221.83/32",
|
||||
"2400:6180:10:200::56a0:b000/128",
|
||||
"2400:6180:10:200::56a0:c000/128",
|
||||
"2400:6180:10:200::56a0:e000/128",
|
||||
"2400:6180:100:d0::94b6:4001/128",
|
||||
"2400:6180:100:d0::94b6:5001/128",
|
||||
"2400:6180:100:d0::94b6:7001/128",
|
||||
"2406:da14:94d:8601:9d0d:7754:bedf:e4f5/128",
|
||||
"2406:da14:94d:8601:b325:ff58:2bba:7934/128",
|
||||
"2406:da14:94d:8601:db4b:c5ac:2cbe:9a79/128",
|
||||
"2406:da1c:9c8:dc02:7ae1:f2ea:ab91:2fde/128",
|
||||
"2406:da1c:9c8:dc02:7db9:f38b:7b9f:402e/128",
|
||||
"2406:da1c:9c8:dc02:82b2:f0fd:ee96:579/128",
|
||||
"2600:1f16:775:3a00:ac3:c5eb:7081:942e/128",
|
||||
"2600:1f16:775:3a00:37bf:6026:e54a:f03a/128",
|
||||
"2600:1f16:775:3a00:3f24:5bb0:95d7:5a6b/128",
|
||||
"2600:1f16:775:3a00:8c2c:2ba6:778f:5be5/128",
|
||||
"2600:1f16:775:3a00:91ac:3120:ff38:92b5/128",
|
||||
"2600:1f16:775:3a00:dbbe:36b0:3c45:da32/128",
|
||||
"2600:1f18:179:f900:71:af9a:ade7:d772/128",
|
||||
"2600:1f18:179:f900:2406:9399:4ae6:c5d3/128",
|
||||
"2600:1f18:179:f900:4696:7729:7bb3:f52f/128",
|
||||
"2600:1f18:179:f900:4b7d:d1cc:2d10:211/128",
|
||||
"2600:1f18:179:f900:5c68:91b6:5d75:5d7/128",
|
||||
"2600:1f18:179:f900:e8dd:eed1:a6c:183b/128",
|
||||
"2604:a880:800:14:0:1:68ba:d000/128",
|
||||
"2604:a880:800:14:0:1:68ba:e000/128",
|
||||
"2604:a880:800:14:0:1:68bb:0/128",
|
||||
"2604:a880:800:14:0:1:68bb:1000/128",
|
||||
"2604:a880:800:14:0:1:68bb:3000/128",
|
||||
"2604:a880:800:14:0:1:68bb:4000/128",
|
||||
"2604:a880:800:14:0:1:68bb:5000/128",
|
||||
"2604:a880:800:14:0:1:68bb:6000/128",
|
||||
"2604:a880:800:14:0:1:68bb:7000/128",
|
||||
"2604:a880:800:14:0:1:68bb:a000/128",
|
||||
"2604:a880:800:14:0:1:68bb:b000/128",
|
||||
"2604:a880:800:14:0:1:68bb:c000/128",
|
||||
"2604:a880:800:14:0:1:68bb:d000/128",
|
||||
"2604:a880:800:14:0:1:68bb:e000/128",
|
||||
"2604:a880:800:14:0:1:68bb:f000/128",
|
||||
"2607:ff68:107::4/128",
|
||||
"2607:ff68:107::14/128",
|
||||
"2607:ff68:107::33/128",
|
||||
"2607:ff68:107::48/127",
|
||||
"2607:ff68:107::50/125",
|
||||
"2607:ff68:107::58/127",
|
||||
"2607:ff68:107::60/128",
|
||||
"2a01:4f8:c0c:83fa::1/128",
|
||||
"2a01:4f8:c17:42e4::1/128",
|
||||
"2a01:4f8:c2c:9fc6::1/128",
|
||||
"2a01:4f8:c2c:beae::1/128",
|
||||
"2a01:4f8:1c1a:3d53::1/128",
|
||||
"2a01:4f8:1c1b:4ef4::1/128",
|
||||
"2a01:4f8:1c1b:5b5a::1/128",
|
||||
"2a01:4f8:1c1b:7ecc::1/128",
|
||||
"2a01:4f8:1c1c:11aa::1/128",
|
||||
"2a01:4f8:1c1c:5353::1/128",
|
||||
"2a01:4f8:1c1c:7240::1/128",
|
||||
"2a01:4f8:1c1c:a98a::1/128",
|
||||
"2a01:4f8:c012:c60e::1/128",
|
||||
"2a01:4f8:c013:c18::1/128",
|
||||
"2a01:4f8:c013:34c0::1/128",
|
||||
"2a01:4f8:c013:3b0f::1/128",
|
||||
"2a01:4f8:c013:3c52::1/128",
|
||||
"2a01:4f8:c013:3c53::1/128",
|
||||
"2a01:4f8:c013:3c54::1/128",
|
||||
"2a01:4f8:c013:3c55::1/128",
|
||||
"2a01:4f8:c013:3c56::1/128",
|
||||
"2a01:4ff:f0:bfd::1/128",
|
||||
"2a01:4ff:f0:2219::1/128",
|
||||
"2a01:4ff:f0:3e03::1/128",
|
||||
"2a01:4ff:f0:5f80::1/128",
|
||||
"2a01:4ff:f0:7fad::1/128",
|
||||
"2a01:4ff:f0:9c5f::1/128",
|
||||
"2a01:4ff:f0:b2f2::1/128",
|
||||
"2a01:4ff:f0:b6f1::1/128",
|
||||
"2a01:4ff:f0:d283::1/128",
|
||||
"2a01:4ff:f0:d3cd::1/128",
|
||||
"2a01:4ff:f0:e516::1/128",
|
||||
"2a01:4ff:f0:e9cf::1/128",
|
||||
"2a01:4ff:f0:eccb::1/128",
|
||||
"2a01:4ff:f0:efd1::1/128",
|
||||
"2a01:4ff:f0:fdc7::1/128",
|
||||
"2a01:4ff:2f0:193c::1/128",
|
||||
"2a01:4ff:2f0:27de::1/128",
|
||||
"2a01:4ff:2f0:3b3a::1/128",
|
||||
"2a03:b0c0:2:f0::bd91:f001/128",
|
||||
"2a03:b0c0:2:f0::bd92:1/128",
|
||||
"2a03:b0c0:2:f0::bd92:1001/128",
|
||||
"2a03:b0c0:2:f0::bd92:2001/128",
|
||||
"2a03:b0c0:2:f0::bd92:4001/128",
|
||||
"2a03:b0c0:2:f0::bd92:5001/128",
|
||||
"2a03:b0c0:2:f0::bd92:6001/128",
|
||||
"2a03:b0c0:2:f0::bd92:7001/128",
|
||||
"2a03:b0c0:2:f0::bd92:8001/128",
|
||||
"2a03:b0c0:2:f0::bd92:9001/128",
|
||||
"2a03:b0c0:2:f0::bd92:a001/128",
|
||||
"2a03:b0c0:2:f0::bd92:b001/128",
|
||||
"2a03:b0c0:2:f0::bd92:c001/128",
|
||||
"2a03:b0c0:2:f0::bd92:e001/128",
|
||||
"2a03:b0c0:2:f0::bd92:f001/128",
|
||||
"2a05:d014:1815:3400:6d:9235:c1c0:96ad/128",
|
||||
"2a05:d014:1815:3400:654f:bd37:724c:212b/128",
|
||||
"2a05:d014:1815:3400:90b4:4ef9:5631:b170/128",
|
||||
"2a05:d014:1815:3400:9779:d8e9:100a:9642/128",
|
||||
"2a05:d014:1815:3400:af29:e95e:64ff:df81/128",
|
||||
"2a05:d014:1815:3400:c7d6:f7f3:6cc1:30d1/128",
|
||||
"2a05:d014:1815:3400:d784:e5dd:8e0:67cb/128",
|
||||
]
|
||||
@@ -48,6 +48,26 @@ func (m *Impl[K, V]) expire(key K) bool {
|
||||
return true
|
||||
}
|
||||
|
||||
// Delete a value from the DecayMap by key.
|
||||
//
|
||||
// If the value does not exist, return false. Return true after
|
||||
// deletion.
|
||||
func (m *Impl[K, V]) Delete(key K) bool {
|
||||
m.lock.RLock()
|
||||
_, ok := m.data[key]
|
||||
m.lock.RUnlock()
|
||||
|
||||
if !ok {
|
||||
return false
|
||||
}
|
||||
|
||||
m.lock.Lock()
|
||||
delete(m.data, key)
|
||||
m.lock.Unlock()
|
||||
|
||||
return true
|
||||
}
|
||||
|
||||
// Get gets a value from the DecayMap by key.
|
||||
//
|
||||
// If a value has expired, forcibly delete it if it was not updated.
|
||||
|
||||
@@ -19,5 +19,3 @@ npm-debug.log*
|
||||
yarn-debug.log*
|
||||
yarn-error.log*
|
||||
|
||||
# Kubernetes manifests
|
||||
/manifest
|
||||
@@ -5,6 +5,7 @@ COPY . .
|
||||
|
||||
RUN npm ci && npm run build
|
||||
|
||||
FROM docker.io/library/nginx:alpine
|
||||
COPY --from=build /app/build /usr/share/nginx/html
|
||||
FROM ghcr.io/xe/nginx-micro
|
||||
COPY --from=build /app/build /www
|
||||
COPY ./manifest/cfg/nginx/nginx.conf /conf
|
||||
LABEL org.opencontainers.image.source="https://github.com/TecharoHQ/anubis"
|
||||
14
docs/blog/2025-06-16-welcome/index.mdx
Normal file
14
docs/blog/2025-06-16-welcome/index.mdx
Normal file
@@ -0,0 +1,14 @@
|
||||
---
|
||||
slug: welcome
|
||||
title: Welcome to the Anubis blog!
|
||||
authors: [xe]
|
||||
tags: [intro]
|
||||
---
|
||||
|
||||
Hello, world!
|
||||
|
||||
At Techaro, we've been working on making Anubis even better, and in the process we want to share what we've done, how it works, and signal boost cool things the community has done. As things happen, we'll blog about them so that you can learn from our struggles.
|
||||
|
||||
More details to come soon!
|
||||
|
||||
{/* truncate */}
|
||||
248
docs/blog/2025-06-27-release-1.20.0/index.mdx
Normal file
248
docs/blog/2025-06-27-release-1.20.0/index.mdx
Normal file
@@ -0,0 +1,248 @@
|
||||
---
|
||||
slug: release/v1.20.0
|
||||
title: Anubis v1.20.0 is now available!
|
||||
authors: [xe]
|
||||
tags: [release]
|
||||
image: sunburst.webp
|
||||
---
|
||||
|
||||

|
||||
|
||||
Hey all!
|
||||
|
||||
Today we released [Anubis v1.20.0: Thancred Waters](https://github.com/TecharoHQ/anubis/releases/tag/v1.20.0). This adds a lot of new and exciting features to Anubis, including but not limited to the `WEIGH` action, custom weight thresholds, Imprint/impressum support, and a no-JS challenge. Here's what you need to know so you can protect your websites in new and exciting ways!
|
||||
|
||||
{/* truncate */}
|
||||
|
||||
## Sponsoring the product
|
||||
|
||||
If you rely on Anubis to keep your website safe, please consider sponsoring the project on [GitHub Sponsors](https://github.com/sponsors/Xe) or [Patreon](https://patreon.com/cadey). Funding helps pay hosting bills and offset the time spent on making this project the best it can be. Every little bit helps and when enough money is raised, [I can make Anubis my full-time job](https://github.com/TecharoHQ/anubis/discussions/278).
|
||||
|
||||
I am waiting to hear back from NLNet on if Anubis was selected for funding or not. Let's hope it is!
|
||||
|
||||
## Deprecation warning: `DEFAULT_DIFFICULTY`
|
||||
|
||||
Anubis v1.20.0 is the last version to support the `DEFAULT_DIFFICULTY` flag in the exact way it currently does. In future versions, this will be ineffectual and you should use the [custom threshold system](/docs/admin/configuration/thresholds) instead.
|
||||
|
||||
If this becomes an imposition in practice, this will be reverted.
|
||||
|
||||
## Chrome won't show "invalid response" after "Success!"
|
||||
|
||||
There were a bunch of smaller fixes in Anubis this time around, but the biggest one was finally squashing the ["invalid response" after "Success!" issue](https://github.com/TecharoHQ/anubis/issues/564) that had been plaguing Chrome users. This was a really annoying issue to track down but it was discovered while we were working on better end-to-end / functional testing: [Chrome randomizes the `Accept-Language` header](https://github.com/explainers-by-googlers/reduce-accept-language) so that websites can't do fingerprinting as easily.
|
||||
|
||||
When Anubis issues a challenge, it grabs [information that the browser sends to the user](/docs/design/how-anubis-works#challenge-format) to create a challenge string. Anubis doesn't store these challenge strings anywhere, and when a solution is being checked it calculates the challenge string from the request. This means that they'd get a challenge on one end, compute the response for that challenge, and then the server would validate that against a different challenge. This server-side validation would fail, leading to the user seeing "invalid response" after the client reported success.
|
||||
|
||||
I suspect this was why Vanadium and Cromite were having sporadic issues as well.
|
||||
|
||||
## New Features
|
||||
|
||||
The biggest feature in Anubis is the "weight" subsystem. This allows administrators to make custom rules that change the suspicion level of a request without having to take immediate action. As an example, consider the self-hostable git forge [Gitea](https://about.gitea.com/). When you load a page in Gitea, it creates a session cookie that your browser sends with every request. Weight allows you to mark a request that includes a Gitea session token as _less_ suspicious:
|
||||
|
||||
```yaml
|
||||
- name: gitea-session-token
|
||||
action: WEIGH
|
||||
expression:
|
||||
all:
|
||||
# Check if the request has a Cookie header
|
||||
- '"Cookie" in headers'
|
||||
# Check if the request's Cookie header contains the Gitea session token
|
||||
- headers["Cookie"].contains("i_love_gitea=")
|
||||
# Remove 5 weight points
|
||||
weight:
|
||||
adjust: -5
|
||||
```
|
||||
|
||||
This is different from the past where you could only allow every request with a Gitea session token, meaning that the invention of lying would allow malicious clients to bypass protection.
|
||||
|
||||
Weight is added and removed whenever a `WEIGH` rule is encountered. When all rules are processed and the request doesn't match any `ALLOW`, `CHALLENGE`, or `DENY` rules, Anubis uses [weight thresholds](/docs/admin/configuration/thresholds) to figure out how to handle that request. Thresholds are defined in the [policy file](/docs/admin/policies) alongside your bot rules:
|
||||
|
||||
```yaml
|
||||
thresholds:
|
||||
- name: minimal-suspicion # This client is likely fine, its soul is lighter than a feather
|
||||
expression: weight <= 0 # a feather weighs zero units
|
||||
action: ALLOW # Allow the traffic through
|
||||
# For clients that had some weight reduced through custom rules, give them a
|
||||
# lightweight challenge.
|
||||
- name: mild-suspicion
|
||||
expression:
|
||||
all:
|
||||
- weight > 0
|
||||
- weight < 10
|
||||
action: CHALLENGE
|
||||
challenge:
|
||||
# https://anubis.techaro.lol/docs/admin/configuration/challenges/metarefresh
|
||||
algorithm: metarefresh
|
||||
difficulty: 1
|
||||
report_as: 1
|
||||
# For clients that are browser-like but have either gained points from custom rules or
|
||||
# report as a standard browser.
|
||||
- name: moderate-suspicion
|
||||
expression:
|
||||
all:
|
||||
- weight >= 10
|
||||
- weight < 20
|
||||
action: CHALLENGE
|
||||
challenge:
|
||||
# https://anubis.techaro.lol/docs/admin/configuration/challenges/proof-of-work
|
||||
algorithm: fast
|
||||
difficulty: 2 # two leading zeros, very fast for most clients
|
||||
report_as: 2
|
||||
# For clients that are browser like and have gained many points from custom rules
|
||||
- name: extreme-suspicion
|
||||
expression: weight >= 20
|
||||
action: CHALLENGE
|
||||
challenge:
|
||||
# https://anubis.techaro.lol/docs/admin/configuration/challenges/proof-of-work
|
||||
algorithm: fast
|
||||
difficulty: 4
|
||||
report_as: 4
|
||||
```
|
||||
|
||||
:::note
|
||||
|
||||
If you don't have thresholds defined in your Anubis policy file, Anubis will default to the "legacy" behaviour where browser-like clients get a challenge at the default difficulty.
|
||||
|
||||
:::
|
||||
|
||||
This lets most clients through if they pass a simple [proof of work challenge](/docs/admin/configuration/challenges/proof-of-work), but any clients that are less suspicious (like ones with a Gitea session token) are given the lightweight [Meta Refresh](/docs/admin/configuration/challenges/metarefresh) challenge instead.
|
||||
|
||||
Threshold expressions are like [Bot rule expressions](/docs/admin/configuration/expressions), but there's only one input: the request's weight. If no thresholds match, the request is allowed through.
|
||||
|
||||
### Imprint/Impressum Support
|
||||
|
||||
European countries like Germany [require an imprint/impressum](https://www.ionos.com/digitalguide/websites/digital-law/a-case-for-thinking-global-germanys-impressum-laws/) to be present in the footer of their website. This allows users to contact someone on the team behind a website in case they run into issues. This also must generally have a separate page where users can view an extended imprint with other information like a privacy policy or a copyright notice.
|
||||
|
||||
Anubis v1.20.0 and later [has support for showing imprints](/docs/admin/configuration/impressum). You can configure two kinds of imprints:
|
||||
|
||||
1. An imprint that is shown in the footer of every Anubis page.
|
||||
2. An extended imprint / privacy policy that is shown when users click on the "Imprint" link. For example, [here's the imprint for the website you're looking at right now](https://anubis.techaro.lol/.within.website/x/cmd/anubis/api/imprint).
|
||||
|
||||
Imprints are configured in [the policy file](/docs/admin/policies/):
|
||||
|
||||
```yaml
|
||||
impressum:
|
||||
# Displayed at the bottom of every page rendered by Anubis.
|
||||
footer: >-
|
||||
This website is hosted by Zombocom. If you have any complaints or notes
|
||||
about the service, please contact
|
||||
<a href="mailto:contact@zombocom.example">contact@zombocom.example</a> and
|
||||
we will assist you as soon as possible.
|
||||
|
||||
# The imprint page that will be linked to at the footer of every Anubis page.
|
||||
page:
|
||||
# The HTML <title> of the page
|
||||
title: Imprint and Privacy Policy
|
||||
# The HTML contents of the page. The exact contents of this page can
|
||||
# and will vary by locale. Please consult with a lawyer if you are not
|
||||
# sure what to put here.
|
||||
body: >-
|
||||
<p>Last updated: June 2025</p>
|
||||
|
||||
<h2>Information that is gathered from visitors</h2>
|
||||
|
||||
<p>In common with other websites, log files are stored on the web server
|
||||
saving details such as the visitor's IP address, browser type, referring
|
||||
page and time of visit.</p>
|
||||
|
||||
<p>Cookies may be used to remember visitor preferences when interacting
|
||||
with the website.</p>
|
||||
|
||||
<p>Where registration is required, the visitor's email and a username
|
||||
will be stored on the server.</p>
|
||||
|
||||
<!-- ... -->
|
||||
```
|
||||
|
||||
If this is insufficient, please [file an issue](https://github.com/TecharoHQ/anubis/issues/new) with a link to the relevant legislation for your country so that this feature can be amended and improved.
|
||||
|
||||
### No-JS Challenge
|
||||
|
||||
One of the first issues in Anubis before it was moved to the [TecharoHQ org](https://github.com/TecharoHQ) was a request [to support challenging browsers without using JavaScript](https://github.com/Xe/x/issues/651). This is a pretty challenging thing to do without rethinking how Anubis works from a fundamentally low level, and with v1.20.0, [Anubis finally has support for running without client-side JavaScript](https://github.com/TecharoHQ/anubis/issues/95) thanks to the [Meta Refresh](/docs/admin/configuration/challenges/metarefresh) challenge.
|
||||
|
||||
When Anubis decides it needs to send a challenge to your browser, it sends a challenge page. Historically, this challenge page is [an HTML template](https://github.com/TecharoHQ/anubis/blob/main/web/index.templ) that kicks off some JavaScript, reads the challenge information out of the page body, and then solves it as fast as possible in order to let users see the website they want to visit.
|
||||
|
||||
In v1.20.0, Anubis has a challenge registry to hold [different client challenge implementations](/docs/category/challenges). This allows us to implement anything we want as long as it can render a page to show a challenge and then check if the result is correct. This is going to be used to implement a WebAssembly-based proof of work option (one that will be way more efficient than the existing browser JS version), but as a proof of concept I implemented a simple challenge using [HTML `<meta refresh>`](https://en.wikipedia.org/wiki/Meta_refresh).
|
||||
|
||||
In my testing, this has worked with every browser I have thrown it at (including CLI browsers, the browser embedded in emacs, etc.). The default configuration of Anubis does use the [meta refresh challenge](/docs/admin/configuration/challenges/metarefresh) for [clients with a very low suspicion](/docs/admin/configuration/thresholds), but by default clients will be sent an [easy proof of work challenge](/docs/admin/configuration/challenges/proof-of-work).
|
||||
|
||||
If the false positive rate of this challenge turns out to not be very high in practice, the meta refresh challenge will be enabled by default for browsers in future versions of Anubis.
|
||||
|
||||
### `robots2policy`
|
||||
|
||||
Anubis was created because crawler bots don't respect [`robots.txt` files](https://www.robotstxt.org/). Administrators have been working on refining and crafting their `robots.txt` files for years, and one common comment is that people don't know where to start crafting their own rules.
|
||||
|
||||
Anubis now ships with a [`robots2policy` tool](/docs/admin/robots2policy) that lets you convert your `robots.txt` file to an Anubis policy.
|
||||
|
||||
```text
|
||||
robots2policy -input https://github.com/robots.txt
|
||||
```
|
||||
|
||||
:::note
|
||||
|
||||
If you installed Anubis from [an OS package](/docs/admin/native-install), you may need to run `anubis-robots2policy` instead of `robots2policy`.
|
||||
|
||||
:::
|
||||
|
||||
We hope that this will help you get started with Anubis faster. We are working on a version of this that will run in the documentation via WebAssembly.
|
||||
|
||||
### Open Graph configuration is being moved to the policy file
|
||||
|
||||
Anubis supports reading [Open Graph tags](/docs/admin/configuration/open-graph) from target services and returning them in challenge pages. This makes the right metadata show up when linking services protected by Anubis in chat applications or on social media.
|
||||
|
||||
In order to test the migration of all of the configuration to the policy file, Open Graph configuration has been moved to the policy file. For more information, please read [the Open Graph configuration options](/docs/admin/configuration/open-graph#configuration-options).
|
||||
|
||||
You can also set default Open Graph tags:
|
||||
|
||||
```yaml
|
||||
openGraph:
|
||||
enabled: true
|
||||
ttl: 24h
|
||||
# If set, return these opengraph values instead of looking them up with
|
||||
# the target service.
|
||||
#
|
||||
# Correlates to properties in https://ogp.me/
|
||||
override:
|
||||
# og:title is required, it is the title of the website
|
||||
"og:title": "Techaro Anubis"
|
||||
"og:description": >-
|
||||
Anubis is a Web AI Firewall Utility that helps you fight the bots
|
||||
away so that you can maintain uptime at work!
|
||||
"description": >-
|
||||
Anubis is a Web AI Firewall Utility that helps you fight the bots
|
||||
away so that you can maintain uptime at work!
|
||||
```
|
||||
|
||||
## Improvements and optimizations
|
||||
|
||||
One of the biggest improvements we've made in v1.20.0 is replacing [SHA-256 with xxhash](https://github.com/TecharoHQ/anubis/pull/676). Anubis uses hashes all over the place to help with identifying clients, matching against rules when allowing traffic through, in error messages sent to users, and more. Historically these have been done with [SHA-256](https://en.wikipedia.org/wiki/SHA-2), however this has been having a mild performance impact in real-world use. As a result, we now use [xxhash](https://xxhash.com/) when possible. This makes policy matching 3x faster in some scenarios and reduces memory usage across the board.
|
||||
|
||||
Anubis now uses [bart](https://pkg.go.dev/github.com/gaissmai/bart) for doing IP address matching when you specify addresses in a `remote_address` check configuration or when you are matching against [advanced checks](/docs/admin/thoth). This uses the same kind of IP address routing configuration that your OS kernel does, making it very fast to query information about IP addresses. This makes IP address range matches anywhere from 3-14 times faster depending on the number of addresses it needs to match against. For more information and benchmarks, check out [@JasonLovesDoggo](https://github.com/JasonLovesDoggo)'s PR: [perf: replace cidranger with bart for significant performance improvements #675](https://github.com/TecharoHQ/anubis/pull/675).
|
||||
|
||||
## What's up next?
|
||||
|
||||
v1.21.0 is already shaping up to be a massive improvement as Anubis adds [internationalization](https://en.wikipedia.org/wiki/Internationalization) support, allowing your users to see its messages in the language they're most comfortable with.
|
||||
|
||||
So far Anubis supports the following languages:
|
||||
|
||||
- English (Simplified and Traditional)
|
||||
- French
|
||||
- Portugese (Brazil)
|
||||
- Spanish
|
||||
|
||||
If you want to contribute translations, please [file an issue](https://github.com/TecharoHQ/anubis/issues/new) with your language of choice or submit a pull request to [the `lib/localization/locales` folder](https://github.com/TecharoHQ/anubis/tree/main/lib/localization/locales). We are about to introduce features to the translation stack, so you may want to hold off a hot minute, but we welcome any and all contributions to making Anubis useful to a global audience.
|
||||
|
||||
Other things we plan to do:
|
||||
|
||||
- Move configuration to the policy file
|
||||
- Support reloading the policy file at runtime without having to restart Anubis
|
||||
- Detecting if a client is "brand new"
|
||||
- A [Valkey](https://valkey.io/)-backed store for sharing information between instances of Anubis
|
||||
- Augmenting No-JS support in the paid product
|
||||
- TLS fingerprinting
|
||||
- Automated testing improvements in CI (FreeBSD CI support, better automated integration/functional testing, etc.)
|
||||
|
||||
## Conclusion
|
||||
|
||||
I hope that these features let you get the same Anubis power you've come to know and love and increases the things you can do with it! I've been really excited to ship [thresholds](/docs/admin/configuration/thresholds) and the cloud-based services for Anubis.
|
||||
|
||||
If you run into any problems, please [file an issue](https://github.com/TecharoHQ/anubis/issues/new). Otherwise, have a good day and get back to making your communities great.
|
||||
BIN
docs/blog/2025-06-27-release-1.20.0/sunburst.webp
Normal file
BIN
docs/blog/2025-06-27-release-1.20.0/sunburst.webp
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 9.2 KiB |
105
docs/blog/2025-07-09-incident-report/index.mdx
Normal file
105
docs/blog/2025-07-09-incident-report/index.mdx
Normal file
@@ -0,0 +1,105 @@
|
||||
---
|
||||
slug: incident/TI-20250709-0001
|
||||
title: "TI-20250709-0001: IPv4 traffic failures for Techaro services"
|
||||
authors: [xe]
|
||||
tags: [incident]
|
||||
image: ./window-portal.jpg
|
||||
---
|
||||
|
||||

|
||||
|
||||
Techaro services were down for IPv4 traffic on July 9th, 2025. This blogpost is a report of what happened, what actions were taken to resolve the situation, and what actions are being done in the near future to prevent this problem. Enjoy this incident report!
|
||||
|
||||
{/* truncate */}
|
||||
|
||||
:::note
|
||||
|
||||
In other companies, this kind of documentation would be kept internal. At Techaro, we believe that you deserve radical candor and the truth. As such, we are proving our lofty words with actions by publishing details about how things go wrong publicly.
|
||||
|
||||
Everything past this point follows my standard incident root cause meeting template.
|
||||
|
||||
:::
|
||||
|
||||
This incident report will focus on the services affected, timeline of what happened at which stage of the incident, where we got lucky, the root cause analysis, and what action items are being planned or taken to prevent this from happening in the future.
|
||||
|
||||
## Timeline
|
||||
|
||||
All events take place on July 9th, 2025.
|
||||
|
||||
| Time (UTC) | Description |
|
||||
| :--------- | :------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
||||
| 12:32 | Uptime Kuma reports that another unrelated website on the same cluster was timing out. |
|
||||
| 12:33 | Uptime Kuma reports that Thoth's production endpoint is failing gRPC health checks. |
|
||||
| 12:35 | Investigation begins, [announcement made on Xe's Bluesky](https://bsky.app/profile/xeiaso.net/post/3ltjtdczpwc2x) due to the impact including their personal blog. |
|
||||
| 12:39 | `nginx-ingress` logs on the production cluster show IPv6 traffic but an abrupt cutoff in IPv4 traffic around 12:32 UTC. Ticket is opened with the hosting provider. |
|
||||
| 12:41 | IPv4 traffic resumes long enough for Uptime Kuma to report uptime, but then immediately fails again. |
|
||||
| 12:46 | IPv4 traffic resumes long enough for Uptime Kuma to report uptime, but then immediately fails again. (repeat instances of this have been scrubbed, but it happened about every 5-10 minutes) |
|
||||
| 12:48 | First reply from the hosting provider. |
|
||||
| 12:57 | Reply to hosting provider, ask to reboot the load balancer. |
|
||||
| 13:00 | Incident responder because busy due to a meeting under the belief that the downtime was out of their control and that uptime monitoring software would let them know if it came back up. |
|
||||
| 13:20 | Incident responder ended meeting and went back to monitoring downtime and preparing this document. |
|
||||
| 13:34 | IPv4 traffic starts to show up in the `ingress-nginx` logs. |
|
||||
| 13:35 | All services start to report healthy. Incident status changes to monitoring. |
|
||||
| 13:48 | Incident closed. |
|
||||
| 14:07 | Incident re-opened. Issues seem to be manifesting as BGP issues in the upstream provider. |
|
||||
| 14:10 | IPv4 traffic resumes and then stops. |
|
||||
| 14:18 | IPv4 traffic resumes again. Incident status changes to monitoring. |
|
||||
| 14:40 | Incident closed. |
|
||||
|
||||
## Services affected
|
||||
|
||||
| Service name | User impact |
|
||||
| :-------------------------------------------------- | :----------------- |
|
||||
| [Anubis Docs](https://anubis.techaro.lol) (IPv4) | Connection timeout |
|
||||
| [Anubis Docs](https://anubis.techaro.lol) (IPv6) | None |
|
||||
| [Thoth](/docs/admin/thoth/) (IPv4) | Connection timeout |
|
||||
| [Thoth](/docs/admin/thoth/) (IPv6) | None |
|
||||
| Other websites colocated on the same cluster (IPv4) | Connection timeout |
|
||||
| Other websites colocated on the same cluster (IPv6) | None |
|
||||
|
||||
## Root cause analysis
|
||||
|
||||
In simplify server management, Techaro runs a [Kubernetes](https://kubernetes.io/) cluster on [Vultr VKE](https://www.vultr.com/kubernetes/) (Vultr Kubernetes Engine). When you do this, Vultr needs to provision a [load balancer](https://docs.vultr.com/how-to-use-a-vultr-load-balancer-with-vke) to bridge the gap between the outside world and the Kubernetes world, kinda like this:
|
||||
|
||||
```mermaid
|
||||
---
|
||||
title: Overall architecture
|
||||
---
|
||||
|
||||
flowchart LR
|
||||
UT(User Traffic)
|
||||
subgraph Provider Infrastructure
|
||||
LB[Load Balancer]
|
||||
end
|
||||
subgraph Kubernetes
|
||||
IN(ingress-nginx)
|
||||
TH(Thoth)
|
||||
AN(Anubis Docs)
|
||||
OS(Other sites)
|
||||
|
||||
IN --> TH
|
||||
IN --> AN
|
||||
IN --> OS
|
||||
end
|
||||
|
||||
UT --> LB --> IN
|
||||
```
|
||||
|
||||
Techaro controls everything inside the Kubernetes side of that diagram. Anything else is out of our control. That load balancer is routed to the public internet via [Border Gateway Protocol (BGP)](https://en.wikipedia.org/wiki/Border_Gateway_Protocol).
|
||||
|
||||
If there is an interruption with the BGP sessions in the upstream provider, this can manifest as things either not working or inconsistently working. This is made more difficult by the fact that the IPv4 and IPv6 internets are technically separate networks. With this in mind, it's very possible to have IPv4 traffic fail but not IPv6 traffic.
|
||||
|
||||
The root cause is that the hosting provider we use for production services had flapping IPv4 BGP sessions in its Toronto region. When this happens all we can do is open a ticket and wait for it to come back up.
|
||||
|
||||
## Where we got lucky
|
||||
|
||||
The Uptime Kuma instance that caught this incident runs on an IPv4-only network. If it was dual stack, this would not have been caught as quickly.
|
||||
|
||||
The `ingress-nginx` logs print IP addresses of remote clients to the log feed. If this was not the case, it would be much more difficult to find this error.
|
||||
|
||||
## Action items
|
||||
|
||||
- A single instance of downtime like this is not enough reason to move providers. Moving providers because of this is thus out of scope.
|
||||
- Techaro needs a status page hosted on a different cloud provider than is used for the production cluster (`TecharoHQ/TODO#6`).
|
||||
- Health checks for IPv4 and IPv6 traffic need to be created (`TecharoHQ/TODO#7`).
|
||||
- Remove the requirement for [Anubis to pass Thoth health checks before it can start if Thoth is enabled](https://github.com/TecharoHQ/anubis/pull/794).
|
||||
BIN
docs/blog/2025-07-09-incident-report/window-portal.jpg
Normal file
BIN
docs/blog/2025-07-09-incident-report/window-portal.jpg
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 30 KiB |
9
docs/blog/authors.yml
Normal file
9
docs/blog/authors.yml
Normal file
@@ -0,0 +1,9 @@
|
||||
xe:
|
||||
name: Xe Iaso
|
||||
title: CEO @ Techaro
|
||||
url: https://github.com/Xe
|
||||
image_url: https://github.com/Xe.png
|
||||
email: xe@techaro.lol
|
||||
page: true
|
||||
socials:
|
||||
github: Xe
|
||||
@@ -11,15 +11,283 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
|
||||
|
||||
## [Unreleased]
|
||||
|
||||
<!-- This changes the project to: -->
|
||||
|
||||
- Expired records are now properly removed from bbolt databases ([#848](https://github.com/TecharoHQ/anubis/pull/848)).
|
||||
- Fix hanging on service restart ([#853](https://github.com/TecharoHQ/anubis/issues/853))
|
||||
|
||||
### Added
|
||||
|
||||
Anubis now supports these new languages:
|
||||
|
||||
- [Czech](https://github.com/TecharoHQ/anubis/pull/849)
|
||||
|
||||
Anubis now supports the [`missingHeader`](./admin/configuration/expressions.mdx#missingHeader) to assert the absence of headers in requests.
|
||||
|
||||
### Fixes
|
||||
|
||||
#### Fix potential memory leak when discovering a solution
|
||||
|
||||
In some cases, the parallel solution finder in Anubis could cause all of the worker promises to leak due to the fact the promises were being improperly terminated. This was fixed by having Anubis debounce worker termination instead of allowing it to potentially recurse infinitely.
|
||||
|
||||
## v1.21.0: Minfilia Warde
|
||||
|
||||
> Please, be at ease. You are among friends here.
|
||||
|
||||
In this release, Anubis becomes internationalized, gains the ability to use system load as input to issuing challenges, finally fixes the "invalid response" after "success" bug, and more! Please read these notes before upgrading as the changes are big enough that administrators should take action to ensure that the upgrade goes smoothly.
|
||||
|
||||
### Big ticket changes
|
||||
|
||||
The biggest change is that the ["invalid response" after "success" bug](https://github.com/TecharoHQ/anubis/issues/564) is now finally fixed for good by totally rewriting how Anubis' challenge issuance flow works. Instead of generating challenge strings from request metadata (under the assumption that the values being compared against are stable), Anubis now generates random data for each challenge. This data is stored in the active [storage backend](./admin/policies.mdx#storage-backends) for up to 30 minutes. This also fixes [#746](https://github.com/TecharoHQ/anubis/issues/746) and other similar instances of this issue.
|
||||
|
||||
In order to reduce confusion, the "Success" interstitial that shows up when you pass a proof of work challenge has been removed.
|
||||
|
||||
#### Storage
|
||||
|
||||
Anubis now is able to store things persistently [in memory](./admin/policies.mdx#memory), [on the disk](./admin/policies.mdx#bbolt), or [in Valkey](./admin/policies.mdx#valkey) (this includes other compatible software). By default Anubis uses the in-memory backend. If you have an environment with mutable storage (even if it is temporary), be sure to configure the [`bbolt`](./admin/policies.mdx#bbolt) storage backend.
|
||||
|
||||
#### Localization
|
||||
|
||||
Anubis now supports localized responses. Locales can be added in [lib/localization/locales/](https://github.com/TecharoHQ/anubis/tree/main/lib/localization/locales). This release includes support for the following languages:
|
||||
|
||||
- [Brazilian Portugese](https://github.com/TecharoHQ/anubis/pull/726)
|
||||
- [Chinese (Simplified)](https://github.com/TecharoHQ/anubis/pull/774)
|
||||
- [Chinese (Traditional)](https://github.com/TecharoHQ/anubis/pull/759)
|
||||
- English
|
||||
- [Estonian](https://github.com/TecharoHQ/anubis/pull/783)
|
||||
- [Filipino](https://github.com/TecharoHQ/anubis/pull/775)
|
||||
- [French](https://github.com/TecharoHQ/anubis/pull/716)
|
||||
- [German](https://github.com/TecharoHQ/anubis/pull/741)
|
||||
- [Icelandic](https://github.com/TecharoHQ/anubis/pull/780)
|
||||
- [Italian](https://github.com/TecharoHQ/anubis/pull/778)
|
||||
- [Japanese](https://github.com/TecharoHQ/anubis/pull/772)
|
||||
- [Spanish](https://github.com/TecharoHQ/anubis/pull/716)
|
||||
- [Turkish](https://github.com/TecharoHQ/anubis/pull/751)
|
||||
|
||||
If facts or local regulations demand, you can set Anubis default language with the `FORCED_LANGUAGE` environment variable or the `--forced-language` command line argument:
|
||||
|
||||
```sh
|
||||
FORCED_LANGUAGE=de
|
||||
```
|
||||
|
||||
#### Load average
|
||||
|
||||
Anubis can dynamically take action [based on the system load average](./admin/configuration/expressions.mdx#using-the-system-load-average), allowing you to write rules like this:
|
||||
|
||||
```yaml
|
||||
## System load based checks.
|
||||
# If the system is under high load for the last minute, add weight.
|
||||
- name: high-load-average
|
||||
action: WEIGH
|
||||
expression: load_1m >= 10.0 # make sure to end the load comparison in a .0
|
||||
weight:
|
||||
adjust: 20
|
||||
|
||||
# If it is not for the last 15 minutes, remove weight.
|
||||
- name: low-load-average
|
||||
action: WEIGH
|
||||
expression: load_15m <= 4.0 # make sure to end the load comparison in a .0
|
||||
weight:
|
||||
adjust: -10
|
||||
```
|
||||
|
||||
Something to keep in mind about system load average is that it is not aware of the number of cores the system has. If you have a 16 core system that has 16 processes running but none of them is hogging the CPU, then you will get a load average below 16. If you are in doubt, make your "high load" metric at least two times the number of CPU cores and your "low load" metric at least half of the number of CPU cores. For example:
|
||||
|
||||
| Kind | Core count | Load threshold |
|
||||
| --------: | :--------- | :------------- |
|
||||
| high load | 4 | `8.0` |
|
||||
| low load | 4 | `2.0` |
|
||||
| high load | 16 | `32.0` |
|
||||
| low load | 16 | `8` |
|
||||
|
||||
Also keep in mind that this does not account for other kinds of latency like I/O latency. A system can have its web applications unresponsive due to high latency from a MySQL server but still have that web application server report a load near or at zero.
|
||||
|
||||
### Other features and fixes
|
||||
|
||||
There are a bunch of other assorted features and fixes too:
|
||||
|
||||
- Add `COOKIE_SECURE` option to set the cookie [Secure flag](https://developer.mozilla.org/en-US/docs/Web/HTTP/Guides/Cookies#block_access_to_your_cookies)
|
||||
- Sets cookie defaults to use [SameSite: None](https://web.dev/articles/samesite-cookies-explained)
|
||||
- Determine the `BIND_NETWORK`/`--bind-network` value from the bind address ([#677](https://github.com/TecharoHQ/anubis/issues/677)).
|
||||
- Implement a [development container](https://containers.dev/) manifest to make contributions easier.
|
||||
- Fix dynamic cookie domains functionality ([#731](https://github.com/TecharoHQ/anubis/pull/731))
|
||||
- Add option for custom cookie prefix ([#732](https://github.com/TecharoHQ/anubis/pull/732))
|
||||
- Make the [Open Graph](./admin/configuration/open-graph.mdx) subsystem and DNSBL subsystem use [storage backends](./admin/policies.mdx#storage-backends) instead of storing everything in memory by default.
|
||||
- Allow [Common Crawl](https://commoncrawl.org/) by default so scrapers have less incentive to scrape
|
||||
- The [bbolt storage backend](./admin/policies.mdx#bbolt) now runs its cleanup every hour instead of every five minutes.
|
||||
- Don't block Anubis starting up if [Thoth](./admin/thoth.mdx) health checks fail.
|
||||
- A race condition involving [opening two challenge pages at once in different tabs](https://github.com/TecharoHQ/anubis/issues/832) causing one of them to fail has been fixed.
|
||||
- The "Try again" button on the error page has been fixed. Previously it meant "try the solution again" instead of "try the challenge again".
|
||||
- In certain cases, a user could be stuck with a test cookie that is invalid, locking them out of the service for up to half an hour. This has been fixed with better validation of this case and clearing the cookie.
|
||||
- Start exposing JA4H fingerprints for later use in CEL expressions.
|
||||
- Add `/healthz` route for use in platform-based health checks.
|
||||
|
||||
### Potentially breaking changes
|
||||
|
||||
We try to introduce breaking changes as much as possible, but these are the changes that may be relevant for you as an administrator:
|
||||
|
||||
#### Challenge format change
|
||||
|
||||
Previously Anubis did no accounting for challenges that it issued. This means that if Anubis restarted during a client, the client would be able to proceed once Anubis came back online.
|
||||
|
||||
During the upgrade to v1.21.0 and when v1.21.0 (or later) restarts with the [in-memory storage backend](./admin/policies.mdx#memory), you may see a higher rate of failed challenges than normal. If this persists beyond a few minutes, [open an issue](https://github.com/TecharoHQ/anubis/issues/new).
|
||||
|
||||
If you are using the in-memory storage backend, please consider using [a different storage backend](./admin/policies.mdx#storage-backends).
|
||||
|
||||
#### Systemd service changes
|
||||
|
||||
The following potentially breaking change applies to native installs with systemd only:
|
||||
|
||||
Each instance of systemd service template now has a unique `RuntimeDirectory`, as opposed to each instance of the service sharing a `RuntimeDirectory`. This change was made to avoid [the `RuntimeDirectory` getting nuked any time one of the Anubis instances restarts](https://github.com/TecharoHQ/anubis/issues/748).
|
||||
|
||||
If you configured Anubis' unix sockets to listen on `/run/anubis/foo.sock` for instance `anubis@foo`, you will need to configure Anubis to listen on `/run/anubis/foo/foo.sock` and additionally configure your HTTP load balancer as appropriate.
|
||||
|
||||
If you need the legacy behaviour, install this [systemd unit dropin](https://www.flatcar.org/docs/latest/setup/systemd/drop-in-units/):
|
||||
|
||||
```systemd
|
||||
# /etc/systemd/system/anubis@.service.d/50-runtimedir.conf
|
||||
[Service]
|
||||
RuntimeDirectory=anubis
|
||||
```
|
||||
|
||||
Just keep in mind that this will cause problems when Anubis restarts.
|
||||
|
||||
## v1.20.0: Thancred Waters
|
||||
|
||||
The big ticket items are as follows:
|
||||
|
||||
- Implement a no-JS challenge method: [`metarefresh`](./admin/configuration/challenges/metarefresh.mdx) ([#95](https://github.com/TecharoHQ/anubis/issues/95))
|
||||
- Implement request "weight", allowing administrators to customize the behaviour of Anubis based on specific criteria
|
||||
- Implement GeoIP and ASN based checks via [Thoth](https://anubis.techaro.lol/docs/admin/thoth) ([#206](https://github.com/TecharoHQ/anubis/issues/206))
|
||||
- Add [custom weight thresholds](./admin/configuration/thresholds.mdx) via CEL ([#688](https://github.com/TecharoHQ/anubis/pull/688))
|
||||
- Move Open Graph configuration [to the policy file](./admin/configuration/open-graph.mdx)
|
||||
- Enable support for Open Graph metadata to be returned by default instead of doing lookups against the target
|
||||
- Add `robots2policy` CLI utility to convert robots.txt files to Anubis challenge policies using CEL expressions ([#409](https://github.com/TecharoHQ/anubis/issues/409))
|
||||
- Refactor challenge presentation logic to use a challenge registry
|
||||
- Allow challenge implementations to register HTTP routes
|
||||
- [Imprint/Impressum support](./admin/configuration/impressum.mdx) ([#362](https://github.com/TecharoHQ/anubis/issues/362))
|
||||
- Fix "invalid response" after "Success!" in Chromium ([#564](https://github.com/TecharoHQ/anubis/issues/564))
|
||||
|
||||
A lot of performance improvements have been made:
|
||||
|
||||
- Replace internal SHA256 hashing with xxhash for 4-6x performance improvement in policy evaluation and cache operations
|
||||
- Optimized the OGTags subsystem with reduced allocations and runtime per request by up to 66%
|
||||
- Replace cidranger with bart for IP range checking, improving IP matching performance by 3-20x with zero heap
|
||||
allocations
|
||||
|
||||
And some cleanups/refactors were added:
|
||||
|
||||
- Fix OpenGraph passthrough ([#717](https://github.com/TecharoHQ/anubis/issues/717))
|
||||
- Remove the unused `/test-error` endpoint and update the testing endpoint `/make-challenge` to only be enabled in
|
||||
development
|
||||
- Add `--xff-strip-private` flag/envvar to toggle skipping X-Forwarded-For private addresses or not
|
||||
- Requests can have their weight be adjusted, if a request weighs zero or less than it is allowed through
|
||||
- Refactor challenge presentation logic to use a challenge registry
|
||||
- Allow challenge implementations to register HTTP routes
|
||||
- Implement a no-JS challenge method: [`metarefresh`](./admin/configuration/challenges/metarefresh.mdx) ([#95](https://github.com/TecharoHQ/anubis/issues/95))
|
||||
- Bump AI-robots.txt to version 1.34
|
||||
- Bump AI-robots.txt to version 1.37
|
||||
- Make progress bar styling more compatible (UXP, etc)
|
||||
- Add `--strip-base-prefix` flag/envvar to strip the base prefix from request paths when forwarding to target servers
|
||||
- Fix an off-by-one in the default threshold config
|
||||
- Add functionality for HS512 JWT algorithm
|
||||
- Add support for dynamic cookie domains with the `--cookie-dynamic-domain`/`COOKIE_DYNAMIC_DOMAIN` flag/envvar
|
||||
|
||||
Request weight is one of the biggest ticket features in Anubis. This enables Anubis to be much closer to a Web Application Firewall and when combined with custom thresholds allows administrators to have Anubis take advanced reactions. For more information about request weight, see [the request weight section](./admin/policies.mdx#request-weight) of the policy file documentation.
|
||||
|
||||
TL;DR when you have one or more WEIGHT rules like this:
|
||||
|
||||
```yaml
|
||||
bots:
|
||||
- name: gitea-session-token
|
||||
action: WEIGH
|
||||
expression:
|
||||
all:
|
||||
- '"Cookie" in headers'
|
||||
- headers["Cookie"].contains("i_love_gitea=")
|
||||
# Remove 5 weight points
|
||||
weight:
|
||||
adjust: -5
|
||||
```
|
||||
|
||||
You can configure custom thresholds like this:
|
||||
|
||||
```yaml
|
||||
thresholds:
|
||||
- name: minimal-suspicion # This client is likely fine, its soul is lighter than a feather
|
||||
expression: weight < 0 # a feather weighs zero units
|
||||
action: ALLOW # Allow the traffic through
|
||||
|
||||
# For clients that had some weight reduced through custom rules, give them a
|
||||
# lightweight challenge.
|
||||
- name: mild-suspicion
|
||||
expression:
|
||||
all:
|
||||
- weight >= 0
|
||||
- weight < 10
|
||||
action: CHALLENGE
|
||||
challenge:
|
||||
# https://anubis.techaro.lol/docs/admin/configuration/challenges/metarefresh
|
||||
algorithm: metarefresh
|
||||
difficulty: 1
|
||||
report_as: 1
|
||||
|
||||
# For clients that are browser-like but have either gained points from custom
|
||||
# rules or report as a standard browser.
|
||||
- name: moderate-suspicion
|
||||
expression:
|
||||
all:
|
||||
- weight >= 10
|
||||
- weight < 20
|
||||
action: CHALLENGE
|
||||
challenge:
|
||||
# https://anubis.techaro.lol/docs/admin/configuration/challenges/proof-of-work
|
||||
algorithm: fast
|
||||
difficulty: 2 # two leading zeros, very fast for most clients
|
||||
report_as: 2
|
||||
|
||||
# For clients that are browser like and have gained many points from custom
|
||||
# rules
|
||||
- name: extreme-suspicion
|
||||
expression: weight >= 20
|
||||
action: CHALLENGE
|
||||
challenge:
|
||||
# https://anubis.techaro.lol/docs/admin/configuration/challenges/proof-of-work
|
||||
algorithm: fast
|
||||
difficulty: 4
|
||||
report_as: 4
|
||||
```
|
||||
|
||||
These thresholds apply when no other `ALLOW`, `DENY`, or `CHALLENGE` rule matches the request. `WEIGHT` rules add and remove request weight as needed:
|
||||
|
||||
```yaml
|
||||
bots:
|
||||
- name: gitea-session-token
|
||||
action: WEIGH
|
||||
expression:
|
||||
all:
|
||||
- '"Cookie" in headers'
|
||||
- headers["Cookie"].contains("i_love_gitea=")
|
||||
# Remove 5 weight points
|
||||
weight:
|
||||
adjust: -5
|
||||
|
||||
- name: bot-like-user-agent
|
||||
action: WEIGH
|
||||
expression: '"Bot" in userAgent'
|
||||
# Add 5 weight points
|
||||
weight:
|
||||
adjust: 5
|
||||
```
|
||||
|
||||
Of note: the default "generic browser" rule assigns 10 weight points:
|
||||
|
||||
```yaml
|
||||
# Generic catchall rule
|
||||
- name: generic-browser
|
||||
user_agent_regex: >-
|
||||
Mozilla|Opera
|
||||
action: WEIGH
|
||||
weight:
|
||||
adjust: 10
|
||||
```
|
||||
|
||||
Adjust this as you see fit.
|
||||
|
||||
## v1.19.1: Jenomis cen Lexentale - Echo 1
|
||||
|
||||
@@ -155,7 +423,6 @@ Other changes:
|
||||
- Moved all CSS inline to the Xess package, changed colors to be CSS variables
|
||||
- Set or append to `X-Forwarded-For` header unless the remote connects over a loopback address [#328](https://github.com/TecharoHQ/anubis/issues/328)
|
||||
- Fixed mojeekbot user agent regex
|
||||
- Added support for running anubis behind a base path (e.g. `/myapp`)
|
||||
- Reduce Anubis' paranoia with user cookies ([#365](https://github.com/TecharoHQ/anubis/pull/365))
|
||||
- Added support for Open Graph passthrough while using unix sockets
|
||||
- The Open Graph subsystem now passes the HTTP `HOST` header through to the origin
|
||||
|
||||
215
docs/docs/admin/botstopper.mdx
Normal file
215
docs/docs/admin/botstopper.mdx
Normal file
@@ -0,0 +1,215 @@
|
||||
---
|
||||
title: "Commercial support and an unbranded version"
|
||||
---
|
||||
|
||||
If you want to use Anubis but organizational policies prevent you from using the branding that the open source project ships, we offer a commercial version of Anubis named BotStopper. BotStopper builds off of the open source core of Anubis and offers organizations more control over the branding, including but not limited to:
|
||||
|
||||
- Custom images for different states of the challenge process (in process, success, failure)
|
||||
- Custom CSS and fonts
|
||||
- Custom titles for the challenge and error pages
|
||||
- "Anubis" replaced with "BotStopper" across the UI
|
||||
- A private bug tracker for issues
|
||||
|
||||
In the near future this will expand to:
|
||||
|
||||
- A private challenge implementation that does advanced fingerprinting to check if the client is a genuine browser or not
|
||||
- Advanced fingerprinting via [Thoth-based advanced checks](./thoth.mdx)
|
||||
|
||||
In order to sign up for BotStopper, please do one of the following:
|
||||
|
||||
- Sign up [on GitHub Sponsors](https://github.com/sponsors/Xe) at the $50 per month tier or higher
|
||||
- Email [sales@techaro.lol](mailto:sales@techaro.lol) with your requirements for invoicing, please note that custom invoicing will cost more than using GitHub Sponsors for understandable overhead reasons
|
||||
|
||||
## Installation
|
||||
|
||||
Install BotStopper like you would Anubis, but replace the image reference. EG:
|
||||
|
||||
```diff
|
||||
-ghcr.io/techarohq/anubis:latest
|
||||
+ghcr.io/techarohq/botstopper/anubis:latest
|
||||
```
|
||||
|
||||
### Binary packages
|
||||
|
||||
Binary packages are available [in the GitHub Releases page](https://github.com/TecharoHQ/botstopper/releases), the main difference is that the package name is `techaro-botstopper`, the systemd service is `techaro-botstopper@your-instance.service`, the binary is `/usr/bin/botstopper`, and the configuration is in `/etc/techaro-botstopper`. All other instructions in the [native package install guide](./native-install.mdx) apply.
|
||||
|
||||
### Docker / Podman
|
||||
|
||||
In order to pull the BotStopper image, you need to [authenticate with GitHub's Container Registry](https://docs.github.com/en/packages/working-with-a-github-packages-registry/working-with-the-container-registry#authenticating-to-the-container-registry).
|
||||
|
||||
```text
|
||||
docker login ghcr.io -u your-username --password-stdin
|
||||
```
|
||||
|
||||
Then you can use the image as normal.
|
||||
|
||||
### Kubernetes
|
||||
|
||||
If you are using Kubernetes, you will need to create an image pull secret:
|
||||
|
||||
```text
|
||||
kubectl create secret docker-registry \
|
||||
techarohq-botstopper \
|
||||
--docker-server ghcr.io \
|
||||
--docker-username your-username \
|
||||
--docker-password your-access-token \
|
||||
--docker-email your@email.address
|
||||
```
|
||||
|
||||
Then attach it to your Deployment:
|
||||
|
||||
```diff
|
||||
spec:
|
||||
securityContext:
|
||||
fsGroup: 1000
|
||||
+ imagePullSecrets:
|
||||
+ - name: techarohq-botstopper
|
||||
```
|
||||
|
||||
## Configuration
|
||||
|
||||
### Docker compose
|
||||
|
||||
Follow [the upstream Docker compose directions](https://anubis.techaro.lol/docs/admin/environments/docker-compose) with the following additional options:
|
||||
|
||||
```diff
|
||||
anubis:
|
||||
image: ghcr.io/techarohq/botstopper/anubis:latest
|
||||
environment:
|
||||
BIND: ":8080"
|
||||
DIFFICULTY: "4"
|
||||
METRICS_BIND: ":9090"
|
||||
SERVE_ROBOTS_TXT: "true"
|
||||
TARGET: "http://nginx"
|
||||
OG_PASSTHROUGH: "true"
|
||||
OG_EXPIRY_TIME: "24h"
|
||||
|
||||
+ # botstopper config here
|
||||
+ CHALLENGE_TITLE: "Doing math for your connnection!"
|
||||
+ ERROR_TITLE: "Something went wrong!"
|
||||
+ OVERLAY_FOLDER: /assets
|
||||
+ volumes:
|
||||
+ - "./your_folder:/assets"
|
||||
```
|
||||
|
||||
#### Example
|
||||
|
||||
There is an example in [docker-compose.yaml](https://github.com/TecharoHQ/botstopper/blob/main/docker-compose.yaml). Start the example with `docker compose up`:
|
||||
|
||||
```text
|
||||
docker compose up -d
|
||||
```
|
||||
|
||||
And then open [https://botstopper.local.cetacean.club:8443](https://botstopper.local.cetacean.club:8443) in your browser.
|
||||
|
||||
> [!NOTE]
|
||||
> This uses locally signed sacrificial TLS certificates stored in `./demo/pki`. Your browser will rightly reject these. Here is what the example looks like:
|
||||
>
|
||||
> 
|
||||
|
||||
## Custom images and CSS
|
||||
|
||||
Anubis uses an internal filesystem that contains CSS, JavaScript, and images. The BotStopper variant of Anubis lets you specify an overlay folder with the environment variable `OVERLAY_FOLDER`. The contents of this folder will be overlaid on top of Anubis' internal filesystem, allowing you to easily customize the images and CSS.
|
||||
|
||||
Your directory tree should look like this, assuming your data is in `./your_folder`:
|
||||
|
||||
```text
|
||||
./your_folder
|
||||
└── static
|
||||
├── css
|
||||
│ └── custom.css
|
||||
└── img
|
||||
├── happy.webp
|
||||
├── pensive.webp
|
||||
└── reject.webp
|
||||
```
|
||||
|
||||
For an example directory tree using some off-the-shelf images the Tango icon set, see the [testdata](https://github.com/TecharoHQ/botstopper/tree/main/testdata/static/img) folder.
|
||||
|
||||
### Custom CSS
|
||||
|
||||
CSS customization is done mainly with CSS variables. View [the example custom CSS file](https://github.com/TecharoHQ/botstopper/blob/main/testdata/static/css/custom.css) for more information about what can be customized.
|
||||
|
||||
### Custom fonts
|
||||
|
||||
If you want to add custom fonts, copy the `woff2` files alongside your `custom.css` file and then include them with the [`@font-face` CSS at-rule](https://developer.mozilla.org/en-US/docs/Web/CSS/@font-face):
|
||||
|
||||
```css
|
||||
@font-face {
|
||||
font-family: "Oswald";
|
||||
font-style: normal;
|
||||
font-weight: 200 900;
|
||||
font-display: swap;
|
||||
src: url("./fonts/oswald.woff2") format("woff2");
|
||||
}
|
||||
```
|
||||
|
||||
Then adjust your CSS variables accordingly:
|
||||
|
||||
```css
|
||||
:root {
|
||||
--body-sans-font: Oswald, sans-serif;
|
||||
--body-preformatted-font: monospace;
|
||||
--body-title-font: serif;
|
||||
}
|
||||
```
|
||||
|
||||
To convert `.ttf` fonts to [Web-optimized woff2 fonts](https://www.w3.org/TR/WOFF2/), use the `woff2_compress` command from the `woff2` or `woff2-tools` package:
|
||||
|
||||
```console
|
||||
$ woff2_compress oswald.ttf
|
||||
Processing oswald.ttf => oswald.woff2
|
||||
Compressed 159517 to 70469.
|
||||
```
|
||||
|
||||
Then you can import and use it as normal.
|
||||
|
||||
### Customizing images
|
||||
|
||||
Anubis uses three images to visually communicate the state of the program. These are:
|
||||
|
||||
| Image name | Intended message | Example |
|
||||
| :------------- | :----------------------------------------------- | :-------------------------------- |
|
||||
| `happy.webp` | You have passed validation, all is good |  |
|
||||
| `pensive.webp` | Checking is running, hold steady until it's done |  |
|
||||
| `reject.webp` | Something went wrong, this is a terminal state |  |
|
||||
|
||||
To make your own images at the optimal quality, use the following ffmpeg command:
|
||||
|
||||
```text
|
||||
ffmpeg -i /path/to/image -vf scale=-1:384 happy.webp
|
||||
```
|
||||
|
||||
`ffprobe` should report something like this on the generated images:
|
||||
|
||||
```text
|
||||
Input #0, webp_pipe, from 'happy.webp':
|
||||
Duration: N/A, bitrate: N/A
|
||||
Stream #0:0: Video: webp, none, 25 fps, 25 tbr, 25 tbn
|
||||
```
|
||||
|
||||
In testing 384 by 384 pixels gives the best balance between filesize, quality, and clarity.
|
||||
|
||||
```text
|
||||
$ du -hs *
|
||||
4.0K happy.webp
|
||||
12K pensive.webp
|
||||
8.0K reject.webp
|
||||
```
|
||||
|
||||
## Customizing messages
|
||||
|
||||
You can customize messages using the following environment variables:
|
||||
|
||||
| Message | Environment variable | Default |
|
||||
| :------------------- | :------------------- | :----------------------------------------- |
|
||||
| Challenge page title | `CHALLENGE_TITLE` | `Ensuring the security of your connection` |
|
||||
| Error page title | `ERROR_TITLE` | `Error` |
|
||||
|
||||
For example:
|
||||
|
||||
```sh
|
||||
# /etc/techaro-botstopper/gitea.env
|
||||
CHALLENGE_TITLE="Wait a moment please!"
|
||||
ERROR_TITLE="Client error"
|
||||
```
|
||||
@@ -77,7 +77,7 @@ For example, consider this rule:
|
||||
|
||||
For this rule, if a request comes in from `8.8.8.8` or `1.1.1.1`, Anubis will deny the request and return an error page.
|
||||
|
||||
#### `all` blocks
|
||||
### `all` blocks
|
||||
|
||||
An `all` block that contains a list of expressions. If all expressions in the list return `true`, then the action specified in the rule will be taken. If any of the expressions in the list returns `false`, Anubis will move on to the next rule.
|
||||
|
||||
@@ -99,15 +99,18 @@ For this rule, if a request comes in matching [the signature of the `go get` com
|
||||
|
||||
Anubis exposes the following variables to expressions:
|
||||
|
||||
| Name | Type | Explanation | Example |
|
||||
| :-------------- | :-------------------- | :---------------------------------------------------------------------------------------------------------------------------------------- | :----------------------------------------------------------- |
|
||||
| `headers` | `map[string, string]` | The [headers](https://developer.mozilla.org/en-US/docs/Web/HTTP/Reference/Headers) of the request being processed. | `{"User-Agent": "Mozilla/5.0 Gecko/20100101 Firefox/137.0"}` |
|
||||
| `host` | `string` | The [HTTP hostname](https://web.dev/articles/url-parts#host) the request is targeted to. | `anubis.techaro.lol` |
|
||||
| `method` | `string` | The [HTTP method](https://developer.mozilla.org/en-US/docs/Web/HTTP/Reference/Methods) in the request being processed. | `GET`, `POST`, `DELETE`, etc. |
|
||||
| `path` | `string` | The [path](https://web.dev/articles/url-parts#pathname) of the request being processed. | `/`, `/api/memes/create` |
|
||||
| `query` | `map[string, string]` | The [query parameters](https://web.dev/articles/url-parts#query) of the request being processed. | `?foo=bar` -> `{"foo": "bar"}` |
|
||||
| `remoteAddress` | `string` | The IP address of the client. | `1.1.1.1` |
|
||||
| `userAgent` | `string` | The [`User-Agent`](https://developer.mozilla.org/en-US/docs/Web/HTTP/Reference/Headers/User-Agent) string in the request being processed. | `Mozilla/5.0 Gecko/20100101 Firefox/137.0` |
|
||||
| Name | Type | Explanation | Example |
|
||||
| :-------------- | :-------------------- | :-------------------------------------------------------------------------------------------------------------------------------------------- | :----------------------------------------------------------- |
|
||||
| `headers` | `map[string, string]` | The [headers](https://developer.mozilla.org/en-US/docs/Web/HTTP/Reference/Headers) of the request being processed. | `{"User-Agent": "Mozilla/5.0 Gecko/20100101 Firefox/137.0"}` |
|
||||
| `host` | `string` | The [HTTP hostname](https://web.dev/articles/url-parts#host) the request is targeted to. | `anubis.techaro.lol` |
|
||||
| `load_1m` | `double` | The current system load average over the last one minute. This is useful for making [load-based checks](#using-the-system-load-average). |
|
||||
| `load_5m` | `double` | The current system load average over the last five minutes. This is useful for making [load-based checks](#using-the-system-load-average). |
|
||||
| `load_15m` | `double` | The current system load average over the last fifteen minutes. This is useful for making [load-based checks](#using-the-system-load-average). |
|
||||
| `method` | `string` | The [HTTP method](https://developer.mozilla.org/en-US/docs/Web/HTTP/Reference/Methods) in the request being processed. | `GET`, `POST`, `DELETE`, etc. |
|
||||
| `path` | `string` | The [path](https://web.dev/articles/url-parts#pathname) of the request being processed. | `/`, `/api/memes/create` |
|
||||
| `query` | `map[string, string]` | The [query parameters](https://web.dev/articles/url-parts#query) of the request being processed. | `?foo=bar` -> `{"foo": "bar"}` |
|
||||
| `remoteAddress` | `string` | The IP address of the client. | `1.1.1.1` |
|
||||
| `userAgent` | `string` | The [`User-Agent`](https://developer.mozilla.org/en-US/docs/Web/HTTP/Reference/Headers/User-Agent) string in the request being processed. | `Mozilla/5.0 Gecko/20100101 Firefox/137.0` |
|
||||
|
||||
Of note: in many languages when you look up a key in a map and there is nothing there, the language will return some "falsy" value like `undefined` in JavaScript, `None` in Python, or the zero value of the type in Go. In CEL, if you try to look up a value that does not exist, execution of the expression will fail and Anubis will return an error.
|
||||
|
||||
@@ -120,7 +123,7 @@ In order to avoid this, make sure the header or query parameter you are testing
|
||||
- 'path == "/index.php"'
|
||||
- '"title" in query'
|
||||
- '"action" in query'
|
||||
- 'query["action"] == "history"
|
||||
- 'query["action"] == "history"'
|
||||
```
|
||||
|
||||
This rule throws a challenge if and only if all of the following conditions are true:
|
||||
@@ -141,12 +144,74 @@ X-Real-Ip: 8.8.8.8
|
||||
|
||||
Anubis would return a challenge because all of those conditions are true.
|
||||
|
||||
### Using the system load average
|
||||
|
||||
In Unix-like systems (such as Linux), every process on the system has to wait its turn to be able to run. This means that as more processes on the system are running, they need to wait longer to be able to execute. The [load average](<https://en.wikipedia.org/wiki/Load_(computing)>) represents the number of processes that want to be able to run but can't run yet. This metric isn't the most reliable to identify a cause, but is great at helping to identify symptoms.
|
||||
|
||||
Anubis lets you use the system load average as an input to expressions so that you can make dynamic rules like "when the system is under a low amount of load, dial back the protection, but when it's under a lot of load, crank it up to the mix". This lets you get all of the blocking features of Anubis in the background but only really expose Anubis to users when the system is actively being attacked.
|
||||
|
||||
This is best combined with the [weight](../policies.mdx#request-weight) and [threshold](./thresholds.mdx) systems so that you can have Anubis dynamically respond to attacks. Consider these rules in the default configuration file:
|
||||
|
||||
```yaml
|
||||
## System load based checks.
|
||||
# If the system is under high load for the last minute, add weight.
|
||||
- name: high-load-average
|
||||
action: WEIGH
|
||||
expression: load_1m >= 10.0 # make sure to end the load comparison in a .0
|
||||
weight:
|
||||
adjust: 20
|
||||
|
||||
# If it is not for the last 15 minutes, remove weight.
|
||||
- name: low-load-average
|
||||
action: WEIGH
|
||||
expression: load_15m <= 4.0 # make sure to end the load comparison in a .0
|
||||
weight:
|
||||
adjust: -10
|
||||
```
|
||||
|
||||
This combination of rules makes Anubis dynamically react to the system load and only kick in when the system is under attack.
|
||||
|
||||
Something to keep in mind about system load average is that it is not aware of the number of cores the system has. If you have a 16 core system that has 16 processes running but none of them is hogging the CPU, then you will get a load average below 16. If you are in doubt, make your "high load" metric at least two times the number of CPU cores and your "low load" metric at least half of the number of CPU cores. For example:
|
||||
|
||||
| Kind | Core count | Load threshold |
|
||||
| --------: | :--------- | :------------- |
|
||||
| high load | 4 | `8.0` |
|
||||
| low load | 4 | `2.0` |
|
||||
| high load | 16 | `32.0` |
|
||||
| low load | 16 | `8` |
|
||||
|
||||
Also keep in mind that this does not account for other kinds of latency like I/O latency. A system can have its web applications unresponsive due to high latency from a MySQL server but still have that web application server report a load near or at zero.
|
||||
|
||||
## Functions exposed to Anubis expressions
|
||||
|
||||
Anubis expressions can be augmented with the following functions:
|
||||
|
||||
### `missingHeader`
|
||||
|
||||
Available in `bot` expressions.
|
||||
|
||||
```ts
|
||||
function missingHeader(headers: Record<string, string>, key: string) bool
|
||||
```
|
||||
|
||||
`missingHeader` returns `true` if the request does not contain a header. This is useful when you are trying to assert behavior such as:
|
||||
|
||||
```yaml
|
||||
# Adds weight to old versions of Chrome
|
||||
- name: old-chrome
|
||||
action: WEIGH
|
||||
weight:
|
||||
adjust: 10
|
||||
expression:
|
||||
all:
|
||||
- userAgent.matches("Chrome/[1-9][0-9]?\\.0\\.0\\.0")
|
||||
- missingHeader(headers, "Sec-Ch-Ua")
|
||||
```
|
||||
|
||||
### `randInt`
|
||||
|
||||
Available in all expressions.
|
||||
|
||||
```ts
|
||||
function randInt(n: int): int;
|
||||
```
|
||||
|
||||
70
docs/docs/admin/configuration/impressum.mdx
Normal file
70
docs/docs/admin/configuration/impressum.mdx
Normal file
@@ -0,0 +1,70 @@
|
||||
# Imprint / Impressum configuration
|
||||
|
||||
Some jurisdictions (such as the European Union and specifically Germany) [must have contact information freely available](https://www.privacycompany.eu/blog/the-imprint-requirement-a-must-have-for-companies-from-outside-germany) on an imprint/impressum page. Anubis supports creating an Anubis-specific imprint page for your organization with the `impressum` block in your bot policy file. For example:
|
||||
|
||||
```yaml
|
||||
impressum:
|
||||
# Displayed at the bottom of every page rendered by Anubis.
|
||||
footer: >-
|
||||
This website is hosted by Techaro. If you have any complaints or notes
|
||||
about the service, please contact
|
||||
<a href="mailto:contact@techaro.lol">contact@techaro.lol</a> and we
|
||||
will assist you as soon as possible.
|
||||
|
||||
# The imprint page that will be linked to at the footer of every Anubis page.
|
||||
page:
|
||||
# The HTML <title> of the page
|
||||
title: Imprint and Privacy Policy
|
||||
# The HTML contents of the page. The exact contents of this page can
|
||||
# and will vary by locale. Please consult with a lawyer if you are not
|
||||
# sure what to put here
|
||||
body: >-
|
||||
<p>Last updated: June 2025</p>
|
||||
|
||||
<h2>Information that is gathered from visitors</h2>
|
||||
|
||||
<p>In common with other websites, log files are stored on the web server saving details such as the visitor's IP address, browser type, referring page and time of visit.</p>
|
||||
|
||||
<p>Cookies may be used to remember visitor preferences when interacting with the website.</p>
|
||||
|
||||
<p>Where registration is required, the visitor's email and a username will be stored on the server.</p>
|
||||
|
||||
<!-- ... -->
|
||||
```
|
||||
|
||||
If you are subscribed to and using [advanced classification features](../thoth.mdx), be sure to disclose the following:
|
||||
|
||||
```html
|
||||
<h2>Techaro Anubis</h2>
|
||||
|
||||
<p>
|
||||
This website uses a service called
|
||||
<a href="https://anubis.techaro.lol">Anubis</a> by
|
||||
<a href="https://techaro.lol">Techaro</a> to filter malicious traffic. Anubis
|
||||
requires the use of browser cookies to ensure that web clients are running
|
||||
conformant software. Anubis also may report the following data to Techaro to
|
||||
improve service quality:
|
||||
</p>
|
||||
|
||||
<ul>
|
||||
<li>
|
||||
IP address (for purposes of matching against geo-location and BGP autonomous
|
||||
systems numbers), which is stored in-memory and not persisted to disk.
|
||||
</li>
|
||||
<li>
|
||||
Unique browser fingerprints (such as HTTP request fingerprints and
|
||||
encryption system fingerprints), which may be stored on Techaro's side for a
|
||||
period of up to one month.
|
||||
</li>
|
||||
<li>
|
||||
HTTP request metadata that may include things such as the User-Agent header
|
||||
and other identifiers.
|
||||
</li>
|
||||
</ul>
|
||||
|
||||
<p>
|
||||
This data is processed and stored for the legitimate interest of combatting
|
||||
abusive web clients. This data is encrypted at rest as much as possible and is
|
||||
only decrypted in memory for the purposes of fulfilling requests.
|
||||
</p>
|
||||
```
|
||||
@@ -9,12 +9,45 @@ This page provides detailed information on how to configure [Open Graph tag](htt
|
||||
|
||||
## Configuration Options
|
||||
|
||||
Open Graph settings are configured in the `openGraph` section of the [Policy File](../policies.mdx).
|
||||
|
||||
```yaml
|
||||
openGraph:
|
||||
# Enables Open Graph passthrough
|
||||
enabled: true
|
||||
# Enables the use of the HTTP host in the cache key, this enables
|
||||
# caching metadata for multiple http hosts at once.
|
||||
considerHost: true
|
||||
# How long cached OpenGraph metadata should last in memory
|
||||
ttl: 24h
|
||||
# If set, return these opengraph values instead of looking them up with
|
||||
# the target service.
|
||||
#
|
||||
# Correlates to properties in https://ogp.me/
|
||||
override:
|
||||
# og:title is required, it is the title of the website
|
||||
"og:title": "Techaro Anubis"
|
||||
"og:description": >-
|
||||
Anubis is a Web AI Firewall Utility that helps you fight the bots
|
||||
away so that you can maintain uptime at work!
|
||||
"description": >-
|
||||
Anubis is a Web AI Firewall Utility that helps you fight the bots
|
||||
away so that you can maintain uptime at work!
|
||||
```
|
||||
|
||||
<details>
|
||||
<summary>Configuration flags / envvars (old)</summary>
|
||||
|
||||
Open Graph passthrough used to be configured with configuration flags / environment variables. Reference to these settings are maintained for backwards compatibility's sake.
|
||||
|
||||
| Name | Description | Type | Default | Example |
|
||||
| ------------------------ | --------------------------------------------------------- | -------- | ------- | ----------------------------- |
|
||||
| `OG_PASSTHROUGH` | Enables or disables the Open Graph tag passthrough system | Boolean | `true` | `OG_PASSTHROUGH=true` |
|
||||
| `OG_EXPIRY_TIME` | Configurable cache expiration time for Open Graph tags | Duration | `24h` | `OG_EXPIRY_TIME=1h` |
|
||||
| `OG_CACHE_CONSIDER_HOST` | Enables or disables the use of the host in the cache key | Boolean | `false` | `OG_CACHE_CONSIDER_HOST=true` |
|
||||
|
||||
</details>
|
||||
|
||||
## Usage
|
||||
|
||||
To configure Open Graph tags, you can set the following environment variables, environment file or as flags in your Anubis configuration:
|
||||
|
||||
140
docs/docs/admin/configuration/thresholds.mdx
Normal file
140
docs/docs/admin/configuration/thresholds.mdx
Normal file
@@ -0,0 +1,140 @@
|
||||
# Weight Threshold Configuration
|
||||
|
||||
Anubis offers the ability to assign "weight" to requests. This is a custom level of suspicion that rules can add to or remove from. For example, here's how you assign 10 weight points to anything that might be a browser:
|
||||
|
||||
```yaml
|
||||
# botPolicies.yaml
|
||||
|
||||
bots:
|
||||
- name: generic-browser
|
||||
user_agent_regex: >-
|
||||
Mozilla|Opera
|
||||
action: WEIGH
|
||||
weight:
|
||||
adjust: 10
|
||||
```
|
||||
|
||||
Thresholds let you take this per-request weight value and take actions in response to it. Thresholds are defined alongside your bot configuration in `botPolicies.yaml`.
|
||||
|
||||
:::note
|
||||
|
||||
Thresholds DO NOT apply when a request matches a bot rule with the CHALLENGE action. Thresholds only apply when requests don't match any terminal bot rules.
|
||||
|
||||
:::
|
||||
|
||||
```yaml
|
||||
# botPolicies.yaml
|
||||
|
||||
bots: ...
|
||||
|
||||
thresholds:
|
||||
- name: minimal-suspicion
|
||||
expression: weight < 0
|
||||
action: ALLOW
|
||||
|
||||
- name: mild-suspicion
|
||||
expression:
|
||||
all:
|
||||
- weight >= 0
|
||||
- weight < 10
|
||||
action: CHALLENGE
|
||||
challenge:
|
||||
algorithm: metarefresh
|
||||
difficulty: 1
|
||||
report_as: 1
|
||||
|
||||
- name: moderate-suspicion
|
||||
expression:
|
||||
all:
|
||||
- weight >= 10
|
||||
- weight < 20
|
||||
action: CHALLENGE
|
||||
challenge:
|
||||
algorithm: fast
|
||||
difficulty: 2
|
||||
report_as: 2
|
||||
|
||||
- name: extreme-suspicion
|
||||
expression: weight >= 20
|
||||
action: CHALLENGE
|
||||
challenge:
|
||||
algorithm: fast
|
||||
difficulty: 4
|
||||
report_as: 4
|
||||
```
|
||||
|
||||
This defines a suite of 4 thresholds:
|
||||
|
||||
1. If the request weight is less than zero, allow it through.
|
||||
2. If the request weight is greater than or equal to zero, but less than ten: give it [a very lightweight challenge](./challenges/metarefresh.mdx).
|
||||
3. If the request weight is greater than or equal to ten, but less than twenty: give it [a slightly heavier challenge](./challenges/proof-of-work.mdx).
|
||||
4. Otherwise, give it [the heaviest challenge](./challenges/proof-of-work.mdx).
|
||||
|
||||
Thresholds can be configured with the following options:
|
||||
|
||||
<table>
|
||||
<thead>
|
||||
<tr>
|
||||
<th>Name</th>
|
||||
<th>Description</th>
|
||||
<th>Example</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody>
|
||||
<tr>
|
||||
<td>`name`</td>
|
||||
<td>The human-readable name for this threshold.</td>
|
||||
<td>
|
||||
|
||||
```yaml
|
||||
name: extreme-suspicion
|
||||
```
|
||||
|
||||
</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>`expression`</td>
|
||||
<td>A [CEL](https://cel.dev/) expression taking the request weight and returning true or false</td>
|
||||
<td>
|
||||
|
||||
To check if the request weight is less than zero:
|
||||
|
||||
```yaml
|
||||
expression: weight < 0
|
||||
```
|
||||
|
||||
To check if it's between 0 and 10 (inclusive):
|
||||
|
||||
```yaml
|
||||
expression:
|
||||
all:
|
||||
- weight >= 0
|
||||
- weight < 10
|
||||
```
|
||||
|
||||
</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>`action`</td>
|
||||
<td>The Anubis action to apply: `ALLOW`, `CHALLENGE`, or `DENY`</td>
|
||||
<td>
|
||||
|
||||
```yaml
|
||||
action: ALLOW
|
||||
```
|
||||
|
||||
If you set the CHALLENGE action, you must set challenge details:
|
||||
|
||||
```yaml
|
||||
action: CHALLENGE
|
||||
challenge:
|
||||
algorithm: metarefresh
|
||||
difficulty: 1
|
||||
report_as: 1
|
||||
```
|
||||
|
||||
</td>
|
||||
</tr>
|
||||
|
||||
</tbody>
|
||||
</table>
|
||||
@@ -30,31 +30,10 @@ Effectively you have one trip through Apache to do TLS termination, a detour thr
|
||||
|
||||
:::note
|
||||
|
||||
These examples assume that you are using a setup where your nginx configuration is made up of a bunch of files in `/etc/httpd/conf.d/*.conf`. This is not true for all deployments of Apache. If you are not in such an environment, append these snippets to your `/etc/httpd/conf/httpd.conf` file.
|
||||
These examples assume that you are using a setup where your Apache configuration is made up of a bunch of files in `/etc/httpd/conf.d/*.conf`. This is not true for all deployments of Apache. If you are not in such an environment, append these snippets to your `/etc/httpd/conf/httpd.conf` file.
|
||||
|
||||
:::
|
||||
|
||||
## Dependencies
|
||||
|
||||
Install the following dependencies for proxying HTTP:
|
||||
|
||||
<Tabs>
|
||||
<TabItem value="rpm" label="Red Hat / RPM" default>
|
||||
|
||||
```text
|
||||
dnf -y install mod_proxy_html
|
||||
```
|
||||
|
||||
</TabItem>
|
||||
<TabItem value="deb" label="Debian / Ubuntu / apt">
|
||||
|
||||
```text
|
||||
apt-get install -y libapache2-mod-proxy-html libxml2-dev
|
||||
```
|
||||
|
||||
</TabItem>
|
||||
</Tabs>
|
||||
|
||||
## Configuration
|
||||
|
||||
Assuming you are protecting `anubistest.techaro.lol`, you need the following server configuration blocks:
|
||||
@@ -77,6 +56,7 @@ Assuming you are protecting `anubistest.techaro.lol`, you need the following ser
|
||||
</VirtualHost>
|
||||
|
||||
# HTTPS listener that forwards to Anubis
|
||||
<IfModule mod_proxy.c>
|
||||
<VirtualHost *:443>
|
||||
ServerAdmin your@email.here
|
||||
ServerName anubistest.techaro.lol
|
||||
|
||||
@@ -4,7 +4,7 @@ Docker compose is typically used in concert with other load balancers such as [A
|
||||
|
||||
```yaml
|
||||
services:
|
||||
anubis-nginx:
|
||||
anubis:
|
||||
image: ghcr.io/techarohq/anubis:latest
|
||||
environment:
|
||||
BIND: ":8080"
|
||||
@@ -15,10 +15,17 @@ services:
|
||||
POLICY_FNAME: "/data/cfg/botPolicy.yaml"
|
||||
OG_PASSTHROUGH: "true"
|
||||
OG_EXPIRY_TIME: "24h"
|
||||
healthcheck:
|
||||
test: ["CMD", "anubis", "--healthcheck"]
|
||||
interval: 5s
|
||||
timeout: 30s
|
||||
retries: 5
|
||||
start_period: 500ms
|
||||
ports:
|
||||
- 8080:8080
|
||||
volumes:
|
||||
- "./botPolicy.yaml:/data/cfg/botPolicy.yaml:ro"
|
||||
|
||||
nginx:
|
||||
image: nginx
|
||||
volumes:
|
||||
|
||||
@@ -4,9 +4,6 @@ title: Setting up Anubis
|
||||
|
||||
import RandomKey from "@site/src/components/RandomKey";
|
||||
|
||||
import Tabs from "@theme/Tabs";
|
||||
import TabItem from "@theme/TabItem";
|
||||
|
||||
Anubis is meant to sit between your reverse proxy (such as Nginx or Caddy) and your target service. One instance of Anubis must be used per service you are protecting.
|
||||
|
||||
<center>
|
||||
@@ -45,30 +42,45 @@ Anubis has very minimal system requirements. I suspect that 128Mi of ram may be
|
||||
|
||||
For more detailed information on installing Anubis with native packages, please read [the native install directions](./native-install.mdx).
|
||||
|
||||
## Environment variables
|
||||
## Configuration
|
||||
|
||||
Anubis is configurable via environment variables and [the policy file](./policies.mdx). Most settings are currently exposed with environment variables but they are being slowly moved over to the policy file.
|
||||
|
||||
### Configuration via the policy file
|
||||
|
||||
Currently the following settings are configurable via the policy file:
|
||||
|
||||
- [Bot policies](./policies.mdx)
|
||||
- [Open Graph passthrough](./configuration/open-graph.mdx)
|
||||
- [Weight thresholds](./configuration/thresholds.mdx)
|
||||
|
||||
### Environment variables
|
||||
|
||||
Anubis uses these environment variables for configuration:
|
||||
|
||||
| Environment Variable | Default value | Explanation |
|
||||
| :----------------------------- | :---------------------- | :------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
|
||||
| `BASE_PREFIX` | unset | If set, adds a global prefix to all Anubis endpoints. For example, setting this to `/myapp` would make Anubis accessible at `/myapp/` instead of `/`. This is useful when running Anubis behind a reverse proxy that routes based on path prefixes. |
|
||||
| `BASE_PREFIX` | unset | If set, adds a global prefix to all Anubis endpoints (everything starting with `/.within.website/x/anubis/`). For example, setting this to `/myapp` would make Anubis accessible at `/myapp/` instead of `/`. This is useful when running Anubis behind a reverse proxy that routes based on path prefixes. |
|
||||
| `BIND` | `:8923` | The network address that Anubis listens on. For `unix`, set this to a path: `/run/anubis/instance.sock` |
|
||||
| `BIND_NETWORK` | `tcp` | The address family that Anubis listens on. Accepts `tcp`, `unix` and anything Go's [`net.Listen`](https://pkg.go.dev/net#Listen) supports. |
|
||||
| `COOKIE_DOMAIN` | unset | The domain the Anubis challenge pass cookie should be set to. This should be set to the domain you bought from your registrar (EG: `techaro.lol` if your webapp is running on `anubis.techaro.lol`). See this [stackoverflow explanation of cookies](https://stackoverflow.com/a/1063760) for more information.<br/><br/>Note that unlike `REDIRECT_DOMAINS`, you should never include a port number in this variable. |
|
||||
| `COOKIE_DYNAMIC_DOMAIN` | false | If set to true, automatically set cookie domain fields based on the hostname of the request. EG: if you are making a request to `anubis.techaro.lol`, the Anubis cookie will be valid for any subdomain of `techaro.lol`. |
|
||||
| `COOKIE_EXPIRATION_TIME` | `168h` | The amount of time the authorization cookie is valid for. |
|
||||
| `COOKIE_PARTITIONED` | `false` | If set to `true`, enables the [partitioned (CHIPS) flag](https://developers.google.com/privacy-sandbox/cookies/chips), meaning that Anubis inside an iframe has a different set of cookies than the domain hosting the iframe. |
|
||||
| `COOKIE_SECURE` | `true` | If set to `true`, enables the [Secure flag](https://developer.mozilla.org/en-US/docs/Web/HTTP/Guides/Cookies#block_access_to_your_cookies), meaning that the cookies will only be transmitted over HTTPS. If Anubis is used in an unsecure context (plain HTTP), this will be need to be set to false |
|
||||
| `DIFFICULTY` | `4` | The difficulty of the challenge, or the number of leading zeroes that must be in successful responses. |
|
||||
| `ED25519_PRIVATE_KEY_HEX` | unset | The hex-encoded ed25519 private key used to sign Anubis responses. If this is not set, Anubis will generate one for you. This should be exactly 64 characters long. See below for details. |
|
||||
| `ED25519_PRIVATE_KEY_HEX` | unset | The hex-encoded ed25519 private key used to sign Anubis responses. If this is not set, Anubis will generate one for you. This should be exactly 64 characters long. When running multiple instances on the same base domain, the key must be the same across all instances. See below for details. |
|
||||
| `ED25519_PRIVATE_KEY_HEX_FILE` | unset | Path to a file containing the hex-encoded ed25519 private key. Only one of this or its sister option may be set. |
|
||||
| `METRICS_BIND` | `:9090` | The network address that Anubis serves Prometheus metrics on. See `BIND` for more information. |
|
||||
| `METRICS_BIND_NETWORK` | `tcp` | The address family that the Anubis metrics server listens on. See `BIND_NETWORK` for more information. |
|
||||
| `OG_EXPIRY_TIME` | `24h` | The expiration time for the Open Graph tag cache. |
|
||||
| `OG_PASSTHROUGH` | `false` | If set to `true`, Anubis will enable Open Graph tag passthrough. |
|
||||
| `OG_CACHE_CONSIDER_HOST` | `false` | If set to `true`, Anubis will consider the host in the Open Graph tag cache key. |
|
||||
| `OG_EXPIRY_TIME` | `24h` | The expiration time for the Open Graph tag cache. Prefer using [the policy file](./configuration/open-graph.mdx) to configure the Open Graph subsystem. |
|
||||
| `OG_PASSTHROUGH` | `false` | If set to `true`, Anubis will enable Open Graph tag passthrough. Prefer using [the policy file](./configuration/open-graph.mdx) to configure the Open Graph subsystem. |
|
||||
| `OG_CACHE_CONSIDER_HOST` | `false` | If set to `true`, Anubis will consider the host in the Open Graph tag cache key. Prefer using [the policy file](./configuration/open-graph.mdx) to configure the Open Graph subsystem. |
|
||||
| `POLICY_FNAME` | unset | The file containing [bot policy configuration](./policies.mdx). See the bot policy documentation for more details. If unset, the default bot policy configuration is used. |
|
||||
| `REDIRECT_DOMAINS` | unset | If set, restrict the domains that Anubis can redirect to when passing a challenge.<br/><br/>If this is unset, Anubis may redirect to any domain which could cause security issues in the unlikely case that an attacker passes a challenge for your browser and then tricks you into clicking a link to your domain.<br/><br/>Note that if you are hosting Anubis on a non-standard port (`https://example:com:8443`, `http://www.example.net:8080`, etc.), you must also include the port number here. |
|
||||
| `SERVE_ROBOTS_TXT` | `false` | If set `true`, Anubis will serve a default `robots.txt` file that disallows all known AI scrapers by name and then additionally disallows every scraper. This is useful if facts and circumstances make it difficult to change the underlying service to serve such a `robots.txt` file. |
|
||||
| `SOCKET_MODE` | `0770` | _Only used when at least one of the `*_BIND_NETWORK` variables are set to `unix`._ The socket mode (permissions) for Unix domain sockets. |
|
||||
| `STRIP_BASE_PREFIX` | `false` | If set to `true`, strips the base prefix from request paths when forwarding to the target server. This is useful when your target service expects to receive requests without the base prefix. For example, with `BASE_PREFIX=/foo` and `STRIP_BASE_PREFIX=true`, a request to `/foo/bar` would be forwarded to the target as `/bar`. |
|
||||
| `TARGET` | `http://localhost:3923` | The URL of the service that Anubis should forward valid requests to. Supports Unix domain sockets, set this to a URI like so: `unix:///path/to/socket.sock`. |
|
||||
| `USE_REMOTE_ADDRESS` | unset | If set to `true`, Anubis will take the client's IP from the network socket. For production deployments, it is expected that a reverse proxy is used in front of Anubis, which pass the IP using headers, instead. |
|
||||
| `WEBMASTER_EMAIL` | unset | If set, shows a contact email address when rendering error pages. This email address will be how users can get in contact with administrators. |
|
||||
@@ -83,11 +95,12 @@ If you don't know or understand what these settings mean, ignore them. These are
|
||||
|
||||
:::
|
||||
|
||||
| Environment Variable | Default value | Explanation |
|
||||
| :---------------------------- | :------------ | :-------------------------------------------------------------------------------------------------------------------------------------------------- |
|
||||
| `TARGET_SNI` | unset | If set, overrides the TLS handshake hostname in requests forwarded to `TARGET`. |
|
||||
| `TARGET_HOST` | unset | If set, overrides the Host header in requests forwarded to `TARGET`. |
|
||||
| `TARGET_INSECURE_SKIP_VERIFY` | `false` | If `true`, skip TLS certificate validation for targets that listen over `https`. If your backend does not listen over `https`, ignore this setting. |
|
||||
| Environment Variable | Default value | Explanation |
|
||||
| :---------------------------- | :------------ | :---------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
||||
| `TARGET_SNI` | unset | If set, overrides the TLS handshake hostname in requests forwarded to `TARGET`. |
|
||||
| `TARGET_HOST` | unset | If set, overrides the Host header in requests forwarded to `TARGET`. |
|
||||
| `TARGET_INSECURE_SKIP_VERIFY` | `false` | If `true`, skip TLS certificate validation for targets that listen over `https`. If your backend does not listen over `https`, ignore this setting. |
|
||||
| `HS512_SECRET` | unset | Secret string for JWT HS512 algorithm. If this is not set, Anubis will use ED25519 as defined via the variables above. The longer the better; 128 chars should suffice. |
|
||||
|
||||
</details>
|
||||
|
||||
@@ -129,6 +142,22 @@ With corresponding Anubis configuration:
|
||||
BASE_PREFIX=/myapp
|
||||
```
|
||||
|
||||
#### Stripping Base Prefix
|
||||
|
||||
If your target service doesn't expect to receive the base prefix in request paths, you can use the `STRIP_BASE_PREFIX` option:
|
||||
|
||||
```
|
||||
BASE_PREFIX=/myapp
|
||||
STRIP_BASE_PREFIX=true
|
||||
```
|
||||
|
||||
With this configuration:
|
||||
|
||||
- A request to `/myapp/api/users` would be forwarded to your target service as `/api/users`
|
||||
- A request to `/myapp/` would be forwarded as `/`
|
||||
|
||||
This is particularly useful when working with applications that weren't designed to handle path prefixes. However, note that if your target application generates absolute redirects or links (like `/login` instead of `./login`), these may break the subpath routing since they won't include the base prefix.
|
||||
|
||||
### Key generation
|
||||
|
||||
To generate an ed25519 private key, you can use this command:
|
||||
|
||||
@@ -137,7 +137,7 @@ Test to make sure it's running with `curl`:
|
||||
curl http://localhost:8240/metrics
|
||||
```
|
||||
|
||||
Then set up your reverse proxy (Nginx, Caddy, etc.) to point to the Anubis port. Anubis will then reverse proxy all requests that meet the policies in `/etc/anubis/gitea.botPolicies.json` to the target service.
|
||||
Then set up your reverse proxy (Nginx, Caddy, etc.) to point to the Anubis port. Anubis will then reverse proxy all requests that meet the policies in `/etc/anubis/gitea.botPolicies.yaml` to the target service.
|
||||
|
||||
For more details on particular reverse proxies, see here:
|
||||
|
||||
|
||||
@@ -233,6 +233,124 @@ remote_addresses:
|
||||
</TabItem>
|
||||
</Tabs>
|
||||
|
||||
## Imprint / Impressum support
|
||||
|
||||
Anubis has support for showing imprint / impressum information. This is defined in the `impressum` block of your configuration. See [Imprint / Impressum configuration](./configuration/impressum.mdx) for more information.
|
||||
|
||||
## Storage backends
|
||||
|
||||
Anubis needs to store temporary data in order to determine if a user is legitimate or not. Administrators should choose a storage backend based on their infrastructure needs. Each backend has its own advantages and disadvantages.
|
||||
|
||||
Anubis offers the following storage backends:
|
||||
|
||||
- [`memory`](#memory) -- A simple in-memory hashmap
|
||||
- [`bbolt`](#bbolt) -- An on-disk key/value store backed by [bbolt](https://github.com/etcd-io/bbolt), an embedded key/value database for Go programs
|
||||
- [`valkey`](#valkey) -- A remote in-memory key/value database backed by [Valkey](https://valkey.io/) (or another database compatible with the [RESP](https://redis.io/docs/latest/develop/reference/protocol-spec/) protocol)
|
||||
|
||||
If no storage backend is set in the policy file, Anubis will use the [`memory`](#memory) backend by default. This is equivalent to the following in the policy file:
|
||||
|
||||
```yaml
|
||||
store:
|
||||
backend: memory
|
||||
parameters: {}
|
||||
```
|
||||
|
||||
### `memory`
|
||||
|
||||
The memory backend is an in-memory cache. This backend works best if you don't use multiple instances of Anubis or don't have mutable storage in the environment you're running Anubis in.
|
||||
|
||||
| Should I use this backend? | Yes/no |
|
||||
| :------------------------------------------------------------ | :----- |
|
||||
| Are you running only one instance of Anubis for this service? | ✅ Yes |
|
||||
| Does your service get a lot of traffic? | 🚫 No |
|
||||
| Do you want to store data persistently when Anubis restarts? | 🚫 No |
|
||||
| Do you run Anubis without mutable filesystem storage? | ✅ Yes |
|
||||
|
||||
The biggest downside is that there is not currently a limit to how much data can be stored in memory. This will be addressed at a later time.
|
||||
|
||||
:::warning
|
||||
|
||||
The in-memory backend exists mostly for validation, testing, and to ensure that the default configuration of Anubis works as expected. Do not use this persistently in production.
|
||||
|
||||
:::
|
||||
|
||||
#### Configuration
|
||||
|
||||
The memory backend does not require any configuration to use.
|
||||
|
||||
### `bbolt`
|
||||
|
||||
An on-disk storage layer powered by [bbolt](https://github.com/etcd-io/bbolt), a high performance embedded key/value database used by containerd, etcd, Kubernetes, and NATS. This backend works best if you're running Anubis on a single host and get a lot of traffic.
|
||||
|
||||
| Should I use this backend? | Yes/no |
|
||||
| :------------------------------------------------------------ | :----- |
|
||||
| Are you running only one instance of Anubis for this service? | ✅ Yes |
|
||||
| Does your service get a lot of traffic? | ✅ Yes |
|
||||
| Do you want to store data persistently when Anubis restarts? | ✅ Yes |
|
||||
| Do you run Anubis without mutable filesystem storage? | 🚫 No |
|
||||
|
||||
When Anubis opens a bbolt database, it takes an exclusive lock on that database. Other instances of Anubis or other tools cannot view the bbolt database while it is locked by another instance of Anubis. If you run multiple instances of Anubis for different services, give each its own `bbolt` configuration.
|
||||
|
||||
#### Configuration
|
||||
|
||||
The `bbolt` backend takes the following configuration options:
|
||||
|
||||
| Name | Type | Example | Description |
|
||||
| :----- | :--- | :----------------- | :--------------------------------------------------------------------------------------------------------------------------- |
|
||||
| `path` | path | `/data/anubis.bdb` | The filesystem path for the Anubis bbolt database. Anubis requires write access to the folder containing the bbolt database. |
|
||||
|
||||
Example:
|
||||
|
||||
If you have persistent storage mounted to `/data`, then your store configuration could look like this:
|
||||
|
||||
```yaml
|
||||
store:
|
||||
backend: bbolt
|
||||
parameters:
|
||||
path: /data/anubis.bdb
|
||||
```
|
||||
|
||||
### `valkey`
|
||||
|
||||
[Valkey](https://valkey.io/) is an in-memory key/value store that clients access over the network. This allows multiple instances of Anubis to share information and does not require each instance of Anubis to have persistent filesystem storage.
|
||||
|
||||
:::note
|
||||
|
||||
You can also use [Redis](http://redis.io/) with Anubis.
|
||||
|
||||
:::
|
||||
|
||||
This backend is ideal if you are running multiple instances of Anubis in a worker pool (eg: Kubernetes Deployments with a copy of Anubis in each Pod).
|
||||
|
||||
| Should I use this backend? | Yes/no |
|
||||
| :------------------------------------------------------------ | :----- |
|
||||
| Are you running only one instance of Anubis for this service? | 🚫 No |
|
||||
| Does your service get a lot of traffic? | ✅ Yes |
|
||||
| Do you want to store data persistently when Anubis restarts? | ✅ Yes |
|
||||
| Do you run Anubis without mutable filesystem storage? | ✅ Yes |
|
||||
| Do you have Redis or Valkey installed? | ✅ Yes |
|
||||
|
||||
#### Configuration
|
||||
|
||||
The `valkey` backend takes the following configuration options:
|
||||
|
||||
| Name | Type | Example | Description |
|
||||
| :---- | :----- | :---------------------- | :----------------------------------------------------------------------------------------------------------------------------------------------- |
|
||||
| `url` | string | `redis://valkey:6379/0` | The URL for the instance of Redis or Valkey that Anubis should store data in. This is in the same format as `REDIS_URL` in many cloud providers. |
|
||||
|
||||
Example:
|
||||
|
||||
If you have an instance of Valkey running with the hostname `valkey.int.techaro.lol`, then your store configuration could look like this:
|
||||
|
||||
```yaml
|
||||
store:
|
||||
backend: valkey
|
||||
parameters:
|
||||
url: "redis://valkey.int.techaro.lol:6379/0"
|
||||
```
|
||||
|
||||
This would have the Valkey client connect to host `valkey.int.techaro.lol` on port `6379` with database `0` (the default database).
|
||||
|
||||
## Risk calculation for downstream services
|
||||
|
||||
In case your service needs it for risk calculation reasons, Anubis exposes information about the rules that any requests match using a few headers:
|
||||
@@ -261,17 +379,11 @@ Anubis rules can also add or remove "weight" from requests, allowing administrat
|
||||
adjust: -5
|
||||
```
|
||||
|
||||
This would remove five weight points from the request, making Anubis present the [Meta Refresh challenge](./configuration/challenges/metarefresh.mdx).
|
||||
This would remove five weight points from the request, which would make Anubis present the [Meta Refresh challenge](./configuration/challenges/metarefresh.mdx) in the default configuration.
|
||||
|
||||
### Weight Thresholds
|
||||
|
||||
Weight thresholds and challenge associations will be configurable with CEL expressions in the configuration file in an upcoming patch, for now here's how Anubis configures the weight thresholds:
|
||||
|
||||
| Weight Expression | Action |
|
||||
| -----------------------------------------------------: | :------------------------------------------------------------------------------------------------------------------------------------- |
|
||||
| `weight < 0` (weight is less than 0) | Allow the request through. |
|
||||
| `weight < 10` (weight is less than 10) | Challenge the client with the [Meta Refresh challenge](./configuration/challenges/metarefresh.mdx) at the default difficulty level. |
|
||||
| `weight >= 10` (weight is greater than or equal to 10) | Challenge the client with the [Proof of Work challenge](./configuration/challenges/proof-of-work.mdx) at the default difficulty level. |
|
||||
For more information on configuring weight thresholds, see [Weight Threshold Configuration](./configuration/thresholds.mdx)
|
||||
|
||||
### Advice
|
||||
|
||||
|
||||
84
docs/docs/admin/robots2policy.mdx
Normal file
84
docs/docs/admin/robots2policy.mdx
Normal file
@@ -0,0 +1,84 @@
|
||||
---
|
||||
title: robots2policy CLI Tool
|
||||
sidebar_position: 50
|
||||
---
|
||||
|
||||
The `robots2policy` tool converts robots.txt files into Anubis challenge policies. It reads robots.txt rules and generates equivalent CEL expressions for path matching and user-agent filtering.
|
||||
|
||||
## Installation
|
||||
|
||||
Install directly with Go:
|
||||
|
||||
```bash
|
||||
go install github.com/TecharoHQ/anubis/cmd/robots2policy@latest
|
||||
```
|
||||
## Usage
|
||||
|
||||
Basic conversion from URL:
|
||||
|
||||
```bash
|
||||
robots2policy -input https://www.example.com/robots.txt
|
||||
```
|
||||
|
||||
Convert local file to YAML:
|
||||
|
||||
```bash
|
||||
robots2policy -input robots.txt -output policy.yaml
|
||||
```
|
||||
|
||||
Convert with custom settings:
|
||||
|
||||
```bash
|
||||
robots2policy -input robots.txt -action DENY -format json
|
||||
```
|
||||
|
||||
## Options
|
||||
|
||||
| Flag | Description | Default |
|
||||
|-----------------------|--------------------------------------------------------------------|---------------------|
|
||||
| `-input` | robots.txt file path or URL (use `-` for stdin) | *required* |
|
||||
| `-output` | Output file (use `-` for stdout) | stdout |
|
||||
| `-format` | Output format: `yaml` or `json` | `yaml` |
|
||||
| `-action` | Action for disallowed paths: `ALLOW`, `DENY`, `CHALLENGE`, `WEIGH` | `CHALLENGE` |
|
||||
| `-name` | Policy name prefix | `robots-txt-policy` |
|
||||
| `-crawl-delay-weight` | Weight adjustment for crawl-delay rules | `3` |
|
||||
| `-deny-user-agents` | Action for blacklisted user agents | `DENY` |
|
||||
|
||||
## Example
|
||||
|
||||
Input robots.txt:
|
||||
```txt
|
||||
User-agent: *
|
||||
Disallow: /admin/
|
||||
Disallow: /private
|
||||
|
||||
User-agent: BadBot
|
||||
Disallow: /
|
||||
```
|
||||
|
||||
Generated policy:
|
||||
```yaml
|
||||
- name: robots-txt-policy-disallow-1
|
||||
action: CHALLENGE
|
||||
expression:
|
||||
single: path.startsWith("/admin/")
|
||||
- name: robots-txt-policy-disallow-2
|
||||
action: CHALLENGE
|
||||
expression:
|
||||
single: path.startsWith("/private")
|
||||
- name: robots-txt-policy-blacklist-3
|
||||
action: DENY
|
||||
expression:
|
||||
single: userAgent.contains("BadBot")
|
||||
```
|
||||
|
||||
## Using the Generated Policy
|
||||
|
||||
Save the output and import it in your main policy file:
|
||||
|
||||
```yaml
|
||||
import:
|
||||
- path: "./robots-policy.yaml"
|
||||
```
|
||||
|
||||
The tool handles wildcard patterns, user-agent specific rules, and blacklisted bots automatically.
|
||||
81
docs/docs/admin/thoth.mdx
Normal file
81
docs/docs/admin/thoth.mdx
Normal file
@@ -0,0 +1,81 @@
|
||||
# Thoth-based advanced checks
|
||||
|
||||
Status: Beta
|
||||
|
||||
Anubis instances are normally isolated. Each Anubis instance has its own configuration and exists in roughly its own world without any long term memory between requests. As threats, workarounds, and AI scraper toolchains evolve, administrators will need a way to get more up to date information faster than Anubis' release cycle.
|
||||
|
||||
Thus, Thoth is being created. Thoth is the reputation database for Anubis. Thoth feeds information to Anubis so that it can make better decisions about which traffic is innocuous and which traffic is suspicious.
|
||||
|
||||
:::note
|
||||
|
||||
Thoth is hosted by [Techaro](https://techaro.lol). Thoth is a paid service. Thoth is opt-in and requires manual intervention (including payment) to use. The code that powers Thoth is currently closed source.
|
||||
|
||||
To get access to Thoth, please subscribe [on GitHub Sponsors](https://github.com/sponsors/Xe) and [email Xe](mailto:xe@techaro.lol). This will be self-service soon.
|
||||
|
||||
:::
|
||||
|
||||
## Implementation
|
||||
|
||||
Thoth is a web service that listens over [gRPC](https://grpc.io/). Thoth's API is documented in protocol buffer definitions in the GitHub repo [TecharoHQ/thoth-proto](https://github.com/TecharoHQ/thoth-proto).
|
||||
|
||||
Thoth is designed to be _informative_, not _authoritative_. Thoth cannot and will not arbitrarily block requests, origins, or other traffic. Thoth is there to inform Anubis and influence the weight of requests so that upstream resources can be protected. Additionally, Anubis aggressively caches data from Thoth such that over time Anubis will not need to request data very often. This makes the fast path for repeat visitors even faster and reduces the amount of data that Thoth is exposed to.
|
||||
|
||||
## Thoth features
|
||||
|
||||
Thoth is currently in active development. Currently, Thoth provides the following features to Anubis:
|
||||
|
||||
- BGP Autonomous System (ASN) based filtering
|
||||
- GeoIP location based filtering
|
||||
|
||||
### ASN-based filtering
|
||||
|
||||
When companies link their backbone infrastructure to the Internet, they do so via a [BGP Autonomous System](<https://en.wikipedia.org/wiki/Autonomous_system_(Internet)>), denoted by a number (the Autonomous System Number or ASN). Every IP address on the Internet is owned by an ASN with a 1:1 lookup that does not change very frequently.
|
||||
|
||||
Anubis uses Thoth to match IP addresses to BGP Autonomous Systems so that you can either issue arbitrary challenges to individual internet service providers (such as Cloudflare or Huawei Cloud) or, at the administrator's explicit instruction, block them altogether. For example, here's how you add 10 weight points to requests from Cloudflare, Huawei Cloud, and Alibaba Cloud:
|
||||
|
||||
```yaml
|
||||
- name: aggressive-asns-without-functional-abuse-contact
|
||||
action: WEIGH
|
||||
asns:
|
||||
match:
|
||||
- 13335 # Cloudflare
|
||||
- 136907 # Huawei Cloud
|
||||
- 45102 # Alibaba Cloud
|
||||
weight:
|
||||
adjust: 10
|
||||
```
|
||||
|
||||
You can look up details for [AS13335](https://bgp.tools/as/13335) or any of these other top offenders on [bgp.tools](https://bgp.tools).
|
||||
|
||||
### GeoIP-based filtering
|
||||
|
||||
In extreme cases, an administrator may have to take action against an entire country. This is not an ideal circumstance, but sometimes reality forces their hands and the administrators just want to sleep at night.
|
||||
|
||||
Anubis uses Thoth to look up the geographic location registered to an IP address. This lookup is not the best and will get better with time, but you ship what you can so you can make it better for next time.
|
||||
|
||||
For example, to add 10 weight points to requests from Brazil and China:
|
||||
|
||||
```yaml
|
||||
- name: countries-with-aggressive-scrapers
|
||||
action: WEIGH
|
||||
geoip:
|
||||
countries:
|
||||
- BR
|
||||
- CN
|
||||
weight:
|
||||
adjust: 10
|
||||
```
|
||||
|
||||
Use this with care.
|
||||
|
||||
## Work-in-progress features
|
||||
|
||||
This section is a bit aspirational and is where Thoth will end up rather than things you can use today.
|
||||
|
||||
In general, a lot of Thoth features are focused on taking the same Anubis you know and love and making it better, smarter, and less paranoid. These include:
|
||||
|
||||
- Private rulesets for advanced patterns, current known exploits, and other recognition tactics that need to be kept cloak and dagger for operational security reasons
|
||||
- Private challenge implementations via WebAssembly, including advanced browser detection logic
|
||||
- Reputation querying so that Thoth can arbitrarily influence the weight of requests based on the net aggregate pass rate so that the most common browsers can get through with no challenge issued at all
|
||||
- APIs for trusted administrators to report abusive request fingerprints so that Anubis can react to threats as they evolve
|
||||
- A way for Anubis to periodically report the pass rate per ASN and other fingerprints so that methodology can be improved
|
||||
@@ -107,7 +107,6 @@ This ensures that the token has enough metadata to prove that the token is valid
|
||||
Challenges are formed by taking some user request metadata and using that to generate a SHA-256 checksum. The following request headers are used:
|
||||
|
||||
- `Accept-Encoding`: The content encodings that the requestor supports, such as gzip.
|
||||
- `Accept-Language`: The language that the requestor would prefer the server respond in, such as English.
|
||||
- `X-Real-Ip`: The IP address of the requestor, as set by a reverse proxy server.
|
||||
- `User-Agent`: The user agent string of the requestor.
|
||||
- The current time in UTC rounded to the nearest week.
|
||||
|
||||
@@ -2,7 +2,7 @@
|
||||
title: Why does Anubis use Proof-of-Work?
|
||||
---
|
||||
|
||||
Anubis uses a [proof of work](https://en.wikipedia.org/wiki/Proof_of_work) in order to validate that clients are genuine. The reason Anubis does this was inspired by [Hashcash](https://en.wikipedia.org/wiki/Hashcash), a suggestion from the early 2000's about extending the email protocol to avoid spam. The idea is that genuine people sending emails will have to do a small math problem that is expensive to compute, but easy to verify such as hashing a string with a given number of leading zeroes. This will have basically no impact on individuals sending a few emails a week, but the company churning out industrial quantities of advertising will be required to do prohibitively expensive computation. This is also how Bitcoin's consensus algorithm works.
|
||||
Anubis uses [proof of work](https://en.wikipedia.org/wiki/Proof_of_work) in order to validate that clients are genuine. The reason Anubis does this was inspired by [Hashcash](https://en.wikipedia.org/wiki/Hashcash), a suggestion from the early 2000's about extending the email protocol to avoid spam. The idea is that genuine people sending emails will have to do a small math problem that is expensive to compute, but easy to verify such as hashing a string with a given number of leading zeroes. This will have basically no impact on individuals sending a few emails a week, but the company churning out industrial quantities of advertising will be required to do prohibitively expensive computation. This is also how Bitcoin's consensus algorithm works.
|
||||
|
||||
## How Anubis' proof of work scheme works
|
||||
|
||||
@@ -21,16 +21,3 @@ const hash = await sha256(`${challenge}${nonce}`);
|
||||
In order to pass a challenge, the `hash` has to have the right number of leading zeros (the "difficulty"). When a client requests to pass the challenge, they include the nonce they used. The server then only has to do one sha256 operation: the one that confirms that the challenge (generated from request metadata) and the nonce (provided by the client) match the difficulty number of leading zeroes.
|
||||
|
||||
Ultimately, this is a hack whose real purpose is to give a "good enough" placeholder solution so that more time can be spent on fingerprinting and identifying headless browsers (EG via how they do font rendering) so that the challenge proof of work page doesn't need to be presented to known legitimate users.
|
||||
|
||||
## Challenge format
|
||||
|
||||
Anubis generates challenges based on browser metadata, including but not limited to the following:
|
||||
|
||||
- The contents of your [`Accept-Language` header](https://developer.mozilla.org/en-US/docs/Web/HTTP/Reference/Headers/Accept-Language)
|
||||
- The IP address of your client
|
||||
- Your browser's [`User-Agent` string](https://developer.mozilla.org/en-US/docs/Web/HTTP/Reference/Headers/User-Agent)
|
||||
- The date of the current week, rooted on Sundays
|
||||
- Anubis' ed25519 public signing key for [JSON web tokens](https://jwt.io/) (JWTs)
|
||||
- The challenge difficulty
|
||||
|
||||
This is intended to be a random value that is difficult for attackers to forge and guess, but also deterministic enough that it will naturally reset itself.
|
||||
|
||||
@@ -58,7 +58,9 @@ This will build all static assets (CSS, JavaScript) for distribution.
|
||||
make build
|
||||
```
|
||||
|
||||
From this point it is up to you to make sure that `./var/anubis` ends up in the right place. You may want to consult the `./run` folder for useful files such as a systemd unit and `anubis.env.default` file.
|
||||
From this point it is up to you to make sure that `./var/anubis` and `./var/robots2policy` end up in
|
||||
the right place. You may want to consult the `./run` folder for useful files such as a systemd unit
|
||||
and `anubis.env.default` file.
|
||||
|
||||
## "Pre-baked" tarball
|
||||
|
||||
@@ -75,7 +77,7 @@ When using this tarball, all you need to do is build `./cmd/anubis`:
|
||||
make prebaked-build
|
||||
```
|
||||
|
||||
Anubis will be built to `./var/anubis`.
|
||||
Anubis will be built to `./var/anubis` and the robots2policy tool to `./var/robots2policy`.
|
||||
|
||||
## Development dependencies
|
||||
|
||||
|
||||
@@ -2,6 +2,10 @@
|
||||
title: Local development
|
||||
---
|
||||
|
||||
If you use an editor with [Development containers](https://containers.dev) support, load this repo's [devcontainer configuration](https://github.com/TecharoHQ/anubis/tree/main/.devcontainer). Skip to [Running Anubis locally](#running-anubis-locally) if you are using the devcontainer.
|
||||
|
||||
This enables you to contribute from [GitHub Codespaces](https://github.com/features/codespaces) or other web-based editors.
|
||||
|
||||
:::note
|
||||
|
||||
TL;DR: `npm ci && npm run dev`
|
||||
|
||||
@@ -14,6 +14,7 @@ title: Anubis
|
||||

|
||||

|
||||

|
||||
[](https://github.com/sponsors/Xe)
|
||||
|
||||
## Sponsors
|
||||
|
||||
@@ -57,6 +58,20 @@ Anubis is brought to you by sponsors and donors like:
|
||||
<a href="https://wildbase.xyz/">
|
||||
<img src="/img/sponsors/wildbase-logo.webp" alt="Wildbase" height="64" />
|
||||
</a>
|
||||
<a href="https://emma.pet">
|
||||
<img
|
||||
src="/img/sponsors/nepeat-logo.webp"
|
||||
alt="Cat eyes over the word Emma in a serif font"
|
||||
height="64"
|
||||
/>
|
||||
</a>
|
||||
<a href="https://fabulous.systems/">
|
||||
<img
|
||||
src="/img/sponsors/fabulous-systems.webp"
|
||||
alt="Cat eyes over the word Emma in a serif font"
|
||||
height="64"
|
||||
/>
|
||||
</a>
|
||||
|
||||
## Overview
|
||||
|
||||
|
||||
@@ -21,8 +21,4 @@ If you use a browser extension such as [JShelter](https://jshelter.org/), you wi
|
||||
|
||||
## Does Anubis mine Bitcoin?
|
||||
|
||||
No. Anubis does not mine Bitcoin.
|
||||
|
||||
In order to mine bitcoin, you need to download a copy of the blockchain (so you have the state required to do mining) and also broadcast your mined blocks to the network should you reach a hash with the right number of leading zeroes. You also need to continuously read for newly broadcasted transactions so you can batch them into a block. This requires gigabytes of data to be transferred from the server to the client.
|
||||
|
||||
Anubis transfers two digit numbers of kilobytes from the server to the client (which you can independently verify with your browser's Developer Tools feature). This is orders of magnitude below what is required to mine Bitcoin.
|
||||
No. Anubis does not mine Bitcoin or any other cryptocurrency.
|
||||
|
||||
@@ -41,13 +41,37 @@ This page contains a non-exhaustive list with all websites using Anubis.
|
||||
- https://minihoot.site
|
||||
- https://catgirl.click/
|
||||
- https://wiki.dolphin-emu.org/
|
||||
- https://squirreljme.cc/
|
||||
- https://gitlab.postmarketos.org/
|
||||
- https://wiki.koha-community.org/
|
||||
- https://extensions.typo3.org/
|
||||
- https://ebird.org/
|
||||
- https://fabulous.systems/
|
||||
- https://coinhoards.org/
|
||||
- https://pluralpedia.org/
|
||||
- https://git.aya.so/
|
||||
- https://marginalia-search.com/
|
||||
- https://repositorio.ufrn.br/home/
|
||||
- https://mozillazine.org/
|
||||
- https://clew.se/
|
||||
- https://tumfatig.net/
|
||||
- https://rpmfusion.org/
|
||||
- https://wiki.freepascal.org/
|
||||
- https://azurlane.koumakan.jp/
|
||||
- <details>
|
||||
<summary>FreeCAD</summary>
|
||||
- https://forum.freecad.org/
|
||||
- https://wiki.freecad.org/
|
||||
</details>
|
||||
- <details>
|
||||
<summary>ReactOS</summary>
|
||||
- https://reactos.org/forum
|
||||
- https://reactos.org/wiki
|
||||
- https://git.reactos.org
|
||||
</details>
|
||||
- <details>
|
||||
<summary>ScummVM</summary>
|
||||
- https://bugs.scummvm.org/
|
||||
- https://forums.scummvm.org/
|
||||
- https://wiki.scummvm.org/
|
||||
</details>
|
||||
@@ -71,3 +95,21 @@ This page contains a non-exhaustive list with all websites using Anubis.
|
||||
- https://karla.hds.hebis.de/
|
||||
- and many more (see https://www.hebis.de/dienste/hebis-discovery-system/)
|
||||
</details>
|
||||
- <details>
|
||||
<summary>Duke University</summary>
|
||||
- https://repository.duke.edu/
|
||||
- https://archives.lib.duke.edu/
|
||||
- https://find.library.duke.edu/
|
||||
- https://nicholas.duke.edu/
|
||||
</details>
|
||||
- <details>
|
||||
<summary>Forschungszentrum Jülich</summary>
|
||||
- https://juser.fz-juelich.de/
|
||||
</details>
|
||||
- <details>
|
||||
<summary>archlinux32.org</summary>
|
||||
- https://www.archlinux32.org/packages/
|
||||
- https://bbs.archlinux32.org/
|
||||
- https://bugs.archlinux32.org/
|
||||
</details>
|
||||
|
||||
|
||||
@@ -6,7 +6,7 @@ import type * as Preset from '@docusaurus/preset-classic';
|
||||
|
||||
const config: Config = {
|
||||
title: 'Anubis',
|
||||
tagline: 'Weigh the soul of incoming HTTP requests using proof-of-work to stop AI crawlers',
|
||||
tagline: 'Weigh the soul of incoming HTTP requests to protect your website!',
|
||||
favicon: 'img/favicon.ico',
|
||||
|
||||
// Set the production url of your site here
|
||||
@@ -40,28 +40,21 @@ const config: Config = {
|
||||
[
|
||||
'classic',
|
||||
{
|
||||
blog: {
|
||||
showReadingTime: true,
|
||||
feedOptions: {
|
||||
type: ['rss', 'atom', "json"],
|
||||
xslt: true,
|
||||
},
|
||||
editUrl: 'https://github.com/TecharoHQ/anubis/tree/main/docs/',
|
||||
onInlineTags: 'warn',
|
||||
onInlineAuthors: 'warn',
|
||||
onUntruncatedBlogPosts: 'throw',
|
||||
},
|
||||
docs: {
|
||||
sidebarPath: './sidebars.ts',
|
||||
// Please change this to your repo.
|
||||
// Remove this to remove the "edit this page" links.
|
||||
editUrl:
|
||||
'https://github.com/TecharoHQ/anubis/tree/main/docs/',
|
||||
editUrl: 'https://github.com/TecharoHQ/anubis/tree/main/docs/',
|
||||
},
|
||||
// blog: {
|
||||
// showReadingTime: true,
|
||||
// feedOptions: {
|
||||
// type: ['rss', 'atom', "json"],
|
||||
// xslt: true,
|
||||
// },
|
||||
// // Please change this to your repo.
|
||||
// // Remove this to remove the "edit this page" links.
|
||||
// editUrl:
|
||||
// 'https://github.com/facebook/docusaurus/tree/main/packages/create-docusaurus/templates/shared/',
|
||||
// // Useful options to enforce blogging best practices
|
||||
// onInlineTags: 'warn',
|
||||
// onInlineAuthors: 'warn',
|
||||
// onUntruncatedBlogPosts: 'warn',
|
||||
// },
|
||||
theme: {
|
||||
customCss: './src/css/custom.css',
|
||||
},
|
||||
@@ -74,7 +67,7 @@ const config: Config = {
|
||||
respectPrefersColorScheme: true,
|
||||
},
|
||||
// Replace with your project's social card
|
||||
image: 'img/docusaurus-social-card.jpg',
|
||||
image: 'img/social-card.jpg',
|
||||
navbar: {
|
||||
title: 'Anubis',
|
||||
logo: {
|
||||
@@ -82,18 +75,28 @@ const config: Config = {
|
||||
src: 'img/favicon.webp',
|
||||
},
|
||||
items: [
|
||||
{ to: '/blog', label: 'Blog', position: 'left' },
|
||||
{
|
||||
type: 'docSidebar',
|
||||
sidebarId: 'tutorialSidebar',
|
||||
position: 'left',
|
||||
label: 'Tutorial',
|
||||
label: 'Docs',
|
||||
},
|
||||
{
|
||||
to: '/docs/admin/botstopper',
|
||||
label: "Unbranded Version",
|
||||
position: "left"
|
||||
},
|
||||
// { to: '/blog', label: 'Blog', position: 'left' },
|
||||
{
|
||||
href: 'https://github.com/TecharoHQ/anubis',
|
||||
label: 'GitHub',
|
||||
position: 'right',
|
||||
},
|
||||
{
|
||||
href: 'https://github.com/sponsors/Xe',
|
||||
label: "Sponsor the Project",
|
||||
position: 'right'
|
||||
},
|
||||
],
|
||||
},
|
||||
footer: {
|
||||
@@ -128,10 +131,18 @@ const config: Config = {
|
||||
{
|
||||
title: 'More',
|
||||
items: [
|
||||
{
|
||||
label: 'Blog',
|
||||
to: '/blog',
|
||||
},
|
||||
{
|
||||
label: 'GitHub',
|
||||
href: 'https://github.com/TecharoHQ/anubis',
|
||||
},
|
||||
{
|
||||
label: 'Status',
|
||||
href: 'https://techarohq.github.io/status/'
|
||||
},
|
||||
],
|
||||
},
|
||||
],
|
||||
|
||||
19
docs/fly.toml
Normal file
19
docs/fly.toml
Normal file
@@ -0,0 +1,19 @@
|
||||
app = 'anubis-docs'
|
||||
primary_region = 'yyz'
|
||||
|
||||
[build]
|
||||
image = "ghcr.io/techarohq/anubis/docs:main"
|
||||
|
||||
[http_service]
|
||||
internal_port = 80
|
||||
force_https = true
|
||||
auto_stop_machines = true
|
||||
auto_start_machines = true
|
||||
min_machines_running = 0
|
||||
processes = ['app']
|
||||
|
||||
[[vm]]
|
||||
cpu_kind = 'shared'
|
||||
cpus = 1
|
||||
memory_mb = 256
|
||||
|
||||
6
docs/manifest/1password.yaml
Normal file
6
docs/manifest/1password.yaml
Normal file
@@ -0,0 +1,6 @@
|
||||
apiVersion: onepassword.com/v1
|
||||
kind: OnePasswordItem
|
||||
metadata:
|
||||
name: anubis-docs-thoth
|
||||
spec:
|
||||
itemPath: "vaults/lc5zo4zjz3if3mkeuhufjmgmui/items/pwguumqcmtxvqbeb7y4gj7l36i"
|
||||
@@ -11,6 +11,7 @@
|
||||
## /usr/share/docs/anubis/data or in the tarball you extracted Anubis from.
|
||||
|
||||
bots:
|
||||
- import: (data)/crawlers/commoncrawl.yaml
|
||||
# Pathological bots to deny
|
||||
- # This correlates to data/bots/deny-pathological.yaml in the source tree
|
||||
# https://github.com/TecharoHQ/anubis/blob/main/data/bots/deny-pathological.yaml
|
||||
@@ -51,6 +52,13 @@ bots:
|
||||
# report_as: 4 # lie to the operator
|
||||
# algorithm: slow # intentionally waste CPU cycles and time
|
||||
|
||||
- name: rss-feed-blog
|
||||
action: ALLOW
|
||||
expression:
|
||||
any:
|
||||
- path.startsWith("/blog/atom.")
|
||||
- path.startsWith("/blog/rss.")
|
||||
|
||||
# Generic catchall rule
|
||||
- name: generic-browser
|
||||
user_agent_regex: >-
|
||||
@@ -63,6 +71,55 @@ bots:
|
||||
|
||||
dnsbl: false
|
||||
|
||||
impressum:
|
||||
footer: |
|
||||
This website is hosted by Techaro. If you have any complaints or notes about the service, please contact <a href="mailto:contact@techaro.lol">contact@techaro.lol</a> and we will assist you as soon as possible.
|
||||
|
||||
page:
|
||||
title: Privacy Policy
|
||||
body: |
|
||||
<p>Last updated: June 2025</p>
|
||||
|
||||
<h2>Information that is gathered from visitors</h2>
|
||||
|
||||
<p>In common with other websites, log files are stored on the web server saving details such as the visitor's IP address, browser type, referring page and time of visit.</p>
|
||||
|
||||
<p>Cookies may be used to remember visitor preferences when interacting with the website.</p>
|
||||
|
||||
<p>Where registration is required, the visitor's email and a username will be stored on the server.</p>
|
||||
|
||||
<h2>How the Information is used</h2>
|
||||
|
||||
<p>The information is used to enhance the vistor's experience when using the website to display personalised content and possibly advertising.</p>
|
||||
|
||||
<p>E-mail addresses will not be sold, rented or leased to 3rd parties.</p>
|
||||
|
||||
<p>E-mail may be sent to inform you of news of our services or offers by us or our affiliates.</p>
|
||||
|
||||
<h2>Visitor Options</h2>
|
||||
|
||||
<p>If you have subscribed to one of our services, you may unsubscribe by following the instructions which are included in e-mail that you receive.</p>
|
||||
|
||||
<p>You may be able to block cookies via your browser settings but this may prevent you from access to certain features of the website.</p>
|
||||
|
||||
<h2>Cookies</h2>
|
||||
|
||||
<p>Cookies are small digital signature files that are stored by your web browser that allow your preferences to be recorded when visiting the website. Also they may be used to track your return visits to the website.</p>
|
||||
|
||||
<p>3rd party advertising companies may also use cookies for tracking purposes.</p>
|
||||
|
||||
<h2>Techaro Anubis</h2>
|
||||
|
||||
<p>This website uses a service called <a href="https://anubis.techaro.lol">Anubis</a> to filter malicious traffic. Anubis requires the use of browser cookies to ensure that web clients are running conformant software. Anubis also may report the following data to Techaro to improve service quality:</p>
|
||||
|
||||
<ul>
|
||||
<li>IP address (for purposes of matching against geo-location and BGP autonomous systems numbers), which is stored in-memory and not persisted to disk.</li>
|
||||
<li>Unique browser fingerprints (such as HTTP request fingerprints and encryption system fingerprints), which may be stored on Techaro's side for a period of up to one month.</li>
|
||||
<li>HTTP request metadata that may include things such as the User-Agent header and other identifiers.</li>
|
||||
</ul>
|
||||
|
||||
<p>This data is processed and stored for the legitimate interest of combatting abusive web clients. This data is encrypted at rest as much as possible and is only decrypted in memory for the purposes of fulfilling requests.</p>
|
||||
|
||||
# By default, send HTTP 200 back to clients that either get issued a challenge
|
||||
# or a denial. This seems weird, but this is load-bearing due to the fact that
|
||||
# the most aggressive scraper bots seem to really, really, want an HTTP 200 and
|
||||
@@ -70,3 +127,8 @@ dnsbl: false
|
||||
status_codes:
|
||||
CHALLENGE: 200
|
||||
DENY: 200
|
||||
|
||||
store:
|
||||
backend: bbolt
|
||||
parameters:
|
||||
path: /xe/data/anubis/data.bdb
|
||||
|
||||
99
docs/manifest/cfg/nginx/mime.types
Normal file
99
docs/manifest/cfg/nginx/mime.types
Normal file
@@ -0,0 +1,99 @@
|
||||
|
||||
types {
|
||||
text/html html htm shtml;
|
||||
text/css css;
|
||||
text/xml xml;
|
||||
image/gif gif;
|
||||
image/jpeg jpeg jpg;
|
||||
application/javascript js;
|
||||
application/atom+xml atom;
|
||||
application/rss+xml rss;
|
||||
|
||||
text/mathml mml;
|
||||
text/plain txt;
|
||||
text/vnd.sun.j2me.app-descriptor jad;
|
||||
text/vnd.wap.wml wml;
|
||||
text/x-component htc;
|
||||
|
||||
image/avif avif;
|
||||
image/png png;
|
||||
image/svg+xml svg svgz;
|
||||
image/tiff tif tiff;
|
||||
image/vnd.wap.wbmp wbmp;
|
||||
image/webp webp;
|
||||
image/x-icon ico;
|
||||
image/x-jng jng;
|
||||
image/x-ms-bmp bmp;
|
||||
|
||||
font/woff woff;
|
||||
font/woff2 woff2;
|
||||
|
||||
application/java-archive jar war ear;
|
||||
application/json json;
|
||||
application/mac-binhex40 hqx;
|
||||
application/msword doc;
|
||||
application/pdf pdf;
|
||||
application/postscript ps eps ai;
|
||||
application/rtf rtf;
|
||||
application/vnd.apple.mpegurl m3u8;
|
||||
application/vnd.google-earth.kml+xml kml;
|
||||
application/vnd.google-earth.kmz kmz;
|
||||
application/vnd.ms-excel xls;
|
||||
application/vnd.ms-fontobject eot;
|
||||
application/vnd.ms-powerpoint ppt;
|
||||
application/vnd.oasis.opendocument.graphics odg;
|
||||
application/vnd.oasis.opendocument.presentation odp;
|
||||
application/vnd.oasis.opendocument.spreadsheet ods;
|
||||
application/vnd.oasis.opendocument.text odt;
|
||||
application/vnd.openxmlformats-officedocument.presentationml.presentation
|
||||
pptx;
|
||||
application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
|
||||
xlsx;
|
||||
application/vnd.openxmlformats-officedocument.wordprocessingml.document
|
||||
docx;
|
||||
application/vnd.wap.wmlc wmlc;
|
||||
application/wasm wasm;
|
||||
application/x-7z-compressed 7z;
|
||||
application/x-cocoa cco;
|
||||
application/x-java-archive-diff jardiff;
|
||||
application/x-java-jnlp-file jnlp;
|
||||
application/x-makeself run;
|
||||
application/x-perl pl pm;
|
||||
application/x-pilot prc pdb;
|
||||
application/x-rar-compressed rar;
|
||||
application/x-redhat-package-manager rpm;
|
||||
application/x-sea sea;
|
||||
application/x-shockwave-flash swf;
|
||||
application/x-stuffit sit;
|
||||
application/x-tcl tcl tk;
|
||||
application/x-x509-ca-cert der pem crt;
|
||||
application/x-xpinstall xpi;
|
||||
application/xhtml+xml xhtml;
|
||||
application/xspf+xml xspf;
|
||||
application/zip zip;
|
||||
|
||||
application/octet-stream bin exe dll;
|
||||
application/octet-stream deb;
|
||||
application/octet-stream dmg;
|
||||
application/octet-stream iso img;
|
||||
application/octet-stream msi msp msm;
|
||||
|
||||
audio/midi mid midi kar;
|
||||
audio/mpeg mp3;
|
||||
audio/ogg ogg;
|
||||
audio/x-m4a m4a;
|
||||
audio/x-realaudio ra;
|
||||
|
||||
video/3gpp 3gpp 3gp;
|
||||
video/mp2t ts;
|
||||
video/mp4 mp4;
|
||||
video/mpeg mpeg mpg;
|
||||
video/quicktime mov;
|
||||
video/webm webm;
|
||||
video/x-flv flv;
|
||||
video/x-m4v m4v;
|
||||
video/x-mng mng;
|
||||
video/x-ms-asf asx asf;
|
||||
video/x-ms-wmv wmv;
|
||||
video/x-msvideo avi;
|
||||
}
|
||||
31
docs/manifest/cfg/nginx/nginx.conf
Normal file
31
docs/manifest/cfg/nginx/nginx.conf
Normal file
@@ -0,0 +1,31 @@
|
||||
user nginx;
|
||||
worker_processes 2;
|
||||
error_log /dev/stdout warn;
|
||||
pid /nginx.pid;
|
||||
|
||||
events {
|
||||
worker_connections 1024;
|
||||
}
|
||||
|
||||
http {
|
||||
include mime.types;
|
||||
default_type application/octet-stream;
|
||||
access_log /dev/stdout;
|
||||
|
||||
sendfile on;
|
||||
keepalive_timeout 65;
|
||||
|
||||
server {
|
||||
listen 80 default_server;
|
||||
server_name _;
|
||||
|
||||
error_page 404 /404.html;
|
||||
|
||||
root /www;
|
||||
index index.html;
|
||||
|
||||
location / {
|
||||
try_files $uri $uri/ =404;
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -15,6 +15,11 @@ spec:
|
||||
- name: anubis
|
||||
configMap:
|
||||
name: anubis-cfg
|
||||
- name: nginx
|
||||
configMap:
|
||||
name: nginx-cfg
|
||||
- name: temporary-data
|
||||
emptyDir: {}
|
||||
containers:
|
||||
- name: anubis-docs
|
||||
image: ghcr.io/techarohq/anubis/docs:main
|
||||
@@ -26,8 +31,23 @@ spec:
|
||||
requests:
|
||||
cpu: 250m
|
||||
memory: 128Mi
|
||||
volumeMounts:
|
||||
- name: nginx
|
||||
mountPath: /conf
|
||||
ports:
|
||||
- containerPort: 80
|
||||
readinessProbe:
|
||||
httpGet:
|
||||
path: /
|
||||
port: 80
|
||||
initialDelaySeconds: 1
|
||||
periodSeconds: 10
|
||||
livenessProbe:
|
||||
httpGet:
|
||||
path: /
|
||||
port: 80
|
||||
initialDelaySeconds: 10
|
||||
periodSeconds: 20
|
||||
- name: anubis
|
||||
image: ghcr.io/techarohq/anubis:main
|
||||
imagePullPolicy: Always
|
||||
@@ -38,6 +58,8 @@ spec:
|
||||
value: "4"
|
||||
- name: "METRICS_BIND"
|
||||
value: ":9090"
|
||||
- name: "OG_PASSTHROUGH"
|
||||
value: "true"
|
||||
- name: "POLICY_FNAME"
|
||||
value: "/xe/cfg/anubis/botPolicies.yaml"
|
||||
- name: "SERVE_ROBOTS_TXT"
|
||||
@@ -49,6 +71,8 @@ spec:
|
||||
volumeMounts:
|
||||
- name: anubis
|
||||
mountPath: /xe/cfg/anubis
|
||||
- name: temporary-data
|
||||
mountPath: /xe/data/anubis
|
||||
resources:
|
||||
limits:
|
||||
cpu: 500m
|
||||
@@ -66,3 +90,18 @@ spec:
|
||||
- ALL
|
||||
seccompProfile:
|
||||
type: RuntimeDefault
|
||||
envFrom:
|
||||
- secretRef:
|
||||
name: anubis-docs-thoth
|
||||
readinessProbe:
|
||||
httpGet:
|
||||
path: /healthz
|
||||
port: 9090
|
||||
initialDelaySeconds: 1
|
||||
periodSeconds: 10
|
||||
livenessProbe:
|
||||
httpGet:
|
||||
path: /healthz
|
||||
port: 9090
|
||||
initialDelaySeconds: 10
|
||||
periodSeconds: 20
|
||||
|
||||
@@ -1,7 +1,9 @@
|
||||
resources:
|
||||
- 1password.yaml
|
||||
- deployment.yaml
|
||||
- ingress.yaml
|
||||
- onionservice.yaml
|
||||
- poddisruptionbudget.yaml
|
||||
- service.yaml
|
||||
|
||||
configMapGenerator:
|
||||
@@ -9,3 +11,8 @@ configMapGenerator:
|
||||
behavior: create
|
||||
files:
|
||||
- ./cfg/anubis/botPolicies.yaml
|
||||
- name: nginx-cfg
|
||||
behavior: create
|
||||
files:
|
||||
- ./cfg/nginx/mime.types
|
||||
- ./cfg/nginx/nginx.conf
|
||||
|
||||
9
docs/manifest/poddisruptionbudget.yaml
Normal file
9
docs/manifest/poddisruptionbudget.yaml
Normal file
@@ -0,0 +1,9 @@
|
||||
apiVersion: policy/v1
|
||||
kind: PodDisruptionBudget
|
||||
metadata:
|
||||
name: anubis-docs
|
||||
spec:
|
||||
minAvailable: 1
|
||||
selector:
|
||||
matchLabels:
|
||||
app: anubis-docs
|
||||
3085
docs/package-lock.json
generated
3085
docs/package-lock.json
generated
File diff suppressed because it is too large
Load Diff
@@ -4,7 +4,7 @@
|
||||
"private": true,
|
||||
"scripts": {
|
||||
"docusaurus": "docusaurus",
|
||||
"start": "docusaurus start",
|
||||
"start": "docusaurus start --host 0.0.0.0",
|
||||
"build": "docusaurus build",
|
||||
"swizzle": "docusaurus swizzle",
|
||||
"deploy": "echo 'use CI' && exit 1",
|
||||
@@ -15,9 +15,9 @@
|
||||
"typecheck": "tsc"
|
||||
},
|
||||
"dependencies": {
|
||||
"@docusaurus/core": "3.7.0",
|
||||
"@docusaurus/preset-classic": "3.7.0",
|
||||
"@docusaurus/theme-mermaid": "^3.7.0",
|
||||
"@docusaurus/core": "^3.8.1",
|
||||
"@docusaurus/preset-classic": "^3.8.1",
|
||||
"@docusaurus/theme-mermaid": "^3.8.1",
|
||||
"@mdx-js/react": "^3.0.0",
|
||||
"clsx": "^2.0.0",
|
||||
"prism-react-renderer": "^2.3.0",
|
||||
@@ -25,9 +25,9 @@
|
||||
"react-dom": "^19.0.0"
|
||||
},
|
||||
"devDependencies": {
|
||||
"@docusaurus/module-type-aliases": "3.7.0",
|
||||
"@docusaurus/tsconfig": "3.7.0",
|
||||
"@docusaurus/types": "3.7.0",
|
||||
"@docusaurus/module-type-aliases": "^3.8.1",
|
||||
"@docusaurus/tsconfig": "^3.8.1",
|
||||
"@docusaurus/types": "^3.8.1",
|
||||
"typescript": "~5.6.2"
|
||||
},
|
||||
"browserslist": {
|
||||
|
||||
@@ -5,49 +5,50 @@ import styles from "./styles.module.css";
|
||||
|
||||
type FeatureItem = {
|
||||
title: string;
|
||||
Svg: React.ComponentType<React.ComponentProps<"svg">>;
|
||||
imageURL: string;
|
||||
description: ReactNode;
|
||||
};
|
||||
|
||||
const FeatureList: FeatureItem[] = [
|
||||
{
|
||||
title: "Easy to Use",
|
||||
Svg: require("@site/static/img/undraw_docusaurus_mountain.svg").default,
|
||||
imageURL: require("@site/static/img/anubis/happy.webp").default,
|
||||
description: (
|
||||
<>
|
||||
Anubis is easy to set up, lightweight, and helps get rid of the lowest
|
||||
hanging fruit so you can sleep at night.
|
||||
Anubis sits in the background and weighs the risk of incoming requests.
|
||||
If it asks a client to complete a challenge, no user interaction is
|
||||
required.
|
||||
</>
|
||||
),
|
||||
},
|
||||
{
|
||||
title: "Lightweight",
|
||||
Svg: require("@site/static/img/undraw_docusaurus_tree.svg").default,
|
||||
imageURL: require("@site/static/img/anubis/pensive.webp").default,
|
||||
description: (
|
||||
<>
|
||||
Anubis is efficient and as lightweight as possible, blocking the worst
|
||||
of the bots on the internet and makes it easy to protect what you host
|
||||
online.
|
||||
Anubis is so lightweight you'll forget it's there until you look at your
|
||||
hosting bill. On average it uses less than 128 MB of ram.
|
||||
</>
|
||||
),
|
||||
},
|
||||
{
|
||||
title: "Multi-threaded",
|
||||
Svg: require("@site/static/img/undraw_docusaurus_react.svg").default,
|
||||
title: "Block the scrapers",
|
||||
imageURL: require("@site/static/img/anubis/reject.webp").default,
|
||||
description: (
|
||||
<>
|
||||
Anubis uses a multi-threaded proof of work check to ensure that users
|
||||
browsers are up to date and support modern standards.
|
||||
Anubis uses a combination of heuristics to identify and block bots
|
||||
before they take your website down. You can customize the rules with{" "}
|
||||
<a href="/docs/admin/policies">your own policies</a>.
|
||||
</>
|
||||
),
|
||||
},
|
||||
];
|
||||
|
||||
function Feature({ title, Svg, description }: FeatureItem) {
|
||||
function Feature({ title, description, imageURL }: FeatureItem) {
|
||||
return (
|
||||
<div className={clsx("col col--4")}>
|
||||
<div className="text--center">
|
||||
<Svg className={styles.featureSvg} role="img" />
|
||||
<img src={imageURL} className={styles.featureSvg} role="img" />
|
||||
</div>
|
||||
<div className="text--center padding-horiz--md">
|
||||
<Heading as="h3">{title}</Heading>
|
||||
|
||||
@@ -31,19 +31,12 @@ export default function Home(): ReactNode {
|
||||
const { siteConfig } = useDocusaurusContext();
|
||||
return (
|
||||
<Layout
|
||||
title={`Anubis: self hostable scraper defense software`}
|
||||
description="Weigh the soul of incoming HTTP requests using proof-of-work to stop AI crawlers"
|
||||
title={`Anubis: Web AI Firewall Utility`}
|
||||
description="Weigh the soul of incoming HTTP requests to protect your website!"
|
||||
>
|
||||
<HomepageHeader />
|
||||
<main>
|
||||
<HomepageFeatures />
|
||||
|
||||
<center>
|
||||
<p>
|
||||
This is all placeholder text. It will be fixed. Give me time. I am
|
||||
one person and my project has unexpectedly gone viral.
|
||||
</p>
|
||||
</center>
|
||||
</main>
|
||||
</Layout>
|
||||
);
|
||||
|
||||
BIN
docs/static/img/anubis/happy.webp
vendored
Normal file
BIN
docs/static/img/anubis/happy.webp
vendored
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 30 KiB |
BIN
docs/static/img/anubis/pensive.webp
vendored
Normal file
BIN
docs/static/img/anubis/pensive.webp
vendored
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 28 KiB |
BIN
docs/static/img/anubis/reject.webp
vendored
Normal file
BIN
docs/static/img/anubis/reject.webp
vendored
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 26 KiB |
BIN
docs/static/img/botstopper/example-screenshot.webp
vendored
Normal file
BIN
docs/static/img/botstopper/example-screenshot.webp
vendored
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 29 KiB |
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user