Compare commits

..

4 Commits

381 changed files with 2833 additions and 9049 deletions

View File

@@ -2,7 +2,9 @@
// README at: https://github.com/devcontainers/templates/tree/main/src/debian // README at: https://github.com/devcontainers/templates/tree/main/src/debian
{ {
"name": "Dev", "name": "Dev",
"dockerComposeFile": ["./docker-compose.yaml"], "dockerComposeFile": [
"./docker-compose.yaml"
],
"service": "workspace", "service": "workspace",
"workspaceFolder": "/workspace/anubis", "workspaceFolder": "/workspace/anubis",
"postStartCommand": "bash ./.devcontainer/poststart.sh", "postStartCommand": "bash ./.devcontainer/poststart.sh",

View File

@@ -58,3 +58,4 @@ body:
attributes: attributes:
label: Additional context label: Additional context
description: Add any other context about the problem here. description: Add any other context about the problem here.

View File

@@ -1,6 +1,6 @@
name: Feature request name: Feature request
description: Suggest an idea for this project description: Suggest an idea for this project
title: "[Feature request] " title: '[Feature request] '
body: body:
- type: textarea - type: textarea

View File

@@ -1,17 +1,17 @@
# check-spelling/check-spelling configuration # check-spelling/check-spelling configuration
| File | Purpose | Format | Info | File | Purpose | Format | Info
| -------------------------------------------------- | -------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------- | -|-|-|-
| [dictionary.txt](dictionary.txt) | Replacement dictionary (creating this file will override the default dictionary) | one word per line | [dictionary](https://github.com/check-spelling/check-spelling/wiki/Configuration#dictionary) | [dictionary.txt](dictionary.txt) | Replacement dictionary (creating this file will override the default dictionary) | one word per line | [dictionary](https://github.com/check-spelling/check-spelling/wiki/Configuration#dictionary)
| [allow.txt](allow.txt) | Add words to the dictionary | one word per line (only letters and `'`s allowed) | [allow](https://github.com/check-spelling/check-spelling/wiki/Configuration#allow) | [allow.txt](allow.txt) | Add words to the dictionary | one word per line (only letters and `'`s allowed) | [allow](https://github.com/check-spelling/check-spelling/wiki/Configuration#allow)
| [reject.txt](reject.txt) | Remove words from the dictionary (after allow) | grep pattern matching whole dictionary words | [reject](https://github.com/check-spelling/check-spelling/wiki/Configuration-Examples%3A-reject) | [reject.txt](reject.txt) | Remove words from the dictionary (after allow) | grep pattern matching whole dictionary words | [reject](https://github.com/check-spelling/check-spelling/wiki/Configuration-Examples%3A-reject)
| [excludes.txt](excludes.txt) | Files to ignore entirely | perl regular expression | [excludes](https://github.com/check-spelling/check-spelling/wiki/Configuration-Examples%3A-excludes) | [excludes.txt](excludes.txt) | Files to ignore entirely | perl regular expression | [excludes](https://github.com/check-spelling/check-spelling/wiki/Configuration-Examples%3A-excludes)
| [only.txt](only.txt) | Only check matching files (applied after excludes) | perl regular expression | [only](https://github.com/check-spelling/check-spelling/wiki/Configuration-Examples%3A-only) | [only.txt](only.txt) | Only check matching files (applied after excludes) | perl regular expression | [only](https://github.com/check-spelling/check-spelling/wiki/Configuration-Examples%3A-only)
| [patterns.txt](patterns.txt) | Patterns to ignore from checked lines | perl regular expression (order matters, first match wins) | [patterns](https://github.com/check-spelling/check-spelling/wiki/Configuration-Examples%3A-patterns) | [patterns.txt](patterns.txt) | Patterns to ignore from checked lines | perl regular expression (order matters, first match wins) | [patterns](https://github.com/check-spelling/check-spelling/wiki/Configuration-Examples%3A-patterns)
| [candidate.patterns](candidate.patterns) | Patterns that might be worth adding to [patterns.txt](patterns.txt) | perl regular expression with optional comment block introductions (all matches will be suggested) | [candidates](https://github.com/check-spelling/check-spelling/wiki/Feature:-Suggest-patterns) | [candidate.patterns](candidate.patterns) | Patterns that might be worth adding to [patterns.txt](patterns.txt) | perl regular expression with optional comment block introductions (all matches will be suggested) | [candidates](https://github.com/check-spelling/check-spelling/wiki/Feature:-Suggest-patterns)
| [line_forbidden.patterns](line_forbidden.patterns) | Patterns to flag in checked lines | perl regular expression (order matters, first match wins) | [patterns](https://github.com/check-spelling/check-spelling/wiki/Configuration-Examples%3A-patterns) | [line_forbidden.patterns](line_forbidden.patterns) | Patterns to flag in checked lines | perl regular expression (order matters, first match wins) | [patterns](https://github.com/check-spelling/check-spelling/wiki/Configuration-Examples%3A-patterns)
| [expect.txt](expect.txt) | Expected words that aren't in the dictionary | one word per line (sorted, alphabetically) | [expect](https://github.com/check-spelling/check-spelling/wiki/Configuration#expect) | [expect.txt](expect.txt) | Expected words that aren't in the dictionary | one word per line (sorted, alphabetically) | [expect](https://github.com/check-spelling/check-spelling/wiki/Configuration#expect)
| [advice.md](advice.md) | Supplement for GitHub comment when unrecognized words are found | GitHub Markdown | [advice](https://github.com/check-spelling/check-spelling/wiki/Configuration-Examples%3A-advice) | [advice.md](advice.md) | Supplement for GitHub comment when unrecognized words are found | GitHub Markdown | [advice](https://github.com/check-spelling/check-spelling/wiki/Configuration-Examples%3A-advice)
Note: you can replace any of these files with a directory by the same name (minus the suffix) Note: you can replace any of these files with a directory by the same name (minus the suffix)
and then include multiple files inside that directory (with that suffix) to merge multiple files together. and then include multiple files inside that directory (with that suffix) to merge multiple files together.

View File

@@ -2,27 +2,30 @@
<details><summary>If the flagged items are :exploding_head: false positives</summary> <details><summary>If the flagged items are :exploding_head: false positives</summary>
If items relate to a ... If items relate to a ...
* binary file (or some other file you wouldn't want to check at all).
- binary file (or some other file you wouldn't want to check at all).
Please add a file path to the `excludes.txt` file matching the containing file. Please add a file path to the `excludes.txt` file matching the containing file.
File paths are Perl 5 Regular Expressions - you can [test](https://www.regexplanet.com/advanced/perl/) yours before committing to verify it will match your files. File paths are Perl 5 Regular Expressions - you can [test](
https://www.regexplanet.com/advanced/perl/) yours before committing to verify it will match your files.
`^` refers to the file's path from the root of the repository, so `^README\.md$` would exclude [README.md](../tree/HEAD/README.md) (on whichever branch you're using). `^` refers to the file's path from the root of the repository, so `^README\.md$` would exclude [README.md](
../tree/HEAD/README.md) (on whichever branch you're using).
- well-formed pattern. * well-formed pattern.
If you can write a [pattern](https://github.com/check-spelling/check-spelling/wiki/Configuration-Examples:-patterns) that would match it, If you can write a [pattern](
https://github.com/check-spelling/check-spelling/wiki/Configuration-Examples:-patterns
) that would match it,
try adding it to the `patterns.txt` file. try adding it to the `patterns.txt` file.
Patterns are Perl 5 Regular Expressions - you can [test](https://www.regexplanet.com/advanced/perl/) yours before committing to verify it will match your lines. Patterns are Perl 5 Regular Expressions - you can [test](
https://www.regexplanet.com/advanced/perl/) yours before committing to verify it will match your lines.
Note that patterns can't match multiline strings. Note that patterns can't match multiline strings.
</details> </details>
<!-- adoption information--> <!-- adoption information-->
:steam_locomotive: If you're seeing this message and your PR is from a branch that doesn't have check-spelling, :steam_locomotive: If you're seeing this message and your PR is from a branch that doesn't have check-spelling,
please merge to your PR's base branch to get the version configured for your repository. please merge to your PR's base branch to get the version configured for your repository.

View File

@@ -9,20 +9,3 @@ xeact
ABee ABee
tencent tencent
maintnotifications maintnotifications
azurediamond
cooldown
verifyfcrdns
Spintax
spintax
clampip
pseudoprofound
reimagining
iocaine
admins
fout
iplist
NArg
blocklists
rififi
prolocation
Prolocation

View File

@@ -87,14 +87,10 @@
^docs/docs/user/known-instances.md$ ^docs/docs/user/known-instances.md$
^docs/manifest/.*$ ^docs/manifest/.*$
^docs/static/\.nojekyll$ ^docs/static/\.nojekyll$
^internal/glob/glob_test.go$
^internal/honeypot/naive/affirmations\.txt$
^internal/honeypot/naive/spintext\.txt$
^internal/honeypot/naive/titles\.txt$
^lib/config/testdata/bad/unparseable\.json$
^lib/localization/.*_test.go$
^lib/localization/locales/.*\.json$
^lib/policy/config/testdata/bad/unparseable\.json$ ^lib/policy/config/testdata/bad/unparseable\.json$
^test/.*$ ^internal/glob/glob_test.go$
ignore$ ignore$
robots.txt robots.txt
^lib/localization/locales/.*\.json$
^lib/localization/.*_test.go$
^test/.*$

View File

@@ -2,12 +2,10 @@ acs
Actorified Actorified
actorifiedstore actorifiedstore
actorify actorify
agentic
Aibrew Aibrew
alibaba alibaba
alrest alrest
amazonbot amazonbot
anexia
anthro anthro
anubis anubis
anubistest anubistest
@@ -15,7 +13,6 @@ apnic
APNICRANDNETAU APNICRANDNETAU
Applebot Applebot
archlinux archlinux
arpa
asnc asnc
asnchecker asnchecker
asns asns
@@ -63,11 +60,10 @@ checkresult
chibi chibi
cidranger cidranger
ckie ckie
CLAUDE
cloudflare cloudflare
cloudsolutions
Codespaces Codespaces
confd confd
connnection
containerbuild containerbuild
containerregistry containerregistry
coreutils coreutils
@@ -76,9 +72,7 @@ Cromite
crt crt
Cscript Cscript
daemonizing daemonizing
databento
dayjob dayjob
dco
DDOS DDOS
Debian Debian
debrpm debrpm
@@ -91,7 +85,6 @@ distros
dnf dnf
dnsbl dnsbl
dnserr dnserr
DNSTTL
domainhere domainhere
dracula dracula
dronebl dronebl
@@ -113,13 +106,9 @@ externalfetcher
extldflags extldflags
facebookgo facebookgo
Factset Factset
fahedouch
fastcgi fastcgi
FCr
fcrdns
fediverse fediverse
ffprobe ffprobe
fhdr
financials financials
finfos finfos
Firecrawl Firecrawl
@@ -138,9 +127,7 @@ GHSA
Ghz Ghz
gipc gipc
gitea gitea
GLM
godotenv godotenv
goimports
goland goland
gomod gomod
goodbot goodbot
@@ -157,7 +144,6 @@ grw
gzw gzw
Hashcash Hashcash
hashrate hashrate
hdr
headermap headermap
healthcheck healthcheck
healthz healthz
@@ -167,10 +153,10 @@ Hetzner
hmc hmc
homelab homelab
hostable hostable
HSTS
htmlc htmlc
htmx htmx
httpdebug httpdebug
Huawei
huawei huawei
hypertext hypertext
iaskspider iaskspider
@@ -208,14 +194,12 @@ lcj
ldflags ldflags
letsencrypt letsencrypt
Lexentale Lexentale
lfc
lgbt lgbt
licend licend
licstart licstart
lightpanda lightpanda
limsa limsa
Linting Linting
listor
LLU LLU
loadbalancer loadbalancer
lol lol
@@ -233,16 +217,11 @@ mnt
Mojeek Mojeek
mojeekbot mojeekbot
mozilla mozilla
myclient
mymaster
mypass
myuser
nbf nbf
nepeat nepeat
netsurf netsurf
nginx nginx
nicksnyder nicksnyder
nikandfor
nobots nobots
NONINFRINGEMENT NONINFRINGEMENT
nosleep nosleep
@@ -254,18 +233,15 @@ oklch
omgili omgili
omgilibot omgilibot
openai openai
opendns
opengraph opengraph
openrc openrc
oswald oswald
pag pag
pagegen
palemoon palemoon
Pangu Pangu
parseable parseable
passthrough passthrough
Patreon Patreon
perplexitybot
pgrep pgrep
phrik phrik
pidfile pidfile
@@ -291,13 +267,12 @@ qwantbot
rac rac
rawler rawler
rcvar rcvar
rdb
redhat redhat
redir redir
redirectscheme redirectscheme
refactors refactors
remoteip
reputational reputational
Rhul
risc risc
ruleset ruleset
runlevels runlevels
@@ -317,16 +292,13 @@ Seo
setsebool setsebool
shellcheck shellcheck
shirou shirou
shoneypot
shopt shopt
Sidetrade Sidetrade
simprint simprint
sitemap sitemap
sls sls
sni sni
snipster
Spambot Spambot
spammer
sparkline sparkline
spyderbot spyderbot
srv srv
@@ -334,8 +306,6 @@ stackoverflow
startprecmd startprecmd
stoppostcmd stoppostcmd
storetest storetest
srcip
strcmp
subgrid subgrid
subr subr
subrequest subrequest
@@ -348,7 +318,6 @@ tbn
tbr tbr
techaro techaro
techarohq techarohq
telegrambot
templ templ
templruntime templruntime
testarea testarea
@@ -360,14 +329,12 @@ Timpibot
TLog TLog
traefik traefik
trunc trunc
txn
uberspace uberspace
Unbreak Unbreak
unbreakdocker unbreakdocker
unifiedjs unifiedjs
unmarshal unmarshal
unparseable unparseable
updown
uvx uvx
UXP UXP
valkey valkey
@@ -375,12 +342,10 @@ Varis
Velen Velen
vendored vendored
vhosts vhosts
vkbot
VKE VKE
vnd vnd
VPS VPS
Vultr Vultr
WAIFU
weblate weblate
webmaster webmaster
webpage webpage
@@ -392,6 +357,7 @@ wildbase
withthothmock withthothmock
wolfbeast wolfbeast
wordpress wordpress
Workaround
workaround workaround
workdir workdir
wpbot wpbot
@@ -406,7 +372,6 @@ XNG
XOB XOB
XOriginal XOriginal
XReal XReal
Y'shtola
yae yae
YAMLTo YAMLTo
Yda Yda
@@ -418,4 +383,3 @@ Zenos
zizmor zizmor
zombocom zombocom
zos zos
zst

View File

@@ -8,8 +8,6 @@ updates:
github-actions: github-actions:
patterns: patterns:
- "*" - "*"
cooldown:
default-days: 7
- package-ecosystem: gomod - package-ecosystem: gomod
directory: / directory: /
@@ -19,8 +17,6 @@ updates:
gomod: gomod:
patterns: patterns:
- "*" - "*"
cooldown:
default-days: 7
- package-ecosystem: npm - package-ecosystem: npm
directory: / directory: /
@@ -30,5 +26,3 @@ updates:
npm: npm:
patterns: patterns:
- "*" - "*"
cooldown:
default-days: 7

View File

@@ -13,7 +13,7 @@ jobs:
asset_verification: asset_verification:
runs-on: ubuntu-24.04 runs-on: ubuntu-24.04
steps: steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2 - uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0
with: with:
persist-credentials: false persist-credentials: false
@@ -22,12 +22,13 @@ jobs:
sudo apt-get update sudo apt-get update
sudo apt-get install -y build-essential sudo apt-get install -y build-essential
- uses: actions/setup-node@6044e13b5dc448c55e2357c09f80417699197238 # v6.2.0 - uses: actions/setup-node@2028fbc5c25fe9cf00d9f06a71cc4710d4507903 # v6.0.0
with: with:
node-version: "24.11.0" node-version: latest
- uses: actions/setup-go@7a3fe6cf4cb3a834922a1244abfce67bcef6a0c5 # v6.2.0
- uses: actions/setup-go@44694675825211faa026b3c33043df3e48a5fa00 # v6.0.0
with: with:
go-version: "1.25.4" go-version: stable
- name: install node deps - name: install node deps
run: | run: |

View File

@@ -1,9 +0,0 @@
name: DCO Check
on: [pull_request]
jobs:
dco_check:
runs-on: ubuntu-latest
steps:
- uses: tisonkun/actions-dco@f1024cd563550b5632e754df11b7d30b73be54a5 # v1.1

View File

@@ -15,7 +15,7 @@ jobs:
runs-on: ubuntu-24.04 runs-on: ubuntu-24.04
steps: steps:
- name: Checkout code - name: Checkout code
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2 uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0
with: with:
fetch-tags: true fetch-tags: true
fetch-depth: 0 fetch-depth: 0
@@ -26,18 +26,19 @@ jobs:
sudo apt-get update sudo apt-get update
sudo apt-get install -y build-essential sudo apt-get install -y build-essential
- uses: actions/setup-node@6044e13b5dc448c55e2357c09f80417699197238 # v6.2.0 - uses: actions/setup-node@2028fbc5c25fe9cf00d9f06a71cc4710d4507903 # v6.0.0
with: with:
node-version: "24.11.0" node-version: latest
- uses: actions/setup-go@7a3fe6cf4cb3a834922a1244abfce67bcef6a0c5 # v6.2.0
- uses: actions/setup-go@44694675825211faa026b3c33043df3e48a5fa00 # v6.0.0
with: with:
go-version: "1.25.4" go-version: stable
- uses: ko-build/setup-ko@d006021bd0c28d1ce33a07e7943d48b079944c8d # v0.9 - uses: ko-build/setup-ko@d006021bd0c28d1ce33a07e7943d48b079944c8d # v0.9
- name: Docker meta - name: Docker meta
id: meta id: meta
uses: docker/metadata-action@c299e40c65443455700f0fdfc63efafe5b349051 # v5.10.0 uses: docker/metadata-action@318604b99e75e41977312d83839a89be02ca4893 # v5.9.0
with: with:
images: ghcr.io/${{ github.repository }} images: ghcr.io/${{ github.repository }}

View File

@@ -21,7 +21,7 @@ jobs:
runs-on: ubuntu-24.04 runs-on: ubuntu-24.04
steps: steps:
- name: Checkout code - name: Checkout code
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2 uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0
with: with:
fetch-tags: true fetch-tags: true
fetch-depth: 0 fetch-depth: 0
@@ -36,17 +36,18 @@ jobs:
run: | run: |
echo "IMAGE=ghcr.io/${GITHUB_REPOSITORY,,}" >> $GITHUB_ENV echo "IMAGE=ghcr.io/${GITHUB_REPOSITORY,,}" >> $GITHUB_ENV
- uses: actions/setup-node@6044e13b5dc448c55e2357c09f80417699197238 # v6.2.0 - uses: actions/setup-node@2028fbc5c25fe9cf00d9f06a71cc4710d4507903 # v6.0.0
with: with:
node-version: "24.11.0" node-version: latest
- uses: actions/setup-go@7a3fe6cf4cb3a834922a1244abfce67bcef6a0c5 # v6.2.0
- uses: actions/setup-go@44694675825211faa026b3c33043df3e48a5fa00 # v6.0.0
with: with:
go-version: "1.25.4" go-version: stable
- uses: ko-build/setup-ko@d006021bd0c28d1ce33a07e7943d48b079944c8d # v0.9 - uses: ko-build/setup-ko@d006021bd0c28d1ce33a07e7943d48b079944c8d # v0.9
- name: Log into registry - name: Log into registry
uses: docker/login-action@c94ce9fb468520275223c153574b00df6fe4bcc9 # v3.7.0 uses: docker/login-action@5e57cd118135c172c3672efd75eb46360885c0ef # v3.6.0
with: with:
registry: ghcr.io registry: ghcr.io
username: ${{ github.repository_owner }} username: ${{ github.repository_owner }}
@@ -54,7 +55,7 @@ jobs:
- name: Docker meta - name: Docker meta
id: meta id: meta
uses: docker/metadata-action@c299e40c65443455700f0fdfc63efafe5b349051 # v5.10.0 uses: docker/metadata-action@318604b99e75e41977312d83839a89be02ca4893 # v5.9.0
with: with:
images: ${{ env.IMAGE }} images: ${{ env.IMAGE }}
@@ -68,7 +69,7 @@ jobs:
SLOG_LEVEL: debug SLOG_LEVEL: debug
- name: Generate artifact attestation - name: Generate artifact attestation
uses: actions/attest-build-provenance@96278af6caaf10aea03fd8d33a09a777ca52d62f # v3.2.0 uses: actions/attest-build-provenance@977bb373ede98d70efdf65b84cb5f73e068dcc2a # v3.0.0
with: with:
subject-name: ${{ env.IMAGE }} subject-name: ${{ env.IMAGE }}
subject-digest: ${{ steps.build.outputs.digest }} subject-digest: ${{ steps.build.outputs.digest }}

View File

@@ -17,15 +17,15 @@ jobs:
runs-on: ubuntu-24.04 runs-on: ubuntu-24.04
steps: steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2 - uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0
with: with:
persist-credentials: false persist-credentials: false
- name: Set up Docker Buildx - name: Set up Docker Buildx
uses: docker/setup-buildx-action@8d2750c68a42422c14e847fe6c8ac0403b4cbd6f # v3.12.0 uses: docker/setup-buildx-action@e468171a9de216ec08956ac3ada2f0791b6bd435 # v3.11.1
- name: Log into registry - name: Log into registry
uses: docker/login-action@c94ce9fb468520275223c153574b00df6fe4bcc9 # v3.7.0 uses: docker/login-action@5e57cd118135c172c3672efd75eb46360885c0ef # v3.6.0
with: with:
registry: ghcr.io registry: ghcr.io
username: techarohq username: techarohq
@@ -33,7 +33,7 @@ jobs:
- name: Docker meta - name: Docker meta
id: meta id: meta
uses: docker/metadata-action@c299e40c65443455700f0fdfc63efafe5b349051 # v5.10.0 uses: docker/metadata-action@318604b99e75e41977312d83839a89be02ca4893 # v5.9.0
with: with:
images: ghcr.io/techarohq/anubis/docs images: ghcr.io/techarohq/anubis/docs
tags: | tags: |
@@ -53,14 +53,14 @@ jobs:
push: true push: true
- name: Apply k8s manifests to limsa lominsa - name: Apply k8s manifests to limsa lominsa
uses: actions-hub/kubectl@3ece3793e7a9fe94effe257d03ac834c815ea87d # v1.35.1 uses: actions-hub/kubectl@f14933a23bc8c582b5aa7d108defd8e2cb9fa86d # v1.34.1
env: env:
KUBE_CONFIG: ${{ secrets.LIMSA_LOMINSA_KUBECONFIG }} KUBE_CONFIG: ${{ secrets.LIMSA_LOMINSA_KUBECONFIG }}
with: with:
args: apply -k docs/manifest args: apply -k docs/manifest
- name: Apply k8s manifests to limsa lominsa - name: Apply k8s manifests to limsa lominsa
uses: actions-hub/kubectl@3ece3793e7a9fe94effe257d03ac834c815ea87d # v1.35.1 uses: actions-hub/kubectl@f14933a23bc8c582b5aa7d108defd8e2cb9fa86d # v1.34.1
env: env:
KUBE_CONFIG: ${{ secrets.LIMSA_LOMINSA_KUBECONFIG }} KUBE_CONFIG: ${{ secrets.LIMSA_LOMINSA_KUBECONFIG }}
with: with:

View File

@@ -13,16 +13,16 @@ jobs:
runs-on: ubuntu-24.04 runs-on: ubuntu-24.04
steps: steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2 - uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0
with: with:
persist-credentials: false persist-credentials: false
- name: Set up Docker Buildx - name: Set up Docker Buildx
uses: docker/setup-buildx-action@8d2750c68a42422c14e847fe6c8ac0403b4cbd6f # v3.12.0 uses: docker/setup-buildx-action@e468171a9de216ec08956ac3ada2f0791b6bd435 # v3.11.1
- name: Docker meta - name: Docker meta
id: meta id: meta
uses: docker/metadata-action@c299e40c65443455700f0fdfc63efafe5b349051 # v5.10.0 uses: docker/metadata-action@318604b99e75e41977312d83839a89be02ca4893 # v5.9.0
with: with:
images: ghcr.io/techarohq/anubis/docs images: ghcr.io/techarohq/anubis/docs
tags: | tags: |

View File

@@ -1,76 +0,0 @@
name: Go Mod Tidy Check
on:
push:
branches: ["main"]
pull_request:
branches: ["main"]
permissions:
contents: read
jobs:
go_mod_tidy_check:
runs-on: ubuntu-24.04
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
persist-credentials: false
- uses: actions/setup-go@7a3fe6cf4cb3a834922a1244abfce67bcef6a0c5 # v6.2.0
with:
go-version: "1.25.4"
- name: Check go.mod and go.sum in main directory
run: |
# Store original file state
cp go.mod go.mod.orig
cp go.sum go.sum.orig
# Run go mod tidy
go mod tidy
# Check if files changed
if ! diff -q go.mod.orig go.mod > /dev/null 2>&1; then
echo "ERROR: go.mod in main directory has changed after running 'go mod tidy'"
echo "Please run 'go mod tidy' locally and commit the changes"
diff go.mod.orig go.mod
exit 1
fi
if ! diff -q go.sum.orig go.sum > /dev/null 2>&1; then
echo "ERROR: go.sum in main directory has changed after running 'go mod tidy'"
echo "Please run 'go mod tidy' locally and commit the changes"
diff go.sum.orig go.sum
exit 1
fi
echo "SUCCESS: go.mod and go.sum in main directory are tidy"
- name: Check go.mod and go.sum in test directory
run: |
cd test
# Store original file state
cp go.mod go.mod.orig
cp go.sum go.sum.orig
# Run go mod tidy
go mod tidy
# Check if files changed
if ! diff -q go.mod.orig go.mod > /dev/null 2>&1; then
echo "ERROR: go.mod in test directory has changed after running 'go mod tidy'"
echo "Please run 'go mod tidy' locally and commit the changes"
diff go.mod.orig go.mod
exit 1
fi
if ! diff -q go.sum.orig go.sum > /dev/null 2>&1; then
echo "ERROR: go.sum in test directory has changed after running 'go mod tidy'"
echo "Please run 'go mod tidy' locally and commit the changes"
diff go.sum.orig go.sum
exit 1
fi
echo "SUCCESS: go.mod and go.sum in test directory are tidy"

View File

@@ -15,7 +15,7 @@ jobs:
#runs-on: alrest-techarohq #runs-on: alrest-techarohq
runs-on: ubuntu-24.04 runs-on: ubuntu-24.04
steps: steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2 - uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0
with: with:
persist-credentials: false persist-credentials: false
@@ -24,15 +24,16 @@ jobs:
sudo apt-get update sudo apt-get update
sudo apt-get install -y build-essential sudo apt-get install -y build-essential
- uses: actions/setup-node@6044e13b5dc448c55e2357c09f80417699197238 # v6.2.0 - uses: actions/setup-node@2028fbc5c25fe9cf00d9f06a71cc4710d4507903 # v6.0.0
with: with:
node-version: "24.11.0" node-version: latest
- uses: actions/setup-go@7a3fe6cf4cb3a834922a1244abfce67bcef6a0c5 # v6.2.0
- uses: actions/setup-go@44694675825211faa026b3c33043df3e48a5fa00 # v6.0.0
with: with:
go-version: "1.25.4" go-version: stable
- name: Cache playwright binaries - name: Cache playwright binaries
uses: actions/cache@cdf6c1fa76f9f475f3d7449005a359c84ca0f306 # v5.0.3 uses: actions/cache@0057852bfaa89a56745cba8c7296529d2fc39830 # v4.3.0
id: playwright-cache id: playwright-cache
with: with:
path: | path: |

View File

@@ -1,19 +0,0 @@
name: "Lint PR"
on:
pull_request_target:
types:
- opened
- edited
- synchronize
jobs:
lint_pr_title:
name: Validate PR title
runs-on: ubuntu-latest
permissions:
pull-requests: read
steps:
- uses: amannn/action-semantic-pull-request@48f256284bd46cdaab1048c3721360e808335d50 # v6.1.1
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}

View File

@@ -14,7 +14,7 @@ jobs:
#runs-on: alrest-techarohq #runs-on: alrest-techarohq
runs-on: ubuntu-24.04 runs-on: ubuntu-24.04
steps: steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2 - uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0
with: with:
persist-credentials: false persist-credentials: false
fetch-tags: true fetch-tags: true
@@ -25,12 +25,13 @@ jobs:
sudo apt-get update sudo apt-get update
sudo apt-get install -y build-essential sudo apt-get install -y build-essential
- uses: actions/setup-node@6044e13b5dc448c55e2357c09f80417699197238 # v6.2.0 - uses: actions/setup-node@2028fbc5c25fe9cf00d9f06a71cc4710d4507903 # v6.0.0
with: with:
node-version: "24.11.0" node-version: latest
- uses: actions/setup-go@7a3fe6cf4cb3a834922a1244abfce67bcef6a0c5 # v6.2.0
- uses: actions/setup-go@44694675825211faa026b3c33043df3e48a5fa00 # v6.0.0
with: with:
go-version: "1.25.4" go-version: stable
- name: install node deps - name: install node deps
run: | run: |

View File

@@ -15,7 +15,7 @@ jobs:
#runs-on: alrest-techarohq #runs-on: alrest-techarohq
runs-on: ubuntu-24.04 runs-on: ubuntu-24.04
steps: steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2 - uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0
with: with:
persist-credentials: false persist-credentials: false
fetch-tags: true fetch-tags: true
@@ -26,12 +26,13 @@ jobs:
sudo apt-get update sudo apt-get update
sudo apt-get install -y build-essential sudo apt-get install -y build-essential
- uses: actions/setup-node@6044e13b5dc448c55e2357c09f80417699197238 # v6.2.0 - uses: actions/setup-node@2028fbc5c25fe9cf00d9f06a71cc4710d4507903 # v6.0.0
with: with:
node-version: "24.11.0" node-version: latest
- uses: actions/setup-go@7a3fe6cf4cb3a834922a1244abfce67bcef6a0c5 # v6.2.0
- uses: actions/setup-go@44694675825211faa026b3c33043df3e48a5fa00 # v6.0.0
with: with:
go-version: "1.25.4" go-version: stable
- name: install node deps - name: install node deps
run: | run: |
@@ -41,7 +42,7 @@ jobs:
run: | run: |
go tool yeet go tool yeet
- uses: actions/upload-artifact@b7c566a772e6b6bfb58ed0dc250532a479d7789f # v6.0.0 - uses: actions/upload-artifact@330a01c490aca151604b8cf639adc76d48f6c5d4 # v5.0.0
with: with:
name: packages name: packages
path: var/* path: var/*

View File

@@ -22,24 +22,23 @@ jobs:
- git-push - git-push
- healthcheck - healthcheck
- i18n - i18n
- log-file
- nginx
- palemoon/amd64 - palemoon/amd64
#- palemoon/i386 #- palemoon/i386
- robots_txt - robots_txt
runs-on: ubuntu-latest runs-on: ubuntu-latest
steps: steps:
- name: Checkout code - name: Checkout code
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2 uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0
with: with:
persist-credentials: false persist-credentials: false
- uses: actions/setup-node@6044e13b5dc448c55e2357c09f80417699197238 # v6.2.0 - uses: actions/setup-node@2028fbc5c25fe9cf00d9f06a71cc4710d4507903 # v6.0.0
with: with:
node-version: "24.11.0" node-version: latest
- uses: actions/setup-go@7a3fe6cf4cb3a834922a1244abfce67bcef6a0c5 # v6.2.0
- uses: actions/setup-go@44694675825211faa026b3c33043df3e48a5fa00 # v6.0.0
with: with:
go-version: "1.25.4" go-version: stable
- uses: ko-build/setup-ko@d006021bd0c28d1ce33a07e7943d48b079944c8d # v0.9 - uses: ko-build/setup-ko@d006021bd0c28d1ce33a07e7943d48b079944c8d # v0.9
@@ -57,7 +56,7 @@ jobs:
run: echo "ARTIFACT_NAME=${{ matrix.test }}" | sed 's|/|-|g' >> $GITHUB_ENV run: echo "ARTIFACT_NAME=${{ matrix.test }}" | sed 's|/|-|g' >> $GITHUB_ENV
- name: Upload artifact - name: Upload artifact
uses: actions/upload-artifact@b7c566a772e6b6bfb58ed0dc250532a479d7789f uses: actions/upload-artifact@330a01c490aca151604b8cf639adc76d48f6c5d4
if: always() if: always()
with: with:
name: ${{ env.ARTIFACT_NAME }} name: ${{ env.ARTIFACT_NAME }}

View File

@@ -59,16 +59,16 @@ name: Check Spelling
on: on:
push: push:
branches: branches:
- "**" - '**'
tags-ignore: tags-ignore:
- "**" - '**'
pull_request: pull_request:
branches: branches:
- "**" - '**'
types: types:
- "opened" - 'opened'
- "reopened" - 'reopened'
- "synchronize" - 'synchronize'
jobs: jobs:
spelling: spelling:

View File

@@ -18,19 +18,19 @@ jobs:
runs-on: ubuntu-latest runs-on: ubuntu-latest
steps: steps:
- name: Checkout code - name: Checkout code
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2 uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0
with: with:
fetch-tags: true fetch-tags: true
fetch-depth: 0 fetch-depth: 0
persist-credentials: false persist-credentials: false
- name: Log into registry - name: Log into registry
uses: docker/login-action@c94ce9fb468520275223c153574b00df6fe4bcc9 # v3.7.0 uses: docker/login-action@5e57cd118135c172c3672efd75eb46360885c0ef # v3.6.0
with: with:
registry: ghcr.io registry: ghcr.io
username: ${{ github.repository_owner }} username: ${{ github.repository_owner }}
password: ${{ secrets.GITHUB_TOKEN }} password: ${{ secrets.GITHUB_TOKEN }}
- name: Set up Docker Buildx - name: Set up Docker Buildx
uses: docker/setup-buildx-action@8d2750c68a42422c14e847fe6c8ac0403b4cbd6f # v3.12.0 uses: docker/setup-buildx-action@e468171a9de216ec08956ac3ada2f0791b6bd435 # v3.11.1
- name: Build and push - name: Build and push
run: | run: |
cd ./test/ssh-ci cd ./test/ssh-ci

View File

@@ -22,7 +22,7 @@ jobs:
- aarch64-16k - aarch64-16k
steps: steps:
- name: Checkout code - name: Checkout code
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2 uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0
with: with:
fetch-tags: true fetch-tags: true
fetch-depth: 0 fetch-depth: 0
@@ -35,9 +35,9 @@ jobs:
name: id_rsa name: id_rsa
known_hosts: ${{ secrets.CI_SSH_KNOWN_HOSTS }} known_hosts: ${{ secrets.CI_SSH_KNOWN_HOSTS }}
- uses: actions/setup-go@7a3fe6cf4cb3a834922a1244abfce67bcef6a0c5 # v6.2.0 - uses: actions/setup-go@44694675825211faa026b3c33043df3e48a5fa00 # v6.0.0
with: with:
go-version: "1.25.4" go-version: stable
- name: Run CI - name: Run CI
run: go run ./utils/cmd/backoff-retry bash test/ssh-ci/rigging.sh ${{ matrix.host }} run: go run ./utils/cmd/backoff-retry bash test/ssh-ci/rigging.sh ${{ matrix.host }}

View File

@@ -3,10 +3,10 @@ name: zizmor
on: on:
push: push:
paths: paths:
- ".github/workflows/*.ya?ml" - '.github/workflows/*.ya?ml'
pull_request: pull_request:
paths: paths:
- ".github/workflows/*.ya?ml" - '.github/workflows/*.ya?ml'
jobs: jobs:
zizmor: zizmor:
@@ -16,12 +16,12 @@ jobs:
security-events: write security-events: write
steps: steps:
- name: Checkout repository - name: Checkout repository
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2 uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0
with: with:
persist-credentials: false persist-credentials: false
- name: Install the latest version of uv - name: Install the latest version of uv
uses: astral-sh/setup-uv@eac588ad8def6316056a12d4907a9d4d84ff7a3b # v7.3.0 uses: astral-sh/setup-uv@85856786d1ce8acfbcc2f13a5f3fbd6b938f9f41 # v7.1.2
- name: Run zizmor 🌈 - name: Run zizmor 🌈
run: uvx zizmor --format sarif . > results.sarif run: uvx zizmor --format sarif . > results.sarif
@@ -29,7 +29,7 @@ jobs:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }} GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
- name: Upload SARIF file - name: Upload SARIF file
uses: github/codeql-action/upload-sarif@5d4e8d1aca955e8d8589aabd499c5cae939e33c7 # v4.31.9 uses: github/codeql-action/upload-sarif@0499de31b99561a6d14a36a5f662c2a54f91beee # v4.31.2
with: with:
sarif_file: results.sarif sarif_file: results.sarif
category: zizmor category: zizmor

View File

@@ -1,8 +0,0 @@
npx --no-install commitlint --edit "$1"
# Check if commit message contains Signed-off-by line
if ! grep -q "^Signed-off-by:" "$1"; then
echo "Commit message must contain a 'Signed-off-by:' line."
echo "Please use 'git commit --signoff' or add a Signed-off-by line to your commit message."
exit 1
fi

View File

@@ -1,2 +0,0 @@
npm run lint
npm run test

View File

@@ -1,4 +0,0 @@
lib/config/testdata/bad/*
*.inc
AGENTS.md
CLAUDE.md

View File

@@ -1,75 +0,0 @@
# Agent instructions
Primary agent documentation is in `CONTRIBUTING.md`. You MUST read this file before proceeding.
## Useful Commands
```shell
npm ci # install node dependencies
npm run assets # build JS/CSS (required before any Go build/test)
npm run build # assets + go build -> ./var/anubis
npm run dev # assets + run locally with --use-remote-address
```
## Testing
```shell
npm run test
```
## Linting
```shell
go vet ./...
go tool staticcheck ./...
go tool govulncheck ./...
```
## Commit Messages
Commit messages follow the [**Conventional Commits**](https://www.conventionalcommits.org/en/v1.0.0/) format:
```text
<type>[optional scope]: <description>
[optional body]
[optional footer(s)]
```
**Types**: `feat`, `fix`, `docs`, `style`, `refactor`, `perf`, `test`, `build`, `ci`, `chore`, `revert`
- Add `!` after type/scope for breaking changes or include `BREAKING CHANGE:` in the footer.
- Keep descriptions concise, imperative, lowercase, and without a trailing period.
- Reference issues/PRs in the footer when applicable.
- **ALL git commits MUST be made with `--signoff`.** This is mandatory.
### Attribution Requirements
AI agents must disclose what tool and model they are using in the "Assisted-by" commit footer:
```text
Assisted-by: [Model Name] via [Tool Name]
```
Example:
```text
Assisted-by: GLM 4.6 via Claude Code
```
## PR Checklist
- Add description of changes to `[Unreleased]` in `docs/docs/CHANGELOG.md`.
- Add test cases for bug fixes and behavior changes.
- Run integration tests: `npm run test:integration`.
- All commits must have verified (signed) signatures.
## Key Conventions
- **Security-first**: This is security software. Code reviews are strict. Always add tests for bug fixes. Consider adversarial inputs.
- **Configuration**: YAML-based policy files. Config structs validate via `Valid() error` methods returning sentinel errors.
- **Store interface**: `lib/store.Interface` abstracts key-value storage.
- **Environment variables**: Parsed from flags via `flagenv`. Use `.env` files locally (loaded by `godotenv/autoload`). Never commit `.env` files.
- **Assets must be built first**: JS/CSS assets are embedded into the Go binary. Always run `npm run assets` before `go test` or `go build`.
- **CEL expressions**: Policy rules support CEL (Common Expression Language) expressions for advanced matching. See `lib/policy/expressions/`.

View File

@@ -1,2 +0,0 @@
@AGENTS.md
@CONTRIBUTING.md

View File

@@ -1,144 +0,0 @@
# Contributing to Anubis
Anubis is a Web AI Firewall Utility (WAIFU) written in Go. It uses sha256 proof-of-work challenges to protect upstream HTTP resources from scraper bots. This is security software -- correctness matters.
## Build & Run
Prerequisites: Go 1.24+, Node.js (any supported version), esbuild, gzip, zstd, brotli. Install all with `brew bundle` if you are using Homebrew.
```shell
npm ci # install node dependencies
npm run assets # build JS/CSS (required before any Go build/test)
npm run build # assets + go build -> ./var/anubis
npm run dev # assets + run locally with --use-remote-address
```
## Testing
```shell
# Run all unit tests (assets must be built first)
npm run test # or: make test
# Run a single test by name
go test -run TestClampIP ./internal/
# Run a single test file's package
go test ./lib/config/
# Run tests with verbose output
go test -v -run TestBotValid ./lib/config/
```
### Smoke tests
The `tests` folder contains "smoke tests" that are intended to set up Anubis in production-adjacent settings and testing it against real infrastructure tools. A smoke test is a folder with `test.sh` that sets up infrastructure, validates the behaviour, and then tears it down. Smoke tests are run in GitHub actions with `.github/workflows/smoke-tests.yaml`.
## Linting
```shell
go vet ./...
go tool staticcheck ./...
go tool govulncheck ./...
```
## Code Generation
The project uses `go generate` for templ templates and stringer. Always run `npm run generate` (or `make assets`) before building or testing. Generated files include:
- `web/*.templ` -> templ-generated Go code
- `web/static/` -> bundled/minified JS and CSS (with .gz, .zst, .br variants)
## Project Layout
Important folders:
- `cmd/anubis`: Main entrypoint for the project. This is the program that runs on servers.
- `lib/*`: The core library for Anubis and all of its features. This is internal code that is made public for ease of downstream consumption. No API stability is guaranteed. Use at your own risk.
- `internal/*`: Actual internal code that is private to the implementation of Anubis. If you need to use a package in this, please copy it out and manually vendor it in your own project.
- `test/*` Smoke tests (see dedicated section for details).
- `web`: Frontend HTML templates.
- `xess`: Frontend CSS framework and build logic.
## Code Style
### Go
This project follows the idioms of the Go standard library. Generally follow the patterns that upstream Go uses, including:
- Prefer packages from the standard library unless there is no other option.
- Use package import aliases only when package names collide.
- Use `goimports` to format code. Run with `npm run format`.
- Use sentinel errors as package-level variables prefixed with `Err` (such as `ErrBotMustHaveName`). Wrap with `fmt.Errorf("package: small message giving context: %w", err)`.
- Use `log/slog` for structured logging. Pass loggers as arguments to functions. Use `lg.With` to preload with context. Prefer using `slog.Debug` unless you absolutely need to report messages to users, some users have magical thinking about log verbosity.
- Name PublicFunctionsAndTypes in PascalCase. Name privateFunctionsAndTypes in camelCase.
- Acronyms stay uppercase (`URL`, `HTTP`, `IP`, `DNS`, etc.)
- Enumerations should use strong types with validation logic for parsing remote input.
- Be conservative in what you send but liberal in what you accept.
- Anything reading configuration values should use both `json` and `yaml` struct tags. Use pointer values for optional configuration values.
- Use [table-driven tests](https://go.dev/wiki/TableDrivenTests) when writing test code.
- Use [`t.Helper()`](https://pkg.go.dev/testing#T.Helper) in helper code (setup/teardown scaffolding).
- Use [`t.Cleanup()`](https://pkg.go.dev/testing#T.Cleanup) to tear down per-test or per-suite scaffolding.
- Use [`errors.Is`](https://pkg.go.dev/errors#Is) for validating function results against sentinel errors.
- Prefer same-package tests over black-box tests (`_test` packages).
### JavaScript / TypeScript
- Source lives in `web/js/`. Built with esbuild, bundled and minified.
- Uses Preact (not React).
- No linter config. Keep functions small. Use `const` by default.
### Templ Templates
Anubis uses [Templ](https://templ.guide) for generating HTML on the server.
- `.templ` files in `web/` generate Go code. Run `go generate ./...` (or `npm run assets`) after modifying them.
- Templates receive typed Go parameters. Keep logic in Go, not templates.
## Commit Messages
Commit messages follow the [**Conventional Commits**](https://www.conventionalcommits.org/en/v1.0.0/) format:
```text
<type>[optional scope]: <description>
[optional body]
[optional footer(s)]
```
**Types**: `feat`, `fix`, `docs`, `style`, `refactor`, `perf`, `test`, `build`, `ci`, `chore`, `revert`
- Add `!` after type/scope for breaking changes or include `BREAKING CHANGE:` in the footer.
- Keep descriptions concise, imperative, lowercase, and without a trailing period.
- Reference issues/PRs in the footer when applicable.
- **ALL git commits MUST be made with `--signoff`.** This is mandatory.
### Attribution Requirements
AI agents must disclose what tool and model they are using in the "Assisted-by" commit footer:
```text
Assisted-by: [Model Name] via [Tool Name]
```
Example:
```text
Assisted-by: GLM 4.6 via Claude Code
```
## PR Checklist
- Add description of changes to `[Unreleased]` in `docs/docs/CHANGELOG.md`.
- Add test cases for bug fixes and behavior changes.
- Run integration tests: `npm run test:integration`.
- All commits must have verified (signed) signatures.
## Key Conventions
- **Security-first**: This is security software. Code reviews are strict. Always add tests for bug fixes. Consider adversarial inputs.
- **Configuration**: YAML-based policy files. Config structs validate via `Valid() error` methods returning sentinel errors.
- **Store interface**: `lib/store.Interface` abstracts key-value storage.
- **Environment variables**: Parsed from flags via `flagenv`. Use `.env` files locally (loaded by `godotenv/autoload`). Never commit `.env` files.
- **Assets must be built first**: JS/CSS assets are embedded into the Go binary. Always run `npm run assets` before `go test` or `go build`.
- **CEL expressions**: Policy rules support CEL (Common Expression Language) expressions for advanced matching. See `lib/policy/expressions/`.

View File

@@ -20,21 +20,12 @@ Anubis is brought to you by sponsors and donors like:
<a href="https://www.raptorcs.com/content/base/products.html"> <a href="https://www.raptorcs.com/content/base/products.html">
<img src="./docs/static/img/sponsors/raptor-computing-logo.webp" alt="Raptor Computing Systems" height=64 /> <img src="./docs/static/img/sponsors/raptor-computing-logo.webp" alt="Raptor Computing Systems" height=64 />
</a> </a>
<a href="https://databento.com/?utm_source=anubis&utm_medium=sponsor&utm_campaign=anubis">
<img src="./docs/static/img/sponsors/databento-logo.webp" alt="Databento" height="64" />
</a>
### Gold Tier ### Gold Tier
<a href="https://distrust.co?utm_campaign=github&utm_medium=referral&utm_content=anubis"> <a href="https://distrust.co?utm_campaign=github&utm_medium=referral&utm_content=anubis">
<img src="./docs/static/img/sponsors/distrust-logo.webp" alt="Distrust" height="64"> <img src="./docs/static/img/sponsors/distrust-logo.webp" alt="Distrust" height="64">
</a> </a>
<a href="https://about.gitea.com?utm_campaign=github&utm_medium=referral&utm_content=anubis">
<img src="./docs/static/img/sponsors/gitea-logo.webp" alt="Gitea" height="64">
</a>
<a href="https://prolocation.net?utm_campaign=github&utm_medium=referral&utm_content=anubis">
<img src="./docs/static/img/sponsors/prolocation-logo.svg" alt="Prolocation" height="64">
</a>
<a href="https://terminaltrove.com/?utm_campaign=github&utm_medium=referral&utm_content=anubis&utm_source=abgh"> <a href="https://terminaltrove.com/?utm_campaign=github&utm_medium=referral&utm_content=anubis&utm_source=abgh">
<img src="./docs/static/img/sponsors/terminal-trove.webp" alt="Terminal Trove" height="64"> <img src="./docs/static/img/sponsors/terminal-trove.webp" alt="Terminal Trove" height="64">
</a> </a>
@@ -64,9 +55,6 @@ Anubis is brought to you by sponsors and donors like:
height="64" height="64"
/> />
</a> </a>
<a href="https://www.anexia.com/">
<img src="./docs/static/img/sponsors/anexia-cloudsolutions-logo.webp" alt="ANEXIA Cloud Solutions" height="64">
</a>
## Overview ## Overview

View File

@@ -1 +1 @@
1.25.0 1.23.1

View File

@@ -31,8 +31,8 @@ import (
"github.com/TecharoHQ/anubis/data" "github.com/TecharoHQ/anubis/data"
"github.com/TecharoHQ/anubis/internal" "github.com/TecharoHQ/anubis/internal"
libanubis "github.com/TecharoHQ/anubis/lib" libanubis "github.com/TecharoHQ/anubis/lib"
"github.com/TecharoHQ/anubis/lib/config"
botPolicy "github.com/TecharoHQ/anubis/lib/policy" botPolicy "github.com/TecharoHQ/anubis/lib/policy"
"github.com/TecharoHQ/anubis/lib/policy/config"
"github.com/TecharoHQ/anubis/lib/thoth" "github.com/TecharoHQ/anubis/lib/thoth"
"github.com/TecharoHQ/anubis/web" "github.com/TecharoHQ/anubis/web"
"github.com/facebookgo/flagenv" "github.com/facebookgo/flagenv"
@@ -273,11 +273,9 @@ func main() {
return return
} }
internal.InitSlog(*slogLevel)
internal.SetHealth("anubis", healthv1.HealthCheckResponse_NOT_SERVING) internal.SetHealth("anubis", healthv1.HealthCheckResponse_NOT_SERVING)
lg := internal.InitSlog(*slogLevel, os.Stderr)
lg.Info("starting up Anubis")
if *healthcheck { if *healthcheck {
log.Println("running healthcheck") log.Println("running healthcheck")
if err := doHealthCheck(); err != nil { if err := doHealthCheck(); err != nil {
@@ -305,7 +303,7 @@ func main() {
if *metricsBind != "" { if *metricsBind != "" {
wg.Add(1) wg.Add(1)
go metricsServer(ctx, *lg.With("subsystem", "metrics"), wg.Done) go metricsServer(ctx, wg.Done)
} }
var rp http.Handler var rp http.Handler
@@ -325,11 +323,11 @@ func main() {
// Thoth configuration // Thoth configuration
switch { switch {
case *thothURL != "" && *thothToken == "": case *thothURL != "" && *thothToken == "":
lg.Warn("THOTH_URL is set but no THOTH_TOKEN is set") slog.Warn("THOTH_URL is set but no THOTH_TOKEN is set")
case *thothURL == "" && *thothToken != "": case *thothURL == "" && *thothToken != "":
lg.Warn("THOTH_TOKEN is set but no THOTH_URL is set") slog.Warn("THOTH_TOKEN is set but no THOTH_URL is set")
case *thothURL != "" && *thothToken != "": case *thothURL != "" && *thothToken != "":
lg.Debug("connecting to Thoth") slog.Debug("connecting to Thoth")
thothClient, err := thoth.New(ctx, *thothURL, *thothToken, *thothInsecure) thothClient, err := thoth.New(ctx, *thothURL, *thothToken, *thothInsecure)
if err != nil { if err != nil {
log.Fatalf("can't dial thoth at %s: %v", *thothURL, err) log.Fatalf("can't dial thoth at %s: %v", *thothURL, err)
@@ -338,19 +336,15 @@ func main() {
ctx = thoth.With(ctx, thothClient) ctx = thoth.With(ctx, thothClient)
} }
lg.Info("loading policy file", "fname", *policyFname) policy, err := libanubis.LoadPoliciesOrDefault(ctx, *policyFname, *challengeDifficulty)
policy, err := libanubis.LoadPoliciesOrDefault(ctx, *policyFname, *challengeDifficulty, *slogLevel)
if err != nil { if err != nil {
log.Fatalf("can't parse policy file: %v", err) log.Fatalf("can't parse policy file: %v", err)
} }
lg = policy.Logger
lg.Debug("swapped to new logger")
slog.SetDefault(lg)
// Warn if persistent storage is used without a configured signing key // Warn if persistent storage is used without a configured signing key
if policy.Store.IsPersistent() { if policy.Store.IsPersistent() {
if *hs512Secret == "" && *ed25519PrivateKeyHex == "" && *ed25519PrivateKeyHexFile == "" { if *hs512Secret == "" && *ed25519PrivateKeyHex == "" && *ed25519PrivateKeyHexFile == "" {
lg.Warn("[misconfiguration] persistent storage backend is configured, but no private key is set. " + slog.Warn("[misconfiguration] persistent storage backend is configured, but no private key is set. " +
"Challenges will be invalidated when Anubis restarts. " + "Challenges will be invalidated when Anubis restarts. " +
"Set HS512_SECRET, ED25519_PRIVATE_KEY_HEX, or ED25519_PRIVATE_KEY_HEX_FILE to ensure challenges survive service restarts. " + "Set HS512_SECRET, ED25519_PRIVATE_KEY_HEX, or ED25519_PRIVATE_KEY_HEX_FILE to ensure challenges survive service restarts. " +
"See: https://anubis.techaro.lol/docs/admin/installation#key-generation") "See: https://anubis.techaro.lol/docs/admin/installation#key-generation")
@@ -413,7 +407,7 @@ func main() {
log.Fatalf("failed to generate ed25519 key: %v", err) log.Fatalf("failed to generate ed25519 key: %v", err)
} }
lg.Warn("generating random key, Anubis will have strange behavior when multiple instances are behind the same load balancer target, for more information: see https://anubis.techaro.lol/docs/admin/installation#key-generation") slog.Warn("generating random key, Anubis will have strange behavior when multiple instances are behind the same load balancer target, for more information: see https://anubis.techaro.lol/docs/admin/installation#key-generation")
} }
var redirectDomainsList []string var redirectDomainsList []string
@@ -427,7 +421,7 @@ func main() {
redirectDomainsList = append(redirectDomainsList, strings.TrimSpace(domain)) redirectDomainsList = append(redirectDomainsList, strings.TrimSpace(domain))
} }
} else { } else {
lg.Warn("REDIRECT_DOMAINS is not set, Anubis will only redirect to the same domain a request is coming from, see https://anubis.techaro.lol/docs/admin/configuration/redirect-domains") slog.Warn("REDIRECT_DOMAINS is not set, Anubis will only redirect to the same domain a request is coming from, see https://anubis.techaro.lol/docs/admin/configuration/redirect-domains")
} }
anubis.CookieName = *cookiePrefix + "-auth" anubis.CookieName = *cookiePrefix + "-auth"
@@ -449,9 +443,6 @@ func main() {
StripBasePrefix: *stripBasePrefix, StripBasePrefix: *stripBasePrefix,
Next: rp, Next: rp,
Policy: policy, Policy: policy,
TargetHost: *targetHost,
TargetSNI: *targetSNI,
TargetInsecureSkipVerify: *targetInsecureSkipVerify,
ServeRobotsTXT: *robotsTxt, ServeRobotsTXT: *robotsTxt,
ED25519PrivateKey: ed25519Priv, ED25519PrivateKey: ed25519Priv,
HS512Secret: []byte(*hs512Secret), HS512Secret: []byte(*hs512Secret),
@@ -467,7 +458,6 @@ func main() {
CookieSameSite: parseSameSite(*cookieSameSite), CookieSameSite: parseSameSite(*cookieSameSite),
PublicUrl: *publicUrl, PublicUrl: *publicUrl,
JWTRestrictionHeader: *jwtRestrictionHeader, JWTRestrictionHeader: *jwtRestrictionHeader,
Logger: policy.Logger.With("subsystem", "anubis"),
DifficultyInJWT: *difficultyInJWT, DifficultyInJWT: *difficultyInJWT,
}) })
if err != nil { if err != nil {
@@ -484,7 +474,7 @@ func main() {
srv := http.Server{Handler: h, ErrorLog: internal.GetFilteredHTTPLogger()} srv := http.Server{Handler: h, ErrorLog: internal.GetFilteredHTTPLogger()}
listener, listenerUrl := setupListener(*bindNetwork, *bind) listener, listenerUrl := setupListener(*bindNetwork, *bind)
lg.Info( slog.Info(
"listening", "listening",
"url", listenerUrl, "url", listenerUrl,
"difficulty", *challengeDifficulty, "difficulty", *challengeDifficulty,
@@ -518,7 +508,7 @@ func main() {
wg.Wait() wg.Wait()
} }
func metricsServer(ctx context.Context, lg slog.Logger, done func()) { func metricsServer(ctx context.Context, done func()) {
defer done() defer done()
mux := http.NewServeMux() mux := http.NewServeMux()
@@ -544,7 +534,7 @@ func metricsServer(ctx context.Context, lg slog.Logger, done func()) {
srv := http.Server{Handler: mux, ErrorLog: internal.GetFilteredHTTPLogger()} srv := http.Server{Handler: mux, ErrorLog: internal.GetFilteredHTTPLogger()}
listener, metricsUrl := setupListener(*metricsBindNetwork, *metricsBind) listener, metricsUrl := setupListener(*metricsBindNetwork, *metricsBind)
lg.Debug("listening for metrics", "url", metricsUrl) slog.Debug("listening for metrics", "url", metricsUrl)
go func() { go func() {
<-ctx.Done() <-ctx.Done()

View File

@@ -28,7 +28,7 @@ func main() {
flagenv.Parse() flagenv.Parse()
flag.Parse() flag.Parse()
slog.SetDefault(internal.InitSlog(*slogLevel, os.Stderr)) internal.InitSlog(*slogLevel)
koDockerRepo := strings.TrimSuffix(*dockerRepo, "/"+filepath.Base(*dockerRepo)) koDockerRepo := strings.TrimSuffix(*dockerRepo, "/"+filepath.Base(*dockerRepo))
@@ -159,8 +159,5 @@ func run(command string) (string, error) {
} }
func setOutput(key, val string) { func setOutput(key, val string) {
github_output := os.Getenv("GITHUB_OUTPUT") fmt.Printf("::set-output name=%s::%s\n", key, val)
f, _ := os.OpenFile(github_output, os.O_WRONLY|os.O_APPEND|os.O_CREATE, 0644)
fmt.Fprintf(f, "%s=%s\n", key, val)
f.Close()
} }

View File

@@ -12,7 +12,7 @@ import (
"regexp" "regexp"
"strings" "strings"
"github.com/TecharoHQ/anubis/lib/config" "github.com/TecharoHQ/anubis/lib/policy/config"
"sigs.k8s.io/yaml" "sigs.k8s.io/yaml"
) )

View File

@@ -22,9 +22,9 @@ type TestCase struct {
type TestOptions struct { type TestOptions struct {
format string format string
action string action string
crawlDelayWeight int
policyName string policyName string
deniedAction string deniedAction string
crawlDelayWeight int
} }
func TestDataFileConversion(t *testing.T) { func TestDataFileConversion(t *testing.T) {

View File

@@ -3,6 +3,5 @@
- name: qualys-ssl-labs - name: qualys-ssl-labs
action: ALLOW action: ALLOW
remote_addresses: remote_addresses:
- 69.67.183.0/24 - 64.41.200.0/24
- 2600:C02:1020:4202::/64 - 2600:C02:1020:4202::/64
- 2602:fdaa:c6:2::/64

View File

@@ -51,6 +51,7 @@ bots:
# action: CHALLENGE # action: CHALLENGE
# challenge: # challenge:
# difficulty: 16 # impossible # difficulty: 16 # impossible
# report_as: 4 # lie to the operator
# algorithm: slow # intentionally waste CPU cycles and time # algorithm: slow # intentionally waste CPU cycles and time
# Requires a subscription to Thoth to use, see # Requires a subscription to Thoth to use, see
@@ -95,6 +96,50 @@ bots:
# weight: # weight:
# adjust: -10 # adjust: -10
# Assert behaviour that only genuine browsers display. This ensures that Chrome
# or Firefox versions
- name: realistic-browser-catchall
expression:
all:
- '"User-Agent" in headers'
- '( userAgent.contains("Firefox") ) || ( userAgent.contains("Chrome") ) || ( userAgent.contains("Safari") )'
- '"Accept" in headers'
- '"Sec-Fetch-Dest" in headers'
- '"Sec-Fetch-Mode" in headers'
- '"Sec-Fetch-Site" in headers'
- '"Accept-Encoding" in headers'
- '( headers["Accept-Encoding"].contains("zstd") || headers["Accept-Encoding"].contains("br") )'
- '"Accept-Language" in headers'
action: WEIGH
weight:
adjust: -10
# The Upgrade-Insecure-Requests header is typically sent by browsers, but not always
- name: upgrade-insecure-requests
expression: '"Upgrade-Insecure-Requests" in headers'
action: WEIGH
weight:
adjust: -2
# Chrome should behave like Chrome
- name: chrome-is-proper
expression:
all:
- userAgent.contains("Chrome")
- '"Sec-Ch-Ua" in headers'
- 'headers["Sec-Ch-Ua"].contains("Chromium")'
- '"Sec-Ch-Ua-Mobile" in headers'
- '"Sec-Ch-Ua-Platform" in headers'
action: WEIGH
weight:
adjust: -5
- name: should-have-accept
expression: '!("Accept" in headers)'
action: WEIGH
weight:
adjust: 5
# Generic catchall rule # Generic catchall rule
- name: generic-browser - name: generic-browser
user_agent_regex: >- user_agent_regex: >-
@@ -204,6 +249,7 @@ thresholds:
# https://anubis.techaro.lol/docs/admin/configuration/challenges/metarefresh # https://anubis.techaro.lol/docs/admin/configuration/challenges/metarefresh
algorithm: metarefresh algorithm: metarefresh
difficulty: 1 difficulty: 1
report_as: 1
# For clients that are browser-like but have either gained points from custom rules or # For clients that are browser-like but have either gained points from custom rules or
# report as a standard browser. # report as a standard browser.
- name: moderate-suspicion - name: moderate-suspicion
@@ -216,6 +262,7 @@ thresholds:
# https://anubis.techaro.lol/docs/admin/configuration/challenges/proof-of-work # https://anubis.techaro.lol/docs/admin/configuration/challenges/proof-of-work
algorithm: fast algorithm: fast
difficulty: 2 # two leading zeros, very fast for most clients difficulty: 2 # two leading zeros, very fast for most clients
report_as: 2
- name: mild-proof-of-work - name: mild-proof-of-work
expression: expression:
all: all:
@@ -226,6 +273,7 @@ thresholds:
# https://anubis.techaro.lol/docs/admin/configuration/challenges/proof-of-work # https://anubis.techaro.lol/docs/admin/configuration/challenges/proof-of-work
algorithm: fast algorithm: fast
difficulty: 4 difficulty: 4
report_as: 4
# For clients that are browser like and have gained many points from custom rules # For clients that are browser like and have gained many points from custom rules
- name: extreme-suspicion - name: extreme-suspicion
expression: weight >= 30 expression: weight >= 30
@@ -234,3 +282,4 @@ thresholds:
# https://anubis.techaro.lol/docs/admin/configuration/challenges/proof-of-work # https://anubis.techaro.lol/docs/admin/configuration/challenges/proof-of-work
algorithm: fast algorithm: fast
difficulty: 6 difficulty: 6
report_as: 6

View File

@@ -4,5 +4,5 @@
# - Claude-User: No published IP allowlist # - Claude-User: No published IP allowlist
- name: "ai-clients" - name: "ai-clients"
user_agent_regex: >- user_agent_regex: >-
ChatGPT-User|Claude-User|MistralAI-User|Perplexity-User ChatGPT-User|Claude-User|MistralAI-User
action: DENY action: DENY

View File

@@ -4,4 +4,7 @@
user_agent_regex: MistralAI-User/.+; \+https\://docs\.mistral\.ai/robots user_agent_regex: MistralAI-User/.+; \+https\://docs\.mistral\.ai/robots
action: ALLOW action: ALLOW
# https://mistral.ai/mistralai-user-ips.json # https://mistral.ai/mistralai-user-ips.json
remote_addresses: ["20.240.160.161/32", "20.240.160.1/32"] remote_addresses: [
"20.240.160.161/32",
"20.240.160.1/32",
]

View File

@@ -5,8 +5,7 @@
action: ALLOW action: ALLOW
# https://openai.com/chatgpt-user.json # https://openai.com/chatgpt-user.json
# curl 'https://openai.com/chatgpt-user.json' | jq '.prefixes.[].ipv4Prefix' | sed 's/$/,/' # curl 'https://openai.com/chatgpt-user.json' | jq '.prefixes.[].ipv4Prefix' | sed 's/$/,/'
remote_addresses: remote_addresses: [
[
"13.65.138.112/28", "13.65.138.112/28",
"23.98.179.16/28", "23.98.179.16/28",
"13.65.138.96/28", "13.65.138.96/28",

View File

@@ -1,8 +0,0 @@
# Acts on behalf of user requests
# https://docs.perplexity.ai/guides/bots
- name: perplexity-user
user_agent_regex: Perplexity-User/.+; \+https\://perplexity\.ai/perplexity-user
action: ALLOW
# https://www.perplexity.com/perplexity-user.json
remote_addresses:
["44.208.221.197/32", "34.193.163.52/32", "18.97.21.0/30", "18.97.43.80/29"]

View File

@@ -1,6 +0,0 @@
- name: telegrambot
action: ALLOW
expression:
all:
- userAgent.matches("TelegramBot")
- verifyFCrDNS(remoteAddress, "ptr\\.telegram\\.org$")

View File

@@ -1,6 +0,0 @@
- name: vkbot
action: ALLOW
expression:
all:
- userAgent.matches("vkShare[^+]+\\+http\\://vk\\.com/dev/Share")
- verifyFCrDNS(remoteAddress, "^snipster\\d+\\.go\\.mail\\.ru$")

View File

@@ -1,55 +0,0 @@
# Assert behaviour that only genuine browsers display. This ensures that modern Chrome
# or Firefox versions will get through without a challenge.
#
# These rules have been known to be bypassed by some of the worst automated scrapers.
# Use at your own risk.
- name: realistic-browser-catchall
expression:
all:
- '"User-Agent" in headers'
- '( userAgent.contains("Firefox") ) || ( userAgent.contains("Chrome") ) || ( userAgent.contains("Safari") )'
- '"Accept" in headers'
- '"Sec-Fetch-Dest" in headers'
- '"Sec-Fetch-Mode" in headers'
- '"Sec-Fetch-Site" in headers'
- '"Accept-Encoding" in headers'
- '( headers["Accept-Encoding"].contains("zstd") || headers["Accept-Encoding"].contains("br") )'
- '"Accept-Language" in headers'
action: WEIGH
weight:
adjust: -10
# The Upgrade-Insecure-Requests header is typically sent by browsers, but not always
- name: upgrade-insecure-requests
expression: '"Upgrade-Insecure-Requests" in headers'
action: WEIGH
weight:
adjust: -2
# Chrome should behave like Chrome
- name: chrome-is-proper
expression:
all:
- userAgent.contains("Chrome")
- '"Sec-Ch-Ua" in headers'
- 'headers["Sec-Ch-Ua"].contains("Chromium")'
- '"Sec-Ch-Ua-Mobile" in headers'
- '"Sec-Ch-Ua-Platform" in headers'
action: WEIGH
weight:
adjust: -5
- name: should-have-accept
expression: '!("Accept" in headers)'
action: WEIGH
weight:
adjust: 5
# Generic catchall rule
- name: generic-browser
user_agent_regex: >-
Mozilla|Opera
action: WEIGH
weight:
adjust: 10

View File

@@ -8,4 +8,3 @@
- import: (data)/crawlers/marginalia.yaml - import: (data)/crawlers/marginalia.yaml
- import: (data)/crawlers/mojeekbot.yaml - import: (data)/crawlers/mojeekbot.yaml
- import: (data)/crawlers/commoncrawl.yaml - import: (data)/crawlers/commoncrawl.yaml
- import: (data)/crawlers/yandexbot.yaml

View File

@@ -4,5 +4,5 @@
# - Claude-SearchBot: No published IP allowlist # - Claude-SearchBot: No published IP allowlist
- name: "ai-crawlers-search" - name: "ai-crawlers-search"
user_agent_regex: >- user_agent_regex: >-
OAI-SearchBot|Claude-SearchBot|PerplexityBot OAI-SearchBot|Claude-SearchBot
action: DENY action: DENY

View File

@@ -4,8 +4,7 @@
user_agent_regex: Applebot user_agent_regex: Applebot
action: ALLOW action: ALLOW
# https://search.developer.apple.com/applebot.json # https://search.developer.apple.com/applebot.json
remote_addresses: remote_addresses: [
[
"17.241.208.160/27", "17.241.208.160/27",
"17.241.193.160/27", "17.241.193.160/27",
"17.241.200.160/27", "17.241.200.160/27",

View File

@@ -2,8 +2,7 @@
user_agent_regex: \+http\://www\.bing\.com/bingbot\.htm user_agent_regex: \+http\://www\.bing\.com/bingbot\.htm
action: ALLOW action: ALLOW
# https://www.bing.com/toolbox/bingbot.json # https://www.bing.com/toolbox/bingbot.json
remote_addresses: remote_addresses: [
[
"157.55.39.0/24", "157.55.39.0/24",
"207.46.13.0/24", "207.46.13.0/24",
"40.77.167.0/24", "40.77.167.0/24",
@@ -31,5 +30,5 @@
"20.74.197.0/28", "20.74.197.0/28",
"20.15.133.160/27", "20.15.133.160/27",
"40.77.177.0/24", "40.77.177.0/24",
"40.77.178.0/23", "40.77.178.0/23"
] ]

View File

@@ -2,8 +2,7 @@
user_agent_regex: DuckDuckBot/1\.1; \(\+http\://duckduckgo\.com/duckduckbot\.html\) user_agent_regex: DuckDuckBot/1\.1; \(\+http\://duckduckgo\.com/duckduckbot\.html\)
action: ALLOW action: ALLOW
# https://duckduckgo.com/duckduckgo-help-pages/results/duckduckbot # https://duckduckgo.com/duckduckgo-help-pages/results/duckduckbot
remote_addresses: remote_addresses: [
[
"57.152.72.128/32", "57.152.72.128/32",
"51.8.253.152/32", "51.8.253.152/32",
"40.80.242.63/32", "40.80.242.63/32",
@@ -272,5 +271,5 @@
"4.213.46.14/32", "4.213.46.14/32",
"172.169.17.165/32", "172.169.17.165/32",
"51.8.71.117/32", "51.8.71.117/32",
"20.3.1.178/32", "20.3.1.178/32"
] ]

View File

@@ -2,8 +2,7 @@
user_agent_regex: \+http\://www\.google\.com/bot\.html user_agent_regex: \+http\://www\.google\.com/bot\.html
action: ALLOW action: ALLOW
# https://developers.google.com/static/search/apis/ipranges/googlebot.json # https://developers.google.com/static/search/apis/ipranges/googlebot.json
remote_addresses: remote_addresses: [
[
"2001:4860:4801:10::/64", "2001:4860:4801:10::/64",
"2001:4860:4801:11::/64", "2001:4860:4801:11::/64",
"2001:4860:4801:12::/64", "2001:4860:4801:12::/64",
@@ -260,5 +259,5 @@
"66.249.79.224/27", "66.249.79.224/27",
"66.249.79.32/27", "66.249.79.32/27",
"66.249.79.64/27", "66.249.79.64/27",
"66.249.79.96/27", "66.249.79.96/27"
] ]

View File

@@ -1,4 +1,8 @@
- name: internet-archive - name: internet-archive
action: ALLOW action: ALLOW
# https://ipinfo.io/AS7941 # https://ipinfo.io/AS7941
remote_addresses: ["207.241.224.0/20", "208.70.24.0/21", "2620:0:9c0::/48"] remote_addresses: [
"207.241.224.0/20",
"208.70.24.0/21",
"2620:0:9c0::/48"
]

View File

@@ -2,10 +2,9 @@
user_agent_regex: \+https\://kagi\.com/bot user_agent_regex: \+https\://kagi\.com/bot
action: ALLOW action: ALLOW
# https://kagi.com/bot # https://kagi.com/bot
remote_addresses: remote_addresses: [
[
"216.18.205.234/32", "216.18.205.234/32",
"35.212.27.76/32", "35.212.27.76/32",
"104.254.65.50/32", "104.254.65.50/32",
"209.151.156.194/32", "209.151.156.194/32"
] ]

View File

@@ -2,11 +2,10 @@
user_agent_regex: search\.marginalia\.nu user_agent_regex: search\.marginalia\.nu
action: ALLOW action: ALLOW
# Received directly over email # Received directly over email
remote_addresses: remote_addresses: [
[
"193.183.0.162/31", "193.183.0.162/31",
"193.183.0.164/30", "193.183.0.164/30",
"193.183.0.168/30", "193.183.0.168/30",
"193.183.0.172/31", "193.183.0.172/31",
"193.183.0.174/32", "193.183.0.174/32"
] ]

View File

@@ -4,8 +4,7 @@
user_agent_regex: GPTBot/1\.1; \+https\://openai\.com/gptbot user_agent_regex: GPTBot/1\.1; \+https\://openai\.com/gptbot
action: ALLOW action: ALLOW
# https://openai.com/gptbot.json # https://openai.com/gptbot.json
remote_addresses: remote_addresses: [
[
"52.230.152.0/24", "52.230.152.0/24",
"20.171.206.0/24", "20.171.206.0/24",
"20.171.207.0/24", "20.171.207.0/24",

View File

@@ -4,11 +4,10 @@
user_agent_regex: OAI-SearchBot/1\.0; \+https\://openai\.com/searchbot user_agent_regex: OAI-SearchBot/1\.0; \+https\://openai\.com/searchbot
action: ALLOW action: ALLOW
# https://openai.com/searchbot.json # https://openai.com/searchbot.json
remote_addresses: remote_addresses: [
[
"20.42.10.176/28", "20.42.10.176/28",
"172.203.190.128/28", "172.203.190.128/28",
"104.210.140.128/28", "104.210.140.128/28",
"51.8.102.0/24", "51.8.102.0/24",
"135.234.64.0/24", "135.234.64.0/24"
] ]

View File

@@ -1,17 +0,0 @@
# Indexing for search, does not collect training data
# https://docs.perplexity.ai/guides/bots
- name: perplexitybot
user_agent_regex: PerplexityBot/.+; \+https\://perplexity\.ai/perplexitybot
action: ALLOW
# https://www.perplexity.com/perplexitybot.json
remote_addresses:
[
"107.20.236.150/32",
"3.224.62.45/32",
"18.210.92.235/32",
"3.222.232.239/32",
"3.211.124.183/32",
"3.231.139.107/32",
"18.97.1.228/30",
"18.97.9.96/29",
]

View File

@@ -1,6 +0,0 @@
- name: yandexbot
action: ALLOW
expression:
all:
- userAgent.matches("\\+http\\://yandex\\.com/bots")
- verifyFCrDNS(remoteAddress, "^.*\\.yandex\\.(ru|com|net)$")

View File

@@ -3,7 +3,5 @@
- import: (data)/bots/ai-catchall.yaml - import: (data)/bots/ai-catchall.yaml
- import: (data)/crawlers/ai-training.yaml - import: (data)/crawlers/ai-training.yaml
- import: (data)/crawlers/openai-searchbot.yaml - import: (data)/crawlers/openai-searchbot.yaml
- import: (data)/crawlers/perplexitybot.yaml
- import: (data)/clients/openai-chatgpt-user.yaml - import: (data)/clients/openai-chatgpt-user.yaml
- import: (data)/clients/mistral-mistralai-user.yaml - import: (data)/clients/mistral-mistralai-user.yaml
- import: (data)/clients/perplexity-user.yaml

View File

@@ -2,7 +2,5 @@
- import: (data)/bots/ai-catchall.yaml - import: (data)/bots/ai-catchall.yaml
- import: (data)/crawlers/openai-searchbot.yaml - import: (data)/crawlers/openai-searchbot.yaml
- import: (data)/crawlers/openai-gptbot.yaml - import: (data)/crawlers/openai-gptbot.yaml
- import: (data)/crawlers/perplexitybot.yaml
- import: (data)/clients/openai-chatgpt-user.yaml - import: (data)/clients/openai-chatgpt-user.yaml
- import: (data)/clients/mistral-mistralai-user.yaml - import: (data)/clients/mistral-mistralai-user.yaml
- import: (data)/clients/perplexity-user.yaml

View File

@@ -35,6 +35,7 @@
# action: CHALLENGE # action: CHALLENGE
# challenge: # challenge:
# difficulty: 16 # impossible # difficulty: 16 # impossible
# report_as: 4 # lie to the operator
# algorithm: slow # intentionally waste CPU cycles and time # algorithm: slow # intentionally waste CPU cycles and time
# Requires a subscription to Thoth to use, see # Requires a subscription to Thoth to use, see
@@ -79,6 +80,50 @@
# weight: # weight:
# adjust: -10 # adjust: -10
# Assert behaviour that only genuine browsers display. This ensures that Chrome
# or Firefox versions
- name: realistic-browser-catchall
expression:
all:
- '"User-Agent" in headers'
- '( userAgent.contains("Firefox") ) || ( userAgent.contains("Chrome") ) || ( userAgent.contains("Safari") )'
- '"Accept" in headers'
- '"Sec-Fetch-Dest" in headers'
- '"Sec-Fetch-Mode" in headers'
- '"Sec-Fetch-Site" in headers'
- '"Accept-Encoding" in headers'
- '( headers["Accept-Encoding"].contains("zstd") || headers["Accept-Encoding"].contains("br") )'
- '"Accept-Language" in headers'
action: WEIGH
weight:
adjust: -10
# The Upgrade-Insecure-Requests header is typically sent by browsers, but not always
- name: upgrade-insecure-requests
expression: '"Upgrade-Insecure-Requests" in headers'
action: WEIGH
weight:
adjust: -2
# Chrome should behave like Chrome
- name: chrome-is-proper
expression:
all:
- userAgent.contains("Chrome")
- '"Sec-Ch-Ua" in headers'
- 'headers["Sec-Ch-Ua"].contains("Chromium")'
- '"Sec-Ch-Ua-Mobile" in headers'
- '"Sec-Ch-Ua-Platform" in headers'
action: WEIGH
weight:
adjust: -5
- name: should-have-accept
expression: '!("Accept" in headers)'
action: WEIGH
weight:
adjust: 5
# Generic catchall rule # Generic catchall rule
- name: generic-browser - name: generic-browser
user_agent_regex: >- user_agent_regex: >-

View File

@@ -1,2 +0,0 @@
- import: (data)/clients/telegram-preview.yaml
- import: (data)/clients/vk-preview.yaml

View File

@@ -1,26 +0,0 @@
# https://updown.io/about
- name: updown
user_agent_regex: updown.io
action: ALLOW
remote_addresses: [
"45.32.74.41/32",
"104.238.136.194/32",
"192.99.37.47/32",
"91.121.222.175/32",
"104.238.159.87/32",
"102.212.60.78/32",
"135.181.102.135/32",
"45.32.107.181/32",
"45.76.104.117/32",
"45.63.29.207/32",
"2001:19f0:6001:2c6::1/128",
"2001:19f0:9002:11a::1/128",
"2607:5300:60:4c2f::1/128",
"2001:41d0:2:85af::1/128",
"2001:19f0:6c01:145::1/128",
"2c0f:c40:4003:4::2/128",
"2a01:4f9:c010:d5f9::1/128",
"2001:19f0:4400:402e::1/128",
"2001:19f0:7001:45a::1/128",
"2001:19f0:5801:1d8::1/128"
]

View File

@@ -2,8 +2,7 @@
user_agent_regex: UptimeRobot user_agent_regex: UptimeRobot
action: ALLOW action: ALLOW
# https://api.uptimerobot.com/meta/ips # https://api.uptimerobot.com/meta/ips
remote_addresses: remote_addresses: [
[
"3.12.251.153/32", "3.12.251.153/32",
"3.20.63.178/32", "3.20.63.178/32",
"3.77.67.4/32", "3.77.67.4/32",

View File

@@ -13,13 +13,13 @@ func Zilch[T any]() T {
// Impl is a lazy key->value map. It's a wrapper around a map and a mutex. If values exceed their time-to-live, they are pruned at Get time. // Impl is a lazy key->value map. It's a wrapper around a map and a mutex. If values exceed their time-to-live, they are pruned at Get time.
type Impl[K comparable, V any] struct { type Impl[K comparable, V any] struct {
data map[K]decayMapEntry[V] data map[K]decayMapEntry[V]
lock sync.RWMutex
// deleteCh receives decay-deletion requests from readers. // deleteCh receives decay-deletion requests from readers.
deleteCh chan deleteReq[K] deleteCh chan deleteReq[K]
// stopCh stops the background cleanup worker. // stopCh stops the background cleanup worker.
stopCh chan struct{} stopCh chan struct{}
wg sync.WaitGroup wg sync.WaitGroup
lock sync.RWMutex
} }
type decayMapEntry[V any] struct { type decayMapEntry[V any] struct {
@@ -146,7 +146,7 @@ func (m *Impl[K, V]) Close() {
func (m *Impl[K, V]) cleanupWorker() { func (m *Impl[K, V]) cleanupWorker() {
defer m.wg.Done() defer m.wg.Done()
batch := make([]deleteReq[K], 0, 64) batch := make([]deleteReq[K], 0, 64)
ticker := time.NewTicker(500 * time.Millisecond) ticker := time.NewTicker(10 * time.Millisecond)
defer ticker.Stop() defer ticker.Stop()
flush := func() { flush := func() {

View File

@@ -32,7 +32,7 @@ func TestImpl(t *testing.T) {
// Deletion of expired entries after Get is deferred to a background worker. // Deletion of expired entries after Get is deferred to a background worker.
// Assert it eventually disappears from the map. // Assert it eventually disappears from the map.
deadline := time.Now().Add(700 * time.Millisecond) deadline := time.Now().Add(200 * time.Millisecond)
for time.Now().Before(deadline) { for time.Now().Before(deadline) {
if dm.Len() == 0 { if dm.Len() == 0 {
break break

View File

@@ -226,7 +226,7 @@ So far Anubis supports the following languages:
- English (Simplified and Traditional) - English (Simplified and Traditional)
- French - French
- Portuguese (Brazil) - Portugese (Brazil)
- Spanish - Spanish
If you want to contribute translations, please [file an issue](https://github.com/TecharoHQ/anubis/issues/new) with your language of choice or submit a pull request to [the `lib/localization/locales` folder](https://github.com/TecharoHQ/anubis/tree/main/lib/localization/locales). We are about to introduce features to the translation stack, so you may want to hold off a hot minute, but we welcome any and all contributions to making Anubis useful to a global audience. If you want to contribute translations, please [file an issue](https://github.com/TecharoHQ/anubis/issues/new) with your language of choice or submit a pull request to [the `lib/localization/locales` folder](https://github.com/TecharoHQ/anubis/tree/main/lib/localization/locales). We are about to introduce features to the translation stack, so you may want to hold off a hot minute, but we welcome any and all contributions to making Anubis useful to a global audience.

View File

@@ -69,7 +69,7 @@ I am waiting to hear back from NLNet on if Anubis was selected for funding or no
Anubis now supports localized responses. Locales can be added in [lib/localization/locales/](https://github.com/TecharoHQ/anubis/tree/main/lib/localization/locales). This release includes support for the following languages: Anubis now supports localized responses. Locales can be added in [lib/localization/locales/](https://github.com/TecharoHQ/anubis/tree/main/lib/localization/locales). This release includes support for the following languages:
- [Brazilian Portuguese](https://github.com/TecharoHQ/anubis/pull/726) - [Brazilian Portugese](https://github.com/TecharoHQ/anubis/pull/726)
- [Chinese (Simplified)](https://github.com/TecharoHQ/anubis/pull/774) - [Chinese (Simplified)](https://github.com/TecharoHQ/anubis/pull/774)
- [Chinese (Traditional)](https://github.com/TecharoHQ/anubis/pull/759) - [Chinese (Traditional)](https://github.com/TecharoHQ/anubis/pull/759)
- [Czech](https://github.com/TecharoHQ/anubis/pull/849) - [Czech](https://github.com/TecharoHQ/anubis/pull/849)

View File

@@ -1,16 +1,14 @@
import React, { useState, useEffect, useMemo } from "react"; import React, { useState, useEffect, useMemo } from 'react';
import styles from "./styles.module.css"; import styles from './styles.module.css';
// A helper function to perform SHA-256 hashing. // A helper function to perform SHA-256 hashing.
// It takes a string, encodes it, hashes it, and returns a hex string. // It takes a string, encodes it, hashes it, and returns a hex string.
async function sha256(message) { async function sha256(message) {
try { try {
const msgBuffer = new TextEncoder().encode(message); const msgBuffer = new TextEncoder().encode(message);
const hashBuffer = await crypto.subtle.digest("SHA-256", msgBuffer); const hashBuffer = await crypto.subtle.digest('SHA-256', msgBuffer);
const hashArray = Array.from(new Uint8Array(hashBuffer)); const hashArray = Array.from(new Uint8Array(hashBuffer));
const hashHex = hashArray const hashHex = hashArray.map(b => b.toString(16).padStart(2, '0')).join('');
.map((b) => b.toString(16).padStart(2, "0"))
.join("");
return hashHex; return hashHex;
} catch (error) { } catch (error) {
console.error("Hashing failed:", error); console.error("Hashing failed:", error);
@@ -23,42 +21,21 @@ const generateRandomHex = (bytes = 16) => {
const buffer = new Uint8Array(bytes); const buffer = new Uint8Array(bytes);
crypto.getRandomValues(buffer); crypto.getRandomValues(buffer);
return Array.from(buffer) return Array.from(buffer)
.map((byte) => byte.toString(16).padStart(2, "0")) .map(byte => byte.toString(16).padStart(2, '0'))
.join(""); .join('');
}; };
// Icon components for better visual feedback // Icon components for better visual feedback
const CheckIcon = () => ( const CheckIcon = () => (
<svg <svg xmlns="http://www.w3.org/2000/svg" className={styles.iconGreen} fill="none" viewBox="0 0 24 24" stroke="currentColor">
xmlns="http://www.w3.org/2000/svg" <path strokeLinecap="round" strokeLinejoin="round" strokeWidth={2} d="M9 12l2 2 4-4m6 2a9 9 0 11-18 0 9 9 0 0118 0z" />
className={styles.iconGreen}
fill="none"
viewBox="0 0 24 24"
stroke="currentColor"
>
<path
strokeLinecap="round"
strokeLinejoin="round"
strokeWidth={2}
d="M9 12l2 2 4-4m6 2a9 9 0 11-18 0 9 9 0 0118 0z"
/>
</svg> </svg>
); );
const XCircleIcon = () => ( const XCircleIcon = () => (
<svg <svg xmlns="http://www.w3.org/2000/svg" className={styles.iconRed} fill="none" viewBox="0 0 24 24" stroke="currentColor">
xmlns="http://www.w3.org/2000/svg" <path strokeLinecap="round" strokeLinejoin="round" strokeWidth={2} d="M10 14l2-2m0 0l2-2m-2 2l-2-2m2 2l2 2m7-2a9 9 0 11-18 0 9 9 0 0118 0z" />
className={styles.iconRed}
fill="none"
viewBox="0 0 24 24"
stroke="currentColor"
>
<path
strokeLinecap="round"
strokeLinejoin="round"
strokeWidth={2}
d="M10 14l2-2m0 0l2-2m-2 2l-2-2m2 2l2 2m7-2a9 9 0 11-18 0 9 9 0 0118 0z"
/>
</svg> </svg>
); );
@@ -69,7 +46,7 @@ export default function App() {
// State for the nonce, which is the variable we can change // State for the nonce, which is the variable we can change
const [nonce, setNonce] = useState(0); const [nonce, setNonce] = useState(0);
// State to store the resulting hash // State to store the resulting hash
const [hash, setHash] = useState(""); const [hash, setHash] = useState('');
// A flag to indicate if the current hash is the "winning" one // A flag to indicate if the current hash is the "winning" one
const [isMining, setIsMining] = useState(false); const [isMining, setIsMining] = useState(false);
const [isFound, setIsFound] = useState(false); const [isFound, setIsFound] = useState(false);
@@ -78,10 +55,7 @@ export default function App() {
const difficulty = "00"; const difficulty = "00";
// Memoize the combined data to avoid recalculating on every render // Memoize the combined data to avoid recalculating on every render
const combinedData = useMemo( const combinedData = useMemo(() => `${challenge}${nonce}`, [challenge, nonce]);
() => `${challenge}${nonce}`,
[challenge, nonce],
);
// This effect hook recalculates the hash whenever the combinedData changes. // This effect hook recalculates the hash whenever the combinedData changes.
useEffect(() => { useEffect(() => {
@@ -94,9 +68,7 @@ export default function App() {
} }
}; };
calculateHash(); calculateHash();
return () => { return () => { isMounted = false; };
isMounted = false;
};
}, [combinedData, difficulty]); }, [combinedData, difficulty]);
// This effect handles the automatic mining process // This effect handles the automatic mining process
@@ -121,7 +93,7 @@ export default function App() {
// Update the UI periodically to avoid freezing the browser // Update the UI periodically to avoid freezing the browser
if (miningNonce % 100 === 0) { if (miningNonce % 100 === 0) {
setNonce(miningNonce); setNonce(miningNonce);
await new Promise((resolve) => setTimeout(resolve, 0)); // Yield to the browser await new Promise(resolve => setTimeout(resolve, 0)); // Yield to the browser
} }
} }
}; };
@@ -130,27 +102,28 @@ export default function App() {
return () => { return () => {
continueMining = false; continueMining = false;
}; }
}, [isMining, challenge, nonce, difficulty]); }, [isMining, challenge, nonce, difficulty]);
const handleMineClick = () => { const handleMineClick = () => {
setIsMining(true); setIsMining(true);
}; }
const handleStopClick = () => { const handleStopClick = () => {
setIsMining(false); setIsMining(false);
}; }
const handleResetClick = () => { const handleResetClick = () => {
setIsMining(false); setIsMining(false);
setNonce(0); setNonce(0);
}; }
const handleNewChallengeClick = () => { const handleNewChallengeClick = () => {
setIsMining(false); setIsMining(false);
setChallenge(generateRandomHex(16)); setChallenge(generateRandomHex(16));
setNonce(0); setNonce(0);
}; }
// Helper to render the hash with colored leading characters // Helper to render the hash with colored leading characters
const renderHash = () => { const renderHash = () => {
@@ -180,46 +153,12 @@ export default function App() {
<div className={styles.block}> <div className={styles.block}>
<h2 className={styles.blockTitle}>2. Nonce</h2> <h2 className={styles.blockTitle}>2. Nonce</h2>
<div className={styles.nonceControls}> <div className={styles.nonceControls}>
<button <button onClick={() => setNonce(n => n - 1)} disabled={isMining} className={styles.nonceButton}>
onClick={() => setNonce((n) => n - 1)} <svg xmlns="http://www.w3.org/2000/svg" className={styles.iconSmall} fill="none" viewBox="0 0 24 24" stroke="currentColor"><path strokeLinecap="round" strokeLinejoin="round" strokeWidth={2} d="M20 12H4" /></svg>
disabled={isMining}
className={styles.nonceButton}
>
<svg
xmlns="http://www.w3.org/2000/svg"
className={styles.iconSmall}
fill="none"
viewBox="0 0 24 24"
stroke="currentColor"
>
<path
strokeLinecap="round"
strokeLinejoin="round"
strokeWidth={2}
d="M20 12H4"
/>
</svg>
</button> </button>
<span className={styles.nonceValue}>{nonce}</span> <span className={styles.nonceValue}>{nonce}</span>
<button <button onClick={() => setNonce(n => n + 1)} disabled={isMining} className={styles.nonceButton}>
onClick={() => setNonce((n) => n + 1)} <svg xmlns="http://www.w3.org/2000/svg" className={styles.iconSmall} fill="none" viewBox="0 0 24 24" stroke="currentColor"><path strokeLinecap="round" strokeLinejoin="round" strokeWidth={2} d="M12 4v16m8-8H4" /></svg>
disabled={isMining}
className={styles.nonceButton}
>
<svg
xmlns="http://www.w3.org/2000/svg"
className={styles.iconSmall}
fill="none"
viewBox="0 0 24 24"
stroke="currentColor"
>
<path
strokeLinecap="round"
strokeLinejoin="round"
strokeWidth={2}
d="M12 4v16m8-8H4"
/>
</svg>
</button> </button>
</div> </div>
</div> </div>
@@ -233,26 +172,13 @@ export default function App() {
{/* Arrow pointing down */} {/* Arrow pointing down */}
<div className={styles.arrowContainer}> <div className={styles.arrowContainer}>
<svg <svg xmlns="http://www.w3.org/2000/svg" className={styles.iconGray} fill="none" viewBox="0 0 24 24" stroke="currentColor">
xmlns="http://www.w3.org/2000/svg" <path strokeLinecap="round" strokeLinejoin="round" strokeWidth={2} d="M19 14l-7 7m0 0l-7-7m7 7V3" />
className={styles.iconGray}
fill="none"
viewBox="0 0 24 24"
stroke="currentColor"
>
<path
strokeLinecap="round"
strokeLinejoin="round"
strokeWidth={2}
d="M19 14l-7 7m0 0l-7-7m7 7V3"
/>
</svg> </svg>
</div> </div>
{/* Hash Output Block */} {/* Hash Output Block */}
<div <div className={`${styles.hashContainer} ${isFound ? styles.hashContainerSuccess : styles.hashContainerError}`}>
className={`${styles.hashContainer} ${isFound ? styles.hashContainerSuccess : styles.hashContainerError}`}
>
<div className={styles.hashContent}> <div className={styles.hashContent}>
<div className={styles.hashText}> <div className={styles.hashText}>
<h2 className={styles.blockTitle}>4. Resulting Hash (SHA-256)</h2> <h2 className={styles.blockTitle}>4. Resulting Hash (SHA-256)</h2>
@@ -267,30 +193,18 @@ export default function App() {
{/* Mining Controls */} {/* Mining Controls */}
<div className={styles.buttonContainer}> <div className={styles.buttonContainer}>
{!isMining ? ( {!isMining ? (
<button <button onClick={handleMineClick} className={`${styles.button} ${styles.buttonCyan}`}>
onClick={handleMineClick}
className={`${styles.button} ${styles.buttonCyan}`}
>
Auto-Mine Auto-Mine
</button> </button>
) : ( ) : (
<button <button onClick={handleStopClick} className={`${styles.button} ${styles.buttonYellow}`}>
onClick={handleStopClick}
className={`${styles.button} ${styles.buttonYellow}`}
>
Stop Mining Stop Mining
</button> </button>
)} )}
<button <button onClick={handleNewChallengeClick} className={`${styles.button} ${styles.buttonIndigo}`}>
onClick={handleNewChallengeClick}
className={`${styles.button} ${styles.buttonIndigo}`}
>
New Challenge New Challenge
</button> </button>
<button <button onClick={handleResetClick} className={`${styles.button} ${styles.buttonGray}`}>
onClick={handleResetClick}
className={`${styles.button} ${styles.buttonGray}`}
>
Reset Nonce Reset Nonce
</button> </button>
</div> </div>

View File

@@ -48,9 +48,7 @@
background-color: rgb(31 41 55); background-color: rgb(31 41 55);
padding: 1.5rem; padding: 1.5rem;
border-radius: 0.5rem; border-radius: 0.5rem;
box-shadow: box-shadow: 0 10px 15px -3px rgb(0 0 0 / 0.1), 0 4px 6px -4px rgb(0 0 0 / 0.1);
0 10px 15px -3px rgb(0 0 0 / 0.1),
0 4px 6px -4px rgb(0 0 0 / 0.1);
height: 100%; height: 100%;
display: flex; display: flex;
flex-direction: column; flex-direction: column;
@@ -160,9 +158,7 @@
.hashContainer { .hashContainer {
padding: 1.5rem; padding: 1.5rem;
border-radius: 0.5rem; border-radius: 0.5rem;
box-shadow: box-shadow: 0 10px 15px -3px rgb(0 0 0 / 0.1), 0 4px 6px -4px rgb(0 0 0 / 0.1);
0 10px 15px -3px rgb(0 0 0 / 0.1),
0 4px 6px -4px rgb(0 0 0 / 0.1);
transition: all 300ms; transition: all 300ms;
border: 2px solid; border: 2px solid;
} }

View File

@@ -12,42 +12,6 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
## [Unreleased] ## [Unreleased]
<!-- This changes the project to: --> <!-- This changes the project to: -->
- Fix CEL internal errors when iterating `headers`/`query` map wrappers by implementing map iterators for `HTTPHeaders` and `URLValues` ([#1465](https://github.com/TecharoHQ/anubis/pull/1465)).
## v1.25.0: Necron
Hey all,
I'm sure you've all been aware that things have been slowing down a little with Anubis development, and I want to apologize for that. A lot has been going on in my life lately (my blog will have a post out on Friday with more information), and as a result I haven't really had the energy to work on Anubis in publicly visible ways. There are things going on behind the scenes, but nothing is really shippable yet, sorry!
I've also been feeling some burnout in the wake of perennial waves of anger directed towards me. I'm handling it, I'll be fine, I've just had a lot going on in my life and it's been rough.
I've been missing the sense of wanderlust and discovery that comes with the artistic way I playfully develop software. I suspect that some of the stresses I've been through (setting up a complicated surgery in a country whose language you aren't fluent in is kind of an experience) have been sapping my energy. I'd gonna try to mess with things on my break, but realistically I'm probably just gonna be either watching Stargate SG-1 or doing unreasonable amounts of ocean fishing in Final Fantasy 14. Normally I'd love to keep the details about my medical state fairly private, but I'm more of a public figure now than I was this time last year so I don't really get the invisibility I'm used to for this.
I've also had a fair amount of negativity directed at me for simply being much more visible than the anonymous threat actors running the scrapers that are ruining everything, which though understandable has not helped.
Anyways, it all worked out and I'm about to be in the hospital for a week, so if things go really badly with this release please downgrade to the last version and/or upgrade to the main branch when the fix PR is inevitably merged. I hoped to have time to tame GPG and set up full release automation in the Anubis repo, but that didn't work out this time and that's okay.
If I can challenge you all to do something, go out there and try to actually create something new somehow. Combine ideas you've never mixed before. Be creative, be human, make something purely for yourself to scratch an itch that you've always had yet never gotten around to actually mending.
At the very least, try to be an example of how you want other people to act, even when you're in a situation where software written by someone else is configured to require a user agent to execute javascript to access a webpage.
Be well,
Xe
PS: if you're well-versed in FFXIV lore, the release title should give you an idea of the kind of stuff I've been going through mentally.
- Add iplist2rule tool that lets admins turn an IP address blocklist into an Anubis ruleset.
- Add Polish locale ([#1292](https://github.com/TecharoHQ/anubis/pull/1309))
- Fix honeypot and imprint links missing `BASE_PREFIX` when deployed behind a path prefix ([#1402](https://github.com/TecharoHQ/anubis/issues/1402))
- Add ANEXIA Sponsor logo to docs ([#1409](https://github.com/TecharoHQ/anubis/pull/1409))
- Improve idle performance in memory storage
- Add HAProxy Configurations to Docs ([#1424](https://github.com/TecharoHQ/anubis/pull/1424))
## v1.24.0: Y'shtola Rhul
Anubis is back and better than ever! Lots of minor fixes with some big ones interspersed.
- Fix panic when validating challenges after privacy-mode browsers strip headers and the follow-up request matches an `ALLOW` threshold. - Fix panic when validating challenges after privacy-mode browsers strip headers and the follow-up request matches an `ALLOW` threshold.
- Expose WEIGHT rule matches as Prometheus metrics. - Expose WEIGHT rule matches as Prometheus metrics.
@@ -57,99 +21,6 @@ Anubis is back and better than ever! Lots of minor fixes with some big ones inte
- Allow Renovate as an OCI registry client. - Allow Renovate as an OCI registry client.
- Properly handle 4in6 addresses so that IP matching works with those addresses. - Properly handle 4in6 addresses so that IP matching works with those addresses.
- Add support to simple Valkey/Redis cluster mode - Add support to simple Valkey/Redis cluster mode
- Open Graph passthrough now reuses the configured target Host/SNI/TLS settings, so metadata fetches succeed when the upstream certificate differs from the public domain. ([1283](https://github.com/TecharoHQ/anubis/pull/1283))
- Stabilize the CVE-2025-24369 regression test by always submitting an invalid proof instead of relying on random POW failures.
- Refine the check that ensures the presence of the Accept header to avoid breaking docker clients.
- Removed rules intended to reward actual browsers due to abuse in the wild.
### Dataset poisoning
Anubis has the ability to engage in [dataset poisoning attacks](https://www.anthropic.com/research/small-samples-poison) using the [dataset poisoning subsystem](./admin/honeypot/overview.mdx). This allows every Anubis instance to be a honeypot to attract and flag abusive scrapers so that no administrator action is required to ban them.
There is much more information about this feature in [the dataset poisoning subsystem documentation](./admin/honeypot/overview.mdx). Administrators that are interested in learning how this feature works should consult that documentation.
### Deprecate `report_as` in challenge configuration
Previously Anubis let you lie to users about the difficulty of a challenge to interfere with operators of malicious scrapers as a psychological attack:
```yaml
bots:
# Punish any bot with "bot" in the user-agent string
# This is known to have a high false-positive rate, use at your own risk
- name: generic-bot-catchall
user_agent_regex: (?i:bot|crawler)
action: CHALLENGE
challenge:
difficulty: 16 # impossible
report_as: 4 # lie to the operator
algorithm: slow # intentionally waste CPU cycles and time
```
This has turned out to be a bad idea because it has caused massive user experience problems and has been removed. If you are using this setting, you will get a warning in your logs like this:
```json
{
"time": "2025-11-25T23:10:31.092201549-05:00",
"level": "WARN",
"source": {
"function": "github.com/TecharoHQ/anubis/lib/policy.ParseConfig",
"file": "/home/xe/code/TecharoHQ/anubis/lib/policy/policy.go",
"line": 201
},
"msg": "use of deprecated report_as setting detected, please remove this from your policy file when possible",
"at": "config-validate",
"name": "mild-suspicion"
}
```
To remove this warning, remove this setting from your policy file.
### Logging customization
Anubis now supports the ability to log to multiple backends ("sinks"). This allows you to have Anubis [log to a file](./admin/policies.mdx#file-sink) instead of just logging to standard out. You can also customize the [logging level](./admin/policies.mdx#log-levels) in the policy file:
```yaml
logging:
level: "warn" # much less verbose logging
sink: file # log to a file
parameters:
file: "./var/anubis.log"
maxBackups: 3 # keep at least 3 old copies
maxBytes: 67108864 # each file can have up to 64 Mi of logs
maxAge: 7 # rotate files out every n days
oldFileTimeFormat: 2006-01-02T15-04-05 # RFC 3339-ish
compress: true # gzip-compress old log files
useLocalTime: false # timezone for rotated files is UTC
```
Additionally, information about [how Anubis uses each logging level](./admin/policies.mdx#log-levels) has been added to the documentation.
### DNS Features
- CEL expressions for:
- FCrDNS checks
- Forward DNS queries
- Reverse DNS queries
- `arpaReverseIP` to transform IPv4/6 addresses into ARPA reverse IP notation.
- `regexSafe` to escape regex special characters (useful for including `remoteAddress` or headers in regular expressions).
- DNS cache and other optimizations to minimize unnecessary DNS queries.
The DNS cache TTL can be changed in the bots config like this:
```yaml
dns_ttl:
forward: 600
reverse: 600
```
The default value for both forward and reverse queries is 300 seconds.
The `verifyFCrDNS` CEL function has two overloads:
- `(addr)`
Simply verifies that the remote side has PTR records pointing to the target address.
- `(addr, ptrPattern)`
Verifies that the remote side refers to a specific domain and that this domain points to the target IP.
## v1.23.1: Lyse Hext - Echo 1 ## v1.23.1: Lyse Hext - Echo 1

View File

@@ -51,8 +51,9 @@ If you are using Kubernetes, you will need to create an image pull secret:
kubectl create secret docker-registry \ kubectl create secret docker-registry \
techarohq-botstopper \ techarohq-botstopper \
--docker-server ghcr.io \ --docker-server ghcr.io \
--docker-username any-username \ --docker-username your-username \
--docker-password <your-access-token> \ --docker-password your-access-token \
--docker-email your@email.address
``` ```
Then attach it to your Deployment: Then attach it to your Deployment:
@@ -84,7 +85,7 @@ Follow [the upstream Docker compose directions](https://anubis.techaro.lol/docs/
OG_EXPIRY_TIME: "24h" OG_EXPIRY_TIME: "24h"
+ # botstopper config here + # botstopper config here
+ CHALLENGE_TITLE: "Doing math for your connection!" + CHALLENGE_TITLE: "Doing math for your connnection!"
+ ERROR_TITLE: "Something went wrong!" + ERROR_TITLE: "Something went wrong!"
+ OVERLAY_FOLDER: /assets + OVERLAY_FOLDER: /assets
+ volumes: + volumes:

View File

@@ -12,6 +12,7 @@ To use it in your Anubis configuration:
action: CHALLENGE action: CHALLENGE
challenge: challenge:
difficulty: 1 # Number of seconds to wait before refreshing the page difficulty: 1 # Number of seconds to wait before refreshing the page
report_as: 4 # Unused by this challenge method
algorithm: metarefresh # Specify a non-JS challenge method algorithm: metarefresh # Specify a non-JS challenge method
``` ```

View File

@@ -12,6 +12,7 @@ To use it in your Anubis configuration:
action: CHALLENGE action: CHALLENGE
challenge: challenge:
difficulty: 1 # Number of seconds to wait before refreshing the page difficulty: 1 # Number of seconds to wait before refreshing the page
report_as: 4 # Unused by this challenge method
algorithm: preact algorithm: preact
``` ```

View File

@@ -233,27 +233,6 @@ This is best applied when doing explicit block rules, eg:
It seems counter-intuitive to allow known bad clients through sometimes, but this allows you to confuse attackers by making Anubis' behavior random. Adjust the thresholds and numbers as facts and circumstances demand. It seems counter-intuitive to allow known bad clients through sometimes, but this allows you to confuse attackers by making Anubis' behavior random. Adjust the thresholds and numbers as facts and circumstances demand.
### `regexSafe`
Available in `bot` expressions.
```ts
function regexSafe(input: string): string;
```
`regexSafe` takes a string and escapes it for safe use inside of a regular expression. This is useful when you are creating regular expressions from headers or variables such as `remoteAddress`.
| Input | Output |
| :------------------------- | :-------------- |
| `regexSafe("1.2.3.4")` | `1\\.2\\.3\\.4` |
| `regexSafe("techaro.lol")` | `techaro\\.lol` |
| `regexSafe("star*")` | `star\\*` |
| `regexSafe("plus+")` | `plus\\+` |
| `regexSafe("{braces}")` | `\\{braces\\}` |
| `regexSafe("start^")` | `start\\^` |
| `regexSafe("back\\slash")` | `back\\\\slash` |
| `regexSafe("dash-dash")` | `dash\\-dash` |
### `segments` ### `segments`
Available in `bot` expressions. Available in `bot` expressions.
@@ -287,99 +266,6 @@ This is useful if you want to write rules that allow requests that have no query
- size(segments(path)) < 2 - size(segments(path)) < 2
``` ```
### DNS Functions
Anubis can also perform DNS lookups as a part of its expression evaluation. This can be useful for doing things like checking for a valid [Forward-confirmed reverse DNS (FCrDNS)](https://en.wikipedia.org/wiki/Forward-confirmed_reverse_DNS) record.
#### `arpaReverseIP`
Available in `bot` expressions.
```ts
function arpaReverseIP(ip: string): string;
```
`arpaReverseIP` takes an IP address and returns its value in [ARPA notation](https://www.ietf.org/rfc/rfc2317.html). This can be useful when matching PTR record patterns.
| Input | Output |
| :----------------------------- | :---------------------------------------------------------------- |
| `arpaReverseIP("1.2.3.4")` | `4.3.2.1` |
| `arpaReverseIP("2001:db8::1")` | `1.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.8.b.d.0.1.0.0.2` |
#### `lookupHost`
Available in `bot` expressions.
```ts
function lookupHost(host: string): string[];
```
`lookupHost` performs a DNS lookup for the given hostname and returns a list of IP addresses.
```yaml
- name: cloudflare-ip-in-host-header
action: DENY
expression: '"104.16.0.0" in lookupHost(headers["Host"])'
```
#### `reverseDNS`
Available in `bot` expressions.
```ts
function reverseDNS(ip: string): string[];
```
`reverseDNS` takes an IP address and returns the DNS names associated with it. This is useful when you want to check PTR records of an IP address.
```yaml
- name: allow-googlebot
action: ALLOW
expression: 'reverseDNS(remoteAddress).endsWith(".googlebot.com")'
```
::: warning
Do not use this for validating the legitimacy of an IP address. It is possible for DNS records to be out of date or otherwise manipulated. Use [`verifyFCrDNS`](#verifyfcrdns) instead for a more reliable result.
:::
#### `verifyFCrDNS`
Available in `bot` expressions.
```ts
function verifyFCrDNS(ip: string): bool;
function verifyFCrDNS(ip: string, pattern: string): bool;
```
`verifyFCrDNS` checks if the reverse DNS of an IP address matches its forward DNS. This is a common technique to filter out spam and bot traffic. `verifyFCrDNS` comes in two forms:
- `verifyFCrDNS(remoteAddress)` will check that the reverse DNS of the remote address resolves back to the remote address. If no PTR records, returns true.
- `verifyFCrDNS(remoteAddress, pattern)` will check that the reverse DNS of the remote address is matching with pattern and that name resolves back to the remote address.
This is best used in rules like this:
```yaml
- name: require-fcrdns-for-post
action: DENY
expression:
all:
- method == "POST"
- "!verifyFCrDNS(remoteAddress)"
```
Here is an another example that allows requests from telegram:
```yaml
- name: telegrambot
action: ALLOW
expression:
all:
- userAgent.matches("TelegramBot")
- verifyFCrDNS(remoteAddress, "ptr\\.telegram\\.org$")
```
## Life advice ## Life advice
Expressions are very powerful. This is a benefit and a burden. If you are not careful with your expression targeting, you will be liable to get yourself into trouble. If you are at all in doubt, throw a `CHALLENGE` over a `DENY`. Legitimate users can easily work around a `CHALLENGE` result with a [proof of work challenge](../../design/why-proof-of-work.mdx). Bots are less likely to be able to do this. Expressions are very powerful. This is a benefit and a burden. If you are not careful with your expression targeting, you will be liable to get yourself into trouble. If you are at all in doubt, throw a `CHALLENGE` over a `DENY`. Legitimate users can easily work around a `CHALLENGE` result with a [proof of work challenge](../../design/why-proof-of-work.mdx). Bots are less likely to be able to do this.

View File

@@ -13,8 +13,6 @@ bots:
- # This correlates to data/bots/ai-catchall.yaml in the source tree - # This correlates to data/bots/ai-catchall.yaml in the source tree
import: (data)/bots/ai-catchall.yaml import: (data)/bots/ai-catchall.yaml
- import: (data)/bots/cloudflare-workers.yaml - import: (data)/bots/cloudflare-workers.yaml
# Import all the rules in the default configuration
- import: (data)/meta/default-config.yaml
``` ```
Of note, a bot rule can either have inline bot configuration or import a bot config snippet. You cannot do both in a single bot rule. Of note, a bot rule can either have inline bot configuration or import a bot config snippet. You cannot do both in a single bot rule.
@@ -37,33 +35,6 @@ config.BotOrImport: rule definition is invalid, you must set either bot rules or
Paths can either be prefixed with `(data)` to import from the [the data folder in the Anubis source tree](https://github.com/TecharoHQ/anubis/tree/main/data) or anywhere on the filesystem. If you don't have access to the Anubis source tree, check /usr/share/docs/anubis/data or in the tarball you extracted Anubis from. Paths can either be prefixed with `(data)` to import from the [the data folder in the Anubis source tree](https://github.com/TecharoHQ/anubis/tree/main/data) or anywhere on the filesystem. If you don't have access to the Anubis source tree, check /usr/share/docs/anubis/data or in the tarball you extracted Anubis from.
## Importing the default configuration
If you want to base your configuration off of the default configuration, import `(data)/meta/default-config.yaml`:
```yaml
bots:
- import: (data)/meta/default-config.yaml
# Write your rules here
```
This will keep your configuration up to date as Anubis adapts to emerging threats.
## How do I exempt most modern browsers from Anubis challenges?
If you want to exempt most modern browsers from Anubis challenges, import `(data)/common/acts-like-browser.yaml`:
```yaml
bots:
- import: (data)/meta/default-config.yaml
- import: (data)/common/acts-like-browser.yaml
# Write your rules here
```
These rules will allow traffic that "looks like" it's from a modern copy of Edge, Safari, Chrome, or Firefox. These rules used to be enabled by default, however user reports have suggested that AI scraper bots have adapted to conform to these rules to scrape without regard for the infrastructure they are attacking.
Use these rules at your own risk.
## Importing from imports ## Importing from imports
You can also import from an imported file in case you want to import an entire folder of rules at once. You can also import from an imported file in case you want to import an entire folder of rules at once.

View File

@@ -156,68 +156,3 @@ server {
``` ```
</details> </details>
## Caddy
Anubis can be used with the [`forward_auth`](https://caddyserver.com/docs/caddyfile/directives/forward_auth) directive in Caddy.
First, the `TARGET` environment variable in Anubis must be set to a space, eg:
<Tabs>
<TabItem value="env-file" label="Environment file" default>
```shell
# anubis.env
TARGET=" "
# ...
```
</TabItem>
<TabItem value="docker-compose" label="Docker Compose">
```yaml
services:
anubis-caddy:
image: ghcr.io/techarohq/anubis:latest
environment:
TARGET: " "
# ...
```
</TabItem>
<TabItem value="k8s" label="Kubernetes">
Inside your Deployment, StatefulSet, or Pod:
```yaml
- name: anubis
image: ghcr.io/techarohq/anubis:latest
env:
- name: TARGET
value: " "
# ...
```
</TabItem>
</Tabs>
Then configure the necessary directives in your site block:
```caddy
route {
# Assumption: Anubis is running in the same network namespace as
# caddy on localhost TCP port 8923
reverse_proxy /.within.website/* 127.0.0.1:8923
forward_auth 127.0.0.1:8923 {
uri /.within.website/x/cmd/anubis/api/check
trusted_proxies private_ranges
@unauthorized status 401
handle_response @unauthorized {
redir * /.within.website/?redir={uri} 307
}
}
}
```
If you want to use this for multiple sites, you can create a [snippet](https://caddyserver.com/docs/caddyfile/concepts#snippets) and import it in multiple site blocks.

View File

@@ -41,6 +41,7 @@ thresholds:
challenge: challenge:
algorithm: metarefresh algorithm: metarefresh
difficulty: 1 difficulty: 1
report_as: 1
- name: moderate-suspicion - name: moderate-suspicion
expression: expression:
@@ -51,6 +52,7 @@ thresholds:
challenge: challenge:
algorithm: fast algorithm: fast
difficulty: 2 difficulty: 2
report_as: 2
- name: extreme-suspicion - name: extreme-suspicion
expression: weight >= 20 expression: weight >= 20
@@ -58,6 +60,7 @@ thresholds:
challenge: challenge:
algorithm: fast algorithm: fast
difficulty: 4 difficulty: 4
report_as: 4
``` ```
This defines a suite of 4 thresholds: This defines a suite of 4 thresholds:
@@ -127,6 +130,7 @@ action: CHALLENGE
challenge: challenge:
algorithm: metarefresh algorithm: metarefresh
difficulty: 1 difficulty: 1
report_as: 1
``` ```
</td> </td>

View File

@@ -92,11 +92,6 @@ Assuming you are protecting `anubistest.techaro.lol`, you need the following ser
DocumentRoot /var/www/anubistest.techaro.lol DocumentRoot /var/www/anubistest.techaro.lol
ErrorLog /var/log/httpd/anubistest.techaro.lol_error.log ErrorLog /var/log/httpd/anubistest.techaro.lol_error.log
CustomLog /var/log/httpd/anubistest.techaro.lol_access.log combined CustomLog /var/log/httpd/anubistest.techaro.lol_access.log combined
# Pass the remote IP to the proxied application instead of 127.0.0.1
# This requires mod_remoteip
RemoteIPHeader X-Real-IP
RemoteIPTrustedProxy 127.0.0.1/32
</VirtualHost> </VirtualHost>
``` ```

View File

@@ -1,99 +0,0 @@
# HAProxy
import CodeBlock from "@theme/CodeBlock";
To use Anubis with HAProxy, you have two variants:
- simple - stick Anubis between HAProxy and your application backend (simple)
- perfect if you only have a single application in general
- advanced - force Anubis challenge by default and route to the application backend by HAProxy if the challenge is correct
- useful for complex setups
- routing can be done in HAProxy
- define ACLs in HAProxy for domains, paths etc which are required/excluded regarding Anubis
- HAProxy 3.0 recommended
## Simple Variant
```mermaid
---
title: HAProxy with simple config
---
flowchart LR
T(User Traffic)
HAProxy(HAProxy Port 80/443)
Anubis
Application
T --> HAProxy
HAProxy --> Anubis
Anubis --> |Happy Traffic| Application
```
Your Anubis env file configuration may look like this:
import simpleAnubis from "!!raw-loader!./haproxy/simple-config.env";
<CodeBlock language="bash">{simpleAnubis}</CodeBlock>
The important part is that `TARGET` points to your actual application and if Anubis and HAProxy are on the same machine, a UNIX socket can be used.
Your frontend and backend configuration of HAProxy may look like the following:
import simpleHAProxy from "!!raw-loader!./haproxy/simple-haproxy.cfg";
<CodeBlock language="bash">{simpleHAProxy}</CodeBlock>
This simply enables SSL offloading, sets some useful and required headers and routes to Anubis directly.
## Advanced Variant
Due to the fact that HAProxy can decode JWT, we are able to verify the Anubis token directly in HAProxy and route the traffic to the specific backends ourselves.
In this example are three applications behind one HAProxy frontend. Only App1 and App2 are secured via Anubis; App3 is open for everyone. The path `/excluded/path` can also be accessed by anyone.
```mermaid
---
title: HAProxy with advanced config
---
flowchart LR
T(User Traffic)
HAProxy(HAProxy Port 80/443)
B1(App1)
B2(App2)
B3(App3)
Anubis
T --> HAProxy
HAProxy --> |Traffic for App1 and App2 without valid challenge| Anubis
HAProxy --> |app1.example.com | B1
HAProxy --> |app2.example.com| B2
HAProxy --> |app3.example.com| B3
```
:::note
For an improved JWT decoding performance, it's recommended to use HAProxy version 3.0 or above.
:::
Your Anubis env file configuration may look like this:
import advancedAnubis from "!!raw-loader!./haproxy/advanced-config.env";
<CodeBlock language="bash">{advancedAnubis}</CodeBlock>
It's important to use `HS512_SECRET` which HAProxy understands. Please replace `<SECRET-HERE>` with your own secret string (alphanumerical string with 128 characters recommended).
You can set Anubis to force a challenge for every request using the following policy file:
import advancedAnubisPolicy from "!!raw-loader!./haproxy/advanced-config-policy.yml";
<CodeBlock language="yaml">{advancedAnubisPolicy}</CodeBlock>
The HAProxy config file may look like this:
import advancedHAProxy from "!!raw-loader!./haproxy/advanced-haproxy.cfg";
<CodeBlock language="haproxy">{advancedHAProxy}</CodeBlock>
Please replace `<SECRET-HERE>` with the same secret from the Anubis config.

View File

@@ -1,15 +0,0 @@
# /etc/anubis/challenge-any.yml
bots:
- name: any
action: CHALLENGE
user_agent_regex: .*
status_codes:
CHALLENGE: 403
DENY: 403
thresholds: []
dnsbl: false

View File

@@ -1,11 +0,0 @@
# /etc/anubis/default.env
BIND=/run/anubis/default.sock
BIND_NETWORK=unix
DIFFICULTY=4
METRICS_BIND=:9090
# target is irrelevant here, backend routing happens in HAProxy
TARGET=http://0.0.0.0
HS512_SECRET=<SECRET-HERE>
COOKIE_DYNAMIC_DOMAIN=True
POLICY_FNAME=/etc/anubis/challenge-any.yml

View File

@@ -1,59 +0,0 @@
# /etc/haproxy/haproxy.cfg
frontend FE-multiple-applications
mode http
bind :80
# ssl offloading on port 443 using a certificate from /etc/haproxy/ssl/ directory
bind :443 ssl crt /etc/haproxy/ssl/ alpn h2,http/1.1 ssl-min-ver TLSv1.2 no-tls-tickets
# set X-Real-IP header required for Anubis
http-request set-header X-Real-IP "%[src]"
# redirect HTTP to HTTPS
http-request redirect scheme https code 301 unless { ssl_fc }
# add HSTS header
http-response set-header Strict-Transport-Security "max-age=31536000; includeSubDomains; preload"
# only force Anubis challenge for app1 and app2
acl acl_anubis_required hdr(host) -i "app1.example.com"
acl acl_anubis_required hdr(host) -i "app2.example.com"
# exclude Anubis for a specific path
acl acl_anubis_ignore path /excluded/path
# use Anubis if auth cookie not found
use_backend BE-anubis if acl_anubis_required !acl_anubis_ignore !{ req.cook(techaro.lol-anubis-auth) -m found }
# get payload of the JWT such as algorithm, expire time, restrictions
http-request set-var(txn.anubis_jwt_alg) req.cook(techaro.lol-anubis-auth),jwt_header_query('$.alg') if acl_anubis_required !acl_anubis_ignore
http-request set-var(txn.anubis_jwt_exp) cook(techaro.lol-anubis-auth),jwt_payload_query('$.exp','int') if acl_anubis_required !acl_anubis_ignore
http-request set-var(txn.anubis_jwt_res) cook(techaro.lol-anubis-auth),jwt_payload_query('$.restriction') if acl_anubis_required !acl_anubis_ignore
http-request set-var(txn.srcip) req.fhdr(X-Real-IP) if acl_anubis_required !acl_anubis_ignore
http-request set-var(txn.now) date() if acl_anubis_required !acl_anubis_ignore
# use Anubis if JWT has wrong algorithm, is expired, restrictions don't match or isn't signed with the correct key
use_backend BE-anubis if acl_anubis_required !acl_anubis_ignore !{ var(txn.anubis_jwt_alg) -m str HS512 }
use_backend BE-anubis if acl_anubis_required !acl_anubis_ignore { var(txn.anubis_jwt_exp),sub(txn.now) -m int lt 0 }
use_backend BE-anubis if acl_anubis_required !acl_anubis_ignore !{ var(txn.srcip),digest(sha256),hex,lower,strcmp(txn.anubis_jwt_res) eq 0 }
use_backend BE-anubis if acl_anubis_required !acl_anubis_ignore !{ cook(techaro.lol-anubis-auth),jwt_verify(txn.anubis_jwt_alg,"<SECRET-HERE>") -m int 1 }
# custom routing in HAProxy
use_backend BE-app1 if { hdr(host) -i "app1.example.com" }
use_backend BE-app2 if { hdr(host) -i "app2.example.com" }
use_backend BE-app3 if { hdr(host) -i "app3.example.com" }
backend BE-app1
mode http
server app1-server 127.0.0.1:3000
backend BE-app2
mode http
server app2-server 127.0.0.1:4000
backend BE-app3
mode http
server app3-server 127.0.0.1:5000
BE-anubis
mode http
server anubis /run/anubis/default.sock

View File

@@ -1,10 +0,0 @@
# /etc/anubis/default.env
BIND=/run/anubis/default.sock
BIND_NETWORK=unix
SOCKET_MODE=0666
DIFFICULTY=4
METRICS_BIND=:9090
COOKIE_DYNAMIC_DOMAIN=true
# address and port of the actual application
TARGET=http://localhost:3000

View File

@@ -1,22 +0,0 @@
# /etc/haproxy/haproxy.cfg
frontend FE-application
mode http
bind :80
# ssl offloading on port 443 using a certificate from /etc/haproxy/ssl/ directory
bind :443 ssl crt /etc/haproxy/ssl/ alpn h2,http/1.1 ssl-min-ver TLSv1.2 no-tls-tickets
# set X-Real-IP header required for Anubis
http-request set-header X-Real-IP "%[src]"
# redirect HTTP to HTTPS
http-request redirect scheme https code 301 unless { ssl_fc }
# add HSTS header
http-response set-header Strict-Transport-Security "max-age=31536000; includeSubDomains; preload"
# route to Anubis backend by default
default_backend BE-anubis-application
BE-anubis-application
mode http
server anubis /run/anubis/default.sock

View File

@@ -1,9 +1,5 @@
# Kubernetes # Kubernetes
:::note
Leave the `PUBLIC_URL` environment variable unset in this sidecar/standalone setup. Setting it here makes redirect construction fail (`redir=null`).
:::
When setting up Anubis in Kubernetes, you want to make sure that you thread requests through Anubis kinda like this: When setting up Anubis in Kubernetes, you want to make sure that you thread requests through Anubis kinda like this:
```mermaid ```mermaid

View File

@@ -1,7 +1,5 @@
# Nginx # Nginx
import CodeBlock from "@theme/CodeBlock";
Anubis is intended to be a filter proxy. The way to integrate this with nginx is to break your configuration up into two parts: TLS termination and then HTTP routing. Consider this diagram: Anubis is intended to be a filter proxy. The way to integrate this with nginx is to break your configuration up into two parts: TLS termination and then HTTP routing. Consider this diagram:
```mermaid ```mermaid
@@ -38,26 +36,108 @@ These examples assume that you are using a setup where your nginx configuration
Assuming that we are protecting `anubistest.techaro.lol`, here's what the server configuration file would look like: Assuming that we are protecting `anubistest.techaro.lol`, here's what the server configuration file would look like:
import anubisTest from "!!raw-loader!./nginx/server-anubistest-techaro-lol.conf"; ```nginx
# /etc/nginx/conf.d/server-anubistest-techaro-lol.conf
<CodeBlock language="nginx">{anubisTest}</CodeBlock> # HTTP - Redirect all HTTP traffic to HTTPS
server {
listen 80;
listen [::]:80;
server_name anubistest.techaro.lol;
location / {
return 301 https://$host$request_uri;
}
}
# TLS termination server, this will listen over TLS (https) and then
# proxy all traffic to the target via Anubis.
server {
# Listen on TCP port 443 with TLS (https) and HTTP/2
listen 443 ssl http2;
listen [::]:443 ssl http2;
location / {
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Http-Version $server_protocol;
proxy_pass http://anubis;
}
server_name anubistest.techaro.lol;
ssl_certificate /path/to/your/certs/anubistest.techaro.lol.crt;
ssl_certificate_key /path/to/your/certs/anubistest.techaro.lol.key;
}
# Backend server, this is where your webapp should actually live.
server {
listen unix:/run/nginx/nginx.sock;
server_name anubistest.techaro.lol;
root "/srv/http/anubistest.techaro.lol";
index index.html;
# Get the visiting IP from the TLS termination server
set_real_ip_from unix:;
real_ip_header X-Real-IP;
# Your normal configuration can go here
# location .php { fastcgi...} etc.
}
```
:::tip :::tip
You can copy the `location /` block into a separate file named something like `conf-anubis.inc` and then include it inline to other `server` blocks: You can copy the `location /` block into a separate file named something like `conf-anubis.inc` and then include it inline to other `server` blocks:
import anubisInclude from "!!raw-loader!./nginx/conf-anubis.inc"; ```nginx
# /etc/nginx/conf.d/conf-anubis.inc
<CodeBlock language="nginx">{anubisInclude}</CodeBlock> # Forward to anubis
location / {
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_pass http://anubis;
}
```
Then in a server block: Then in a server block:
<details> <details>
<summary>Full nginx config</summary> <summary>Full nginx config</summary>
import mimiTecharoLol from "!!raw-loader!./nginx/server-mimi-techaro-lol.conf"; ```nginx
# /etc/nginx/conf.d/server-mimi-techaro-lol.conf
<CodeBlock language="nginx">{mimiTecharoLol}</CodeBlock> server {
# Listen on 443 with SSL
listen 443 ssl http2;
listen [::]:443 ssl http2;
# Slipstream via Anubis
include "conf-anubis.inc";
server_name mimi.techaro.lol;
ssl_certificate /path/to/your/certs/mimi.techaro.lol.crt;
ssl_certificate_key /path/to/your/certs/mimi.techaro.lol.key;
}
server {
listen unix:/run/nginx/nginx.sock;
server_name mimi.techaro.lol;
port_in_redirect off;
root "/srv/http/mimi.techaro.lol";
index index.html;
# Your normal configuration can go here
# location .php { fastcgi...} etc.
}
```
</details> </details>
@@ -65,9 +145,24 @@ import mimiTecharoLol from "!!raw-loader!./nginx/server-mimi-techaro-lol.conf";
Create an upstream for Anubis. Create an upstream for Anubis.
import anubisUpstream from "!!raw-loader!./nginx/upstream-anubis.conf"; ```nginx
# /etc/nginx/conf.d/upstream-anubis.conf
<CodeBlock language="nginx">{anubisUpstream}</CodeBlock> upstream anubis {
# Make sure this matches the values you set for `BIND` and `BIND_NETWORK`.
# If this does not match, your services will not be protected by Anubis.
# Try anubis first over a UNIX socket
server unix:/run/anubis/nginx.sock;
#server 127.0.0.1:8923;
# Optional: fall back to serving the websites directly. This allows your
# websites to be resilient against Anubis failing, at the risk of exposing
# them to the raw internet without protection. This is a tradeoff and can
# be worth it in some edge cases.
#server unix:/run/nginx.sock backup;
}
```
This can be repeated for multiple sites. Anubis does not care about the HTTP `Host` header and will happily cope with multiple websites via the same instance. This can be repeated for multiple sites. Anubis does not care about the HTTP `Host` header and will happily cope with multiple websites via the same instance.

View File

@@ -1,2 +0,0 @@
# /etc/nginx/conf-anubis.inc # Forward to anubis location / { proxy_set_header
Host $host; proxy_set_header X-Real-IP $remote_addr; proxy_pass http://anubis; }

View File

@@ -1,50 +0,0 @@
# /etc/nginx/conf.d/server-anubistest-techaro-lol.conf
# HTTP - Redirect all HTTP traffic to HTTPS
server {
listen 80;
listen [::]:80;
server_name anubistest.techaro.lol;
location / {
return 301 https://$host$request_uri;
}
}
# TLS termination server, this will listen over TLS (https) and then
# proxy all traffic to the target via Anubis.
server {
# Listen on TCP port 443 with TLS (https) and HTTP/2
listen 443 ssl;
listen [::]:443 ssl;
http2 on;
location / {
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Http-Version $server_protocol;
proxy_pass http://anubis;
}
server_name anubistest.techaro.lol;
ssl_certificate /path/to/your/certs/anubistest.techaro.lol.crt;
ssl_certificate_key /path/to/your/certs/anubistest.techaro.lol.key;
}
# Backend server, this is where your webapp should actually live.
server {
listen unix:/run/nginx/nginx.sock;
server_name anubistest.techaro.lol;
root "/srv/http/anubistest.techaro.lol";
index index.html;
# Get the visiting IP from the TLS termination server
set_real_ip_from unix:;
real_ip_header X-Real-IP;
# Your normal configuration can go here
# location .php { fastcgi...} etc.
}

View File

@@ -1,29 +0,0 @@
# /etc/nginx/conf.d/server-mimi-techaro-lol.conf
server {
# Listen on 443 with SSL
listen 443 ssl;
listen [::]:443 ssl;
http2 on;
# Slipstream via Anubis
include "conf-anubis.inc";
server_name mimi.techaro.lol;
ssl_certificate /path/to/your/certs/mimi.techaro.lol.crt;
ssl_certificate_key /path/to/your/certs/mimi.techaro.lol.key;
}
server {
listen unix:/run/nginx/nginx.sock;
server_name mimi.techaro.lol;
port_in_redirect off;
root "/srv/http/mimi.techaro.lol";
index index.html;
# Your normal configuration can go here
# location .php { fastcgi...} etc.
}

View File

@@ -1,16 +0,0 @@
# /etc/nginx/conf.d/upstream-anubis.conf
upstream anubis {
# Make sure this matches the values you set for `BIND` and `BIND_NETWORK`.
# If this does not match, your services will not be protected by Anubis.
# Try anubis first over a UNIX socket
server unix:/run/anubis/nginx.sock;
#server 127.0.0.1:8923;
# Optional: fall back to serving the websites directly. This allows your
# websites to be resilient against Anubis failing, at the risk of exposing
# them to the raw internet without protection. This is a tradeoff and can
# be worth it in some edge cases.
#server unix:/run/nginx.sock backup;
}

View File

@@ -75,7 +75,7 @@ services:
# Telling Anubis, where to listen for Traefik # Telling Anubis, where to listen for Traefik
- BIND=:8080 - BIND=:8080
# Telling Anubis to do redirect — ensure there is a space after '=' # Telling Anubis to do redirect — ensure there is a space after '='
- "TARGET= " - 'TARGET= '
# Specifies which domains Anubis is allowed to redirect to. # Specifies which domains Anubis is allowed to redirect to.
- REDIRECT_DOMAINS=example.com - REDIRECT_DOMAINS=example.com
# Should be the full external URL for Anubis (including scheme) # Should be the full external URL for Anubis (including scheme)

View File

@@ -1,8 +0,0 @@
{
"label": "Honeypot",
"position": 40,
"link": {
"type": "generated-index",
"description": "Honeypot features in Anubis, allowing Anubis to passively detect malicious crawlers."
}
}

View File

@@ -1,40 +0,0 @@
---
title: Dataset poisoning
---
Anubis offers the ability to participate in [dataset poisoning](https://www.anthropic.com/research/small-samples-poison) attacks similar to what [iocaine](https://iocaine.madhouse-project.org/) and other similar tools offer. Currently this is in a preview state where a lot of details are hard-coded in order to test the viability of this approach.
In essence, when Anubis challenge and error pages are rendered they include a small bit of HTML code that browsers will ignore but scrapers will interpret as a link to ingest. This will then create a small forest of recursive nothing pages that are designed according to the following principles:
- These pages are _cheap_ to render, rendering in at most ten milliseconds on decently specced hardware.
- These pages are _vacuous_, meaning that they essentially are devoid of content such that a human would find it odd and click away, but a scraper would not be able to know that and would continue through the forest.
- These pages are _fairly large_ so that scrapers don't think that the pages are error pages or are otherwise devoid of content.
- These pages are _fully self-contained_ so that they load fast without incurring additional load from resource fetches.
In this limited preview state, Anubis generates pages using [spintax](https://outboundly.ai/blogs/what-is-spintax-and-how-to-use-it/). Spintax is a syntax that is used to create different variants of utterances for use in marketing messages and email spam that evades word filtering. In its current form, Anubis' dataset poisoning has AI generated spintax that generates vapid LinkedIn posts with some western occultism thrown in for good measure. This results in utterances like the following:
> There's a moment when visionaries are being called to realize that the work can't be reduced to optimization, but about resonance. We don't transform products by grinding endlessly, we do it by holding the vision. Because meaning can't be forced, it unfolds over time when culture are in integrity. This moment represents a fundamental reimagining in how we think about work. This isn't a framework, it's a lived truth that requires courage. When we get honest, we activate nonlinear growth that don't show up in dashboards, but redefine success anyway.
This should be fairly transparent to humans that this is pseudoprofound anti-content and is a signal to click away.
## Plans
Future versions of this feature will allow for more customization. In the near future this will be configurable via the following mechanisms:
- WebAssembly logic for customizing how the poisoning data is generated (with examples including the existing spintax method).
- Weight thresholds and logic for how they are interpreted by Anubis.
- Other configuration settings as facts and circumstances dictate.
## Implementation notes
In its current implementation, the Anubis dataset poisoning feature has the following flaws that may hinder production deployments:
- All Anubis instances use the same method for generating dataset poisoning information. This may be easy for malicious actors to detect and ignore.
- Anubis dataset poisoning routes are under the `/.within.website/x/cmd/anubis` URL hierarchy. This may be easy for malicious actors to detect and ignore.
Right now Anubis assigns 30 weight points if the following criteria are met:
- A client's User-Agent has been observed in the dataset poisoning maze at least 25 times.
- The network-clamped IP address (/24 for IPv4 and /48 for IPv6) has been observed in the dataset poisoning maze at least 25 times.
Additionally, when any given client by both User-Agent and network-clamped IP address has been observed, Anubis will emit log lines warning about it so that administrative action can be taken up to and including [filing abuse reports with the network owner](/blog/2025/file-abuse-reports).

View File

@@ -94,7 +94,7 @@ Anubis uses these environment variables for configuration:
| `OG_CACHE_CONSIDER_HOST` | `false` | If set to `true`, Anubis will consider the host in the Open Graph tag cache key. Prefer using [the policy file](./configuration/open-graph.mdx) to configure the Open Graph subsystem. | | `OG_CACHE_CONSIDER_HOST` | `false` | If set to `true`, Anubis will consider the host in the Open Graph tag cache key. Prefer using [the policy file](./configuration/open-graph.mdx) to configure the Open Graph subsystem. |
| `OVERLAY_FOLDER` | unset | <EO /> If set, treat the given path as an [overlay folder](./botstopper.mdx#custom-images-and-css), allowing you to customize CSS, fonts, images, and add other assets to BotStopper deployments. | | `OVERLAY_FOLDER` | unset | <EO /> If set, treat the given path as an [overlay folder](./botstopper.mdx#custom-images-and-css), allowing you to customize CSS, fonts, images, and add other assets to BotStopper deployments. |
| `POLICY_FNAME` | unset | The file containing [bot policy configuration](./policies.mdx). See the bot policy documentation for more details. If unset, the default bot policy configuration is used. | | `POLICY_FNAME` | unset | The file containing [bot policy configuration](./policies.mdx). See the bot policy documentation for more details. If unset, the default bot policy configuration is used. |
| `PUBLIC_URL` | unset | The externally accessible URL for this Anubis instance, used for constructing redirect URLs (e.g., for Traefik forwardAuth). Leave it unset when Anubis terminates traffic directly (sidecar/standalone deployments) or redirect building will fail with `redir=null`. | | `PUBLIC_URL` | unset | The externally accessible URL for this Anubis instance, used for constructing redirect URLs (e.g., for Traefik forwardAuth). |
| `REDIRECT_DOMAINS` | unset | Comma-separated list of domain names that Anubis should allow redirects to when passing a challenge. See [Redirect Domain Configuration](./configuration/redirect-domains) for more details. | | `REDIRECT_DOMAINS` | unset | Comma-separated list of domain names that Anubis should allow redirects to when passing a challenge. See [Redirect Domain Configuration](./configuration/redirect-domains) for more details. |
| `SERVE_ROBOTS_TXT` | `false` | If set `true`, Anubis will serve a default `robots.txt` file that disallows all known AI scrapers by name and then additionally disallows every scraper. This is useful if facts and circumstances make it difficult to change the underlying service to serve such a `robots.txt` file. | | `SERVE_ROBOTS_TXT` | `false` | If set `true`, Anubis will serve a default `robots.txt` file that disallows all known AI scrapers by name and then additionally disallows every scraper. This is useful if facts and circumstances make it difficult to change the underlying service to serve such a `robots.txt` file. |
| `SLOG_LEVEL` | `INFO` | The log level for structured logging. Valid values are `DEBUG`, `INFO`, `WARN`, and `ERROR`. Set to `DEBUG` to see all requests, evaluations, and detailed diagnostic information. | | `SLOG_LEVEL` | `INFO` | The log level for structured logging. Valid values are `DEBUG`, `INFO`, `WARN`, and `ERROR`. Set to `DEBUG` to see all requests, evaluations, and detailed diagnostic information. |
@@ -203,7 +203,6 @@ To get Anubis filtering your traffic, you need to make sure it's added to your H
- [Kubernetes](./environments/kubernetes.mdx) - [Kubernetes](./environments/kubernetes.mdx)
- [Nginx](./environments/nginx.mdx) - [Nginx](./environments/nginx.mdx)
- [Traefik](./environments/traefik.mdx) - [Traefik](./environments/traefik.mdx)
- [HAProxy](./environments/haproxy.mdx)
:::note :::note

View File

@@ -1,50 +0,0 @@
---
title: iplist2rule CLI tool
---
The `iplist2rule` tool converts IP blocklists into Anubis challenge policies. It reads common IP block list formats and generates the appropriate Anubis policy file for IP address filtering.
## Installation
Install directly with Go
```bash
go install github.com/TecharoHQ/anubis/utils/cmd/iplist2rule@latest
```
## Usage
Basic conversion from URL:
```bash
iplist2rule https://raw.githubusercontent.com/7c/torfilter/refs/heads/main/lists/txt/torfilter-1m-flat.txt filter-tor.yaml
```
Explicitly allow every IP address on a list:
```bash
iplist2rule --action ALLOW https://raw.githubusercontent.com/7c/torfilter/refs/heads/main/lists/txt/torfilter-1m-flat.txt filter-tor.yaml
```
Add weight to requests matching IP addresses on a list:
```bash
iplist2rule --action WEIGH --weight 20 https://raw.githubusercontent.com/7c/torfilter/refs/heads/main/lists/txt/torfilter-1m-flat.txt filter-tor.yaml
```
## Options
| Flag | Description | Default |
| :------------ | :----------------------------------------------------------------------------------------------- | :-------------------------------- |
| `--action` | The Anubis action to take for the IP address in question, must be in ALL CAPS. | `DENY` (forbids traffic) |
| `--rule-name` | The name for the generated Anubis rule, should be in kebab-case. | (not set, inferred from filename) |
| `--weight` | When `--action=WEIGH`, how many weight points should be added or removed from matching requests? | 0 (not set) |
## Using the Generated Policy
Save the output and import it in your main policy file:
```yaml
bots:
- import: "./filter-tor.yaml"
```

View File

@@ -143,4 +143,3 @@ For more details on particular reverse proxies, see here:
- [Apache](./environments/apache.mdx) - [Apache](./environments/apache.mdx)
- [Nginx](./environments/nginx.mdx) - [Nginx](./environments/nginx.mdx)
- [HAProxy](./environments/haproxy.mdx)

View File

@@ -84,6 +84,7 @@ This rule has been known to have a high false positive rate in testing. Please u
action: CHALLENGE action: CHALLENGE
challenge: challenge:
difficulty: 16 # impossible difficulty: 16 # impossible
report_as: 4 # lie to the operator
algorithm: slow # intentionally waste CPU cycles and time algorithm: slow # intentionally waste CPU cycles and time
``` ```
@@ -92,6 +93,7 @@ Challenges can be configured with these settings:
| Key | Example | Description | | Key | Example | Description |
| :----------- | :------- | :--------------------------------------------------------------------------------------------------------------------------------------------------------------- | | :----------- | :------- | :--------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `difficulty` | `4` | The challenge difficulty (number of leading zeros) for proof-of-work. See [Why does Anubis use Proof-of-Work?](/docs/design/why-proof-of-work) for more details. | | `difficulty` | `4` | The challenge difficulty (number of leading zeros) for proof-of-work. See [Why does Anubis use Proof-of-Work?](/docs/design/why-proof-of-work) for more details. |
| `report_as` | `4` | What difficulty the UI should report to the user. Useful for messing with industrial-scale scraping efforts. |
| `algorithm` | `"fast"` | The challenge method to use. See [the list of challenge methods](./configuration/challenges/) for more information. | | `algorithm` | `"fast"` | The challenge method to use. See [the list of challenge methods](./configuration/challenges/) for more information. |
### Remote IP based filtering ### Remote IP based filtering
@@ -224,8 +226,8 @@ Using this backend will cause a lot of S3 operations, at least one for creating
The `s3api` backend takes the following configuration options: The `s3api` backend takes the following configuration options:
| Name | Type | Example | Description | | Name | Type | Example | Description |
| :----------- | :------ | :------------ | :------------------------------------------------------------------------------------------------------------------------------------------ | | :----------- | :------ | :------------------------------------------------------------------- | :------------------------------------------------------------------------------------------------------------------------------------------ |
| `bucketName` | string | `anubis-data` | (Required) The name of the dedicated bucket for Anubis to store information in. | | `bucketName` | string | The name of the dedicated bucket for Anubis to store information in. |
| `pathStyle` | boolean | `false` | If true, use path-style S3 API operations. Please consult your storage provider's documentation if you don't know what you should put here. | | `pathStyle` | boolean | `false` | If true, use path-style S3 API operations. Please consult your storage provider's documentation if you don't know what you should put here. |
:::note :::note
@@ -277,7 +279,7 @@ store:
:::note :::note
You can also use [Redis](http://redis.io/) with Anubis. You can also use [Redis](http://redis.io/) with Anubis.
::: :::
@@ -289,17 +291,15 @@ This backend is ideal if you are running multiple instances of Anubis in a worke
| Does your service get a lot of traffic? | ✅ Yes | | Does your service get a lot of traffic? | ✅ Yes |
| Do you want to store data persistently when Anubis restarts? | ✅ Yes | | Do you want to store data persistently when Anubis restarts? | ✅ Yes |
| Do you run Anubis without mutable filesystem storage? | ✅ Yes | | Do you run Anubis without mutable filesystem storage? | ✅ Yes |
| Do you have Redis or Valkey installed? | ✅ Yes | | Do you have Redis or Valkey installed? | ✅ Yes |
#### Configuration #### Configuration
The `valkey` backend takes the following configuration options: The `valkey` backend takes the following configuration options:
| Name | Type | Example | Description | | Name | Type | Example | Description |
| :--------- | :----- | :---------------------- | :------------------------------------------------------------------------------------------------------------------------------------------------ | | :---- | :----- | :---------------------- | :----------------------------------------------------------------------------------------------------------------------------------------------- |
| `cluster` | bool | `false` | If true, use [Redis™ Clustering](https://redis.io/topics/cluster-spec) for storing Anubis data. | | `url` | string | `redis://valkey:6379/0` | The URL for the instance of Redis or Valkey that Anubis should store data in. This is in the same format as `REDIS_URL` in many cloud providers. |
| `sentinel` | object | `{}` | See [Redis™ Sentinel docs](#redis-sentinel) for more detail and examples |
| `url` | string | `redis://valkey:6379/0` | The URL for the instance of Redis™ or Valkey that Anubis should store data in. This is in the same format as `REDIS_URL` in many cloud providers. |
Example: Example:
@@ -314,96 +314,6 @@ store:
This would have the Valkey client connect to host `valkey.int.techaro.lol` on port `6379` with database `0` (the default database). This would have the Valkey client connect to host `valkey.int.techaro.lol` on port `6379` with database `0` (the default database).
#### Redis™ Sentinel
If you are using [Redis™ Sentinel](https://redis.io/docs/latest/operate/oss_and_stack/management/sentinel/) for a high availability setup, you need to configure the `sentinel` object. This object takes the following configuration options:
| Name | Type | Example | Description |
| :----------- | :----------------------- | :-------------------- | :-------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `addr` | string or list of string | `10.43.208.130:26379` | (Required) The host and port of the Redis™ Sentinel server. When possible, use DNS names for this. If you have multiple addresses, supply a list of them. |
| `clientName` | string | `Anubis` | The client name reported to Redis™ Sentinel. Set this if you want to track Anubis connections to your Redis™ Sentinel. |
| `masterName` | string | `mymaster` | (Required) The name of the master in the Redis™ Sentinel configuration. This is used to discover where to find client connection hosts/ports. |
| `username` | string | `azurediamond` | The username used to authenticate against the Redis™ Sentinel and Redis™ servers. |
| `password` | string | `hunter2` | The password used to authenticate against the Redis™ Sentinel and Redis™ servers. |
## Logging management
Anubis has very verbose logging out of the box. This is intentional and allows administrators to be sure that it is working merely by watching it work in real time. Some administrators may not appreciate this level of logging out of the box. As such, Anubis lets you customize details about how it logs data.
Anubis uses a practice called [structured logging](https://stackify.com/what-is-structured-logging-and-why-developers-need-it/) to emit log messages with key-value pair context. In order to make analyzing large amounts of log messages easier, Anubis encodes all logs in JSON. This allows you to use any tool that can parse JSON to perform analytics or monitor for issues.
Anubis exposes the following logging settings in the policy file:
| Name | Type | Example | Description |
| :----------- | :----------------------- | :-------------- | :--------------------------------------------------------------------------------------------------------------------------------------- |
| `level` | [log level](#log-levels) | `info` | The logging level threshold. Any logs that are at or above this threshold will be drained to the sink. Any other logs will be discarded. |
| `sink` | string | `stdio`, `file` | The sink where the logs drain to as they are being recorded in Anubis. |
| `parameters` | object | | Parameters for the given logging sink. This will vary based on the logging sink of choice. See below for more information. |
Anubis supports the following logging sinks:
1. `file`: logs are emitted to a file that is rotated based on size and age. Old log files are compressed with gzip to save space. This allows for better integration with users that decide to use legacy service managers (OpenRC, FreeBSD's init, etc).
2. `stdio`: logs are emitted to the standard error stream of the Anubis process. This allows runtimes such as Docker, Podman, Systemd, and Kubernetes to capture logs with their native logging subsystems without any additional configuration.
### Log levels
Anubis uses Go's [standard library `log/slog` package](https://pkg.go.dev/log/slog) to emit structured logs. By default, Anubis logs at the [Info level](https://pkg.go.dev/log/slog#Level), which is fairly verbose out of the box. Here are the possible logging levels in Anubis:
| Log level | Use in Anubis |
| :-------- | :-------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `DEBUG` | The raw unfiltered torrent of doom. Only use this if you are actively working on Anubis or have very good reasons to use it. |
| `INFO` | The default logging level, fairly verbose in order to make it easier for automation to parse. |
| `WARN` | A "more silent" logging level. Much less verbose. Some things that are now at the `info` level need to be moved up to the `warn` level in future patches. |
| `ERROR` | Only log error messages. |
Additionally, you can set a "slightly higher" log level if you need to, such as:
```yaml
logging:
sink: stdio
level: "INFO+1"
```
This isn't currently used by Anubis, but will be in the future for "slightly important" information.
### `file` sink
The `file` sink makes Anubis write its logs to the filesystem and rotate them out when the log file meets certain thresholds. This logging sink takes the following parameters:
| Name | Type | Example | Description |
| :------------- | :-------------- | :-------------------- | :----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `file` | string | `/var/log/anubis.log` | The file where Anubis logs should be written to. Make sure the user Anubis is running as has write and file creation permissions to this directory. |
| `maxBackups` | number | `3` | The number of old log files that should be maintained when log files are rotated out. |
| `maxBytes` | number of bytes | `67108864` (64Mi) | The maximum size of each log file before it is rotated out. |
| `maxAge` | number of days | `7` | If a log file is more than this many days old, rotate it out. |
| `compress` | boolean | `true` | If true, compress old log files with gzip. This should be set to `true` and is only exposed as an option for dealing with legacy workflows where there is magical thinking about log files at play. |
| `useLocalTime` | boolean | `false` | If true, use the system local time zone to create log filenames instead of UTC. This should almost always be set to `false` and is only exposed for legacy workflows where there is magical thinking about time zones at play. |
```yaml
logging:
sink: file
parameters:
file: "./var/anubis.log"
maxBackups: 3 # keep at least 3 old copies
maxBytes: 67108864 # each file can have up to 64 Mi of logs
maxAge: 7 # rotate files out every n days
compress: true # gzip-compress old log files
useLocalTime: false # timezone for rotated files is UTC
```
When files are rotated out, the old files will be named after the rotation timestamp in [RFC 3339 format](https://www.rfc-editor.org/rfc/rfc3339).
### `stdio` sink
By default, Anubis logs everything to the standard error stream of its process. This requires no configuration:
```yaml
logging:
sink: stdio
```
If you use a service orchestration platform that does not capture the standard error stream of processes, you need to use a different logging sink.
## Risk calculation for downstream services ## Risk calculation for downstream services
In case your service needs it for risk calculation reasons, Anubis exposes information about the rules that any requests match using a few headers: In case your service needs it for risk calculation reasons, Anubis exposes information about the rules that any requests match using a few headers:

View File

@@ -12,7 +12,6 @@ Install directly with Go:
```bash ```bash
go install github.com/TecharoHQ/anubis/cmd/robots2policy@latest go install github.com/TecharoHQ/anubis/cmd/robots2policy@latest
``` ```
## Usage ## Usage
Basic conversion from URL: Basic conversion from URL:
@@ -36,8 +35,8 @@ robots2policy -input robots.txt -action DENY -format json
## Options ## Options
| Flag | Description | Default | | Flag | Description | Default |
| --------------------- | ------------------------------------------------------------------ | ------------------- | |-----------------------|--------------------------------------------------------------------|---------------------|
| `-input` | robots.txt file path or URL (use `-` for stdin) | _required_ | | `-input` | robots.txt file path or URL (use `-` for stdin) | *required* |
| `-output` | Output file (use `-` for stdout) | stdout | | `-output` | Output file (use `-` for stdout) | stdout |
| `-format` | Output format: `yaml` or `json` | `yaml` | | `-format` | Output format: `yaml` or `json` | `yaml` |
| `-action` | Action for disallowed paths: `ALLOW`, `DENY`, `CHALLENGE`, `WEIGH` | `CHALLENGE` | | `-action` | Action for disallowed paths: `ALLOW`, `DENY`, `CHALLENGE`, `WEIGH` | `CHALLENGE` |
@@ -48,7 +47,6 @@ robots2policy -input robots.txt -action DENY -format json
## Example ## Example
Input robots.txt: Input robots.txt:
```txt ```txt
User-agent: * User-agent: *
Disallow: /admin/ Disallow: /admin/
@@ -59,7 +57,6 @@ Disallow: /
``` ```
Generated policy: Generated policy:
```yaml ```yaml
- name: robots-txt-policy-disallow-1 - name: robots-txt-policy-disallow-1
action: CHALLENGE action: CHALLENGE
@@ -80,8 +77,8 @@ Generated policy:
Save the output and import it in your main policy file: Save the output and import it in your main policy file:
```yaml ```yaml
bots: import:
- import: "./robots-policy.yaml" - path: "./robots-policy.yaml"
``` ```
The tool handles wildcard patterns, user-agent specific rules, and blacklisted bots automatically. The tool handles wildcard patterns, user-agent specific rules, and blacklisted bots automatically.

Some files were not shown because too many files have changed in this diff Show More