Compare commits

...

34 Commits

Author SHA1 Message Date
Xe Iaso
ec733e93a5 v1.19.1
Signed-off-by: Xe Iaso <me@xeiaso.net>
2025-06-01 17:17:24 -04:00
Xe Iaso
51c384eefd fix(data/bots): bring back ai-robots-txt.yaml
Closes #599

Signed-off-by: Xe Iaso <me@xeiaso.net>
2025-06-01 17:15:00 -04:00
Xe Iaso
44d5ec0b6e chore: release version v1.19.0
Signed-off-by: Xe Iaso <me@xeiaso.net>
2025-06-01 16:35:03 -04:00
Xe Iaso
3bc9040a96 chore: bump yeet to v0.6.0
Gives us many nice things like:

* Windows support for yeet (modulo TecharoHQ/yeet#29)
* Removes the dependency on /bin/sh or /bin/bash thanks to
  mvdan.cc/sh/v3
* Checksum-compliant reproducible builds by default

Signed-off-by: Xe Iaso <me@xeiaso.net>
2025-06-01 16:33:08 -04:00
Corry Haines
de7dbfe6d6 Split up AI filtering files (#592)
* Split up AI filtering files

Create aggressive/moderate/permissive policies to allow administrators to choose their AI/LLM stance.

Aggressive policy matches existing default in Anubis.

Removes `Google-Extended` flag from `ai-robots-txt.yaml` as it doesn't exist in requests.

Rename `ai-robots-txt.yaml` to `ai-catchall.yaml` as the file is no longer a copy of the source repo/file.

* chore: spelling

* chore: fix embeds

* chore: fix data includes

* chore: fix file name typo

* chore: Ignore READMEs in configs

* chore(lib/policy/config): go tool goimports -w

Signed-off-by: Xe Iaso <me@xeiaso.net>

---------

Signed-off-by: Xe Iaso <me@xeiaso.net>
Co-authored-by: Xe Iaso <me@xeiaso.net>
2025-06-01 20:21:18 +00:00
minihoot
77e0bbbce9 add my site to known-instances.md (#595)
Signed-off-by: minihoot <70825884+minihoot@users.noreply.github.com>
2025-06-01 15:28:16 +00:00
Zohiu
b4b5d2f82e docs(known-instances): add catgirl.click (#597)
Signed-off-by: Zohiu <102434603+Zohiu@users.noreply.github.com>
2025-06-01 14:55:02 +00:00
Aleksei
988fff77f1 docs(known-instances): add openwrt.org (#594)
Signed-off-by: Aleksei <Aloki@users.noreply.github.com>
2025-06-01 13:13:52 +00:00
Corry Haines
0d9ebebff6 Opt-in policies for OpenAI and MistralAI bots (#590)
* Define OpenAI bot ALLOW policies

Allows OpenAI bots to be allowlisted at the choice of the Anubis administrator. None are enabled by default.

* Define MistralAI bot ALLOW policy

* chore: spelling
2025-05-31 16:48:57 -04:00
Jason
ba00cdacd2 docs(known-instances): Add Gitea to the known instances list (#591)
Signed-off-by: Jason <57271945+jesentz@users.noreply.github.com>
2025-05-31 14:20:39 -04:00
Corry Haines
68a71c6a99 Add Applebot definition (#589)
* Add Applebot definition

Adds Apple's search indexing bot, and allowlists it by default.

Allowlisted by default because it is equivalent to Googlebot/Bingbot. Remove Applebot from `ai-robots-txt.yaml` for the same reasons.

Remove `Applebot-Extended` from `ai-robots-txt.yaml` as it has no effect.

* chore: spelling

Signed-off-by: Xe Iaso <me@xeiaso.net>

---------

Signed-off-by: Xe Iaso <me@xeiaso.net>
Co-authored-by: Xe Iaso <me@xeiaso.net>
2025-05-31 10:18:32 -04:00
Xe Iaso
fbbab5a035 feat(lib): annotate cookies with what rule was passed (#576)
* feat(lib): annotate cookies with what rule was passed

Anubis JWTs now contain a policyRule claim with the cryptographic hash
of the rule that it passed. This is intended to help with a future move
away from proof of work being the default.

Signed-off-by: Xe Iaso <me@xeiaso.net>

* test(lib): fix cookie storage logic

Signed-off-by: Xe Iaso <me@xeiaso.net>

---------

Signed-off-by: Xe Iaso <me@xeiaso.net>
2025-05-30 14:42:02 -04:00
Jason Cameron
28ab29389c style(bench): small cleanup (#546)
* fix(bench): await benchmark loop and adjust outline styles in templates

Signed-off-by: Jason Cameron <git@jasoncameron.dev>

* refactor: remove unused showContinueBar function and clean up video error handling

Signed-off-by: Jason Cameron <git@jasoncameron.dev>

* style: format code for consistency and readability using prettier

Signed-off-by: Jason Cameron <git@jasoncameron.dev>

---------

Signed-off-by: Jason Cameron <git@jasoncameron.dev>
2025-05-30 17:57:56 +00:00
Xe Iaso
497005ce3e fix(lib): only use the first five characters of Accept-Language header values (#588)
For some reason, Google Chrome will randomly send a "full"
Accept-Language header, and other times it will send a "partial"
Accept-Language header. This makes the challenge construction
inconsistent.

This commit fixes this issue by only considering up to the first five
characters of the Accept-Language header when making a challenge string.

Signed-off-by: Xe Iaso <me@xeiaso.net>
2025-05-30 13:15:03 -04:00
Xe Iaso
669eb4ba4b fix(web): show Anubis version number on challenge pages (#587)
Closes #565

The page already had the version number embedded into it, but that was
not printed to the page. This prints the version number set at compile
time to the page.

Signed-off-by: Xe Iaso <me@xeiaso.net>
2025-05-30 16:23:27 +00:00
Kian Kasad
6c4e739b0b feat(lib): Add anubis_proxied_requests_total metric (#570)
Adds a new counter metric which counts the number of requests proxied to
the upstream target, labeled with the host name of the request.
2025-05-30 12:21:04 -04:00
Xe Iaso
c8635357dc feat(yeetfile): build GOARCH=ppc64le packages (#583)
Signed-off-by: Xe Iaso <me@xeiaso.net>
2025-05-30 00:35:56 -04:00
Xe Iaso
0ed905fd4e fix(internal/test): skip integration tests if SKIP_INTEGRATION is set (#586)
Signed-off-by: Xe Iaso <me@xeiaso.net>
2025-05-30 00:49:53 +00:00
Xe Iaso
cd8a7eb2e2 feat(data): add x-firefox-ai default challenge rule (#580)
Signed-off-by: Xe Iaso <me@xeiaso.net>
2025-05-28 21:08:39 +00:00
Xe Iaso
22c47f40d1 feat(expressions): add randInt function to allow making rules nondeterministic (#578)
This seems counter-intuitive at first glance, but let me cook.

One of the problems with Anubis is that the rule matching is super
deterministic. This means that attackers can figure out what patterns
they are hitting and change things to bypass them.

The randInt function lets you have rulesets behave nondeterministically.
This is a very easy way to hang yourself, but can be great to
psychologically mess with scraper operators. Consider this rule:

```yaml
- name: deny-lightpanda-sometimes
  action: DENY
  expression:
    all:
      - userAgent.matches("LightPanda")
      - randInt(16) >= 4
```

It would match about 75% of the time.

Signed-off-by: Xe Iaso <me@xeiaso.net>
2025-05-28 16:36:27 -04:00
Xe Iaso
669671bd46 fix(internal): register mime type for .mjs files (#577)
Closes #575

I'm gonna be totally honest, I'm not sure if this is needed. However,
measure twice, cut once.

Signed-off-by: Xe Iaso <me@xeiaso.net>
2025-05-28 13:52:48 +00:00
dependabot[bot]
6c247cdec8 build(deps): bump k8s.io/apimachinery in the gomod group (#524)
Bumps the gomod group with 1 update: [k8s.io/apimachinery](https://github.com/kubernetes/apimachinery).


Updates `k8s.io/apimachinery` from 0.33.0 to 0.33.1
- [Commits](https://github.com/kubernetes/apimachinery/compare/v0.33.0...v0.33.1)

---
updated-dependencies:
- dependency-name: k8s.io/apimachinery
  dependency-version: 0.33.1
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: gomod
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-05-28 09:41:25 -04:00
Kian Kasad
eeae28f459 feat(cli): Add --version flag (#572) 2025-05-28 04:16:44 +00:00
jordigh
9ba10262e3 add Weblate to known-instances.md (#571)
Signed-off-by: jordigh <jordigh@octave.org>
2025-05-28 00:03:04 +00:00
dependabot[bot]
a28a3d155a build(deps): bump astral-sh/setup-uv in the github-actions group (#558)
Bumps the github-actions group with 1 update: [astral-sh/setup-uv](https://github.com/astral-sh/setup-uv).


Updates `astral-sh/setup-uv` from 6.0.1 to 6.1.0
- [Release notes](https://github.com/astral-sh/setup-uv/releases)
- [Commits](6b9c6063ab...f0ec1fc3b3)

---
updated-dependencies:
- dependency-name: astral-sh/setup-uv
  dependency-version: 6.1.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: github-actions
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-05-27 11:12:31 -04:00
Anna
086f43e3ca Create Anubis OpenRC init.d script (#561)
Signed-off-by: Anna @CyberTailor <cyber@sysrq.in>
2025-05-27 01:58:59 +00:00
Xe Iaso
fa1f2355ea v1.19.0-pre1
Signed-off-by: Xe Iaso <me@xeiaso.net>
2025-05-25 14:10:22 -04:00
Xe Iaso
0a56194825 docs(admin): add wordpress docs (#552)
Closes #551

Signed-off-by: Xe Iaso <me@xeiaso.net>
2025-05-24 17:00:37 -04:00
Jason Cameron
93e2447ba2 fix(expression): add validation for empty expression list in CEL (#545)
* fix(expression): add validation for empty ExpressionOrList

Signed-off-by: Jason Cameron <git@jasoncameron.dev>

* fix(imports): block empty file imports with improved error checking logic

Signed-off-by: Jason Cameron <git@jasoncameron.dev>

* docs(expression): improve validation to error on empty CEL expressions

Signed-off-by: Jason Cameron <git@jasoncameron.dev>

---------

Signed-off-by: Jason Cameron <git@jasoncameron.dev>
2025-05-23 18:14:31 -04:00
Xe Iaso
51f875ff6f docs(native-install): vague gesturing at distribution package managers (#544)
Closes #530

Signed-off-by: Xe Iaso <me@xeiaso.net>
2025-05-23 16:51:45 +00:00
Xe Iaso
555a188dc3 fix(lib): record challenges issused over embedded HTML (#543)
Closes #531

This changes `anubis_challenges_issued` to be a vector counter that
records the challenge issuance method.

Signed-off-by: Xe Iaso <me@xeiaso.net>
2025-05-23 12:45:41 -04:00
James Renken
6f08bcb481 feat: add TARGET_SNI to allow overriding the TLS handshake hostname when forwarding requests (#529)
* feat: add TARGET_SNI to allow overriding the TLS handshake hostname when forwarding requests

* chore: spelling

Signed-off-by: Xe Iaso <me@xeiaso.net>

---------

Signed-off-by: Xe Iaso <me@xeiaso.net>
Co-authored-by: Xe Iaso <me@xeiaso.net>
2025-05-23 16:27:35 +00:00
Dryusdan
11081aac08 Bump AI-robots.txt rules to version 1.31 (#538)
* Bump AI-robots.txt rules to version 1.31

* chore: spelling

Signed-off-by: Xe Iaso <me@xeiaso.net>

---------

Signed-off-by: Xe Iaso <me@xeiaso.net>
Co-authored-by: Xe Iaso <me@xeiaso.net>
2025-05-23 16:15:12 +00:00
Nathan Price
c78d830ecb docs/docs/admin/native-install.mdx: correct the path for the default configuration file installation (#535)
Using the native-install instructions, default.env was installed as /etc/anubis rather than /etc/anubis/default.env
2025-05-22 18:34:06 +00:00
61 changed files with 916 additions and 287 deletions

View File

@@ -6,6 +6,7 @@ amazonbot
anthro
anubis
anubistest
Applebot
archlinux
badregexes
berr
@@ -17,7 +18,9 @@ blueskybot
boi
botnet
BPort
Brightbot
broked
Bytespider
cachebuster
Caddyfile
caninetools
@@ -30,14 +33,17 @@ cgr
chainguard
chall
challengemozilla
checkpath
checkresult
chen
chibi
cidranger
ckie
cloudflare
confd
containerbuild
coreutils
Cotoyogi
CRDs
crt
daemonizing
@@ -46,6 +52,7 @@ Debian
debrpm
decaymap
decompiling
Diffbot
discordapp
discordbot
distros
@@ -56,17 +63,22 @@ dracula
dronebl
droneblresponse
duckduckbot
eerror
ellenjoe
enbyware
everyones
evilbot
evilsite
expressionorlist
externalagent
externalfetcher
extldflags
facebookgo
Factset
fastcgi
fediverse
finfos
Firecrawl
flagenv
Fordola
forgejo
@@ -81,6 +93,8 @@ goodbot
googlebot
govulncheck
GPG
GPT
gptbot
grw
Hashcash
hashrate
@@ -92,11 +106,16 @@ hostable
htmx
httpdebug
hypertext
iaskspider
iat
ifm
Imagesift
imgproxy
inp
iss
isset
ivh
Jenomis
JGit
journalctl
jshelter
@@ -108,8 +127,10 @@ keypair
KHTML
kinda
KUBECONFIG
lcj
ldflags
letsencrypt
Lexentale
lgbt
licend
licstart
@@ -125,8 +146,10 @@ maintainership
malware
mcr
memes
metrix
mimi
minica
mistralai
Mojeek
mojeekbot
mozilla
@@ -135,9 +158,15 @@ nginx
nobots
NONINFRINGEMENT
nosleep
OCOB
ogtags
omgili
omgilibot
onionservice
openai
openrc
pag
Pangu
parseable
passthrough
Patreon
@@ -167,20 +196,27 @@ reputational
reqmeta
risc
ruleset
runlevels
RUnlock
sas
sasl
Scumm
searchbot
searx
sebest
secretplans
selfsigned
Semrush
setsebool
shellcheck
Sidetrade
sitemap
sls
sni
Sourceware
Spambot
sparkline
spyderbot
srv
stackoverflow
startprecmd
@@ -188,6 +224,7 @@ stoppostcmd
subgrid
subr
subrequest
SVCNAME
tagline
tarballs
techaro
@@ -195,12 +232,15 @@ techarohq
templ
templruntime
testarea
Tik
Timpibot
torproject
traefik
unixhttpd
unmarshal
uvx
Varis
Velen
vendored
vhosts
videotest
@@ -210,8 +250,11 @@ webmaster
webpage
websecure
websites
workaround
Webzio
wordpress
Workaround
workdir
wpbot
xcaddy
Xeact
xeiaso

View File

@@ -20,9 +20,6 @@
# https://twitter.com/nyttypos/status/1898844061873639490
#\([A-Z][a-z]{2,}(?: [a-z]+){3,}\)\.\s
# Complete sentences shouldn't be in the middle of another sentence as a parenthetical.
(?<!\.)\.\),
# Complete sentences in parentheticals should not have a space before the period.
\s\.\)(?!.*\}\})

View File

@@ -128,3 +128,7 @@ go install(?:\s+[a-z]+\.[-@\w/.]+)+
# ignore long runs of a single character:
\b([A-Za-z])\g{-1}{3,}\b
# hit-count: 1 file-count: 1
# microsoft
\b(?:https?://|)(?:(?:(?:blogs|download\.visualstudio|docs|msdn2?|research)\.|)microsoft|blogs\.msdn)\.co(?:m|\.\w\w)/[-_a-zA-Z0-9()=./%]*

View File

@@ -21,7 +21,7 @@ jobs:
persist-credentials: false
- name: Install the latest version of uv
uses: astral-sh/setup-uv@6b9c6063abd6010835644d4c2e1bef4cf5cd0fca # v6.0.1
uses: astral-sh/setup-uv@f0ec1fc3b38f5e7cd731bb6ce540c5af426746bb # v6.1.0
- name: Run zizmor 🌈
run: uvx zizmor --format sarif . > results.sarif

View File

@@ -1 +1 @@
1.18.0
1.19.1

View File

@@ -56,6 +56,7 @@ var (
redirectDomains = flag.String("redirect-domains", "", "list of domains separated by commas which anubis is allowed to redirect to. Leaving this unset allows any domain.")
slogLevel = flag.String("slog-level", "INFO", "logging level (see https://pkg.go.dev/log/slog#hdr-Levels)")
target = flag.String("target", "http://localhost:3923", "target to reverse proxy to, set to an empty string to disable proxying when only using auth request")
targetSNI = flag.String("target-sni", "", "if set, the value of the TLS handshake hostname when forwarding requests to the target")
targetHost = flag.String("target-host", "", "if set, the value of the Host header when forwarding requests to the target")
targetInsecureSkipVerify = flag.Bool("target-insecure-skip-verify", false, "if true, skips TLS validation for the backend")
healthcheck = flag.Bool("healthcheck", false, "run a health check against Anubis")
@@ -66,6 +67,7 @@ var (
ogCacheConsiderHost = flag.Bool("og-cache-consider-host", false, "enable or disable the use of the host in the Open Graph tag cache")
extractResources = flag.String("extract-resources", "", "if set, extract the static resources to the specified folder")
webmasterEmail = flag.String("webmaster-email", "", "if set, displays webmaster's email on the reject page for appeals")
versionFlag = flag.Bool("version", false, "print Anubis version")
)
func keyFromHex(value string) (ed25519.PrivateKey, error) {
@@ -136,7 +138,7 @@ func setupListener(network string, address string) (net.Listener, string) {
return listener, formattedAddress
}
func makeReverseProxy(target string, targetHost string, insecureSkipVerify bool) (http.Handler, error) {
func makeReverseProxy(target string, targetSNI string, targetHost string, insecureSkipVerify bool) (http.Handler, error) {
targetUri, err := url.Parse(target)
if err != nil {
return nil, fmt.Errorf("failed to parse target URL: %w", err)
@@ -158,10 +160,14 @@ func makeReverseProxy(target string, targetHost string, insecureSkipVerify bool)
transport.RegisterProtocol("unix", libanubis.UnixRoundTripper{Transport: transport})
}
if insecureSkipVerify {
slog.Warn("TARGET_INSECURE_SKIP_VERIFY is set to true, TLS certificate validation will not be performed", "target", target)
transport.TLSClientConfig = &tls.Config{
InsecureSkipVerify: true,
if insecureSkipVerify || targetSNI != "" {
transport.TLSClientConfig = &tls.Config{}
if insecureSkipVerify {
slog.Warn("TARGET_INSECURE_SKIP_VERIFY is set to true, TLS certificate validation will not be performed", "target", target)
transport.TLSClientConfig.InsecureSkipVerify = true
}
if targetSNI != "" {
transport.TLSClientConfig.ServerName = targetSNI
}
}
@@ -197,6 +203,11 @@ func main() {
flagenv.Parse()
flag.Parse()
if *versionFlag {
fmt.Println("Anubis", anubis.Version)
return
}
internal.InitSlog(*slogLevel)
if *extractResources != "" {
@@ -214,7 +225,7 @@ func main() {
// when using anubis via Systemd and environment variables, then it is not possible to set targe to an empty string but only to space
if strings.TrimSpace(*target) != "" {
var err error
rp, err = makeReverseProxy(*target, *targetHost, *targetInsecureSkipVerify)
rp, err = makeReverseProxy(*target, *targetSNI, *targetHost, *targetInsecureSkipVerify)
if err != nil {
log.Fatalf("can't make reverse proxy: %v", err)
}

View File

@@ -4,7 +4,7 @@
"import": "(data)/bots/_deny-pathological.yaml"
},
{
"import": "(data)/bots/ai-robots-txt.yaml"
"import": "(data)/meta/ai-block-aggressive.yaml"
},
{
"import": "(data)/crawlers/_allow-good.yaml"

View File

@@ -11,44 +11,51 @@
## /usr/share/docs/anubis/data or in the tarball you extracted Anubis from.
bots:
# Pathological bots to deny
- # This correlates to data/bots/deny-pathological.yaml in the source tree
# https://github.com/TecharoHQ/anubis/blob/main/data/bots/deny-pathological.yaml
import: (data)/bots/_deny-pathological.yaml
- import: (data)/bots/aggressive-brazilian-scrapers.yaml
# Pathological bots to deny
- # This correlates to data/bots/deny-pathological.yaml in the source tree
# https://github.com/TecharoHQ/anubis/blob/main/data/bots/deny-pathological.yaml
import: (data)/bots/_deny-pathological.yaml
- import: (data)/bots/aggressive-brazilian-scrapers.yaml
# Enforce https://github.com/ai-robots-txt/ai.robots.txt
- import: (data)/bots/ai-robots-txt.yaml
# Aggressively block AI/LLM related bots/agents by default
- import: (data)/meta/ai-block-aggressive.yaml
# Search engine crawlers to allow, defaults to:
# - Google (so they don't try to bypass Anubis)
# - Bing
# - DuckDuckGo
# - Qwant
# - The Internet Archive
# - Kagi
# - Marginalia
# - Mojeek
- import: (data)/crawlers/_allow-good.yaml
# Consider replacing the aggressive AI policy with more selective policies:
# - import: (data)/meta/ai-block-moderate.yaml
# - import: (data)/meta/ai-block-permissive.yaml
# Allow common "keeping the internet working" routes (well-known, favicon, robots.txt)
- import: (data)/common/keep-internet-working.yaml
# Search engine crawlers to allow, defaults to:
# - Google (so they don't try to bypass Anubis)
# - Apple
# - Bing
# - DuckDuckGo
# - Qwant
# - The Internet Archive
# - Kagi
# - Marginalia
# - Mojeek
- import: (data)/crawlers/_allow-good.yaml
# Challenge Firefox AI previews
- import: (data)/clients/x-firefox-ai.yaml
# # Punish any bot with "bot" in the user-agent string
# # This is known to have a high false-positive rate, use at your own risk
# - name: generic-bot-catchall
# user_agent_regex: (?i:bot|crawler)
# action: CHALLENGE
# challenge:
# difficulty: 16 # impossible
# report_as: 4 # lie to the operator
# algorithm: slow # intentionally waste CPU cycles and time
# Allow common "keeping the internet working" routes (well-known, favicon, robots.txt)
- import: (data)/common/keep-internet-working.yaml
# Generic catchall rule
- name: generic-browser
user_agent_regex: >-
Mozilla|Opera
action: CHALLENGE
# # Punish any bot with "bot" in the user-agent string
# # This is known to have a high false-positive rate, use at your own risk
# - name: generic-bot-catchall
# user_agent_regex: (?i:bot|crawler)
# action: CHALLENGE
# challenge:
# difficulty: 16 # impossible
# report_as: 4 # lie to the operator
# algorithm: slow # intentionally waste CPU cycles and time
# Generic catchall rule
- name: generic-browser
user_agent_regex: >-
Mozilla|Opera
action: CHALLENGE
dnsbl: false
@@ -58,4 +65,4 @@ dnsbl: false
# will stop sending requests once they get it.
status_codes:
CHALLENGE: 200
DENY: 200
DENY: 200

View File

@@ -0,0 +1,11 @@
# Extensive list of AI-affiliated agents based on https://github.com/ai-robots-txt/ai.robots.txt
# Add new/undocumented agents here. Where documentation exists, consider moving to dedicated policy files.
# Notes on various agents:
# - Amazonbot: Well documented, but they refuse to state which agent collects training data.
# - anthropic-ai/Claude-Web: Undocumented by Anthropic. Possibly deprecated or hallucinations?
# - Perplexity*: Well documented, but they refuse to state which agent collects training data.
# Warning: May contain user agents that _must_ be blocked in robots.txt, or the opt-out will have no effect.
- name: "ai-catchall"
user_agent_regex: >-
AI2Bot|Ai2Bot-Dolma|aiHitBot|Amazonbot|anthropic-ai|Brightbot 1.0|Bytespider|CCBot|Claude-Web|cohere-ai|cohere-training-data-crawler|Cotoyogi|Crawlspace|Diffbot|DuckAssistBot|FacebookBot|Factset_spyderbot|FirecrawlAgent|FriendlyCrawler|Google-CloudVertexBot|GoogleOther|GoogleOther-Image|GoogleOther-Video|iaskspider/2.0|ICC-Crawler|ImagesiftBot|img2dataset|imgproxy|ISSCyberRiskCrawler|Kangaroo Bot|meta-externalagent|Meta-ExternalAgent|meta-externalfetcher|Meta-ExternalFetcher|NovaAct|omgili|omgilibot|Operator|PanguBot|Perplexity-User|PerplexityBot|PetalBot|QualifiedBot|Scrapy|SemrushBot-OCOB|SemrushBot-SWA|Sidetrade indexer bot|TikTokSpider|Timpibot|VelenPublicWebCrawler|Webzio-Extended|wpbot|YouBot
action: DENY

View File

@@ -1,4 +1,6 @@
# Warning: Contains user agents that _must_ be blocked in robots.txt, or the opt-out will have no effect.
# Note: Blocks human-directed/non-training user agents
- name: "ai-robots-txt"
user_agent_regex: >-
AI2Bot|Ai2Bot-Dolma|aiHitBot|Amazonbot|anthropic-ai|Applebot|Applebot-Extended|Brightbot 1.0|Bytespider|CCBot|ChatGPT-User|Claude-Web|ClaudeBot|cohere-ai|cohere-training-data-crawler|Cotoyogi|Crawlspace|Diffbot|DuckAssistBot|FacebookBot|Factset_spyderbot|FirecrawlAgent|FriendlyCrawler|Google-Extended|GoogleOther|GoogleOther-Image|GoogleOther-Video|GPTBot|iaskspider/2.0|ICC-Crawler|ImagesiftBot|img2dataset|imgproxy|ISSCyberRiskCrawler|Kangaroo Bot|meta-externalagent|Meta-ExternalAgent|meta-externalfetcher|Meta-ExternalFetcher|NovaAct|OAI-SearchBot|omgili|omgilibot|Operator|PanguBot|Perplexity-User|PerplexityBot|PetalBot|QualifiedBot|Scrapy|SemrushBot-OCOB|SemrushBot-SWA|Sidetrade indexer bot|TikTokSpider|Timpibot|VelenPublicWebCrawler|Webzio-Extended|YouBot
AI2Bot|Ai2Bot-Dolma|aiHitBot|Amazonbot|anthropic-ai|Brightbot 1.0|Bytespider|CCBot|ChatGPT-User|Claude-SearchBot|Claude-User|Claude-Web|ClaudeBot|cohere-ai|cohere-training-data-crawler|Cotoyogi|Crawlspace|Diffbot|DuckAssistBot|FacebookBot|Factset_spyderbot|FirecrawlAgent|FriendlyCrawler|Google-CloudVertexBot|Google-Extended|GoogleOther|GoogleOther-Image|GoogleOther-Video|GPTBot|iaskspider/2.0|ICC-Crawler|ImagesiftBot|img2dataset|imgproxy|ISSCyberRiskCrawler|Kangaroo Bot|meta-externalagent|Meta-ExternalAgent|meta-externalfetcher|Meta-ExternalFetcher|MistralAI-User/1.0|NovaAct|OAI-SearchBot|omgili|omgilibot|Operator|PanguBot|Perplexity-User|PerplexityBot|PetalBot|QualifiedBot|Scrapy|SemrushBot-OCOB|SemrushBot-SWA|Sidetrade indexer bot|TikTokSpider|Timpibot|VelenPublicWebCrawler|Webzio-Extended|wpbot|YouBot
action: DENY

8
data/clients/ai.yaml Normal file
View File

@@ -0,0 +1,8 @@
# User agents that act on behalf of humans in AI tools, e.g. searching the web.
# Each entry should have a positive/ALLOW entry created as well, with further documentation.
# Exceptions:
# - Claude-User: No published IP allowlist
- name: "ai-clients"
user_agent_regex: >-
ChatGPT-User|Claude-User|MistralAI-User
action: DENY

View File

@@ -0,0 +1,10 @@
# Acts on behalf of user requests
# https://docs.mistral.ai/robots/
- name: mistral-mistralai-user
user_agent_regex: MistralAI-User/.+; \+https\://docs\.mistral\.ai/robots
action: ALLOW
# https://mistral.ai/mistralai-user-ips.json
remote_addresses: [
"20.240.160.161/32",
"20.240.160.1/32",
]

View File

@@ -0,0 +1,93 @@
# Acts on behalf of user requests
# https://platform.openai.com/docs/bots/overview-of-openai-crawlers
- name: openai-chatgpt-user
user_agent_regex: ChatGPT-User/.+; \+https\://openai\.com/bot
action: ALLOW
# https://openai.com/chatgpt-user.json
# curl 'https://openai.com/chatgpt-user.json' | jq '.prefixes.[].ipv4Prefix' | sed 's/$/,/'
remote_addresses: [
"13.65.138.112/28",
"23.98.179.16/28",
"13.65.138.96/28",
"172.183.222.128/28",
"20.102.212.144/28",
"40.116.73.208/28",
"172.183.143.224/28",
"52.190.190.16/28",
"13.83.237.176/28",
"51.8.155.64/28",
"74.249.86.176/28",
"51.8.155.48/28",
"20.55.229.144/28",
"135.237.131.208/28",
"135.237.133.48/28",
"51.8.155.112/28",
"135.237.133.112/28",
"52.159.249.96/28",
"52.190.137.16/28",
"52.255.111.112/28",
"40.84.181.32/28",
"172.178.141.112/28",
"52.190.142.64/28",
"172.178.140.144/28",
"52.190.137.144/28",
"172.178.141.128/28",
"57.154.187.32/28",
"4.196.118.112/28",
"20.193.50.32/28",
"20.215.188.192/28",
"20.215.214.16/28",
"4.197.22.112/28",
"4.197.115.112/28",
"172.213.21.16/28",
"172.213.11.144/28",
"172.213.12.112/28",
"172.213.21.144/28",
"20.90.7.144/28",
"57.154.175.0/28",
"57.154.174.112/28",
"52.236.94.144/28",
"137.135.191.176/28",
"23.98.186.192/28",
"23.98.186.96/28",
"23.98.186.176/28",
"23.98.186.64/28",
"68.221.67.192/28",
"68.221.67.160/28",
"13.83.167.128/28",
"20.228.106.176/28",
"52.159.227.32/28",
"68.220.57.64/28",
"172.213.21.112/28",
"68.221.67.224/28",
"68.221.75.16/28",
"20.97.189.96/28",
"52.252.113.240/28",
"52.230.163.32/28",
"172.212.159.64/28",
"52.255.111.80/28",
"52.255.111.0/28",
"4.151.241.240/28",
"52.255.111.32/28",
"52.255.111.48/28",
"52.255.111.16/28",
"52.230.164.176/28",
"52.176.139.176/28",
"52.173.234.16/28",
"4.151.71.176/28",
"4.151.119.48/28",
"52.255.109.112/28",
"52.255.109.80/28",
"20.161.75.208/28",
"68.154.28.96/28",
"52.255.109.128/28",
"52.225.75.208/28",
"52.190.139.48/28",
"68.221.67.240/28",
"52.156.77.144/28",
"52.148.129.32/28",
"40.84.221.208/28",
"104.210.139.224/28",
"40.84.221.224/28",
"104.210.139.192/28",
]

View File

@@ -0,0 +1,4 @@
# https://connect.mozilla.org/t5/firefox-labs/try-out-link-previews-in-firefox-labs-138-and-share-your/td-p/92012
- name: x-firefox-ai
action: CHALLENGE
expression: '"X-Firefox-Ai" in headers'

View File

@@ -1,4 +1,5 @@
- import: (data)/crawlers/googlebot.yaml
- import: (data)/crawlers/applebot.yaml
- import: (data)/crawlers/bingbot.yaml
- import: (data)/crawlers/duckduckbot.yaml
- import: (data)/crawlers/qwantbot.yaml

View File

@@ -0,0 +1,8 @@
# User agents that index exclusively for search in for AI systems.
# Each entry should have a positive/ALLOW entry created as well, with further documentation.
# Exceptions:
# - Claude-SearchBot: No published IP allowlist
- name: "ai-crawlers-search"
user_agent_regex: >-
OAI-SearchBot|Claude-SearchBot
action: DENY

View File

@@ -0,0 +1,8 @@
# User agents that crawl for training AI/LLM systems
# Each entry should have a positive/ALLOW entry created as well, with further documentation.
# Exceptions:
# - ClaudeBot: No published IP allowlist
- name: "ai-crawlers-training"
user_agent_regex: >-
GPTBot|ClaudeBot
action: DENY

View File

@@ -0,0 +1,20 @@
# Indexing for search and Siri
# https://support.apple.com/en-us/119829
- name: applebot
user_agent_regex: Applebot
action: ALLOW
# https://search.developer.apple.com/applebot.json
remote_addresses: [
"17.241.208.160/27",
"17.241.193.160/27",
"17.241.200.160/27",
"17.22.237.0/24",
"17.22.245.0/24",
"17.22.253.0/24",
"17.241.75.0/24",
"17.241.219.0/24",
"17.241.227.0/24",
"17.246.15.0/24",
"17.246.19.0/24",
"17.246.23.0/24",
]

View File

@@ -0,0 +1,16 @@
# Collects AI training data
# https://platform.openai.com/docs/bots/overview-of-openai-crawlers
- name: openai-gptbot
user_agent_regex: GPTBot/1\.1; \+https\://openai\.com/gptbot
action: ALLOW
# https://openai.com/gptbot.json
remote_addresses: [
"52.230.152.0/24",
"20.171.206.0/24",
"20.171.207.0/24",
"4.227.36.0/25",
"20.125.66.80/28",
"172.182.204.0/24",
"172.182.214.0/24",
"172.182.215.0/24",
]

View File

@@ -0,0 +1,13 @@
# Indexing for search, does not collect training data
# https://platform.openai.com/docs/bots/overview-of-openai-crawlers
- name: openai-searchbot
user_agent_regex: OAI-SearchBot/1\.0; \+https\://openai\.com/searchbot
action: ALLOW
# https://openai.com/searchbot.json
remote_addresses: [
"20.42.10.176/28",
"172.203.190.128/28",
"104.210.140.128/28",
"51.8.102.0/24",
"135.234.64.0/24"
]

View File

@@ -3,6 +3,6 @@ package data
import "embed"
var (
//go:embed botPolicies.yaml botPolicies.json all:apps all:bots all:clients all:common all:crawlers
//go:embed botPolicies.yaml botPolicies.json all:apps all:bots all:clients all:common all:crawlers all:meta
BotPolicies embed.FS
)

5
data/meta/README.md Normal file
View File

@@ -0,0 +1,5 @@
# meta policies
Contains policies that exclusively reference policies in _multiple_ other data folders.
Akin to "stances" that the administrator can take, with reference to various topics, such as AI/LLM systems.

View File

@@ -0,0 +1,6 @@
# Blocks all AI/LLM associated user agents, regardless of purpose or human agency
# Warning: To completely block some AI/LLM training, such as with Google, you _must_ place flags in robots.txt.
- import: (data)/bots/ai-catchall.yaml
- import: (data)/clients/ai.yaml
- import: (data)/crawlers/ai-search.yaml
- import: (data)/crawlers/ai-training.yaml

View File

@@ -0,0 +1,7 @@
# Blocks all AI/LLM bots used for training or unknown/undocumented purposes.
# Permits user agents with explicitly documented non-training use, and published IP allowlists.
- import: (data)/bots/ai-catchall.yaml
- import: (data)/crawlers/ai-training.yaml
- import: (data)/crawlers/openai-searchbot.yaml
- import: (data)/clients/openai-chatgpt-user.yaml
- import: (data)/clients/mistral-mistralai-user.yaml

View File

@@ -0,0 +1,6 @@
# Permits all well documented AI/LLM user agents with published IP allowlists.
- import: (data)/bots/ai-catchall.yaml
- import: (data)/crawlers/openai-searchbot.yaml
- import: (data)/crawlers/openai-gptbot.yaml
- import: (data)/clients/openai-chatgpt-user.yaml
- import: (data)/clients/mistral-mistralai-user.yaml

View File

@@ -11,22 +11,44 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
## [Unreleased]
## v1.19.1: Jenomis cen Lexentale - Echo 1
Return `data/bots/ai-robots-txt.yaml` to avoid breaking configs [#599](https://github.com/TecharoHQ/anubis/issues/599)
## v1.19.0: Jenomis cen Lexentale
Mostly a bunch of small features, no big ticket things this time.
- Record if challenges were issued via the API or via embedded JSON in the challenge page HTML ([#531](https://github.com/TecharoHQ/anubis/issues/531))
- Ensure that clients that are shown a challenge support storing cookies
- Imprint the version number into challenge pages
- Encode challenge pages with gzip level 1
- Add PowerPC 64 bit little-endian builds (`GOARCH=ppc64le`)
- Add `check-spelling` for spell checking
- Add `--target-insecure-skip-verify` flag/envvar to allow Anubis to hit a self-signed HTTPS backend
- Minor adjustments to FreeBSD rc.d script to allow for more flexible configuration.
- Added Podman and Docker support for running Playwright tests
- Add a default rule to throw challenges when a request with the `X-Firefox-Ai` header is set.
- Updated the nonce value in the challenge JWT cookie to be a string instead of a number
- Rename cookies in response to user feedback
- Ensure cookie renaming is consistent across configuration options
- Add Bookstack app in data
- Truncate everything but the first five characters of Accept-Language headers when making challenges
- Ensure client JavaScript is served with Content-Type text/javascript.
- Add `--target-host` flag/envvar to allow changing the value of the Host header in requests forwarded to the target service.
- Bump AI-robots.txt to version 1.30 (add QualifiedBot)
- Bump AI-robots.txt to version 1.31
- Add `RuntimeDirectory` to systemd unit settings so native packages can listen over unix sockets
- Added SearXNG instance tracker whitelist policy
- Added Qualys SSL Labs whitelist policy
- Fixed cookie deletion logic ([#520](https://github.com/TecharoHQ/anubis/issues/520), [#522](https://github.com/TecharoHQ/anubis/pull/522))
- Add `--target-sni` flag/envvar to allow changing the value of the TLS handshake hostname in requests forwarded to the target service.
- Fixed CEL expression matching validator to now properly error out when it receives empty expressions
- Added OpenRC init.d script.
- Added `--version` flag.
- Added `anubis_proxied_requests_total` metric to count proxied requests.
- Add `Applebot` as "good" web crawler
- Reorganize AI/LLM crawler blocking into three separate stances, maintaining existing status quo as default.
- Split out AI/LLM user agent blocking policies, adding documentation for each.
## v1.18.0: Varis zos Galvus

View File

@@ -143,7 +143,29 @@ Anubis would return a challenge because all of those conditions are true.
## Functions exposed to Anubis expressions
There are currently no functions from the Anubis runtime exposed to expressions. This will change in the future.
Anubis expressions can be augmented with the following functions:
### `randInt`
```ts
function randInt(n: int): int;
```
randInt returns a randomly selected integer value in the range of `[0,n)`. This is a thin wrapper around [Go's math/rand#Intn](https://pkg.go.dev/math/rand#Intn). Be careful with this as it may cause inconsistent behavior for genuine users.
This is best applied when doing explicit block rules, eg:
```yaml
# Denies LightPanda about 75% of the time on average
- name: deny-lightpanda-sometimes
action: DENY
expression:
all:
- userAgent.matches("LightPanda")
- randInt(16) >= 4
```
It seems counter-intuitive to allow known bad clients through sometimes, but this allows you to confuse attackers by making Anubis' behavior random. Adjust the thresholds and numbers as facts and circumstances demand.
## Life advice

View File

@@ -14,7 +14,7 @@ EG:
{
"bots": [
{
"import": "(data)/bots/ai-robots-txt.yaml"
"import": "(data)/bots/ai-catchall.yaml"
},
{
"import": "(data)/bots/cloudflare-workers.yaml"
@@ -29,8 +29,8 @@ EG:
```yaml
bots:
# Pathological bots to deny
- # This correlates to data/bots/ai-robots-txt.yaml in the source tree
import: (data)/bots/ai-robots-txt.yaml
- # This correlates to data/bots/ai-catchall.yaml in the source tree
import: (data)/bots/ai-catchall.yaml
- import: (data)/bots/cloudflare-workers.yaml
```
@@ -46,7 +46,7 @@ Of note, a bot rule can either have inline bot configuration or import a bot con
{
"bots": [
{
"import": "(data)/bots/ai-robots-txt.yaml",
"import": "(data)/bots/ai-catchall.yaml",
"name": "generic-browser",
"user_agent_regex": "Mozilla|Opera\n",
"action": "CHALLENGE"
@@ -60,7 +60,7 @@ Of note, a bot rule can either have inline bot configuration or import a bot con
```yaml
bots:
- import: (data)/bots/ai-robots-txt.yaml
- import: (data)/bots/ai-catchall.yaml
name: generic-browser
user_agent_regex: >
Mozilla|Opera
@@ -167,7 +167,7 @@ static
├── botPolicies.json
├── botPolicies.yaml
├── bots
│ ├── ai-robots-txt.yaml
│ ├── ai-catchall.yaml
│ ├── cloudflare-workers.yaml
│ ├── headless-browsers.yaml
│ └── us-ai-scraper.yaml

View File

@@ -0,0 +1,39 @@
# Wordpress
Wordpress is the most popular blog engine on the planet.
## Using a multi-site setup with Anubis
If you have a multi-site setup where traffic goes through Anubis like this:
```mermaid
---
title: Apache as tls terminator and HTTP router
---
flowchart LR
T(User Traffic)
subgraph Apache 2
TCP(TCP 80/443)
US(TCP 3001)
end
An(Anubis)
B(Backend)
T --> |TLS termination| TCP
TCP --> |Traffic filtering| An
An --> |Happy traffic| US
US --> |whatever you're doing| B
```
Wordpress may not realize that the underlying connection is being done over HTTPS. This could lead to a redirect loop in the `/wp-admin/` routes. In order to fix this, add the following to your `wp-config.php` file:
```php
if (isset($_SERVER['HTTP_X_FORWARDED_PROTO']) && $_SERVER['HTTP_X_FORWARDED_PROTO'] === 'https') {
$_SERVER['HTTPS'] = 'on';
$_SERVER['SERVER_PORT'] = 443;
}
```
This will make Wordpress think that your connection is over HTTPS instead of plain HTTP.

View File

@@ -84,6 +84,7 @@ If you don't know or understand what these settings mean, ignore them. These are
| Environment Variable | Default value | Explanation |
| :---------------------------- | :------------ | :-------------------------------------------------------------------------------------------------------------------------------------------------- |
| `TARGET_SNI` | unset | If set, overrides the TLS handshake hostname in requests forwarded to `TARGET`. |
| `TARGET_HOST` | unset | If set, overrides the Host header in requests forwarded to `TARGET`. |
| `TARGET_INSECURE_SKIP_VERIFY` | `false` | If `true`, skip TLS certificate validation for targets that listen over `https`. If your backend does not listen over `https`, ignore this setting. |

View File

@@ -49,7 +49,7 @@ sudo install -D ./run/anubis@.service /etc/systemd/system
Install the default configuration file to your system:
```text
sudo install -D ./run/default.env /etc/anubis
sudo install -D ./run/default.env /etc/anubis/default.env
```
</TabItem>
@@ -77,6 +77,13 @@ Install Anubis with `rpm`:
sudo rpm -ivh ./anubis-$VERSION.$ARCH.rpm
```
</TabItem>
<TabItem value="distro" label="Package managers">
Some Linux distributions offer Anubis [as a native package](https://repology.org/project/anubis-anti-crawler/versions). If you want to install Anubis from your distribution's package manager, consult any upstream documentation for how to install the package. It will either be named `anubis`, `www-apps/anubis` or `www/anubis`.
If you use a systemd-flavoured distribution, then follow the setup instructions for Debian or Red Hat Linux.
</TabItem>
</Tabs>

View File

@@ -35,6 +35,11 @@ This page contains a non-exhaustive list with all websites using Anubis.
- https://hofstede.io/
- https://www.indiemag.fr/
- https://reddit.nerdvpn.de/
- https://hosted.weblate.org/
- https://gitea.com/
- https://openwrt.org/
- https://minihoot.site
- https://catgirl.click/
- <details>
<summary>FreeCAD</summary>
- https://forum.freecad.org/

14
go.mod
View File

@@ -12,22 +12,22 @@ require (
github.com/sebest/xff v0.0.0-20210106013422-671bd2870b3a
github.com/yl2chen/cidranger v1.0.2
golang.org/x/net v0.40.0
k8s.io/apimachinery v0.33.0
k8s.io/apimachinery v0.33.1
)
require (
al.essio.dev/pkg/shellescape v1.6.0 // indirect
cel.dev/expr v0.23.1 // indirect
dario.cat/mergo v1.0.1 // indirect
dario.cat/mergo v1.0.2 // indirect
github.com/AlekSi/pointer v1.2.0 // indirect
github.com/BurntSushi/toml v1.4.1-0.20240526193622-a339e1f7089c // indirect
github.com/Masterminds/goutils v1.1.1 // indirect
github.com/Masterminds/semver/v3 v3.3.1 // indirect
github.com/Masterminds/sprig/v3 v3.3.0 // indirect
github.com/Microsoft/go-winio v0.6.2 // indirect
github.com/ProtonMail/go-crypto v1.1.6 // indirect
github.com/ProtonMail/go-crypto v1.2.0 // indirect
github.com/Songmu/gitconfig v0.2.0 // indirect
github.com/TecharoHQ/yeet v0.2.3 // indirect
github.com/TecharoHQ/yeet v0.6.0 // indirect
github.com/a-h/parse v0.0.0-20250122154542-74294addb73e // indirect
github.com/andybalholm/brotli v1.1.0 // indirect
github.com/antlr4-go/antlr/v4 v4.13.0 // indirect
@@ -59,11 +59,11 @@ require (
github.com/goccy/go-yaml v1.12.0 // indirect
github.com/golang/groupcache v0.0.0-20241129210726-2c02b8208cf8 // indirect
github.com/google/pprof v0.0.0-20230207041349-798e818bf904 // indirect
github.com/google/rpmpack v0.6.1-0.20240329070804-c2247cbb881a // indirect
github.com/google/rpmpack v0.6.1-0.20250405124433-758cc6896cbc // indirect
github.com/google/uuid v1.6.0 // indirect
github.com/goreleaser/chglog v0.7.0 // indirect
github.com/goreleaser/fileglob v1.3.0 // indirect
github.com/goreleaser/nfpm/v2 v2.42.0 // indirect
github.com/goreleaser/nfpm/v2 v2.42.1 // indirect
github.com/huandu/xstrings v1.5.0 // indirect
github.com/jbenet/go-context v0.0.0-20150711004518-d14ea06fba99 // indirect
github.com/kevinburke/ssh_config v1.2.0 // indirect
@@ -95,6 +95,7 @@ require (
golang.org/x/sync v0.14.0 // indirect
golang.org/x/sys v0.33.0 // indirect
golang.org/x/telemetry v0.0.0-20240522233618-39ace7a40ae7 // indirect
golang.org/x/term v0.32.0 // indirect
golang.org/x/text v0.25.0 // indirect
golang.org/x/tools v0.32.0 // indirect
golang.org/x/vuln v1.1.4 // indirect
@@ -105,6 +106,7 @@ require (
gopkg.in/warnings.v0 v0.1.2 // indirect
gopkg.in/yaml.v3 v3.0.1 // indirect
honnef.co/go/tools v0.6.1 // indirect
mvdan.cc/sh/v3 v3.11.0 // indirect
sigs.k8s.io/json v0.0.0-20241010143419-9aa6b5e7a4b3 // indirect
sigs.k8s.io/yaml v1.4.0 // indirect
)

16
go.sum
View File

@@ -4,6 +4,8 @@ cel.dev/expr v0.23.1 h1:K4KOtPCJQjVggkARsjG9RWXP6O4R73aHeJMa/dmCQQg=
cel.dev/expr v0.23.1/go.mod h1:hLPLo1W4QUmuYdA72RBX06QTs6MXw941piREPl3Yfiw=
dario.cat/mergo v1.0.1 h1:Ra4+bf83h2ztPIQYNP99R6m+Y7KfnARDfID+a+vLl4s=
dario.cat/mergo v1.0.1/go.mod h1:uNxQE+84aUszobStD9th8a29P2fMDhsBdgRYvZOxGmk=
dario.cat/mergo v1.0.2 h1:85+piFYR1tMbRrLcDwR18y4UKJ3aH1Tbzi24VRW1TK8=
dario.cat/mergo v1.0.2/go.mod h1:E/hbnu0NxMFBjpMIE34DRGLWqDy0g5FuKDhCb31ngxA=
github.com/AlekSi/pointer v1.2.0 h1:glcy/gc4h8HnG2Z3ZECSzZ1IX1x2JxRVuDzaJwQE0+w=
github.com/AlekSi/pointer v1.2.0/go.mod h1:gZGfd3dpW4vEc/UlyfKKi1roIqcCgwOIvb0tSNSBle0=
github.com/BurntSushi/toml v1.4.1-0.20240526193622-a339e1f7089c h1:pxW6RcqyfI9/kWtOwnv/G+AzdKuy2ZrqINhenH4HyNs=
@@ -22,6 +24,8 @@ github.com/Microsoft/go-winio v0.6.2 h1:F2VQgta7ecxGYO8k3ZZz3RS8fVIXVxONVUPlNERo
github.com/Microsoft/go-winio v0.6.2/go.mod h1:yd8OoFMLzJbo9gZq8j5qaps8bJ9aShtEA8Ipt1oGCvU=
github.com/ProtonMail/go-crypto v1.1.6 h1:ZcV+Ropw6Qn0AX9brlQLAUXfqLBc7Bl+f/DmNxpLfdw=
github.com/ProtonMail/go-crypto v1.1.6/go.mod h1:rA3QumHc/FZ8pAHreoekgiAbzpNsfQAosU5td4SnOrE=
github.com/ProtonMail/go-crypto v1.2.0 h1:+PhXXn4SPGd+qk76TlEePBfOfivE0zkWFenhGhFLzWs=
github.com/ProtonMail/go-crypto v1.2.0/go.mod h1:9whxjD8Rbs29b4XWbB8irEcE8KHMqaR2e7GWU1R+/PE=
github.com/ProtonMail/go-mime v0.0.0-20230322103455-7d82a3887f2f h1:tCbYj7/299ekTTXpdwKYF8eBlsYsDVoggDAuAjoK66k=
github.com/ProtonMail/go-mime v0.0.0-20230322103455-7d82a3887f2f/go.mod h1:gcr0kNtGBqin9zDW9GOHcVntrwnjrK+qdJ06mWYBybw=
github.com/ProtonMail/gopenpgp/v2 v2.7.1 h1:Awsg7MPc2gD3I7IFac2qE3Gdls0lZW8SzrFZ3k1oz0s=
@@ -30,6 +34,8 @@ github.com/Songmu/gitconfig v0.2.0 h1:pX2++u4KUq+K2k/ZCzGXLtkD3ceCqIdi0tDyb+IbSy
github.com/Songmu/gitconfig v0.2.0/go.mod h1:cB5bYJer+pl7W8g6RHFwL/0X6aJROVrYuHlvc7PT+hE=
github.com/TecharoHQ/yeet v0.2.3 h1:Pcsnq5HTnk4Xntlu/FNEidH7x55bIx+f5Mk1hpVIngs=
github.com/TecharoHQ/yeet v0.2.3/go.mod h1:avLiwxZpNY37A/o35XledvdmGnTkm3G7+Oskxca6Z7Y=
github.com/TecharoHQ/yeet v0.6.0 h1:RCBAjr7wIlllsgy0tpvWpLX7jsZgu2tiuBY3RrprcR0=
github.com/TecharoHQ/yeet v0.6.0/go.mod h1:bj2V4Fg8qKQXoiuPZa3HuawrE8g+LsOQv/9q2WyGSsA=
github.com/a-h/parse v0.0.0-20250122154542-74294addb73e h1:HjVbSQHy+dnlS6C3XajZ69NYAb5jbGNfHanvm1+iYlo=
github.com/a-h/parse v0.0.0-20250122154542-74294addb73e/go.mod h1:3mnrkvGpurZ4ZrTDbYU84xhwXW2TjTKShSwjRi2ihfQ=
github.com/a-h/templ v0.3.865 h1:nYn5EWm9EiXaDgWcMQaKiKvrydqgxDUtT1+4zU2C43A=
@@ -143,6 +149,8 @@ github.com/google/renameio v0.1.0 h1:GOZbcHa3HfsPKPlmyPyN2KEohoMXOhdMbHrvbpl2QaA
github.com/google/renameio v0.1.0/go.mod h1:KWCgfxg9yswjAJkECMjeO8J8rahYeXnNhOm40UhjYkI=
github.com/google/rpmpack v0.6.1-0.20240329070804-c2247cbb881a h1:JJBdjSfqSy3mnDT0940ASQFghwcZ4y4cb6ttjAoXqwE=
github.com/google/rpmpack v0.6.1-0.20240329070804-c2247cbb881a/go.mod h1:uqVAUVQLq8UY2hCDfmJ/+rtO3aw7qyhc90rCVEabEfI=
github.com/google/rpmpack v0.6.1-0.20250405124433-758cc6896cbc h1:qES+d3PvR9CN+zARQQH/bNXH0ybzmdjNMHICrBwXD28=
github.com/google/rpmpack v0.6.1-0.20250405124433-758cc6896cbc/go.mod h1:uqVAUVQLq8UY2hCDfmJ/+rtO3aw7qyhc90rCVEabEfI=
github.com/google/shlex v0.0.0-20191202100458-e7afc7fbc510 h1:El6M4kTTCOh6aBiKaUGG7oYTSPP8MxqL4YI3kZKwcP4=
github.com/google/shlex v0.0.0-20191202100458-e7afc7fbc510/go.mod h1:pupxD2MaaD3pAXIBCelhxNneeOaAeabZDe5s4K6zSpQ=
github.com/google/uuid v1.6.0 h1:NIvaJDMOsjHA8n1jAhLSgzrAzy1Hgr+hNrb57e+94F0=
@@ -155,6 +163,8 @@ github.com/goreleaser/fileglob v1.3.0 h1:/X6J7U8lbDpQtBvGcwwPS6OpzkNVlVEsFUVRx9+
github.com/goreleaser/fileglob v1.3.0/go.mod h1:Jx6BoXv3mbYkEzwm9THo7xbr5egkAraxkGorbJb4RxU=
github.com/goreleaser/nfpm/v2 v2.42.0 h1:7BW4WQWyvZDrT0C7SyWop+J8rtqFyTB17Sb2/j/NxMI=
github.com/goreleaser/nfpm/v2 v2.42.0/go.mod h1:DtNL+nKpfB8sMFZp+X7Xu3W64atyZYtTnYe8O925/mg=
github.com/goreleaser/nfpm/v2 v2.42.1 h1:xu2pLRgQuz2ab+YZFoeIzwU/M5jjjCKDGwv1lRbVGvk=
github.com/goreleaser/nfpm/v2 v2.42.1/go.mod h1:dY53KWYKebkOocxgkmpM7SRX0Nv5hU+jEu2kIaM4/LI=
github.com/h2non/parth v0.0.0-20190131123155-b4df798d6542/go.mod h1:Ow0tF8D4Kplbc8s8sSb3V2oUCygFHVp8gC3Dn6U4MNI=
github.com/henvic/httpretty v0.0.6/go.mod h1:X38wLjWXHkXT7r2+uK8LjCMne9rsuNaBLJ+5cU2/Pmo=
github.com/huandu/xstrings v1.5.0 h1:2ag3IFq9ZDANvthTwTiqSSZLjDc+BedvHPAp5tJy2TI=
@@ -371,8 +381,10 @@ gopkg.in/yaml.v3 v3.0.1 h1:fxVm/GzAzEWqLHuvctI91KS9hhNmmWOoWu0XTYJS7CA=
gopkg.in/yaml.v3 v3.0.1/go.mod h1:K4uyk7z7BCEPqu6E+C64Yfv1cQ7kz7rIZviUmN+EgEM=
honnef.co/go/tools v0.6.1 h1:R094WgE8K4JirYjBaOpz/AvTyUu/3wbmAoskKN/pxTI=
honnef.co/go/tools v0.6.1/go.mod h1:3puzxxljPCe8RGJX7BIy1plGbxEOZni5mR2aXe3/uk4=
k8s.io/apimachinery v0.33.0 h1:1a6kHrJxb2hs4t8EE5wuR/WxKDwGN1FKH3JvDtA0CIQ=
k8s.io/apimachinery v0.33.0/go.mod h1:BHW0YOu7n22fFv/JkYOEfkUYNRN0fj0BlvMFWA7b+SM=
k8s.io/apimachinery v0.33.1 h1:mzqXWV8tW9Rw4VeW9rEkqvnxj59k1ezDUl20tFK/oM4=
k8s.io/apimachinery v0.33.1/go.mod h1:BHW0YOu7n22fFv/JkYOEfkUYNRN0fj0BlvMFWA7b+SM=
mvdan.cc/sh/v3 v3.11.0 h1:q5h+XMDRfUGUedCqFFsjoFjrhwf2Mvtt1rkMvVz0blw=
mvdan.cc/sh/v3 v3.11.0/go.mod h1:LRM+1NjoYCzuq/WZ6y44x14YNAI0NK7FLPeQSaFagGg=
pault.ag/go/debian v0.18.0 h1:nr0iiyOU5QlG1VPnhZLNhnCcHx58kukvBJp+dvaM6CQ=
pault.ag/go/debian v0.18.0/go.mod h1:JFl0XWRCv9hWBrB5MDDZjA5GSEs1X3zcFK/9kCNIUmE=
pault.ag/go/topsort v0.1.1 h1:L0QnhUly6LmTv0e3DEzbN2q6/FGgAcQvaEw65S53Bg4=

7
internal/mimetype.go Normal file
View File

@@ -0,0 +1,7 @@
package internal
import "mime"
func init() {
mime.AddExtensionType(".mjs", "text/javascript")
}

View File

@@ -214,6 +214,11 @@ func TestPlaywrightBrowser(t *testing.T) {
return
}
if os.Getenv("SKIP_INTEGRATION") != "" {
t.Skip("SKIP_INTEGRATION was set")
return
}
startPlaywright(t)
pw := setupPlaywright(t)
@@ -289,6 +294,11 @@ func TestPlaywrightWithBasePrefix(t *testing.T) {
return
}
if os.Getenv("SKIP_INTEGRATION") != "" {
t.Skip("SKIP_INTEGRATION was set")
return
}
t.Skip("NOTE(Xe)\\ these tests require HTTPS support in #364")
startPlaywright(t)

View File

@@ -31,10 +31,10 @@ import (
)
var (
challengesIssued = promauto.NewCounter(prometheus.CounterOpts{
challengesIssued = promauto.NewCounterVec(prometheus.CounterOpts{
Name: "anubis_challenges_issued",
Help: "The total number of challenges issued",
})
}, []string{"method"})
challengesValidated = promauto.NewCounter(prometheus.CounterOpts{
Name: "anubis_challenges_validated",
@@ -56,6 +56,11 @@ var (
Help: "The time taken for a browser to generate a response (milliseconds)",
Buckets: prometheus.ExponentialBucketsRange(1, math.Pow(2, 18), 19),
})
requestsProxied = promauto.NewCounterVec(prometheus.CounterOpts{
Name: "anubis_proxied_requests_total",
Help: "Number of requests proxied through Anubis to upstream targets",
}, []string{"host"})
)
type Server struct {
@@ -71,11 +76,16 @@ type Server struct {
}
func (s *Server) challengeFor(r *http.Request, difficulty int) string {
fp := sha256.Sum256(s.priv.Seed())
fp := sha256.Sum256(s.pub[:])
acceptLanguage := r.Header.Get("Accept-Language")
if len(acceptLanguage) > 5 {
acceptLanguage = acceptLanguage[:5]
}
challengeData := fmt.Sprintf(
"Accept-Language=%s,X-Real-IP=%s,User-Agent=%s,WeekTime=%s,Fingerprint=%x,Difficulty=%d",
r.Header.Get("Accept-Language"),
acceptLanguage,
r.Header.Get("X-Real-Ip"),
r.UserAgent(),
time.Now().UTC().Round(24*7*time.Hour).Format(time.RFC3339),
@@ -157,6 +167,29 @@ func (s *Server) maybeReverseProxy(w http.ResponseWriter, r *http.Request, httpS
return
}
claims, ok := token.Claims.(jwt.MapClaims)
if !ok {
lg.Debug("invalid token claims type", "path", r.URL.Path)
s.ClearCookie(w, s.cookieName, cookiePath)
s.RenderIndex(w, r, rule, httpStatusOnly)
return
}
policyRule, ok := claims["policyRule"].(string)
if !ok {
lg.Debug("policyRule claim is not a string")
s.ClearCookie(w, s.cookieName, cookiePath)
s.RenderIndex(w, r, rule, httpStatusOnly)
return
}
if policyRule != rule.Hash() {
lg.Debug("user originally passed with a different rule, issuing new challenge", "old", policyRule, "new", rule.Name)
s.ClearCookie(w, s.cookieName, cookiePath)
s.RenderIndex(w, r, rule, httpStatusOnly)
return
}
r.Header.Add("X-Anubis-Status", "PASS")
s.ServeHTTPNext(w, r)
}
@@ -226,6 +259,21 @@ func (s *Server) handleDNSBL(w http.ResponseWriter, r *http.Request, ip string,
func (s *Server) MakeChallenge(w http.ResponseWriter, r *http.Request) {
lg := internal.GetRequestLogger(r)
redir := r.FormValue("redir")
if redir == "" {
w.WriteHeader(http.StatusBadRequest)
encoder := json.NewEncoder(w)
lg.Error("invalid invocation of MakeChallenge", "redir", redir)
encoder.Encode(struct {
Error string `json:"error"`
}{
Error: "Invalid invocation of MakeChallenge",
})
return
}
r.URL.Path = redir
encoder := json.NewEncoder(w)
cr, rule, err := s.check(r)
if err != nil {
@@ -260,7 +308,7 @@ func (s *Server) MakeChallenge(w http.ResponseWriter, r *http.Request) {
return
}
lg.Debug("made challenge", "challenge", challenge, "rules", rule.Challenge, "cr", cr)
challengesIssued.Inc()
challengesIssued.WithLabelValues("api").Inc()
}
func (s *Server) PassChallenge(w http.ResponseWriter, r *http.Request) {
@@ -369,15 +417,13 @@ func (s *Server) PassChallenge(w http.ResponseWriter, r *http.Request) {
}
// generate JWT cookie
token := jwt.NewWithClaims(jwt.SigningMethodEdDSA, jwt.MapClaims{
"challenge": challenge,
"nonce": nonceStr,
"response": response,
"iat": time.Now().Unix(),
"nbf": time.Now().Add(-1 * time.Minute).Unix(),
"exp": time.Now().Add(s.opts.CookieExpiration).Unix(),
tokenString, err := s.signJWT(jwt.MapClaims{
"challenge": challenge,
"nonce": nonceStr,
"response": response,
"policyRule": rule.Hash(),
"action": string(cr.Rule),
})
tokenString, err := token.SignedString(s.priv)
if err != nil {
lg.Error("failed to sign JWT", "err", err)
s.ClearCookie(w, s.cookieName, cookiePath)
@@ -433,6 +479,7 @@ func (s *Server) check(r *http.Request) (policy.CheckResult, *policy.Bot, error)
ReportAs: s.policy.DefaultDifficulty,
Algorithm: config.AlgorithmFast,
},
Rules: &policy.CheckerList{},
}, nil
}

View File

@@ -4,10 +4,11 @@ import (
"encoding/json"
"fmt"
"net/http"
"net/http/cookiejar"
"net/http/httptest"
"net/url"
"os"
"strings"
"sync"
"testing"
"time"
@@ -18,6 +19,10 @@ import (
"github.com/TecharoHQ/anubis/lib/policy/config"
)
func init() {
internal.InitSlog("debug")
}
func loadPolicies(t *testing.T, fname string) *policy.ParsedConfig {
t.Helper()
@@ -47,7 +52,16 @@ type challenge struct {
func makeChallenge(t *testing.T, ts *httptest.Server, cli *http.Client) challenge {
t.Helper()
resp, err := cli.Post(ts.URL+"/.within.website/x/cmd/anubis/api/make-challenge", "", nil)
req, err := http.NewRequest(http.MethodPost, ts.URL+"/.within.website/x/cmd/anubis/api/make-challenge", nil)
if err != nil {
t.Fatalf("can't make request: %v", err)
}
q := req.URL.Query()
q.Set("redir", "/")
req.URL.RawQuery = q.Encode()
resp, err := cli.Do(req)
if err != nil {
t.Fatalf("can't request challenge: %v", err)
}
@@ -91,16 +105,48 @@ func handleChallengeZeroDifficulty(t *testing.T, ts *httptest.Server, cli *http.
return resp
}
type loggingCookieJar struct {
t *testing.T
lock sync.Mutex
cookies map[string][]*http.Cookie
}
func (lcj *loggingCookieJar) Cookies(u *url.URL) []*http.Cookie {
lcj.lock.Lock()
defer lcj.lock.Unlock()
// XXX(Xe): This is not RFC compliant in the slightest.
result, ok := lcj.cookies[u.Host]
if !ok {
return nil
}
lcj.t.Logf("requested cookies for %s", u)
for _, ckie := range result {
lcj.t.Logf("get cookie: <- %s", ckie)
}
return result
}
func (lcj *loggingCookieJar) SetCookies(u *url.URL, cookies []*http.Cookie) {
lcj.lock.Lock()
defer lcj.lock.Unlock()
for _, ckie := range cookies {
lcj.t.Logf("set cookie: %s -> %s", u, ckie)
}
// XXX(Xe): This is not RFC compliant in the slightest.
lcj.cookies[u.Host] = append(lcj.cookies[u.Host], cookies...)
}
func httpClient(t *testing.T) *http.Client {
t.Helper()
jar, err := cookiejar.New(nil)
if err != nil {
t.Fatal(err)
}
cli := &http.Client{
Jar: jar,
Jar: &loggingCookieJar{t: t, cookies: map[string][]*http.Cookie{}},
CheckRedirect: func(req *http.Request, via []*http.Request) error {
return http.ErrUseLastResponse
},
@@ -134,8 +180,7 @@ func TestCVE2025_24369(t *testing.T) {
Next: http.NewServeMux(),
Policy: pol,
CookiePartitioned: true,
CookieName: t.Name(),
CookieName: t.Name(),
})
ts := httptest.NewServer(internal.RemoteXRealIP(true, "tcp", srv))
@@ -318,7 +363,7 @@ func TestBasePrefix(t *testing.T) {
}{
{
name: "no prefix",
basePrefix: "",
basePrefix: "/",
path: "/.within.website/x/cmd/anubis/api/make-challenge",
expected: "/.within.website/x/cmd/anubis/api/make-challenge",
},
@@ -353,8 +398,19 @@ func TestBasePrefix(t *testing.T) {
ts := httptest.NewServer(internal.RemoteXRealIP(true, "tcp", srv))
defer ts.Close()
cli := httpClient(t)
req, err := http.NewRequest(http.MethodPost, ts.URL+tc.path, nil)
if err != nil {
t.Fatal(err)
}
q := req.URL.Query()
q.Set("redir", tc.basePrefix)
req.URL.RawQuery = q.Encode()
// Test API endpoint with prefix
resp, err := ts.Client().Post(ts.URL+tc.path, "", nil)
resp, err := cli.Do(req)
if err != nil {
t.Fatalf("can't request challenge: %v", err)
}
@@ -388,7 +444,6 @@ func TestBasePrefix(t *testing.T) {
elapsedTime := 420
redir := "/"
cli := ts.Client()
cli.CheckRedirect = func(req *http.Request, via []*http.Request) error {
return http.ErrUseLastResponse
}
@@ -397,7 +452,7 @@ func TestBasePrefix(t *testing.T) {
passChallengePath := tc.path
passChallengePath = passChallengePath[:strings.LastIndex(passChallengePath, "/")+1] + "pass-challenge"
req, err := http.NewRequest(http.MethodGet, ts.URL+passChallengePath, nil)
req, err = http.NewRequest(http.MethodGet, ts.URL+passChallengePath, nil)
if err != nil {
t.Fatalf("can't make request: %v", err)
}
@@ -406,7 +461,7 @@ func TestBasePrefix(t *testing.T) {
req.AddCookie(ckie)
}
q := req.URL.Query()
q = req.URL.Query()
q.Set("response", calculated)
q.Set("nonce", fmt.Sprint(nonce))
q.Set("redir", redir)
@@ -549,3 +604,31 @@ func TestCloudflareWorkersRule(t *testing.T) {
})
}
}
func TestRuleChange(t *testing.T) {
pol := loadPolicies(t, "testdata/rule_change.yaml")
pol.DefaultDifficulty = 0
ckieExpiration := 10 * time.Minute
srv := spawnAnubis(t, Options{
Next: http.NewServeMux(),
Policy: pol,
CookieDomain: "127.0.0.1",
CookieName: t.Name(),
CookieExpiration: ckieExpiration,
})
ts := httptest.NewServer(internal.RemoteXRealIP(true, "tcp", srv))
defer ts.Close()
cli := httpClient(t)
chall := makeChallenge(t, ts, cli)
resp := handleChallengeZeroDifficulty(t, ts, cli, chall)
if resp.StatusCode != http.StatusFound {
resp.Write(os.Stderr)
t.Errorf("wanted %d, got: %d", http.StatusFound, resp.StatusCode)
}
}

View File

@@ -12,6 +12,7 @@ import (
"github.com/TecharoHQ/anubis/lib/policy"
"github.com/TecharoHQ/anubis/web"
"github.com/a-h/templ"
"github.com/golang-jwt/jwt/v5"
)
func (s *Server) SetCookie(w http.ResponseWriter, name, value, path string) {
@@ -73,6 +74,7 @@ func (s *Server) RenderIndex(w http.ResponseWriter, r *http.Request, rule *polic
s.respondWithError(w, r, "Client Error: Please ensure your browser is up to date and try again later.")
}
challengesIssued.WithLabelValues("embedded").Add(1)
challenge := s.challengeFor(r, rule.Challenge.Difficulty)
var ogTags map[string]string = nil
@@ -146,6 +148,15 @@ func (s *Server) ServeHTTPNext(w http.ResponseWriter, r *http.Request) {
web.Base("You are not a bot!", web.StaticHappy()),
).ServeHTTP(w, r)
} else {
requestsProxied.WithLabelValues(r.Host).Inc()
s.next.ServeHTTP(w, r)
}
}
func (s *Server) signJWT(claims jwt.MapClaims) (string, error) {
claims["iat"] = time.Now().Unix()
claims["nbf"] = time.Now().Add(-1 * time.Minute).Unix()
claims["exp"] = time.Now().Add(s.opts.CookieExpiration).Unix()
return jwt.NewWithClaims(jwt.SigningMethodEdDSA, claims).SignedString(s.priv)
}

View File

@@ -224,7 +224,7 @@ func (is *ImportStatement) open() (fs.File, error) {
func (is *ImportStatement) load() error {
fin, err := is.open()
if err != nil {
return fmt.Errorf("can't open %s: %w", is.Import, err)
return fmt.Errorf("%w: %s: %w", ErrInvalidImportStatement, is.Import, err)
}
defer fin.Close()

View File

@@ -251,6 +251,7 @@ func TestImportStatement(t *testing.T) {
"bots",
"common",
"crawlers",
"meta",
} {
if err := fs.WalkDir(data.BotPolicies, folderName, func(path string, d fs.DirEntry, err error) error {
if err != nil {
@@ -259,6 +260,9 @@ func TestImportStatement(t *testing.T) {
if d.IsDir() {
return nil
}
if d.Name() == "README.md" {
return nil
}
tests = append(tests, testCase{
name: "(data)/" + path,

View File

@@ -54,6 +54,9 @@ func (eol *ExpressionOrList) UnmarshalJSON(data []byte) error {
}
func (eol *ExpressionOrList) Valid() error {
if eol.Expression == "" && len(eol.All) == 0 && len(eol.Any) == 0 {
return ErrExpressionEmpty
}
if len(eol.All) != 0 && len(eol.Any) != 0 {
return ErrExpressionCantHaveBoth
}

View File

@@ -51,6 +51,13 @@ func TestExpressionOrListUnmarshal(t *testing.T) {
}`,
validErr: ErrExpressionCantHaveBoth,
},
{
name: "expression-empty",
inp: `{
"any": []
}`,
validErr: ErrExpressionEmpty,
},
} {
t.Run(tt.name, func(t *testing.T) {
var eol ExpressionOrList

View File

@@ -1,7 +1,7 @@
{
"bots": [
{
"import": "(data)/bots/ai-robots-txt.yaml",
"import": "(data)/bots/ai-catchall.yaml",
"name": "generic-browser",
"user_agent_regex": "Mozilla|Opera\n",
"action": "CHALLENGE"

View File

@@ -1,5 +1,5 @@
bots:
- import: (data)/bots/ai-robots-txt.yaml
- import: (data)/bots/ai-catchall.yaml
name: generic-browser
user_agent_regex: >
Mozilla|Opera

View File

@@ -0,0 +1,8 @@
bots:
- name: total-randomness
action: ALLOW
expression:
all:
- '"Accept" in headers'
- headers["Accept"].contains("text/html")
- randInt(1) == 0

View File

@@ -1,7 +1,11 @@
package expressions
import (
"math/rand/v2"
"github.com/google/cel-go/cel"
"github.com/google/cel-go/common/types"
"github.com/google/cel-go/common/types/ref"
"github.com/google/cel-go/ext"
)
@@ -29,6 +33,20 @@ func NewEnvironment() (*cel.Env, error) {
cel.Variable("headers", cel.MapType(cel.StringType, cel.StringType)),
// Functions exposed to CEL programs:
cel.Function("randInt",
cel.Overload("randInt_int",
[]*cel.Type{cel.IntType},
cel.IntType,
cel.UnaryBinding(func(val ref.Val) ref.Val {
n, ok := val.(types.Int)
if !ok {
return types.ValOrErr(val, "value is not an integer, but is %T", val)
}
return types.Int(rand.IntN(int(n)))
}),
),
),
)
}

12
lib/testdata/rule_change.yaml vendored Normal file
View File

@@ -0,0 +1,12 @@
bots:
- name: old-rule
path_regex: ^/old$
action: CHALLENGE
- name: new-rule
path_regex: ^/new$
action: CHALLENGE
status_codes:
CHALLENGE: 401
DENY: 403

4
package-lock.json generated
View File

@@ -1,12 +1,12 @@
{
"name": "@techaro/anubis",
"version": "1.18.0",
"version": "1.19.1",
"lockfileVersion": 3,
"requires": true,
"packages": {
"": {
"name": "@techaro/anubis",
"version": "1.18.0",
"version": "1.19.1",
"license": "ISC",
"devDependencies": {
"cssnano": "^7.0.7",

View File

@@ -1,6 +1,6 @@
{
"name": "@techaro/anubis",
"version": "1.18.0",
"version": "1.19.1",
"description": "",
"main": "index.js",
"scripts": {
@@ -27,4 +27,4 @@
"postcss-import-url": "^7.2.0",
"postcss-url": "^10.1.3"
}
}
}

24
run/openrc/anubis.confd Normal file
View File

@@ -0,0 +1,24 @@
# The URL of the service that Anubis should forward valid requests to. Supports
# Unix domain sockets.
#ANUBIS_TARGET="http://localhost:3923"
#ANUBIS_TARGET="unix:///path/to/socket"
# The network address that Anubis listens on.
#
# If unset, listen on /run/anubis_${instance}/anubis.sock Unix socket instead.
#ANUBIS_BIND_PORT=":8923"
# The network address that Anubis serves Prometheus metrics on.
#
# If unset, listen on /run/anubis_${instance}/metrix.sock Unix socket instead.
#ANUBIS_METRICS_BIND_PORT=":9090"
# The difficulty of the challenge, or the number of leading zeroes that must be
# in successful responses.
#ANUBIS_DIFFICULTY=4
# Additional command-line options for Anubis.
#ANUBIS_OPTS=""
# Configure the user[:group] Anubis will run as.
#command_user="anubis:anubis"

34
run/openrc/anubis.initd Normal file
View File

@@ -0,0 +1,34 @@
#!/sbin/openrc-run
# shellcheck shell=sh
instance=${RC_SVCNAME#*.}
description="Anubis HTTP defense proxy (instance ${instance})"
supervisor="supervise-daemon"
command="/usr/bin/anubis"
command_args="\
-bind ${ANUBIS_BIND_PORT:-/run/anubis_${instance?}/anubis.sock -bind-network unix} \
-metrics-bind ${ANUBIS_METRICS_BIND_PORT:-/run/anubis_${instance?}/metrics.sock -metrics-bind-network unix} \
-target ${ANUBIS_TARGET:-http://localhost:3923} \
-difficulty ${ANUBIS_DIFFICULTY:-4} \
${ANUBIS_OPTS}
"
command_background=1
pidfile="/run/anubis_${instance?}/anubis.pid"
: "${command_user:=anubis:anubis}"
depend() {
use net firewall
}
start_pre() {
if [ "${instance?}" = "${RC_SVCNAME?}" ]; then
eerror "${RC_SVCNAME?} cannot be started directly. You must create"
eerror "symbolic links to it for the services you want to start"
eerror "and add those to the appropriate runlevels."
return 1
fi
checkpath -d -o "${command_user?}" "/run/anubis_${instance?}"
}

View File

@@ -42,11 +42,9 @@ templ base(title string, body templ.Component, challenge any, ogTags map[string]
border-radius: 1rem;
overflow: hidden;
margin: 1rem 0 2rem;
outline-color: #b16286;
outline-offset: 2px;
outline-style: solid;
outline-width: 4px;
}
outline-offset: 2px;
outline: #b16286 solid 4px;
}
.bar-inner {
background-color: #b16286;
@@ -123,6 +121,7 @@ templ index() {
>JShelter</a> will disable. Please disable JShelter or other such
plugins for this domain.
</p>
<p>This website is running Anubis version <code>{ anubis.Version }</code>.</p>
</details>
<noscript>
<p>
@@ -154,15 +153,15 @@ templ errorPage(message string, mail string) {
}
templ StaticHappy() {
<div class="centered-div">
<img
style="display:none;"
style="width:100%;max-width:256px;"
src={ "/.within.website/x/cmd/anubis/static/img/happy.webp?cacheBuster=" +
<div class="centered-div">
<img
style="display:none;"
style="width:100%;max-width:256px;"
src={ "/.within.website/x/cmd/anubis/static/img/happy.webp?cacheBuster=" +
anubis.Version }
/>
<p>This is just a check endpoint for your reverse proxy to use.</p>
</div>
/>
<p>This is just a check endpoint for your reverse proxy to use.</p>
</div>
}
templ bench() {

145
web/index_templ.go generated
View File

@@ -96,7 +96,7 @@ func base(title string, body templ.Component, challenge any, ogTags map[string]s
return templ_7745c5c3_Err
}
}
templ_7745c5c3_Err = templruntime.WriteString(templ_7745c5c3_Buffer, 7, "<style>\n body,\n html {\n height: 100%;\n display: flex;\n justify-content: center;\n align-items: center;\n margin-left: auto;\n margin-right: auto;\n }\n\n .centered-div {\n text-align: center;\n }\n\n #status {\n font-variant-numeric: tabular-nums;\n }\n\n #progress {\n display: none;\n width: min(20rem, 90%);\n height: 2rem;\n border-radius: 1rem;\n overflow: hidden;\n margin: 1rem 0 2rem;\n outline-color: #b16286;\n outline-offset: 2px;\n outline-style: solid;\n outline-width: 4px;\n }\n\n .bar-inner {\n background-color: #b16286;\n height: 100%;\n width: 0;\n transition: width 0.25s ease-in;\n }\n </style>")
templ_7745c5c3_Err = templruntime.WriteString(templ_7745c5c3_Buffer, 7, "<style>\n body,\n html {\n height: 100%;\n display: flex;\n justify-content: center;\n align-items: center;\n margin-left: auto;\n margin-right: auto;\n }\n\n .centered-div {\n text-align: center;\n }\n\n #status {\n font-variant-numeric: tabular-nums;\n }\n\n #progress {\n display: none;\n width: min(20rem, 90%);\n height: 2rem;\n border-radius: 1rem;\n overflow: hidden;\n margin: 1rem 0 2rem;\n\t\t\toutline-offset: 2px;\n\t\t\toutline: #b16286 solid 4px;\n\t\t}\n\n .bar-inner {\n background-color: #b16286;\n height: 100%;\n width: 0;\n transition: width 0.25s ease-in;\n }\n </style>")
if templ_7745c5c3_Err != nil {
return templ_7745c5c3_Err
}
@@ -121,7 +121,7 @@ func base(title string, body templ.Component, challenge any, ogTags map[string]s
var templ_7745c5c3_Var6 string
templ_7745c5c3_Var6, templ_7745c5c3_Err = templ.JoinStringErrs(title)
if templ_7745c5c3_Err != nil {
return templ.Error{Err: templ_7745c5c3_Err, FileName: `index.templ`, Line: 67, Col: 49}
return templ.Error{Err: templ_7745c5c3_Err, FileName: `index.templ`, Line: 65, Col: 49}
}
_, templ_7745c5c3_Err = templ_7745c5c3_Buffer.WriteString(templ.EscapeString(templ_7745c5c3_Var6))
if templ_7745c5c3_Err != nil {
@@ -171,7 +171,7 @@ func index() templ.Component {
var templ_7745c5c3_Var8 string
templ_7745c5c3_Var8, templ_7745c5c3_Err = templ.JoinStringErrs(anubis.BasePrefix + "/.within.website/x/cmd/anubis/static/img/pensive.webp?cacheBuster=" + anubis.Version)
if templ_7745c5c3_Err != nil {
return templ.Error{Err: templ_7745c5c3_Err, FileName: `index.templ`, Line: 87, Col: 165}
return templ.Error{Err: templ_7745c5c3_Err, FileName: `index.templ`, Line: 85, Col: 165}
}
_, templ_7745c5c3_Err = templ_7745c5c3_Buffer.WriteString(templ.EscapeString(templ_7745c5c3_Var8))
if templ_7745c5c3_Err != nil {
@@ -184,7 +184,7 @@ func index() templ.Component {
var templ_7745c5c3_Var9 string
templ_7745c5c3_Var9, templ_7745c5c3_Err = templ.JoinStringErrs(anubis.BasePrefix + "/.within.website/x/cmd/anubis/static/img/happy.webp?cacheBuster=" + anubis.Version)
if templ_7745c5c3_Err != nil {
return templ.Error{Err: templ_7745c5c3_Err, FileName: `index.templ`, Line: 88, Col: 174}
return templ.Error{Err: templ_7745c5c3_Err, FileName: `index.templ`, Line: 86, Col: 174}
}
_, templ_7745c5c3_Err = templ_7745c5c3_Buffer.WriteString(templ.EscapeString(templ_7745c5c3_Var9))
if templ_7745c5c3_Err != nil {
@@ -197,13 +197,26 @@ func index() templ.Component {
var templ_7745c5c3_Var10 string
templ_7745c5c3_Var10, templ_7745c5c3_Err = templ.JoinStringErrs(anubis.BasePrefix + "/.within.website/x/cmd/anubis/static/js/main.mjs?cacheBuster=" + anubis.Version)
if templ_7745c5c3_Err != nil {
return templ.Error{Err: templ_7745c5c3_Err, FileName: `index.templ`, Line: 90, Col: 136}
return templ.Error{Err: templ_7745c5c3_Err, FileName: `index.templ`, Line: 88, Col: 136}
}
_, templ_7745c5c3_Err = templ_7745c5c3_Buffer.WriteString(templ.EscapeString(templ_7745c5c3_Var10))
if templ_7745c5c3_Err != nil {
return templ_7745c5c3_Err
}
templ_7745c5c3_Err = templruntime.WriteString(templ_7745c5c3_Buffer, 14, "\"></script><div id=\"progress\" role=\"progressbar\" aria-labelledby=\"status\"><div class=\"bar-inner\"></div></div><details><summary>Why am I seeing this?</summary><p>You are seeing this because the administrator of this website has set up <a href=\"https://github.com/TecharoHQ/anubis\">Anubis</a> to protect the server against the scourge of <a href=\"https://thelibre.news/foss-infrastructure-is-under-attack-by-ai-companies/\">AI companies aggressively scraping websites</a>. This can and does cause downtime for the websites, which makes their resources inaccessible for everyone.</p><p>Anubis is a compromise. Anubis uses a <a href=\"https://anubis.techaro.lol/docs/design/why-proof-of-work\">Proof-of-Work</a> scheme in the vein of <a href=\"https://en.wikipedia.org/wiki/Hashcash\">Hashcash</a>, a proposed proof-of-work scheme for reducing email spam. The idea is that at individual scales the additional load is ignorable, but at mass scraper levels it adds up and makes scraping much more expensive.</p><p>Ultimately, this is a hack whose real purpose is to give a \"good enough\" placeholder solution so that more time can be spent on fingerprinting and identifying headless browsers (EG: via how they do font rendering) so that the challenge proof of work page doesn't need to be presented to users that are much more likely to be legitimate.</p><p>Please note that Anubis requires the use of modern JavaScript features that plugins like <a href=\"https://jshelter.org/\">JShelter</a> will disable. Please disable JShelter or other such plugins for this domain.</p></details><noscript><p>Sadly, you must enable JavaScript to get past this challenge. This is required because AI companies have changed the social contract around how website hosting works. A no-JS solution is a work-in-progress.</p></noscript><div id=\"testarea\"></div></div>")
templ_7745c5c3_Err = templruntime.WriteString(templ_7745c5c3_Buffer, 14, "\"></script><div id=\"progress\" role=\"progressbar\" aria-labelledby=\"status\"><div class=\"bar-inner\"></div></div><details><summary>Why am I seeing this?</summary><p>You are seeing this because the administrator of this website has set up <a href=\"https://github.com/TecharoHQ/anubis\">Anubis</a> to protect the server against the scourge of <a href=\"https://thelibre.news/foss-infrastructure-is-under-attack-by-ai-companies/\">AI companies aggressively scraping websites</a>. This can and does cause downtime for the websites, which makes their resources inaccessible for everyone.</p><p>Anubis is a compromise. Anubis uses a <a href=\"https://anubis.techaro.lol/docs/design/why-proof-of-work\">Proof-of-Work</a> scheme in the vein of <a href=\"https://en.wikipedia.org/wiki/Hashcash\">Hashcash</a>, a proposed proof-of-work scheme for reducing email spam. The idea is that at individual scales the additional load is ignorable, but at mass scraper levels it adds up and makes scraping much more expensive.</p><p>Ultimately, this is a hack whose real purpose is to give a \"good enough\" placeholder solution so that more time can be spent on fingerprinting and identifying headless browsers (EG: via how they do font rendering) so that the challenge proof of work page doesn't need to be presented to users that are much more likely to be legitimate.</p><p>Please note that Anubis requires the use of modern JavaScript features that plugins like <a href=\"https://jshelter.org/\">JShelter</a> will disable. Please disable JShelter or other such plugins for this domain.</p><p>This website is running Anubis version <code>")
if templ_7745c5c3_Err != nil {
return templ_7745c5c3_Err
}
var templ_7745c5c3_Var11 string
templ_7745c5c3_Var11, templ_7745c5c3_Err = templ.JoinStringErrs(anubis.Version)
if templ_7745c5c3_Err != nil {
return templ.Error{Err: templ_7745c5c3_Err, FileName: `index.templ`, Line: 124, Col: 67}
}
_, templ_7745c5c3_Err = templ_7745c5c3_Buffer.WriteString(templ.EscapeString(templ_7745c5c3_Var11))
if templ_7745c5c3_Err != nil {
return templ_7745c5c3_Err
}
templ_7745c5c3_Err = templruntime.WriteString(templ_7745c5c3_Buffer, 15, "</code>.</p></details><noscript><p>Sadly, you must enable JavaScript to get past this challenge. This is required because AI companies have changed the social contract around how website hosting works. A no-JS solution is a work-in-progress.</p></noscript><div id=\"testarea\"></div></div>")
if templ_7745c5c3_Err != nil {
return templ_7745c5c3_Err
}
@@ -227,75 +240,75 @@ func errorPage(message string, mail string) templ.Component {
}()
}
ctx = templ.InitializeContext(ctx)
templ_7745c5c3_Var11 := templ.GetChildren(ctx)
if templ_7745c5c3_Var11 == nil {
templ_7745c5c3_Var11 = templ.NopComponent
templ_7745c5c3_Var12 := templ.GetChildren(ctx)
if templ_7745c5c3_Var12 == nil {
templ_7745c5c3_Var12 = templ.NopComponent
}
ctx = templ.ClearChildren(ctx)
templ_7745c5c3_Err = templruntime.WriteString(templ_7745c5c3_Buffer, 15, "<div class=\"centered-div\"><img id=\"image\" alt=\"Sad Anubis\" style=\"width:100%;max-width:256px;\" src=\"")
if templ_7745c5c3_Err != nil {
return templ_7745c5c3_Err
}
var templ_7745c5c3_Var12 string
templ_7745c5c3_Var12, templ_7745c5c3_Err = templ.JoinStringErrs(anubis.BasePrefix + "/.within.website/x/cmd/anubis/static/img/reject.webp?cacheBuster=" + anubis.Version)
if templ_7745c5c3_Err != nil {
return templ.Error{Err: templ_7745c5c3_Err, FileName: `index.templ`, Line: 140, Col: 181}
}
_, templ_7745c5c3_Err = templ_7745c5c3_Buffer.WriteString(templ.EscapeString(templ_7745c5c3_Var12))
if templ_7745c5c3_Err != nil {
return templ_7745c5c3_Err
}
templ_7745c5c3_Err = templruntime.WriteString(templ_7745c5c3_Buffer, 16, "\"><p>")
templ_7745c5c3_Err = templruntime.WriteString(templ_7745c5c3_Buffer, 16, "<div class=\"centered-div\"><img id=\"image\" alt=\"Sad Anubis\" style=\"width:100%;max-width:256px;\" src=\"")
if templ_7745c5c3_Err != nil {
return templ_7745c5c3_Err
}
var templ_7745c5c3_Var13 string
templ_7745c5c3_Var13, templ_7745c5c3_Err = templ.JoinStringErrs(message)
templ_7745c5c3_Var13, templ_7745c5c3_Err = templ.JoinStringErrs(anubis.BasePrefix + "/.within.website/x/cmd/anubis/static/img/reject.webp?cacheBuster=" + anubis.Version)
if templ_7745c5c3_Err != nil {
return templ.Error{Err: templ_7745c5c3_Err, FileName: `index.templ`, Line: 141, Col: 14}
return templ.Error{Err: templ_7745c5c3_Err, FileName: `index.templ`, Line: 139, Col: 181}
}
_, templ_7745c5c3_Err = templ_7745c5c3_Buffer.WriteString(templ.EscapeString(templ_7745c5c3_Var13))
if templ_7745c5c3_Err != nil {
return templ_7745c5c3_Err
}
templ_7745c5c3_Err = templruntime.WriteString(templ_7745c5c3_Buffer, 17, ".</p><button onClick=\"window.location.reload();\">Try again</button> ")
templ_7745c5c3_Err = templruntime.WriteString(templ_7745c5c3_Buffer, 17, "\"><p>")
if templ_7745c5c3_Err != nil {
return templ_7745c5c3_Err
}
var templ_7745c5c3_Var14 string
templ_7745c5c3_Var14, templ_7745c5c3_Err = templ.JoinStringErrs(message)
if templ_7745c5c3_Err != nil {
return templ.Error{Err: templ_7745c5c3_Err, FileName: `index.templ`, Line: 140, Col: 14}
}
_, templ_7745c5c3_Err = templ_7745c5c3_Buffer.WriteString(templ.EscapeString(templ_7745c5c3_Var14))
if templ_7745c5c3_Err != nil {
return templ_7745c5c3_Err
}
templ_7745c5c3_Err = templruntime.WriteString(templ_7745c5c3_Buffer, 18, ".</p><button onClick=\"window.location.reload();\">Try again</button> ")
if templ_7745c5c3_Err != nil {
return templ_7745c5c3_Err
}
if mail != "" {
templ_7745c5c3_Err = templruntime.WriteString(templ_7745c5c3_Buffer, 18, "<p><a href=\"/\">Go home</a> or if you believe you should not be blocked, please contact the webmaster at <a href=\"")
templ_7745c5c3_Err = templruntime.WriteString(templ_7745c5c3_Buffer, 19, "<p><a href=\"/\">Go home</a> or if you believe you should not be blocked, please contact the webmaster at <a href=\"")
if templ_7745c5c3_Err != nil {
return templ_7745c5c3_Err
}
var templ_7745c5c3_Var14 templ.SafeURL = "mailto:" + templ.SafeURL(mail)
_, templ_7745c5c3_Err = templ_7745c5c3_Buffer.WriteString(templ.EscapeString(string(templ_7745c5c3_Var14)))
var templ_7745c5c3_Var15 templ.SafeURL = "mailto:" + templ.SafeURL(mail)
_, templ_7745c5c3_Err = templ_7745c5c3_Buffer.WriteString(templ.EscapeString(string(templ_7745c5c3_Var15)))
if templ_7745c5c3_Err != nil {
return templ_7745c5c3_Err
}
templ_7745c5c3_Err = templruntime.WriteString(templ_7745c5c3_Buffer, 19, "\">")
templ_7745c5c3_Err = templruntime.WriteString(templ_7745c5c3_Buffer, 20, "\">")
if templ_7745c5c3_Err != nil {
return templ_7745c5c3_Err
}
var templ_7745c5c3_Var15 string
templ_7745c5c3_Var15, templ_7745c5c3_Err = templ.JoinStringErrs(mail)
var templ_7745c5c3_Var16 string
templ_7745c5c3_Var16, templ_7745c5c3_Err = templ.JoinStringErrs(mail)
if templ_7745c5c3_Err != nil {
return templ.Error{Err: templ_7745c5c3_Err, FileName: `index.templ`, Line: 147, Col: 11}
return templ.Error{Err: templ_7745c5c3_Err, FileName: `index.templ`, Line: 146, Col: 11}
}
_, templ_7745c5c3_Err = templ_7745c5c3_Buffer.WriteString(templ.EscapeString(templ_7745c5c3_Var15))
_, templ_7745c5c3_Err = templ_7745c5c3_Buffer.WriteString(templ.EscapeString(templ_7745c5c3_Var16))
if templ_7745c5c3_Err != nil {
return templ_7745c5c3_Err
}
templ_7745c5c3_Err = templruntime.WriteString(templ_7745c5c3_Buffer, 20, "</a></p>")
templ_7745c5c3_Err = templruntime.WriteString(templ_7745c5c3_Buffer, 21, "</a></p>")
if templ_7745c5c3_Err != nil {
return templ_7745c5c3_Err
}
} else {
templ_7745c5c3_Err = templruntime.WriteString(templ_7745c5c3_Buffer, 21, "<p><a href=\"/\">Go home</a></p>")
templ_7745c5c3_Err = templruntime.WriteString(templ_7745c5c3_Buffer, 22, "<p><a href=\"/\">Go home</a></p>")
if templ_7745c5c3_Err != nil {
return templ_7745c5c3_Err
}
}
templ_7745c5c3_Err = templruntime.WriteString(templ_7745c5c3_Buffer, 22, "</div>")
templ_7745c5c3_Err = templruntime.WriteString(templ_7745c5c3_Buffer, 23, "</div>")
if templ_7745c5c3_Err != nil {
return templ_7745c5c3_Err
}
@@ -319,26 +332,26 @@ func StaticHappy() templ.Component {
}()
}
ctx = templ.InitializeContext(ctx)
templ_7745c5c3_Var16 := templ.GetChildren(ctx)
if templ_7745c5c3_Var16 == nil {
templ_7745c5c3_Var16 = templ.NopComponent
templ_7745c5c3_Var17 := templ.GetChildren(ctx)
if templ_7745c5c3_Var17 == nil {
templ_7745c5c3_Var17 = templ.NopComponent
}
ctx = templ.ClearChildren(ctx)
templ_7745c5c3_Err = templruntime.WriteString(templ_7745c5c3_Buffer, 23, "<div class=\"centered-div\"><img style=\"display:none;\" style=\"width:100%;max-width:256px;\" src=\"")
templ_7745c5c3_Err = templruntime.WriteString(templ_7745c5c3_Buffer, 24, "<div class=\"centered-div\"><img style=\"display:none;\" style=\"width:100%;max-width:256px;\" src=\"")
if templ_7745c5c3_Err != nil {
return templ_7745c5c3_Err
}
var templ_7745c5c3_Var17 string
templ_7745c5c3_Var17, templ_7745c5c3_Err = templ.JoinStringErrs("/.within.website/x/cmd/anubis/static/img/happy.webp?cacheBuster=" +
var templ_7745c5c3_Var18 string
templ_7745c5c3_Var18, templ_7745c5c3_Err = templ.JoinStringErrs("/.within.website/x/cmd/anubis/static/img/happy.webp?cacheBuster=" +
anubis.Version)
if templ_7745c5c3_Err != nil {
return templ.Error{Err: templ_7745c5c3_Err, FileName: `index.templ`, Line: 162, Col: 18}
return templ.Error{Err: templ_7745c5c3_Err, FileName: `index.templ`, Line: 161, Col: 18}
}
_, templ_7745c5c3_Err = templ_7745c5c3_Buffer.WriteString(templ.EscapeString(templ_7745c5c3_Var17))
_, templ_7745c5c3_Err = templ_7745c5c3_Buffer.WriteString(templ.EscapeString(templ_7745c5c3_Var18))
if templ_7745c5c3_Err != nil {
return templ_7745c5c3_Err
}
templ_7745c5c3_Err = templruntime.WriteString(templ_7745c5c3_Buffer, 24, "\"><p>This is just a check endpoint for your reverse proxy to use.</p></div>")
templ_7745c5c3_Err = templruntime.WriteString(templ_7745c5c3_Buffer, 25, "\"><p>This is just a check endpoint for your reverse proxy to use.</p></div>")
if templ_7745c5c3_Err != nil {
return templ_7745c5c3_Err
}
@@ -362,38 +375,38 @@ func bench() templ.Component {
}()
}
ctx = templ.InitializeContext(ctx)
templ_7745c5c3_Var18 := templ.GetChildren(ctx)
if templ_7745c5c3_Var18 == nil {
templ_7745c5c3_Var18 = templ.NopComponent
templ_7745c5c3_Var19 := templ.GetChildren(ctx)
if templ_7745c5c3_Var19 == nil {
templ_7745c5c3_Var19 = templ.NopComponent
}
ctx = templ.ClearChildren(ctx)
templ_7745c5c3_Err = templruntime.WriteString(templ_7745c5c3_Buffer, 25, "<div style=\"height:20rem;display:flex\"><table style=\"margin-top:1rem;display:grid;grid-template:auto 1fr/auto auto;gap:0 0.5rem\"><thead style=\"border-bottom:1px solid black;padding:0.25rem 0;display:grid;grid-template:1fr/subgrid;grid-column:1/-1\"><tr id=\"table-header\" style=\"display:contents\"><th style=\"width:4.5rem\">Time</th><th style=\"width:4rem\">Iters</th></tr><tr id=\"table-header-compare\" style=\"display:none\"><th style=\"width:4.5rem\">Time A</th><th style=\"width:4rem\">Iters A</th><th style=\"width:4.5rem\">Time B</th><th style=\"width:4rem\">Iters B</th></tr></thead> <tbody id=\"results\" style=\"padding-top:0.25rem;display:grid;grid-template-columns:subgrid;grid-auto-rows:min-content;grid-column:1/-1;row-gap:0.25rem;overflow-y:auto;font-variant-numeric:tabular-nums\"></tbody></table><div class=\"centered-div\"><img id=\"image\" style=\"width:100%;max-width:256px;\" src=\"")
if templ_7745c5c3_Err != nil {
return templ_7745c5c3_Err
}
var templ_7745c5c3_Var19 string
templ_7745c5c3_Var19, templ_7745c5c3_Err = templ.JoinStringErrs(anubis.BasePrefix + "/.within.website/x/cmd/anubis/static/img/pensive.webp?cacheBuster=" + anubis.Version)
if templ_7745c5c3_Err != nil {
return templ.Error{Err: templ_7745c5c3_Err, FileName: `index.templ`, Line: 191, Col: 166}
}
_, templ_7745c5c3_Err = templ_7745c5c3_Buffer.WriteString(templ.EscapeString(templ_7745c5c3_Var19))
if templ_7745c5c3_Err != nil {
return templ_7745c5c3_Err
}
templ_7745c5c3_Err = templruntime.WriteString(templ_7745c5c3_Buffer, 26, "\"><p id=\"status\" style=\"max-width:256px\">Loading...</p><script async type=\"module\" src=\"")
templ_7745c5c3_Err = templruntime.WriteString(templ_7745c5c3_Buffer, 26, "<div style=\"height:20rem;display:flex\"><table style=\"margin-top:1rem;display:grid;grid-template:auto 1fr/auto auto;gap:0 0.5rem\"><thead style=\"border-bottom:1px solid black;padding:0.25rem 0;display:grid;grid-template:1fr/subgrid;grid-column:1/-1\"><tr id=\"table-header\" style=\"display:contents\"><th style=\"width:4.5rem\">Time</th><th style=\"width:4rem\">Iters</th></tr><tr id=\"table-header-compare\" style=\"display:none\"><th style=\"width:4.5rem\">Time A</th><th style=\"width:4rem\">Iters A</th><th style=\"width:4.5rem\">Time B</th><th style=\"width:4rem\">Iters B</th></tr></thead> <tbody id=\"results\" style=\"padding-top:0.25rem;display:grid;grid-template-columns:subgrid;grid-auto-rows:min-content;grid-column:1/-1;row-gap:0.25rem;overflow-y:auto;font-variant-numeric:tabular-nums\"></tbody></table><div class=\"centered-div\"><img id=\"image\" style=\"width:100%;max-width:256px;\" src=\"")
if templ_7745c5c3_Err != nil {
return templ_7745c5c3_Err
}
var templ_7745c5c3_Var20 string
templ_7745c5c3_Var20, templ_7745c5c3_Err = templ.JoinStringErrs(anubis.BasePrefix + "/.within.website/x/cmd/anubis/static/js/bench.mjs?cacheBuster=" + anubis.Version)
templ_7745c5c3_Var20, templ_7745c5c3_Err = templ.JoinStringErrs(anubis.BasePrefix + "/.within.website/x/cmd/anubis/static/img/pensive.webp?cacheBuster=" + anubis.Version)
if templ_7745c5c3_Err != nil {
return templ.Error{Err: templ_7745c5c3_Err, FileName: `index.templ`, Line: 193, Col: 138}
return templ.Error{Err: templ_7745c5c3_Err, FileName: `index.templ`, Line: 190, Col: 166}
}
_, templ_7745c5c3_Err = templ_7745c5c3_Buffer.WriteString(templ.EscapeString(templ_7745c5c3_Var20))
if templ_7745c5c3_Err != nil {
return templ_7745c5c3_Err
}
templ_7745c5c3_Err = templruntime.WriteString(templ_7745c5c3_Buffer, 27, "\"></script><div id=\"sparkline\"></div><noscript><p>Running the benchmark tool requires JavaScript to be enabled.</p></noscript></div></div><form id=\"controls\" style=\"position:fixed;top:0.5rem;right:0.5rem\"><div style=\"display:flex;justify-content:end\"><label for=\"difficulty-input\" style=\"margin-right:0.5rem\">Difficulty:</label> <input id=\"difficulty-input\" type=\"number\" name=\"difficulty\" style=\"width:3rem\"></div><div style=\"margin-top:0.25rem;display:flex;justify-content:end\"><label for=\"algorithm-select\" style=\"margin-right:0.5rem\">Algorithm:</label> <select id=\"algorithm-select\" name=\"algorithm\"></select></div><div style=\"margin-top:0.25rem;display:flex;justify-content:end\"><label for=\"compare-select\" style=\"margin-right:0.5rem\">Compare:</label> <select id=\"compare-select\" name=\"compare\"><option value=\"NONE\">-</option></select></div></form>")
templ_7745c5c3_Err = templruntime.WriteString(templ_7745c5c3_Buffer, 27, "\"><p id=\"status\" style=\"max-width:256px\">Loading...</p><script async type=\"module\" src=\"")
if templ_7745c5c3_Err != nil {
return templ_7745c5c3_Err
}
var templ_7745c5c3_Var21 string
templ_7745c5c3_Var21, templ_7745c5c3_Err = templ.JoinStringErrs(anubis.BasePrefix + "/.within.website/x/cmd/anubis/static/js/bench.mjs?cacheBuster=" + anubis.Version)
if templ_7745c5c3_Err != nil {
return templ.Error{Err: templ_7745c5c3_Err, FileName: `index.templ`, Line: 192, Col: 138}
}
_, templ_7745c5c3_Err = templ_7745c5c3_Buffer.WriteString(templ.EscapeString(templ_7745c5c3_Var21))
if templ_7745c5c3_Err != nil {
return templ_7745c5c3_Err
}
templ_7745c5c3_Err = templruntime.WriteString(templ_7745c5c3_Buffer, 28, "\"></script><div id=\"sparkline\"></div><noscript><p>Running the benchmark tool requires JavaScript to be enabled.</p></noscript></div></div><form id=\"controls\" style=\"position:fixed;top:0.5rem;right:0.5rem\"><div style=\"display:flex;justify-content:end\"><label for=\"difficulty-input\" style=\"margin-right:0.5rem\">Difficulty:</label> <input id=\"difficulty-input\" type=\"number\" name=\"difficulty\" style=\"width:3rem\"></div><div style=\"margin-top:0.25rem;display:flex;justify-content:end\"><label for=\"algorithm-select\" style=\"margin-right:0.5rem\">Algorithm:</label> <select id=\"algorithm-select\" name=\"algorithm\"></select></div><div style=\"margin-top:0.25rem;display:flex;justify-content:end\"><label for=\"compare-select\" style=\"margin-right:0.5rem\">Compare:</label> <select id=\"compare-select\" name=\"compare\"><option value=\"NONE\">-</option></select></div></form>")
if templ_7745c5c3_Err != nil {
return templ_7745c5c3_Err
}

View File

@@ -118,7 +118,7 @@ const benchmarkLoop = async (controller) => {
return;
}
benchmarkLoop(controller);
await benchmarkLoop(controller);
};
let controller = null;
@@ -142,11 +142,11 @@ const reset = () => {
controller.abort();
}
controller = new AbortController();
benchmarkLoop(controller);
void benchmarkLoop(controller);
};
setupControls();
difficultyInput.addEventListener("change", reset);
algorithmSelect.addEventListener("change", reset);
compareSelect.addEventListener("change", reset);
reset();
reset();

View File

@@ -1,10 +1,9 @@
import processFast from "./proof-of-work.mjs";
import processSlow from "./proof-of-work-slow.mjs";
import { testVideo } from "./video.mjs";
const algorithms = {
"fast": processFast,
"slow": processSlow,
fast: processFast,
slow: processSlow,
};
// from Xeact
@@ -15,7 +14,9 @@ const u = (url = "", params = {}) => {
};
const imageURL = (mood, cacheBuster, basePrefix) =>
u(`${basePrefix}/.within.website/x/cmd/anubis/static/img/${mood}.webp`, { cacheBuster });
u(`${basePrefix}/.within.website/x/cmd/anubis/static/img/${mood}.webp`, {
cacheBuster,
});
const dependencies = [
{
@@ -35,59 +36,18 @@ const dependencies = [
},
];
function showContinueBar(hash, nonce, t0, t1) {
const barContainer = document.createElement("div");
barContainer.style.marginTop = "1rem";
barContainer.style.width = "100%";
barContainer.style.maxWidth = "32rem";
barContainer.style.background = "#3c3836";
barContainer.style.borderRadius = "4px";
barContainer.style.overflow = "hidden";
barContainer.style.cursor = "pointer";
barContainer.style.height = "2rem";
barContainer.style.marginLeft = "auto";
barContainer.style.marginRight = "auto";
barContainer.title = "Click to continue";
const barInner = document.createElement("div");
barInner.className = "bar-inner";
barInner.style.display = "flex";
barInner.style.alignItems = "center";
barInner.style.justifyContent = "center";
barInner.style.color = "white";
barInner.style.fontWeight = "bold";
barInner.style.height = "100%";
barInner.style.width = "0";
barInner.innerText = "I've finished reading, continue →";
barContainer.appendChild(barInner);
document.body.appendChild(barContainer);
requestAnimationFrame(() => {
barInner.style.width = "100%";
});
barContainer.onclick = () => {
const redir = window.location.href;
window.location.replace(
u("/.within.website/x/cmd/anubis/api/pass-challenge", {
response: hash,
nonce,
redir,
elapsedTime: t1 - t0
})
);
};
}
(async () => {
const status = document.getElementById('status');
const image = document.getElementById('image');
const title = document.getElementById('title');
const progress = document.getElementById('progress');
const anubisVersion = JSON.parse(document.getElementById('anubis_version').textContent);
const basePrefix = JSON.parse(document.getElementById('anubis_base_prefix').textContent);
const details = document.querySelector('details');
const status = document.getElementById("status");
const image = document.getElementById("image");
const title = document.getElementById("title");
const progress = document.getElementById("progress");
const anubisVersion = JSON.parse(
document.getElementById("anubis_version").textContent,
);
const basePrefix = JSON.parse(
document.getElementById("anubis_base_prefix").textContent,
);
const details = document.querySelector("details");
let userReadDetails = false;
if (details) {
@@ -114,20 +74,7 @@ function showContinueBar(hash, nonce, t0, t1) {
return;
}
// const testarea = document.getElementById('testarea');
// const videoWorks = await testVideo(testarea);
// console.log(`videoWorks: ${videoWorks}`);
// if (!videoWorks) {
// title.innerHTML = "Oh no!";
// status.innerHTML = "Checks failed. Please check your browser's settings and try again.";
// image.src = imageURL("reject");
// progress.style.display = "none";
// return;
// }
status.innerHTML = 'Calculating...';
status.innerHTML = "Calculating...";
for (const { value, name, msg } of dependencies) {
if (!value) {
@@ -140,7 +87,9 @@ function showContinueBar(hash, nonce, t0, t1) {
}
}
const { challenge, rules } = JSON.parse(document.getElementById('anubis_challenge').textContent);
const { challenge, rules } = JSON.parse(
document.getElementById("anubis_challenge").textContent,
);
const process = algorithms[rules.algorithm];
if (!process) {
@@ -234,14 +183,13 @@ function showContinueBar(hash, nonce, t0, t1) {
response: hash,
nonce,
redir,
elapsedTime: t1 - t0
elapsedTime: t1 - t0,
}),
);
}
container.onclick = onDetailsExpand;
setTimeout(onDetailsExpand, 30000);
} else {
setTimeout(() => {
const redir = window.location.href;
@@ -250,12 +198,11 @@ function showContinueBar(hash, nonce, t0, t1) {
response: hash,
nonce,
redir,
elapsedTime: t1 - t0
elapsedTime: t1 - t0,
}),
);
}, 250);
}
} catch (err) {
ohNoes({
titleMsg: "Calculation error!",
@@ -263,4 +210,4 @@ function showContinueBar(hash, nonce, t0, t1) {
imageSrc: imageURL("reject", anubisVersion, basePrefix),
});
}
})();
})();

View File

@@ -9,9 +9,9 @@ export default function process(
) {
console.debug("slow algo");
return new Promise((resolve, reject) => {
let webWorkerURL = URL.createObjectURL(new Blob([
'(', processTask(), ')()'
], { type: 'application/javascript' }));
let webWorkerURL = URL.createObjectURL(
new Blob(["(", processTask(), ")()"], { type: "application/javascript" }),
);
let worker = new Worker(webWorkerURL);
const terminate = () => {
@@ -45,7 +45,7 @@ export default function process(
worker.postMessage({
data,
difficulty
difficulty,
});
URL.revokeObjectURL(webWorkerURL);
@@ -56,26 +56,27 @@ function processTask() {
return function () {
const sha256 = (text) => {
const encoded = new TextEncoder().encode(text);
return crypto.subtle.digest("SHA-256", encoded.buffer)
.then((result) =>
Array.from(new Uint8Array(result))
.map((c) => c.toString(16).padStart(2, "0"))
.join(""),
);
return crypto.subtle.digest("SHA-256", encoded.buffer).then((result) =>
Array.from(new Uint8Array(result))
.map((c) => c.toString(16).padStart(2, "0"))
.join(""),
);
};
addEventListener('message', async (event) => {
addEventListener("message", async (event) => {
let data = event.data.data;
let difficulty = event.data.difficulty;
let hash;
let nonce = 0;
do {
if (nonce & 1023 === 0) {
if (nonce & (1023 === 0)) {
postMessage(nonce);
}
hash = await sha256(data + nonce++);
} while (hash.substring(0, difficulty) !== Array(difficulty + 1).join('0'));
} while (
hash.substring(0, difficulty) !== Array(difficulty + 1).join("0")
);
nonce -= 1; // last nonce was post-incremented
@@ -87,4 +88,4 @@ function processTask() {
});
});
}.toString();
}
}

View File

@@ -3,13 +3,13 @@ export default function process(
difficulty = 5,
signal = null,
progressCallback = null,
threads = (navigator.hardwareConcurrency || 1),
threads = navigator.hardwareConcurrency || 1,
) {
console.debug("fast algo");
return new Promise((resolve, reject) => {
let webWorkerURL = URL.createObjectURL(new Blob([
'(', processTask(), ')()'
], { type: 'application/javascript' }));
let webWorkerURL = URL.createObjectURL(
new Blob(["(", processTask(), ")()"], { type: "application/javascript" }),
);
const workers = [];
const terminate = () => {
@@ -71,7 +71,7 @@ function processTask() {
.join("");
}
addEventListener('message', async (event) => {
addEventListener("message", async (event) => {
let data = event.data.data;
let difficulty = event.data.difficulty;
let hash;
@@ -89,7 +89,8 @@ function processTask() {
const byteIndex = Math.floor(j / 2); // which byte we are looking at
const nibbleIndex = j % 2; // which nibble in the byte we are looking at (0 is high, 1 is low)
let nibble = (thisHash[byteIndex] >> (nibbleIndex === 0 ? 4 : 0)) & 0x0F; // Get the nibble
let nibble =
(thisHash[byteIndex] >> (nibbleIndex === 0 ? 4 : 0)) & 0x0f; // Get the nibble
if (nibble !== 0) {
valid = false;
@@ -113,7 +114,7 @@ function processTask() {
// update and they will get behind the others. this is slightly more
// complicated but ensures an even distribution between threads.
if (
nonce > oldNonce | 1023 && // we've wrapped past 1024
(nonce > oldNonce) | 1023 && // we've wrapped past 1024
(nonce >> 10) % threads === threadId // and it's our turn
) {
postMessage(nonce);
@@ -129,4 +130,3 @@ function processTask() {
});
}.toString();
}

View File

@@ -2,15 +2,15 @@ const videoElement = `<video id="videotest" width="0" height="0" src="/.within.w
export const testVideo = async (testarea) => {
testarea.innerHTML = videoElement;
return (await new Promise((resolve) => {
const video = document.getElementById('videotest');
return await new Promise((resolve) => {
const video = document.getElementById("videotest");
video.oncanplay = () => {
testarea.style.display = "none";
resolve(true);
};
video.onerror = (ev) => {
video.onerror = () => {
testarea.style.display = "none";
resolve(false);
};
}));
};
});
};

View File

@@ -9,6 +9,8 @@ User-agent: Brightbot 1.0
User-agent: Bytespider
User-agent: CCBot
User-agent: ChatGPT-User
User-agent: Claude-SearchBot
User-agent: Claude-User
User-agent: Claude-Web
User-agent: ClaudeBot
User-agent: cohere-ai
@@ -21,6 +23,7 @@ User-agent: FacebookBot
User-agent: Factset_spyderbot
User-agent: FirecrawlAgent
User-agent: FriendlyCrawler
User-agent: Google-CloudVertexBot
User-agent: Google-Extended
User-agent: GoogleOther
User-agent: GoogleOther-Image
@@ -37,6 +40,7 @@ User-agent: meta-externalagent
User-agent: Meta-ExternalAgent
User-agent: meta-externalfetcher
User-agent: Meta-ExternalFetcher
User-agent: MistralAI-User/1.0
User-agent: NovaAct
User-agent: OAI-SearchBot
User-agent: omgili
@@ -55,6 +59,7 @@ User-agent: TikTokSpider
User-agent: Timpibot
User-agent: VelenPublicWebCrawler
User-agent: Webzio-Extended
User-agent: wpbot
User-agent: YouBot
Disallow: /

View File

@@ -1,6 +1,11 @@
$`npm run assets`;
["amd64", "arm64", "riscv64"].forEach(goarch => {
[
"amd64",
"arm64",
"ppc64le",
"riscv64",
].forEach(goarch => {
[deb, rpm, tarball].forEach(method => method.build({
name: "anubis",
description: "Anubis weighs the souls of incoming HTTP requests and uses a sha256 proof-of-work challenge in order to protect upstream resources from scraper bots.",
@@ -30,6 +35,7 @@ $`npm run assets`;
$`cp -a data/clients ${doc}/data/clients`;
$`cp -a data/common ${doc}/data/common`;
$`cp -a data/crawlers ${doc}/data/crawlers`;
$`cp -a data/meta ${doc}/data/meta`;
},
}));
});