Compare commits

...

12 Commits

Author SHA1 Message Date
Xe Iaso
a173b8f484 fix(data): add services folder to embedded filesystem
Signed-off-by: Xe Iaso <me@xeiaso.net>
2025-09-07 16:14:37 +00:00
Xe Iaso
7e1b5d9951 fix: demote temporal assurance checks
* fix(challenge): demote temporal assurance to 80% instead of 95%

Signed-off-by: Xe Iaso <me@xeiaso.net>

* fix(challenge/preact): wait a little longer to be extra safe

Signed-off-by: Xe Iaso <me@xeiaso.net>

* fix(challenge/metarefresh): wait a little longer to be extra safe

Signed-off-by: Xe Iaso <me@xeiaso.net>

* docs(CHANGELOG): add fix notes

Signed-off-by: Xe Iaso <me@xeiaso.net>

---------

Signed-off-by: Xe Iaso <me@xeiaso.net>
2025-09-07 16:10:54 +00:00
Xe Iaso
98945fb56f feat(lib/store): add s3api storage backend (#1089)
* feat(lib/store): add s3api storage backend

Signed-off-by: Xe Iaso <me@xeiaso.net>

* docs(store/s3api): replace fake S3 API keys with the bee movie script

Signed-off-by: Xe Iaso <me@xeiaso.net>

* docs(store/s3api): fix spelling sin

Signed-off-by: Xe Iaso <me@xeiaso.net>

* fix(store/s3api): remove vestigal experiment

Signed-off-by: Xe Iaso <me@xeiaso.net>

* chore: spelling

Signed-off-by: Xe Iaso <me@xeiaso.net>

* chore: spelling

Signed-off-by: Xe Iaso <me@xeiaso.net>

* chore(store/s3api): support IsPersistent call

Ref #1088

Signed-off-by: Xe Iaso <me@xeiaso.net>

* chore: spelling

Signed-off-by: Xe Iaso <me@xeiaso.net>

* chore(test): go mod tidy

Signed-off-by: Xe Iaso <me@xeiaso.net>

---------

Signed-off-by: Xe Iaso <me@xeiaso.net>
2025-09-07 09:24:14 -04:00
Jason Cameron
82099d9e05 fix(robots2policy): handle multiple user agents under one block (#925) 2025-09-06 22:35:19 -04:00
dependabot[bot]
87c2f1e0e6 build(deps): bump the github-actions group across 1 directory with 8 updates (#1071)
Co-authored-by: Jason Cameron <git@jasoncameron.dev>
2025-09-06 22:30:43 -04:00
Jason Cameron
f0199d014f docs: document some missing env vars (#1087) 2025-09-07 01:34:42 +00:00
Jason Cameron
75109f6b73 docs(installation): add SLOG_LEVEL environment variable to configuration (#1086)
* docs(installation): add SLOG_LEVEL environment variable to configuration

* docs(installation): add SLOG_LEVEL environment variable to configuration
2025-09-06 20:59:02 -04:00
Xe Iaso
c43d7ca686 docs(botstopper): add HTML templating support
Signed-off-by: Xe Iaso <me@xeiaso.net>
2025-09-06 23:42:23 +00:00
Xe Iaso
5d5c39e123 chore: v1.22.0
Signed-off-by: Xe Iaso <me@xeiaso.net>
2025-09-06 11:54:36 -04:00
Xe Iaso
d35e47c655 feat: glob matching for redirect domains (#1084)
* feat: glob matching for redirect domains

Signed-off-by: Xe Iaso <me@xeiaso.net>

* chore: spelling

Signed-off-by: Xe Iaso <me@xeiaso.net>

* docs: update CHANGELOG

Signed-off-by: Xe Iaso <me@xeiaso.net>

---------

Signed-off-by: Xe Iaso <me@xeiaso.net>
2025-09-06 15:46:18 +00:00
Xe Iaso
48b49a0190 docs(CHANGELOG): add changelog entry for v1.22.0
Signed-off-by: Xe Iaso <me@xeiaso.net>
2025-09-05 22:42:08 +00:00
Xe Iaso
de94139789 test: ensure FORCED_LANGUAGE works (#1083)
Closes #1077
2025-09-05 22:07:17 +00:00
49 changed files with 1250 additions and 140 deletions

View File

@@ -5,4 +5,5 @@ ubuntu
workarounds
rjack
msgbox
xeact
xeact
ABee

View File

@@ -88,6 +88,7 @@
^docs/manifest/.*$
^docs/static/\.nojekyll$
^lib/policy/config/testdata/bad/unparseable\.json$
^internal/glob/glob_test.go$
ignore$
robots.txt
^lib/localization/locales/.*\.json$

View File

@@ -140,6 +140,7 @@ headermap
healthcheck
healthz
hec
Hetzner
hmc
homelab
hostable
@@ -237,7 +238,6 @@ pki
podkova
podman
poststart
poxied
prebaked
privkey
promauto
@@ -250,7 +250,6 @@ pwuser
qualys
qwant
qwantbot
QWEN
rac
rawler
rcvar
@@ -282,7 +281,6 @@ shirou
Sidetrade
simprint
sitemap
Slackware
sls
Smartphone
sni
@@ -360,6 +358,7 @@ XOriginal
XReal
yae
YAMLTo
Yda
yeet
yeetfile
yourdomain

View File

@@ -15,7 +15,7 @@ jobs:
runs-on: ubuntu-24.04
steps:
- name: Checkout code
uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2
uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0
with:
fetch-tags: true
fetch-depth: 0
@@ -25,7 +25,7 @@ jobs:
uses: Homebrew/actions/setup-homebrew@main
- name: Setup Homebrew cellar cache
uses: actions/cache@5a3ec84eff668545956fd18022155c47e93e2684 # v4.2.3
uses: actions/cache@0400d5f644dc74513175e3cd8d07132dd4860809 # v4.2.4
with:
path: |
/home/linuxbrew/.linuxbrew/Cellar
@@ -47,7 +47,7 @@ jobs:
- name: Docker meta
id: meta
uses: docker/metadata-action@902fa8ec7d6ecbf8d84d538b9b233a880e428804 # v5.7.0
uses: docker/metadata-action@c1e51972afc2121e065aed6d45c65596fe445f3f # v5.8.0
with:
images: ghcr.io/${{ github.repository }}

View File

@@ -21,7 +21,7 @@ jobs:
runs-on: ubuntu-24.04
steps:
- name: Checkout code
uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2
uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0
with:
fetch-tags: true
fetch-depth: 0
@@ -35,7 +35,7 @@ jobs:
uses: Homebrew/actions/setup-homebrew@main
- name: Setup Homebrew cellar cache
uses: actions/cache@5a3ec84eff668545956fd18022155c47e93e2684 # v4.2.3
uses: actions/cache@0400d5f644dc74513175e3cd8d07132dd4860809 # v4.2.4
with:
path: |
/home/linuxbrew/.linuxbrew/Cellar
@@ -56,7 +56,7 @@ jobs:
brew bundle
- name: Log into registry
uses: docker/login-action@74a5d142397b4f367a81961eba4e8cd7edddf772 # v3.4.0
uses: docker/login-action@184bdaa0721073962dff0199f1fb9940f07167d1 # v3.5.0
with:
registry: ghcr.io
username: ${{ github.repository_owner }}
@@ -64,7 +64,7 @@ jobs:
- name: Docker meta
id: meta
uses: docker/metadata-action@902fa8ec7d6ecbf8d84d538b9b233a880e428804 # v5.7.0
uses: docker/metadata-action@c1e51972afc2121e065aed6d45c65596fe445f3f # v5.8.0
with:
images: ${{ env.IMAGE }}
@@ -78,7 +78,7 @@ jobs:
SLOG_LEVEL: debug
- name: Generate artifact attestation
uses: actions/attest-build-provenance@e8998f949152b193b063cb0ec769d69d929409be # v2.4.0
uses: actions/attest-build-provenance@977bb373ede98d70efdf65b84cb5f73e068dcc2a # v3.0.0
with:
subject-name: ${{ env.IMAGE }}
subject-digest: ${{ steps.build.outputs.digest }}

View File

@@ -17,7 +17,7 @@ jobs:
runs-on: ubuntu-24.04
steps:
- uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2
- uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0
with:
persist-credentials: false
@@ -25,7 +25,7 @@ jobs:
uses: docker/setup-buildx-action@e468171a9de216ec08956ac3ada2f0791b6bd435 # v3.11.1
- name: Log into registry
uses: docker/login-action@74a5d142397b4f367a81961eba4e8cd7edddf772 # v3.4.0
uses: docker/login-action@184bdaa0721073962dff0199f1fb9940f07167d1 # v3.5.0
with:
registry: ghcr.io
username: techarohq
@@ -33,7 +33,7 @@ jobs:
- name: Docker meta
id: meta
uses: docker/metadata-action@902fa8ec7d6ecbf8d84d538b9b233a880e428804 # v5.7.0
uses: docker/metadata-action@c1e51972afc2121e065aed6d45c65596fe445f3f # v5.8.0
with:
images: ghcr.io/techarohq/anubis/docs
tags: |
@@ -53,14 +53,14 @@ jobs:
push: true
- name: Apply k8s manifests to limsa lominsa
uses: actions-hub/kubectl@b5b19eeb6a0ffde16637e398f8b96ef01eb8fdb7 # v1.33.3
uses: actions-hub/kubectl@af345ed727f0268738e65be48422e463cc67c220 # v1.34.0
env:
KUBE_CONFIG: ${{ secrets.LIMSA_LOMINSA_KUBECONFIG }}
with:
args: apply -k docs/manifest
- name: Apply k8s manifests to limsa lominsa
uses: actions-hub/kubectl@b5b19eeb6a0ffde16637e398f8b96ef01eb8fdb7 # v1.33.3
uses: actions-hub/kubectl@af345ed727f0268738e65be48422e463cc67c220 # v1.34.0
env:
KUBE_CONFIG: ${{ secrets.LIMSA_LOMINSA_KUBECONFIG }}
with:

View File

@@ -13,7 +13,7 @@ jobs:
runs-on: ubuntu-24.04
steps:
- uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2
- uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0
with:
persist-credentials: false
@@ -22,7 +22,7 @@ jobs:
- name: Docker meta
id: meta
uses: docker/metadata-action@902fa8ec7d6ecbf8d84d538b9b233a880e428804 # v5.7.0
uses: docker/metadata-action@c1e51972afc2121e065aed6d45c65596fe445f3f # v5.8.0
with:
images: ghcr.io/techarohq/anubis/docs
tags: |

View File

@@ -15,7 +15,7 @@ jobs:
#runs-on: alrest-techarohq
runs-on: ubuntu-24.04
steps:
- uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2
- uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0
with:
persist-credentials: false
@@ -28,7 +28,7 @@ jobs:
uses: Homebrew/actions/setup-homebrew@main
- name: Setup Homebrew cellar cache
uses: actions/cache@5a3ec84eff668545956fd18022155c47e93e2684 # v4.2.3
uses: actions/cache@0400d5f644dc74513175e3cd8d07132dd4860809 # v4.2.4
with:
path: |
/home/linuxbrew/.linuxbrew/Cellar
@@ -49,7 +49,7 @@ jobs:
brew bundle
- name: Setup Golang caches
uses: actions/cache@5a3ec84eff668545956fd18022155c47e93e2684 # v4.2.3
uses: actions/cache@0400d5f644dc74513175e3cd8d07132dd4860809 # v4.2.4
with:
path: |
~/.cache/go-build
@@ -59,7 +59,7 @@ jobs:
${{ runner.os }}-golang-
- name: Cache playwright binaries
uses: actions/cache@5a3ec84eff668545956fd18022155c47e93e2684 # v4.2.3
uses: actions/cache@0400d5f644dc74513175e3cd8d07132dd4860809 # v4.2.4
id: playwright-cache
with:
path: |

View File

@@ -14,7 +14,7 @@ jobs:
#runs-on: alrest-techarohq
runs-on: ubuntu-24.04
steps:
- uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2
- uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0
with:
persist-credentials: false
fetch-tags: true
@@ -29,7 +29,7 @@ jobs:
uses: Homebrew/actions/setup-homebrew@main
- name: Setup Homebrew cellar cache
uses: actions/cache@5a3ec84eff668545956fd18022155c47e93e2684 # v4.2.3
uses: actions/cache@0400d5f644dc74513175e3cd8d07132dd4860809 # v4.2.4
with:
path: |
/home/linuxbrew/.linuxbrew/Cellar
@@ -50,7 +50,7 @@ jobs:
brew bundle
- name: Setup Golang caches
uses: actions/cache@5a3ec84eff668545956fd18022155c47e93e2684 # v4.2.3
uses: actions/cache@0400d5f644dc74513175e3cd8d07132dd4860809 # v4.2.4
with:
path: |
~/.cache/go-build

View File

@@ -15,7 +15,7 @@ jobs:
#runs-on: alrest-techarohq
runs-on: ubuntu-24.04
steps:
- uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2
- uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0
with:
persist-credentials: false
fetch-tags: true
@@ -30,7 +30,7 @@ jobs:
uses: Homebrew/actions/setup-homebrew@main
- name: Setup Homebrew cellar cache
uses: actions/cache@5a3ec84eff668545956fd18022155c47e93e2684 # v4.2.3
uses: actions/cache@0400d5f644dc74513175e3cd8d07132dd4860809 # v4.2.4
with:
path: |
/home/linuxbrew/.linuxbrew/Cellar
@@ -51,7 +51,7 @@ jobs:
brew bundle
- name: Setup Golang caches
uses: actions/cache@5a3ec84eff668545956fd18022155c47e93e2684 # v4.2.3
uses: actions/cache@0400d5f644dc74513175e3cd8d07132dd4860809 # v4.2.4
with:
path: |
~/.cache/go-build

View File

@@ -14,6 +14,7 @@ jobs:
strategy:
matrix:
test:
- forced-language
- git-clone
- git-push
- healthcheck
@@ -23,7 +24,7 @@ jobs:
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2
uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0
with:
persist-credentials: false

View File

@@ -18,13 +18,13 @@ jobs:
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2
uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0
with:
fetch-tags: true
fetch-depth: 0
persist-credentials: false
- name: Log into registry
uses: docker/login-action@74a5d142397b4f367a81961eba4e8cd7edddf772 # v3.4.0
uses: docker/login-action@184bdaa0721073962dff0199f1fb9940f07167d1 # v3.5.0
with:
registry: ghcr.io
username: ${{ github.repository_owner }}

View File

@@ -20,7 +20,7 @@ jobs:
- ci@ppc64le.techaro.lol
steps:
- name: Checkout code
uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2
uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0
with:
fetch-tags: true
fetch-depth: 0

View File

@@ -16,12 +16,12 @@ jobs:
security-events: write
steps:
- name: Checkout repository
uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2
uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0
with:
persist-credentials: false
- name: Install the latest version of uv
uses: astral-sh/setup-uv@e92bafb6253dcd438e0484186d7669ea7a8ca1cc # v6.4.3
uses: astral-sh/setup-uv@4959332f0f014c5280e7eac8b70c90cb574c9f9b # v6.6.0
- name: Run zizmor 🌈
run: uvx zizmor --format sarif . > results.sarif
@@ -29,7 +29,7 @@ jobs:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
- name: Upload SARIF file
uses: github/codeql-action/upload-sarif@4e828ff8d448a8a6e532957b1811f387a63867e8 # v3.29.4
uses: github/codeql-action/upload-sarif@3c3833e0f8c1c83d449a7478aa59c036a9165498 # v3.29.11
with:
sarif_file: results.sarif
category: zizmor

View File

@@ -1 +1 @@
1.21.3
1.22.0

View File

@@ -29,7 +29,7 @@ var (
)
type RobotsRule struct {
UserAgent string
UserAgents []string
Disallows []string
Allows []string
CrawlDelay int
@@ -130,10 +130,26 @@ func main() {
}
}
func createRuleFromAccumulated(userAgents, disallows, allows []string, crawlDelay int) RobotsRule {
rule := RobotsRule{
UserAgents: make([]string, len(userAgents)),
Disallows: make([]string, len(disallows)),
Allows: make([]string, len(allows)),
CrawlDelay: crawlDelay,
}
copy(rule.UserAgents, userAgents)
copy(rule.Disallows, disallows)
copy(rule.Allows, allows)
return rule
}
func parseRobotsTxt(input io.Reader) ([]RobotsRule, error) {
scanner := bufio.NewScanner(input)
var rules []RobotsRule
var currentRule *RobotsRule
var currentUserAgents []string
var currentDisallows []string
var currentAllows []string
var currentCrawlDelay int
for scanner.Scan() {
line := strings.TrimSpace(scanner.Text())
@@ -154,38 +170,42 @@ func parseRobotsTxt(input io.Reader) ([]RobotsRule, error) {
switch directive {
case "user-agent":
// Start a new rule section
if currentRule != nil {
rules = append(rules, *currentRule)
}
currentRule = &RobotsRule{
UserAgent: value,
Disallows: make([]string, 0),
Allows: make([]string, 0),
// If we have accumulated rules with directives and encounter a new user-agent,
// flush the current rules
if len(currentUserAgents) > 0 && (len(currentDisallows) > 0 || len(currentAllows) > 0 || currentCrawlDelay > 0) {
rule := createRuleFromAccumulated(currentUserAgents, currentDisallows, currentAllows, currentCrawlDelay)
rules = append(rules, rule)
// Reset for next group
currentUserAgents = nil
currentDisallows = nil
currentAllows = nil
currentCrawlDelay = 0
}
currentUserAgents = append(currentUserAgents, value)
case "disallow":
if currentRule != nil && value != "" {
currentRule.Disallows = append(currentRule.Disallows, value)
if len(currentUserAgents) > 0 && value != "" {
currentDisallows = append(currentDisallows, value)
}
case "allow":
if currentRule != nil && value != "" {
currentRule.Allows = append(currentRule.Allows, value)
if len(currentUserAgents) > 0 && value != "" {
currentAllows = append(currentAllows, value)
}
case "crawl-delay":
if currentRule != nil {
if len(currentUserAgents) > 0 {
if delay, err := parseIntSafe(value); err == nil {
currentRule.CrawlDelay = delay
currentCrawlDelay = delay
}
}
}
}
// Don't forget the last rule
if currentRule != nil {
rules = append(rules, *currentRule)
// Don't forget the last group of rules
if len(currentUserAgents) > 0 {
rule := createRuleFromAccumulated(currentUserAgents, currentDisallows, currentAllows, currentCrawlDelay)
rules = append(rules, rule)
}
// Mark blacklisted user agents (those with "Disallow: /")
@@ -211,10 +231,11 @@ func convertToAnubisRules(robotsRules []RobotsRule) []AnubisRule {
var anubisRules []AnubisRule
ruleCounter := 0
// Process each robots rule individually
for _, robotsRule := range robotsRules {
userAgent := robotsRule.UserAgent
userAgents := robotsRule.UserAgents
// Handle crawl delay as weight adjustment (do this first before any continues)
// Handle crawl delay
if robotsRule.CrawlDelay > 0 && *crawlDelay > 0 {
ruleCounter++
rule := AnubisRule{
@@ -223,20 +244,32 @@ func convertToAnubisRules(robotsRules []RobotsRule) []AnubisRule {
Weight: &config.Weight{Adjust: *crawlDelay},
}
if userAgent == "*" {
if len(userAgents) == 1 && userAgents[0] == "*" {
rule.Expression = &config.ExpressionOrList{
All: []string{"true"}, // Always applies
}
} else {
} else if len(userAgents) == 1 {
rule.Expression = &config.ExpressionOrList{
All: []string{fmt.Sprintf("userAgent.contains(%q)", userAgent)},
All: []string{fmt.Sprintf("userAgent.contains(%q)", userAgents[0])},
}
} else {
// Multiple user agents - use any block
var expressions []string
for _, ua := range userAgents {
if ua == "*" {
expressions = append(expressions, "true")
} else {
expressions = append(expressions, fmt.Sprintf("userAgent.contains(%q)", ua))
}
}
rule.Expression = &config.ExpressionOrList{
Any: expressions,
}
}
anubisRules = append(anubisRules, rule)
}
// Handle blacklisted user agents (complete deny/challenge)
// Handle blacklisted user agents
if robotsRule.IsBlacklist {
ruleCounter++
rule := AnubisRule{
@@ -244,21 +277,36 @@ func convertToAnubisRules(robotsRules []RobotsRule) []AnubisRule {
Action: *userAgentDeny,
}
if userAgent == "*" {
// This would block everything - convert to a weight adjustment instead
rule.Name = fmt.Sprintf("%s-global-restriction-%d", *policyName, ruleCounter)
rule.Action = "WEIGH"
rule.Weight = &config.Weight{Adjust: 20} // Increase difficulty significantly
rule.Expression = &config.ExpressionOrList{
All: []string{"true"}, // Always applies
if len(userAgents) == 1 {
userAgent := userAgents[0]
if userAgent == "*" {
// This would block everything - convert to a weight adjustment instead
rule.Name = fmt.Sprintf("%s-global-restriction-%d", *policyName, ruleCounter)
rule.Action = "WEIGH"
rule.Weight = &config.Weight{Adjust: 20} // Increase difficulty significantly
rule.Expression = &config.ExpressionOrList{
All: []string{"true"}, // Always applies
}
} else {
rule.Expression = &config.ExpressionOrList{
All: []string{fmt.Sprintf("userAgent.contains(%q)", userAgent)},
}
}
} else {
// Multiple user agents - use any block
var expressions []string
for _, ua := range userAgents {
if ua == "*" {
expressions = append(expressions, "true")
} else {
expressions = append(expressions, fmt.Sprintf("userAgent.contains(%q)", ua))
}
}
rule.Expression = &config.ExpressionOrList{
All: []string{fmt.Sprintf("userAgent.contains(%q)", userAgent)},
Any: expressions,
}
}
anubisRules = append(anubisRules, rule)
continue
}
// Handle specific disallow rules
@@ -276,9 +324,33 @@ func convertToAnubisRules(robotsRules []RobotsRule) []AnubisRule {
// Build CEL expression
var conditions []string
// Add user agent condition if not wildcard
if userAgent != "*" {
conditions = append(conditions, fmt.Sprintf("userAgent.contains(%q)", userAgent))
// Add user agent conditions
if len(userAgents) == 1 && userAgents[0] == "*" {
// Wildcard user agent - no user agent condition needed
} else if len(userAgents) == 1 {
conditions = append(conditions, fmt.Sprintf("userAgent.contains(%q)", userAgents[0]))
} else {
// For multiple user agents, we need to use a more complex expression
// This is a limitation - we can't easily combine any for user agents with all for path
// So we'll create separate rules for each user agent
for _, ua := range userAgents {
if ua == "*" {
continue // Skip wildcard as it's handled separately
}
ruleCounter++
subRule := AnubisRule{
Name: fmt.Sprintf("%s-disallow-%d", *policyName, ruleCounter),
Action: *baseAction,
Expression: &config.ExpressionOrList{
All: []string{
fmt.Sprintf("userAgent.contains(%q)", ua),
buildPathCondition(disallow),
},
},
}
anubisRules = append(anubisRules, subRule)
}
continue
}
// Add path condition
@@ -291,7 +363,6 @@ func convertToAnubisRules(robotsRules []RobotsRule) []AnubisRule {
anubisRules = append(anubisRules, rule)
}
}
return anubisRules

View File

@@ -78,6 +78,12 @@ func TestDataFileConversion(t *testing.T) {
expectedFile: "complex.yaml",
options: TestOptions{format: "yaml", crawlDelayWeight: 5},
},
{
name: "consecutive_user_agents",
robotsFile: "consecutive.robots.txt",
expectedFile: "consecutive.yaml",
options: TestOptions{format: "yaml", crawlDelayWeight: 3},
},
}
for _, tc := range testCases {

View File

@@ -25,6 +25,6 @@
- action: CHALLENGE
expression:
all:
- userAgent.contains("Googlebot")
- path.startsWith("/search")
name: robots-txt-policy-disallow-7
- userAgent.contains("Googlebot")
- path.startsWith("/search")
name: robots-txt-policy-disallow-7

View File

@@ -20,8 +20,8 @@
- action: CHALLENGE
expression:
all:
- userAgent.contains("Googlebot")
- path.startsWith("/search/")
- userAgent.contains("Googlebot")
- path.startsWith("/search/")
name: robots-txt-policy-disallow-6
- action: WEIGH
expression: userAgent.contains("Bingbot")
@@ -31,14 +31,14 @@
- action: CHALLENGE
expression:
all:
- userAgent.contains("Bingbot")
- path.startsWith("/search/")
- userAgent.contains("Bingbot")
- path.startsWith("/search/")
name: robots-txt-policy-disallow-8
- action: CHALLENGE
expression:
all:
- userAgent.contains("Bingbot")
- path.startsWith("/admin/")
- userAgent.contains("Bingbot")
- path.startsWith("/admin/")
name: robots-txt-policy-disallow-9
- action: DENY
expression: userAgent.contains("BadBot")
@@ -54,18 +54,18 @@
- action: CHALLENGE
expression:
all:
- userAgent.contains("TestBot")
- path.matches("^/.*/admin")
- userAgent.contains("TestBot")
- path.matches("^/.*/admin")
name: robots-txt-policy-disallow-13
- action: CHALLENGE
expression:
all:
- userAgent.contains("TestBot")
- path.matches("^/temp.*\\.html")
- userAgent.contains("TestBot")
- path.matches("^/temp.*\\.html")
name: robots-txt-policy-disallow-14
- action: CHALLENGE
expression:
all:
- userAgent.contains("TestBot")
- path.matches("^/file.\\.log")
- userAgent.contains("TestBot")
- path.matches("^/file.\\.log")
name: robots-txt-policy-disallow-15

View File

@@ -0,0 +1,25 @@
# Test consecutive user agents that should be grouped into any: blocks
User-agent: *
Disallow: /admin
Crawl-delay: 10
# Multiple consecutive user agents - should be grouped
User-agent: BadBot
User-agent: SpamBot
User-agent: EvilBot
Disallow: /
# Single user agent - should be separate
User-agent: GoodBot
Disallow: /private
# Multiple consecutive user agents with crawl delay
User-agent: SlowBot1
User-agent: SlowBot2
Crawl-delay: 5
# Multiple consecutive user agents with specific path
User-agent: SearchBot1
User-agent: SearchBot2
User-agent: SearchBot3
Disallow: /search

View File

@@ -0,0 +1,47 @@
- action: WEIGH
expression: "true"
name: robots-txt-policy-crawl-delay-1
weight:
adjust: 3
- action: CHALLENGE
expression: path.startsWith("/admin")
name: robots-txt-policy-disallow-2
- action: DENY
expression:
any:
- userAgent.contains("BadBot")
- userAgent.contains("SpamBot")
- userAgent.contains("EvilBot")
name: robots-txt-policy-blacklist-3
- action: CHALLENGE
expression:
all:
- userAgent.contains("GoodBot")
- path.startsWith("/private")
name: robots-txt-policy-disallow-4
- action: WEIGH
expression:
any:
- userAgent.contains("SlowBot1")
- userAgent.contains("SlowBot2")
name: robots-txt-policy-crawl-delay-5
weight:
adjust: 3
- action: CHALLENGE
expression:
all:
- userAgent.contains("SearchBot1")
- path.startsWith("/search")
name: robots-txt-policy-disallow-7
- action: CHALLENGE
expression:
all:
- userAgent.contains("SearchBot2")
- path.startsWith("/search")
name: robots-txt-policy-disallow-8
- action: CHALLENGE
expression:
all:
- userAgent.contains("SearchBot3")
- path.startsWith("/search")
name: robots-txt-policy-disallow-9

View File

@@ -1,12 +1,12 @@
[
{
"action": "CHALLENGE",
"expression": "path.startsWith(\"/admin/\")",
"name": "robots-txt-policy-disallow-1"
"name": "robots-txt-policy-disallow-1",
"action": "CHALLENGE"
},
{
"action": "CHALLENGE",
"expression": "path.startsWith(\"/private\")",
"name": "robots-txt-policy-disallow-2"
"name": "robots-txt-policy-disallow-2",
"action": "CHALLENGE"
}
]

View File

@@ -3,6 +3,6 @@ package data
import "embed"
var (
//go:embed botPolicies.yaml all:apps all:bots all:clients all:common all:crawlers all:meta
//go:embed botPolicies.yaml all:apps all:bots all:clients all:common all:crawlers all:meta all:services
BotPolicies embed.FS
)

View File

@@ -13,47 +13,76 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
<!-- This changes the project to: -->
- Add a "proof of React" challenge to prove that the client is able to run a simple JSX app.
- Added possibility to disable HTTP keep-alive to support backends not properly
handling it.
- Add a server-side check for the meta-refresh challenge that makes sure clients have waited for at least 95% of the time that they should.
- Added a missing link to the Caddy installation environment in the installation documentation.
- Downstream consumers can change the default [log/slog#Logger](https://pkg.go.dev/log/slog#Logger) instance that Anubis uses by setting `opts.Logger` to your slog instance of choice ([#864](https://github.com/TecharoHQ/anubis/issues/864)).
- The [Thoth client](https://anubis.techaro.lol/docs/admin/thoth) is now public in the repo instead of being an internal package.
- [Custom-AsyncHttpClient](https://github.com/AsyncHttpClient/async-http-client)'s default User-Agent has an increased weight by default ([#852](https://github.com/TecharoHQ/anubis/issues/852)).
- Document missing environment variables in installation guide: `SLOG_LEVEL`, `COOKIE_PREFIX`, `FORCED_LANGUAGE`, and `TARGET_DISABLE_KEEPALIVE` ([#1086](https://github.com/TecharoHQ/anubis/pull/1086))
- Fixed `robots2policy` to properly group consecutive user agents into `any:` instead of only processing the last one ([#925](https://github.com/TecharoHQ/anubis/pull/925))
- Add the [`s3api` storage backend](./admin/policies.mdx#s3api) to allow Anubis to use S3 API compatible object storage as its storage backend.
- Add the `services` folder to the embedded data filesystem.
### Bug Fixes
Sometimes the enhanced temporal assurance in [#1038](https://github.com/TecharoHQ/anubis/pull/1038) and [#1068](https://github.com/TecharoHQ/anubis/pull/1068) could backfire because Chromium and its ilk randomize the amount of time they wait in order to avoid a timing side channel attack. This has been fixed by both increasing the amount of time a client has to wait for the metarefresh and preact challenges as well as making the server side logic more permissive.
## v1.22.0: Yda Hext
> Someone has to make an effort at reconciliation if these conflicts are ever going to end.
In this release, we finally fix the odd number of CPU cores bug, pave the way for lighter weight challenges, make Anubis more adaptable, and more.
### Big ticket items
#### Proof of React challenge
A new ["proof of React"](./admin/configuration/challenges/preact.mdx) has been added. It runs a simple app in React that has several chained hooks. It is much more lightweight than the proof of work check.
#### Smaller features
- The [`segments`](./admin/configuration/expressions.mdx#segments) function was added for splitting a path into its slash-separated segments.
- Added possibility to disable HTTP keep-alive to support backends not properly handling it.
- When issuing a challenge, Anubis stores information about that challenge into the store. That stored information is later used to validate challenge responses. This works around nondeterminism in bot rules. ([#917](https://github.com/TecharoHQ/anubis/issues/917))
- When parsing [Open Graph tags](./admin/configuration/open-graph.mdx), add any URLs found in the responses to a temporary "allow cache" so that social preview images work.
- Proof of work solving has had a complete overhaul and rethink based on feedback from browser engine developers, frontend experts, and overall performance profiling.
- One of the biggest sources of lag in Firefox has been eliminated: the use of WebCrypto. Now whenever Anubis detects the client is using Firefox (or Pale Moon), it will swap over to a pure-JS implementation of SHA-256 for speed.
- Proof of work solving has had a complete overhaul and rethink based on feedback from browser engine developers, frontend experts, and overall performance profiling.
- Optimize the performance of the pure-JS Anubis solver.
- Web Workers are stored as dedicated JavaScript files in `static/js/workers/*.mjs`.
- Pave the way for non-SHA256 solver methods and eventually one that uses WebAssembly (or WebAssembly code compiled to JS for those that disable WebAssembly).
- Legacy JavaScript code has been eliminated.
- When parsing [Open Graph tags](./admin/configuration/open-graph.mdx), add any URLs found in the responses to a temporary "allow cache" so that social preview images work.
- The hard dependency on WebCrypto has been removed, allowing a proof of work challenge to work over plain (unencrypted) HTTP.
- The Anubis version number is put in the footer of every page.
- Add a default block rule for Huawei Cloud.
- Add a default block rule for Alibaba Cloud.
- Added support to use Traefik forwardAuth middleware.
- Add X-Request-URI support so that Subrequest Authentication has path support.
- Added glob matching for `REDIRECT_DOMAINS`. You can pass `*.bugs.techaro.lol` to allow redirecting to anything ending with `.bugs.techaro.lol`. There is a limit of 4 wildcards.
### Fixes
#### Odd numbers of CPU cores are properly supported
Some phones have an odd number of CPU cores. This caused [interesting issues](https://anubis.techaro.lol/blog/2025/cpu-core-odd). This was fixed by [using `Math.trunc` to convert the number of CPU cores back into an integer](https://github.com/TecharoHQ/anubis/issues/1043).
#### Smaller fixes
- A standard library HTTP server log message about HTTP pipelining not working has been filtered out of Anubis' logs. There is no action that can be taken about it.
- Added a missing link to the Caddy installation environment in the installation documentation.
- Downstream consumers can change the default [log/slog#Logger](https://pkg.go.dev/log/slog#Logger) instance that Anubis uses by setting `opts.Logger` to your slog instance of choice ([#864](https://github.com/TecharoHQ/anubis/issues/864)).
- The [Thoth client](https://anubis.techaro.lol/docs/admin/thoth) is now public in the repo instead of being an internal package.
- [Custom-AsyncHttpClient](https://github.com/AsyncHttpClient/async-http-client)'s default User-Agent has an increased weight by default ([#852](https://github.com/TecharoHQ/anubis/issues/852)).
- Add option for replacing the default explanation text with a custom one ([#747](https://github.com/TecharoHQ/anubis/pull/747))
- The contact email in the LibreJS header has been changed.
- The hard dependency on WebCrypto has been removed, allowing a proof of work challenge to work over plain (unencrypted) HTTP.
- Firefox for Android support has been fixed by embedding the challenge ID into the pass-challenge route. This also fixes some inconsistent issues with other mobile browsers.
- The Anubis version number is put in the footer of every page.
- Prevent the proof of work nonce from being a decimal value by using Math.trunc to coerce it back to an integer if it happens ([#1043](https://github.com/TecharoHQ/anubis/issues/1043)).
- The legacy JSON based policy file example has been removed and all documentation for how to write a policy file in JSON has been deleted. JSON based policy files will still work, but YAML is the superior option for Anubis configuration.
- A standard library HTTP server log message about HTTP pipelining not working has been filtered out of Anubis' logs. There is no action that can be taken about it.
- The default `favicon` pattern in `data/common/keep-internet-working.yaml` has been updated to permit requests for png/gif/jpg/svg files as well as ico.
- The `--cookie-prefix` flag has been fixed so that it is fully respected.
- The default patterns in `data/common/keep-internet-working.yaml` have been updated to appropriately escape the '.' character in the regular expression patterns.
- Add optional restrictions for JWT based on the value of a header ([#697](https://github.com/TecharoHQ/anubis/pull/697))
- The word "hack" has been removed from the translation strings for Anubis due to incidents involving people misunderstanding that word and sending particularly horrible things to the project lead over email.
- Bump AI-robots.txt to version 1.39
- Add a default block rule for Huawei Cloud.
- Add a default block rule for Alibaba Cloud.
- Add X-Request-URI support so that Subrequest Authentication has path support.
- Add better logging when using Subrequest Authentication.
- Two of Slackware's community git repository servers are now poxied by Anubis.
- Added support to use Traefik forwardAuth middleware.
- Inject adversarial input to break AI coding assistants.
- Add better logging when using Subrequest Authentication.
### Security-relevant changes
- Add a server-side check for the meta-refresh challenge that makes sure clients have waited for at least 95% of the time that they should.
#### Fix potential double-spend for challenges
Anubis operates by issuing a challenge and having the client present a solution for that challenge. Challenges are identified by a unique UUID, which is stored in the database.
@@ -71,15 +100,11 @@ Thanks to [@taviso](https://github.com/taviso) for reporting this issue.
### Breaking changes
- The "slow" frontend solver has been removed in order to reduce maintenance burden. Any existing uses of it will still work, but issue a warning upon startup asking administrators to upgrade to the "fast" frontend solver.
- The legacy JSON based policy file example has been removed and all documentation for how to write a policy file in JSON has been deleted. JSON based policy files will still work, but YAML is the superior option for Anubis configuration.
### New Locales
- [Lithuanian](https://github.com/TecharoHQ/anubis/pull/972)
### Added
Anubis now supports these new languages:
- Lithuanian [#972](https://github.com/TecharoHQ/anubis/pull/972)
- Vietnamese [#926](https://github.com/TecharoHQ/anubis/pull/926)
## v1.21.3: Minfilia Warde - Echo 3

View File

@@ -197,6 +197,96 @@ $ du -hs *
8.0K reject.webp
```
## Custom HTML templates
If you need to completely control the HTML layout of all Anubis pages, you can customize the entire page with `USE_TEMPLATES=true`. This uses Go's standard library [html/template](https://pkg.go.dev/html/template) package to template HTML responses. In order to use this, you must define the following templates:
| Template path | Usage |
| :----------------------------------------- | :---------------------------------------------- |
| `$OVERLAY_FOLDER/templates/challenge.tmpl` | Challenge pages |
| `$OVERLAY_FOLDER/templates/error.tmpl` | Error pages |
| `$OVERLAY_FOLDER/templates/impressum.tmpl` | [Impressum](./configuration/impressum.mdx) page |
Here are minimal (but working) examples for each template:
<details>
<summary>`challenge.tmpl`</summary>
:::note
You **MUST** include the `{{.Head}}` segment in a `<head>` tag. It contains important information for challenges to execute. If you don't include this, no clients will be able to pass challenges.
:::
```html
<!DOCTYPE html>
<html lang="{{ .Lang }}">
<head>
{{ .Head }}
</head>
<body>
{{ .Body }}
</body>
</html>
```
</details>
<details>
<summary>`error.tmpl`</summary>
```html
<!DOCTYPE html>
<html lang="{{ .Lang }}">
<body>
{{ .Body }}
</body>
</html>
```
</details>
<details>
<summary>`impressum.tmpl`</summary>
```html
<!DOCTYPE html>
<html lang="{{ .Lang }}">
<body>
{{ .Body }}
</body>
</html>
```
</details>
### Template functions
In order to make life easier, the following template functions are defined:
#### `Asset`
Constructs the path for a static asset in the [overlay folder](#custom-images-and-css)'s `static` directory.
```go
func Asset(string) string
```
Usage:
```html
<link rel="stylesheet" href="{{ Asset "css/example.css" }}" />
```
Generates:
```html
<link
rel="stylesheet"
href="/.within.website/x/cmd/anubis/static/css/example.css"
/>
```
## Customizing messages
You can customize messages using the following environment variables:

View File

@@ -59,7 +59,7 @@ Currently the following settings are configurable via the policy file:
Anubis uses these environment variables for configuration:
| Environment Variable | Default value | Explanation |
| :----------------------------- | :---------------------- | :------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
|:-------------------------------|:------------------------|:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| `BASE_PREFIX` | unset | If set, adds a global prefix to all Anubis endpoints (everything starting with `/.within.website/x/anubis/`). For example, setting this to `/myapp` would make Anubis accessible at `/myapp/` instead of `/`. This is useful when running Anubis behind a reverse proxy that routes based on path prefixes. |
| `BIND` | `:8923` | The network address that Anubis listens on. For `unix`, set this to a path: `/run/anubis/instance.sock` |
| `BIND_NETWORK` | `tcp` | The address family that Anubis listens on. Accepts `tcp`, `unix` and anything Go's [`net.Listen`](https://pkg.go.dev/net#Listen) supports. |
@@ -67,6 +67,7 @@ Anubis uses these environment variables for configuration:
| `COOKIE_DYNAMIC_DOMAIN` | false | If set to true, automatically set cookie domain fields based on the hostname of the request. EG: if you are making a request to `anubis.techaro.lol`, the Anubis cookie will be valid for any subdomain of `techaro.lol`. |
| `COOKIE_EXPIRATION_TIME` | `168h` | The amount of time the authorization cookie is valid for. |
| `COOKIE_PARTITIONED` | `false` | If set to `true`, enables the [partitioned (CHIPS) flag](https://developers.google.com/privacy-sandbox/cookies/chips), meaning that Anubis inside an iframe has a different set of cookies than the domain hosting the iframe. |
| `COOKIE_PREFIX` | `anubis-cookie` | The prefix used for browser cookies created by Anubis. Useful for customization or avoiding conflicts with other applications. |
| `COOKIE_SECURE` | `true` | If set to `true`, enables the [Secure flag](https://developer.mozilla.org/en-US/docs/Web/HTTP/Guides/Cookies#block_access_to_your_cookies), meaning that the cookies will only be transmitted over HTTPS. If Anubis is used in an unsecure context (plain HTTP), this will be need to be set to false |
| `DIFFICULTY` | `4` | The difficulty of the challenge, or the number of leading zeroes that must be in successful responses. |
| `ED25519_PRIVATE_KEY_HEX` | unset | The hex-encoded ed25519 private key used to sign Anubis responses. If this is not set, Anubis will generate one for you. This should be exactly 64 characters long. When running multiple instances on the same base domain, the key must be the same across all instances. See below for details. |
@@ -81,6 +82,7 @@ Anubis uses these environment variables for configuration:
| `PUBLIC_URL` | unset | The externally accessible URL for this Anubis instance, used for constructing redirect URLs (e.g., for Traefik forwardAuth). |
| `REDIRECT_DOMAINS` | unset | If set, restrict the domains that Anubis can redirect to when passing a challenge.<br/><br/>If this is unset, Anubis may redirect to any domain which could cause security issues in the unlikely case that an attacker passes a challenge for your browser and then tricks you into clicking a link to your domain.<br/><br/>Note that if you are hosting Anubis on a non-standard port (`https://example:com:8443`, `http://www.example.net:8080`, etc.), you must also include the port number here. |
| `SERVE_ROBOTS_TXT` | `false` | If set `true`, Anubis will serve a default `robots.txt` file that disallows all known AI scrapers by name and then additionally disallows every scraper. This is useful if facts and circumstances make it difficult to change the underlying service to serve such a `robots.txt` file. |
| `SLOG_LEVEL` | `INFO` | The log level for structured logging. Valid values are `DEBUG`, `INFO`, `WARN`, and `ERROR`. Set to `DEBUG` to see all requests, evaluations, and detailed diagnostic information. |
| `SOCKET_MODE` | `0770` | _Only used when at least one of the `*_BIND_NETWORK` variables are set to `unix`._ The socket mode (permissions) for Unix domain sockets. |
| `STRIP_BASE_PREFIX` | `false` | If set to `true`, strips the base prefix from request paths when forwarding to the target server. This is useful when your target service expects to receive requests without the base prefix. For example, with `BASE_PREFIX=/foo` and `STRIP_BASE_PREFIX=true`, a request to `/foo/bar` would be forwarded to the target as `/bar`. |
| `TARGET` | `http://localhost:3923` | The URL of the service that Anubis should forward valid requests to. Supports Unix domain sockets, set this to a URI like so: `unix:///path/to/socket.sock`. |
@@ -100,10 +102,12 @@ If you don't know or understand what these settings mean, ignore them. These are
| Environment Variable | Default value | Explanation |
| :---------------------------- | :------------ | :---------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `TARGET_SNI` | unset | If set, overrides the TLS handshake hostname in requests forwarded to `TARGET`. |
| `FORCED_LANGUAGE` | unset | If set, forces Anubis to display challenge pages in the specified language instead of using the browser's Accept-Language header. Use ISO 639-1 language codes (e.g., `de` for German, `fr` for French). |
| `HS512_SECRET` | unset | Secret string for JWT HS512 algorithm. If this is not set, Anubis will use ED25519 as defined via the variables above. The longer the better; 128 chars should suffice. |
| `TARGET_DISABLE_KEEPALIVE` | `false` | If `true`, disables HTTP keep-alive for connections to the target backend. Useful for backends that don't handle keep-alive properly. |
| `TARGET_HOST` | unset | If set, overrides the Host header in requests forwarded to `TARGET`. |
| `TARGET_INSECURE_SKIP_VERIFY` | `false` | If `true`, skip TLS certificate validation for targets that listen over `https`. If your backend does not listen over `https`, ignore this setting. |
| `HS512_SECRET` | unset | Secret string for JWT HS512 algorithm. If this is not set, Anubis will use ED25519 as defined via the variables above. The longer the better; 128 chars should suffice. |
| `TARGET_SNI` | unset | If set, overrides the TLS handshake hostname in requests forwarded to `TARGET`. |
</details>

View File

@@ -196,6 +196,83 @@ store:
path: /data/anubis.bdb
```
### `s3api`
A network-backed storage layer backed by [object storage](https://en.wikipedia.org/wiki/Object_storage), specifically using the [S3 API](https://docs.aws.amazon.com/AmazonS3/latest/API/Type_API_Reference.html). This can be backed by any S3-compatible object storage service such as:
- [AWS S3](https://aws.amazon.com/s3/)
- [Cloudflare R2](https://www.cloudflare.com/developer-platform/products/r2/)
- [Hetzner Object Storage](https://www.hetzner.com/storage/object-storage/)
- [Minio](https://www.min.io/)
- [Tigris](https://www.tigrisdata.com/)
If you are using a cloud platform, they likely provide an S3 compatible object storage service. If not, you may want to choose [one of the fastest options](https://www.tigrisdata.com/blog/benchmark-small-objects/).
| Should I use this backend? | Yes/no |
| :------------------------------------------------------------ | :----- |
| Are you running only one instance of Anubis for this service? | 🚫 No |
| Does your service get a lot of traffic? | ✅ Yes |
| Do you want to store data persistently when Anubis restarts? | ✅ Yes |
| Do you run Anubis without mutable filesystem storage? | ✅ Yes |
:::note
Using this backend will cause a lot of S3 operations, at least one for creating challenges, one for invalidating challenges, one for updating challenges to prevent double-spends, and one for removing challenges.
:::
#### Configuration
The `s3api` backend takes the following configuration options:
| Name | Type | Example | Description |
| :----------- | :------ | :------------------------------------------------------------------- | :------------------------------------------------------------------------------------------------------------------------------------------ |
| `bucketName` | string | The name of the dedicated bucket for Anubis to store information in. |
| `pathStyle` | boolean | `false` | If true, use path-style S3 API operations. Please consult your storage provider's documentation if you don't know what you should put here. |
:::note
You should probably enable a lifecycle expiration rule for buckets containing Anubis data. Here is an example policy:
```json
{
"Rules": [
{
"Status": "Enabled",
"Expiration": {
"Days": 7
}
}
]
}
```
Adjust this as facts and circumstances demand, but 7 days should be enough for anyone.
:::
Example:
Assuming your environment looks like this:
```sh
# All of the following are fake credentials that look like real ones.
AWS_ACCESS_KEY_ID=accordingToAllKnownRulesOfAviation
AWS_SECRET_ACCESS_KEY=thereIsNoWayABeeShouldBeAbleToFly
AWS_REGION=yow
AWS_ENDPOINT_URL_S3=https://yow.s3.probably-not-malware.lol
```
Then your configuration would look like this:
```yaml
store:
backend: s3api
parameters:
bucketName: techaro-prod-anubis
pathStyle: false
```
### `valkey`
[Valkey](https://valkey.io/) is an in-memory key/value store that clients access over the network. This allows multiple instances of Anubis to share information and does not require each instance of Anubis to have persistent filesystem storage.

18
go.mod
View File

@@ -5,6 +5,9 @@ go 1.24.2
require (
github.com/TecharoHQ/thoth-proto v0.4.0
github.com/a-h/templ v0.3.924
github.com/aws/aws-sdk-go-v2 v1.38.3
github.com/aws/aws-sdk-go-v2/config v1.31.6
github.com/aws/aws-sdk-go-v2/service/s3 v1.87.3
github.com/cespare/xxhash/v2 v2.3.0
github.com/facebookgo/flagenv v0.0.0-20160425205200-fcd59fca7456
github.com/gaissmai/bart v0.23.0
@@ -49,6 +52,21 @@ require (
github.com/a-h/parse v0.0.0-20250122154542-74294addb73e // indirect
github.com/andybalholm/brotli v1.2.0 // indirect
github.com/antlr4-go/antlr/v4 v4.13.1 // indirect
github.com/aws/aws-sdk-go-v2/aws/protocol/eventstream v1.7.1 // indirect
github.com/aws/aws-sdk-go-v2/credentials v1.18.10 // indirect
github.com/aws/aws-sdk-go-v2/feature/ec2/imds v1.18.6 // indirect
github.com/aws/aws-sdk-go-v2/internal/configsources v1.4.6 // indirect
github.com/aws/aws-sdk-go-v2/internal/endpoints/v2 v2.7.6 // indirect
github.com/aws/aws-sdk-go-v2/internal/ini v1.8.3 // indirect
github.com/aws/aws-sdk-go-v2/internal/v4a v1.4.6 // indirect
github.com/aws/aws-sdk-go-v2/service/internal/accept-encoding v1.13.1 // indirect
github.com/aws/aws-sdk-go-v2/service/internal/checksum v1.8.6 // indirect
github.com/aws/aws-sdk-go-v2/service/internal/presigned-url v1.13.6 // indirect
github.com/aws/aws-sdk-go-v2/service/internal/s3shared v1.19.6 // indirect
github.com/aws/aws-sdk-go-v2/service/sso v1.29.1 // indirect
github.com/aws/aws-sdk-go-v2/service/ssooidc v1.34.2 // indirect
github.com/aws/aws-sdk-go-v2/service/sts v1.38.2 // indirect
github.com/aws/smithy-go v1.23.0 // indirect
github.com/beorn7/perks v1.0.1 // indirect
github.com/blakesmith/ar v0.0.0-20190502131153-809d4375e1fb // indirect
github.com/cavaliergopher/cpio v1.0.1 // indirect

36
go.sum
View File

@@ -51,6 +51,42 @@ github.com/antlr4-go/antlr/v4 v4.13.1 h1:SqQKkuVZ+zWkMMNkjy5FZe5mr5WURWnlpmOuzYW
github.com/antlr4-go/antlr/v4 v4.13.1/go.mod h1:GKmUxMtwp6ZgGwZSva4eWPC5mS6vUAmOABFgjdkM7Nw=
github.com/armon/go-socks5 v0.0.0-20160902184237-e75332964ef5 h1:0CwZNZbxp69SHPdPJAN/hZIm0C4OItdklCFmMRWYpio=
github.com/armon/go-socks5 v0.0.0-20160902184237-e75332964ef5/go.mod h1:wHh0iHkYZB8zMSxRWpUBQtwG5a7fFgvEO+odwuTv2gs=
github.com/aws/aws-sdk-go-v2 v1.38.3 h1:B6cV4oxnMs45fql4yRH+/Po/YU+597zgWqvDpYMturk=
github.com/aws/aws-sdk-go-v2 v1.38.3/go.mod h1:sDioUELIUO9Znk23YVmIk86/9DOpkbyyVb1i/gUNFXY=
github.com/aws/aws-sdk-go-v2/aws/protocol/eventstream v1.7.1 h1:i8p8P4diljCr60PpJp6qZXNlgX4m2yQFpYk+9ZT+J4E=
github.com/aws/aws-sdk-go-v2/aws/protocol/eventstream v1.7.1/go.mod h1:ddqbooRZYNoJ2dsTwOty16rM+/Aqmk/GOXrK8cg7V00=
github.com/aws/aws-sdk-go-v2/config v1.31.6 h1:a1t8fXY4GT4xjyJExz4knbuoxSCacB5hT/WgtfPyLjo=
github.com/aws/aws-sdk-go-v2/config v1.31.6/go.mod h1:5ByscNi7R+ztvOGzeUaIu49vkMk2soq5NaH5PYe33MQ=
github.com/aws/aws-sdk-go-v2/credentials v1.18.10 h1:xdJnXCouCx8Y0NncgoptztUocIYLKeQxrCgN6x9sdhg=
github.com/aws/aws-sdk-go-v2/credentials v1.18.10/go.mod h1:7tQk08ntj914F/5i9jC4+2HQTAuJirq7m1vZVIhEkWs=
github.com/aws/aws-sdk-go-v2/feature/ec2/imds v1.18.6 h1:wbjnrrMnKew78/juW7I2BtKQwa1qlf6EjQgS69uYY14=
github.com/aws/aws-sdk-go-v2/feature/ec2/imds v1.18.6/go.mod h1:AtiqqNrDioJXuUgz3+3T0mBWN7Hro2n9wll2zRUc0ww=
github.com/aws/aws-sdk-go-v2/internal/configsources v1.4.6 h1:uF68eJA6+S9iVr9WgX1NaRGyQ/6MdIyc4JNUo6TN1FA=
github.com/aws/aws-sdk-go-v2/internal/configsources v1.4.6/go.mod h1:qlPeVZCGPiobx8wb1ft0GHT5l+dc6ldnwInDFaMvC7Y=
github.com/aws/aws-sdk-go-v2/internal/endpoints/v2 v2.7.6 h1:pa1DEC6JoI0zduhZePp3zmhWvk/xxm4NB8Hy/Tlsgos=
github.com/aws/aws-sdk-go-v2/internal/endpoints/v2 v2.7.6/go.mod h1:gxEjPebnhWGJoaDdtDkA0JX46VRg1wcTHYe63OfX5pE=
github.com/aws/aws-sdk-go-v2/internal/ini v1.8.3 h1:bIqFDwgGXXN1Kpp99pDOdKMTTb5d2KyU5X/BZxjOkRo=
github.com/aws/aws-sdk-go-v2/internal/ini v1.8.3/go.mod h1:H5O/EsxDWyU+LP/V8i5sm8cxoZgc2fdNR9bxlOFrQTo=
github.com/aws/aws-sdk-go-v2/internal/v4a v1.4.6 h1:R0tNFJqfjHL3900cqhXuwQ+1K4G0xc9Yf8EDbFXCKEw=
github.com/aws/aws-sdk-go-v2/internal/v4a v1.4.6/go.mod h1:y/7sDdu+aJvPtGXr4xYosdpq9a6T9Z0jkXfugmti0rI=
github.com/aws/aws-sdk-go-v2/service/internal/accept-encoding v1.13.1 h1:oegbebPEMA/1Jny7kvwejowCaHz1FWZAQ94WXFNCyTM=
github.com/aws/aws-sdk-go-v2/service/internal/accept-encoding v1.13.1/go.mod h1:kemo5Myr9ac0U9JfSjMo9yHLtw+pECEHsFtJ9tqCEI8=
github.com/aws/aws-sdk-go-v2/service/internal/checksum v1.8.6 h1:hncKj/4gR+TPauZgTAsxOxNcvBayhUlYZ6LO/BYiQ30=
github.com/aws/aws-sdk-go-v2/service/internal/checksum v1.8.6/go.mod h1:OiIh45tp6HdJDDJGnja0mw8ihQGz3VGrUflLqSL0SmM=
github.com/aws/aws-sdk-go-v2/service/internal/presigned-url v1.13.6 h1:LHS1YAIJXJ4K9zS+1d/xa9JAA9sL2QyXIQCQFQW/X08=
github.com/aws/aws-sdk-go-v2/service/internal/presigned-url v1.13.6/go.mod h1:c9PCiTEuh0wQID5/KqA32J+HAgZxN9tOGXKCiYJjTZI=
github.com/aws/aws-sdk-go-v2/service/internal/s3shared v1.19.6 h1:nEXUSAwyUfLTgnc9cxlDWy637qsq4UWwp3sNAfl0Z3Y=
github.com/aws/aws-sdk-go-v2/service/internal/s3shared v1.19.6/go.mod h1:HGzIULx4Ge3Do2V0FaiYKcyKzOqwrhUZgCI77NisswQ=
github.com/aws/aws-sdk-go-v2/service/s3 v1.87.3 h1:ETkfWcXP2KNPLecaDa++5bsQhCRa5M5sLUJa5DWYIIg=
github.com/aws/aws-sdk-go-v2/service/s3 v1.87.3/go.mod h1:+/3ZTqoYb3Ur7DObD00tarKMLMuKg8iqz5CHEanqTnw=
github.com/aws/aws-sdk-go-v2/service/sso v1.29.1 h1:8OLZnVJPvjnrxEwHFg9hVUof/P4sibH+Ea4KKuqAGSg=
github.com/aws/aws-sdk-go-v2/service/sso v1.29.1/go.mod h1:27M3BpVi0C02UiQh1w9nsBEit6pLhlaH3NHna6WUbDE=
github.com/aws/aws-sdk-go-v2/service/ssooidc v1.34.2 h1:gKWSTnqudpo8dAxqBqZnDoDWCiEh/40FziUjr/mo6uA=
github.com/aws/aws-sdk-go-v2/service/ssooidc v1.34.2/go.mod h1:x7+rkNmRoEN1U13A6JE2fXne9EWyJy54o3n6d4mGaXQ=
github.com/aws/aws-sdk-go-v2/service/sts v1.38.2 h1:YZPjhyaGzhDQEvsffDEcpycq49nl7fiGcfJTIo8BszI=
github.com/aws/aws-sdk-go-v2/service/sts v1.38.2/go.mod h1:2dIN8qhQfv37BdUYGgEC8Q3tteM3zFxTI1MLO2O3J3c=
github.com/aws/smithy-go v1.23.0 h1:8n6I3gXzWJB2DxBDnfxgBaSX6oe0d/t10qGz7OKqMCE=
github.com/aws/smithy-go v1.23.0/go.mod h1:t1ufH5HMublsJYulve2RKmHDC15xu1f26kHCp/HgceI=
github.com/beorn7/perks v1.0.1 h1:VlbKKnNfV8bJzeqoa4cOKqO6bYr3WgKZxO8Z16+hsOM=
github.com/beorn7/perks v1.0.1/go.mod h1:G2ZrVWU2WbWT9wwq4/hrbKbnv/1ERSJQ0ibhJ6rlkpw=
github.com/blakesmith/ar v0.0.0-20190502131153-809d4375e1fb h1:m935MPodAbYS46DG4pJSv7WO+VECIWUQ7OJYSoTrMh4=

61
internal/glob/glob.go Normal file
View File

@@ -0,0 +1,61 @@
package glob
import "strings"
const GLOB = "*"
const maxGlobParts = 5
// Glob will test a string pattern, potentially containing globs, against a
// subject string. The result is a simple true/false, determining whether or
// not the glob pattern matched the subject text.
func Glob(pattern, subj string) bool {
// Empty pattern can only match empty subject
if pattern == "" {
return subj == pattern
}
// If the pattern _is_ a glob, it matches everything
if pattern == GLOB {
return true
}
parts := strings.Split(pattern, GLOB)
if len(parts) > maxGlobParts {
return false // Pattern is too complex, reject it.
}
if len(parts) == 1 {
// No globs in pattern, so test for equality
return subj == pattern
}
leadingGlob := strings.HasPrefix(pattern, GLOB)
trailingGlob := strings.HasSuffix(pattern, GLOB)
end := len(parts) - 1
// Go over the leading parts and ensure they match.
for i := 0; i < end; i++ {
idx := strings.Index(subj, parts[i])
switch i {
case 0:
// Check the first section. Requires special handling.
if !leadingGlob && idx != 0 {
return false
}
default:
// Check that the middle parts match.
if idx < 0 {
return false
}
}
// Trim evaluated text from subj as we loop over the pattern.
subj = subj[idx+len(parts[i]):]
}
// Reached the last section. Requires special handling.
return trailingGlob || strings.HasSuffix(subj, parts[end])
}

189
internal/glob/glob_test.go Normal file
View File

@@ -0,0 +1,189 @@
package glob
import "testing"
func TestGlob_EqualityAndEmpty(t *testing.T) {
cases := []struct {
name string
pattern string
subj string
want bool
}{
{"exact match", "hello", "hello", true},
{"exact mismatch", "hello", "hell", false},
{"empty pattern and subject", "", "", true},
{"empty pattern with non-empty subject", "", "x", false},
{"pattern star matches empty", "*", "", true},
{"pattern star matches anything", "*", "anything at all", true},
}
for _, tc := range cases {
t.Run(tc.name, func(t *testing.T) {
if got := Glob(tc.pattern, tc.subj); got != tc.want {
t.Fatalf("Glob(%q,%q) = %v, want %v", tc.pattern, tc.subj, got, tc.want)
}
})
}
}
func TestGlob_LeadingAndTrailing(t *testing.T) {
cases := []struct {
name string
pattern string
subj string
want bool
}{
{"prefix match - minimal", "foo*", "foo", true},
{"prefix match - extended", "foo*", "foobar", true},
{"prefix mismatch - not at start", "foo*", "xfoo", false},
{"suffix match - minimal", "*foo", "foo", true},
{"suffix match - extended", "*foo", "xfoo", true},
{"suffix mismatch - not at end", "*foo", "foox", false},
{"contains match", "*foo*", "barfoobaz", true},
{"contains mismatch - missing needle", "*foo*", "f", false},
}
for _, tc := range cases {
t.Run(tc.name, func(t *testing.T) {
if got := Glob(tc.pattern, tc.subj); got != tc.want {
t.Fatalf("Glob(%q,%q) = %v, want %v", tc.pattern, tc.subj, got, tc.want)
}
})
}
}
func TestGlob_MiddleAndOrder(t *testing.T) {
cases := []struct {
name string
pattern string
subj string
want bool
}{
{"middle wildcard basic", "f*o", "fo", true},
{"middle wildcard gap", "f*o", "fZZZo", true},
{"middle wildcard requires start f", "f*o", "xfyo", false},
{"order enforced across parts", "a*b*c*d", "axxbxxcxxd", true},
{"order mismatch fails", "a*b*c*d", "abdc", false},
{"must end with last part when no trailing *", "*foo*bar", "zzfooqqbar", true},
{"failing when trailing chars remain", "*foo*bar", "zzfooqqbarzz", false},
{"first part must start when no leading *", "foo*bar", "zzfooqqbar", false},
{"works with overlapping content", "ab*ba", "ababa", true},
{"needle not found fails", "foo*bar", "foobaz", false},
}
for _, tc := range cases {
t.Run(tc.name, func(t *testing.T) {
if got := Glob(tc.pattern, tc.subj); got != tc.want {
t.Fatalf("Glob(%q,%q) = %v, want %v", tc.pattern, tc.subj, got, tc.want)
}
})
}
}
func TestGlob_ConsecutiveStarsAndEmptyParts(t *testing.T) {
cases := []struct {
name string
pattern string
subj string
want bool
}{
{"double star matches anything", "**", "", true},
{"double star matches anything non-empty", "**", "abc", true},
{"consecutive stars behave like single", "a**b", "ab", true},
{"consecutive stars with gaps", "a**b", "axxxb", true},
{"consecutive stars + trailing star", "a**b*", "axxbzzz", true},
{"consecutive stars still enforce anchors", "a**b", "xaBy", false},
}
for _, tc := range cases {
t.Run(tc.name, func(t *testing.T) {
if got := Glob(tc.pattern, tc.subj); got != tc.want {
t.Fatalf("Glob(%q,%q) = %v, want %v", tc.pattern, tc.subj, got, tc.want)
}
})
}
}
func TestGlob_MaxPartsLimit(t *testing.T) {
// Allowed: up to 4 '*' (5 parts)
allowed := []struct {
pattern string
subj string
want bool
}{
{"a*b*c*d*e", "axxbxxcxxdxxe", true}, // 4 stars -> 5 parts
{"*a*b*c*d", "zzzaaaabbbcccddd", true},
{"a*b*c*d*e", "abcde", true},
{"a*b*c*d*e", "abxdxe", false}, // missing 'c' should fail
}
for _, tc := range allowed {
if got := Glob(tc.pattern, tc.subj); got != tc.want {
t.Fatalf("allowed pattern Glob(%q,%q) = %v, want %v", tc.pattern, tc.subj, got, tc.want)
}
}
// Disallowed: 5 '*' (6 parts) -> always false by complexity check
disallowed := []struct {
pattern string
subj string
}{
{"a*b*c*d*e*f", "aXXbYYcZZdQQeRRf"},
{"*a*b*c*d*e*", "abcdef"},
{"******", "anything"}, // 6 stars -> 7 parts
}
for _, tc := range disallowed {
if got := Glob(tc.pattern, tc.subj); got {
t.Fatalf("disallowed pattern should fail Glob(%q,%q) = %v, want false", tc.pattern, tc.subj, got)
}
}
}
func TestGlob_CaseSensitivity(t *testing.T) {
cases := []struct {
pattern string
subj string
want bool
}{
{"FOO*", "foo", false},
{"*Bar", "bar", false},
{"Foo*Bar", "FooZZZBar", true},
}
for _, tc := range cases {
if got := Glob(tc.pattern, tc.subj); got != tc.want {
t.Fatalf("Glob(%q,%q) = %v, want %v", tc.pattern, tc.subj, got, tc.want)
}
}
}
func TestGlob_EmptySubjectInteractions(t *testing.T) {
cases := []struct {
pattern string
subj string
want bool
}{
{"*a", "", false},
{"a*", "", false},
{"**", "", true},
{"*", "", true},
}
for _, tc := range cases {
if got := Glob(tc.pattern, tc.subj); got != tc.want {
t.Fatalf("Glob(%q,%q) = %v, want %v", tc.pattern, tc.subj, got, tc.want)
}
}
}
func BenchmarkGlob(b *testing.B) {
patterns := []string{
"*", "*foo*", "foo*bar", "a*b*c*d*e", "a**b*", "*needle*end",
}
subjects := []string{
"", "foo", "barfoo", "foobarbaz", "axxbxxcxxdxxe", "zzfooqqbarzz",
"lorem ipsum dolor sit amet, consectetur adipiscing elit",
}
for _, p := range patterns {
for _, s := range subjects {
b.Run(p+"::"+s, func(b *testing.B) {
for i := 0; i < b.N; i++ {
_ = Glob(p, s)
}
})
}
}
}

View File

@@ -11,7 +11,6 @@ import (
"net"
"net/http"
"net/url"
"slices"
"strings"
"time"
@@ -435,7 +434,7 @@ func (s *Server) PassChallenge(w http.ResponseWriter, r *http.Request) {
s.respondWithError(w, r, localizer.T("redirect_not_parseable"))
return
}
if (len(urlParsed.Host) > 0 && len(s.opts.RedirectDomains) != 0 && !slices.Contains(s.opts.RedirectDomains, urlParsed.Host)) || urlParsed.Host != r.URL.Host {
if (len(urlParsed.Host) > 0 && len(s.opts.RedirectDomains) != 0 && !matchRedirectDomain(s.opts.RedirectDomains, urlParsed.Host)) || urlParsed.Host != r.URL.Host {
lg.Debug("domain not allowed", "domain", urlParsed.Host)
s.respondWithError(w, r, localizer.T("redirect_domain_not_allowed"))
return

View File

@@ -43,7 +43,7 @@ func (i *Impl) Issue(r *http.Request, lg *slog.Logger, in *challenge.IssueInput)
}
func (i *Impl) Validate(r *http.Request, lg *slog.Logger, in *challenge.ValidateInput) error {
wantTime := in.Challenge.IssuedAt.Add(time.Duration(in.Rule.Challenge.Difficulty) * 950 * time.Millisecond)
wantTime := in.Challenge.IssuedAt.Add(time.Duration(in.Rule.Challenge.Difficulty) * 800 * time.Millisecond)
if time.Now().Before(wantTime) {
return challenge.NewError("validate", "insufficent time", fmt.Errorf("%w: wanted user to wait until at least %s", challenge.ErrFailed, wantTime.Format(time.RFC3339)))

View File

@@ -13,6 +13,6 @@ templ page(redir string, difficulty int, loc *localization.SimpleLocalizer) {
<img style="display:none;" style="width:100%;max-width:256px;" src={ anubis.BasePrefix + "/.within.website/x/cmd/anubis/static/img/happy.webp?cacheBuster=" + anubis.Version }/>
<p id="status">{ loc.T("loading") }</p>
<p>{ loc.T("connection_security") }</p>
<meta http-equiv="refresh" content={ fmt.Sprintf("%d; url=%s", difficulty, redir) }/>
<meta http-equiv="refresh" content={ fmt.Sprintf("%d; url=%s", difficulty+1, redir) }/>
</div>
}

View File

@@ -32,7 +32,7 @@ const App = () => {
useEffect(() => {
const timer = setTimeout(() => {
setPassed(true);
}, state.difficulty * 100);
}, state.difficulty * 125);
return () => clearTimeout(timer);
}, [challenge]);

View File

@@ -57,7 +57,7 @@ func (i *impl) Issue(r *http.Request, lg *slog.Logger, in *challenge.IssueInput)
}
func (i *impl) Validate(r *http.Request, lg *slog.Logger, in *challenge.ValidateInput) error {
wantTime := in.Challenge.IssuedAt.Add(time.Duration(in.Rule.Challenge.Difficulty) * 95 * time.Millisecond)
wantTime := in.Challenge.IssuedAt.Add(time.Duration(in.Rule.Challenge.Difficulty) * 80 * time.Millisecond)
if time.Now().Before(wantTime) {
return challenge.NewError("validate", "insufficent time", fmt.Errorf("%w: wanted user to wait until at least %s", challenge.ErrFailed, wantTime.Format(time.RFC3339)))

View File

@@ -7,12 +7,12 @@ import (
"net/http"
"net/url"
"regexp"
"slices"
"strings"
"time"
"github.com/TecharoHQ/anubis"
"github.com/TecharoHQ/anubis/internal"
"github.com/TecharoHQ/anubis/internal/glob"
"github.com/TecharoHQ/anubis/lib/challenge"
"github.com/TecharoHQ/anubis/lib/localization"
"github.com/TecharoHQ/anubis/lib/policy"
@@ -24,6 +24,26 @@ import (
var domainMatchRegexp = regexp.MustCompile(`^((xn--)?[a-z0-9]+(-[a-z0-9]+)*\.)+[a-z]{2,}$`)
// matchRedirectDomain returns true if host matches any of the allowed redirect
// domain patterns. Patterns may contain '*' which are matched using the
// internal glob matcher. Matching is case-insensitive on hostnames.
func matchRedirectDomain(allowed []string, host string) bool {
h := strings.ToLower(strings.TrimSpace(host))
for _, pat := range allowed {
p := strings.ToLower(strings.TrimSpace(pat))
if strings.Contains(p, glob.GLOB) {
if glob.Glob(p, h) {
return true
}
continue
}
if p == h {
return true
}
}
return false
}
type CookieOpts struct {
Value string
Host string
@@ -217,8 +237,8 @@ func (s *Server) constructRedirectURL(r *http.Request) (string, error) {
if proto == "" || host == "" || uri == "" {
return "", errors.New(localizer.T("missing_required_forwarded_headers"))
}
// Check if host is allowed in RedirectDomains
if len(s.opts.RedirectDomains) > 0 && !slices.Contains(s.opts.RedirectDomains, host) {
// Check if host is allowed in RedirectDomains (supports '*' via glob)
if len(s.opts.RedirectDomains) > 0 && !matchRedirectDomain(s.opts.RedirectDomains, host) {
lg := internal.GetRequestLogger(s.logger, r)
lg.Debug("domain not allowed", "domain", host)
return "", errors.New(localizer.T("redirect_domain_not_allowed"))
@@ -290,7 +310,7 @@ func (s *Server) ServeHTTPNext(w http.ResponseWriter, r *http.Request) {
hostNotAllowed := len(urlParsed.Host) > 0 &&
len(s.opts.RedirectDomains) != 0 &&
!slices.Contains(s.opts.RedirectDomains, urlParsed.Host)
!matchRedirectDomain(s.opts.RedirectDomains, urlParsed.Host)
hostMismatch := r.URL.Host != "" && urlParsed.Host != r.URL.Host
if hostNotAllowed || hostMismatch {

View File

@@ -6,5 +6,6 @@ package all
import (
_ "github.com/TecharoHQ/anubis/lib/store/bbolt"
_ "github.com/TecharoHQ/anubis/lib/store/memory"
_ "github.com/TecharoHQ/anubis/lib/store/s3api"
_ "github.com/TecharoHQ/anubis/lib/store/valkey"
)

107
lib/store/s3api/factory.go Normal file
View File

@@ -0,0 +1,107 @@
package s3api
import (
"context"
"encoding/json"
"errors"
"fmt"
"github.com/TecharoHQ/anubis/lib/store"
awsConfig "github.com/aws/aws-sdk-go-v2/config"
"github.com/aws/aws-sdk-go-v2/service/s3"
)
var (
ErrNoRegion = errors.New("s3api.Config: no region env var name defined")
ErrNoAccessKeyID = errors.New("s3api.Config: no access key id env var name defined")
ErrNoSecretAccessKey = errors.New("s3api.Config: no secret access key env var name defined")
ErrNoBucketName = errors.New("s3api.Config: no bucket name env var name defined")
)
func init() {
store.Register("s3api", Factory{})
}
// S3API is the subset of the AWS S3 client used by this store. It enables mocking in tests.
type S3API interface {
PutObject(ctx context.Context, params *s3.PutObjectInput, optFns ...func(*s3.Options)) (*s3.PutObjectOutput, error)
GetObject(ctx context.Context, params *s3.GetObjectInput, optFns ...func(*s3.Options)) (*s3.GetObjectOutput, error)
DeleteObject(ctx context.Context, params *s3.DeleteObjectInput, optFns ...func(*s3.Options)) (*s3.DeleteObjectOutput, error)
HeadObject(ctx context.Context, params *s3.HeadObjectInput, optFns ...func(*s3.Options)) (*s3.HeadObjectOutput, error)
}
// Factory builds an S3-backed store. Tests can inject a Mock via Client.
// Factory can optionally carry a preconstructed S3 client (e.g., a mock in tests).
type Factory struct {
Client S3API
}
func (f Factory) Build(ctx context.Context, data json.RawMessage) (store.Interface, error) {
var config Config
if err := json.Unmarshal([]byte(data), &config); err != nil {
return nil, fmt.Errorf("%w: %w", store.ErrBadConfig, err)
}
if err := config.Valid(); err != nil {
return nil, fmt.Errorf("%w: %w", store.ErrBadConfig, err)
}
if config.BucketName == "" {
return nil, fmt.Errorf("%w: %s", store.ErrBadConfig, ErrNoBucketName)
}
// If a client was injected (e.g., tests), use it directly.
if f.Client != nil {
return &Store{
s3: f.Client,
bucket: config.BucketName,
}, nil
}
cfg, err := awsConfig.LoadDefaultConfig(ctx)
if err != nil {
return nil, fmt.Errorf("can't load AWS config from environment: %w", err)
}
client := s3.NewFromConfig(cfg, func(o *s3.Options) {
o.UsePathStyle = config.PathStyle
})
return &Store{
s3: client,
bucket: config.BucketName,
}, nil
}
func (Factory) Valid(data json.RawMessage) error {
var config Config
if err := json.Unmarshal([]byte(data), &config); err != nil {
return fmt.Errorf("%w: %w", store.ErrBadConfig, err)
}
if err := config.Valid(); err != nil {
return fmt.Errorf("%w: %w", store.ErrBadConfig, err)
}
return nil
}
type Config struct {
PathStyle bool `json:"pathStyle"`
BucketName string `json:"bucketName"`
}
func (c Config) Valid() error {
var errs []error
if c.BucketName == "" {
errs = append(errs, ErrNoBucketName)
}
if len(errs) != 0 {
return fmt.Errorf("s3api.Config: invalid config: %w", errors.Join(errs...))
}
return nil
}

78
lib/store/s3api/s3api.go Normal file
View File

@@ -0,0 +1,78 @@
package s3api
import (
"bytes"
"context"
"fmt"
"io"
"strconv"
"strings"
"time"
"github.com/TecharoHQ/anubis/lib/store"
"github.com/aws/aws-sdk-go-v2/service/s3"
)
type Store struct {
s3 S3API
bucket string
}
func (s *Store) Delete(ctx context.Context, key string) error {
normKey := strings.ReplaceAll(key, ":", "/")
// Emulate not found by probing first.
if _, err := s.s3.HeadObject(ctx, &s3.HeadObjectInput{Bucket: &s.bucket, Key: &normKey}); err != nil {
return fmt.Errorf("%w: %w", store.ErrNotFound, err)
}
if _, err := s.s3.DeleteObject(ctx, &s3.DeleteObjectInput{Bucket: &s.bucket, Key: &normKey}); err != nil {
return fmt.Errorf("can't delete from s3: %w", err)
}
return nil
}
func (s *Store) Get(ctx context.Context, key string) ([]byte, error) {
normKey := strings.ReplaceAll(key, ":", "/")
out, err := s.s3.GetObject(ctx, &s3.GetObjectInput{
Bucket: &s.bucket,
Key: &normKey,
})
if err != nil {
return nil, fmt.Errorf("%w: %w", store.ErrNotFound, err)
}
defer out.Body.Close()
if msStr, ok := out.Metadata["x-anubis-expiry-ms"]; ok && msStr != "" {
if ms, err := strconv.ParseInt(msStr, 10, 64); err == nil {
if time.Now().UnixMilli() >= ms {
_, _ = s.s3.DeleteObject(ctx, &s3.DeleteObjectInput{Bucket: &s.bucket, Key: &normKey})
return nil, store.ErrNotFound
}
}
}
b, err := io.ReadAll(out.Body)
if err != nil {
return nil, fmt.Errorf("can't read s3 object: %w", err)
}
return b, nil
}
func (s *Store) Set(ctx context.Context, key string, value []byte, expiry time.Duration) error {
normKey := strings.ReplaceAll(key, ":", "/")
// S3 has no native TTL; we store object with metadata X-Anubis-Expiry as epoch seconds.
var meta map[string]string
if expiry > 0 {
exp := time.Now().Add(expiry).UnixMilli()
meta = map[string]string{"x-anubis-expiry-ms": fmt.Sprintf("%d", exp)}
}
_, err := s.s3.PutObject(ctx, &s3.PutObjectInput{
Bucket: &s.bucket,
Key: &normKey,
Body: bytes.NewReader(value),
Metadata: meta,
})
if err != nil {
return fmt.Errorf("can't put s3 object: %w", err)
}
return nil
}
func (Store) IsPersistent() bool { return true }

View File

@@ -0,0 +1,140 @@
package s3api
import (
"bytes"
"context"
"encoding/json"
"fmt"
"io"
"sync"
"testing"
"time"
"github.com/TecharoHQ/anubis/lib/store/storetest"
"github.com/aws/aws-sdk-go-v2/aws"
"github.com/aws/aws-sdk-go-v2/service/s3"
)
// mockS3 is an in-memory mock of the methods we use.
type mockS3 struct {
mu sync.RWMutex
bucket string
data map[string][]byte
meta map[string]map[string]string
}
func (m *mockS3) PutObject(ctx context.Context, in *s3.PutObjectInput, _ ...func(*s3.Options)) (*s3.PutObjectOutput, error) {
m.mu.Lock()
defer m.mu.Unlock()
if m.data == nil {
m.data = map[string][]byte{}
}
if m.meta == nil {
m.meta = map[string]map[string]string{}
}
b, _ := io.ReadAll(in.Body)
m.data[aws.ToString(in.Key)] = bytes.Clone(b)
if in.Metadata != nil {
m.meta[aws.ToString(in.Key)] = map[string]string{}
for k, v := range in.Metadata {
m.meta[aws.ToString(in.Key)][k] = v
}
}
m.bucket = aws.ToString(in.Bucket)
return &s3.PutObjectOutput{}, nil
}
func (m *mockS3) GetObject(ctx context.Context, in *s3.GetObjectInput, _ ...func(*s3.Options)) (*s3.GetObjectOutput, error) {
m.mu.RLock()
defer m.mu.RUnlock()
b, ok := m.data[aws.ToString(in.Key)]
if !ok {
return nil, fmt.Errorf("not found")
}
out := &s3.GetObjectOutput{Body: io.NopCloser(bytes.NewReader(b))}
if md, ok := m.meta[aws.ToString(in.Key)]; ok {
out.Metadata = md
}
return out, nil
}
func (m *mockS3) DeleteObject(ctx context.Context, in *s3.DeleteObjectInput, _ ...func(*s3.Options)) (*s3.DeleteObjectOutput, error) {
m.mu.Lock()
defer m.mu.Unlock()
delete(m.data, aws.ToString(in.Key))
delete(m.meta, aws.ToString(in.Key))
return &s3.DeleteObjectOutput{}, nil
}
func (m *mockS3) HeadObject(ctx context.Context, in *s3.HeadObjectInput, _ ...func(*s3.Options)) (*s3.HeadObjectOutput, error) {
m.mu.RLock()
defer m.mu.RUnlock()
if _, ok := m.data[aws.ToString(in.Key)]; !ok {
return nil, fmt.Errorf("not found")
}
return &s3.HeadObjectOutput{}, nil
}
func TestImpl(t *testing.T) {
mock := &mockS3{}
f := Factory{Client: mock}
data, _ := json.Marshal(Config{
BucketName: "bucket",
})
storetest.Common(t, f, json.RawMessage(data))
}
func TestKeyNormalization(t *testing.T) {
mock := &mockS3{}
f := Factory{Client: mock}
data, _ := json.Marshal(Config{
BucketName: "anubis",
})
s, err := f.Build(t.Context(), json.RawMessage(data))
if err != nil {
t.Fatal(err)
}
key := "a:b:c"
val := []byte("value")
if err := s.Set(t.Context(), key, val, 0); err != nil {
t.Fatalf("Set failed: %v", err)
}
// Ensure mock saw normalized key
mock.mu.RLock()
_, hasRaw := mock.data["a:b:c"]
got, hasNorm := mock.data["a/b/c"]
mock.mu.RUnlock()
if hasRaw {
t.Fatalf("mock contains raw key with colon; normalization failed")
}
if !hasNorm || !bytes.Equal(got, val) {
t.Fatalf("normalized key missing or wrong value: got=%q", string(got))
}
// Get using colon key should work
out, err := s.Get(t.Context(), key)
if err != nil {
t.Fatalf("Get failed: %v", err)
}
if !bytes.Equal(out, val) {
t.Fatalf("Get returned wrong value: got=%q", string(out))
}
// Delete using colon key should delete normalized object
if err := s.Delete(t.Context(), key); err != nil {
t.Fatalf("Delete failed: %v", err)
}
// Give any async cleanup in tests a tick (not needed for mock, but harmless)
time.Sleep(1 * time.Millisecond)
mock.mu.RLock()
_, exists := mock.data["a/b/c"]
mock.mu.RUnlock()
if exists {
t.Fatalf("normalized key still exists after Delete")
}
}

4
package-lock.json generated
View File

@@ -1,12 +1,12 @@
{
"name": "@techaro/anubis",
"version": "1.22.0-pre2",
"version": "1.22.0",
"lockfileVersion": 3,
"requires": true,
"packages": {
"": {
"name": "@techaro/anubis",
"version": "1.22.0-pre2",
"version": "1.22.0",
"license": "ISC",
"dependencies": {
"@aws-crypto/sha256-js": "^5.2.0",

View File

@@ -1,6 +1,6 @@
{
"name": "@techaro/anubis",
"version": "1.22.0-pre2",
"version": "1.22.0",
"description": "",
"main": "index.js",
"scripts": {

View File

@@ -0,0 +1,8 @@
bots:
- name: challenge
user_agent_regex: CHALLENGE
action: CHALLENGE
status_codes:
CHALLENGE: 200
DENY: 403

View File

@@ -0,0 +1,27 @@
async function getChallengePage() {
return fetch("http://localhost:8923/reqmeta", {
headers: {
"Accept-Language": "en",
"User-Agent": "CHALLENGE",
}
})
.then(resp => {
if (resp.status !== 200) {
throw new Error(`wanted status 200, got status: ${resp.status}`);
}
return resp;
})
.then(resp => resp.text());
}
(async () => {
const page = await getChallengePage();
if (!page.includes(`<html lang="de">`)) {
console.log(page)
throw new Error("force language smoke test failed");
}
console.log("FORCED_LANGUAGE=de caused a page to be rendered in german");
process.exit(0);
})();

23
test/forced-language/test.sh Executable file
View File

@@ -0,0 +1,23 @@
#!/usr/bin/env bash
set -euo pipefail
function cleanup() {
pkill -P $$
}
trap cleanup EXIT SIGINT
# Build static assets
(cd ../.. && npm ci && npm run assets)
go tool anubis --help 2>/dev/null ||:
go run ../cmd/unixhttpd &
FORCED_LANGUAGE=de go tool anubis \
--policy-fname ./anubis.yaml \
--use-remote-address \
--target=unix://$(pwd)/unixhttpd.sock &
backoff-retry node ./test.mjs

2
test/forced-language/var/.gitignore vendored Normal file
View File

@@ -0,0 +1,2 @@
*
!.gitignore

View File

@@ -5,7 +5,7 @@ go 1.24.5
replace github.com/TecharoHQ/anubis => ..
require (
github.com/TecharoHQ/anubis v1.21.3
github.com/TecharoHQ/anubis v1.22.0
github.com/docker/docker v28.3.2+incompatible
github.com/facebookgo/flagenv v0.0.0-20160425205200-fcd59fca7456
github.com/google/uuid v1.6.0
@@ -18,6 +18,24 @@ require (
github.com/TecharoHQ/thoth-proto v0.4.0 // indirect
github.com/a-h/templ v0.3.924 // indirect
github.com/antlr4-go/antlr/v4 v4.13.1 // indirect
github.com/aws/aws-sdk-go-v2 v1.38.3 // indirect
github.com/aws/aws-sdk-go-v2/aws/protocol/eventstream v1.7.1 // indirect
github.com/aws/aws-sdk-go-v2/config v1.31.6 // indirect
github.com/aws/aws-sdk-go-v2/credentials v1.18.10 // indirect
github.com/aws/aws-sdk-go-v2/feature/ec2/imds v1.18.6 // indirect
github.com/aws/aws-sdk-go-v2/internal/configsources v1.4.6 // indirect
github.com/aws/aws-sdk-go-v2/internal/endpoints/v2 v2.7.6 // indirect
github.com/aws/aws-sdk-go-v2/internal/ini v1.8.3 // indirect
github.com/aws/aws-sdk-go-v2/internal/v4a v1.4.6 // indirect
github.com/aws/aws-sdk-go-v2/service/internal/accept-encoding v1.13.1 // indirect
github.com/aws/aws-sdk-go-v2/service/internal/checksum v1.8.6 // indirect
github.com/aws/aws-sdk-go-v2/service/internal/presigned-url v1.13.6 // indirect
github.com/aws/aws-sdk-go-v2/service/internal/s3shared v1.19.6 // indirect
github.com/aws/aws-sdk-go-v2/service/s3 v1.87.3 // indirect
github.com/aws/aws-sdk-go-v2/service/sso v1.29.1 // indirect
github.com/aws/aws-sdk-go-v2/service/ssooidc v1.34.2 // indirect
github.com/aws/aws-sdk-go-v2/service/sts v1.38.2 // indirect
github.com/aws/smithy-go v1.23.0 // indirect
github.com/beorn7/perks v1.0.1 // indirect
github.com/cespare/xxhash/v2 v2.3.0 // indirect
github.com/containerd/errdefs v1.0.0 // indirect

View File

@@ -16,6 +16,42 @@ github.com/a-h/templ v0.3.924 h1:t5gZqTneXqvehpNZsgtnlOscnBboNh9aASBH2MgV/0k=
github.com/a-h/templ v0.3.924/go.mod h1:FFAu4dI//ESmEN7PQkJ7E7QfnSEMdcnu7QrAY8Dn334=
github.com/antlr4-go/antlr/v4 v4.13.1 h1:SqQKkuVZ+zWkMMNkjy5FZe5mr5WURWnlpmOuzYWrPrQ=
github.com/antlr4-go/antlr/v4 v4.13.1/go.mod h1:GKmUxMtwp6ZgGwZSva4eWPC5mS6vUAmOABFgjdkM7Nw=
github.com/aws/aws-sdk-go-v2 v1.38.3 h1:B6cV4oxnMs45fql4yRH+/Po/YU+597zgWqvDpYMturk=
github.com/aws/aws-sdk-go-v2 v1.38.3/go.mod h1:sDioUELIUO9Znk23YVmIk86/9DOpkbyyVb1i/gUNFXY=
github.com/aws/aws-sdk-go-v2/aws/protocol/eventstream v1.7.1 h1:i8p8P4diljCr60PpJp6qZXNlgX4m2yQFpYk+9ZT+J4E=
github.com/aws/aws-sdk-go-v2/aws/protocol/eventstream v1.7.1/go.mod h1:ddqbooRZYNoJ2dsTwOty16rM+/Aqmk/GOXrK8cg7V00=
github.com/aws/aws-sdk-go-v2/config v1.31.6 h1:a1t8fXY4GT4xjyJExz4knbuoxSCacB5hT/WgtfPyLjo=
github.com/aws/aws-sdk-go-v2/config v1.31.6/go.mod h1:5ByscNi7R+ztvOGzeUaIu49vkMk2soq5NaH5PYe33MQ=
github.com/aws/aws-sdk-go-v2/credentials v1.18.10 h1:xdJnXCouCx8Y0NncgoptztUocIYLKeQxrCgN6x9sdhg=
github.com/aws/aws-sdk-go-v2/credentials v1.18.10/go.mod h1:7tQk08ntj914F/5i9jC4+2HQTAuJirq7m1vZVIhEkWs=
github.com/aws/aws-sdk-go-v2/feature/ec2/imds v1.18.6 h1:wbjnrrMnKew78/juW7I2BtKQwa1qlf6EjQgS69uYY14=
github.com/aws/aws-sdk-go-v2/feature/ec2/imds v1.18.6/go.mod h1:AtiqqNrDioJXuUgz3+3T0mBWN7Hro2n9wll2zRUc0ww=
github.com/aws/aws-sdk-go-v2/internal/configsources v1.4.6 h1:uF68eJA6+S9iVr9WgX1NaRGyQ/6MdIyc4JNUo6TN1FA=
github.com/aws/aws-sdk-go-v2/internal/configsources v1.4.6/go.mod h1:qlPeVZCGPiobx8wb1ft0GHT5l+dc6ldnwInDFaMvC7Y=
github.com/aws/aws-sdk-go-v2/internal/endpoints/v2 v2.7.6 h1:pa1DEC6JoI0zduhZePp3zmhWvk/xxm4NB8Hy/Tlsgos=
github.com/aws/aws-sdk-go-v2/internal/endpoints/v2 v2.7.6/go.mod h1:gxEjPebnhWGJoaDdtDkA0JX46VRg1wcTHYe63OfX5pE=
github.com/aws/aws-sdk-go-v2/internal/ini v1.8.3 h1:bIqFDwgGXXN1Kpp99pDOdKMTTb5d2KyU5X/BZxjOkRo=
github.com/aws/aws-sdk-go-v2/internal/ini v1.8.3/go.mod h1:H5O/EsxDWyU+LP/V8i5sm8cxoZgc2fdNR9bxlOFrQTo=
github.com/aws/aws-sdk-go-v2/internal/v4a v1.4.6 h1:R0tNFJqfjHL3900cqhXuwQ+1K4G0xc9Yf8EDbFXCKEw=
github.com/aws/aws-sdk-go-v2/internal/v4a v1.4.6/go.mod h1:y/7sDdu+aJvPtGXr4xYosdpq9a6T9Z0jkXfugmti0rI=
github.com/aws/aws-sdk-go-v2/service/internal/accept-encoding v1.13.1 h1:oegbebPEMA/1Jny7kvwejowCaHz1FWZAQ94WXFNCyTM=
github.com/aws/aws-sdk-go-v2/service/internal/accept-encoding v1.13.1/go.mod h1:kemo5Myr9ac0U9JfSjMo9yHLtw+pECEHsFtJ9tqCEI8=
github.com/aws/aws-sdk-go-v2/service/internal/checksum v1.8.6 h1:hncKj/4gR+TPauZgTAsxOxNcvBayhUlYZ6LO/BYiQ30=
github.com/aws/aws-sdk-go-v2/service/internal/checksum v1.8.6/go.mod h1:OiIh45tp6HdJDDJGnja0mw8ihQGz3VGrUflLqSL0SmM=
github.com/aws/aws-sdk-go-v2/service/internal/presigned-url v1.13.6 h1:LHS1YAIJXJ4K9zS+1d/xa9JAA9sL2QyXIQCQFQW/X08=
github.com/aws/aws-sdk-go-v2/service/internal/presigned-url v1.13.6/go.mod h1:c9PCiTEuh0wQID5/KqA32J+HAgZxN9tOGXKCiYJjTZI=
github.com/aws/aws-sdk-go-v2/service/internal/s3shared v1.19.6 h1:nEXUSAwyUfLTgnc9cxlDWy637qsq4UWwp3sNAfl0Z3Y=
github.com/aws/aws-sdk-go-v2/service/internal/s3shared v1.19.6/go.mod h1:HGzIULx4Ge3Do2V0FaiYKcyKzOqwrhUZgCI77NisswQ=
github.com/aws/aws-sdk-go-v2/service/s3 v1.87.3 h1:ETkfWcXP2KNPLecaDa++5bsQhCRa5M5sLUJa5DWYIIg=
github.com/aws/aws-sdk-go-v2/service/s3 v1.87.3/go.mod h1:+/3ZTqoYb3Ur7DObD00tarKMLMuKg8iqz5CHEanqTnw=
github.com/aws/aws-sdk-go-v2/service/sso v1.29.1 h1:8OLZnVJPvjnrxEwHFg9hVUof/P4sibH+Ea4KKuqAGSg=
github.com/aws/aws-sdk-go-v2/service/sso v1.29.1/go.mod h1:27M3BpVi0C02UiQh1w9nsBEit6pLhlaH3NHna6WUbDE=
github.com/aws/aws-sdk-go-v2/service/ssooidc v1.34.2 h1:gKWSTnqudpo8dAxqBqZnDoDWCiEh/40FziUjr/mo6uA=
github.com/aws/aws-sdk-go-v2/service/ssooidc v1.34.2/go.mod h1:x7+rkNmRoEN1U13A6JE2fXne9EWyJy54o3n6d4mGaXQ=
github.com/aws/aws-sdk-go-v2/service/sts v1.38.2 h1:YZPjhyaGzhDQEvsffDEcpycq49nl7fiGcfJTIo8BszI=
github.com/aws/aws-sdk-go-v2/service/sts v1.38.2/go.mod h1:2dIN8qhQfv37BdUYGgEC8Q3tteM3zFxTI1MLO2O3J3c=
github.com/aws/smithy-go v1.23.0 h1:8n6I3gXzWJB2DxBDnfxgBaSX6oe0d/t10qGz7OKqMCE=
github.com/aws/smithy-go v1.23.0/go.mod h1:t1ufH5HMublsJYulve2RKmHDC15xu1f26kHCp/HgceI=
github.com/beorn7/perks v1.0.1 h1:VlbKKnNfV8bJzeqoa4cOKqO6bYr3WgKZxO8Z16+hsOM=
github.com/beorn7/perks v1.0.1/go.mod h1:G2ZrVWU2WbWT9wwq4/hrbKbnv/1ERSJQ0ibhJ6rlkpw=
github.com/bsm/ginkgo/v2 v2.12.0 h1:Ny8MWAHyOepLGlLKYmXG4IEkioBysk6GpaRTLC8zwWs=