Compare commits

..

8 Commits

Author SHA1 Message Date
Xe Iaso 9d88de5878 chore: spelling
Signed-off-by: Xe Iaso <me@xeiaso.net>
2026-06-06 10:25:42 -04:00
Xe Iaso 3c2e2f5940 chore: spelling
Signed-off-by: Xe Iaso <me@xeiaso.net>
2026-06-06 10:24:12 -04:00
Xe Iaso 5c16bf5592 chore: ban x.ai
Signed-off-by: Xe Iaso <me@xeiaso.net>
2026-06-06 10:22:05 -04:00
Xe Iaso 44d5fa3ce0 chore: use Go stdlib version stamping (#1665)
* chore: use Go stdlib version stamping

Signed-off-by: Xe Iaso <me@xeiaso.net>

* chore: spelling

Signed-off-by: Xe Iaso <me@xeiaso.net>

---------

Signed-off-by: Xe Iaso <me@xeiaso.net>
2026-06-04 16:05:37 -04:00
Julien Voisin ef3ea08b79 perf(challenge/proofofwork): stream sha256 into stack buffer in Validate (#1653)
Signed-off-by: jvoisin <julien.voisin@dustri.org>
Co-authored-by: Jason Cameron <git@jasoncameron.dev>
2026-06-03 11:35:28 -04:00
Julien Voisin a08b0f4262 perf: enable uuid randomness pool and minor cleanups (#1652)
cmd/anubis: call uuid.EnableRandPool() at the top of main. The pool
batches crypto/rand reads internally, dramatically reducing per-call
syscall overhead for UUID generation. UUIDs are produced on every
issued challenge (NewV7, 3.7 times faster, down to zero allocation) and on
every challenge page render (NewString, 1.6 times faster, 1 fewer allocation).
The pool is non-cryptographic-key material, PoW challenge bytes and signing
keys still go directly to crypto/rand.

lib/anubis.go: three trivial optimizations in issueChallenge and
maybeReverseProxy, reducing the amount of allocations by 2%, which isn't much
but since the changes are trivial:

  - fmt.Sprintf("%x", randomData) -> hex.EncodeToString(randomData)
  - cache uuid.UUID.String() once instead of calling it three times
  - fmt.Sprintf("ogtags:allow:%s%s", ...) -> string concat

Signed-off-by: jvoisin <julien.voisin@dustri.org>
Signed-off-by: Xe Iaso <xe.iaso@techaro.lol>
Co-authored-by: Xe Iaso <xe.iaso@techaro.lol>
2026-05-30 01:05:01 -04:00
Julien Voisin 3dc962b301 perf(internal/gzip): pool *gzip.Writer per middleware instance (#1654)
gzip.NewWriterLevel allocates fresh deflate window and hash table
buffers (~1.18 MiB) on every request. This commit pools them in a closure-local
sync.Pool so each middleware instance reuses its writers.

The level is validated once at setup (NewWriterLevel against
io.Discard); pooled writers are reset to io.Discard on Put so the
pool doesn't pin response writers between requests.

Only call site is RenderIndex (lib/http.go), which serves the
challenge page, so this directly cuts the per-challenge allocation
footprint.

I benchmarked the change using the following benchmark,
put in the commit message instead of in a file since it's pretty much useless
outside of this particular change.

```
package internal

import (
	"io"
	"net/http"
	"net/http/httptest"
	"testing"
)

func BenchmarkGzipMiddleware(b *testing.B) {
	payload := make([]byte, 4096)
	for i := range payload {
		payload[i] = byte(i)
	}

	inner := http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
		w.Write(payload)
	})
	h := GzipMiddleware(1, inner)

	b.ReportAllocs()
	b.RunParallel(func(pb *testing.PB) {
		req := httptest.NewRequest(http.MethodGet, "/", nil)
		req.Header.Set("Accept-Encoding", "gzip")
		for pb.Next() {
			rec := httptest.NewRecorder()
			h.ServeHTTP(rec, req)
			io.Copy(io.Discard, rec.Body)
		}
	})
}
```

The results are pretty nice:

Benchmarks (Linux arm64, count=10, benchstat, vs origin/main):

  GzipMiddleware-8  sec/op     158.8µs ± 4%  ->   5.2µs ± 3%  -96.72% (p=0.000)
  GzipMiddleware-8  B/op       1180.6 KiB    ->   1.9 KiB     -99.84% (p=0.000)
  GzipMiddleware-8  allocs/op  32            ->  13           -59.38% (p=0.000)

Signed-off-by: jvoisin <julien.voisin@dustri.org>
2026-05-30 00:52:37 -04:00
Xe Iaso 926f3d1d0e fix: small security fixes (#1651)
This is based on private evaluation of a prerelease security product.
I cannot comment further other than I am impressed by its output.

This commit is a squash of several commits. The impactful commits
have details underneath markdown heading twos.

## fix(metrics): don't expose pprof by default

pprof[1] is the Go standard library profiling toolkit. It is invaluable
for diagnosing how Go programs perform in the wild. However it also is
able to expose secret data set with command line flags. This is not
ideal and should be mitigated by correctly configured firewall rules. We
don't live in a world where people correctly configure firewall rules,
so we have to fix things for people. Welcome to 2026.

[1]: https://pkg.go.dev/runtime/pprof

Ref: AWOO-001

## fix(honeypot/naive): cap r9k delay to one second

Otherwise this can get unbounded, which can cause problems with lesser
HTTP proxies such as Apache.

Ref: AWOO-002

## fix(policy): mend an edge case with subrequest auth and query strings

This fixes an unlikely edge case where using subrequest auth and query
strings with path based filtering can cause reality to differ from
administrator intent. This effectively strips the query string from
subrequest auth checks. This deficiency should be fixed in the future.

Ref: AWOO-004

## fix(expressions): mend possible nil pointer deref edge case

If Anubis just started up, load averages may not be set in memory. This
can cause a nil pointer dereference which could fail requests with weird
errors until the async thread sets the load averages.

Ref: AWOO-005

## fix(lib): mend case where domainless redirects could allow cross-domain redirects

Ref: AWOO-009

## fix(expressions): validate randInt bounds before rand.IntN

Non-positive or platform-overflowing arguments to the CEL randInt
helper used to reach rand.IntN unchecked, surfacing a CEL evaluator
error during request processing when policies passed
attacker-influenced values (e.g. contentLength). Reject non-positive
bounds and detect int narrowing explicitly, returning a typed CEL
error in both cases.

Ref: AWOO-010

Signed-off-by: Xe Iaso <xe.iaso@techaro.lol>
2026-05-30 00:48:43 -04:00
12 changed files with 92 additions and 21 deletions
+3
View File
@@ -44,3 +44,6 @@ xou
AWOO
firewalls
bindhosts
handrolled
xai
gitlab
-1
View File
@@ -10,4 +10,3 @@ builds:
ldflags:
- -s -w
- -extldflags "-static"
- -X github.com/TecharoHQ/anubis.Version={{.Env.VERSION}}
+18 -3
View File
@@ -1,12 +1,27 @@
// Package anubis contains the version number of Anubis.
package anubis
import "time"
import (
"runtime/debug"
"time"
)
func init() {
bi, ok := debug.ReadBuildInfo()
if !ok {
return
}
// XXX(Xe): many things in this repo assume that the development version
// of anubis is `devel` and ReadBuildInfo returns `(devel)`. Shim the gap.
if bi.Main.Version != "(devel)" {
Version = bi.Main.Version
}
}
// Version is the current version of Anubis.
//
// This variable is set at build time using the -X linker flag. If not set,
// it defaults to "devel".
// This is set from the Go module runtime version.
var Version = "devel"
// CookieName is the name of the cookie that Anubis uses in order to validate
+4
View File
@@ -36,6 +36,7 @@ import (
"github.com/TecharoHQ/anubis/lib/thoth"
"github.com/TecharoHQ/anubis/web"
"github.com/facebookgo/flagenv"
"github.com/google/uuid"
_ "github.com/joho/godotenv/autoload"
healthv1 "google.golang.org/grpc/health/grpc_health_v1"
)
@@ -193,6 +194,9 @@ func main() {
flagenv.Parse()
flag.Parse()
// Must be set before any concurrent UUID call.
uuid.EnableRandPool()
if *versionFlag {
fmt.Println("Anubis", anubis.Version)
return
+3
View File
@@ -41,6 +41,9 @@ bots:
# Challenge Firefox AI previews
- import: (data)/clients/x-firefox-ai.yaml
# x.ai has a scraper that is killing gitlab instances
- import: (data)/crawlers/xai.yaml
# Allow common "keeping the internet working" routes (well-known, favicon, robots.txt)
- import: (data)/common/keep-internet-working.yaml
+8
View File
@@ -0,0 +1,8 @@
- name: xai-crawler-and-asn
action: DENY
user_agent_regex: code-review-sourcing.*\+xai-research
remote_addresses:
- 69.12.56.0/12
- name: xai-crawler-user-agent
action: DENY
user_agent_regex: code-review-sourcing.*\+xai-research
+3
View File
@@ -25,6 +25,9 @@
# Challenge Firefox AI previews
- import: (data)/clients/x-firefox-ai.yaml
# x.ai has a scraper that is killing gitlab instances
- import: (data)/crawlers/xai.yaml
# Allow common "keeping the internet working" routes (well-known, favicon, robots.txt)
- import: (data)/common/keep-internet-working.yaml
+5
View File
@@ -23,12 +23,14 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- Improve error messages and fix broken REDIRECT_DOMAINS link in docs ([#1193](https://github.com/TecharoHQ/anubis/issues/1193))
- Add Bulgarian locale ([#1394](https://github.com/TecharoHQ/anubis/pull/1394))
- Fixed case-sensitivity mismatch in geoipchecker.go
- Use [Go's native version stamping](https://michael.stapelberg.ch/posts/2026-04-05-stamp-it-all-programs-must-report-their-version/) instead of a handrolled variant.
- Fix CEL internal errors when iterating `headers`/`query` map wrappers by implementing map iterators for `HTTPHeaders` and `URLValues` ([#1465](https://github.com/TecharoHQ/anubis/pull/1465)).
- Enable [metrics serving via TLS](./admin/policies.mdx#tls), including [mutual TLS (mTLS)](./admin/policies.mdx#mtls).
- Enable [HTTP basic auth](./admin/policies.mdx#http-basic-authentication) for the metrics server.
- Fix a bug in the dataset poisoning maze that could allow denial of service [#1580](https://github.com/TecharoHQ/anubis/issues/1580).
- Add config option to add ASN to logs/metrics.
- Log weight when issuing challenge.
- Block x.ai's crawler for code review training.
- Gate pprof endpoints behind `metrics.debug` in the policy file.
- Limit naive honeypot r9k delay to one second.
- Fix an obscure case where adding query values to a subrequest match could cause an invalid rule match when using path based matching for protected resources.
@@ -39,6 +41,9 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- Fix a race in the bbolt store where the asynchronous cleanup scheduled by an expired read could delete a value that had just been refreshed; the delete now only fires when the key still carries the same expired generation it observed.
- Marginally increase the performances of requests processing
- Marginally improve the performances of PoW validation
- Marginally improve the performances of challenges generation/display
- Significantly improve the performances of the gzip middleware
- Significantly improve the performances of the PoW validation
## v1.25.0: Necron
+24 -5
View File
@@ -2,11 +2,28 @@ package internal
import (
"compress/gzip"
"io"
"net/http"
"strings"
"sync"
)
func GzipMiddleware(level int, next http.Handler) http.Handler {
// Validate the level once at setup; gzip.NewWriterLevel only fails for
// invalid levels and we'd rather panic now than mid-request.
if _, err := gzip.NewWriterLevel(io.Discard, level); err != nil {
panic(err)
}
// Per-middleware pool of *gzip.Writer. Each entry carries ~40 KiB of
// deflate buffers; reusing them avoids that allocation on every request.
pool := sync.Pool{
New: func() any {
gz, _ := gzip.NewWriterLevel(io.Discard, level)
return gz
},
}
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
if !strings.Contains(r.Header.Get("Accept-Encoding"), "gzip") {
next.ServeHTTP(w, r)
@@ -14,11 +31,13 @@ func GzipMiddleware(level int, next http.Handler) http.Handler {
}
w.Header().Set("Content-Encoding", "gzip")
gz, err := gzip.NewWriterLevel(w, level)
if err != nil {
panic(err)
}
defer gz.Close()
gz := pool.Get().(*gzip.Writer)
gz.Reset(w)
defer func() {
gz.Close()
gz.Reset(io.Discard)
pool.Put(gz)
}()
grw := gzipResponseWriter{ResponseWriter: w, sink: gz}
next.ServeHTTP(grw, r)
+7 -5
View File
@@ -4,6 +4,7 @@ import (
"context"
"crypto/ed25519"
"crypto/rand"
"encoding/hex"
"encoding/json"
"errors"
"fmt"
@@ -162,6 +163,7 @@ func (s *Server) issueChallenge(ctx context.Context, r *http.Request, lg *slog.L
if err != nil {
return nil, err
}
idStr := id.String()
var randomData = make([]byte, 64)
if _, err := rand.Read(randomData); err != nil {
@@ -169,9 +171,9 @@ func (s *Server) issueChallenge(ctx context.Context, r *http.Request, lg *slog.L
}
chall := challenge.Challenge{
ID: id.String(),
ID: idStr,
Method: rule.Challenge.Algorithm,
RandomData: fmt.Sprintf("%x", randomData),
RandomData: hex.EncodeToString(randomData),
IssuedAt: time.Now(),
Difficulty: rule.Challenge.Difficulty,
PolicyRuleHash: rule.Hash(),
@@ -182,11 +184,11 @@ func (s *Server) issueChallenge(ctx context.Context, r *http.Request, lg *slog.L
}
j := store.JSON[challenge.Challenge]{Underlying: s.store}
if err := j.Set(ctx, "challenge:"+id.String(), chall, 30*time.Minute); err != nil {
if err := j.Set(ctx, "challenge:"+idStr, chall, 30*time.Minute); err != nil {
return nil, err
}
lg.Info("new challenge issued", "challenge", id.String(), "weight", cr.Weight)
lg.Info("new challenge issued", "challenge", idStr, "weight", cr.Weight)
return &chall, err
}
@@ -240,7 +242,7 @@ func (s *Server) maybeReverseProxyOrPage(w http.ResponseWriter, r *http.Request)
func (s *Server) maybeReverseProxy(w http.ResponseWriter, r *http.Request, httpStatusOnly bool) {
lg, r := s.getRequestLogger(r)
if val, _ := s.store.Get(r.Context(), fmt.Sprintf("ogtags:allow:%s%s", r.Host, r.URL.String())); val != nil {
if val, _ := s.store.Get(r.Context(), "ogtags:allow:"+r.Host+r.URL.String()); val != nil {
lg.Debug("serving opengraph tag asset")
s.ServeHTTPNext(w, r)
return
+15 -5
View File
@@ -1,14 +1,15 @@
package proofofwork
import (
"crypto/sha256"
"crypto/subtle"
"encoding/hex"
"fmt"
"log/slog"
"net/http"
"strconv"
"strings"
"github.com/TecharoHQ/anubis/internal"
chall "github.com/TecharoHQ/anubis/lib/challenge"
"github.com/TecharoHQ/anubis/lib/localization"
"github.com/a-h/templ"
@@ -66,11 +67,20 @@ func (i *Impl) Validate(r *http.Request, lg *slog.Logger, in *chall.ValidateInpu
return chall.NewError("validate", "invalid response", fmt.Errorf("%w response", chall.ErrMissingField))
}
calcString := challenge + nonceStr
calculated := internal.SHA256sum(calcString)
// Stream the challenge and nonce into a single sha256 hasher to avoid
// the intermediate "challenge + nonceStr" concatenation. Hex-encode
// the digest into a stack buffer so the comparison runs without
// allocating a heap string.
h := sha256.New()
h.Write([]byte(challenge))
h.Write([]byte(nonceStr))
var sumBuf [sha256.Size]byte
sum := h.Sum(sumBuf[:0])
var hexBuf [sha256.Size * 2]byte
hex.Encode(hexBuf[:], sum)
if subtle.ConstantTimeCompare([]byte(response), []byte(calculated)) != 1 {
return chall.NewError("validate", "invalid response", fmt.Errorf("%w: wanted response %s but got %s", chall.ErrFailed, calculated, response))
if subtle.ConstantTimeCompare([]byte(response), hexBuf[:]) != 1 {
return chall.NewError("validate", "invalid response", fmt.Errorf("%w: wanted response %s but got %s", chall.ErrFailed, string(hexBuf[:]), response))
}
// compare the leading zeroes
+2 -2
View File
@@ -17,8 +17,8 @@ $`npm run assets`;
},
build: ({ bin, etc, systemd, doc }) => {
$`go build -o ${bin}/anubis -ldflags '-s -w -extldflags "-static" -X "github.com/TecharoHQ/anubis.Version=${git.tag()}"' ./cmd/anubis`;
$`go build -o ${bin}/anubis-robots2policy -ldflags '-s -w -extldflags "-static" -X "github.com/TecharoHQ/anubis.Version=${git.tag()}"' ./cmd/robots2policy`;
$`go build -o ${bin}/anubis -ldflags '-s -w -extldflags "-static" ./cmd/anubis`;
$`go build -o ${bin}/anubis-robots2policy -ldflags '-s -w -extldflags "-static"' ./cmd/robots2policy`;
file.install("./run/anubis@.service", `${systemd}/anubis@.service`);
file.install("./run/default.env", `${etc}/default.env`);