Compare commits

..

3 Commits

Author SHA1 Message Date
Xe Iaso f19a5f7eb8 docs(k8s): document that Kubernetes support needs a non-default storage backend
Closes: #1602
Signed-off-by: Xe Iaso <me@xeiaso.net>
2026-06-01 10:29:23 -04:00
Julien Voisin 3dc962b301 perf(internal/gzip): pool *gzip.Writer per middleware instance (#1654)
gzip.NewWriterLevel allocates fresh deflate window and hash table
buffers (~1.18 MiB) on every request. This commit pools them in a closure-local
sync.Pool so each middleware instance reuses its writers.

The level is validated once at setup (NewWriterLevel against
io.Discard); pooled writers are reset to io.Discard on Put so the
pool doesn't pin response writers between requests.

Only call site is RenderIndex (lib/http.go), which serves the
challenge page, so this directly cuts the per-challenge allocation
footprint.

I benchmarked the change using the following benchmark,
put in the commit message instead of in a file since it's pretty much useless
outside of this particular change.

```
package internal

import (
	"io"
	"net/http"
	"net/http/httptest"
	"testing"
)

func BenchmarkGzipMiddleware(b *testing.B) {
	payload := make([]byte, 4096)
	for i := range payload {
		payload[i] = byte(i)
	}

	inner := http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
		w.Write(payload)
	})
	h := GzipMiddleware(1, inner)

	b.ReportAllocs()
	b.RunParallel(func(pb *testing.PB) {
		req := httptest.NewRequest(http.MethodGet, "/", nil)
		req.Header.Set("Accept-Encoding", "gzip")
		for pb.Next() {
			rec := httptest.NewRecorder()
			h.ServeHTTP(rec, req)
			io.Copy(io.Discard, rec.Body)
		}
	})
}
```

The results are pretty nice:

Benchmarks (Linux arm64, count=10, benchstat, vs origin/main):

  GzipMiddleware-8  sec/op     158.8µs ± 4%  ->   5.2µs ± 3%  -96.72% (p=0.000)
  GzipMiddleware-8  B/op       1180.6 KiB    ->   1.9 KiB     -99.84% (p=0.000)
  GzipMiddleware-8  allocs/op  32            ->  13           -59.38% (p=0.000)

Signed-off-by: jvoisin <julien.voisin@dustri.org>
2026-05-30 00:52:37 -04:00
Xe Iaso 926f3d1d0e fix: small security fixes (#1651)
This is based on private evaluation of a prerelease security product.
I cannot comment further other than I am impressed by its output.

This commit is a squash of several commits. The impactful commits
have details underneath markdown heading twos.

## fix(metrics): don't expose pprof by default

pprof[1] is the Go standard library profiling toolkit. It is invaluable
for diagnosing how Go programs perform in the wild. However it also is
able to expose secret data set with command line flags. This is not
ideal and should be mitigated by correctly configured firewall rules. We
don't live in a world where people correctly configure firewall rules,
so we have to fix things for people. Welcome to 2026.

[1]: https://pkg.go.dev/runtime/pprof

Ref: AWOO-001

## fix(honeypot/naive): cap r9k delay to one second

Otherwise this can get unbounded, which can cause problems with lesser
HTTP proxies such as Apache.

Ref: AWOO-002

## fix(policy): mend an edge case with subrequest auth and query strings

This fixes an unlikely edge case where using subrequest auth and query
strings with path based filtering can cause reality to differ from
administrator intent. This effectively strips the query string from
subrequest auth checks. This deficiency should be fixed in the future.

Ref: AWOO-004

## fix(expressions): mend possible nil pointer deref edge case

If Anubis just started up, load averages may not be set in memory. This
can cause a nil pointer dereference which could fail requests with weird
errors until the async thread sets the load averages.

Ref: AWOO-005

## fix(lib): mend case where domainless redirects could allow cross-domain redirects

Ref: AWOO-009

## fix(expressions): validate randInt bounds before rand.IntN

Non-positive or platform-overflowing arguments to the CEL randInt
helper used to reach rand.IntN unchecked, surfacing a CEL evaluator
error during request processing when policies passed
attacker-influenced values (e.g. contentLength). Reject non-positive
bounds and detect int narrowing explicitly, returning a typed CEL
error in both cases.

Ref: AWOO-010

Signed-off-by: Xe Iaso <xe.iaso@techaro.lol>
2026-05-30 00:48:43 -04:00
4 changed files with 49 additions and 5 deletions
+1
View File
@@ -39,6 +39,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- Fix a race in the bbolt store where the asynchronous cleanup scheduled by an expired read could delete a value that had just been refreshed; the delete now only fires when the key still carries the same expired generation it observed.
- Marginally increase the performances of requests processing
- Marginally improve the performances of PoW validation
- Significantly improve the performances of the gzip middleware
## v1.25.0: Necron
@@ -131,11 +131,27 @@ Then point your Ingress to the Anubis port:
name: anubis
```
## Storage
By default, Anubis stores all of its data in memory. This memory is not shared between pods. If you have multiple instances of Anubis without the data being [stored outside of memory](../policies.mdx#storage-backends) and a [shared cookie key](../installation.mdx#key-generation), you will run into [unexpected behaviour](https://github.com/TecharoHQ/anubis/issues/1602) when user traffic traverses between pods.
Based on the deployment of your Kubernetes cluster, here are the preferable storage backends to pick from:
| Backend | Pro | Con |
| :------- | :-------------------------------------------------------------- | :------------------------------------------------------------------------------------------- |
| `bbolt` | Only requires a ReadWriteOnce PVC. | Does not support more than one Anubis pod. |
| `memory` | Requires no configuration. | Process memory is not shared between pods. |
| `s3api` | Great if your cluster includes Rook/Ceph to use RADOS directly. | Potentially higher latency unless you use a store like [Tigris](https://www.tigrisdata.com). |
| `valkey` | Trivial to configure in your cluster. | If your Redis/Valkey server is down, Anubis is going to have issues. |
Pick your poison accordingly. Many production deployments use the `s3api` and `valkey` backends without issue. Single node deployments can get away with either `memory` or `bbolt` depending on the facts and circumstances of the deployment.
## Envoy Gateway
If you are using envoy-gateway, the `X-Real-Ip` header is not set by default, but Anubis does require it. You can resolve this by adding the header, either on the specific `HTTPRoute` where Anubis is listening, or on the `ClientTrafficPolicy` to apply it to any number of Gateways:
HTTPRoute:
```yaml
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
@@ -160,6 +176,7 @@ spec:
```
Applying to any number of Gateways:
```yaml
apiVersion: gateway.envoyproxy.io/v1alpha1
kind: ClientTrafficPolicy
+7
View File
@@ -219,8 +219,11 @@ Anubis offers the following storage backends:
- [`memory`](#memory) -- A simple in-memory hashmap
- [`bbolt`](#bbolt) -- An on-disk key/value store backed by [bbolt](https://github.com/etcd-io/bbolt), an embedded key/value database for Go programs
- [`s3api`](#s3api) -- Amazon S3 based storage or another compatible object store
- [`valkey`](#valkey) -- A remote in-memory key/value database backed by [Valkey](https://valkey.io/) (or another database compatible with the [RESP](https://redis.io/docs/latest/develop/reference/protocol-spec/) protocol)
:::warning
If no storage backend is set in the policy file, Anubis will use the [`memory`](#memory) backend by default. This is equivalent to the following in the policy file:
```yaml
@@ -229,6 +232,10 @@ store:
parameters: {}
```
This means that all session data that is required for the challenge mechanism to work is stored **IN PROCESS MEMORY** that is **NOT** shared between instances of Anubis. If you set up Anubis with multiple instances using the `memory` storage backend, your users will sometimes get "Administrator has misconfigured Anubis" error messages when it cannot look up the aforementioned session data.
:::
### `memory`
The memory backend is an in-memory cache. This backend works best if you don't use multiple instances of Anubis or don't have mutable storage in the environment you're running Anubis in.
+24 -5
View File
@@ -2,11 +2,28 @@ package internal
import (
"compress/gzip"
"io"
"net/http"
"strings"
"sync"
)
func GzipMiddleware(level int, next http.Handler) http.Handler {
// Validate the level once at setup; gzip.NewWriterLevel only fails for
// invalid levels and we'd rather panic now than mid-request.
if _, err := gzip.NewWriterLevel(io.Discard, level); err != nil {
panic(err)
}
// Per-middleware pool of *gzip.Writer. Each entry carries ~40 KiB of
// deflate buffers; reusing them avoids that allocation on every request.
pool := sync.Pool{
New: func() any {
gz, _ := gzip.NewWriterLevel(io.Discard, level)
return gz
},
}
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
if !strings.Contains(r.Header.Get("Accept-Encoding"), "gzip") {
next.ServeHTTP(w, r)
@@ -14,11 +31,13 @@ func GzipMiddleware(level int, next http.Handler) http.Handler {
}
w.Header().Set("Content-Encoding", "gzip")
gz, err := gzip.NewWriterLevel(w, level)
if err != nil {
panic(err)
}
defer gz.Close()
gz := pool.Get().(*gzip.Writer)
gz.Reset(w)
defer func() {
gz.Close()
gz.Reset(io.Discard)
pool.Put(gz)
}()
grw := gzipResponseWriter{ResponseWriter: w, sink: gz}
next.ServeHTTP(grw, r)