Compare commits

..

8 Commits

Author SHA1 Message Date
Xe Iaso
15b0927c46 chore: spelling
Signed-off-by: Xe Iaso <me@xeiaso.net>
2025-12-16 04:08:53 -05:00
Xe Iaso
5e69031c10 chore: fix spelling metadata
Signed-off-by: Xe Iaso <me@xeiaso.net>
2025-12-16 04:05:51 -05:00
Xe Iaso
9ccd5db528 chore(test): go mod tidy
Signed-off-by: Xe Iaso <me@xeiaso.net>
2025-12-16 04:04:56 -05:00
Xe Iaso
82fca3e714 docs: add honeypot docs
Signed-off-by: Xe Iaso <me@xeiaso.net>
2025-12-16 04:03:37 -05:00
Xe Iaso
83c8c3606a fix(lib): use mazeGen instead of bsGen
Signed-off-by: Xe Iaso <me@xeiaso.net>
2025-12-16 01:26:00 -05:00
Xe Iaso
958daba4a1 feat(honeypot/naive): attempt to automatically filter out based on crawling
Signed-off-by: Xe Iaso <me@xeiaso.net>
2025-12-16 01:22:13 -05:00
Xe Iaso
e0f4468b03 fix(honeypot/naive): optimize hilariously
Signed-off-by: Xe Iaso <me@xeiaso.net>
2025-12-15 23:11:21 -05:00
Xe Iaso
ade8505b26 feat: first implementation of honeypot logic
This is a bit of an experiment, stick with me.

The core idea here is that badly written crawlers are that: badly
written. They look for anything that contains `<a href="whatever" />`
tags and will blindly use those values to recurse. This takes advantage
of that by hiding a link in a `<script>` tag like this:

```html
<script type="ignore"><a href="/bots-only">Don't click</a></script>
```

Browsers will ignore it because they have no handler for the "ignore"
script type.

This current draft is very unoptimized (it takes like 7 seconds to
generate a page on my tower), however switching spintax libraries will
make this much faster.

The hope is to make this pluggable with WebAssembly such that we force
administrators to choose a storage method. First we crawl before we
walk.

The AI involvement in this commit is limited to the spintax in
affirmations.txt, spintext.txt, and titles.txt. This generates a bunch
of "pseudoprofound bullshit" like the following:

> This Restoration to Balance & Alignment
>
> There's a moment when creators are being called to realize that the work
> can't be reduced to results, but about energy. We don't innovate products
> by pushing harder, we do it by holding the vision. Because momentum can't
> be forced, it unfolds over time when culture are moving in the same
> direction. We're being invited into a paradigm shift in how we think
> about innovation. [...]

This is intended to "look" like normal article text. As this is a first
draft, this sucks and will be improved upon.

Assisted-by: GLM 4.6, ChatGPT, GPT-OSS 120b
Signed-off-by: Xe Iaso <me@xeiaso.net>
2025-12-15 18:27:42 -05:00
5 changed files with 5 additions and 14 deletions

View File

@@ -134,10 +134,7 @@ bots:
adjust: -5
- name: should-have-accept
expression:
all:
- userAgent.contains("Mozilla")
- '!("Accept" in headers)'
expression: '!("Accept" in headers)'
action: WEIGH
weight:
adjust: 5

View File

@@ -118,10 +118,7 @@
adjust: -5
- name: should-have-accept
expression:
all:
- userAgent.contains("Mozilla")
- '!("Accept" in headers)'
expression: '!("Accept" in headers)'
action: WEIGH
weight:
adjust: 5

View File

@@ -27,7 +27,6 @@ Anubis is back and better than ever! Lots of minor fixes with some big ones inte
- Add support to simple Valkey/Redis cluster mode
- Open Graph passthrough now reuses the configured target Host/SNI/TLS settings, so metadata fetches succeed when the upstream certificate differs from the public domain. ([1283](https://github.com/TecharoHQ/anubis/pull/1283))
- Stabilize the CVE-2025-24369 regression test by always submitting an invalid proof instead of relying on random POW failures.
- Refine the check that ensures the presence of the Accept header to avoid breaking docker clients.
### Dataset poisoning

View File

@@ -100,9 +100,6 @@ func XForwardedForToXRealIP(next http.Handler) http.Handler {
ip := xff.Parse(xffHeader)
slog.Debug("setting X-Real-Ip from X-Forwarded-For", "to", ip, "x-forwarded-for", xffHeader)
r.Header.Set("X-Real-Ip", ip)
if addr, err := netip.ParseAddr(ip); err == nil {
r = r.WithContext(context.WithValue(r.Context(), realIPKey{}, addr))
}
}
next.ServeHTTP(w, r)

View File

@@ -7,7 +7,6 @@ import (
"log/slog"
"math/rand/v2"
"net/http"
"net/netip"
"time"
"github.com/TecharoHQ/anubis/internal"
@@ -153,7 +152,9 @@ func (i *Impl) ServeHTTP(w http.ResponseWriter, r *http.Request) {
realIP, _ := internal.RealIP(r)
if !realIP.IsValid() {
realIP = netip.MustParseAddr(r.Header.Get("X-Real-Ip"))
lg.Error("the real IP is somehow invalid, bad middleware stack?")
http.Error(w, "The cake is a lie", http.StatusTeapot)
return
}
network, ok := internal.ClampIP(realIP)