Commit Graph

7 Commits

Author SHA1 Message Date
Xe Iaso
fb20b36b18 feat(data/bots): add two example IRC bots
Signed-off-by: Xe Iaso <me@xeiaso.net>
2025-04-28 00:46:59 -04:00
Xe Iaso
9f988578a4 feat(data): add challenge-browser-like.yaml
This is a huge change to Anubis and will make it a lot more invisible at
the cost of requiring additional server configuration to make it happen.

If you add this bit of nginx config to your location block:

```nginx
proxy_set_header X-Http-Version $server_protocol;
```

And then adjust the bottom bot rule to this:

```yaml
- import: (data)/common/challenge-browser-like.yaml
```

Anubis will be way less aggressive than it was before. This will let
through any traffic that comes from a browser that actually is a browser
via some more advanced heuristics.

I think that this rule alone is the key feature of v1.18.0.

Signed-off-by: Xe Iaso <me@xeiaso.net>
2025-04-28 00:20:27 -04:00
Xe Iaso
029c79ba28 fix(lib/test): fix failing test and invalid cloudflare workers rule
Signed-off-by: Xe Iaso <me@xeiaso.net>
2025-04-27 22:25:50 -04:00
Xe Iaso
80bd7c563b chore(data): reformat some things for expressions
Signed-off-by: Xe Iaso <me@xeiaso.net>
2025-04-27 22:25:49 -04:00
Dryusdan
76514f9f32 Bump AI-robots.txt rules to version 1.29 (#383) 2025-04-27 20:52:08 -04:00
Xe Iaso
ef52550e70 fix(config): remove trailing newlines in regexes (#373)
Closes #372

Fun YAML fact of the day:

What is the difference between how these two expressions are parsed?

```yaml
foo: >
  bar
```

```yaml
foo: >-
  bar
```

They are invisible in yaml, but when you evaluate them to JSON the
difference is obvious:

```json
{
  "foo": "bar\n"
}
```

```json
{
  "foo": "bar"
}
```

User-Agent strings, URL path values, and HTTP headers _do_ end in
newlines in HTTP/1.1 wire form, but that newline is usually stripped
before the server actually handles it. Also HTTP/2 is a thing and does
not terminate header values with newlines.

This change makes Anubis more aggressively detect mistaken uses of the
yaml `>` operator and nudges the user into using the yaml `>-` operator
which does not append the trailing newline.

I had honestly forgotten about this YAML behavior because it wasn't
relevant for so long. Oops! Glad I released a beta.

Whenever you get into this state, Anubis will throw a config parsing
error and then give you a message hinting at the folly of your ways.

```
config.Bot: regular expression ends with newline (try >- instead of > in yaml)
```

Big thanks to https://yaml-multiline.info, this helped me realize my
folly instantly.

@aiverson, this is official permission to say "told you so".

Signed-off-by: Xe Iaso <me@xeiaso.net>
2025-04-26 14:01:15 +00:00
Xe Iaso
74e11505c6 feat: enable loading config fragments (#321)
* feat(config): support importing bot policy snippets

This changes the grammar of the Anubis bot policy config to allow
importing from internal shared rules or external rules on the
filesystem.

This lets you create a file at `/data/policies/block-evilbot.yaml` and
then import it with:

```yaml
bots:
- import: /data/policies/block-evilbot.yaml
```

This also explodes the default policy file into a bunch of composable
snippets.

Thank you @Aibrew for your example gitea Atom / RSS feed rules!

Signed-off-by: Xe Iaso <me@xeiaso.net>

* fix(data): update botPolicies.json to use imports

Signed-off-by: Xe Iaso <me@xeiaso.net>

* fix(cmd/anubis): extract bot policies with --extract-resources

This allows a user that doesn't have anything but the Anubis binary to
figure out what the default configuration does.

* docs(data/botPolices.yaml): document import syntax in-line

Signed-off-by: Xe Iaso <me@xeiaso.net>

* fix(lib/policy): better test importing from JSON snippets

Signed-off-by: Xe Iaso <me@xeiaso.net>

* docs(admin): Add import syntax documentation

This documents the import syntax and is based on the block comment at
the top of the default bot policy file.

* docs(changelog): add note about importing snippets

Signed-off-by: Xe Iaso <me@xeiaso.net>

* style(lib/policy/config): use an error value instead of an inline error

Signed-off-by: Xe Iaso <me@xeiaso.net>

---------

Signed-off-by: Xe Iaso <me@xeiaso.net>
2025-04-23 07:01:28 -04:00