Compare commits

...

6 Commits

Author SHA1 Message Date
Xe Iaso
efe648f2af Apply suggestions from code review
Signed-off-by: Xe Iaso <me@xeiaso.net>
2025-03-22 11:27:16 -04:00
Xe Iaso
44efcea781 cmd/anubis: amend generic browser challenge to "Gecko"
I'm gonna be honest, this is an extreme galaxy brain strategy and I'm
not entirely sure if this will pan out. However I got the idea when
reading [a community post][0]

If this works, that would be so much funnier than just using "Mozilla"
in the rules. I think that this could greatly backfire though, which is
why I'm making a pull request and opening this for feedback from the
community.

It would be absolutely hilarious if this works though.

[0]: https://github.com/TecharoHQ/anubis/discussions/68#discussioncomment-12583134
2025-03-22 09:23:03 -04:00
Christian F. Coors
15d801be7d fix: installation instructions and example (#75) 2025-03-22 07:45:32 -04:00
dependabot[bot]
c66305904b build(deps): bump github.com/golang-jwt/jwt/v5 from 5.2.1 to 5.2.2 (#74)
Bumps [github.com/golang-jwt/jwt/v5](https://github.com/golang-jwt/jwt) from 5.2.1 to 5.2.2.
- [Release notes](https://github.com/golang-jwt/jwt/releases)
- [Changelog](https://github.com/golang-jwt/jwt/blob/main/VERSION_HISTORY.md)
- [Commits](https://github.com/golang-jwt/jwt/compare/v5.2.1...v5.2.2)

---
updated-dependencies:
- dependency-name: github.com/golang-jwt/jwt/v5
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-03-21 20:41:19 -04:00
Xe Iaso
5f7942faca cmd/anubis: delete example RSS reader rule (#67)
The example/default bot policy document had a rule to allow RSS readers
through based on paths that end with ".rss", ".xml", ".atom", or
".json". Frameworks like Rails will treat these specially, meaning that
going to /things/12345-whateverhaha.json could bypass Anubis.

I checked the history of this rule and it was present in the original
example policy file in Xe/x. This rule is likely a mistake and it has
been removed. I think it was for making my blog still work with RSS
readers.

Thanks to Graham Sutherland for reporting this over email.

Signed-off-by: Xe Iaso <me@xeiaso.net>
2025-03-21 17:20:17 -04:00
Dennis ten Hoove
869e46a4cc Add MojeekBot (#64)
* Add MojeekBot

* Update docs/docs/CHANGELOG.md

Co-authored-by: Xe Iaso <me@xeiaso.net>
Signed-off-by: Dennis ten Hoove <36002865+dennis1248@users.noreply.github.com>

---------

Signed-off-by: Dennis ten Hoove <36002865+dennis1248@users.noreply.github.com>
Co-authored-by: Xe Iaso <me@xeiaso.net>
2025-03-21 16:59:42 -04:00
6 changed files with 30 additions and 13 deletions

View File

@@ -1 +1 @@
1.14.1
1.14.2

View File

@@ -335,6 +335,14 @@
"193.183.0.174/32"
]
},
{
"name": "mojeekbot",
"user_agent_regex": "http\\://www\\.mojeek\\.com/bot\\.html",
"action": "ALLOW",
"remote_addresses": [
"5.102.173.71/32"
]
},
{
"name": "us-artificial-intelligence-scraper",
"user_agent_regex": "\\+https\\:\\/\\/github\\.com\\/US-Artificial-Intelligence\\/scraper",
@@ -355,11 +363,6 @@
"path_regex": "^/robots.txt$",
"action": "ALLOW"
},
{
"name": "rss-readers",
"path_regex": ".*\\.(rss|xml|atom|json)$",
"action": "ALLOW"
},
{
"name": "lightpanda",
"user_agent_regex": "^Lightpanda/.*$",
@@ -387,7 +390,7 @@
},
{
"name": "generic-browser",
"user_agent_regex": "Mozilla",
"user_agent_regex": "(?i:gecko)",
"action": "CHALLENGE"
}
],

View File

@@ -11,6 +11,18 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
## [Unreleased]
- Fixed and clarified installation instructions
- Changed default challenge logic for "Gecko" in the User-Agent string
instead of "Mozilla" [#78](https://github.com/TecharoHQ/anubis/pull/78)
## v1.14.2
Livia sas Junius: Echo 2
- Remove default RSS reader rule as it may allow for a targeted attack against rails apps
[#67](https://github.com/TecharoHQ/anubis/pull/67)
- Whitelist MojeekBot in botPolicies [#47](https://github.com/TecharoHQ/anubis/issues/47)
## v1.14.1
Livia sas Junius: Echo 1

View File

@@ -29,7 +29,7 @@ Anubis uses these environment variables for configuration:
| `METRICS_BIND` | `:9090` | The network address that Anubis serves Prometheus metrics on. See `BIND` for more information. |
| `METRICS_BIND_NETWORK` | `tcp` | The address family that the Anubis metrics server listens on. See `BIND_NETWORK` for more information. |
| `SOCKET_MODE` | `0770` | *Only used when at least one of the `*_BIND_NETWORK` variables are set to `unix`.* The socket mode (permissions) for Unix domain sockets. |
| `POLICY_FNAME` | `/data/cfg/botPolicy.json` | The file containing [bot policy configuration](./policies.md). See the bot policy documentation for more details. |
| `POLICY_FNAME` | unset | The file containing [bot policy configuration](./policies.md). See the bot policy documentation for more details. If unset, the default bot policy configuration is used. |
| `SERVE_ROBOTS_TXT` | `false` | If set `true`, Anubis will serve a default `robots.txt` file that disallows all known AI scrapers by name and then additionally disallows every scraper. This is useful if facts and circumstances make it difficult to change the underlying service to serve such a `robots.txt` file. |
| `TARGET` | `http://localhost:3923` | The URL of the service that Anubis should forward valid requests to. Supports Unix domain sockets, set this to a URI like so: `unix:///path/to/socket.sock`. |
@@ -47,13 +47,15 @@ services:
METRICS_BIND: ":9090"
SERVE_ROBOTS_TXT: "true"
TARGET: "http://nginx"
POLICY_FNAME: "/data/cfg/botPolicy.json"
ports:
- 8080:8080
volumes:
- "./botPolicy.json:/data/cfg/botPolicy.json:ro"
nginx:
image: nginx
volumes:
- "./www:/usr/share/nginx/html"
- "./botPolicy.json:/data/cfg/botPolicy.json"
```
## Kubernetes

4
go.mod
View File

@@ -5,8 +5,9 @@ go 1.24.1
require (
github.com/a-h/templ v0.3.833
github.com/facebookgo/flagenv v0.0.0-20160425205200-fcd59fca7456
github.com/golang-jwt/jwt/v5 v5.2.1
github.com/golang-jwt/jwt/v5 v5.2.2
github.com/prometheus/client_golang v1.21.1
github.com/sebest/xff v0.0.0-20210106013422-671bd2870b3a
github.com/yl2chen/cidranger v1.0.2
)
@@ -34,7 +35,6 @@ require (
github.com/prometheus/client_model v0.6.1 // indirect
github.com/prometheus/common v0.62.0 // indirect
github.com/prometheus/procfs v0.15.1 // indirect
github.com/sebest/xff v0.0.0-20210106013422-671bd2870b3a // indirect
golang.org/x/mod v0.24.0 // indirect
golang.org/x/net v0.37.0 // indirect
golang.org/x/sync v0.12.0 // indirect

4
go.sum
View File

@@ -31,8 +31,8 @@ github.com/fatih/color v1.16.0 h1:zmkK9Ngbjj+K0yRhTVONQh1p/HknKYSlNT+vZCzyokM=
github.com/fatih/color v1.16.0/go.mod h1:fL2Sau1YI5c0pdGEVCbKQbLXB6edEj1ZgiY4NijnWvE=
github.com/fsnotify/fsnotify v1.7.0 h1:8JEhPFa5W2WU7YfeZzPNqzMP6Lwt7L2715Ggo0nosvA=
github.com/fsnotify/fsnotify v1.7.0/go.mod h1:40Bi/Hjc2AVfZrqy+aj+yEI+/bRxZnMJyTJwOpGvigM=
github.com/golang-jwt/jwt/v5 v5.2.1 h1:OuVbFODueb089Lh128TAcimifWaLhJwVflnrgM17wHk=
github.com/golang-jwt/jwt/v5 v5.2.1/go.mod h1:pqrtFR0X4osieyHYxtmOUWsAWrfe1Q5UVIyoH402zdk=
github.com/golang-jwt/jwt/v5 v5.2.2 h1:Rl4B7itRWVtYIHFrSNd7vhTiz9UpLdi6gZhZ3wEeDy8=
github.com/golang-jwt/jwt/v5 v5.2.2/go.mod h1:pqrtFR0X4osieyHYxtmOUWsAWrfe1Q5UVIyoH402zdk=
github.com/google/go-cmp v0.6.0 h1:ofyhxvXcZhMsU5ulbFiLKl/XBFqE1GSq7atu8tAmTRI=
github.com/google/go-cmp v0.6.0/go.mod h1:17dUlkBOakJ0+DkrSSNjCkIjxS6bF9zb3elmeNGIjoY=
github.com/klauspost/compress v1.17.11 h1:In6xLpyWOi1+C7tXUUWv2ot1QvBjxevKAaI6IXrJmUc=