* chore(web/js): delete proof-of-work-slow.mjs
This code has served its purpose and now needs to be retired to the
great beyond. There is no replacement for this, the fast implementation
will be used instead.
Signed-off-by: Xe Iaso <me@xeiaso.net>
* chore(web): handle building multiple JS entrypoints and web workers
Signed-off-by: Xe Iaso <me@xeiaso.net>
* feat(web): rewrite frontend worker handling
This completely rewrites how the proof of work challenge works based on
feedback from browser engine developers and starts the process of making
the proof of work function easier to change out.
- Import @aws-crypto/sha256-js to use in Firefox as its implementation
of WebCrypto doesn't jump directly from highly optimized browser
internals to JIT-ed JavaScript like Chrome's seems to.
- Move the worker code to `web/js/worker/*` with each worker named after
the hashing method and hash method implementation it uses.
- Update bench.mjs to import algorithms the new way.
- Delete video.mjs, it was part of a legacy experiment that I never had
time to finish.
- Update LibreJS comment to add info about the use of
@aws-crypto/sha256-js.
- Also update my email to my @techaro.lol address.
Signed-off-by: Xe Iaso <me@xeiaso.net>
* fix(web): don't hard dep webcrypto anymore
Signed-off-by: Xe Iaso <me@xeiaso.net>
* chore(lib/policy): start the deprecation process for slow
This mostly adds a warning, but the "slow" method is in the process of
being removed. Warn admins with slog.Warn.
Signed-off-by: Xe Iaso <me@xeiaso.net>
* docs: update CHANGELOG
Signed-off-by: Xe Iaso <me@xeiaso.net>
* feat(web/js): allow running Anubis in non-secure contexts
Signed-off-by: Xe Iaso <me@xeiaso.net>
* Update metadata
check-spelling run (pull_request) for Xe/purge-slow
Signed-off-by: check-spelling-bot <check-spelling-bot@users.noreply.github.com>
on-behalf-of: @check-spelling <check-spelling-bot@check-spelling.dev>
---------
Signed-off-by: Xe Iaso <me@xeiaso.net>
Signed-off-by: check-spelling-bot <check-spelling-bot@users.noreply.github.com>
Previously the X-Forwarded-For middleware could return two commas in a
row. This is a regression test to make sure that doesn't happen again.
Imports a patch previously exclusive to Botstopper.
Signed-off-by: Xe Iaso <me@xeiaso.net>
* fix(lib): block XSS attacks via nonstandard URLs
This could allow an attacker to craft an Anubis pass-challenge URL that
forces a redirect to nonstandard URLs, such as the `javascript:` scheme
which executes arbitrary JavaScript code in a browser context when the
user clicks the "Try again" button.
Release-status: cut
Signed-off-by: Xe Iaso <me@xeiaso.net>
* chore: spelling
Signed-off-by: Xe Iaso <me@xeiaso.net>
---------
Signed-off-by: Xe Iaso <me@xeiaso.net>
The TLS termination server sets X-Real-IP to be used by the back-end, but the back-end configuration example doesn't actually extract it so nginx logs (and back-end processing) fails to log or use the visiting IP in any way (it just states `unix:` if using a unix socket like in the example given, or the local IP if forwarded over TCP).
Adding real_ip_header to the config will fix this.
Signed-off-by: Moonchild <moonchild@palemoon.org>
* fix(lib): fix challenge issuance logic
Fixes#869
v1.21.0 changed the core challenge flow to maintain information about
challenges on the server side instead of only doing them via stateless
idempotent generation functions and relying on details to not change.
There was a subtle bug introduced in this change: if a client has an
unknown challenge ID set in its test cookie, Anubis will clear that
cookie and then throw an HTTP 500 error.
This has been fixed by making Anubis throw a new challenge page instead.
Signed-off-by: Xe Iaso <me@xeiaso.net>
* test(lib): you win this time spell check
Signed-off-by: Xe Iaso <me@xeiaso.net>
---------
Signed-off-by: Xe Iaso <me@xeiaso.net>
* Update pt-BR.json
While current version is good enough as a machine translation, it is not natural enough for what a reader would like to read while browsing sites made by native devs - including the subtle nuances from the original English version, now incorporated to the translation instead of plain, literal translations with questionable meanings.
Signed-off-by: HQuest <hquest@gmail.com>
* fix(locales/pt-BR): anubis is from Canada
CA is the ISO country code for Canada, but also the US state code for California.
Co-authored-by: Victor Fernandes <victorvalenca@gmail.com>
Signed-off-by: Xe Iaso <me@xeiaso.net>
---------
Signed-off-by: HQuest <hquest@gmail.com>
Signed-off-by: Xe Iaso <me@xeiaso.net>
Co-authored-by: Xe Iaso <me@xeiaso.net>
Co-authored-by: Victor Fernandes <victorvalenca@gmail.com>
Fixes#877
Continued from #879, event loop thrashing can cause stack space
exhaustion on ia32 systems. Previously this would thrash the event loop
in Firefox and Firefox derived browsers such as Pale Moon. I suspect
that this is the ultimate root cause of the bizarre irreproducible bugs
that Pale Moon (and maybe Cromite) users have been reporting since at
least #87 was merged.
The root cause is an invalid boolean statement:
```js
// send a progress update every 1024 iterations. since each thread checks
// separate values, one simple way to do this is by bit masking the
// nonce for multiples of 1024. unfortunately, if the number of threads
// is not prime, only some of the threads will be sending the status
// update and they will get behind the others. this is slightly more
// complicated but ensures an even distribution between threads.
if (
(nonce > oldNonce) | 1023 && // we've wrapped past 1024
(nonce >> 10) % threads === threadId // and it's our turn
) {
postMessage(nonce);
}
```
The logic here looks fine but is subtly wrong as was reported in #877
by a user in the Pale Moon community. Consider the following scenario:
`nonce` is a counter that increments by the worker count every loop.
This is intended to spread the load between CPU cores as such:
| Iteration | Worker ID | Nonce |
| :-------- | :-------- | :---- |
| 1 | 0 | 0 |
| 1 | 1 | 1 |
| 2 | 0 | 2 |
| 3 | 1 | 3 |
And so on.
The incorrect part of this is the boolean logic, specifically the part
with the bitwise or `|`. I think the intent was to use a logical or
(`||`), but this had the effect of making the `postMessage` handler fire
on every iteration. The intent of this snippet (as the comment clearly
indicates) is to make sure that the main event loop is only updated with
the worker status every 1024 iterations per worker. This had the
opposite effect, causing a lot of messages to be sent from workers to
the parent JavaScript context.
This is bad for the event loop.
Instead, I have ripped out that statement and replaced it with a much
simpler increment only counter that fires every 1024 iterations.
Additionally, only the first thread communicates back to the parent
process. This does mean that in theory the other workers could be ahead
of the first thread (posting a message out of a worker has a nonzero
cost), but in practice I don't think this will be as much of an issue as
the current behaviour is.
The root cause of the stack exhaustion is likely the pressure caused by
all of the postMessage futures piling up. Maybe the larger stack size in
64 bit environments is causing this to be fine there, maybe it's some
combination of newer hardware in 64 bit systems making this not be as
much of a problem due to it being able to handle events fast enough to
keep up with the pressure.
Either way, thanks much to @wolfbeast and the Pale Moon community for
finding this. This will make Anubis faster for everyone!
Signed-off-by: Xe Iaso <xe.iaso@techaro.lol>
Possible fix for #877
In some cases, the parallel solution finder in Anubis could cause
all of the worker promises to leak due to the fact the promises
were being improperly terminated. A recursion bomb happens in the
following scenario:
1. A worker sends a message indicating it found a solution to the proof
of work challenge.
2. The `onmessage` handler for that worker calls `terminate()`
3. Inside `terminate()`, the parent process loops through all other
workers and calls `w.terminate()` on them.
4. It's possible that terminating a worker could lead to the `onerror`
event handler.
5. This would create a recursive loop of `onmessage` -> `terminate` ->
`onerror` -> `terminate` -> `onerror` and so on.
This infinite recursion quickly consumes all available stack space, but
this has never been noticed in development because all of my computers
have at least 64Gi of ram provisioned to them under the axiom paying for
more ram is cheaper than paying in my time spent having to work around
not having enough ram. Additionally, ia32 has a smaller base stack size,
which means that they will run into this issue much sooner than users on
other CPU architectures will.
The fix adds a boolean `settled` flag to prevent termination from
running more than once.
Signed-off-by: Xe Iaso <me@xeiaso.net>
* test: add i18n smoke test
Makes sure that all of the languages that Anubis supports show up when
the challenge page is sent to a client.
Signed-off-by: Xe Iaso <me@xeiaso.net>
* test(i18n): build anubis so that the smoke test doesn't backoff timeout
Signed-off-by: Xe Iaso <me@xeiaso.net>
---------
Signed-off-by: Xe Iaso <me@xeiaso.net>
This fixes a bug that was introduced in 68b653b0, in which the call
to metricsServer was passed a plain context.Background without
signal handling.
This commit adds back in the signal handling for the metrics server,
as well as for the Thoth client and storage backend.
Closes: #853
Signed-off-by: Emily Rowlands <emily@erowl.net>
* feat(anubis): add /healthz route to metrics server
Also add health check test for Docker Compose and update documentation
for health checking Anubis with Docker Compose.
Signed-off-by: Xe Iaso <me@xeiaso.net>
* chore: spelling
Signed-off-by: Xe Iaso <me@xeiaso.net>
---------
Signed-off-by: Xe Iaso <me@xeiaso.net>
Uptime Robot is a commonly used service for tracking service
interruptions. Additional policy definitions may be beneficial for
services that do publish their IP addresses in use. The list is
additionally aggregated to slightly shorten it.
Signed-off-by: Marcel Bischoff <marcel@herrbischoff.com>
This is not used yet, but it will be part of a larger strategy around
adding/removing weight based on JA4H (and other) fingerprint matches
with Thoth.
Signed-off-by: Xe Iaso <me@xeiaso.net>
* fix(lib): fix race condition when rendering multiple challenge pages at once
Closes#832
Signed-off-by: Xe Iaso <me@xeiaso.net>
* fix(web): make try again button work
Looks like the intent of this was "try the solution again". This fix
makes the client try the challenge again.
Signed-off-by: Xe Iaso <me@xeiaso.net>
* fix(web): don't block a user if they have an invalid challenge cookie
Signed-off-by: Xe Iaso <me@xeiaso.net>
* docs: update CHANGELOG
Signed-off-by: Xe Iaso <me@xeiaso.net>
---------
Signed-off-by: Xe Iaso <me@xeiaso.net>
This was causing issues with git clone against highly loaded servers. I
thought that this would be pretty innocuous, but I guess I was wrong.
Oops!
Signed-off-by: Xe Iaso <me@xeiaso.net>
* Fix centered-div class usage in index.templ
There was a redundant <center> tag around a div with centered-div class. Well, not so redundant because a typo in the class attribute caused it to not apply.
Removed another <center> tag and replaced by a div.centered-div for consistency.
Signed-off-by: Jesús Martínez Novo <martineznovo@gmail.com>
* Fix centered-div class usage in index.templ (continuation)
Template needed to be compiled into go code...
---------
Signed-off-by: Jesús Martínez Novo <martineznovo@gmail.com>
* docs(known-instances): add rpmfusion.org to known instances
Signed-off-by: Lothar Serra Mari <mail@serra.me>
* docs(known-instances): add wiki.freepascal.org to known instances
Signed-off-by: Lothar Serra Mari <mail@serra.me>
---------
Signed-off-by: Lothar Serra Mari <mail@serra.me>
* correct gitea.botPolicies extension to be yaml, not json
while Anubis probably doesn't care about the extension, and would parse a JSON file just fine too, the rest of the page talks about `gitea.botPolicies.yaml`, so let's be consistent
Signed-off-by: Evgeni Golov <evgeni@golov.de>
* chore: spelling
Signed-off-by: Xe Iaso <me@xeiaso.net>
---------
Signed-off-by: Evgeni Golov <evgeni@golov.de>
Signed-off-by: Xe Iaso <me@xeiaso.net>
Co-authored-by: Xe Iaso <me@xeiaso.net>
* docs(known-instances): add clew.se to known instances
Signed-off-by: Lothar Serra Mari <mail@serra.me>
* docs(known-instances): add tumfatig.net to known instances
Signed-off-by: Lothar Serra Mari <mail@serra.me>
---------
Signed-off-by: Lothar Serra Mari <mail@serra.me>
* feat(lib/policy/expressions): add system load average to bot expression inputs
This lets Anubis dynamically react to system load in order to
increase and decrease the required level of scrutiny. High load? More
scrutiny required. Low load? Less scrutiny required.
* docs: spell system correctly
Signed-off-by: Xe Iaso <me@xeiaso.net>
* Update metadata
check-spelling run (pull_request) for Xe/load-average
Signed-off-by: check-spelling-bot <check-spelling-bot@users.noreply.github.com>
on-behalf-of: @check-spelling <check-spelling-bot@check-spelling.dev>
* fix(default-config): don't enable low load average feature by default
Signed-off-by: Xe Iaso <me@xeiaso.net>
---------
Signed-off-by: Xe Iaso <me@xeiaso.net>
Signed-off-by: check-spelling-bot <check-spelling-bot@users.noreply.github.com>
Signed-off-by: Xe Iaso <xe.iaso@techaro.lol>
* Add translation for Traditional Chinese
* Add translation for Traditional Chinese: test
* Add translation for Traditional Chinese: Add PR number to CHANGELOG
* Add translation for Traditional Chinese: test: remove empty lines
* Add translation for Traditional Chinese: test: remove empty lines
* docs(known-instances): add Duke University to known instances
Signed-off-by: Lothar Serra Mari <mail@serra.me>
* docs(known-instances): add fabulous.systems to known instances
Signed-off-by: Lothar Serra Mari <mail@serra.me>
* docs(known-instances): add coinhoards.org to known instances
Signed-off-by: Lothar Serra Mari <mail@serra.me>
* chore(spelling): exempt the known instances page
Signed-off-by: Xe Iaso <me@xeiaso.net>
---------
Signed-off-by: Lothar Serra Mari <mail@serra.me>
Signed-off-by: Xe Iaso <me@xeiaso.net>
Co-authored-by: Xe Iaso <me@xeiaso.net>
* feat(decaymap): add Delete method
Signed-off-by: Xe Iaso <me@xeiaso.net>
* chore(lib/challenge): refactor Validate to take ValidateInput
Signed-off-by: Xe Iaso <me@xeiaso.net>
* feat(lib): implement store interface
Signed-off-by: Xe Iaso <me@xeiaso.net>
* feat(lib/store): all metapackage to import all store implementations
Signed-off-by: Xe Iaso <me@xeiaso.net>
* chore(policy): import all store backends
Signed-off-by: Xe Iaso <me@xeiaso.net>
* feat(lib): use new challenge creation flow
Previously Anubis constructed challenge strings from request metadata.
This was a good idea in spirit, but has turned out to be a very bad idea
in practice. This new flow reuses the Store facility to dynamically
create challenge values with completely random data.
This is a fairly big rewrite of how Anubis processes challenges. Right
now it defaults to using the in-memory storage backend, but on-disk
(boltdb) and valkey-based adaptors will come soon.
Signed-off-by: Xe Iaso <me@xeiaso.net>
* chore(decaymap): fix documentation typo
Signed-off-by: Xe Iaso <me@xeiaso.net>
* chore(lib): fix SA4004
Signed-off-by: Xe Iaso <me@xeiaso.net>
* test(lib/store): make generic storage interface test adaptor
Signed-off-by: Xe Iaso <me@xeiaso.net>
* chore: spelling
Signed-off-by: Xe Iaso <me@xeiaso.net>
* fix(decaymap): invert locking process for Delete
Signed-off-by: Xe Iaso <me@xeiaso.net>
* feat(lib/store): add bbolt store implementation
Signed-off-by: Xe Iaso <me@xeiaso.net>
* chore: spelling
Signed-off-by: Xe Iaso <me@xeiaso.net>
* chore: go mod tidy
Signed-off-by: Xe Iaso <me@xeiaso.net>
* chore(devcontainer): adapt to docker compose, add valkey service
Signed-off-by: Xe Iaso <me@xeiaso.net>
* fix(lib): make challenges live for 30 minutes by default
Signed-off-by: Xe Iaso <me@xeiaso.net>
* feat(lib/store): implement valkey backend
Signed-off-by: Xe Iaso <me@xeiaso.net>
* test(lib/store/valkey): disable tests if not using docker
Signed-off-by: Xe Iaso <me@xeiaso.net>
* test(lib/policy/config): ensure valkey stores can be loaded
Signed-off-by: Xe Iaso <me@xeiaso.net>
* Update metadata
check-spelling run (pull_request) for Xe/store-interface
Signed-off-by: check-spelling-bot <check-spelling-bot@users.noreply.github.com>
on-behalf-of: @check-spelling <check-spelling-bot@check-spelling.dev>
* chore(devcontainer): remove port forwards because vs code handles that for you
Signed-off-by: Xe Iaso <me@xeiaso.net>
* docs(default-config): add a nudge to the storage backends section of the docs
Signed-off-by: Xe Iaso <me@xeiaso.net>
* chore(docs): listen on 0.0.0.0 for dev container support
Signed-off-by: Xe Iaso <me@xeiaso.net>
* docs(policy): document storage backends
Signed-off-by: Xe Iaso <me@xeiaso.net>
* docs: update CHANGELOG and internal links
Signed-off-by: Xe Iaso <me@xeiaso.net>
* docs(admin/policies): don't start a sentence with as
Signed-off-by: Xe Iaso <me@xeiaso.net>
* chore: fixes found in review
Signed-off-by: Xe Iaso <me@xeiaso.net>
---------
Signed-off-by: Xe Iaso <me@xeiaso.net>
Signed-off-by: check-spelling-bot <check-spelling-bot@users.noreply.github.com>
This may seem strange, but allowlisting common crawl means that scrapers
have less incentive to scrape because they can just grab the data from
common crawl instead of scraping it again.
I'm gonna be totally honest here, I'm still not sure why #564 is still
an issue. This is really confusing and I'm going to totally throw out
how Anubis issues challenges and redo it with Valkey (#201, #622).
The problem seems to be that I assume that the makeChallenge function in
package lib is idempotent for the same client. I have no idea why this
would be inconsistent, but for some reason it is and I'm just at a loss
for words as to why this is happening.
This stops the bleeding by improving the UX as a stopgap.
Signed-off-by: Xe Iaso <me@xeiaso.net>
* Add cookie prefix option
* Add explaination comment for TestCookieName
* Rename TestCookieName value from cookie-test-if-you-block-this-anubis-wont-work to cookie-verification
* Add changes to CHANGELOG.md
* Add values to CookieName and TestCookieName in anubis.go required for testcases
* Fix cookieDynamicDomain option not being set in Options struct
* Fix using wrong cookie name when using dynamic cookie domains
* Adjust testcases for new cookie option structs
* Add known words to expect.txt and change typo in Zombocom
* Cleanup expect.txt
* Add changes to changelog
* Bump versions of grpc and apimachinery
* Fix testcases and add additional condition for dynamic cookie domain
* lib/localization: implement localization system
Locale files are placed in lib/localization/locales/. If you add a
locale, update manifest.json with available locales.
* Exclude locales from check spelling
* tests(lib/localization): add comprehensive translations test
Signed-off-by: Xe Iaso <me@xeiaso.net>
* fix(challenge/metarefresh): enable localization
Signed-off-by: Xe Iaso <me@xeiaso.net>
* fix: use simple syntax for localization in templ
Also localize CELPHASE into French according to the wishes of the
artist.
Signed-off-by: Xe Iaso <me@xeiaso.net>
* chore: spelling
Signed-off-by: Xe Iaso <me@xeiaso.net>
* chore:(js): fix forbidden patterns
Signed-off-by: Xe Iaso <me@xeiaso.net>
* chore: add goi18n to tools
Signed-off-by: Xe Iaso <me@xeiaso.net>
* test(lib/localization): dynamically determine the list of supported languages
Signed-off-by: Xe Iaso <me@xeiaso.net>
---------
Signed-off-by: Xe Iaso <me@xeiaso.net>
Co-authored-by: Xe Iaso <me@xeiaso.net>
* feat: dynamic cookie domains
Replaces #685
I was having weird testing issues when trying to merge #685, so I
rewrote it from scratch to be a lot more minimal.
* chore: spelling
Signed-off-by: Xe Iaso <me@xeiaso.net>
---------
Signed-off-by: Xe Iaso <me@xeiaso.net>
* chore(xess): remove unused xess templates
Signed-off-by: Jason Cameron <git@jasoncameron.dev>
* chore(checker): remove unused staticHashChecker implementation
Signed-off-by: Jason Cameron <git@jasoncameron.dev>
* feat: add pinact and deadcode to go tools (pinact is used for the gha pinning)
Signed-off-by: Jason Cameron <git@jasoncameron.dev>
* chore: update Docker and kubectl actions to latest versions
Signed-off-by: Jason Cameron <git@jasoncameron.dev>
* chore: update Homebrew action from master to main in workflow files
See df537ec97f
Signed-off-by: Jason Cameron <git@jasoncameron.dev>
* chore: remove unused go-colorable and tools dependencies from go.sum
Signed-off-by: Jason Cameron <git@jasoncameron.dev>
* chore: update postcss-import and other dependencies to latest versions
Signed-off-by: Jason Cameron <git@jasoncameron.dev>
* chore: update Docusaurus dependencies to version 3.8.1
Signed-off-by: Jason Cameron <git@jasoncameron.dev>
* chore: downgrade playwright and playwright-core to version 1.52.0
Signed-off-by: Jason Cameron <git@jasoncameron.dev>
---------
Signed-off-by: Jason Cameron <git@jasoncameron.dev>
Closes#564
This one is really dumb. Take a seat and listen to my tale of woe.
While @victorvalenca was working on #693 we ran into a strange issue.
The tests would consistently pass on Firefox but instantly failed on
Chrome. After adding increasingly desperate debugging logs to the mix,
we found out that somehow Chrome was randomizing the contents of its
Accept-Language header. This was making the challenge string get
calculated differently, thus making things spuriously fail. I cannot
figure out what causes Chrome to do this other than you being in an
environment where you have more than one "system language" set.
Either way, this should finally fix this issue and bring peace to the
land forever*.
Signed-off-by: Xe Iaso <me@xeiaso.net>
* feat(config): opengraph passthrough configuration
Signed-off-by: Xe Iaso <me@xeiaso.net>
* chore(ogtags): use config.OpenGraph for configuration
Signed-off-by: Xe Iaso <me@xeiaso.net>
* chore: wire up ogtags config in most of the app
Signed-off-by: Xe Iaso <me@xeiaso.net>
* feat(ogtags): return default tags if they are supplied
Signed-off-by: Xe Iaso <me@xeiaso.net>
* chore: make OpenGraph legal so we have some sanity in reviewing
Signed-off-by: Xe Iaso <me@xeiaso.net>
* chore: spelling
Signed-off-by: Xe Iaso <me@xeiaso.net>
* fix(lib): use OpenGraph.Enabled
Signed-off-by: Xe Iaso <me@xeiaso.net>
* test(lib): load default config file if one is not specified in spawnAnubis
Signed-off-by: Xe Iaso <me@xeiaso.net>
* chore(config): fix ST1005
Signed-off-by: Xe Iaso <me@xeiaso.net>
* docs: document open graph defaults and its new home in the policy file
Signed-off-by: Xe Iaso <me@xeiaso.net>
* docs(installation): point to weight threshold new home
Signed-off-by: Xe Iaso <me@xeiaso.net>
* chore: rename default to override
Signed-off-by: Xe Iaso <me@xeiaso.net>
* chore(default-config): add off-by-default opengraph settings to bot policy file
Signed-off-by: Xe Iaso <me@xeiaso.net>
* fix(anubis): make build
Signed-off-by: Xe Iaso <me@xeiaso.net>
* test(lib): fix build
Signed-off-by: Xe Iaso <me@xeiaso.net>
---------
Signed-off-by: Xe Iaso <me@xeiaso.net>
* feat: replace cidranger with bart improving performance by 3-20x
Signed-off-by: Jason Cameron <git@jasoncameron.dev>
* perf: replace cidranger with bart for IP range checking
- Replace cidranger.Ranger with bart.Lite in RemoteAddrChecker
- Use netip.ParsePrefix instead of net.ParseCIDR for modern IP handling
- Improve performance: 3-20x faster lookups with zero heap allocations
- Update imports to use github.com/gaissmai/bart and net/netip
- Remove cidranger dependency from go.mod
Benchmark results:
- IPv4 lookups: 4x faster (15.58ns vs 63.25ns, 0 vs 2 allocs)
- IPv6 lookups: 3x faster (26.51ns vs 76.96ns, 0 vs 2 allocs)
- Insertions: 20x faster (976ns vs 19,191ns)
- Large tables: 14x faster (5.2ns vs 74.85ns)
* docs: clarify CHANGELOG to not give false impressions
Signed-off-by: Jason Cameron <git@jasoncameron.dev>
* perf: optimize string concatenation in RemoteAddrChecker hash generation
Replace fmt.Fprintln with strings.Join for 7x faster performance:
- Before: 935.1 ns/op, 784 B/op, 22 allocs/op
- After: 133.2 ns/op, 192 B/op, 1 alloc/op
The hash is used for JWT cookie validation and error code generation.
Comma separation provides the same deterministic uniqueness as newlines
but with significantly better performance during policy initialization.
* chore: remove accidentally commited string benchmark
Signed-off-by: Jason Cameron <git@jasoncameron.dev>
* style: apply Copilot suggestions
Signed-off-by: Jason Cameron <git@jasoncameron.dev>
* fix: reference the right var name
i cannot write a merge commit
Signed-off-by: Jason Cameron <git@jasoncameron.dev>
---------
Signed-off-by: Jason Cameron <git@jasoncameron.dev>
* Revert "docs/blog: remove (#273)"
This reverts commit df3509ec99.
* chore: intro to the blog post
Signed-off-by: Xe Iaso <me@xeiaso.net>
---------
Signed-off-by: Xe Iaso <me@xeiaso.net>
* docs(known-instances): add bugs.scummvm.org and gitlab.postmarketos.org
Signed-off-by: Lothar Serra Mari <mail@serra.me>
* chore: clean uri
Signed-off-by: Jason Cameron <git@jasoncameron.dev>
---------
Signed-off-by: Lothar Serra Mari <mail@serra.me>
Signed-off-by: Jason Cameron <git@jasoncameron.dev>
Co-authored-by: Jason Cameron <git@jasoncameron.dev>
* refactor(ogtags): optimize URL construction and memory allocations
* test(ogtags): add benchmarks and memory usage tests for OGTagCache
* refactor(ogtags): optimize OGTags subsystem to reduce allocations and improve request runtime by up to 66%
* Update docs/docs/CHANGELOG.md
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Signed-off-by: Jason Cameron <jasoncameron.all@gmail.com>
* refactor(ogtags): optimize URL string construction to reduce allocations
* Update internal/ogtags/ogtags.go
Co-authored-by: Xe Iaso <me@xeiaso.net>
Signed-off-by: Jason Cameron <jasoncameron.all@gmail.com>
* test(ogtags): add fuzz tests for getTarget and extractOGTags functions
* fix(ogtags): update memory calculation logic
Prev it would say that we had allocated 18pb
=== RUN TestMemoryUsage
mem_test.go:107: Memory allocated for 10k getTarget calls: 18014398509481904.00 KB
mem_test.go:135: Memory allocated for 1k extractOGTags calls: 18014398509481978.00
Now it's fixed with
=== RUN TestMemoryUsage
mem_test.go:109: Memory allocated for 10k getTarget calls:
mem_test.go:110: Total: 630.56 KB (0.62 MB)
mem_test.go:111: Per operation: 64.57 bytes
mem_test.go:140: Memory allocated for 1k extractOGTags calls:
mem_test.go:141: Total: 328.17 KB (0.32 MB)
mem_test.go:142: Per operation: 336.05 bytes
* refactor(ogtags): optimize meta tag extraction for improved performance
* Update metadata
check-spelling run (pull_request) for json/ogmem
Signed-off-by: check-spelling-bot <check-spelling-bot@users.noreply.github.com>
on-behalf-of: @check-spelling <check-spelling-bot@check-spelling.dev>
* chore: update CHANGELOG for recent optimizations and version bump
* refactor: improve URL construction and meta tag extraction logic
* style: cleanup fuzz tests
---------
Signed-off-by: Jason Cameron <jasoncameron.all@gmail.com>
Signed-off-by: check-spelling-bot <check-spelling-bot@users.noreply.github.com>
Signed-off-by: Jason Cameron <git@jasoncameron.dev>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Xe Iaso <me@xeiaso.net>
* style: fix formatting in .air.toml and installation.mdx
* feat: add --strip-base-prefix flag to modify request paths when forwarding
Closes: #638
* refactor: apply structpacking (betteralign)
* fix: add validation for strip-base-prefix and base-prefix configuration
* fix: improve request path handling by cloning request and modifying URL path
* chore: remove integration tests as they are too annoying to debug on my system
* feat(lib): implement request weight
Replaces #608
This is a big one and will be what makes Anubis a generic web
application firewall. This introduces the WEIGH option, allowing
administrators to have facets of request metadata add or remove
"weight", or the level of suspicion. This really makes Anubis weigh
the soul of requests.
Signed-off-by: Xe Iaso <me@xeiaso.net>
* fix(lib): maintain legacy challenge behavior
Signed-off-by: Xe Iaso <me@xeiaso.net>
* fix(lib): make weight have dedicated checkers for the hashes
Signed-off-by: Xe Iaso <me@xeiaso.net>
* feat(data): convert some rules over to weight points
Signed-off-by: Xe Iaso <me@xeiaso.net>
* docs: document request weight
Signed-off-by: Xe Iaso <me@xeiaso.net>
* fix(CHANGELOG): spelling error
Signed-off-by: Xe Iaso <me@xeiaso.net>
* chore: spelling
Signed-off-by: Xe Iaso <me@xeiaso.net>
* docs: fix links to challenge information
Signed-off-by: Xe Iaso <me@xeiaso.net>
* docs(policies): fix formatting
Signed-off-by: Xe Iaso <me@xeiaso.net>
* fix(config): make default weight adjustment 5
Signed-off-by: Xe Iaso <me@xeiaso.net>
---------
Signed-off-by: Xe Iaso <me@xeiaso.net>
* chore(deps): update dependencies in go.mod and go.sum
Signed-off-by: Jason Cameron <git@jasoncameron.dev>
* refactor: rename variables for clarity in anubis.go and main.go
Signed-off-by: Jason Cameron <git@jasoncameron.dev>
* fix(checker): handle error when inserting IP range in ranger
Signed-off-by: Jason Cameron <git@jasoncameron.dev>
* fix(tests): simplify boolean checks in header and URL value tests
Signed-off-by: Jason Cameron <git@jasoncameron.dev>
* refactor(api): remove unused /test-error endpoint and restrict /make-challenge to development
Signed-off-by: Jason Cameron <git@jasoncameron.dev>
* build(deps): update golang-set to v2.8.0 in go.sum
Signed-off-by: Jason Cameron <git@jasoncameron.dev>
* Update metadata
check-spelling run (pull_request) for json/stuff
Signed-off-by: check-spelling-bot <check-spelling-bot@users.noreply.github.com>
on-behalf-of: @check-spelling <check-spelling-bot@check-spelling.dev>
---------
Signed-off-by: Jason Cameron <git@jasoncameron.dev>
Signed-off-by: check-spelling-bot <check-spelling-bot@users.noreply.github.com>
* feat(lib/challenge): HTTP meta refresh challenge method
Closes#95
This challenge method enables users that don't (or won't) support
JavaScript to pass Anubis challenges. It works by using HTML meta
refresh directives to ensure that the client is a browser.
This is OFF by default. In order to enable it, an administrator MUST
choose to make the default challenge method `metarefresh`.
TODO(Xe):
- [ ] Documentation on this challenge method
- [ ] Amend wording around Anubis being a proof of work proxy in the docs
- [ ] Add configuration file syntax for the default challenge method and settings
- [ ] Test with early customers
Signed-off-by: Xe Iaso <me@xeiaso.net>
* chore: spelling
Signed-off-by: Xe Iaso <me@xeiaso.net>
* fix(lib/challenge/metarefresh): use this value of err
Signed-off-by: Xe Iaso <me@xeiaso.net>
* docs: add metarefresh challenge info, Web AI Firewall Utility
Signed-off-by: Xe Iaso <me@xeiaso.net>
---------
Signed-off-by: Xe Iaso <me@xeiaso.net>
Gives us many nice things like:
* Windows support for yeet (modulo TecharoHQ/yeet#29)
* Removes the dependency on /bin/sh or /bin/bash thanks to
mvdan.cc/sh/v3
* Checksum-compliant reproducible builds by default
Signed-off-by: Xe Iaso <me@xeiaso.net>
* Split up AI filtering files
Create aggressive/moderate/permissive policies to allow administrators to choose their AI/LLM stance.
Aggressive policy matches existing default in Anubis.
Removes `Google-Extended` flag from `ai-robots-txt.yaml` as it doesn't exist in requests.
Rename `ai-robots-txt.yaml` to `ai-catchall.yaml` as the file is no longer a copy of the source repo/file.
* chore: spelling
* chore: fix embeds
* chore: fix data includes
* chore: fix file name typo
* chore: Ignore READMEs in configs
* chore(lib/policy/config): go tool goimports -w
Signed-off-by: Xe Iaso <me@xeiaso.net>
---------
Signed-off-by: Xe Iaso <me@xeiaso.net>
Co-authored-by: Xe Iaso <me@xeiaso.net>
* Define OpenAI bot ALLOW policies
Allows OpenAI bots to be allowlisted at the choice of the Anubis administrator. None are enabled by default.
* Define MistralAI bot ALLOW policy
* chore: spelling
* Add Applebot definition
Adds Apple's search indexing bot, and allowlists it by default.
Allowlisted by default because it is equivalent to Googlebot/Bingbot. Remove Applebot from `ai-robots-txt.yaml` for the same reasons.
Remove `Applebot-Extended` from `ai-robots-txt.yaml` as it has no effect.
* chore: spelling
Signed-off-by: Xe Iaso <me@xeiaso.net>
---------
Signed-off-by: Xe Iaso <me@xeiaso.net>
Co-authored-by: Xe Iaso <me@xeiaso.net>
* feat(lib): annotate cookies with what rule was passed
Anubis JWTs now contain a policyRule claim with the cryptographic hash
of the rule that it passed. This is intended to help with a future move
away from proof of work being the default.
Signed-off-by: Xe Iaso <me@xeiaso.net>
* test(lib): fix cookie storage logic
Signed-off-by: Xe Iaso <me@xeiaso.net>
---------
Signed-off-by: Xe Iaso <me@xeiaso.net>
* fix(bench): await benchmark loop and adjust outline styles in templates
Signed-off-by: Jason Cameron <git@jasoncameron.dev>
* refactor: remove unused showContinueBar function and clean up video error handling
Signed-off-by: Jason Cameron <git@jasoncameron.dev>
* style: format code for consistency and readability using prettier
Signed-off-by: Jason Cameron <git@jasoncameron.dev>
---------
Signed-off-by: Jason Cameron <git@jasoncameron.dev>
For some reason, Google Chrome will randomly send a "full"
Accept-Language header, and other times it will send a "partial"
Accept-Language header. This makes the challenge construction
inconsistent.
This commit fixes this issue by only considering up to the first five
characters of the Accept-Language header when making a challenge string.
Signed-off-by: Xe Iaso <me@xeiaso.net>
Closes#565
The page already had the version number embedded into it, but that was
not printed to the page. This prints the version number set at compile
time to the page.
Signed-off-by: Xe Iaso <me@xeiaso.net>
This seems counter-intuitive at first glance, but let me cook.
One of the problems with Anubis is that the rule matching is super
deterministic. This means that attackers can figure out what patterns
they are hitting and change things to bypass them.
The randInt function lets you have rulesets behave nondeterministically.
This is a very easy way to hang yourself, but can be great to
psychologically mess with scraper operators. Consider this rule:
```yaml
- name: deny-lightpanda-sometimes
action: DENY
expression:
all:
- userAgent.matches("LightPanda")
- randInt(16) >= 4
```
It would match about 75% of the time.
Signed-off-by: Xe Iaso <me@xeiaso.net>
* fix(expression): add validation for empty ExpressionOrList
Signed-off-by: Jason Cameron <git@jasoncameron.dev>
* fix(imports): block empty file imports with improved error checking logic
Signed-off-by: Jason Cameron <git@jasoncameron.dev>
* docs(expression): improve validation to error on empty CEL expressions
Signed-off-by: Jason Cameron <git@jasoncameron.dev>
---------
Signed-off-by: Jason Cameron <git@jasoncameron.dev>
Closes#531
This changes `anubis_challenges_issued` to be a vector counter that
records the challenge issuance method.
Signed-off-by: Xe Iaso <me@xeiaso.net>
For websites hosted on non-standard ports (`https://example:com:8443`,
`http://www.example.net:8080`, etc.), the domains listed in
`REDIRECT_DOMAINS` must contain the port number. This commit documents
this requirement on the Installation page.
Fixes#517.
Signed-off-by: Max Chernoff <git@maxchernoff.ca>
Closes#520
For some reason, Chrome and Firefox are very picky over what they use to
match cookies that need to be deleted. Listen to me for my tale of woe:
The basic problem here is that cookies were an early hack added on the
side of the HTTP spec and they're basically impossible to upgrade or
change because who knows what relies on the exact behavior cookies use.
As a result, cookies don't just match by name, but by every setting that
exists on them. You can also have two cookies with the same name but
different values. This spec is a nightmare lol.
Even more fun: browsers will make up values for cookies if they aren't
set, meaning that getting a challenge token at `/docs` is semantically
different than a challenge token you got from `/`.
This PR fixes this issue by explicitly setting the "make sure cookie
support is working" cookie's path to `/`, meaning that it will always be
sent. Additionally, cookies are expired by setting the expiry time to
one minute in the past.
Hopefully this will fix it. I'm testing this locally and it seems to
work fine.
Signed-off-by: Xe Iaso <me@xeiaso.net>
* feat(lib): ensure that clients store cookies
If a client is misconfigured and does not store cookies, then they can
get into a proof of work death spiral with Anubis. This fixes the
problem by setting a test cookie whenever the user gets hit with a
challenge page. If the test cookie is not there at challenge pass time,
then they are blocked. Administrators will also get a log message
explaining that the user intentionally broke cookie support and that this
behavior is not an Anubis bug.
Additionally, this ensures that clients being shown a challenge support
gzip-compressed responses by showing the challenge page at gzip level 1.
This level is intentionally chosen in order to minimize system impacts.
The ClearCookie function is made more generic to account for cookie
names as an argument. A correlating SetCookie function was also added to
make it easier to set cookies.
* chore(lib): clean up test code
Signed-off-by: Xe Iaso <me@xeiaso.net>
---------
Signed-off-by: Xe Iaso <me@xeiaso.net>
Previously this made ClearCookie always clear cookies by name even when
CookieDomain was set. This change fixes this and adds tests to make sure
that this doesn't happen again.
Signed-off-by: Xe Iaso <me@xeiaso.net>
Also properly re-brand the cookies so that some of the /x/ heritage is
lost.
This will invalidate existing cookies and probably affects tests.
Signed-off-by: Xe Iaso <me@xeiaso.net>
* feat(ci): use dynamic repository owner and name in Docker actions
Signed-off-by: Jason Cameron <git@jasoncameron.dev>
* feat(ci): support forks
Signed-off-by: Jason Cameron <git@jasoncameron.dev>
* feat(ci): support forks
Signed-off-by: Jason Cameron <git@jasoncameron.dev>
* feat(ci): add debug output for Docker repository information
Signed-off-by: Jason Cameron <git@jasoncameron.dev>
* feat(ci): update Docker image naming convention in workflow
Signed-off-by: Jason Cameron <git@jasoncameron.dev>
* feat(ci): set lowercase image name in Docker workflow
Signed-off-by: Jason Cameron <git@jasoncameron.dev>
* feat(ci): remove json/gha branch from Docker workflow triggers
Signed-off-by: Jason Cameron <git@jasoncameron.dev>
* feat(ci): simplify Docker registry configuration in workflow
Signed-off-by: Jason Cameron <git@jasoncameron.dev>
---------
Signed-off-by: Jason Cameron <git@jasoncameron.dev>
This means that yeet's version will be managed by `go.mod` and
auto-bumped with dependabot. This removes human error from the equation
and ensures that Anubis is always built with the newest version of yeet.
This also makes it trivial to make your own local packages for testing:
```text
go tool yeet
```
Signed-off-by: Xe Iaso <me@xeiaso.net>
* test(playwright): Add support to run tests in Docker/Podman
* fix command name
Co-authored-by: Xe Iaso <me@xeiaso.net>
Signed-off-by: Henri Vasserman <henv@hot.ee>
* up the pw version as it is in package.json
* add convenience npm scripts
* chore: changelog update
Also removed a period from my other item.
* chore: fix spelling
Signed-off-by: Xe Iaso <me@xeiaso.net>
---------
Signed-off-by: Henri Vasserman <henv@hot.ee>
Signed-off-by: Xe Iaso <me@xeiaso.net>
Co-authored-by: Xe Iaso <me@xeiaso.net>
* Overhaul anubis.freebsd
Some changes here to reflect the discussion in pull request 274 regarding the `anubis_env`, `anubis_env_file` and `anubis_args` variables.
At the risk of improving personal choices in configuration with a minor amount more complexity, this new script now allows for the use of all three of these, together, with no interference between them all
i.e.
- if `anubis_env_file` is set, environment variables will be taken from this file
- if `anubis_env` is set, environment variables will be taken from this string of variables, and override matching variables set in `anubis_env_file`
- if `anubis_args` is set, runtime parameters will be taken from this string and override matching ones in both `anubis_env_file` and `anubis_env`
Thanks to @dlangille for the advice with this.
Signed-off-by: Paul Wilde <31094984+pswilde@users.noreply.github.com>
* Update CHANGELOG.md
Signed-off-by: Paul Wilde <31094984+pswilde@users.noreply.github.com>
* Remove unnecessary comment line
Signed-off-by: Paul Wilde <31094984+pswilde@users.noreply.github.com>
* Correct helper information for anubis_env_file
Signed-off-by: Paul Wilde <31094984+pswilde@users.noreply.github.com>
---------
Signed-off-by: Paul Wilde <31094984+pswilde@users.noreply.github.com>
* refactor: reorder import statements in fetch.go and fetch_test.go
Signed-off-by: Jason Cameron <git@jasoncameron.dev>
* fix: optimize struct field alignment to reduce memory usage
Signed-off-by: Jason Cameron <git@jasoncameron.dev>
---------
Signed-off-by: Jason Cameron <git@jasoncameron.dev>
The big ticket feature in this release is [CEL expression matching support](https://anubis.techaro.lol/docs/admin/configuration/expressions). This allows you to tailor your approach for the individual services you are protecting.
These can be as simple as:
```yaml
- name: allow-api-requests
action: ALLOW
expression:
all:
- '"Accept" in headers'
- 'headers["Accept"] == "application/json"'
- 'path.startsWith("/api/")'
```
Or as complicated as:
```yaml
- name: allow-git-clients
action: ALLOW
expression:
all:
- >-
(
userAgent.startsWith("git/") ||
userAgent.contains("libgit") ||
userAgent.startsWith("go-git") ||
userAgent.startsWith("JGit/") ||
userAgent.startsWith("JGit-")
)
- '"Git-Protocol" in headers'
- headers["Git-Protocol"] == "version=2"
```
The docs have more information, but here's a tl;dr of the variables you have access to in expressions:
| Name | Type | Explanation | Example |
| :-------------- | :-------------------- | :---------------------------------------------------------------------------------------------------------------------------------------- | :----------------------------------------------------------- |
| `headers` | `map[string, string]` | The [headers](https://developer.mozilla.org/en-US/docs/Web/HTTP/Reference/Headers) of the request being processed. | `{"User-Agent": "Mozilla/5.0 Gecko/20100101 Firefox/137.0"}` |
| `host` | `string` | The [HTTP hostname](https://web.dev/articles/url-parts#host) the request is targeted to. | `anubis.techaro.lol` |
| `method` | `string` | The [HTTP method](https://developer.mozilla.org/en-US/docs/Web/HTTP/Reference/Methods) in the request being processed. | `GET`, `POST`, `DELETE`, etc. |
| `path` | `string` | The [path](https://web.dev/articles/url-parts#pathname) of the request being processed. | `/`, `/api/memes/create` |
| `query` | `map[string, string]` | The [query parameters](https://web.dev/articles/url-parts#query) of the request being processed. | `?foo=bar` -> `{"foo": "bar"}` |
| `remoteAddress` | `string` | The IP address of the client. | `1.1.1.1` |
| `userAgent` | `string` | The [`User-Agent`](https://developer.mozilla.org/en-US/docs/Web/HTTP/Reference/Headers/User-Agent) string in the request being processed. | `Mozilla/5.0 Gecko/20100101 Firefox/137.0` |
This will be made more elaborate in the future. Give me time. This is a [simple, lovable, and complete](https://longform.asmartbear.com/slc/) implementation of this feature so that administrators can get hacking ASAP.
Other changes:
- Use CSS variables to deduplicate styles
- Fixed native packages not containing the stdlib and botPolicies.yaml
- Change import syntax to allow multi-level imports
- Changed the startup logging to use JSON formatting as all the other logs do.
- Added the ability to do [expression matching with CEL](./admin/configuration/expressions.mdx)
- Add a warning for clients that don't store cookies
- Disable Open Graph passthrough by default ([#435](https://github.com/TecharoHQ/anubis/issues/435))
- Clarify the license of the mascot images ([#442](https://github.com/TecharoHQ/anubis/issues/442))
- Started Suppressing 'Context canceled' errors from http in the logs ([#446](https://github.com/TecharoHQ/anubis/issues/446))
Signed-off-by: Xe Iaso <me@xeiaso.net>
* fix(js): use pure JS SHA256 library, refactor
Closes#458
Additionally, I made a horrifying discovery: Firefox seems to actively
hinder performance if you are using more than one Worker per page. It
does not spread the load out across cores like I expected. Instead it
seems to make that one Worker thrash and have to constantly context
switch, which caused a lot of slowdown.
The benchmarks in #155 continue to be the best contribution ever made to
Anubis. What clued me into there being a problem here was the fact that
the "slow" algorithm was faster than the "fast" algorithm on my laptop.
This made no intuitive sense to me so I dug further.
Either way I think this is a Firefox bug at its core, but for now we
have to work around it by doing the hacky terrible thing that I hate.
I also swapped the SHA256 operations to @aws-crypto/sha256-js on the
advice of a trusted cryptography expert. I don't know what performance
differences this makes, but I'm getting 150-225 kilohashes per second,
which is pretty dang good.
Signed-off-by: Xe Iaso <me@xeiaso.net>
* fix(js): apply suggestions from code review
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Signed-off-by: Xe Iaso <me@xeiaso.net>
* fix(js): use fast algo for fast worker
Signed-off-by: Xe Iaso <me@xeiaso.net>
---------
Signed-off-by: Xe Iaso <me@xeiaso.net>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* The IP address and Host should be included
* The Content-Length removed to avoid Anubis waiting for the body, which is not passed because subrequest is always using GET.
Signed-off-by: Michal Čihař <michal@weblate.org>
* feat(lib/policy): add support for CEL checkers
This adds the ability for administrators to use Common Expression
Language[0] (CEL) for more advanced check logic than Anubis previously
offered.
These can be as simple as:
```yaml
- name: allow-api-routes
action: ALLOW
expression:
and:
- '!(method == "HEAD" || method == "GET")'
- path.startsWith("/api/")
```
or get as complicated as:
```yaml
- name: allow-git-clients
action: ALLOW
expression:
and:
- userAgent.startsWith("git/") || userAgent.contains("libgit") || userAgent.startsWith("go-git") || userAgent.startsWith("JGit/") || userAgent.startsWith("JGit-")
- >
"Git-Protocol" in headers && headers["Git-Protocol"] == "version=2"
```
Internally these are compiled and evaluated with cel-go[1]. This also
leaves room for extensibility should that be desired in the future. This
will intersect with #338 and eventually intersect with TLS fingerprints
as in #337.
[0]: https://cel.dev/
[1]: https://github.com/google/cel-go
Signed-off-by: Xe Iaso <me@xeiaso.net>
* feat(data/apps): add API route allow rule for non-HEAD/GET
Signed-off-by: Xe Iaso <me@xeiaso.net>
* docs: document expression syntax
Signed-off-by: Xe Iaso <me@xeiaso.net>
* fix: fixes in review
Signed-off-by: Xe Iaso <me@xeiaso.net>
---------
Signed-off-by: Xe Iaso <me@xeiaso.net>
I guess the whole purpose is to avoid having 3001 opened to the world. This is the easyest way to do it (iptables might be an option too)
Signed-off-by: mans17 <github@spontex.org>
* deduplicate css rules by using media query to set variables
* Update xess/xess.css
Co-authored-by: Xe Iaso <me@xeiaso.net>
Signed-off-by: Benjamin Armintor <armintor@gmail.com>
---------
Signed-off-by: Benjamin Armintor <armintor@gmail.com>
Co-authored-by: Xe Iaso <me@xeiaso.net>
Anubis offers a [development container](https://containers.dev/) image in order to make it easier to contribute to the project. This image is based on [Xe/devcontainer-base/go](https://github.com/Xe/devcontainer-base/tree/main/src/go), which is based on Debian Bookworm with the following customizations:
- [Fish](https://fishshell.com/) as the shell complete with a custom theme
- [Go](https://go.dev) at the most recent stable version
- [Node.js](https://nodejs.org/en) at the most recent stable version
- [Atuin](https://atuin.sh/) to sync shell history between your host OS and the development container
- [Docker](https://docker.com) to manage and build Anubis container images from inside the development container
- [Ko](https://ko.build/) to build production-ready Anubis container images
- [Neovim](https://neovim.io/) for use with Git
This development container is tested and known to work with [Visual Studio Code](https://code.visualstudio.com/). If you run into problems with it outside of VS Code, please file an issue and let us know what editor you are using.
[dictionary.txt](dictionary.txt) | Replacement dictionary (creating this file will override the default dictionary) | one word per line | [dictionary](https://github.com/check-spelling/check-spelling/wiki/Configuration#dictionary)
[allow.txt](allow.txt) | Add words to the dictionary | one word per line (only letters and `'`s allowed) | [allow](https://github.com/check-spelling/check-spelling/wiki/Configuration#allow)
[reject.txt](reject.txt) | Remove words from the dictionary (after allow) | grep pattern matching whole dictionary words | [reject](https://github.com/check-spelling/check-spelling/wiki/Configuration-Examples%3A-reject)
[only.txt](only.txt) | Only check matching files (applied after excludes) | perl regular expression | [only](https://github.com/check-spelling/check-spelling/wiki/Configuration-Examples%3A-only)
[patterns.txt](patterns.txt) | Patterns to ignore from checked lines | perl regular expression (order matters, first match wins) | [patterns](https://github.com/check-spelling/check-spelling/wiki/Configuration-Examples%3A-patterns)
[candidate.patterns](candidate.patterns) | Patterns that might be worth adding to [patterns.txt](patterns.txt) | perl regular expression with optional comment block introductions (all matches will be suggested) | [candidates](https://github.com/check-spelling/check-spelling/wiki/Feature:-Suggest-patterns)
[line_forbidden.patterns](line_forbidden.patterns) | Patterns to flag in checked lines | perl regular expression (order matters, first match wins) | [patterns](https://github.com/check-spelling/check-spelling/wiki/Configuration-Examples%3A-patterns)
[expect.txt](expect.txt) | Expected words that aren't in the dictionary | one word per line (sorted, alphabetically) | [expect](https://github.com/check-spelling/check-spelling/wiki/Configuration#expect)
[advice.md](advice.md) | Supplement for GitHub comment when unrecognized words are found | GitHub Markdown | [advice](https://github.com/check-spelling/check-spelling/wiki/Configuration-Examples%3A-advice)
Note: you can replace any of these files with a directory by the same name (minus the suffix)
and then include multiple files inside that directory (with that suffix) to merge multiple files together.
# Update Lorem based on your content (requires `ge` and `w` from https://github.com/jsoref/spelling; and `review` from https://github.com/check-spelling/check-spelling/wiki/Looking-for-items-locally )
# Even repositories expecting pure English content can unintentionally have Non-English content... People will occasionally mistakenly enter [homoglyphs](https://en.wikipedia.org/wiki/Homoglyph) which are essentially typos, and using this pattern will mean check-spelling will not complain about them.
#
# If the content to be checked should be written in English and the only Non-English items will be people's names, then you can consider adding this.
#
# Alternatively, if you're using check-spelling v0.0.25+, and you would like to _check_ the Non-English content for spelling errors, you can. For information on how to do so, see:
# Should be `background` / `intro text` / `introduction` / `prologue` unless it's a brand or relates to _subterfuge_
(?i)\bpretext\b
# Should be `branches`
# ... unless it's really about the meal that replaces breakfast and lunch.
\b[Bb]runches\b
# Should be `briefcase`
\bbrief-case\b
# Should be `by far` or `far and away`
\bby far and away\b
# Should be `can, not only ..., ... also...`
\bcan not only.*can also\b
# Should be `cannot` (or `can't`)
# See https://www.grammarly.com/blog/cannot-or-can-not/
# > Don't use `can not` when you mean `cannot`. The only time you're likely to see `can not` written as separate words is when the word `can` happens to precede some other phrase that happens to start with `not`.
# > `Can't` is a contraction of `cannot`, and it's best suited for informal writing.
# > In formal writing and where contractions are frowned upon, use `cannot`.
# > It is possible to write `can not`, but you generally find it only as part of some other construction, such as `not only . . . but also.`
# - if you encounter such a case, add a pattern for that case to patterns.txt.
# Should be `for its` (possessive) or `because it is`
\bfor it(?:'s| is)\b
# Should be `log in`
\blogin to the
# Should be `long-standing`
\blong standing\b
# `apt-key` is deprecated
# ... instead you should be writing a pair of files:
# ... * the gpg key added to a distinct key ring file based on your project/distro/key...
# ... * the sources.list in a district file -- not simply appended to `/etc/apt/sources.list` -- (there is a newer format [DEB822](https://manpages.debian.org/bookworm/dpkg-dev/deb822.5.en.html)) that references the gpg key.
Anubis [weighs the soul of your connection](https://en.wikipedia.org/wiki/Weighing_of_souls) using a proof-of-work challenge in order to protect upstream resources from scraper bots.
Anubis is a Web AI Firewall Utility that [weighs the soul of your connection](https://en.wikipedia.org/wiki/Weighing_of_souls) using one or more challenges in order to protect upstream resources from scraper bots.
This program is designed to help protect the small internet from the endless storm of requests that flood in from AI companies. Anubis is as lightweight as possible to ensure that everyone can afford to protect the communities closest to them.
bindNetwork=flag.String("bind-network","tcp","network family to bind HTTP to, e.g. unix, tcp")
challengeDifficulty=flag.Int("difficulty",anubis.DefaultDifficulty,"difficulty of the challenge")
cookieDomain=flag.String("cookie-domain","","if set, the top-level domain that the Anubis cookie will be valid for")
cookieDynamicDomain=flag.Bool("cookie-dynamic-domain",false,"if set, automatically set the cookie Domain value based on the request domain")
cookieExpiration=flag.Duration("cookie-expiration-time",anubis.CookieDefaultExpirationTime,"The amount of time the authorization cookie is valid for")
cookiePrefix=flag.String("cookie-prefix","techaro.lol-anubis","prefix for browser cookies created by Anubis")
cookiePartitioned=flag.Bool("cookie-partitioned",false,"if true, sets the partitioned flag on Anubis cookies, enabling CHIPS support")
forcedLanguage=flag.String("forced-language","","if set, this language is being used instead of the one from the request's Accept-Language header")
hs512Secret=flag.String("hs512-secret","","secret used to sign JWTs, uses ed25519 if not set")
cookieSecure=flag.Bool("cookie-secure",true,"if true, sets the secure flag on Anubis cookies")
ed25519PrivateKeyHex=flag.String("ed25519-private-key-hex","","private key used to sign JWTs, if not set a random one will be assigned")
ed25519PrivateKeyHexFile=flag.String("ed25519-private-key-hex-file","","file name containing value for ed25519-private-key-hex")
metricsBind=flag.String("metrics-bind",":9090","network address to bind metrics to")
@@ -54,15 +63,25 @@ var (
policyFname=flag.String("policy-fname","","full path to anubis policy document (defaults to a sensible built-in policy)")
redirectDomains=flag.String("redirect-domains","","list of domains separated by commas which anubis is allowed to redirect to. Leaving this unset allows any domain.")
slogLevel=flag.String("slog-level","INFO","logging level (see https://pkg.go.dev/log/slog#hdr-Levels)")
stripBasePrefix=flag.Bool("strip-base-prefix",false,"if true, strips the base prefix from requests forwarded to the target server")
target=flag.String("target","http://localhost:3923","target to reverse proxy to, set to an empty string to disable proxying when only using auth request")
targetSNI=flag.String("target-sni","","if set, the value of the TLS handshake hostname when forwarding requests to the target")
targetHost=flag.String("target-host","","if set, the value of the Host header when forwarding requests to the target")
targetInsecureSkipVerify=flag.Bool("target-insecure-skip-verify",false,"if true, skips TLS validation for the backend")
healthcheck=flag.Bool("healthcheck",false,"run a health check against Anubis")
useRemoteAddress=flag.Bool("use-remote-address",false,"read the client's IP address from the network request, useful for debugging and running Anubis on bare metal")
debugBenchmarkJS=flag.Bool("debug-benchmark-js",false,"respond to every request with a challenge for benchmarking hashrate")
ogPassthrough=flag.Bool("og-passthrough",true,"enable Open Graph tag passthrough")
ogPassthrough=flag.Bool("og-passthrough",false,"enable Open Graph tag passthrough")
ogTimeToLive=flag.Duration("og-expiry-time",24*time.Hour,"Open Graph tag cache expiration time")
ogCacheConsiderHost=flag.Bool("og-cache-consider-host",false,"enable or disable the use of the host in the Open Graph tag cache")
extractResources=flag.String("extract-resources","","if set, extract the static resources to the specified folder")
webmasterEmail=flag.String("webmaster-email","","if set, displays webmaster's email on the reject page for appeals")
log.Fatalf("failed to generate ed25519 key: %v",err)
}
@@ -273,44 +382,49 @@ func main() {
slog.Warn("REDIRECT_DOMAINS is not set, Anubis will only redirect to the same domain a request is coming from, see https://anubis.techaro.lol/docs/admin/configuration/redirect-domains")
# # The imprint page that will be linked to at the footer of every Anubis page.
# page:
# # The HTML <title> of the page
# title: Imprint and Privacy Policy
# # The HTML contents of the page. The exact contents of this page can
# # and will vary by locale. Please consult with a lawyer if you are not
# # sure what to put here
# body: >-
# <p>Last updated: June 2025</p>
# <h2>Information that is gathered from visitors</h2>
# <p>In common with other websites, log files are stored on the web server saving details such as the visitor's IP address, browser type, referring page and time of visit.</p>
# <p>Cookies may be used to remember visitor preferences when interacting with the website.</p>
# <p>Where registration is required, the visitor's email and a username will be stored on the server.</p>
# <!-- ... -->
# Open Graph passthrough configuration, see here for more information:
At Techaro, we've been working on making Anubis even better, and in the process we want to share what we've done, how it works, and signal boost cool things the community has done. As things happen, we'll blog about them so that you can learn from our struggles.
Today we released [Anubis v1.20.0: Thancred Waters](https://github.com/TecharoHQ/anubis/releases/tag/v1.20.0). This adds a lot of new and exciting features to Anubis, including but not limited to the `WEIGH` action, custom weight thresholds, Imprint/impressum support, and a no-JS challenge. Here's what you need to know so you can protect your websites in new and exciting ways!
{/* truncate */}
## Sponsoring the product
If you rely on Anubis to keep your website safe, please consider sponsoring the project on [GitHub Sponsors](https://github.com/sponsors/Xe) or [Patreon](https://patreon.com/cadey). Funding helps pay hosting bills and offset the time spent on making this project the best it can be. Every little bit helps and when enough money is raised, [I can make Anubis my full-time job](https://github.com/TecharoHQ/anubis/discussions/278).
I am waiting to hear back from NLNet on if Anubis was selected for funding or not. Let's hope it is!
## Deprecation warning: `DIFFICULTY`
Anubis v1.20.0 is the last version to support the `DIFFICULTY` flag in the exact way it currently does. In future versions, this will be ineffectual and you should use the [custom threshold system](/docs/admin/configuration/thresholds) instead.
If this becomes an imposition in practice, this will be reverted.
## Chrome won't show "invalid response" after "Success!"
There were a bunch of smaller fixes in Anubis this time around, but the biggest one was finally squashing the ["invalid response" after "Success!" issue](https://github.com/TecharoHQ/anubis/issues/564) that had been plaguing Chrome users. This was a really annoying issue to track down but it was discovered while we were working on better end-to-end / functional testing: [Chrome randomizes the `Accept-Language` header](https://github.com/explainers-by-googlers/reduce-accept-language) so that websites can't do fingerprinting as easily.
When Anubis issues a challenge, it grabs [information that the browser sends to the user](/docs/design/how-anubis-works#challenge-format) to create a challenge string. Anubis doesn't store these challenge strings anywhere, and when a solution is being checked it calculates the challenge string from the request. This means that they'd get a challenge on one end, compute the response for that challenge, and then the server would validate that against a different challenge. This server-side validation would fail, leading to the user seeing "invalid response" after the client reported success.
I suspect this was why Vanadium and Cromite were having sporadic issues as well.
## New Features
The biggest feature in Anubis is the "weight" subsystem. This allows administrators to make custom rules that change the suspicion level of a request without having to take immediate action. As an example, consider the self-hostable git forge [Gitea](https://about.gitea.com/). When you load a page in Gitea, it creates a session cookie that your browser sends with every request. Weight allows you to mark a request that includes a Gitea session token as _less_ suspicious:
```yaml
- name: gitea-session-token
action: WEIGH
expression:
all:
# Check if the request has a Cookie header
- '"Cookie" in headers'
# Check if the request's Cookie header contains the Gitea session token
- headers["Cookie"].contains("i_love_gitea=")
# Remove 5 weight points
weight:
adjust: -5
```
This is different from the past where you could only allow every request with a Gitea session token, meaning that the invention of lying would allow malicious clients to bypass protection.
Weight is added and removed whenever a `WEIGH` rule is encountered. When all rules are processed and the request doesn't match any `ALLOW`, `CHALLENGE`, or `DENY` rules, Anubis uses [weight thresholds](/docs/admin/configuration/thresholds) to figure out how to handle that request. Thresholds are defined in the [policy file](/docs/admin/policies) alongside your bot rules:
```yaml
thresholds:
- name: minimal-suspicion # This client is likely fine, its soul is lighter than a feather
expression: weight <= 0 # a feather weighs zero units
action: ALLOW # Allow the traffic through
# For clients that had some weight reduced through custom rules, give them a
If you don't have thresholds defined in your Anubis policy file, Anubis will default to the "legacy" behaviour where browser-like clients get a challenge at the default difficulty.
:::
This lets most clients through if they pass a simple [proof of work challenge](/docs/admin/configuration/challenges/proof-of-work), but any clients that are less suspicious (like ones with a Gitea session token) are given the lightweight [Meta Refresh](/docs/admin/configuration/challenges/metarefresh) challenge instead.
Threshold expressions are like [Bot rule expressions](/docs/admin/configuration/expressions), but there's only one input: the request's weight. If no thresholds match, the request is allowed through.
### Imprint/Impressum Support
European countries like Germany [require an imprint/impressum](https://www.ionos.com/digitalguide/websites/digital-law/a-case-for-thinking-global-germanys-impressum-laws/) to be present in the footer of their website. This allows users to contact someone on the team behind a website in case they run into issues. This also must generally have a separate page where users can view an extended imprint with other information like a privacy policy or a copyright notice.
Anubis v1.20.0 and later [has support for showing imprints](/docs/admin/configuration/impressum). You can configure two kinds of imprints:
1. An imprint that is shown in the footer of every Anubis page.
2. An extended imprint / privacy policy that is shown when users click on the "Imprint" link. For example, [here's the imprint for the website you're looking at right now](https://anubis.techaro.lol/.within.website/x/cmd/anubis/api/imprint).
Imprints are configured in [the policy file](/docs/admin/policies/):
```yaml
impressum:
# Displayed at the bottom of every page rendered by Anubis.
footer: >-
This website is hosted by Zombocom. If you have any complaints or notes
about the service, please contact
<a href="mailto:contact@zombocom.example">contact@zombocom.example</a> and
we will assist you as soon as possible.
# The imprint page that will be linked to at the footer of every Anubis page.
page:
# The HTML <title> of the page
title: Imprint and Privacy Policy
# The HTML contents of the page. The exact contents of this page can
# and will vary by locale. Please consult with a lawyer if you are not
# sure what to put here.
body: >-
<p>Last updated: June 2025</p>
<h2>Information that is gathered from visitors</h2>
<p>In common with other websites, log files are stored on the web server
saving details such as the visitor's IP address, browser type, referring
page and time of visit.</p>
<p>Cookies may be used to remember visitor preferences when interacting
with the website.</p>
<p>Where registration is required, the visitor's email and a username
will be stored on the server.</p>
<!-- ... -->
```
If this is insufficient, please [file an issue](https://github.com/TecharoHQ/anubis/issues/new) with a link to the relevant legislation for your country so that this feature can be amended and improved.
### No-JS Challenge
One of the first issues in Anubis before it was moved to the [TecharoHQ org](https://github.com/TecharoHQ) was a request [to support challenging browsers without using JavaScript](https://github.com/Xe/x/issues/651). This is a pretty challenging thing to do without rethinking how Anubis works from a fundamentally low level, and with v1.20.0, [Anubis finally has support for running without client-side JavaScript](https://github.com/TecharoHQ/anubis/issues/95) thanks to the [Meta Refresh](/docs/admin/configuration/challenges/metarefresh) challenge.
When Anubis decides it needs to send a challenge to your browser, it sends a challenge page. Historically, this challenge page is [an HTML template](https://github.com/TecharoHQ/anubis/blob/main/web/index.templ) that kicks off some JavaScript, reads the challenge information out of the page body, and then solves it as fast as possible in order to let users see the website they want to visit.
In v1.20.0, Anubis has a challenge registry to hold [different client challenge implementations](/docs/admin/configuration/challenges/). This allows us to implement anything we want as long as it can render a page to show a challenge and then check if the result is correct. This is going to be used to implement a WebAssembly-based proof of work option (one that will be way more efficient than the existing browser JS version), but as a proof of concept I implemented a simple challenge using [HTML `<meta refresh>`](https://en.wikipedia.org/wiki/Meta_refresh).
In my testing, this has worked with every browser I have thrown it at (including CLI browsers, the browser embedded in emacs, etc.). The default configuration of Anubis does use the [meta refresh challenge](/docs/admin/configuration/challenges/metarefresh) for [clients with a very low suspicion](/docs/admin/configuration/thresholds), but by default clients will be sent an [easy proof of work challenge](/docs/admin/configuration/challenges/proof-of-work).
If the false positive rate of this challenge turns out to not be very high in practice, the meta refresh challenge will be enabled by default for browsers in future versions of Anubis.
### `robots2policy`
Anubis was created because crawler bots don't respect [`robots.txt` files](https://www.robotstxt.org/). Administrators have been working on refining and crafting their `robots.txt` files for years, and one common comment is that people don't know where to start crafting their own rules.
Anubis now ships with a [`robots2policy` tool](/docs/admin/robots2policy) that lets you convert your `robots.txt` file to an Anubis policy.
If you installed Anubis from [an OS package](/docs/admin/native-install), you may need to run `anubis-robots2policy` instead of `robots2policy`.
:::
We hope that this will help you get started with Anubis faster. We are working on a version of this that will run in the documentation via WebAssembly.
### Open Graph configuration is being moved to the policy file
Anubis supports reading [Open Graph tags](/docs/admin/configuration/open-graph) from target services and returning them in challenge pages. This makes the right metadata show up when linking services protected by Anubis in chat applications or on social media.
In order to test the migration of all of the configuration to the policy file, Open Graph configuration has been moved to the policy file. For more information, please read [the Open Graph configuration options](/docs/admin/configuration/open-graph#configuration-options).
You can also set default Open Graph tags:
```yaml
openGraph:
enabled: true
ttl: 24h
# If set, return these opengraph values instead of looking them up with
# the target service.
#
# Correlates to properties in https://ogp.me/
override:
# og:title is required, it is the title of the website
"og:title": "Techaro Anubis"
"og:description": >-
Anubis is a Web AI Firewall Utility that helps you fight the bots
away so that you can maintain uptime at work!
"description": >-
Anubis is a Web AI Firewall Utility that helps you fight the bots
away so that you can maintain uptime at work!
```
## Improvements and optimizations
One of the biggest improvements we've made in v1.20.0 is replacing [SHA-256 with xxhash](https://github.com/TecharoHQ/anubis/pull/676). Anubis uses hashes all over the place to help with identifying clients, matching against rules when allowing traffic through, in error messages sent to users, and more. Historically these have been done with [SHA-256](https://en.wikipedia.org/wiki/SHA-2), however this has been having a mild performance impact in real-world use. As a result, we now use [xxhash](https://xxhash.com/) when possible. This makes policy matching 3x faster in some scenarios and reduces memory usage across the board.
Anubis now uses [bart](https://pkg.go.dev/github.com/gaissmai/bart) for doing IP address matching when you specify addresses in a `remote_address` check configuration or when you are matching against [advanced checks](/docs/admin/thoth). This uses the same kind of IP address routing configuration that your OS kernel does, making it very fast to query information about IP addresses. This makes IP address range matches anywhere from 3-14 times faster depending on the number of addresses it needs to match against. For more information and benchmarks, check out [@JasonLovesDoggo](https://github.com/JasonLovesDoggo)'s PR: [perf: replace cidranger with bart for significant performance improvements #675](https://github.com/TecharoHQ/anubis/pull/675).
## What's up next?
v1.21.0 is already shaping up to be a massive improvement as Anubis adds [internationalization](https://en.wikipedia.org/wiki/Internationalization) support, allowing your users to see its messages in the language they're most comfortable with.
So far Anubis supports the following languages:
- English (Simplified and Traditional)
- French
- Portugese (Brazil)
- Spanish
If you want to contribute translations, please [file an issue](https://github.com/TecharoHQ/anubis/issues/new) with your language of choice or submit a pull request to [the `lib/localization/locales` folder](https://github.com/TecharoHQ/anubis/tree/main/lib/localization/locales). We are about to introduce features to the translation stack, so you may want to hold off a hot minute, but we welcome any and all contributions to making Anubis useful to a global audience.
Other things we plan to do:
- Move configuration to the policy file
- Support reloading the policy file at runtime without having to restart Anubis
- Detecting if a client is "brand new"
- A [Valkey](https://valkey.io/)-backed store for sharing information between instances of Anubis
- Augmenting No-JS support in the paid product
- TLS fingerprinting
- Automated testing improvements in CI (FreeBSD CI support, better automated integration/functional testing, etc.)
## Conclusion
I hope that these features let you get the same Anubis power you've come to know and love and increases the things you can do with it! I've been really excited to ship [thresholds](/docs/admin/configuration/thresholds) and the cloud-based services for Anubis.
If you run into any problems, please [file an issue](https://github.com/TecharoHQ/anubis/issues/new). Otherwise, have a good day and get back to making your communities great.
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.