mirror of
https://github.com/TecharoHQ/anubis.git
synced 2026-05-07 07:42:42 +00:00
c460169047
Most of the worst of the worst scrapers run Headless Chrome. Headless Chrome is difficult for Anubis to combat because it follows all the rules that browsers do. The worst of the worst scrapers also use residential proxy services. Those residental proxy services charge upwards of $1 per GB of data egressed or ingressed. The Prompt API makes Chrome download a 4Gi or 16Gi machine learning model. When you ask it to start downloading, it will _continue_ downloading even when you leave the Anubis challenge page. This will make the local model answer "why is the sky blue?" in an absurt amount of detail, which wastes both bandwidth and scraper CPU (some scraping companies charge via Chrome CPU too). Signed-off-by: Xe Iaso <me@xeiaso.net>