-
OK, sorry, I missed that (or maybe I saw it in passing, but the past 6 months have been quite crazy for me so I possibly forgot/overlooked it). However, isn't that a bit drastic? What about just updating robots.txt as suggested recently by the EFF to block the AI engines? Or even to disallow all?? And then perhaps certain 'useful' robots/crawlers could be allowed?
Sorry if these are naive questions, and I do understand the cost implications etc, particularly with the race to train, and fully support your efforts to deal with it - as well as total support for your maintenance of the entire forum :)
-
There are a lot of scrapers and such here, and very few respect robots.txt (Bing doesn't respect the speed part, so hammered us and cost more bandwidth).
If I try more aggressive things like blocking non-browsers, then I block people too as the tools are blunt... and I don't want to block people.
I did survey and ask at the time... and perhaps there's less scraping now, perhaps that was an early 2023 thing? I no longer have the visibility given that they're blocked
that question is actually: do all of the photographers in the thread wish for all of their photos to be creative-commons licensed and scraped by all of the AI companies, and then used to generate new content with no credit given back to you?
because that was why the defaults changed for the whole forum... the scraping by OpenAI, Microsoft, Google, Facebook... all significantly increased, with real bandwidth costs, as the race to train AI demands constantly fresh content