I have just dogged another crisis. Seemingly PetalBut make requests at accelerating speed. Then we hit a problem where I may not be able to kill nginx root@helos ~ # systemctl stop nginx Failed to allocate directory watch: Too many open files After that, root@helos ~ # ps axf | grep nginx 804800 pts/2 S+ 0:00 \_ grep nginx so it may not run any more, but repeating the nginx shoutdown shows the same warning And then, a few minutes after that, helos starts runing again. I added # case sensitive matching if ($http_user_agent ~ (PetalBot)) { return 403; } to /etc/nginx/sites-available/collec.repec.org That may keep them out. Then killing the containers (????) root@helos ~ # killall -9 containerd-shim-runc-v2 and root@helos ~ # systemctl start nginx -- Written by Thomas Krichel http://openlib.org/home/krichel on his 21163rd day.
Instead of querying CollEc in one user session, Petalbot opens lots of different user sessions, each coming with its own Docker container. I am now working on upgrading the middleware and adding a cron job that removes crashed Docker containers every few minutes. The cron job is not an elegant fix, but could stop Helos from crashing due to too many containers. Christian Düben Doctoral Candidate Chair of Macroeconomics Hamburg University Germany christian.dueben@uni-hamburg.de https://www.christian-dueben.com -----Original Message----- From: CollEc-run <collec-run-bounces@lists.openlib.org> On Behalf Of Thomas Krichel Sent: Sonntag, 14. Mai 2023 15:39 To: CollEc Run <collec-run@lists.openlib.org> Subject: [CollEc] PetalBot I have just dogged another crisis. Seemingly PetalBut make requests at accelerating speed. Then we hit a problem where I may not be able to kill nginx root@helos ~ # systemctl stop nginx Failed to allocate directory watch: Too many open files After that, root@helos ~ # ps axf | grep nginx 804800 pts/2 S+ 0:00 \_ grep nginx so it may not run any more, but repeating the nginx shoutdown shows the same warning And then, a few minutes after that, helos starts runing again. I added # case sensitive matching if ($http_user_agent ~ (PetalBot)) { return 403; } to /etc/nginx/sites-available/collec.repec.org That may keep them out. Then killing the containers (????) root@helos ~ # killall -9 containerd-shim-runc-v2 and root@helos ~ # systemctl start nginx -- Written by Thomas Krichel http://openlib.org/home/krichel on his 21163rd day. _______________________________________________ CollEc-run mailing list CollEc-run@lists.openlib.org http://lists.openlib.org/cgi-bin/mailman/listinfo/collec-run
Düben, Christian writes
Instead of querying CollEc in one user session, Petalbot opens lots of different user sessions, each coming with its own Docker container.
It's what you expect from a bot.
I am now working on upgrading the middleware and adding a cron job that removes crashed Docker containers every few minutes. The cron job is not an elegant fix, but could stop Helos from crashing due to too many containers.
Thank you!!! I am out hiking tomorrow, leaving at 5:13 and back at 14:00 or so, in UTC+7. -- Written by Thomas Krichel http://openlib.org/home/krichel on his 21163rd day.
Block it before it even opens sessions. Christian Zimmermann On Sun, May 14, 2023 at 9:26 AM Thomas Krichel <krichel@openlib.org> wrote:
Düben, Christian writes
Instead of querying CollEc in one user session, Petalbot opens lots of different user sessions, each coming with its own Docker container.
It's what you expect from a bot.
I am now working on upgrading the middleware and adding a cron job that removes crashed Docker containers every few minutes. The cron job is not an elegant fix, but could stop Helos from crashing due to too many containers.
Thank you!!!
I am out hiking tomorrow, leaving at 5:13 and back at 14:00 or so, in UTC+7.
-- Written by Thomas Krichel http://openlib.org/home/krichel on his 21163rd day.
_______________________________________________ CollEc-run mailing list CollEc-run@lists.openlib.org http://lists.openlib.org/cgi-bin/mailman/listinfo/collec-run
User-agent: PetalBot Disallow: / -- Lars Vilhuber, Economist Cornell University p: +1.607-330-5743 https://calendly.com/larsvilhuber My working day may not be your working day. Please respond during your working day. ________________________________ From: CollEc-run <collec-run-bounces@lists.openlib.org> on behalf of Christian Zimmermann <chuichuiche@gmail.com> Sent: Sunday, May 14, 2023 11:40 To: Thomas Krichel <krichel@openlib.org> Cc: CollEc Run <collec-run@lists.openlib.org> Subject: Re: [CollEc] PetalBot Block it before it even opens sessions. Christian Zimmermann On Sun, May 14, 2023 at 9:26 AM Thomas Krichel <krichel@openlib.org<mailto:krichel@openlib.org>> wrote: Düben, Christian writes
Instead of querying CollEc in one user session, Petalbot opens lots of different user sessions, each coming with its own Docker container.
It's what you expect from a bot.
I am now working on upgrading the middleware and adding a cron job that removes crashed Docker containers every few minutes. The cron job is not an elegant fix, but could stop Helos from crashing due to too many containers.
Thank you!!! I am out hiking tomorrow, leaving at 5:13 and back at 14:00 or so, in UTC+7. -- Written by Thomas Krichel http://openlib.org/home/krichel on his 21163rd day. _______________________________________________ CollEc-run mailing list CollEc-run@lists.openlib.org<mailto:CollEc-run@lists.openlib.org> http://lists.openlib.org/cgi-bin/mailman/listinfo/collec-run
I would prefer to block this at the web server level. However, I have now added a maximum app container lifetime through the middleware. The previous system supposedly only removed containers of users that left the site. The new system also removes actively used containers. Let us see how well that works. If that is insufficient, I will set up the mentioned cron job. Following tonight’s update, the app should work again. Christian Düben Doctoral Candidate Chair of Macroeconomics Hamburg University Germany christian.dueben@uni-hamburg.de<mailto:christian.dueben@uni-hamburg.de> https://www.christian-dueben.com From: CollEc-run <collec-run-bounces@lists.openlib.org> On Behalf Of Lars Vilhuber Sent: Sonntag, 14. Mai 2023 19:28 To: Christian Zimmermann <chuichuiche@gmail.com>; Thomas Krichel <krichel@openlib.org> Cc: CollEc Run <collec-run@lists.openlib.org> Subject: Re: [CollEc] PetalBot User-agent: PetalBot Disallow: / -- Lars Vilhuber, Economist Cornell University p: +1.607-330-5743 https://calendly.com/larsvilhuber My working day may not be your working day. Please respond during your working day. ________________________________ From: CollEc-run <collec-run-bounces@lists.openlib.org<mailto:collec-run-bounces@lists.openlib.org>> on behalf of Christian Zimmermann <chuichuiche@gmail.com<mailto:chuichuiche@gmail.com>> Sent: Sunday, May 14, 2023 11:40 To: Thomas Krichel <krichel@openlib.org<mailto:krichel@openlib.org>> Cc: CollEc Run <collec-run@lists.openlib.org<mailto:collec-run@lists.openlib.org>> Subject: Re: [CollEc] PetalBot Block it before it even opens sessions. Christian Zimmermann On Sun, May 14, 2023 at 9:26 AM Thomas Krichel <krichel@openlib.org<mailto:krichel@openlib.org>> wrote: Düben, Christian writes
Instead of querying CollEc in one user session, Petalbot opens lots of different user sessions, each coming with its own Docker container.
It's what you expect from a bot.
I am now working on upgrading the middleware and adding a cron job that removes crashed Docker containers every few minutes. The cron job is not an elegant fix, but could stop Helos from crashing due to too many containers.
Thank you!!! I am out hiking tomorrow, leaving at 5:13 and back at 14:00 or so, in UTC+7. -- Written by Thomas Krichel http://openlib.org/home/krichel on his 21163rd day. _______________________________________________ CollEc-run mailing list CollEc-run@lists.openlib.org<mailto:CollEc-run@lists.openlib.org> http://lists.openlib.org/cgi-bin/mailman/listinfo/collec-run
Düben, Christian writes
I would prefer to block this at the web server level.
Agreed.
However, I have now added a maximum app container lifetime through the middleware. The previous system supposedly only removed containers of users that left the site. The new system also removes actively used containers. Let us see how well that works.
At this time root@helos ~ # bin/petal_watch one hit every 3 seconds it seems to work well. Kudos and thanks. -- Written by Thomas Krichel http://openlib.org/home/krichel on his 21164th day.
Thomas Krichel writes
root@helos ~ # bin/petal_watch one hit every 3 seconds
Now we have no hits. I am returning to a blank response in robots.txt -- Written by Thomas Krichel http://openlib.org/home/krichel on his 21164th day.
Lars Vilhuber writes
User-agent: PetalBot Disallow: /
This is done, but it does not help. -- Written by Thomas Krichel http://openlib.org/home/krichel on his 21163rd day.
Thomas Krichel writes
Lars Vilhuber writes
User-agent: PetalBot Disallow: /
This is done, but it does not help.
And they just read the location 114.119.151.113 - - [14/May/2023:21:24:42 +0000] "GET /robots.txt HTTP/1.1" 301 162 "-" "Mozilla/5.0 (compatible;PetalBot;+https://webmaster.petalsearch.com/site/petalbot)" but don't stop. -- Written by Thomas Krichel http://openlib.org/home/krichel on his 21163rd day.
participants (4)
-
Christian Zimmermann -
Düben, Christian -
Lars Vilhuber -
Thomas Krichel