I have just dogged another crisis.
Seemingly PetalBut make requests at accelerating speed. Then we hit
a problem where I may not be able to kill nginx
root@helos ~ # systemctl stop nginx
Failed to allocate directory watch: Too many open files
After that,
root@helos ~ # ps axf | grep nginx
804800 pts/2 S+ 0:00 \_ grep nginx
so it may not run any more, but repeating the nginx shoutdown shows
the same warning
And then, a few minutes after that, helos starts runing again.
I added
# case sensitive matching
if ($http_user_agent ~ (PetalBot)) {
return 403;
}
to
/etc/nginx/sites-available/collec.repec.org
That may keep them out. Then killing the containers (????)
root@helos ~ # killall -9 containerd-shim-runc-v2
and
root@helos ~ # systemctl start nginx
--
Written by Thomas Krichel http://openlib.org/home/krichel on his 21163rd day.
hi PetalBot gang,
https://collec.repec.org is a shiny app. Recently it has been
overloading the machine. I suspect is your reqests. From what I
understand at my last overload, about 90 minutes ago, you made a
request with less than 3 seconds delay but my robots.txt had
Crawl-delay: 5. I now set it to Disallow: / in an act of dispair.
Can we please respect indications in robot.txt?
--
Written by Thomas Krichel http://openlib.org/home/krichel on his 21163rd day.
Yes, if their platform becomes even moderately successful, the load will likely exceed how much their shiny can handle. I warned them in the call about it, but they did not seem particularly interested.
It could be that Posit’s managed shiny server costing thousands of dollars a year performs better than the free ShinyProxy middleware that we use for CollEc. Yet, even with the paid service the app is unlikely to meet the requirements and come anywhere close to a more professional setup.
I understand why people use shiny apps. I use them myself. If you have a data science background, it is way easier to build an app with shiny than it is to build it with NodeJS. Shiny apps are convenient way of letting users explore a data set or a method. And in an environment with frequent staff turnover where people usually do not have a web development background, like at an economics department, maintenance is easier to ensure with a shiny app than with a more complex structure.
I built an entire teaching platform in shiny and am fed up with that tool. Its poor performance, hidden reactivity layer, and limited capabilities make it annoying to work with in apps beyond simple data or method illustrations. So, my new colleague and I are currently transitioning to a React/ Next.js/ Deno/ Redis/ PostgreSQL stack. Doing that besides research, teaching, R package development, software development for the institute and university, other institute duties, and personal affairs means that it takes at least a few months until I will have time to rewrite CollEc.
I presume it is fine to wait until then, as CollEc appears not to be very popular anyway.
Christian Düben
Doctoral Candidate
Chair of Macroeconomics
Hamburg University
Germany
christian.dueben(a)uni-hamburg.de<mailto:christian.dueben@uni-hamburg.de>
https://www.christian-dueben.com
From: Christian Zimmermann <chuichuiche(a)gmail.com>
Sent: Freitag, 12. Mai 2023 13:53
To: Düben, Christian <christian.dueben(a)uni-hamburg.de>
Subject: Re: [CollEc] Helos down
This is the kind of problem I foresee for the Banque de France site...
Christian Zimmermann
On Fri, May 12, 2023 at 6:52 AM Düben, Christian <christian.dueben(a)uni-hamburg.de<mailto:christian.dueben@uni-hamburg.de>> wrote:
There had been a bunch of exited Docker containers. I cleared them. CollEc should, in the medium term, move away from shiny apps. It is well possible that the current system does not support request bursts from bots.
I currently do not have time for that, but I can schedule it for the end of this year.
Christian Düben
Doctoral Candidate
Chair of Macroeconomics
Hamburg University
Germany
christian.dueben(a)uni-hamburg.de<mailto:christian.dueben@uni-hamburg.de>
https://www.christian-dueben.com
-----Original Message-----
From: CollEc-run <collec-run-bounces(a)lists.openlib.org<mailto:collec-run-bounces@lists.openlib.org>> On Behalf Of Thomas Krichel
Sent: Freitag, 12. Mai 2023 13:22
To: CollEc Run <collec-run(a)lists.openlib.org<mailto:collec-run@lists.openlib.org>>
Subject: Re: [CollEc] Helos down
Thomas Krichel writes
> I can ping it, but not more than that. I can't read email while this
> goes one, but you can to me at editors(a)nep.repec.org<mailto:editors@nep.repec.org>.
It has been up since about 7:50 UTC. Cezar rebooted. It was out of
memory. I think if I had a root window open
There are log entries for the oom killer
root@helos /var/log # grep oom-killer kern.log May 12 03:18:06 helos kernel: [4449730.455246] systemd invoked oom-killer: gfp_mask=0x1100cca(GFP_HIGHUSER_MOVABLE), order=0, oom_score_adj=0 May 12 04:20:22 helos kernel: [4459782.217108] systemd invoked oom-killer: gfp_mask=0x1100cca(GFP_HIGHUSER_MOVABLE), order=0, oom_score_adj=0 May 12 06:53:10 helos kernel: [4469843.887742] mutt invoked oom-killer: gfp_mask=0x1100cca(GFP_HIGHUSER_MOVABLE), order=0, oom_score_adj=0
The web access log, entries of today, contains 4716 entries of petalbot
doing stuff like
114.119.145.116 - - [12/May/2023:11:04:22 +0000] "GET /app_direct/collec_app?_inputs_&navbars=%22tab_Coauthors%22&_values_&g_author=%22ppa246%22 HTTP/1.1" 301 162 "https://ideas.repec.org/f/ppa246.html" "Mozilla/5.0 (Linux; Android 7.0;) AppleWebKit/537.36 (KHTML, like Gecko) Mobile Safari/537.36 (compatible; PetalBot;+https://webmaster.petalsearch.com/site/petalbot)"
I supect that petalbot made too many requests.
--
Written by Thomas Krichel http://openlib.org/home/krichel on his 21161st day.
_______________________________________________
CollEc-run mailing list
CollEc-run(a)lists.openlib.org<mailto:CollEc-run@lists.openlib.org>
http://lists.openlib.org/cgi-bin/mailman/listinfo/collec-run
_______________________________________________
CollEc-run mailing list
CollEc-run(a)lists.openlib.org<mailto:CollEc-run@lists.openlib.org>
http://lists.openlib.org/cgi-bin/mailman/listinfo/collec-run
I can ping it, but not more than that. I can't read email while this goes
one, but you can to me at editors(a)nep.repec.org.
--
Written by Thomas Krichel http://openlib.org/home/krichel on his 21161st day.