I did not find a way to add a robots.txt at the application level. We could add such a file at the web server level though. Christian Düben Doctoral Candidate Chair of Macroeconomics Hamburg University Germany christian.dueben@uni-hamburg.de http://www.christian-dueben.com -----Original Message----- From: Christian Zimmermann <zimmermann@stlouisfed.org> Sent: Sonntag, 25. Juli 2021 15:25 To: Thomas Krichel <krichel@openlib.org> Cc: Düben, Christian <Christian.Dueben@uni-hamburg.de>; CollEc Run <collec-run@lists.openlib.org> Subject: Re: [CollEc] Helos offline I see there is no http://collec.repec.org/robots.txt... Christian Zimmermann FIGUGEGL! Economic Research Federal Reserve Bank of St. Louis P.O. Box 442 St. Louis MO 63166-0442 USA https://ideas.repec.org/zimm/ @CZimm_economist On Sun, 25 Jul 2021, Thomas Krichel wrote:
D�ben, Christian writes
At the beginning of June, I installed a script that records the times CollEc was accessed - no other variable, just the access time. When plotting the results aggregated by day, you can see that the number of daily app visits tends to fluctuate around 1,000 (see Subset.pdf). However, yesterday it surged to almost 30,000 (see Full_Period.pdf). Monit just notified me at 9:30 am today that the app was offline. So, I do not know whether that is related to the server issue. But tons of machines firing requests at port 80 on one day and the server becoming inaccessible on the next appears to be an odd coincidence.
Well, if you just log the times, how can you claim it's "ton of machines"? I did go through the apache log, and the surge appears to come from indeed, a bunch of servers from Huawei's petalsearch. The requests look legit. I'm sure they use reasonable defaults. It just that the shinyapp is slow.
Apache keeps saying 503 but keeps logging so it was still up. The odd thing is that we could not get through on the ssh. Since we only have that route to the server we are stuck, and have to ask for Cezar.
There is a change for 502 to 503
114.119.158.156 - - [24/Jul/2021:09:04:30 +0200] "GET /app_direct/collec_app/?_inputs_&navbars=%22tab_Coauthors%22&_values_&g_author=%22pel60%22 HTTP/1.1" 502 646 "-" "Mozilla/5.0 (Linux; Androi d 7.0;) AppleWebKit/537.36 (KHTML, like Gecko) Mobile Safari/537.36 (compatible; PetalBot;+https://webmaster.petalsearch.com/site/petalbot)" 114.119.136.243 - - [24/Jul/2021:09:04:39 +0200] "GET /app_direct/collec_app/?_inputs_&navbars=%22tab_Coauthors%22&_values_&g_author=%22ppa963%22 HTTP/1.1" 502 646 "-" "Mozilla/5.0 (Linux; Andro id 7.0;) AppleWebKit/537.36 (KHTML, like Gecko) Mobile Safari/537.36 (compatible; PetalBot;+https://webmaster.petalsearch.com/site/petalbot)" 114.119.134.212 - - [24/Jul/2021:09:04:43 +0200] "GET /app_direct/collec_app/?_inputs_&navbars=%22tab_Coauthors%22&_values_&g_author=%22pkr268%22 HTTP/1.1" 503 575 "-" "Mozilla/5.0 (Linux; Android 7.0;) AppleWebKit/537.36 (KHTML, like Gecko) Mobile Safari/537.36 (compatible; PetalBot;+https://webmaster.petalsearch.com/site/petalbot)" 114.119.146.29 - - [24/Jul/2021:09:04:43 +0200] "GET /app_direct/collec_app/?_inputs_&navbars=%22tab_Coauthors%22&_values_&g_author=%22pbe625%22 HTTP/1.1" 503 575 "-" "Mozilla/5.0 (Linux; Android 7.0;) AppleWebKit/537.36 (KHTML, like Gecko) Mobile Safari/537.36 (compatible; PetalBot;+https://webmaster.petalsearch.com/site/petalbot)"
at 9:04 so that's pretty consistent with what you note. The non-accessibilty presumably has to do with helos running out of memory, but why did the oom killer not work? Well it run, but was not enough. We have in syslog
root@helos /var/log # grep 'R invoked oom-killer' syslog.1 Jul 24 08:59:09 helos kernel: [14922235.506685] R invoked oom-killer: gfp_mask=0x14200ca(GFP_HIGHUSER_MOVABLE), nodemask=(null), order=0, oom_score_adj=0 Jul 24 09:30:07 helos kernel: [14924093.497980] R invoked oom-killer: gfp_mask=0x14200ca(GFP_HIGHUSER_MOVABLE), nodemask=(null), order=0, oom_score_adj=0 Jul 24 10:26:46 helos kernel: [14927492.848174] R invoked oom-killer: gfp_mask=0x14200ca(GFP_HIGHUSER_MOVABLE), nodemask=(null), order=0, oom_score_adj=0 Jul 24 10:58:08 helos kernel: [14929347.932058] R invoked oom-killer: gfp_mask=0x14200ca(GFP_HIGHUSER_MOVABLE), nodemask=(null), order=0, oom_score_adj=0 Jul 24 12:08:50 helos kernel: [14933616.461377] R invoked oom-killer: gfp_mask=0x14200ca(GFP_HIGHUSER_MOVABLE), nodemask=(null), order=0, oom_score_adj=0 Jul 24 12:58:00 helos kernel: [14936548.248476] R invoked oom-killer: gfp_mask=0x14200ca(GFP_HIGHUSER_MOVABLE), nodemask=(null), order=0, oom_score_adj=0 Jul 24 13:10:19 helos kernel: [14937294.624810] R invoked oom-killer: gfp_mask=0x14200ca(GFP_HIGHUSER_MOVABLE), nodemask=(null), order=0, oom_score_adj=0 Jul 24 13:23:38 helos kernel: [14938104.947025] R invoked oom-killer: gfp_mask=0x14200ca(GFP_HIGHUSER_MOVABLE), nodemask=(null), order=0, oom_score_adj=0 Jul 24 14:08:06 helos kernel: [14940762.579273] R invoked oom-killer: gfp_mask=0x14200ca(GFP_HIGHUSER_MOVABLE), nodemask=(null), order=0, oom_score_adj=0 Jul 24 14:24:13 helos kernel: [14941739.313980] R invoked oom-killer: gfp_mask=0x14200ca(GFP_HIGHUSER_MOVABLE), nodemask=(null), order=0, oom_score_adj=0 Jul 24 16:32:52 helos kernel: [14949437.368614] R invoked oom-killer: gfp_mask=0x14200ca(GFP_HIGHUSER_MOVABLE), nodemask=(null), order=0, oom_score_adj=0 Jul 24 17:50:50 helos kernel: [14954122.341626] R invoked oom-killer: gfp_mask=0x14200ca(GFP_HIGHUSER_MOVABLE), nodemask=(null), order=0, oom_score_adj=0
But seemingly these oom kill are not enough to keep ssh up.
I suspect what could be done is a script that checks whether the uptime is greater than a day. In that case, grep for 'R invoked oom-killer' in syslog, if found, reboot. Run that every hour. I've never written / run anything like that.
The easier thing is to disable petal via hosts.txt.
Your thoughts?
--
Cheers,
Thomas Krichel http://openlib.org/home/krichel skype:thomaskrichel
_______________________________________________ CollEc-run mailing list CollEc-run@lists.openlib.org http://lists.openlib.org/cgi-bin/mailman/listinfo/collec-run