Düben, Christian writes
I went through some of the files and checked what I would need for an extension of CollEc. I have a few ideas in mind on what to add and how to present it in an interactive application.
It's very hard to do a worse job than I did vizualizing that data!
When consulting our IT department here at Hamburg University, they suggested to host RePEc Visual on one of their managed Linux servers. At this point I am still waiting for the administration to process my application requesting such a server. And just like every administrative procedure at our institution, this takes a while. Once I have access to the respective infrastructure I am going to test implementations of RePEc Visual and potential CollEc extensions on it. Those applications would of course run under an external domain, not a Hamburg University domain.
We could run this on the existing CollEc server. This would be especially valuable if you manage to find a way to run the calculations faster. At this time, it's dreadfully slow. You could just take over the whole thing, well almost. We need to keep the mention of the sponsor, and I'd like to be aknowledged as the orginal creator.
I do not have Telegram and apparently do not have the correct login credentials for the Skype setup on my office Laptop. Do you use Zoom? If you do, I can send you a meeting link. If you do not, I will try to find out what login credentials our IT set for Skype.
Zoom should be fine. I'm in UTC+7. I can do late evenings no problem. My schedule is completely open. Maybe someone else would want to attend? I copy CollEc-run. -- Cheers, Thomas Krichel http://openlib.org/home/krichel skype:thomaskrichel
I'm curious. Keep me in the loop. How is this different (other than more current, and focussed only on intra-RePEc links rather than Twitter links) than https://io.mongeau.net/repec-twitter-network/? -- Lars Vilhuber, Economist Cornell University, Executive Director, Labor Dynamics Institute and ILR School - Department of Economics American Economic Association - Data Editor Journal of Privacy and Confidentiality - Managing Editor e: lars.vilhuber@cornell.edu p: +1.607-330-5743 v: https://cornell.zoom.us/my/larsvilhuber w: http://lars.vilhuber.com/ <http://lars.vilhuber.com/> Assistant: ldi@cornell.edu | +1.607-255-2744 ________________________________ From: CollEc-run <collec-run-bounces@lists.openlib.org> on behalf of Thomas Krichel <krichel@openlib.org> Sent: Wednesday, May 20, 2020 08:14 To: Düben, Christian <Christian.Dueben@uni-hamburg.de> Cc: CollEc Run <collec-run@lists.openlib.org> Subject: Re: [CollEc] RePEc Visual Düben, Christian writes
I went through some of the files and checked what I would need for an extension of CollEc. I have a few ideas in mind on what to add and how to present it in an interactive application.
It's very hard to do a worse job than I did vizualizing that data!
When consulting our IT department here at Hamburg University, they suggested to host RePEc Visual on one of their managed Linux servers. At this point I am still waiting for the administration to process my application requesting such a server. And just like every administrative procedure at our institution, this takes a while. Once I have access to the respective infrastructure I am going to test implementations of RePEc Visual and potential CollEc extensions on it. Those applications would of course run under an external domain, not a Hamburg University domain.
We could run this on the existing CollEc server. This would be especially valuable if you manage to find a way to run the calculations faster. At this time, it's dreadfully slow. You could just take over the whole thing, well almost. We need to keep the mention of the sponsor, and I'd like to be aknowledged as the orginal creator.
I do not have Telegram and apparently do not have the correct login credentials for the Skype setup on my office Laptop. Do you use Zoom? If you do, I can send you a meeting link. If you do not, I will try to find out what login credentials our IT set for Skype.
Zoom should be fine. I'm in UTC+7. I can do late evenings no problem. My schedule is completely open. Maybe someone else would want to attend? I copy CollEc-run. -- Cheers, Thomas Krichel http://openlib.org/home/krichel skype:thomaskrichel _______________________________________________ CollEc-run mailing list CollEc-run@lists.openlib.org http://lists.openlib.org/cgi-bin/mailman/listinfo/collec-run
Thanks for your messages. How about a Zoom meeting today at 15:30 CEST (UTC+2)? Christian Düben Research Associate Chair of Macroeconomics Hamburg University Von-Melle-Park 5, Room 3102 20146 Hamburg Germany +49 40 42838 1898 christian.dueben@uni-hamburg.de<mailto:christian.dueben@uni-hamburg.de> http://www.christian-dueben.com Von: Lars Vilhuber <lars.vilhuber@cornell.edu> Gesendet: Mittwoch, 20. Mai 2020 14:41 An: Thomas Krichel <krichel@openlib.org>; Düben, Christian <Christian.Dueben@uni-hamburg.de> Cc: CollEc Run <collec-run@lists.openlib.org> Betreff: Re: [CollEc] RePEc Visual I'm curious. Keep me in the loop. How is this different (other than more current, and focussed only on intra-RePEc links rather than Twitter links) than https://io.mongeau.net/repec-twitter-network/? -- Lars Vilhuber, Economist Cornell University, Executive Director, Labor Dynamics Institute and ILR School - Department of Economics American Economic Association - Data Editor Journal of Privacy and Confidentiality - Managing Editor e: lars.vilhuber@cornell.edu<mailto:lars.vilhuber@cornell.edu> p: +1.607-330-5743 v: https://cornell.zoom.us/my/larsvilhuber w: http://lars.vilhuber.com/ Assistant: ldi@cornell.edu<mailto:ldi@cornell.edu> | +1.607-255-2744 ________________________________ From: CollEc-run <collec-run-bounces@lists.openlib.org<mailto:collec-run-bounces@lists.openlib.org>> on behalf of Thomas Krichel <krichel@openlib.org<mailto:krichel@openlib.org>> Sent: Wednesday, May 20, 2020 08:14 To: Düben, Christian <Christian.Dueben@uni-hamburg.de<mailto:Christian.Dueben@uni-hamburg.de>> Cc: CollEc Run <collec-run@lists.openlib.org<mailto:collec-run@lists.openlib.org>> Subject: Re: [CollEc] RePEc Visual Düben, Christian writes
I went through some of the files and checked what I would need for an extension of CollEc. I have a few ideas in mind on what to add and how to present it in an interactive application.
It's very hard to do a worse job than I did vizualizing that data!
When consulting our IT department here at Hamburg University, they suggested to host RePEc Visual on one of their managed Linux servers. At this point I am still waiting for the administration to process my application requesting such a server. And just like every administrative procedure at our institution, this takes a while. Once I have access to the respective infrastructure I am going to test implementations of RePEc Visual and potential CollEc extensions on it. Those applications would of course run under an external domain, not a Hamburg University domain.
We could run this on the existing CollEc server. This would be especially valuable if you manage to find a way to run the calculations faster. At this time, it's dreadfully slow. You could just take over the whole thing, well almost. We need to keep the mention of the sponsor, and I'd like to be aknowledged as the orginal creator.
I do not have Telegram and apparently do not have the correct login credentials for the Skype setup on my office Laptop. Do you use Zoom? If you do, I can send you a meeting link. If you do not, I will try to find out what login credentials our IT set for Skype.
Zoom should be fine. I'm in UTC+7. I can do late evenings no problem. My schedule is completely open. Maybe someone else would want to attend? I copy CollEc-run. -- Cheers, Thomas Krichel http://openlib.org/home/krichel skype:thomaskrichel _______________________________________________ CollEc-run mailing list CollEc-run@lists.openlib.org<mailto:CollEc-run@lists.openlib.org> http://lists.openlib.org/cgi-bin/mailman/listinfo/collec-run
I believe that is now. I'm not available. -- Lars Vilhuber, Economist Cornell University, Executive Director, Labor Dynamics Institute and ILR School - Department of Economics American Economic Association - Data Editor Journal of Privacy and Confidentiality - Managing Editor e: lars.vilhuber@cornell.edu p: +1.607-330-5743 v: https://cornell.zoom.us/my/larsvilhuber w: http://lars.vilhuber.com/ <http://lars.vilhuber.com/> Assistant: ldi@cornell.edu | +1.607-255-2744 ________________________________ From: Düben, Christian <Christian.Dueben@uni-hamburg.de> Sent: Wednesday, May 20, 2020 08:59 To: Lars Vilhuber <lars.vilhuber@cornell.edu>; Thomas Krichel <krichel@openlib.org> Cc: CollEc Run <collec-run@lists.openlib.org> Subject: AW: [CollEc] RePEc Visual Thanks for your messages. How about a Zoom meeting today at 15:30 CEST (UTC+2)? Christian Düben Research Associate Chair of Macroeconomics Hamburg University Von-Melle-Park 5, Room 3102 20146 Hamburg Germany +49 40 42838 1898 christian.dueben@uni-hamburg.de<mailto:christian.dueben@uni-hamburg.de> http://www.christian-dueben.com Von: Lars Vilhuber <lars.vilhuber@cornell.edu> Gesendet: Mittwoch, 20. Mai 2020 14:41 An: Thomas Krichel <krichel@openlib.org>; Düben, Christian <Christian.Dueben@uni-hamburg.de> Cc: CollEc Run <collec-run@lists.openlib.org> Betreff: Re: [CollEc] RePEc Visual I'm curious. Keep me in the loop. How is this different (other than more current, and focussed only on intra-RePEc links rather than Twitter links) than https://io.mongeau.net/repec-twitter-network/? -- Lars Vilhuber, Economist Cornell University, Executive Director, Labor Dynamics Institute and ILR School - Department of Economics American Economic Association - Data Editor Journal of Privacy and Confidentiality - Managing Editor e: lars.vilhuber@cornell.edu<mailto:lars.vilhuber@cornell.edu> p: +1.607-330-5743 v: https://cornell.zoom.us/my/larsvilhuber w: http://lars.vilhuber.com/ Assistant: ldi@cornell.edu<mailto:ldi@cornell.edu> | +1.607-255-2744 ________________________________ From: CollEc-run <collec-run-bounces@lists.openlib.org<mailto:collec-run-bounces@lists.openlib.org>> on behalf of Thomas Krichel <krichel@openlib.org<mailto:krichel@openlib.org>> Sent: Wednesday, May 20, 2020 08:14 To: Düben, Christian <Christian.Dueben@uni-hamburg.de<mailto:Christian.Dueben@uni-hamburg.de>> Cc: CollEc Run <collec-run@lists.openlib.org<mailto:collec-run@lists.openlib.org>> Subject: Re: [CollEc] RePEc Visual Düben, Christian writes
I went through some of the files and checked what I would need for an extension of CollEc. I have a few ideas in mind on what to add and how to present it in an interactive application.
It's very hard to do a worse job than I did vizualizing that data!
When consulting our IT department here at Hamburg University, they suggested to host RePEc Visual on one of their managed Linux servers. At this point I am still waiting for the administration to process my application requesting such a server. And just like every administrative procedure at our institution, this takes a while. Once I have access to the respective infrastructure I am going to test implementations of RePEc Visual and potential CollEc extensions on it. Those applications would of course run under an external domain, not a Hamburg University domain.
We could run this on the existing CollEc server. This would be especially valuable if you manage to find a way to run the calculations faster. At this time, it's dreadfully slow. You could just take over the whole thing, well almost. We need to keep the mention of the sponsor, and I'd like to be aknowledged as the orginal creator.
I do not have Telegram and apparently do not have the correct login credentials for the Skype setup on my office Laptop. Do you use Zoom? If you do, I can send you a meeting link. If you do not, I will try to find out what login credentials our IT set for Skype.
Zoom should be fine. I'm in UTC+7. I can do late evenings no problem. My schedule is completely open. Maybe someone else would want to attend? I copy CollEc-run. -- Cheers, Thomas Krichel http://openlib.org/home/krichel skype:thomaskrichel _______________________________________________ CollEc-run mailing list CollEc-run@lists.openlib.org<mailto:CollEc-run@lists.openlib.org> http://lists.openlib.org/cgi-bin/mailman/listinfo/collec-run
I guess I should announce that more than 30 minutes in advance. Let us postpone it to one of the next days. Tomorrow is a public holiday here in Germany and my internet connection at home is unstable. So how about Friday? Here is a doodle: https://doodle.com/poll/3t7ga9nhupq95ceu. The reference time zone is UTC+2. Christian Düben Research Associate Chair of Macroeconomics Hamburg University Von-Melle-Park 5, Room 3102 20146 Hamburg Germany +49 40 42838 1898 christian.dueben@uni-hamburg.de<mailto:christian.dueben@uni-hamburg.de> http://www.christian-dueben.com Von: CollEc-run <collec-run-bounces@lists.openlib.org> Im Auftrag von Düben, Christian Gesendet: Mittwoch, 20. Mai 2020 15:00 An: Lars Vilhuber <lars.vilhuber@cornell.edu>; Thomas Krichel <krichel@openlib.org> Cc: CollEc Run <collec-run@lists.openlib.org> Betreff: Re: [CollEc] RePEc Visual Thanks for your messages. How about a Zoom meeting today at 15:30 CEST (UTC+2)? Christian Düben Research Associate Chair of Macroeconomics Hamburg University Von-Melle-Park 5, Room 3102 20146 Hamburg Germany +49 40 42838 1898 christian.dueben@uni-hamburg.de<mailto:christian.dueben@uni-hamburg.de> http://www.christian-dueben.com Von: Lars Vilhuber <lars.vilhuber@cornell.edu<mailto:lars.vilhuber@cornell.edu>> Gesendet: Mittwoch, 20. Mai 2020 14:41 An: Thomas Krichel <krichel@openlib.org<mailto:krichel@openlib.org>>; Düben, Christian <Christian.Dueben@uni-hamburg.de<mailto:Christian.Dueben@uni-hamburg.de>> Cc: CollEc Run <collec-run@lists.openlib.org<mailto:collec-run@lists.openlib.org>> Betreff: Re: [CollEc] RePEc Visual I'm curious. Keep me in the loop. How is this different (other than more current, and focussed only on intra-RePEc links rather than Twitter links) than https://io.mongeau.net/repec-twitter-network/? -- Lars Vilhuber, Economist Cornell University, Executive Director, Labor Dynamics Institute and ILR School - Department of Economics American Economic Association - Data Editor Journal of Privacy and Confidentiality - Managing Editor e: lars.vilhuber@cornell.edu<mailto:lars.vilhuber@cornell.edu> p: +1.607-330-5743 v: https://cornell.zoom.us/my/larsvilhuber w: http://lars.vilhuber.com/ Assistant: ldi@cornell.edu<mailto:ldi@cornell.edu> | +1.607-255-2744 ________________________________ From: CollEc-run <collec-run-bounces@lists.openlib.org<mailto:collec-run-bounces@lists.openlib.org>> on behalf of Thomas Krichel <krichel@openlib.org<mailto:krichel@openlib.org>> Sent: Wednesday, May 20, 2020 08:14 To: Düben, Christian <Christian.Dueben@uni-hamburg.de<mailto:Christian.Dueben@uni-hamburg.de>> Cc: CollEc Run <collec-run@lists.openlib.org<mailto:collec-run@lists.openlib.org>> Subject: Re: [CollEc] RePEc Visual Düben, Christian writes
I went through some of the files and checked what I would need for an extension of CollEc. I have a few ideas in mind on what to add and how to present it in an interactive application.
It's very hard to do a worse job than I did vizualizing that data!
When consulting our IT department here at Hamburg University, they suggested to host RePEc Visual on one of their managed Linux servers. At this point I am still waiting for the administration to process my application requesting such a server. And just like every administrative procedure at our institution, this takes a while. Once I have access to the respective infrastructure I am going to test implementations of RePEc Visual and potential CollEc extensions on it. Those applications would of course run under an external domain, not a Hamburg University domain.
We could run this on the existing CollEc server. This would be especially valuable if you manage to find a way to run the calculations faster. At this time, it's dreadfully slow. You could just take over the whole thing, well almost. We need to keep the mention of the sponsor, and I'd like to be aknowledged as the orginal creator.
I do not have Telegram and apparently do not have the correct login credentials for the Skype setup on my office Laptop. Do you use Zoom? If you do, I can send you a meeting link. If you do not, I will try to find out what login credentials our IT set for Skype.
Zoom should be fine. I'm in UTC+7. I can do late evenings no problem. My schedule is completely open. Maybe someone else would want to attend? I copy CollEc-run. -- Cheers, Thomas Krichel http://openlib.org/home/krichel skype:thomaskrichel _______________________________________________ CollEc-run mailing list CollEc-run@lists.openlib.org<mailto:CollEc-run@lists.openlib.org> http://lists.openlib.org/cgi-bin/mailman/listinfo/collec-run
Düben, Christian writes
I guess I should announce that more than 30 minutes in advance.
We live and learn.
Tomorrow is a public holiday here in Germany
Yeah Germany is a collective leisure park!
Here is a doodle: https://doodle.com/poll/3t7ga9nhupq95ceu.
Filled.
The reference time zone is UTC+2.
It changes the zone for me. Let's wait for Lars, I don't think I ever talked to him. -- Cheers, Thomas Krichel http://openlib.org/home/krichel skype:thomaskrichel
@Thomas Krichel<mailto:krichel@openlib.org> Me neither! I filled out the Doodle. Sorry, the bulk of the day is already full. -- Lars Vilhuber, Economist Cornell University, Executive Director, Labor Dynamics Institute and ILR School - Department of Economics American Economic Association - Data Editor Journal of Privacy and Confidentiality - Managing Editor e: lars.vilhuber@cornell.edu p: +1.607-330-5743 v: https://cornell.zoom.us/my/larsvilhuber w: http://lars.vilhuber.com/ <http://lars.vilhuber.com/> Assistant: ldi@cornell.edu | +1.607-255-2744 ________________________________ From: Thomas Krichel <krichel@openlib.org> Sent: Wednesday, May 20, 2020 10:55 To: Düben, Christian <Christian.Dueben@uni-hamburg.de> Cc: Lars Vilhuber <lars.vilhuber@cornell.edu>; CollEc Run <collec-run@lists.openlib.org> Subject: Re: [CollEc] RePEc Visual Düben, Christian writes
I guess I should announce that more than 30 minutes in advance.
We live and learn.
Tomorrow is a public holiday here in Germany
Yeah Germany is a collective leisure park!
Here is a doodle: https://doodle.com/poll/3t7ga9nhupq95ceu.
Filled.
The reference time zone is UTC+2.
It changes the zone for me. Let's wait for Lars, I don't think I ever talked to him. -- Cheers, Thomas Krichel http://openlib.org/home/krichel skype:thomaskrichel
Not sure I am supposed to be on there, but I am fully booked all week. Christian Zimmermann FIGUGEGL! Economic Research Federal Reserve Bank of St. Louis P.O. Box 442 St. Louis MO 63166-0442 USA https://ideas.repec.org/zimm/ @CZimm_economist On Wed, 20 May 2020, Lars Vilhuber wrote:
@Thomas Krichel<mailto:krichel@openlib.org> Me neither!
I filled out the Doodle. Sorry, the bulk of the day is already full.
-- Lars Vilhuber, Economist Cornell University, Executive Director, Labor Dynamics Institute and ILR School - Department of Economics American Economic Association - Data Editor Journal of Privacy and Confidentiality - Managing Editor
e: lars.vilhuber@cornell.edu p: +1.607-330-5743 v: https://cornell.zoom.us/my/larsvilhuber w: http://lars.vilhuber.com/ <http://lars.vilhuber.com/>
Assistant: ldi@cornell.edu | +1.607-255-2744 ________________________________ From: Thomas Krichel <krichel@openlib.org> Sent: Wednesday, May 20, 2020 10:55 To: D�ben, Christian <Christian.Dueben@uni-hamburg.de> Cc: Lars Vilhuber <lars.vilhuber@cornell.edu>; CollEc Run <collec-run@lists.openlib.org> Subject: Re: [CollEc] RePEc Visual
D�ben, Christian writes
I guess I should announce that more than 30 minutes in advance.
We live and learn.
Tomorrow is a public holiday here in Germany
Yeah Germany is a collective leisure park!
Here is a doodle: https://doodle.com/poll/3t7ga9nhupq95ceu.
Filled.
The reference time zone is UTC+2.
It changes the zone for me.
Let's wait for Lars, I don't think I ever talked to him.
--
Cheers,
Thomas Krichel http://openlib.org/home/krichel skype:thomaskrichel
Christian Zimmermann writes
Not sure I am supposed to be on there, but I am fully booked all week.
You don't have to dance at all weddings ;-) -- Cheers, Thomas Krichel http://openlib.org/home/krichel skype:thomaskrichel
Well, in 25 years I have been only at two weddings. Both involving the same person. There was a funeral in between. Christian Zimmermann FIGUGEGL! Economic Research Federal Reserve Bank of St. Louis P.O. Box 442 St. Louis MO 63166-0442 USA https://ideas.repec.org/zimm/ @CZimm_economist On Wed, 20 May 2020, Thomas Krichel wrote:
Christian Zimmermann writes
Not sure I am supposed to be on there, but I am fully booked all week.
You don't have to dance at all weddings ;-)
--
Cheers,
Thomas Krichel http://openlib.org/home/krichel skype:thomaskrichel
Friday 2:30 PM (UTC+2). Christian Düben is inviting you to a scheduled Zoom meeting. Topic: CollEc Time: May 22, 2020 02:30 PM Amsterdam, Berlin, Rome, Stockholm, Vienna Join Zoom Meeting https://uni-hamburg.zoom.us/j/91488614124?pwd=bkVFcHg1bkZ6MmRXd1NmWTFEdTUzUT... Meeting ID: 914 8861 4124 Password: 387647 One tap mobile +14703812552,,91488614124#,,1#,387647# US (Atlanta) +17209289299,,91488614124#,,1#,387647# US (Denver) Dial by your location +1 470 381 2552 US (Atlanta) +1 720 928 9299 US (Denver) +1 971 247 1195 US (Portland) Meeting ID: 914 8861 4124 Password: 387647 Find your local number: https://uni-hamburg.zoom.us/u/aexylvKEcl Christian Düben Research Associate Chair of Macroeconomics Hamburg University Von-Melle-Park 5, Room 3102 20146 Hamburg Germany +49 40 42838 1898 christian.dueben@uni-hamburg.de<mailto:christian.dueben@uni-hamburg.de> http://www.christian-dueben.com Von: Lars Vilhuber <lars.vilhuber@cornell.edu> Gesendet: Mittwoch, 20. Mai 2020 17:54 An: Thomas Krichel <krichel@openlib.org>; Düben, Christian <Christian.Dueben@uni-hamburg.de> Cc: CollEc Run <collec-run@lists.openlib.org> Betreff: Re: [CollEc] RePEc Visual @Thomas Krichel<mailto:krichel@openlib.org> Me neither! I filled out the Doodle. Sorry, the bulk of the day is already full. -- Lars Vilhuber, Economist Cornell University, Executive Director, Labor Dynamics Institute and ILR School - Department of Economics American Economic Association - Data Editor Journal of Privacy and Confidentiality - Managing Editor e: lars.vilhuber@cornell.edu<mailto:lars.vilhuber@cornell.edu> p: +1.607-330-5743 v: https://cornell.zoom.us/my/larsvilhuber w: http://lars.vilhuber.com/ Assistant: ldi@cornell.edu<mailto:ldi@cornell.edu> | +1.607-255-2744 ________________________________ From: Thomas Krichel <krichel@openlib.org<mailto:krichel@openlib.org>> Sent: Wednesday, May 20, 2020 10:55 To: Düben, Christian <Christian.Dueben@uni-hamburg.de<mailto:Christian.Dueben@uni-hamburg.de>> Cc: Lars Vilhuber <lars.vilhuber@cornell.edu<mailto:lars.vilhuber@cornell.edu>>; CollEc Run <collec-run@lists.openlib.org<mailto:collec-run@lists.openlib.org>> Subject: Re: [CollEc] RePEc Visual Düben, Christian writes
I guess I should announce that more than 30 minutes in advance.
We live and learn.
Tomorrow is a public holiday here in Germany
Yeah Germany is a collective leisure park!
Here is a doodle: https://doodle.com/poll/3t7ga9nhupq95ceu.
Filled.
The reference time zone is UTC+2.
It changes the zone for me. Let's wait for Lars, I don't think I ever talked to him. -- Cheers, Thomas Krichel http://openlib.org/home/krichel skype:thomaskrichel
Düben, Christian writes
Thanks for your messages. How about a Zoom meeting today at 15:30 CEST (UTC+2)?
Sorry I was out boozing with my lover, only now got back. -- Cheers, Thomas Krichel http://openlib.org/home/krichel skype:thomaskrichel
I checked how fast CollEc computations run when executed in C and C++ through an R package. The underlying graph contained 47,192 authors who wrote at least one co-authored paper. I weighted the edges between co-authors by their number of joint papers. First, I calculated the distance matrix. Distances are measured as the length of Dijkstra's shortest cost paths. Calculating and writing those 2,227,084,864 cell values to disk took 4.77 minutes in a process parallelized across 8 cores. Computing each author's closeness value and writing it to disk took 4.27 minutes in an 8 core process. Betweenness is quite slow in comparison. The code still leaves space for improvement. All three measures are derived from shortest cost paths. So, it would be more efficient to derive those paths once and use them for all three measures rather than computing them thrice. Another point is the parallel process structure. The iterations' chunk size may not be optimal and could be improved through further tests. If users only access data for a small number of authors at once, it is not even necessary to previously calculate those values and store them on disk or on a SQL database server. With the graph kept in memory computations are quasi instant for small sets of authors. See you tomorrow. Christian Düben Research Associate Chair of Macroeconomics Hamburg University Von-Melle-Park 5, Room 3102 20146 Hamburg Germany +49 40 42838 1898 christian.dueben@uni-hamburg.de http://www.christian-dueben.com -----Ursprüngliche Nachricht----- Von: Thomas Krichel <krichel@openlib.org> Gesendet: Mittwoch, 20. Mai 2020 14:14 An: Düben, Christian <Christian.Dueben@uni-hamburg.de> Cc: CollEc Run <collec-run@lists.openlib.org> Betreff: Re: RePEc Visual Düben, Christian writes
I went through some of the files and checked what I would need for an extension of CollEc. I have a few ideas in mind on what to add and how to present it in an interactive application.
It's very hard to do a worse job than I did vizualizing that data!
When consulting our IT department here at Hamburg University, they suggested to host RePEc Visual on one of their managed Linux servers. At this point I am still waiting for the administration to process my application requesting such a server. And just like every administrative procedure at our institution, this takes a while. Once I have access to the respective infrastructure I am going to test implementations of RePEc Visual and potential CollEc extensions on it. Those applications would of course run under an external domain, not a Hamburg University domain.
We could run this on the existing CollEc server. This would be especially valuable if you manage to find a way to run the calculations faster. At this time, it's dreadfully slow. You could just take over the whole thing, well almost. We need to keep the mention of the sponsor, and I'd like to be aknowledged as the orginal creator.
I do not have Telegram and apparently do not have the correct login credentials for the Skype setup on my office Laptop. Do you use Zoom? If you do, I can send you a meeting link. If you do not, I will try to find out what login credentials our IT set for Skype.
Zoom should be fine. I'm in UTC+7. I can do late evenings no problem. My schedule is completely open. Maybe someone else would want to attend? I copy CollEc-run. -- Cheers, Thomas Krichel http://openlib.org/home/krichel skype:thomaskrichel
Düben, Christian writes
I weighted the edges between co-authors by their number of joint papers.
As far a I understand we need a binary network. Otherwise can can easily be an a situation where we say that the shortest path between A and C is through B, even though A and C have written a paper together.
First, I calculated the distance matrix. Distances are measured as the length of Dijkstra's shortest cost paths. Calculating and writing those 2,227,084,864 cell values to disk took 4.77 minutes in a process parallelized across 8 cores. Computing each author's closeness value and writing it to disk took 4.27 minutes in an 8 core process. Betweenness is quite slow in comparison.
It would be many many times faster than what I do now.
See you tomorrow.
Yes, 19:30 my time. -- Cheers, Thomas Krichel http://openlib.org/home/krichel skype:thomaskrichel
Dear All, Thanks again for the insightful Zoom meeting. This helped me a lot in developing a new version of CollEc. And thank you Thomas for allowing me to advance this interesting project. After spending much of this week on developing the new CollEc, it is now quasi done. The user interface is similar to that of RePEc Visual. Users interactively navigate through the app and are presented with CollEc data through graphical output. A documentation contains variable definitions, details on the technical implementation etc. I wrote the documentation so that someone who never heard of graph theory understands what CollEc displays. The SQL implementation and the code generating the CollEc data from RePEc Author Service inputs are also almost ready for upload. What I am still having some difficulties with is deploying apps through Docker containers. I am going to take a course on Docker this weekend and expect to deploy a test app some time next week. CollEc's server does not really have any vacant CPU capacity to test the web application. I therefore suggest to upload the test application to another server where everyone on this list can access it with a password. After this testing phase it can then be officially released on the actual server. Feel free to share your thoughts on this. Have a nice weekend. Kind regards, Christian Christian Düben Research Associate Chair of Macroeconomics Hamburg University Von-Melle-Park 5, Room 3102 20146 Hamburg Germany +49 40 42838 1898 christian.dueben@uni-hamburg.de http://www.christian-dueben.com -----Original Message----- From: Thomas Krichel <krichel@openlib.org> Sent: Donnerstag, 21. Mai 2020 16:03 To: Düben, Christian <Christian.Dueben@uni-hamburg.de> Cc: CollEc Run <collec-run@lists.openlib.org> Subject: Re: RePEc Visual Düben, Christian writes
I weighted the edges between co-authors by their number of joint papers.
As far a I understand we need a binary network. Otherwise can can easily be an a situation where we say that the shortest path between A and C is through B, even though A and C have written a paper together.
First, I calculated the distance matrix. Distances are measured as the length of Dijkstra's shortest cost paths. Calculating and writing those 2,227,084,864 cell values to disk took 4.77 minutes in a process parallelized across 8 cores. Computing each author's closeness value and writing it to disk took 4.27 minutes in an 8 core process. Betweenness is quite slow in comparison.
It would be many many times faster than what I do now.
See you tomorrow.
Yes, 19:30 my time. -- Cheers, Thomas Krichel http://openlib.org/home/krichel skype:thomaskrichel
Hi Christian, cool. Two things: since I'm late to this discussion I actually don't know where to find "RePEc Visual". Second, I have been working a lot with Docker lately, and have some experience by now (won the hard way, the old fashioned way). If you point me to your Git repo and/or your Docker build file, I can provide some insights/help. Lars -- Lars Vilhuber, Economist Cornell University, Executive Director, Labor Dynamics Institute and ILR School - Department of Economics American Economic Association - Data Editor Journal of Privacy and Confidentiality - Managing Editor e: lars.vilhuber@cornell.edu p: +1.607-330-5743 v: https://cornell.zoom.us/my/larsvilhuber w: http://lars.vilhuber.com/ <http://lars.vilhuber.com/> Assistant: ldi@cornell.edu | +1.607-255-2744 ________________________________ From: Düben, Christian <Christian.Dueben@uni-hamburg.de> Sent: Friday, May 29, 2020 16:12 To: Thomas Krichel <krichel@openlib.org>; Lars Vilhuber <lars.vilhuber@cornell.edu> Cc: CollEc Run <collec-run@lists.openlib.org> Subject: RE: RePEc Visual Dear All, Thanks again for the insightful Zoom meeting. This helped me a lot in developing a new version of CollEc. And thank you Thomas for allowing me to advance this interesting project. After spending much of this week on developing the new CollEc, it is now quasi done. The user interface is similar to that of RePEc Visual. Users interactively navigate through the app and are presented with CollEc data through graphical output. A documentation contains variable definitions, details on the technical implementation etc. I wrote the documentation so that someone who never heard of graph theory understands what CollEc displays. The SQL implementation and the code generating the CollEc data from RePEc Author Service inputs are also almost ready for upload. What I am still having some difficulties with is deploying apps through Docker containers. I am going to take a course on Docker this weekend and expect to deploy a test app some time next week. CollEc's server does not really have any vacant CPU capacity to test the web application. I therefore suggest to upload the test application to another server where everyone on this list can access it with a password. After this testing phase it can then be officially released on the actual server. Feel free to share your thoughts on this. Have a nice weekend. Kind regards, Christian Christian Düben Research Associate Chair of Macroeconomics Hamburg University Von-Melle-Park 5, Room 3102 20146 Hamburg Germany +49 40 42838 1898 christian.dueben@uni-hamburg.de http://www.christian-dueben.com -----Original Message----- From: Thomas Krichel <krichel@openlib.org> Sent: Donnerstag, 21. Mai 2020 16:03 To: Düben, Christian <Christian.Dueben@uni-hamburg.de> Cc: CollEc Run <collec-run@lists.openlib.org> Subject: Re: RePEc Visual Düben, Christian writes
I weighted the edges between co-authors by their number of joint papers.
As far a I understand we need a binary network. Otherwise can can easily be an a situation where we say that the shortest path between A and C is through B, even though A and C have written a paper together.
First, I calculated the distance matrix. Distances are measured as the length of Dijkstra's shortest cost paths. Calculating and writing those 2,227,084,864 cell values to disk took 4.77 minutes in a process parallelized across 8 cores. Computing each author's closeness value and writing it to disk took 4.27 minutes in an 8 core process. Betweenness is quite slow in comparison.
It would be many many times faster than what I do now.
See you tomorrow.
Yes, 19:30 my time. -- Cheers, Thomas Krichel http://openlib.org/home/krichel skype:thomaskrichel
Hi Lars, RePEc Visual is not officially released yet. I only uploaded a preliminary draft to https://christiandueben.shinyapps.io/RePEc_Visual/ and shared that with the RePEc-run list. However, it is super slow on a free shinyapps.io subscription. I expect to publish an official version on a different server next week. It is going to contain a documentation tab and the introduction is going to be different - including a video tutorial on how to use the app. Thanks for your offer to help with Docker. I will share the file with you once I figured out the basics. Kind regards, Christian Christian Düben Research Associate Chair of Macroeconomics Hamburg University Von-Melle-Park 5, Room 3102 20146 Hamburg Germany +49 40 42838 1898 christian.dueben@uni-hamburg.de<mailto:christian.dueben@uni-hamburg.de> http://www.christian-dueben.com From: Lars Vilhuber <lars.vilhuber@cornell.edu> Sent: Freitag, 29. Mai 2020 23:36 To: Düben, Christian <Christian.Dueben@uni-hamburg.de>; Thomas Krichel <krichel@openlib.org> Cc: CollEc Run <collec-run@lists.openlib.org> Subject: Re: RePEc Visual Hi Christian, cool. Two things: since I'm late to this discussion I actually don't know where to find "RePEc Visual". Second, I have been working a lot with Docker lately, and have some experience by now (won the hard way, the old fashioned way). If you point me to your Git repo and/or your Docker build file, I can provide some insights/help. Lars -- Lars Vilhuber, Economist Cornell University, Executive Director, Labor Dynamics Institute and ILR School - Department of Economics American Economic Association - Data Editor Journal of Privacy and Confidentiality - Managing Editor e: lars.vilhuber@cornell.edu<mailto:lars.vilhuber@cornell.edu> p: +1.607-330-5743 v: https://cornell.zoom.us/my/larsvilhuber w: http://lars.vilhuber.com/ Assistant: ldi@cornell.edu<mailto:ldi@cornell.edu> | +1.607-255-2744 ________________________________ From: Düben, Christian <Christian.Dueben@uni-hamburg.de<mailto:Christian.Dueben@uni-hamburg.de>> Sent: Friday, May 29, 2020 16:12 To: Thomas Krichel <krichel@openlib.org<mailto:krichel@openlib.org>>; Lars Vilhuber <lars.vilhuber@cornell.edu<mailto:lars.vilhuber@cornell.edu>> Cc: CollEc Run <collec-run@lists.openlib.org<mailto:collec-run@lists.openlib.org>> Subject: RE: RePEc Visual Dear All, Thanks again for the insightful Zoom meeting. This helped me a lot in developing a new version of CollEc. And thank you Thomas for allowing me to advance this interesting project. After spending much of this week on developing the new CollEc, it is now quasi done. The user interface is similar to that of RePEc Visual. Users interactively navigate through the app and are presented with CollEc data through graphical output. A documentation contains variable definitions, details on the technical implementation etc. I wrote the documentation so that someone who never heard of graph theory understands what CollEc displays. The SQL implementation and the code generating the CollEc data from RePEc Author Service inputs are also almost ready for upload. What I am still having some difficulties with is deploying apps through Docker containers. I am going to take a course on Docker this weekend and expect to deploy a test app some time next week. CollEc's server does not really have any vacant CPU capacity to test the web application. I therefore suggest to upload the test application to another server where everyone on this list can access it with a password. After this testing phase it can then be officially released on the actual server. Feel free to share your thoughts on this. Have a nice weekend. Kind regards, Christian Christian Düben Research Associate Chair of Macroeconomics Hamburg University Von-Melle-Park 5, Room 3102 20146 Hamburg Germany +49 40 42838 1898 christian.dueben@uni-hamburg.de<mailto:christian.dueben@uni-hamburg.de> http://www.christian-dueben.com -----Original Message----- From: Thomas Krichel <krichel@openlib.org<mailto:krichel@openlib.org>> Sent: Donnerstag, 21. Mai 2020 16:03 To: Düben, Christian <Christian.Dueben@uni-hamburg.de<mailto:Christian.Dueben@uni-hamburg.de>> Cc: CollEc Run <collec-run@lists.openlib.org<mailto:collec-run@lists.openlib.org>> Subject: Re: RePEc Visual Düben, Christian writes
I weighted the edges between co-authors by their number of joint papers.
As far a I understand we need a binary network. Otherwise can can easily be an a situation where we say that the shortest path between A and C is through B, even though A and C have written a paper together.
First, I calculated the distance matrix. Distances are measured as the length of Dijkstra's shortest cost paths. Calculating and writing those 2,227,084,864 cell values to disk took 4.77 minutes in a process parallelized across 8 cores. Computing each author's closeness value and writing it to disk took 4.27 minutes in an 8 core process. Betweenness is quite slow in comparison.
It would be many many times faster than what I do now.
See you tomorrow.
Yes, 19:30 my time. -- Cheers, Thomas Krichel http://openlib.org/home/krichel skype:thomaskrichel
Düben, Christian writes
CollEc's server does not really have any vacant CPU capacity to test the web application.
Sure, because it's doing the calculations for the legacy service.
I therefore suggest to upload the test application to another server
I can give you darni. I'm building ArchEc there. I can create an icanis account there, or some other name you like and then point the dns entry test.collect.repec.org to it. I'm not sure if we need to run it in a docker container. I've never done it. For me it just creates a layer of complication. I rarely get to move machines anyway, and all (bar helos) run Debian testing. -- Cheers, Thomas Krichel http://openlib.org/home/krichel skype:thomaskrichel
Hi Thomas, Testing the app on darni would be great. Thanks for the offer. For some reason ShinyProxy deploys apps from Docker containers. And I think that, at least for the first draft, sticking with the standard procedure might be the easiest solution. Kind regards, Christian Christian Düben Research Associate Chair of Macroeconomics Hamburg University Von-Melle-Park 5, Room 3102 20146 Hamburg Germany +49 40 42838 1898 christian.dueben@uni-hamburg.de http://www.christian-dueben.com -----Original Message----- From: Thomas Krichel <krichel@openlib.org> Sent: Samstag, 30. Mai 2020 07:08 To: Düben, Christian <Christian.Dueben@uni-hamburg.de> Cc: CollEc Run <collec-run@lists.openlib.org> Subject: Re: RePEc Visual Düben, Christian writes
CollEc's server does not really have any vacant CPU capacity to test the web application.
Sure, because it's doing the calculations for the legacy service.
I therefore suggest to upload the test application to another server
I can give you darni. I'm building ArchEc there. I can create an icanis account there, or some other name you like and then point the dns entry test.collect.repec.org to it. I'm not sure if we need to run it in a docker container. I've never done it. For me it just creates a layer of complication. I rarely get to move machines anyway, and all (bar helos) run Debian testing. -- Cheers, Thomas Krichel http://openlib.org/home/krichel skype:thomaskrichel
Düben, Christian writes
Testing the app on darni would be great. Thanks for the offer.
I created an icanis account with the same .ssh, and a link helos to an nfs of the files on helos. Also, I have a DNS name test.collec.repec.org set the the IP addresses of darni. BTW VizEc would not be a trademark ... at this time. -- Cheers, Thomas Krichel http://openlib.org/home/krichel skype:thomaskrichel
Thanks. I am going to run a test on that next week. I like "VisEc" but find "VizEc" a little confusing as there is no "z" in "Visual". Christian Düben Research Associate Chair of Macroeconomics Hamburg University Von-Melle-Park 5, Room 3102 20146 Hamburg Germany +49 40 42838 1898 christian.dueben@uni-hamburg.de http://www.christian-dueben.com -----Original Message----- From: Thomas Krichel <krichel@openlib.org> Sent: Samstag, 30. Mai 2020 13:34 To: Düben, Christian <Christian.Dueben@uni-hamburg.de> Cc: CollEc Run <collec-run@lists.openlib.org> Subject: Re: RePEc Visual Düben, Christian writes
Testing the app on darni would be great. Thanks for the offer.
I created an icanis account with the same .ssh, and a link helos to an nfs of the files on helos. Also, I have a DNS name test.collec.repec.org set the the IP addresses of darni. BTW VizEc would not be a trademark ... at this time. -- Cheers, Thomas Krichel http://openlib.org/home/krichel skype:thomaskrichel
Düben, Christian writes
Thanks. I am going to run a test on that next week.
I like "VisEc"
Sure but you pointed out the trademark issue.
but find "VizEc" a little confusing as there is no "z" in "Visual".
I understand. Generally, the term visual is rather general, as web sites usually are visual. Maybe we can come up with somewhat more specific on what exactly the site likes to show, like relationships RelEc, or just graphics GraphEc. -- Cheers, Thomas Krichel http://openlib.org/home/krichel skype:thomaskrichel
GraphEc is good idea. The app is about graphics, i.e. plots. I am going to think about which Ec combination best describes the app. Am 30. Mai 2020, um 15:06, Thomas Krichel <krichel@openlib.org<mailto:krichel@openlib.org>> schrieb: Düben, Christian writes Thanks. I am going to run a test on that next week. I like "VisEc" Sure but you pointed out the trademark issue. but find "VizEc" a little confusing as there is no "z" in "Visual". I understand. Generally, the term visual is rather general, as web sites usually are visual. Maybe we can come up with somewhat more specific on what exactly the site likes to show, like relationships RelEc, or just graphics GraphEc.
These names just confuse users. Call it like what it is: RePEc Coauthor Network Christian Zimmermann FIGUGEGL! Economic Research Federal Reserve Bank of St. Louis P.O. Box 442 St. Louis MO 63166-0442 USA https://ideas.repec.org/zimm/ @CZimm_economist On Sat, 30 May 2020, Thomas Krichel wrote:
D�ben, Christian writes
Thanks. I am going to run a test on that next week.
I like "VisEc"
Sure but you pointed out the trademark issue.
but find "VizEc" a little confusing as there is no "z" in "Visual".
I understand. Generally, the term visual is rather general, as web sites usually are visual. Maybe we can come up with somewhat more specific on what exactly the site likes to show, like relationships RelEc, or just graphics GraphEc.
--
Cheers,
Thomas Krichel http://openlib.org/home/krichel skype:thomaskrichel
_______________________________________________ CollEc-run mailing list CollEc-run@lists.openlib.org http://lists.openlib.org/cgi-bin/mailman/listinfo/collec-run
Christian Zimmermann writes
These names just confuse users.
"IDEAS" is even more confusing.
Call it like what it is: RePEc Coauthor Network
No, that's what CollEc is about, GraphEc is broader. -- Cheers, Thomas Krichel http://openlib.org/home/krichel skype:thomaskrichel
On Sat, 30 May 2020, Thomas Krichel wrote:
Christian Zimmermann writes
These names just confuse users.
"IDEAS" is even more confusing.
Agreed. A youth mistake. I have correctred myself later.
Call it like what it is: RePEc Coauthor Network
No, that's what CollEc is about, GraphEc is broader.
As if people understand what CollEc is...
--
Cheers,
Thomas Krichel http://openlib.org/home/krichel skype:thomaskrichel
Christian Zimmermann writes
Agreed.
And EDIRC came before it, I suspect.
A youth mistake. I have correctred myself later.
Christian surely is now younger than you were at the time you conceied IDEAS ... -- Cheers, Thomas Krichel http://openlib.org/home/krichel skype:thomaskrichel
I checked the availability of various names. Short names of one or two letters plus Ec tend to be registered trademarks. GraphEc and VisuEc are still available. They are both suitable names. However, I currently favor VisuEc. I plan on adding maps to the currently implemented plots. And I have the impression that "visualize" matches that broader focus quite well. Changing CollEc's name is up to Thomas. In my opinion we should keep its name for now. I managed to deploy apps through ShinyProxy. Setting up the Docker image turned out to be fairly simple. Configuring ShinyProxy based on a sparse documentation, though, takes a while. And I am not sure about the scalability of this basic setup. Depending on how many users access it, I might need to add Nginx as a load balancer. Thomas, you mentioned that you created an icanis account on darni. What is that server's address? Kind regards, Christian Christian Düben Research Associate Chair of Macroeconomics Hamburg University Von-Melle-Park 5, Room 3102 20146 Hamburg Germany +49 40 42838 1898 christian.dueben@uni-hamburg.de http://www.christian-dueben.com -----Original Message----- From: CollEc-run <collec-run-bounces@lists.openlib.org> On Behalf Of Thomas Krichel Sent: Samstag, 30. Mai 2020 17:11 To: Christian Zimmermann <zimmermann@stlouisfed.org> Cc: CollEc Run <collec-run@lists.openlib.org> Subject: Re: [CollEc] RePEc Visual Christian Zimmermann writes
Agreed.
And EDIRC came before it, I suspect.
A youth mistake. I have correctred myself later.
Christian surely is now younger than you were at the time you conceied IDEAS ... -- Cheers, Thomas Krichel http://openlib.org/home/krichel skype:thomaskrichel _______________________________________________ CollEc-run mailing list CollEc-run@lists.openlib.org http://lists.openlib.org/cgi-bin/mailman/listinfo/collec-run
Düben, Christian writes
I checked the availability of various names. Short names of one or two letters plus Ec tend to be registered trademarks. GraphEc and VisuEc are still available. They are both suitable names. However, I currently favor VisuEc.
GraphEc would be more in keeping with the traditional two-syllable structure, and be more funny as a name, since graphic has a somewhat sensationalist overtone. The idea off a "graphic description" invoques ideas that it deals with sex and crime.
I plan on adding maps to the currently implemented plots. And I have the impression that "visualize" matches that broader focus quite well.
Changing CollEc's name is up to Thomas. In my opinion we should keep its name for now.
Sure.
I managed to deploy apps through ShinyProxy. Setting up the Docker image turned out to be fairly simple. Configuring ShinyProxy based on a sparse documentation, though, takes a while. And I am not sure about the scalability of this basic setup. Depending on how many users access it, I might need to add Nginx as a load balancer.
No worries I can set this up for you.
Thomas, you mentioned that you created an icanis account on darni. What is that server's address?
All names are in the openlib.org domain. krichel@trabbi~$ host darni darni.openlib.org has address 95.216.35.87 darni.openlib.org has IPv6 address 2a01:4f9:2a:23a8::2 grapfec, grafec, and visuec, all .repec.org, now point to darni. -- Cheers, Thomas Krichel http://openlib.org/home/krichel skype:thomaskrichel
Yes, I noticed GraphEc's pun on graphic content. Let us set GraphEc as the current title then. I see that you like it and I am personally more focused on the app's content right now. I see that I do not have the access rights to install anything on darni. Could you, therefore, install Java, Docker and ShinyProxy as explained on this website: https://www.shinyproxy.io/getting-started/? To run CollEc on that server I would also need access to a database within the installed MariaDB. Christian Düben Research Associate Chair of Macroeconomics Hamburg University Von-Melle-Park 5, Room 3102 20146 Hamburg Germany +49 40 42838 1898 christian.dueben@uni-hamburg.de http://www.christian-dueben.com -----Original Message----- From: Thomas Krichel <krichel@openlib.org> Sent: Mittwoch, 3. Juni 2020 16:28 To: Düben, Christian <Christian.Dueben@uni-hamburg.de> Cc: CollEc Run <collec-run@lists.openlib.org> Subject: Re: [CollEc] RePEc Visual Düben, Christian writes
I checked the availability of various names. Short names of one or two letters plus Ec tend to be registered trademarks. GraphEc and VisuEc are still available. They are both suitable names. However, I currently favor VisuEc.
GraphEc would be more in keeping with the traditional two-syllable structure, and be more funny as a name, since graphic has a somewhat sensationalist overtone. The idea off a "graphic description" invoques ideas that it deals with sex and crime.
I plan on adding maps to the currently implemented plots. And I have the impression that "visualize" matches that broader focus quite well.
Changing CollEc's name is up to Thomas. In my opinion we should keep its name for now.
Sure.
I managed to deploy apps through ShinyProxy. Setting up the Docker image turned out to be fairly simple. Configuring ShinyProxy based on a sparse documentation, though, takes a while. And I am not sure about the scalability of this basic setup. Depending on how many users access it, I might need to add Nginx as a load balancer.
No worries I can set this up for you.
Thomas, you mentioned that you created an icanis account on darni. What is that server's address?
All names are in the openlib.org domain. krichel@trabbi~$ host darni darni.openlib.org has address 95.216.35.87 darni.openlib.org has IPv6 address 2a01:4f9:2a:23a8::2 grapfec, grafec, and visuec, all .repec.org, now point to darni. -- Cheers, Thomas Krichel http://openlib.org/home/krichel skype:thomaskrichel
Düben, Christian writes
I see that I do not have the access rights to install anything on darni.
I'll be happy to give you the root access ... in fact it's done icanis@darni:~$ ssh root@darni Linux darni 5.6.0-1-amd64 #1 SMP Debian 5.6.7-1 (2020-04-29) x86_64 The programs included with the Debian GNU/Linux system are free software; the exact distribution terms for each program are described in the individual files in /usr/share/doc/*/copyright. Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent permitted by applicable law. Last login: Wed Jun 3 17:09:13 2020 from 2a01:4f9:2a:23a8::2 root@darni ~ # exit logout Connection to darni closed.
Could you, therefore, install Java, Docker and ShinyProxy as explained on this website: https://www.shinyproxy.io/getting-started/?
root@darni ~ # apt install java8-runtime Reading package lists... Done Building dependency tree Reading state information... Done Package java8-runtime is a virtual package provided by: openjdk-14-jre 14.0.1+7-1 openjdk-13-jre 13.0.3+3-1 openjdk-11-jre 11.0.7+9-1 default-jre 2:1.11-72 You should explicitly select one to install. E: Package 'java8-runtime' has no installation candidate root@darni ~ # apt install openjdk-14-jre ... root@darni ~ # aptitude install docker The following NEW packages will be installed: docker wmdocker{a} 0 packages upgraded, 2 newly installed, 0 to remove and 13 not upgraded. Need to get 15.3 kB of archives. After unpacking 58.4 kB will be used. Do you want to continue? [Y/n/?] Get: 1 http://mirror.hetzner.de/debian/packages testing/main amd64 wmdocker amd64 1.5-2 [12.8 kB] Get: 2 http://mirror.hetzner.de/debian/packages testing/main amd64 docker all 1.5-2 [2,556 B] Fetched 15.3 kB in 0s (128 kB/s) Selecting previously unselected package wmdocker. (Reading database ... 105422 files and directories currently installed.) Preparing to unpack .../wmdocker_1.5-2_amd64.deb ... Unpacking wmdocker (1.5-2) ... Selecting previously unselected package docker. Preparing to unpack .../archives/docker_1.5-2_all.deb ... Unpacking docker (1.5-2) ... Setting up wmdocker (1.5-2) ... Processing triggers for man-db (2.9.1-1) ... Processing triggers for menu (2.1.47+b1) ... Setting up docker (1.5-2) ... Then they suggest root@darni ~ # cat /etc/systemd/system/docker.service.d/override.conf [Service] ExecStart= ExecStart=/usr/bin/dockerd -H unix:// -D -H tcp://127.0.0.1:2375 root@darni ~ # systemctl daemon-reload root@darni ~ # sudo systemctl restart docker Failed to restart docker.service: Unit docker.service not found. So something here is not quite right. To be continued. -- Cheers, Thomas Krichel http://openlib.org/home/krichel skype:thomaskrichel
Thomas Krichel writes
root@darni ~ # sudo systemctl restart docker Failed to restart docker.service: Unit docker.service not found.
Well it turns out I had to get docker.io root@darni ~ # apt install docker.io ... root@darni ~ # systemctl start docker Job for docker.service failed because the control process exited with error code. See "systemctl status docker.service" and "journalctl -xe" for details. root@darni ~ # systemctl status docker.service ● docker.service - Docker Application Container Engine Loaded: loaded (/lib/systemd/system/docker.service; enabled; vendor preset: enabled) Drop-In: /etc/systemd/system/docker.service.d └─override.conf Active: failed (Result: exit-code) since Thu 2020-06-04 06:17:12 UTC; 57s ago TriggeredBy: ● docker.socket Docs: https://docs.docker.com Process: 3288749 ExecStart=/usr/bin/dockerd -H unix:// -D -H tcp://127.0.0.1:2375 (code=exi> Main PID: 3288749 (code=exited, status=203/EXEC) Jun 04 06:17:12 darni systemd[1]: docker.service: Scheduled restart job, restart counter is at > Jun 04 06:17:12 darni systemd[1]: Stopped Docker Application Container Engine. Jun 04 06:17:12 darni systemd[1]: docker.service: Start request repeated too quickly. Jun 04 06:17:12 darni systemd[1]: docker.service: Failed with result 'exit-code'. Jun 04 06:17:12 darni systemd[1]: Failed to start Docker Application Container Engine. Jun 04 06:18:06 darni systemd[1]: docker.service: Start request repeated too quickly. Jun 04 06:18:06 darni systemd[1]: docker.service: Failed with result 'exit-code'. Jun 04 06:18:06 darni systemd[1]: Failed to start Docker Application Container Engine. overwrite.conf looks for docker in /usr/bin/dockerd, but root@darni ~ # which dockerd /usr/sbin/dockerd says it's in another place. I needed to add the 's' in /etc/systemd/system/docker.service.d/override.conf Then root@darni ~ # systemctl daemon-reload root@darni ~ # systemctl start docker and it's up and running. I suspect the shiny app should be run as an unpriviledged user. Let me know if you want to keep icanis or want a different one. We can then accumulate the sudo instructions for that user It would also be good to have the mysql database with the same name? -- Cheers, Thomas Krichel http://openlib.org/home/krichel skype:thomaskrichel
Thanks for setting it up. You mentioned in yesterday's e-mail that you gave me root access. However, I apparently need a password for that. The app itself only needs read access. It reads data from the SQL database and from other files stored on disk and displays it. The scripts generating the data run independently of the app. They require read and write access to the database and the directories the app uses and are initiated by a scheduling system. Installing and updating the app requires more extensive permissions. I need full access to Docker and ShinyProxy for that. How about two accounts? One handles the app and has minor access rights. And the other generates the data, controls the Docker images and ShinyProxy and has larger access permissions. For security reasons I suggest that these accounts can only access the new CollEc's database within MariaDB. This prevents any repercussions on non-CollEc databases. When setting these permissions we should make sure that "LOAD DATA LOCAL INFILE" or " LOAD DATA INFILE" are still available. Restricted access apparently tends to block these statements which I use to insert large data sets. Feel free to choose any name you like for the account(s) and the database. Christian Düben Research Associate Chair of Macroeconomics Hamburg University Von-Melle-Park 5, Room 3102 20146 Hamburg Germany +49 40 42838 1898 christian.dueben@uni-hamburg.de http://www.christian-dueben.com -----Original Message----- From: Thomas Krichel <krichel@openlib.org> Sent: Donnerstag, 4. Juni 2020 08:32 To: Düben, Christian <Christian.Dueben@uni-hamburg.de> Cc: CollEc Run <collec-run@lists.openlib.org> Subject: Re: [CollEc] RePEc Visual Thomas Krichel writes
root@darni ~ # sudo systemctl restart docker Failed to restart docker.service: Unit docker.service not found.
Well it turns out I had to get docker.io root@darni ~ # apt install docker.io ... root@darni ~ # systemctl start docker Job for docker.service failed because the control process exited with error code. See "systemctl status docker.service" and "journalctl -xe" for details. root@darni ~ # systemctl status docker.service ● docker.service - Docker Application Container Engine Loaded: loaded (/lib/systemd/system/docker.service; enabled; vendor preset: enabled) Drop-In: /etc/systemd/system/docker.service.d └─override.conf Active: failed (Result: exit-code) since Thu 2020-06-04 06:17:12 UTC; 57s ago TriggeredBy: ● docker.socket Docs: https://docs.docker.com Process: 3288749 ExecStart=/usr/bin/dockerd -H unix:// -D -H tcp://127.0.0.1:2375 (code=exi> Main PID: 3288749 (code=exited, status=203/EXEC) Jun 04 06:17:12 darni systemd[1]: docker.service: Scheduled restart job, restart counter is at > Jun 04 06:17:12 darni systemd[1]: Stopped Docker Application Container Engine. Jun 04 06:17:12 darni systemd[1]: docker.service: Start request repeated too quickly. Jun 04 06:17:12 darni systemd[1]: docker.service: Failed with result 'exit-code'. Jun 04 06:17:12 darni systemd[1]: Failed to start Docker Application Container Engine. Jun 04 06:18:06 darni systemd[1]: docker.service: Start request repeated too quickly. Jun 04 06:18:06 darni systemd[1]: docker.service: Failed with result 'exit-code'. Jun 04 06:18:06 darni systemd[1]: Failed to start Docker Application Container Engine. overwrite.conf looks for docker in /usr/bin/dockerd, but root@darni ~ # which dockerd /usr/sbin/dockerd says it's in another place. I needed to add the 's' in /etc/systemd/system/docker.service.d/override.conf Then root@darni ~ # systemctl daemon-reload root@darni ~ # systemctl start docker and it's up and running. I suspect the shiny app should be run as an unpriviledged user. Let me know if you want to keep icanis or want a different one. We can then accumulate the sudo instructions for that user It would also be good to have the mysql database with the same name? -- Cheers, Thomas Krichel http://openlib.org/home/krichel skype:thomaskrichel
Düben, Christian writes
You mentioned in yesterday's e-mail that you gave me root access. However, I apparently need a password for that.
icanis@darni:~$ ssh root@darni Works for me. Am I missing something?
The app itself only needs read access. It reads data from the SQL database and from other files stored on disk and displays it. The scripts generating the data run independently of the app. They require read and write access to the database and the directories the app uses and are initiated by a scheduling system. Installing and updating the app requires more extensive permissions. I need full access to Docker and ShinyProxy for that.
How about two accounts? One handles the app and has minor access rights. And the other generates the data, controls the Docker images and ShinyProxy and has larger access permissions.
Actually I created another account "collec", then had a nap, and deleted it again. I don't see the point of the two accounts. We don't need complicated security, as we have nothing that anybody could steal. But if you want to create another user you can do that. For reason related to the weather, I am very sleepy at this time.
For security reasons I suggest that these accounts can only access the new CollEc's database within MariaDB. This prevents any repercussions on non-CollEc databases. When setting these permissions we should make sure that "LOAD DATA LOCAL INFILE" or " LOAD DATA INFILE" are still available. Restricted access apparently tends to block these statements which I use to insert large data sets.
root@darni has access to the mysql root account. To call my understanding of mysql security rudimentary would be heaping praise on it.
Feel free to choose any name you like for the account(s) and the database.
Kindly consider the following. (1) Once a week, I rsync all the /home /etc /var and /root as backup to aigtu, except anything that is in a folder called 'opt'. At this time, aigtu is short of space. It's a good idea to move bulky files that can be recalculated into folders called opt. For example, all the icanis path data is in a directory called opt, even though it would take months to regenerate it. You can do a cd /var/lib/mysql mkdir -p /var/lib/mysql/opt/foo ln -s opt/foo foo cd /var/lib/mysql (2) At server migration time---not imminent for helos and darni, both are quite new---I copy all of /home, /root and /var as is. All other directories will be dealt with by hand. Thus the change in /lib/, proposed by the shiny app installation is problematic because it needs to be remembered in a few years time when I migrate. For sudo, just use /etc/sudo/sudoers.d files. They can convienently be rsynced at migration time. We operate in a resource-poor environment where migrations take place only every few years, so I don't use things like docker that are important when you have lots of servers. But it pays off to keep things in users' home directories. -- Cheers, Thomas Krichel http://openlib.org/home/krichel skype:thomaskrichel
Sorry, I got the login process to the root account wrong in the first place. I tried to sign in to root directly without using icanis first. Now I understand how it works. Thanks. I do not know what else you store on the server. But if you say that is does not require complex security that is fine with me. The CollEc database is now located in the subdirectory. Thanks for the respective code. I received an error when installing R outside Docker but it works fine when containerized. I am going to look into that. Running R inside containers is fine for now. Regarding point (2), I am not sure which directories ShinyProxy and Docker set. My apps follow the directory structure illustrated in the cheat sheet attached to this e-mail. I can set it up in the home directory. But that does not prevent ShinyProxy and Docker from writing files elsewhere. ShinyProxy's configuration file is in /etc/shinyproxy/. Christian Düben Research Associate Chair of Macroeconomics Hamburg University Von-Melle-Park 5, Room 3102 20146 Hamburg Germany +49 40 42838 1898 christian.dueben@uni-hamburg.de http://www.christian-dueben.com -----Original Message----- From: Thomas Krichel <krichel@openlib.org> Sent: Donnerstag, 4. Juni 2020 15:03 To: Düben, Christian <Christian.Dueben@uni-hamburg.de> Cc: CollEc Run <collec-run@lists.openlib.org> Subject: Re: [CollEc] RePEc Visual Düben, Christian writes
You mentioned in yesterday's e-mail that you gave me root access. However, I apparently need a password for that.
icanis@darni:~$ ssh root@darni Works for me. Am I missing something?
The app itself only needs read access. It reads data from the SQL database and from other files stored on disk and displays it. The scripts generating the data run independently of the app. They require read and write access to the database and the directories the app uses and are initiated by a scheduling system. Installing and updating the app requires more extensive permissions. I need full access to Docker and ShinyProxy for that.
How about two accounts? One handles the app and has minor access rights. And the other generates the data, controls the Docker images and ShinyProxy and has larger access permissions.
Actually I created another account "collec", then had a nap, and deleted it again. I don't see the point of the two accounts. We don't need complicated security, as we have nothing that anybody could steal. But if you want to create another user you can do that. For reason related to the weather, I am very sleepy at this time.
For security reasons I suggest that these accounts can only access the new CollEc's database within MariaDB. This prevents any repercussions on non-CollEc databases. When setting these permissions we should make sure that "LOAD DATA LOCAL INFILE" or " LOAD DATA INFILE" are still available. Restricted access apparently tends to block these statements which I use to insert large data sets.
root@darni has access to the mysql root account. To call my understanding of mysql security rudimentary would be heaping praise on it.
Feel free to choose any name you like for the account(s) and the database.
Kindly consider the following. (1) Once a week, I rsync all the /home /etc /var and /root as backup to aigtu, except anything that is in a folder called 'opt'. At this time, aigtu is short of space. It's a good idea to move bulky files that can be recalculated into folders called opt. For example, all the icanis path data is in a directory called opt, even though it would take months to regenerate it. You can do a cd /var/lib/mysql mkdir -p /var/lib/mysql/opt/foo ln -s opt/foo foo cd /var/lib/mysql (2) At server migration time---not imminent for helos and darni, both are quite new---I copy all of /home, /root and /var as is. All other directories will be dealt with by hand. Thus the change in /lib/, proposed by the shiny app installation is problematic because it needs to be remembered in a few years time when I migrate. For sudo, just use /etc/sudo/sudoers.d files. They can convienently be rsynced at migration time. We operate in a resource-poor environment where migrations take place only every few years, so I don't use things like docker that are important when you have lots of servers. But it pays off to keep things in users' home directories. -- Cheers, Thomas Krichel http://openlib.org/home/krichel skype:thomaskrichel
I am having issues with connecting a Docker container with the MariaDB on the host. I tried various solutions, but nothing works. And now I am even facing a permission error when trying to access the database directly on the host. @Lars, any advice on connecting a Docker container with MariaDB? @Thomas, I do not want to break the host's database. I think I should therefore host another MariaDB server within a container. Christian Düben Research Associate Chair of Macroeconomics Hamburg University Von-Melle-Park 5, Room 3102 20146 Hamburg Germany +49 40 42838 1898 christian.dueben@uni-hamburg.de http://www.christian-dueben.com -----Original Message----- From: CollEc-run <collec-run-bounces@lists.openlib.org> On Behalf Of Düben, Christian Sent: Donnerstag, 4. Juni 2020 18:25 To: Thomas Krichel <krichel@openlib.org> Cc: CollEc Run <collec-run@lists.openlib.org> Subject: Re: [CollEc] RePEc Visual Sorry, I got the login process to the root account wrong in the first place. I tried to sign in to root directly without using icanis first. Now I understand how it works. Thanks. I do not know what else you store on the server. But if you say that is does not require complex security that is fine with me. The CollEc database is now located in the subdirectory. Thanks for the respective code. I received an error when installing R outside Docker but it works fine when containerized. I am going to look into that. Running R inside containers is fine for now. Regarding point (2), I am not sure which directories ShinyProxy and Docker set. My apps follow the directory structure illustrated in the cheat sheet attached to this e-mail. I can set it up in the home directory. But that does not prevent ShinyProxy and Docker from writing files elsewhere. ShinyProxy's configuration file is in /etc/shinyproxy/. Christian Düben Research Associate Chair of Macroeconomics Hamburg University Von-Melle-Park 5, Room 3102 20146 Hamburg Germany +49 40 42838 1898 christian.dueben@uni-hamburg.de http://www.christian-dueben.com -----Original Message----- From: Thomas Krichel <krichel@openlib.org> Sent: Donnerstag, 4. Juni 2020 15:03 To: Düben, Christian <Christian.Dueben@uni-hamburg.de> Cc: CollEc Run <collec-run@lists.openlib.org> Subject: Re: [CollEc] RePEc Visual Düben, Christian writes
You mentioned in yesterday's e-mail that you gave me root access. However, I apparently need a password for that.
icanis@darni:~$ ssh root@darni Works for me. Am I missing something?
The app itself only needs read access. It reads data from the SQL database and from other files stored on disk and displays it. The scripts generating the data run independently of the app. They require read and write access to the database and the directories the app uses and are initiated by a scheduling system. Installing and updating the app requires more extensive permissions. I need full access to Docker and ShinyProxy for that.
How about two accounts? One handles the app and has minor access rights. And the other generates the data, controls the Docker images and ShinyProxy and has larger access permissions.
Actually I created another account "collec", then had a nap, and deleted it again. I don't see the point of the two accounts. We don't need complicated security, as we have nothing that anybody could steal. But if you want to create another user you can do that. For reason related to the weather, I am very sleepy at this time.
For security reasons I suggest that these accounts can only access the new CollEc's database within MariaDB. This prevents any repercussions on non-CollEc databases. When setting these permissions we should make sure that "LOAD DATA LOCAL INFILE" or " LOAD DATA INFILE" are still available. Restricted access apparently tends to block these statements which I use to insert large data sets.
root@darni has access to the mysql root account. To call my understanding of mysql security rudimentary would be heaping praise on it.
Feel free to choose any name you like for the account(s) and the database.
Kindly consider the following. (1) Once a week, I rsync all the /home /etc /var and /root as backup to aigtu, except anything that is in a folder called 'opt'. At this time, aigtu is short of space. It's a good idea to move bulky files that can be recalculated into folders called opt. For example, all the icanis path data is in a directory called opt, even though it would take months to regenerate it. You can do a cd /var/lib/mysql mkdir -p /var/lib/mysql/opt/foo ln -s opt/foo foo cd /var/lib/mysql (2) At server migration time---not imminent for helos and darni, both are quite new---I copy all of /home, /root and /var as is. All other directories will be dealt with by hand. Thus the change in /lib/, proposed by the shiny app installation is problematic because it needs to be remembered in a few years time when I migrate. For sudo, just use /etc/sudo/sudoers.d files. They can convienently be rsynced at migration time. We operate in a resource-poor environment where migrations take place only every few years, so I don't use things like docker that are important when you have lots of servers. But it pays off to keep things in users' home directories. -- Cheers, Thomas Krichel http://openlib.org/home/krichel skype:thomaskrichel
No advice/experience with connecting out from the Docker, except that the default Linux docker setup does *not* allow for networking/bridging - that might be the reason you cannot connect. Also check permissions on the MariaDB - MySQL/MariaDB permissions are both at the user@host level, so you may need "user@*" or something like that to connect. -- Lars Vilhuber, Economist Cornell University, Executive Director, Labor Dynamics Institute and ILR School - Department of Economics American Economic Association - Data Editor Journal of Privacy and Confidentiality - Managing Editor e: lars.vilhuber@cornell.edu p: +1.607-330-5743 v: https://cornell.zoom.us/my/larsvilhuber w: http://lars.vilhuber.com/ <http://lars.vilhuber.com/> Assistant: ldi@cornell.edu | +1.607-255-2744 ________________________________ From: Düben, Christian <Christian.Dueben@uni-hamburg.de> Sent: Tuesday, June 9, 2020 10:19 To: Thomas Krichel <krichel@openlib.org>; Lars Vilhuber <lars.vilhuber@cornell.edu> Cc: CollEc Run <collec-run@lists.openlib.org> Subject: RE: [CollEc] RePEc Visual I am having issues with connecting a Docker container with the MariaDB on the host. I tried various solutions, but nothing works. And now I am even facing a permission error when trying to access the database directly on the host. @Lars, any advice on connecting a Docker container with MariaDB? @Thomas, I do not want to break the host's database. I think I should therefore host another MariaDB server within a container. Christian Düben Research Associate Chair of Macroeconomics Hamburg University Von-Melle-Park 5, Room 3102 20146 Hamburg Germany +49 40 42838 1898 christian.dueben@uni-hamburg.de http://www.christian-dueben.com -----Original Message----- From: CollEc-run <collec-run-bounces@lists.openlib.org> On Behalf Of Düben, Christian Sent: Donnerstag, 4. Juni 2020 18:25 To: Thomas Krichel <krichel@openlib.org> Cc: CollEc Run <collec-run@lists.openlib.org> Subject: Re: [CollEc] RePEc Visual Sorry, I got the login process to the root account wrong in the first place. I tried to sign in to root directly without using icanis first. Now I understand how it works. Thanks. I do not know what else you store on the server. But if you say that is does not require complex security that is fine with me. The CollEc database is now located in the subdirectory. Thanks for the respective code. I received an error when installing R outside Docker but it works fine when containerized. I am going to look into that. Running R inside containers is fine for now. Regarding point (2), I am not sure which directories ShinyProxy and Docker set. My apps follow the directory structure illustrated in the cheat sheet attached to this e-mail. I can set it up in the home directory. But that does not prevent ShinyProxy and Docker from writing files elsewhere. ShinyProxy's configuration file is in /etc/shinyproxy/. Christian Düben Research Associate Chair of Macroeconomics Hamburg University Von-Melle-Park 5, Room 3102 20146 Hamburg Germany +49 40 42838 1898 christian.dueben@uni-hamburg.de http://www.christian-dueben.com -----Original Message----- From: Thomas Krichel <krichel@openlib.org> Sent: Donnerstag, 4. Juni 2020 15:03 To: Düben, Christian <Christian.Dueben@uni-hamburg.de> Cc: CollEc Run <collec-run@lists.openlib.org> Subject: Re: [CollEc] RePEc Visual Düben, Christian writes
You mentioned in yesterday's e-mail that you gave me root access. However, I apparently need a password for that.
icanis@darni:~$ ssh root@darni Works for me. Am I missing something?
The app itself only needs read access. It reads data from the SQL database and from other files stored on disk and displays it. The scripts generating the data run independently of the app. They require read and write access to the database and the directories the app uses and are initiated by a scheduling system. Installing and updating the app requires more extensive permissions. I need full access to Docker and ShinyProxy for that.
How about two accounts? One handles the app and has minor access rights. And the other generates the data, controls the Docker images and ShinyProxy and has larger access permissions.
Actually I created another account "collec", then had a nap, and deleted it again. I don't see the point of the two accounts. We don't need complicated security, as we have nothing that anybody could steal. But if you want to create another user you can do that. For reason related to the weather, I am very sleepy at this time.
For security reasons I suggest that these accounts can only access the new CollEc's database within MariaDB. This prevents any repercussions on non-CollEc databases. When setting these permissions we should make sure that "LOAD DATA LOCAL INFILE" or " LOAD DATA INFILE" are still available. Restricted access apparently tends to block these statements which I use to insert large data sets.
root@darni has access to the mysql root account. To call my understanding of mysql security rudimentary would be heaping praise on it.
Feel free to choose any name you like for the account(s) and the database.
Kindly consider the following. (1) Once a week, I rsync all the /home /etc /var and /root as backup to aigtu, except anything that is in a folder called 'opt'. At this time, aigtu is short of space. It's a good idea to move bulky files that can be recalculated into folders called opt. For example, all the icanis path data is in a directory called opt, even though it would take months to regenerate it. You can do a cd /var/lib/mysql mkdir -p /var/lib/mysql/opt/foo ln -s opt/foo foo cd /var/lib/mysql (2) At server migration time---not imminent for helos and darni, both are quite new---I copy all of /home, /root and /var as is. All other directories will be dealt with by hand. Thus the change in /lib/, proposed by the shiny app installation is problematic because it needs to be remembered in a few years time when I migrate. For sudo, just use /etc/sudo/sudoers.d files. They can convienently be rsynced at migration time. We operate in a resource-poor environment where migrations take place only every few years, so I don't use things like docker that are important when you have lots of servers. But it pays off to keep things in users' home directories. -- Cheers, Thomas Krichel http://openlib.org/home/krichel skype:thomaskrichel
Thanks. I will check that once I have access to the server again. Somehow it kicked me out and I cannot reconnect. Christian Düben Research Associate Chair of Macroeconomics Hamburg University Von-Melle-Park 5, Room 3102 20146 Hamburg Germany +49 40 42838 1898 christian.dueben@uni-hamburg.de<mailto:christian.dueben@uni-hamburg.de> http://www.christian-dueben.com From: Lars Vilhuber <lars.vilhuber@cornell.edu> Sent: Dienstag, 9. Juni 2020 16:29 To: Düben, Christian <Christian.Dueben@uni-hamburg.de>; Thomas Krichel <krichel@openlib.org> Cc: CollEc Run <collec-run@lists.openlib.org> Subject: Re: [CollEc] RePEc Visual No advice/experience with connecting out from the Docker, except that the default Linux docker setup does *not* allow for networking/bridging - that might be the reason you cannot connect. Also check permissions on the MariaDB - MySQL/MariaDB permissions are both at the user@host level, so you may need "user@*" or something like that to connect. -- Lars Vilhuber, Economist Cornell University, Executive Director, Labor Dynamics Institute and ILR School - Department of Economics American Economic Association - Data Editor Journal of Privacy and Confidentiality - Managing Editor e: lars.vilhuber@cornell.edu<mailto:lars.vilhuber@cornell.edu> p: +1.607-330-5743 v: https://cornell.zoom.us/my/larsvilhuber w: http://lars.vilhuber.com/ Assistant: ldi@cornell.edu<mailto:ldi@cornell.edu> | +1.607-255-2744 ________________________________ From: Düben, Christian <Christian.Dueben@uni-hamburg.de<mailto:Christian.Dueben@uni-hamburg.de>> Sent: Tuesday, June 9, 2020 10:19 To: Thomas Krichel <krichel@openlib.org<mailto:krichel@openlib.org>>; Lars Vilhuber <lars.vilhuber@cornell.edu<mailto:lars.vilhuber@cornell.edu>> Cc: CollEc Run <collec-run@lists.openlib.org<mailto:collec-run@lists.openlib.org>> Subject: RE: [CollEc] RePEc Visual I am having issues with connecting a Docker container with the MariaDB on the host. I tried various solutions, but nothing works. And now I am even facing a permission error when trying to access the database directly on the host. @Lars, any advice on connecting a Docker container with MariaDB? @Thomas, I do not want to break the host's database. I think I should therefore host another MariaDB server within a container. Christian Düben Research Associate Chair of Macroeconomics Hamburg University Von-Melle-Park 5, Room 3102 20146 Hamburg Germany +49 40 42838 1898 christian.dueben@uni-hamburg.de<mailto:christian.dueben@uni-hamburg.de> http://www.christian-dueben.com -----Original Message----- From: CollEc-run <collec-run-bounces@lists.openlib.org<mailto:collec-run-bounces@lists.openlib.org>> On Behalf Of Düben, Christian Sent: Donnerstag, 4. Juni 2020 18:25 To: Thomas Krichel <krichel@openlib.org<mailto:krichel@openlib.org>> Cc: CollEc Run <collec-run@lists.openlib.org<mailto:collec-run@lists.openlib.org>> Subject: Re: [CollEc] RePEc Visual Sorry, I got the login process to the root account wrong in the first place. I tried to sign in to root directly without using icanis first. Now I understand how it works. Thanks. I do not know what else you store on the server. But if you say that is does not require complex security that is fine with me. The CollEc database is now located in the subdirectory. Thanks for the respective code. I received an error when installing R outside Docker but it works fine when containerized. I am going to look into that. Running R inside containers is fine for now. Regarding point (2), I am not sure which directories ShinyProxy and Docker set. My apps follow the directory structure illustrated in the cheat sheet attached to this e-mail. I can set it up in the home directory. But that does not prevent ShinyProxy and Docker from writing files elsewhere. ShinyProxy's configuration file is in /etc/shinyproxy/. Christian Düben Research Associate Chair of Macroeconomics Hamburg University Von-Melle-Park 5, Room 3102 20146 Hamburg Germany +49 40 42838 1898 christian.dueben@uni-hamburg.de<mailto:christian.dueben@uni-hamburg.de> http://www.christian-dueben.com -----Original Message----- From: Thomas Krichel <krichel@openlib.org<mailto:krichel@openlib.org>> Sent: Donnerstag, 4. Juni 2020 15:03 To: Düben, Christian <Christian.Dueben@uni-hamburg.de<mailto:Christian.Dueben@uni-hamburg.de>> Cc: CollEc Run <collec-run@lists.openlib.org<mailto:collec-run@lists.openlib.org>> Subject: Re: [CollEc] RePEc Visual Düben, Christian writes
You mentioned in yesterday's e-mail that you gave me root access. However, I apparently need a password for that.
icanis@darni:~$ ssh root@darni Works for me. Am I missing something?
The app itself only needs read access. It reads data from the SQL database and from other files stored on disk and displays it. The scripts generating the data run independently of the app. They require read and write access to the database and the directories the app uses and are initiated by a scheduling system. Installing and updating the app requires more extensive permissions. I need full access to Docker and ShinyProxy for that.
How about two accounts? One handles the app and has minor access rights. And the other generates the data, controls the Docker images and ShinyProxy and has larger access permissions.
Actually I created another account "collec", then had a nap, and deleted it again. I don't see the point of the two accounts. We don't need complicated security, as we have nothing that anybody could steal. But if you want to create another user you can do that. For reason related to the weather, I am very sleepy at this time.
For security reasons I suggest that these accounts can only access the new CollEc's database within MariaDB. This prevents any repercussions on non-CollEc databases. When setting these permissions we should make sure that "LOAD DATA LOCAL INFILE" or " LOAD DATA INFILE" are still available. Restricted access apparently tends to block these statements which I use to insert large data sets.
root@darni has access to the mysql root account. To call my understanding of mysql security rudimentary would be heaping praise on it.
Feel free to choose any name you like for the account(s) and the database.
Kindly consider the following. (1) Once a week, I rsync all the /home /etc /var and /root as backup to aigtu, except anything that is in a folder called 'opt'. At this time, aigtu is short of space. It's a good idea to move bulky files that can be recalculated into folders called opt. For example, all the icanis path data is in a directory called opt, even though it would take months to regenerate it. You can do a cd /var/lib/mysql mkdir -p /var/lib/mysql/opt/foo ln -s opt/foo foo cd /var/lib/mysql (2) At server migration time---not imminent for helos and darni, both are quite new---I copy all of /home, /root and /var as is. All other directories will be dealt with by hand. Thus the change in /lib/, proposed by the shiny app installation is problematic because it needs to be remembered in a few years time when I migrate. For sudo, just use /etc/sudo/sudoers.d files. They can convienently be rsynced at migration time. We operate in a resource-poor environment where migrations take place only every few years, so I don't use things like docker that are important when you have lots of servers. But it pays off to keep things in users' home directories. -- Cheers, Thomas Krichel http://openlib.org/home/krichel skype:thomaskrichel
Düben, Christian writes
Thanks. I will check that once I have access to the server again. Somehow it kicked me out and I cannot reconnect.
??? are we talking about darni? I was out boozing with my lover. I think I'm innoscent. -- Cheers, Thomas Krichel http://openlib.org/home/krichel skype:thomaskrichel
Düben, Christian writes
I am having issues with connecting a Docker container with the MariaDB on the host. I tried various solutions, but nothing works. And now I am even facing a permission error when trying to access the database directly on the host.
@Lars, any advice on connecting a Docker container with MariaDB?
@Thomas, I do not want to break the host's database. I think I should therefore host another MariaDB server within a container.
No, I don't think there is an issue with that. Use the main installation and don't worry about the existing databases. -- Cheers, Thomas Krichel http://openlib.org/home/krichel skype:thomaskrichel
Yes, we are talking about darni. I am currently using the server again. The database's permission error however remains. I somehow locked myself out. Christian Düben Research Associate Chair of Macroeconomics Hamburg University Von-Melle-Park 5, Room 3102 20146 Hamburg Germany +49 40 42838 1898 christian.dueben@uni-hamburg.de http://www.christian-dueben.com -----Original Message----- From: Thomas Krichel <krichel@openlib.org> Sent: Dienstag, 9. Juni 2020 19:18 To: Düben, Christian <Christian.Dueben@uni-hamburg.de> Cc: CollEc Run <collec-run@lists.openlib.org> Subject: Re: [CollEc] RePEc Visual Düben, Christian writes
I am having issues with connecting a Docker container with the MariaDB on the host. I tried various solutions, but nothing works. And now I am even facing a permission error when trying to access the database directly on the host.
@Lars, any advice on connecting a Docker container with MariaDB?
@Thomas, I do not want to break the host's database. I think I should therefore host another MariaDB server within a container.
No, I don't think there is an issue with that. Use the main installation and don't worry about the existing databases. -- Cheers, Thomas Krichel http://openlib.org/home/krichel skype:thomaskrichel
Düben, Christian writes
Yes, we are talking about darni. I am currently using the server again. The database's permission error however remains. I somehow locked myself out.
root@darni ~ # systemctl status mariadb.service ● mariadb.service - MariaDB 10.3.22 database server Loaded: loaded (/lib/systemd/system/mariadb.service; enabled; vendor preset: enabled) Active: failed (Result: exit-code) since Tue 2020-06-09 17:27:29 UTC; 11s ago Docs: man:mysqld(8) https://mariadb.com/kb/en/library/systemd/ Process: 3894144 ExecStartPre=/usr/bin/install -m 755 -o mysql -g root -d /var/run/mysqld (> Process: 3894145 ExecStartPre=/bin/sh -c systemctl unset-environment _WSREP_START_POSITION > Process: 3894147 ExecStartPre=/bin/sh -c [ ! -e /usr/bin/galera_recovery ] && VAR= || VAR> Process: 3894195 ExecStart=/usr/sbin/mysqld $MYSQLD_OPTS $_WSREP_NEW_CLUSTER $_WSREP_START_> Main PID: 3894195 (code=exited, status=1/FAILURE) Status: "MariaDB server is down" Jun 09 17:27:29 darni systemd[1]: Starting MariaDB 10.3.22 database server... Jun 09 17:27:29 darni mysqld[3894195]: 2020-06-09 17:27:29 0 [Note] /usr/sbin/mysqld (mysqld 10> Jun 09 17:27:29 darni mysqld[3894195]: 2020-06-09 17:27:29 0 [Warning] Can't create test file /> Jun 09 17:27:29 darni mysqld[3894195]: [90B blob data] Jun 09 17:27:29 darni mysqld[3894195]: 2020-06-09 17:27:29 0 [ERROR] Aborting Jun 09 17:27:29 darni systemd[1]: mariadb.service: Main process exited, code=exited, status=1/F> Jun 09 17:27:29 darni systemd[1]: mariadb.service: Failed with result 'exit-code'. Jun 09 17:27:29 darni systemd[1]: Failed to start MariaDB 10.3.22 database server. Something is fishy but I am too drunk to deal with this. Prost! -- Cheers, Thomas Krichel http://openlib.org/home/krichel skype:thomaskrichel
Thomas Krichel writes
Something is fishy but I am too drunk to deal with this. Prost!
Woke up with hangover at 6:30. The /var/lib/mysql and its subfolders was owned by systemd-coredump. It should be mysql.mysql I think, so I fixed. And the permission of the directory had to be widened to 755 from 700. Then I found I had to remove /var/lib/mysql/tc.log. Now mysql running again. -- Cheers, Thomas Krichel http://openlib.org/home/krichel skype:thomaskrichel
Thanks. And sorry for breaking it in the first place. It should not happen again. I now use a containerized MariaDB which the other containers can directly access through the bridge network. Christian Düben Research Associate Chair of Macroeconomics Hamburg University Von-Melle-Park 5, Room 3102 20146 Hamburg Germany +49 40 42838 1898 christian.dueben@uni-hamburg.de http://www.christian-dueben.com -----Original Message----- From: Thomas Krichel <krichel@openlib.org> Sent: Mittwoch, 10. Juni 2020 03:04 To: Düben, Christian <Christian.Dueben@uni-hamburg.de> Cc: CollEc Run <collec-run@lists.openlib.org> Subject: Re: [CollEc] RePEc Visual Thomas Krichel writes
Something is fishy but I am too drunk to deal with this. Prost!
Woke up with hangover at 6:30. The /var/lib/mysql and its subfolders was owned by systemd-coredump. It should be mysql.mysql I think, so I fixed. And the permission of the directory had to be widened to 755 from 700. Then I found I had to remove /var/lib/mysql/tc.log. Now mysql running again. -- Cheers, Thomas Krichel http://openlib.org/home/krichel skype:thomaskrichel
Düben, Christian writes
Thanks. And sorry for breaking it in the first place. It should not happen again.
I now use a containerized MariaDB which the other containers can directly access through the bridge network.
This issue should not have prevented you from using the main installation. -- Cheers, Thomas Krichel http://openlib.org/home/krichel skype:thomaskrichel
I did indeed consider using the main installation. The container just turned out to be the easier solution because it automatically links the database to the other containers via the bridge network. Christian Düben Research Associate Chair of Macroeconomics Hamburg University Von-Melle-Park 5, Room 3102 20146 Hamburg Germany +49 40 42838 1898 christian.dueben@uni-hamburg.de http://www.christian-dueben.com -----Original Message----- From: Thomas Krichel <krichel@openlib.org> Sent: Mittwoch, 10. Juni 2020 11:14 To: Düben, Christian <Christian.Dueben@uni-hamburg.de> Cc: CollEc Run <collec-run@lists.openlib.org> Subject: Re: [CollEc] RePEc Visual Düben, Christian writes
Thanks. And sorry for breaking it in the first place. It should not happen again.
I now use a containerized MariaDB which the other containers can directly access through the bridge network.
This issue should not have prevented you from using the main installation. -- Cheers, Thomas Krichel http://openlib.org/home/krichel skype:thomaskrichel
A quick update on the new CollEc: Setting up a database within a Docker container turned out to be a good idea. It kept crashing for days. And with Docker I could simply spin it up again without having to ask you to fix the main installation every time. The source of the crashes was the mismatch between MariaDB default settings and the size of the inserted data. After adjusting various settings, the containerized database is now stable. An issue that remains is the (LOAD DATA LOCAL INFILE) insert speed. Calculating four distance matrices of more than 2.2 billion cells each and writing them to disk does not take a lot of time. Loading them into the database, however, easily takes multiple days. MariaDB's column restriction requires the data to be inserted in long format, i.e. more than 8.8 billion rows. I am working on reducing that insert time to not more than a few hours. Using the distance matrices' symmetry, the zeros along the main diagonal and the unconnectedness of authors across subgraphs I cut that table length by more than half. Instead of N^{2} rows I now insert sum_{i} N_{i} (N_{i} - 1)/2 where N_{i} is the number of authors in graph i. This reduces the more than 2.2 billion rows to around 1.02 billion rows for each of the four transition functions. As this still takes a long time, I am testing further modifications to the database. There are various MariaDB system variables apart from the already modified ones (net_read_timeout, net_write_timeout, wait_timeout, innodb-fatal-semaphore-wait-threshold, max_allowed_packet, innodb-buffer-pool-size) for which I am yet to figure out the appropriate levels. Betweenness calculations now run in a manageable amount of time, but are only computed with three out of the four transition functions. The currently implemented exponential transition function generates edges weights small enough to crash the system when used in betweenness computations. The app does, therefore, not cover this combination. Addressing the database issues takes longer than I expected. You can test the app after I dealt with the performance bottlenecks. The app and the code generating the data once a day are ready to be deployed. Have a nice day. Kind regards, Christian Christian Düben Research Associate Chair of Macroeconomics Hamburg University Von-Melle-Park 5, Room 3102 20146 Hamburg Germany +49 40 42838 1898 christian.dueben@uni-hamburg.de http://www.christian-dueben.com -----Original Message----- From: CollEc-run <collec-run-bounces@lists.openlib.org> On Behalf Of Düben, Christian Sent: Mittwoch, 10. Juni 2020 11:24 To: Thomas Krichel <krichel@openlib.org> Cc: CollEc Run <collec-run@lists.openlib.org> Subject: Re: [CollEc] RePEc Visual I did indeed consider using the main installation. The container just turned out to be the easier solution because it automatically links the database to the other containers via the bridge network. Christian Düben Research Associate Chair of Macroeconomics Hamburg University Von-Melle-Park 5, Room 3102 20146 Hamburg Germany +49 40 42838 1898 christian.dueben@uni-hamburg.de http://www.christian-dueben.com -----Original Message----- From: Thomas Krichel <krichel@openlib.org> Sent: Mittwoch, 10. Juni 2020 11:14 To: Düben, Christian <Christian.Dueben@uni-hamburg.de> Cc: CollEc Run <collec-run@lists.openlib.org> Subject: Re: [CollEc] RePEc Visual Düben, Christian writes
Thanks. And sorry for breaking it in the first place. It should not happen again.
I now use a containerized MariaDB which the other containers can directly access through the bridge network.
This issue should not have prevented you from using the main installation. -- Cheers, Thomas Krichel http://openlib.org/home/krichel skype:thomaskrichel _______________________________________________ CollEc-run mailing list CollEc-run@lists.openlib.org http://lists.openlib.org/cgi-bin/mailman/listinfo/collec-run
Düben, Christian writes
Setting up a database within a Docker container turned out to be a good idea. It kept crashing for days. And with Docker I could simply spin it up again without having to ask you to fix the main installation every time.
The problem with that was just the ownership of the files in /var/lib/mysql, that somehow was handed to the container user.
The source of the crashes was the mismatch between MariaDB default settings and the size of the inserted data. After adjusting various settings, the containerized database is now stable.
At this time, you are the only mySQL users an the box, so your setting could be the default. Later, darni will need mysql for mailman3, but on helos you will be the only mysql user. Similarly on both boxes you would be the only JAVA user. Thus the only common thing would be the web server, or I'm a wrong? If that was not running in a container, I could help you with some Redirect statements that will give us nice meaningful URLs.
Addressing the database issues takes longer than I expected.
Adressing anything like that takes more than we expect. Very sorry for the late reply. I fell severely ill with a renal colic on the 16th and then my eyes wondered off this issue. -- Cheers, Thomas Krichel http://openlib.org/home/krichel skype:thomaskrichel
Thomas Krichel writes
If that was not running in a container, I could help you with some Redirect statements that will give us nice meaningful URLs.
In the meantime, you could send me the doc page source code, and I can work on that. -- Cheers, Thomas Krichel http://openlib.org/home/krichel skype:thomaskrichel
I changed the documentation. Take a look. After weeks of optimizing the data base to efficiently handle billions of pre-calculated distance values, I decided the implement the alternative solution. The app now computes distances directly from the graph rather than reading pre-calculated distance values from the SQL database. The database optimizations I implemented did improve the daily data insertions. However, the performance did not get anywhere near the intended level. And the more I optimized the data base the more server resources it consumed. As a result, other processes on the server became slower, including the app. Another issue of reading the distance values from the SQL database is the extraction time. Even with indexed searching variables, queries on tables with over a billion rows are not instant. I tried to omit any network analysis computations from the app itself. But given how the database approach played out, in-app calculations appear to be the preferable choice. Calculating distances when a user interacts with the app rather than in the daily data generation process drastically cuts the time and resources consumed by that daily data generation. And reading the main graph from an optimized file and deriving the distance values turns out to be even faster than querying those values from a long MariaDB table. Reading the graph into memory and calculating N distances from a specific author to all other authors in that graph takes between around 0.18 and 0.21 seconds. My attempt to push these 0.2 seconds closer to zero and to read a few KB of distances values rather than a 10 MB graph motivated my efforts to insert distances into the database in the first place. And now the supposedly less efficient solution turns out to be the more efficient one. Despite the choice of the alternative solution I am glad to have tested the database implementation. I gained insights into the technical features of MariaDB (and MySQL) and the InnoDB engine which I may use at a later point. Much of the other data used by the app is still read from MariaDB tables. And that works fine. Now the app is almost ready to be publicly released. I just have to fix a bug in the app's distances tab and produce the introductory video. And we have to agree on the open content questions and have to connect the app's port with one of the RePEc URLs. As I mentioned in the previous e-mail, everything related to the new implementation is up for discussion. May I install Nginx on the server? Then I can link ShinyProxy's port 8080 to the default HTTP port 80. You asked for the documentation's source code. Here it is: h1("Documentation"), h3("Network"), p("CollEc constructs and examines the co-authorship network using methods from the field of network analysis. Assuming no computer science background on the side of many CollEc users, this documentation begins with a short introduction on the basics of networks."), p("Graphs, i.e. networks, exist in many different applications. Those include websites on the internet, geo-spatial data, social media connections, co-authorship among economists etc. A graph consists of vertices, also called nodes, that are connected via edges. A vertex is e.g. a location in geo-spatial data, a registered social media user, a website or a researcher. Edges between vertices are e.g. roads between locations, links to other websites, co-authorship between economists etc. Both vertices and edges have attributes. A typical vertex attribute is a name. That might be the name of a location, or as in the case of CollEc the name of an economist who published co-authored research. A common edge attribute is weights. Weights express the transition costs between vertices. In geo-spatial applications the weight might express the distance between locations. In CollEc the weight represents the degree of collaboration, i.e. the number of papers two authors wrote together. Edge weights are determined by transition functions. In CollEc you can interactively choose between different function forms that model the transition cost between co-authors as a non-linear function of the number of joint papers. The transition costs are symmetric as CollEc uses undirected graphs. Moving from author A to author B is as costly as moving from B to A."), p("The following plot illustrates the concept of graphs. Authors A to K, the white vertices, are connected by joint research, the blue edges."), img(src = "Example_Graph.png", width = "300px", style="display: block; margin-left: auto; margin-right: auto;"), p("CollEc's network currently consists of more than 47,000 authors registered through the ", a(href = "https://authors.repec.org/", "RePEc Author Service", target = "_blank"), ", a RePEc service maintained by ", a(href = "https://ideas.repec.org/zimm/", "Christian Zimmermann", target = "_blank"), ". Each of them has at least one co-authored paper listed on RePEc. Not all of them are connected to the same graph. In fact there are over 900 unconnected sub-graphs. The largest one of them contains with around 44,900 people the vast majority of vertices. The remaining graphs are small and consist of e.g. two otherwise unconnected people who published a joint paper. You can find an author's graph order, the graph or network size, below distance, closeness and betweenness plots."), h3("Variable Definition"), h4("Distance Measures"), p("The computed distance is the length of the shortest cost path between the two selected authors. Edge weights measure the distance between adjacent authors. The shortest cost path is, thus, the connection between two authors that minimizes the sum of edge weights, the path's length. Since the input is an undirected graph with exclusively positive weights the shortest paths are derived through ", a(href = "https://en.wikipedia.org/wiki/Dijkstra%27s_algorithm", "Dijkstra's algorithm", target = "_blank"), ". The generated distance values are comparable within but not between transition functions."), p("If you set ", tags$b("Weighted Edges"), " to ", tags$b("No"), ", edges are not weighted by the number of joint papers. Instead they all receive a weight of one. So they are in fact weighted. But with all weights set to the same value the weights do not play a role."), p("If you set ", tags$b("Weighted Edges"), " to ", tags$b("Yes"), " and ", tags$b("Transition Function"), " to ", tags$b("Inverse"), ", edges are weighted by the inverse of the quanitity of joint papers:"), uiOutput("eq_distance_i"), p("If you set ", tags$b("Weighted Edges"), " to ", tags$b("Yes"), " and ", tags$b("Transition Function"), " to ", tags$b("Gravity"), ", edges are weighted by the inverse of the squared quanitity of joint papers:"), uiOutput("eq_distance_g"), p("If you set ", tags$b("Weighted Edges"), " to ", tags$b("Yes"), " and ", tags$b("Transition Function"), " to ", tags$b("Exponential"), ", edges are weighted by an exponential function based on the quanitity of joint papers:"), uiOutput("eq_distance_e"), p("The following figure illustrates how the different transition functions translate number of joint papers into edge weights. If two authors only collaborated on one paper, it does not matter which transition function is selected. They all attribute a value of one to this connection. Beyond the first paper, the effect varies between functions.", tags$b("Inverse"), ", ", tags$b("Gravity"), " and ", tags$b("Exponential"), " all model diminishing returns to co-authorship with the same person. However, with ", tags$b("Gravity"), "the edge weight, i.e. the transition cost or the distance, drops more drastically over the first few papers than it does with ", tags$b("Inverse"), " and ", tags$b("Exponential"), "."), img(src = "Transition_Functions.png", width = "500px", style="display: block; margin-left: auto; margin-right: auto;"), p("CollEc users access bilaterial distances through a plot of the following type."), img(src = "Distances_Example.png", width = "500px", style="display: block; margin-left: auto; margin-right: auto;"), p("In this example, the selected authors are Christian Düben and Thomas Krichel and distances are based on the inverse transition function. The two distributions represent the kernel density estimates of distances from the two authors. The curves are similar in shape but shifted along the horizontal axis. Thomas Krichel is closer to many authors in the graph than Christian Düben is. Junior researchers like Christian Düben tend to be less closely connected than people like Thomas Krichel who have been in the field for decades. The red line denotes the bilateral distance between the two selected authors. It is slightly above the blue distribution's mode and somewhere in the green distribution's upper tail. From Thomas Krichel's perspective, Christian Düben is a fairly distant author. While from Christian Düben's perspective, Thomas Krichel is just as far as many other authors are. A short text underneath the plot states the graph size and bilateral distance. The two authors in this example are part of the main graph with around 45,000 people and are located at a distance of around 2.043."), h4("Closeness"), p(a(href = "https://en.wikipedia.org/wiki/Closeness_centrality", "Closeness", target = "_blank"), ", or closeness centrality, is the reciprocal of the sum of the length of the shortest paths between a vertex and all other vertices. High closeness values imply short paths to other vertices and thus a central position."), uiOutput("eq_closeness"), p("\\(d(v,i)\\) is the length of the shortest cost path between vertex \\(v\\) and vertex \\(i\\). Given the constant inflow of new authors into the network, closeness values are not comparable over time."), p("A closeness plot with Nobel prize laureate Esther Duflo as selected author and distances based on the exponential transition function looks as follows."), img(src = "Closeness_Example.png", width = "500px", style="display: block; margin-left: auto; margin-right: auto;"), p("The blue distribution illustrates kernel density estimates of all authors' closeness values in that graph. Esther Duflo is part of the main graph which contains around 45,000 authors and with it around 45,000 closeness values. She is with a closeness of around 2.24e-05 one of the most centrally located economists. The red line representing that value is near the upper end of the closeness distribution. Graph size and closeness value are stated in a short text underneath the plot."), h4("Betweenness"), p(a(href = "https://en.wikipedia.org/wiki/Betweenness_centrality", "Betweenness", target = "_blank"), ", or betweenness centrality, measures the number of shortest paths passing through a vertex. High betweenness values imply high centrality."), uiOutput("eq_betweenness"), p("\\(\\sigma_{ij}\\) represents the number of shortest paths from vertex \\(i\\) to vertex \\(j\\). And \\(\\sigma_{ij}(v)\\) is the number of those paths passing through vertex \\(v\\). Given the constant inflow of new authors into the network, betweenness values are not comparable over time."), p("A betweenness plot with Nobel prize laureate Abhijit Banerjee as selected author and distances based on the gravity transition function looks as follows."), img(src = "Betweenness_Example.png", width = "500px", style="display: block; margin-left: auto; margin-right: auto;"), p("The blue distribution illustrates kernel density estimates of all authors' log betweenness values in that graph. It displays the logarithm of betweenness because the distribution of actual betweenness is so wide that its shape can merely be guessed from a plot of this size. Abhijit Banerjee is part of the main graph which contains around 45,000 authors and with it around 45,000 betweenness values. He is with a betweenness of around \\(\\log(6,188,136) \\approx 15.638 \\) one of the most centrally located economists. The red line representing that value is near the upper end of the betweenness distribution. Graph size and betweenness value are stated in a short text underneath the plot."), h3("Technical Implementation"), h4("Data Generation"), p("CollEc retrieves information on co-authorship from the ", a(href = "https://authors.repec.org/", "RePEc Author Service", target = "_blank"), ". In particular, it extracts a vector of authors from each co-authored paper. These are then merged into a graph with edges weighted according to one of the four transition functions. Any result available in this web application is derived from one of these graphs. The respective code is written in R with the graph construction and analysis executed through the ", a(href = "https://igraph.org/r/", "igraph package", target = "_blank"), ". igraph is a wrapper for functions written in C and C++ which makes it very efficient. Calulating more than 2.2 billion shortest cost path lengths and writting them to disk only takes a few minutes in an eight CPU process."), p("Much of that data is then inserted into a SQL database. As that process takes considerably more time than just writing the data to disk, the code does not insert the full distance matrix. Instead of the \\(N \\times N\\) cells, which is currently more than 2.2 billion distance values, it uses around 1.02 billion cells. Three properties allow the data to be cut by more than 50 percent without losing information. First, an undirected graph's distance matrix is symmetric. Author A is as far from author B as author B is from author A. Second, all values along the main diagonal, the distance from an author to him- or herself, are zero. Third, only authors that are part of the same graph are located at a finite bilateral distance. The number of stored distance values is thus \\( \\sum_{i} N_{i} (N_{i} - 1)/2 \\) where \\( N_{i} \\) is the number of authors in graph \\( i \\)."), p("Betweenness calculations using the exponential transition function include values small enough to crash the machine and are, thus, not implemented at this point."), p("CollEc retrieves RePEc Author Service data and computes the respective results once a day. Check the footer for the current update status. The respective processes, i.e. the database and the R script generating the data, are executed from within ", a(href = "https://www.docker.com/", "Docker", target = "_blank"), " containers."), h4("Web Application"), p("The web application is written in ", a(href = "https://shiny.rstudio.com/", "R Shiny", target = "_blank"), ", which merges server-side R with client-side HTML, CSS and Javascript. It reads data from the above mentioned SQL database and displays it. The process generating the data and updating the database once a day runs independently of the web application."), p("CollEc uses ", a(href = "https://www.shinyproxy.io/", "ShinyProxy", target = "_blank"), " to deploy the app. When a user visits the website, the itself containerized ShinyProxy spins up the app through another Docker container."), h3("Privacy"), p("CollEc does not set any cookies apart from the ones necessary to navigate a ", a(href = "https://shiny.rstudio.com/", "Shiny", target = "_blank"), " web application. Users are not tracked anywhere outside this website and are not analyzed. There are no personalized ads and no data is shared with third parties. Dropdown menu selections, including author names, transition functions etc., are only stored as long as a session is active. They are, therefore, deleted within minutes of a user's inactivity."), p("ShinyProxy logs access times, container crashes etc. but does not track what happens within the app. The underlying R session's output is usually not printed to a file, except during testing and debugging by the maintainer."), p("CollEc's decision not to track and analyze users and web application usage is motivated by compliance with strict European regulation. Instead of directly observing who uses the app and how the app is used, CollEc relies on users to explicitly report their experience. If you encounter any errors, long loading times or other issues, report them ", a(href = "https://docs.google.com/forms/d/e/1FAIpQLSc6n-6FlzZx6YBorjlsSWpGm8PHbHAVxC9b...", "anonymously", target = "_blank"), " or contact the ", a(href = "http://www.christian-dueben.com", "maintainer", target = "_blank") , "."), p("The introduction tab's video is a special case. It is a Youtube video embedded with the 'nocookies' option. Youtube only places cookies in the user's browser once he or she clicks on the play button. Those cookies are subject to Youtube's cookie policy."), p("This privacy statement, including the extent to which data is stored and to which cookies are used, may change with future updates to the web application."), h3("Data Access"), p("At this point, CollEc data is only available through this application's graphical output. A functionality to download the tabular data behind it will be added in one of the next updates. In the meantime you can use other ", a(href = "https://ideas.repec.org/getdata.html", "RePEc data", target = "_blank"), "."), h3("History"), p("The first version of CollEc dates back to the year <YEAR>. It was developed by ", a(href = "http://openlib.org/home/krichel/", "Thomas Krichel", target = "_blank"), " who also founded the RePEc Author Service. He wrote a software computing closeness and betweenness centrality using Perl and displayed the results on a static website. An ", a(href = "http://collec.repec.org/", "image", target = "_blank"), " <UPDATE LINK> of that version is still available."), p("In 2020, after decades of maintaining this project, Thomas Krichel transferred it to ", a(href = "http://www.christian-dueben.com", "Christian Düben", target = "_blank"), ". With the new maintainer came a new implementation. CollEc was re-written from scratch. Migrating the network analysis from Perl to modern C and C++ code wrapped in R functions boosted efficiency and facilitated extensions to the analysis. The primary extensions are bilateral distances and weighted edges. The interface through which users view the data changed in various regards. Web applications' larger complexity compared to static websites gave the new maintainer the flexibility to fundamentally redefine how the data is presented. The new CollEc puts results into perspective using graphical output. When a user inquires the distance between two authors, CollEc generates a plot comparing that bilateral distance to the distances to all other authors in the network. A short text states further information on network size etc. The new CollEc evolves around the same approach as ", a(href = "http://graphec.repec.org/", "GraphEc", target = "_blank"), ", another recently developed RePEc service, does. It is a highly interactive tool presenting easily interpretable results and comparisons."), h3("Contributions"), p("If you would like to contribute to CollEc, register with the ", a(href = "https://authors.repec.org/", "RePEc Author Service", target = "_blank"), " and promote it among your colleagues. RePEc handles are unique identifiers that are assigned to everything listed in RePEc, from authors to papers, journals and working paper series. Especially in the case of authors they are a major advantage over bibliographic databases that only match by name. Duplicated names are very common in a field as large as economics. And creating a network based on names would be heavily distored. CollEc therefore constructs the co-authorship network using RePEc handles. And for an author to be assigned a RePEc handle he or she must register with the RePEc Author Service. Each additional registered economist with at least one co-authored paper fills in a missing link in CollEc."), p("Other types of contribution are also welcome. Feel free to contact CollEc's current maintainer ", a(href = "http://www.christian-dueben.com", "Christian Düben", target = "_blank"), "with your suggestions. Errors in the web application can be reported ", a(href = "https://docs.google.com/forms/d/e/1FAIpQLSc6n-6FlzZx6YBorjlsSWpGm8PHbHAVxC9b...", "anonymously", target = "_blank"), " or via e-mail to the maintainer."), h3("Citing CollEc"), p(a(href = "http://www.christian-dueben.com", "Christian Düben", target = "_blank"), " is currently working on a CollEc-based paper which will be mentioned here at some point."), tags$footer(tags$small("CollEc was founded by", a(href = "http://openlib.org/home/krichel/", "Thomas Krichel"), "and is currently maintained by", a(href = "http://www.christian-dueben.com", "Christian Düben", target = "_blank"), ". The server is sponsored by", a(href = "https://www.symplectic.co.uk/", "Symplectic", target = "_blank"), ". Report errors using this ", a(href = "https://docs.google.com/forms/d/e/1FAIpQLSc6n-6FlzZx6YBorjlsSWpGm8PHbHAVxC9b...", "form", target = "_blank"), ". Latest data update: ", textOutput("current_date_text_dc", inline = T)), ".", img(src = "symplectic_logo.png", align = "right"), style = "position: static; left: 1; bottom: 0; width: 98%; text-align: left; margin:10px 0px") Christian Düben Research Associate Chair of Macroeconomics Hamburg University Von-Melle-Park 5, Room 3102 20146 Hamburg Germany +49 40 42838 1898 christian.dueben@uni-hamburg.de http://www.christian-dueben.com -----Original Message----- From: Thomas Krichel <krichel@openlib.org> Sent: Freitag, 3. Juli 2020 19:06 To: Düben, Christian <Christian.Dueben@uni-hamburg.de> Cc: CollEc Run <collec-run@lists.openlib.org> Subject: Re: [CollEc] RePEc Visual Thomas Krichel writes
If that was not running in a container, I could help you with some Redirect statements that will give us nice meaningful URLs.
In the meantime, you could send me the doc page source code, and I can work on that. -- Cheers, Thomas Krichel http://openlib.org/home/krichel skype:thomaskrichel
I forgot to mention that I changed the authors displayed by default in the selectize.js field. They are now the main graph's authors with the highest closeness values in the inverse transition function case. Among them are currently e.g. Daron Acemoglu, James Robinson etc. The footer positions remains to be changed and the introduction to be rewritten. I am going to take care of that. Christian Düben Research Associate Chair of Macroeconomics Hamburg University Von-Melle-Park 5, Room 3102 20146 Hamburg Germany +49 40 42838 1898 christian.dueben@uni-hamburg.de http://www.christian-dueben.com -----Original Message----- From: Thomas Krichel <krichel@openlib.org> Sent: Freitag, 3. Juli 2020 19:06 To: Düben, Christian <Christian.Dueben@uni-hamburg.de> Cc: CollEc Run <collec-run@lists.openlib.org> Subject: Re: [CollEc] RePEc Visual Thomas Krichel writes
If that was not running in a container, I could help you with some Redirect statements that will give us nice meaningful URLs.
In the meantime, you could send me the doc page source code, and I can work on that. -- Cheers, Thomas Krichel http://openlib.org/home/krichel skype:thomaskrichel
I changed the footer settings. And the distances tab works now. Christian Düben Research Associate Chair of Macroeconomics Hamburg University Von-Melle-Park 5, Room 3102 20146 Hamburg Germany +49 40 42838 1898 christian.dueben@uni-hamburg.de http://www.christian-dueben.com -----Original Message----- From: Thomas Krichel <krichel@openlib.org> Sent: Freitag, 3. Juli 2020 19:06 To: Düben, Christian <Christian.Dueben@uni-hamburg.de> Cc: CollEc Run <collec-run@lists.openlib.org> Subject: Re: [CollEc] RePEc Visual Thomas Krichel writes
If that was not running in a container, I could help you with some Redirect statements that will give us nice meaningful URLs.
In the meantime, you could send me the doc page source code, and I can work on that. -- Cheers, Thomas Krichel http://openlib.org/home/krichel skype:thomaskrichel
participants (4)
-
Christian Zimmermann -
Düben, Christian -
Lars Vilhuber -
Thomas Krichel