Hi Christian, cool. Two things: since I'm late to this discussion I actually don't know where to find "RePEc Visual". Second, I have been working a lot with Docker lately, and have some experience by now (won the hard way, the old fashioned way). If you point me to your Git repo and/or your Docker build file, I can provide some insights/help. Lars -- Lars Vilhuber, Economist Cornell University, Executive Director, Labor Dynamics Institute and ILR School - Department of Economics American Economic Association - Data Editor Journal of Privacy and Confidentiality - Managing Editor e: lars.vilhuber@cornell.edu p: +1.607-330-5743 v: https://cornell.zoom.us/my/larsvilhuber w: http://lars.vilhuber.com/ <http://lars.vilhuber.com/> Assistant: ldi@cornell.edu | +1.607-255-2744 ________________________________ From: Düben, Christian <Christian.Dueben@uni-hamburg.de> Sent: Friday, May 29, 2020 16:12 To: Thomas Krichel <krichel@openlib.org>; Lars Vilhuber <lars.vilhuber@cornell.edu> Cc: CollEc Run <collec-run@lists.openlib.org> Subject: RE: RePEc Visual Dear All, Thanks again for the insightful Zoom meeting. This helped me a lot in developing a new version of CollEc. And thank you Thomas for allowing me to advance this interesting project. After spending much of this week on developing the new CollEc, it is now quasi done. The user interface is similar to that of RePEc Visual. Users interactively navigate through the app and are presented with CollEc data through graphical output. A documentation contains variable definitions, details on the technical implementation etc. I wrote the documentation so that someone who never heard of graph theory understands what CollEc displays. The SQL implementation and the code generating the CollEc data from RePEc Author Service inputs are also almost ready for upload. What I am still having some difficulties with is deploying apps through Docker containers. I am going to take a course on Docker this weekend and expect to deploy a test app some time next week. CollEc's server does not really have any vacant CPU capacity to test the web application. I therefore suggest to upload the test application to another server where everyone on this list can access it with a password. After this testing phase it can then be officially released on the actual server. Feel free to share your thoughts on this. Have a nice weekend. Kind regards, Christian Christian Düben Research Associate Chair of Macroeconomics Hamburg University Von-Melle-Park 5, Room 3102 20146 Hamburg Germany +49 40 42838 1898 christian.dueben@uni-hamburg.de http://www.christian-dueben.com -----Original Message----- From: Thomas Krichel <krichel@openlib.org> Sent: Donnerstag, 21. Mai 2020 16:03 To: Düben, Christian <Christian.Dueben@uni-hamburg.de> Cc: CollEc Run <collec-run@lists.openlib.org> Subject: Re: RePEc Visual Düben, Christian writes
I weighted the edges between co-authors by their number of joint papers.
As far a I understand we need a binary network. Otherwise can can easily be an a situation where we say that the shortest path between A and C is through B, even though A and C have written a paper together.
First, I calculated the distance matrix. Distances are measured as the length of Dijkstra's shortest cost paths. Calculating and writing those 2,227,084,864 cell values to disk took 4.77 minutes in a process parallelized across 8 cores. Computing each author's closeness value and writing it to disk took 4.27 minutes in an 8 core process. Betweenness is quite slow in comparison.
It would be many many times faster than what I do now.
See you tomorrow.
Yes, 19:30 my time. -- Cheers, Thomas Krichel http://openlib.org/home/krichel skype:thomaskrichel