Re: [CollEc] repec genealogy/citation/coauthor raw data
Lars Vilhuber wrote:
this is what I was aiming to do:
http://www.vrdc.cornell.edu/repecgraph/
in my abundant free time. I have a programmer who would be perfectly capable of doing that, but no time right now (may change in a year). Also, if you are interested, I could propose this to one of our Cornell CS classes - they do 'client-oriented visualization projects' and could probably do this with a dump of the data, if not an API.
This is just idle thoughts, but who knows...
'Christian Zimmermann' writes
I always wanted http://collec.repec.org/ to have something like this.
Me to. But I don't have the expertise.
Thomas may help you getting access to the CollEc data/server for starters.
I surely will be pleased to do that. -- Cheers, Thomas Krichel http://openlib.org/home/krichel skype:thomaskrichel
Thomas, Christian: (I'll continue the discussion in English, since I might want to forward this to my RAs) This is doable! I may have a very competent undergraduate RA to work on this in the Spring. I'll have more details (on his availability) in about 2 weeks. I'm putting together a quick write-up of why this makes sense from our projects, and why it might make sense for RePEc (or in general). I'm aiming for something that would be maintainable as an actual web app on (your? our?) servers as part of the network. I can see all the data (I think) I need on the page that Christian sent me, except for the genealogy data. Possibly Thomas has data in a more concise format than us parsing the RePEc:per archive? But that's no problem. Can you provide me with a one-time dump (extract probably is NOT sufficient if done randomly, but could be time-based - birthday of the person in the database?) for now of the genealogy database? (what format are you using there? ) -- Lars Vilhuber | Executive Director Economist | Labor Dynamics Institute lars.vilhuber@cornell.edu -+- Cornell University Office/Cell : +1.607-330-5743 | ILR School - Department of Economics Fax : +1.866-873-9078 + U.S. Census Bureau - CES - LEHD Labor Dynamics Institute: http://www.ilr.cornell.edu/ldi Visit the Cornell VirtualRDC: http://www.vrdc.cornell.edu ________________________________________ From: Thomas Krichel <krichel@openlib.org> Sent: Saturday, October 25, 2014 03:17 To: Lars Vilhuber Cc: CollEc Run; Christian Zimmermann Subject: Re: repec genealogy/citation/coauthor raw data Lars Vilhuber wrote:
this is what I was aiming to do:
http://www.vrdc.cornell.edu/repecgraph/
in my abundant free time. I have a programmer who would be perfectly capable of doing that, but no time right now (may change in a year). Also, if you are interested, I could propose this to one of our Cornell CS classes - they do 'client-oriented visualization projects' and could probably do this with a dump of the data, if not an API.
This is just idle thoughts, but who knows...
'Christian Zimmermann' writes
I always wanted http://collec.repec.org/ to have something like this.
Me to. But I don't have the expertise.
Thomas may help you getting access to the CollEc data/server for starters.
I surely will be pleased to do that. -- Cheers, Thomas Krichel http://openlib.org/home/krichel skype:thomaskrichel
Lars Vilhuber writes
I'm putting together a quick write-up of why this makes sense from our projects, and why it might make sense for RePEc (or in general). I'm aiming for something that would be maintainable as an actual web app on (your? our?) servers as part of the network.
It would be built into http://collec.repec.org. I can do the intergration as long as the result is avaiable in XML. You may also want to use the CollEc server for the project.
Possibly Thomas has data in a more concise format than us parsing the RePEc:per archive?
Yes, on icanis@katri.openlib.org. You may want to give me a key so I can add you to the account or send me a username as well so I would open a separate user. -- Cheers, Thomas Krichel http://openlib.org/home/krichel skype:thomaskrichel
Thomas: for development including undergraduate RAs etc. (regardless of how good they are), I prefer to do the development elsewhere. I have a bunch of virtual servers. If you let me know what collec.repec.org is running (ubuntu, debian, centos, opensuse, etc.) I can replicate that on the development server. The visualization itself may have some requirements that we'll need to figure out. Right now, exploration is in pure javascript, but that may not scale well (that's part of the project to figure that out). For the concise data and SSH access, use the public key attached. That's mine - I won't let the RAs mess around. -- Lars Vilhuber | Executive Director Economist | Labor Dynamics Institute lars.vilhuber@cornell.edu -+- Cornell University Office/Cell : +1.607-330-5743 | ILR School - Department of Economics Fax : +1.866-873-9078 + U.S. Census Bureau - CES - LEHD Labor Dynamics Institute: http://www.ilr.cornell.edu/ldi Visit the Cornell VirtualRDC: http://www.vrdc.cornell.edu ________________________________________ From: Thomas Krichel <krichel@openlib.org> Sent: Tuesday, October 28, 2014 10:56 To: Lars Vilhuber Cc: CollEc Run Subject: Re: repec genealogy/citation/coauthor raw data Lars Vilhuber writes
I'm putting together a quick write-up of why this makes sense from our projects, and why it might make sense for RePEc (or in general). I'm aiming for something that would be maintainable as an actual web app on (your? our?) servers as part of the network.
It would be built into http://collec.repec.org. I can do the intergration as long as the result is avaiable in XML. You may also want to use the CollEc server for the project.
Possibly Thomas has data in a more concise format than us parsing the RePEc:per archive?
Yes, on icanis@katri.openlib.org. You may want to give me a key so I can add you to the account or send me a username as well so I would open a separate user. -- Cheers, Thomas Krichel http://openlib.org/home/krichel skype:thomaskrichel
Lars Vilhuber writes
for development including undergraduate RAs etc. (regardless of how good they are), I prefer to do the development elsewhere. I have a bunch of virtual servers. If you let me know what collec.repec.org is running (ubuntu, debian, centos, opensuse, etc.) I can replicate that on the development server.
That's good. Load on katri (the server we use) is very high. It maintains now close to 1 billion paths. I run debian testing. If you set up the server, I suggest you create an "icanis" account, and I can rsync the data to you.
The visualization itself may have some requirements that we'll need to figure out. Right now, exploration is in pure javascript, but that may not scale well (that's part of the project to figure that out).
The CollEC site uses XSLT, so thet data that you generate needs to be in XML to fit. Sure, it could point to graphics say is a different format, or to files with BTW I was last year in Ithaka. Too bad I did not think about contacting you. -- Cheers, Thomas Krichel http://openlib.org/home/krichel skype:thomaskrichel
I do not (yet) have a way to share the geneaology data, but we can talk about what would best suitable. My optimal outcome would to have an app that can be used on CollEc, your site and the RePEc Genealogy to show the respective networks. CollEc and RG would be limited to their data, you could show all you want. Christian Zimmermann FIGUGEGL! Economic Research Federal Reserve Bank of St. Louis P.O. Box 442 St. Louis MO 63166-0442 USA http://ideas.repec.org/zimm/ @CZimm_economist On Tue, 28 Oct 2014, Lars Vilhuber wrote:
Thomas, Christian:
(I'll continue the discussion in English, since I might want to forward this to my RAs)
This is doable!
I may have a very competent undergraduate RA to work on this in the Spring. I'll have more details (on his availability) in about 2 weeks.
I'm putting together a quick write-up of why this makes sense from our projects, and why it might make sense for RePEc (or in general). I'm aiming for something that would be maintainable as an actual web app on (your? our?) servers as part of the network.
I can see all the data (I think) I need on the page that Christian sent me, except for the genealogy data. Possibly Thomas has data in a more concise format than us parsing the RePEc:per archive? But that's no problem. Can you provide me with a one-time dump (extract probably is NOT sufficient if done randomly, but could be time-based - birthday of the person in the database?) for now of the genealogy database? (what format are you using there? ) -- Lars Vilhuber | Executive Director Economist | Labor Dynamics Institute lars.vilhuber@cornell.edu -+- Cornell University Office/Cell : +1.607-330-5743 | ILR School - Department of Economics Fax : +1.866-873-9078 + U.S. Census Bureau - CES - LEHD
Labor Dynamics Institute: http://www.ilr.cornell.edu/ldi Visit the Cornell VirtualRDC: http://www.vrdc.cornell.edu
________________________________________ From: Thomas Krichel <krichel@openlib.org> Sent: Saturday, October 25, 2014 03:17 To: Lars Vilhuber Cc: CollEc Run; Christian Zimmermann Subject: Re: repec genealogy/citation/coauthor raw data
Lars Vilhuber wrote:
this is what I was aiming to do:
http://www.vrdc.cornell.edu/repecgraph/
in my abundant free time. I have a programmer who would be perfectly capable of doing that, but no time right now (may change in a year). Also, if you are interested, I could propose this to one of our Cornell CS classes - they do 'client-oriented visualization projects' and could probably do this with a dump of the data, if not an API.
This is just idle thoughts, but who knows...
'Christian Zimmermann' writes
I always wanted http://collec.repec.org/ to have something like this.
Me to. But I don't have the expertise.
Thomas may help you getting access to the CollEc data/server for starters.
I surely will be pleased to do that.
--
Cheers,
Thomas Krichel http://openlib.org/home/krichel skype:thomaskrichel
Christian, Thomas: just a brief update. We've identified a (very) qualified (undergrad) RA, who'll work on this part time starting mid-January. We'll start with a demo using a (static) data dump (I have grabbed a one-time copy of Thomas' data, will still need a dump of the genealogy data, unless the RA simply decides to scrape it), we can discuss how to most efficiently "collect" the data necessary for graphing and exploring in an "implemented production" server, once we get there. I'm excited! Lars -- Lars Vilhuber | Executive Director Economist | Labor Dynamics Institute lars.vilhuber@cornell.edu -+- Cornell University Office/Cell : +1.607-330-5743 | ILR School - Department of Economics Fax : +1.866-873-9078 + U.S. Census Bureau - CES - LEHD Labor Dynamics Institute: http://www.ilr.cornell.edu/ldi Visit the Cornell VirtualRDC: http://www.vrdc.cornell.edu ________________________________________ From: 'Christian Zimmermann' <zimmermann@stlouisfed.org> Sent: Tuesday, October 28, 2014 11:03 To: Lars Vilhuber Cc: Thomas Krichel; CollEc Run Subject: Re: repec genealogy/citation/coauthor raw data I do not (yet) have a way to share the geneaology data, but we can talk about what would best suitable. My optimal outcome would to have an app that can be used on CollEc, your site and the RePEc Genealogy to show the respective networks. CollEc and RG would be limited to their data, you could show all you want. Christian Zimmermann FIGUGEGL! Economic Research Federal Reserve Bank of St. Louis P.O. Box 442 St. Louis MO 63166-0442 USA http://ideas.repec.org/zimm/ @CZimm_economist On Tue, 28 Oct 2014, Lars Vilhuber wrote:
Thomas, Christian:
(I'll continue the discussion in English, since I might want to forward this to my RAs)
This is doable!
I may have a very competent undergraduate RA to work on this in the Spring. I'll have more details (on his availability) in about 2 weeks.
I'm putting together a quick write-up of why this makes sense from our projects, and why it might make sense for RePEc (or in general). I'm aiming for something that would be maintainable as an actual web app on (your? our?) servers as part of the network.
I can see all the data (I think) I need on the page that Christian sent me, except for the genealogy data. Possibly Thomas has data in a more concise format than us parsing the RePEc:per archive? But that's no problem. Can you provide me with a one-time dump (extract probably is NOT sufficient if done randomly, but could be time-based - birthday of the person in the database?) for now of the genealogy database? (what format are you using there? ) -- Lars Vilhuber | Executive Director Economist | Labor Dynamics Institute lars.vilhuber@cornell.edu -+- Cornell University Office/Cell : +1.607-330-5743 | ILR School - Department of Economics Fax : +1.866-873-9078 + U.S. Census Bureau - CES - LEHD
Labor Dynamics Institute: http://www.ilr.cornell.edu/ldi Visit the Cornell VirtualRDC: http://www.vrdc.cornell.edu
________________________________________ From: Thomas Krichel <krichel@openlib.org> Sent: Saturday, October 25, 2014 03:17 To: Lars Vilhuber Cc: CollEc Run; Christian Zimmermann Subject: Re: repec genealogy/citation/coauthor raw data
Lars Vilhuber wrote:
this is what I was aiming to do:
http://www.vrdc.cornell.edu/repecgraph/
in my abundant free time. I have a programmer who would be perfectly capable of doing that, but no time right now (may change in a year). Also, if you are interested, I could propose this to one of our Cornell CS classes - they do 'client-oriented visualization projects' and could probably do this with a dump of the data, if not an API.
This is just idle thoughts, but who knows...
'Christian Zimmermann' writes
I always wanted http://collec.repec.org/ to have something like this.
Me to. But I don't have the expertise.
Thomas may help you getting access to the CollEc data/server for starters.
I surely will be pleased to do that.
--
Cheers,
Thomas Krichel http://openlib.org/home/krichel skype:thomaskrichel
Tell me in what format you want the genealogy data. I have it in MySQL. Christian Zimmermann FIGUGEGL! Economic Research Federal Reserve Bank of St. Louis P.O. Box 442 St. Louis MO 63166-0442 USA http://ideas.repec.org/zimm/ @CZimm_economist On Fri, 7 Nov 2014, Lars Vilhuber wrote:
Christian, Thomas:
just a brief update. We've identified a (very) qualified (undergrad) RA, who'll work on this part time starting mid-January.
We'll start with a demo using a (static) data dump (I have grabbed a one-time copy of Thomas' data, will still need a dump of the genealogy data, unless the RA simply decides to scrape it), we can discuss how to most efficiently "collect" the data necessary for graphing and exploring in an "implemented production" server, once we get there.
I'm excited!
Lars
-- Lars Vilhuber | Executive Director Economist | Labor Dynamics Institute lars.vilhuber@cornell.edu -+- Cornell University Office/Cell : +1.607-330-5743 | ILR School - Department of Economics Fax : +1.866-873-9078 + U.S. Census Bureau - CES - LEHD
Labor Dynamics Institute: http://www.ilr.cornell.edu/ldi Visit the Cornell VirtualRDC: http://www.vrdc.cornell.edu
________________________________________ From: 'Christian Zimmermann' <zimmermann@stlouisfed.org> Sent: Tuesday, October 28, 2014 11:03 To: Lars Vilhuber Cc: Thomas Krichel; CollEc Run Subject: Re: repec genealogy/citation/coauthor raw data
I do not (yet) have a way to share the geneaology data, but we can talk about what would best suitable.
My optimal outcome would to have an app that can be used on CollEc, your site and the RePEc Genealogy to show the respective networks. CollEc and RG would be limited to their data, you could show all you want.
Christian Zimmermann FIGUGEGL! Economic Research Federal Reserve Bank of St. Louis P.O. Box 442 St. Louis MO 63166-0442 USA http://ideas.repec.org/zimm/ @CZimm_economist
On Tue, 28 Oct 2014, Lars Vilhuber wrote:
Thomas, Christian:
(I'll continue the discussion in English, since I might want to forward this to my RAs)
This is doable!
I may have a very competent undergraduate RA to work on this in the Spring. I'll have more details (on his availability) in about 2 weeks.
I'm putting together a quick write-up of why this makes sense from our projects, and why it might make sense for RePEc (or in general). I'm aiming for something that would be maintainable as an actual web app on (your? our?) servers as part of the network.
I can see all the data (I think) I need on the page that Christian sent me, except for the genealogy data. Possibly Thomas has data in a more concise format than us parsing the RePEc:per archive? But that's no problem. Can you provide me with a one-time dump (extract probably is NOT sufficient if done randomly, but could be time-based - birthday of the person in the database?) for now of the genealogy database? (what format are you using there? ) -- Lars Vilhuber | Executive Director Economist | Labor Dynamics Institute lars.vilhuber@cornell.edu -+- Cornell University Office/Cell : +1.607-330-5743 | ILR School - Department of Economics Fax : +1.866-873-9078 + U.S. Census Bureau - CES - LEHD
Labor Dynamics Institute: http://www.ilr.cornell.edu/ldi Visit the Cornell VirtualRDC: http://www.vrdc.cornell.edu
________________________________________ From: Thomas Krichel <krichel@openlib.org> Sent: Saturday, October 25, 2014 03:17 To: Lars Vilhuber Cc: CollEc Run; Christian Zimmermann Subject: Re: repec genealogy/citation/coauthor raw data
Lars Vilhuber wrote:
this is what I was aiming to do:
http://www.vrdc.cornell.edu/repecgraph/
in my abundant free time. I have a programmer who would be perfectly capable of doing that, but no time right now (may change in a year). Also, if you are interested, I could propose this to one of our Cornell CS classes - they do 'client-oriented visualization projects' and could probably do this with a dump of the data, if not an API.
This is just idle thoughts, but who knows...
'Christian Zimmermann' writes
I always wanted http://collec.repec.org/ to have something like this.
Me to. But I don't have the expertise.
Thomas may help you getting access to the CollEc data/server for starters.
I surely will be pleased to do that.
--
Cheers,
Thomas Krichel http://openlib.org/home/krichel skype:thomaskrichel
participants (3)
-
'Christian Zimmermann' -
Lars Vilhuber -
Thomas Krichel