Difference between revisions of "LGI Pilotjob Framework"

From PDP/Grid Wiki
Jump to navigationJump to search
(→‎R: add links)
m (→‎R: restructure links)
Line 21: Line 21:
 
One idea is to use this framework to provide easy access to R on the grid.
 
One idea is to use this framework to provide easy access to R on the grid.
  
* [http://nws-r.sourceforge.net/ netWorkSpaces with R]
+
* [http://nws-r.sourceforge.net/ netWorkSpaces with R], seems to have moved to the commercial scene at [http://www.revolutionanalytics.com/ Revolution Analytics]: a special edition of R for large-scale computing
* [http://biocep-distrib.r-forge.r-project.org/ Biocep-R]
+
* [http://biocep-distrib.r-forge.r-project.org/ Biocep-R]. Example: [http://user2010.org/tutorials/Chine.html Elastic-R], a google docs-like portal for data analysis in the cloud
 
* [http://epub.ub.uni-muenchen.de/8991/1/parallelR_techRep.pdf State-of-the-art in Parallel Computing with R]
 
* [http://epub.ub.uni-muenchen.de/8991/1/parallelR_techRep.pdf State-of-the-art in Parallel Computing with R]
 
* [http://cran.r-project.org/web/views/HighPerformanceComputing.html High-Performance and Parallel Computing with R]
 
* [http://cran.r-project.org/web/views/HighPerformanceComputing.html High-Performance and Parallel Computing with R]
 
* [https://spaces.umbc.edu/display/hpc/Running+R+on+HPC Running R on HPC]
 
* [https://spaces.umbc.edu/display/hpc/Running+R+on+HPC Running R on HPC]
* [http://www.revolutionanalytics.com/ Revolution Analytics]: special edition of R for large-scale computing
 
 
* Other R portals
 
* Other R portals
 
** [http://technical.bestgrid.org/index.php/GridSphere_R_Portal_Initial_Work_Plan GridSphere R portal work plan]
 
** [http://technical.bestgrid.org/index.php/GridSphere_R_Portal_Initial_Work_Plan GridSphere R portal work plan]
** [http://user2010.org/tutorials/Chine.html Elastic-R], a google docs-like portal for data analysis in the cloud
 
  
 
==Links==
 
==Links==

Revision as of 11:04, 30 November 2010

The Leiden Grid Infrastructure (LGI) is a framework for executing high-performance applications on different computer systems. It consists of one or more project servers that keep track of all jobs and resources, and a collection of resources that regularly contact the project server for work.

This project makes a connection to gLite grid infrastructure (EGI) by introducing an efficient way to run jobs on the grid that are managed by an LGI setup. This would have the following benefits:

  • Improved latency with respect to the grid.
  • Users do not have to port an application to the grid, as the LGI administrator makes sure applications are running properly.
  • Better scalability, since the WMS can give problems with a large number of jobs.
  • Possibility for username/password authentication instead of certificates (with the aid of a robot certificate).
  • Possibility to mix grid and non-grid systems.

With a little luck, this will make the grid more accessible to less technically minded people.

Architecture of the LGI pilotjob framework.

Architecture

An existing LGI setup is the base, and that's all that's visible to the user. This is centered around the project server. The project server communicates with resources that execute the work. A resource consists of a resource daemon that runs the application. Now on the grid there are pilotjobs, which do nothing but running a single application by means of a LGI resource daemon. They are the work-horses. The lifecycle of these pilotjobs is managed by the pilotjob manager (which makes sure enough pilotjobs are running all the time).

With everything properly tuned, this means in practice that users can submit jobs using one of the LGI interfaces, while they are executed on the grid.

Status

The current prototype shows good results. When we have some experience with real-world applications, we plan to release a public version.

R

One idea is to use this framework to provide easy access to R on the grid.

Links