Changes between Version 4 and Version 5 of QCG-Coordinator

Show
Ignore:
Timestamp:
11/05/12 15:20:36 (11 years ago)
Author:
mmamonski
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • QCG-Coordinator

    v4 v5  
    11 
    2 In most parallel toolkits used within single cluster environments the master process spawns the worker processes either using SSH or LRMS native interfaces. This make the task of exchanging contact information (e.g. listening host and port) between master and workers relatively easy as the master process is always initialized before the slave processes. With a co-allocated parallel application this is an issue as master and workers are started independently. In the !QosCosGrid stack we solved this problem with a help of external entity: the QCG-Coordinator service. The service implements two general operations: /!PutProcessEntry/ and /!GetProcessEntry/. The master process provides contact information using the /PutProcessEntry/ method, while the slave processes acquire this information using the /!GetProcessEntry/ method which blocks until the information is available. This relaxes the requirement that the kernels must be started in some particular order. 
     2In most parallel toolkits used within single cluster environments the master process spawns the worker processes either using SSH or LRMS native interfaces. This make the task of exchanging contact information (e.g. listening host and port) between master and workers relatively easy as the master process is always initialized before the slave processes. With a co-allocated parallel application this is an issue as master and workers are started independently. In the !QosCosGrid stack we solved this problem with a help of external entity: the QCG-Coordinator service. The service implements two general operations: `PutProcessEntry` and `GetProcessEntry`. The master process provides contact information using the `PutProcessEntry` method, while the slave processes acquire this information using the `GetProcessEntry` method which blocks until the information is available. This relaxes the requirement that the kernels must be started in some particular order. 
    33 * /PutProcessEntry(in: key, in: data)/ - puts contact information data for a given session key, 
    44 * /GetProcessEntry(in: key, out: data)/- gets contact information data for a given session key.