Changes between Version 6 and Version 7 of QCG-Coordinator

Show
Ignore:
Timestamp:
11/05/12 15:23:45 (11 years ago)
Author:
mmamonski
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • QCG-Coordinator

    v6 v7  
    11 
    2 In most parallel toolkits used within single cluster environments the master process spawns the worker processes either using SSH or LRMS native interfaces. This make the task of exchanging contact information (e.g. listening host and port) between master and workers relatively easy as the master process is always initialized before the slave processes. With a co-allocated parallel application this is an issue as master and workers are started independently. In the !QosCosGrid stack we solved this problem with a help of external entity: the QCG-Coordinator service. The service implements two general operations: `PutProcessEntry` and `GetProcessEntry`. The master process provides contact information using the `PutProcessEntry` method, while the slave processes acquire this information using the `GetProcessEntry` method which blocks until the information is available. This relaxes the requirement that the kernels must be started in some particular order. 
     2In most parallel toolkits used within single cluster environments the master process spawns the worker processes either using SSH or LRMS native interfaces. This make the task of exchanging contact information (e.g. listening host and port) between master and workers relatively easy as the master process is always initialized before the slave processes. With a co-allocated parallel application this is an issue as master and workers are started independently. In the !QosCosGrid stack we solved this problem with a help of external entity: the QCG-Coordinator service. The service implements two general operations: `PutProcessEntry` and `GetProcessEntry`. The master process provides contact information using the `PutProcessEntry` method, while the slave processes acquire this information using the `GetProcessEntry` method which blocks until the information is available. This relaxes the requirement that the co-allocated parts of the applications must be started in some particular order. 
    33 * `PutProcessEntry(in: key, in: data)` - puts contact information data for a given session key, 
    44 * `GetProcessEntry(in: key, out: data)`- gets contact information data for a given session key. 
    5 The `GetProcessEntry` operation is blocking, i.e. it waits until the process data for a given key is available. This relaxes the requirement that the kernels must be started in some particular order. The unique session key is generated by QCG-Broker and distributed to all MUSCLE kernels. The whole process of exchanging contact information is shown in the below figure. 
     5The `GetProcessEntry` operation is blocking, i.e. it waits until the process data for a given key is available. This relaxes the requirement that the kernels must be started in some particular order. The unique session key is generated by QCG-Broker and distributed to all applications processes. The whole process of exchanging contact information is shown in the below figure. 
    66 
    77[[Image(QCG-Coordinator.png)]]