Version 30 (modified by mmamonski, 12 years ago) (diff)

--

QCG-Computing Installation

QCG-Computing service (formerly known under name Smoa Computing) is an open source service acting as a computing provider exposing on demand access to computing resources and jobs over the HPC Basic Profile compliant Web Services interface. In addition the QCG-Computing offers remote interface for Advance Reservations management.

Requirements

Before installing QCG-Computing make sure that software listed in the next subsections is installed. You may also read distribution specific notes available for:

Dependend libraries

  1. libxml2 >= 2.6.27
  2. openssl >= 0.9.7
  3. unixODBC >= 2.2.0
  4. Python (with the lxml package) >= 2.4

Note: If you are installing the libraries from binary packages check if those libraries were installed in devel versions (i.e. with headers files).

Database Backend

For the database backend we recommends the  PostgreSQL, as the QCG-Computing is shipped with SQL script for the database scheme creation for the PostgreSQL. Check if ODBC drivers for the PostgreSQL were installed (i.e. if the libodbcpsql.so and libodbcpsqlS.so files are available in your system), if not install it independently.

DRMAA and ARES libraries

If you decided to use LSF as the LRMS system you need to install two external libraries:

  • DRMAA library for LSF (you can get it from  here)
  • ARES (Advance REServation) library for LSF (you can get it from  here)

Note: Remember were the libraries where installed. You will need this information during Service configuration.

If SGE:

  • DRMAA library is provided with standard SGE installation.
  • Instead of ARES library a Python module (installed by default with QCG-Computing) is used. Thus to make Advance Reservation capabilities accessible via QCG-Computing you will need Python interpreter (version 2.4 at least) together with the python library for XML processing (lxml or elementtree).

If Torque with Maui scheduler:

  • DRMAA library can be obtained from the  PBS DRMAA sourceforge project. After installation, as described in the library README file, you need either to:
    • configure Torque to keep information about completed jobs (e.g.: by setting: qmgr -c 'set server keep_completed = 60').
    • configure the DRMAA library to use Torque logs. Sample configuration file of the DRMAA library (PREFIX/etc/pbs_drmaa.conf):
      # pbs_drmaa.conf - Sample pbs_drmaa configuration file.
      wait_thread: 1,
      pbs_home: "/var/spool/pbs",
      pool_delay: 5,
      cache_job_state: 1,
      

  • Instead of ARES library a Python module (installed by default with QCG-Computing) is used. Thus to make Advance Reservation capabilities accessible via QCG-Computing you will need Python interpreter (version 2.4 at least) together with the python library for XML processing (lxml or elementtree).

Installation

QCG-Core library

 QCG-Core is an utility and interoperability layer for the the basics QCG services (QCG-Computing and QCG-Notification in particular). As the GSI (Grid Security Infrastructure) is used to secure the transport layer you must compile the QCG-Core library with GSI support. After successful gridFTP installation (we recommend to use version 4.x or 5.x of Globus Toolkit) all you need to do is to check if the GLOBUS_LOCATION environment variable is set correctly (and the $GLOBUS_LOCATION/etc/globus-user-env.sh file sourced). You must also choose globus flavor that you wish to compile with (e.g. gcc64dbg for a 64bit machine or gcc32dbg if you are installing the QCG-Core on a 32bit machine):

. $GLOBUS_LOCATION/etc/globus-user-env.sh 
tar xzf qcg-core-2.1.1.tar.gz 
cd qcg-core-2.1.1
./configure  --with-globus-flavor=gcc64dbg
make
sudo make install

Note: You may need to pass some additional options to ./configure script, if some of the dependencies are installed in non-standard locations (type ./configure --help for more info)

QCG-Computing service

After downloading the source package the QCG-Computing installation can be done in a few steps, e.g:

tar xzf qcg-comp-2.4.*.tar.gz 
cd qcg-comp-2.4.*
./configure  --with-qcg-core=/opt/qcg/
make
sudo make install

Configuration

Before you start create a new system user (e.g. qcg-comp) that will be used by the service (and only the QCG-Computing service) to run in unprivileged mode:

useradd -d  /opt/qcg/var/log/qcg-comp/ -M  qcg-comp

Database setup

You need to create a database for QCG-Computing and appropriate tables using the provided qcg-comp-psql.sql script. It can be found in PREFIX/share/qcg-comp/db/. Currently only PostgreSQL is supported.

  • Create new database user (e.g. qcg-comp) authenticated via password (you will later provide this password in the QCC-Computing configuration file). This user must be allowed to create new databases.
    su - postgres
    createuser -P qcg-comp 
    exit
    
  • Create new database (e.g. qcg-comp) owned by the qcg-comp user:
    su - qcg-comp
    createdb -U qcg-comp qcg-comp
    exit
    
  • Depending on the local PostgreSQL configuration you may need to edit the pg_hba.conf file (host based authentication configuration file) and add the following lines as the two top rules:
    local   qcg-comp   qcg-comp                         md5
    host    qcg-comp   qcg-comp   127.0.0.1/32          md5
    
    in order to enable password authentication for the user qcg-comp.

Note: You must reload the PostgreSQL server in order to make the changes visible e.g: Note: If you plan to use database server located on an another machine remember to provide ip address of the QCG-Computing machine instead of 127.0.0.1.

/etc/init.d/postgresql reload
  • Create database schema using provided script (qcg-comp-psql.sql) :
psql -f /opt/qcg/share/qcg-comp/db/qcg-comp-psql.sql -U qcg-comp
 psql:/opt/qcg/share/qcg-comp/db/qcg-comp-psql.sql:3: ERROR:  table "states" does not exist
 psql:/opt/qcg/share/qcg-comp/db/qcg-comp-psql.sql:4: ERROR:  table "jobs" does not exist
 psql:/opt/qcg/share/qcg-comp/db/qcg-comp-psql.sql:5: ERROR:  table "jobs_acc" does not exist
 psql:/opt/qcg/share/qcg-comp/db/qcg-comp-psql.sql:6: ERROR:  table "reservations" does not exist
 psql:/opt/qcg/share/qcg-comp/db/qcg-comp-psql.sql:7: ERROR:  table "reservations_acc" does not exist
 psql:/opt/qcg/share/qcg-comp/db/qcg-comp-psql.sql:12: NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index "states_pkey" for table "states"
 CREATE TABLE
 INSERT 0 1
 INSERT 0 1
 INSERT 0 1
 INSERT 0 1
 INSERT 0 1
 INSERT 0 1
 INSERT 0 1
 INSERT 0 1
 INSERT 0 1
 psql:/opt/qcg/share/qcg-comp/db/qcg-comp-psql.sql:53: NOTICE:  CREATE TABLE will create implicit sequence "jobs_serial_seq" for serial column "jobs.serial"
 psql:/opt/qcg/share/qcg-comp/db/qcg-comp-psql.sql:53: NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index "jobs_pkey" for table "jobs"
 psql:/opt/qcg/share/qcg-comp/db/qcg-comp-psql.sql:53: NOTICE:  CREATE TABLE / UNIQUE will create implicit index "jobs_drms_id_key" for table "jobs"
 CREATE TABLE
 CREATE INDEX
 CREATE INDEX
 CREATE INDEX
 psql:/opt/qcg/share/qcg-comp/db/qcg-comp-psql.sql:88: NOTICE:  CREATE TABLE will create implicit sequence "jobs_acc_serial_seq" for serial column "jobs_acc.serial"
 psql:/opt/qcg/share/qcg-comp/db/qcg-comp-psql.sql:88: NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index "jobs_acc_pkey" for table "jobs_acc"
 CREATE TABLE
 CREATE INDEX
 CREATE INDEX
 CREATE INDEX
 psql:/opt/qcg/share/qcg-comp/db/qcg-comp-psql.sql:109: NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index "reservations_pkey" for table "reservations"
 CREATE TABLE
 CREATE INDEX
 psql:/opt/qcg/share/qcg-comp/db/qcg-comp-psql.sql:128: NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index "reservations_acc_pkey" for table "reservations_acc"
 CREATE TABLE
 CREATE INDEX
  • You need to create ODBC Data Source Name. You can do this by editing system-wide, i.e. /etc/odbc.ini, configuration file and adding new section. In this example we assume that the DSN (Database Source Name) will be qcg-comp, real database name is also qcg-comp and appropriate drivers for PostgreSQL database (i.e. the libodbcpsql.so and libodbcpsqlS.so files) are located in /usr/local/lib:
[qcg-comp]
Description         = QCG-Computing database
Driver              = /usr/local/lib/libodbcpsql.so
Setup               = /usr/local/lib/libodbcpsqlS.so
Database            = qcg-comp
Servername          = localhost
Port                = 5432
ReadOnly            = No

Note: Verify if paths given in the Driver and Setup sections are valid. Note: If you plan to use database server located on an another machine remember to provide ip address of the PostgreSQL server machine instead of localhost.

Note The PostgreSQL odbc driver on many systems is distributed as a separate package (the postgresql-odbc or similar). Don't use the PostgreSQL driver shipped with the UnixODBC source tarball, as this one is known to be  obsolete.

  • It is recommended to use a command line client to access the database through ODBC to check if everything was configured correctly. For example, unixODBC provides an isql client:
isql -v qcg-comp qcg-comp <password>
+---------------------------------------+
| Connected!                            |
|                                       |
| sql-statement                         |
| help [tablename]                      |
| quit                                  |
|                                       |
+---------------------------------------+
SQL>quit

Service configuration

You can find the complete reference guide in the QCG-Computing manual. For convenience we provided below sample service configuration files (located in PREFIX/etc/qcg-compd.xml - do not mislead this file with the client configuration file which is named qcg-comp.xml) that you can use as templates:

Template configuration for the QCG-Computing service deployed on the top of LSF system

<?xml version="1.0" encoding="UTF-8"?>
<sm:QCGCore
       xmlns:sm="http://schemas.qoscosgrid.org/core/2011/04/config"
       xmlns="http://schemas.qoscosgrid.org/comp/2011/04/config"
       xmlns:smc="http://schemas.qoscosgrid.org/comp/2011/04/config"
       xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
 
       <Configuration>
               <sm:ModuleManager>
                       <sm:Directory>/opt/qcg/lib/qcg-core/modules/</sm:Directory>
                       <sm:Directory>/opt/qcg/lib/qcg-comp/modules/</sm:Directory>
               </sm:ModuleManager>
 
               <sm:Service xsi:type="qcg-compd" description="QCG-Computing">
                       <sm:Logger>
                               <sm:Filename>/opt/qcg/var/log/qcg-comp/qcg-comp.log</sm:Filename>
                               <sm:Level>INFO</sm:Level>
                       </sm:Logger>
 
                       <sm:Transport>
                               <sm:Module xsi:type="sm:ecm_gsoap.service">
                                       <sm:Host>frontend.example.com</sm:Host>
                                       <sm:Port>19000</sm:Port>
                                       <sm:KeepAlive>false</sm:KeepAlive>
                                       <sm:Authentication>
                                               <sm:Module xsi:type="sm:atc_transport_gsi.service">
                                                       <sm:X509CertFile>/opt/qcg/certs/qcgcert.pem</sm:X509CertFile>
                                                       <sm:X509KeyFile>/opt/qcg/certs/qcgkey.pem</sm:X509KeyFile>
                                               </sm:Module>
                                       </sm:Authentication>
                                       <sm:Authorization>
                                               <sm:Module xsi:type="sm:atz_mapfile">
                                                       <sm:Mapfile>/etc/grid-security/grid-mapfile</sm:Mapfile>
                                               </sm:Module>
                                       </sm:Authorization>
                               </sm:Module>
                               <sm:Module xsi:type="smc:qcg-comp-service"/>
                       </sm:Transport>
 
                       <sm:Module xsi:type="submission_drmaa" path="/opt/LSF/lsf7.0.3/7.0/linux2.6-glibc2.3-x86/lib/libdrmaa.so"/>
                       <sm:Module xsi:type="reservation_ares" path="/opt/qcg/lib/libares.so"/>
 
                       <sm:Module xsi:type="lsf_jsdl_filter"/>
                       <sm:Module xsi:type="smc:notification_wsn">
                               <sm:Module xsi:type="sm:ecm_gsoap.client" >
                                     <sm:ServiceURL>http://frontend.example.com:19005/</sm:ServiceURL>
                                     <sm:Authentication>
                                           <sm:Module xsi:type="sm:atc_transport_http.client"/>
                                     </sm:Authentication>
                                           <sm:Module xsi:type="sm:ntf_client"/>
                               </sm:Module>
                       </sm:Module>
    
                       <sm:Module xsi:type="atz_ardl_filter"/>

                       <sm:Module xsi:type="application_mapper">
                               <ApplicationMapFile>/opt/qcg/etc/application_mapfile</ApplicationMapFile>
                       </sm:Module>

                       <Database>
                               <DSN>qcg-comp</DSN>
                               <User>qcg-comp</User>
                               <Password>qcg-comp</Password>
                       </Database>
 
                       <UnprivilegedUser>qcg-comp</UnprivilegedUser>
                       <FactoryAttributes>
                               <CommonName>IT cluster</CommonName>
                               <LongDescription>IT department cluster for public use</LongDescription>
                               <LocalResourceManagerType>http://example.com/LSF</LocalResourceManagerType>
                               <ContainedResources xmlns:bes-factory="http://schemas.ggf.org/bes/2006/08/bes-factory" xmlns:jsdl="http://schemas.ggf.org/jsdl/2005/11/jsdl">
                                       <ContainedResource>
                                               <bes-factory:ResourceName>worker.example.com</bes-factory:ResourceName>
                                               <bes-factory:CPUArchitecture>
                                                       <jsdl:CPUArchitectureName>x86_32</jsdl:CPUArchitectureName>
                                               </bes-factory:CPUArchitecture>
                                               <bes-factory:CPUCount>4</bes-factory:CPUCount>
                                               <bes-factory:PhysicalMemory>1073741824</bes-factory:PhysicalMemory>
                                       </ContainedResource>
                               </ContainedResources>
                       </FactoryAttributes>
               </sm:Service>
       </Configuration>
</sm:QCGCore>

Template configuration for the QCG-Computing service deployed on the top of Grid Engine

<?xml version="1.0" encoding="UTF-8"?>
<sm:QCGCore
       xmlns:sm="http://schemas.qoscosgrid.org/core/2011/04/config"
       xmlns="http://schemas.qoscosgrid.org/comp/2011/04/config"
       xmlns:smc="http://schemas.qoscosgrid.org/comp/2011/04/config"
       xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
 
       <Configuration>
               <sm:ModuleManager>
                       <sm:Directory>/opt/qcg/lib/qcg-core/modules/</sm:Directory>
                       <sm:Directory>/opt/qcg/lib/qcg-comp/modules/</sm:Directory>
               </sm:ModuleManager>
 
               <sm:Service xsi:type="qcg-compd" description="QCG-Computing">
                       <sm:Logger>
                               <sm:Filename>/opt/qcg/var/log/qcg-comp/qcg-comp.log</sm:Filename>
                               <sm:Level>INFO</sm:Level>
                       </sm:Logger>
 
                       <sm:Transport>
                               <sm:Module xsi:type="sm:ecm_gsoap.service">
                                       <sm:Host>frontend.example.com</sm:Host>
                                       <sm:Port>19000</sm:Port>
                                       <sm:KeepAlive>false</sm:KeepAlive>
                                       <sm:Authentication>
                                               <sm:Module xsi:type="sm:atc_transport_gsi.service">
                                                       <sm:X509CertFile>/opt/qcg/certs/qcgcert.pem</sm:X509CertFile>
                                                       <sm:X509KeyFile>/opt/qcg/certs/qcgkey.pem</sm:X509KeyFile>
                                               </sm:Module>
                                       </sm:Authentication>
                                       <sm:Authorization>
                                               <sm:Module xsi:type="sm:atz_mapfile">
                                                       <sm:Mapfile>/etc/grid-security/grid-mapfile</sm:Mapfile>
                                               </sm:Module>
                                       </sm:Authorization>
                               </sm:Module>
                               <sm:Module xsi:type="smc:qcg-comp-service"/>
                       </sm:Transport>
 
                       <sm:Module xsi:type="submission_drmaa" path="/opt/sge/lib/lx24-x86/libdrmaa.so"/>
                       <sm:Module xsi:type="reservation_python" path="/opt/qcg/lib/qcg-comp/modules/python/reservation_sge.py"/>
                        
                       <sm:Module xsi:type="sge_jsdl_filter"/>
                       <sm:Module xsi:type="smc:notification_wsn">
                               <sm:Module xsi:type="sm:ecm_gsoap.client" >
                                     <sm:ServiceURL>http://frontend.example.com:19005/</sm:ServiceURL>
                                     <sm:Authentication>
                                           <sm:Module xsi:type="sm:atc_transport_http.client"/>
                                     </sm:Authentication>
                                           <sm:Module xsi:type="sm:ntf_client"/>
                               </sm:Module>
                       </sm:Module>
    
                       <sm:Module xsi:type="atz_ardl_filter"/>

                       <sm:Module xsi:type="application_mapper">
                               <ApplicationMapFile>/opt/qcg/etc/application_mapfile</ApplicationMapFile>
                       </sm:Module>

                       <Database>
                               <DSN>qcg-comp</DSN>
                               <User>qcg-comp</User>
                               <Password>qcg-comp</Password>
                       </Database>
 
                       <UnprivilegedUser>qcg-comp</UnprivilegedUser>
                       <FactoryAttributes>
                               <CommonName>IT cluster</CommonName>
                               <LongDescription>IT department cluster for public use</LongDescription>
                               <LocalResourceManagerType>http://example.com/LSF</LocalResourceManagerType>
                               <ContainedResources xmlns:bes-factory="http://schemas.ggf.org/bes/2006/08/bes-factory" xmlns:jsdl="http://schemas.ggf.org/jsdl/2005/11/jsdl">
                                       <ContainedResource>
                                               <bes-factory:ResourceName>worker.example.com</bes-factory:ResourceName>
                                               <bes-factory:CPUArchitecture>
                                                       <jsdl:CPUArchitectureName>x86_32</jsdl:CPUArchitectureName>
                                               </bes-factory:CPUArchitecture>
                                               <bes-factory:CPUCount>4</bes-factory:CPUCount>
                                               <bes-factory:PhysicalMemory>1073741824</bes-factory:PhysicalMemory>
                                       </ContainedResource>
                               </ContainedResources>
                       </FactoryAttributes>
               </sm:Service>
       </Configuration>
</sm:QCGCore>

Template configuration for the QCG-Computing service deployed on the top of Torque (using Maui scheduler)

<?xml version="1.0" encoding="UTF-8"?>
<sm:QCGCore
       xmlns:sm="http://schemas.qoscosgrid.org/core/2011/04/config"
       xmlns="http://schemas.qoscosgrid.org/comp/2011/04/config"
       xmlns:smc="http://schemas.qoscosgrid.org/comp/2011/04/config"
       xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
 
       <Configuration>
               <sm:ModuleManager>
                       <sm:Directory>/opt/qcg/lib/qcg-core/modules/</sm:Directory>
                       <sm:Directory>/opt/qcg/lib/qcg-comp/modules/</sm:Directory>
               </sm:ModuleManager>
 
               <sm:Service xsi:type="qcg-compd" description="QCG-Computing">
                       <sm:Logger>
                               <sm:Filename>/opt/qcg/var/log/qcg-comp/qcg-comp.log</sm:Filename>
                               <sm:Level>INFO</sm:Level>
                       </sm:Logger>
 
                       <sm:Transport>
                               <sm:Module xsi:type="sm:ecm_gsoap.service">
                                       <sm:Host>frontend.example.com</sm:Host>
                                       <sm:Port>19000</sm:Port>
                                       <sm:KeepAlive>false</sm:KeepAlive>
                                       <sm:Authentication>
                                               <sm:Module xsi:type="sm:atc_transport_gsi.service">
                                                       <sm:X509CertFile>/opt/qcg/etc/certs/qcgcert.pem</sm:X509CertFile>
                                                       <sm:X509KeyFile>/opt/qcg/etc/certs/qcgkey.pem</sm:X509KeyFile>
                                               </sm:Module>
                                       </sm:Authentication>
                                       <sm:Authorization>
                                               <sm:Module xsi:type="sm:atz_mapfile">
                                                       <sm:Mapfile>/etc/grid-security/grid-mapfile</sm:Mapfile>
                                               </sm:Module>
                                       </sm:Authorization>
                               </sm:Module>
                               <sm:Module xsi:type="smc:qcg-comp-service"/>
                       </sm:Transport>
  
                       <sm:Module xsi:type="submission_drmaa" path="/opt/qcg/lib/libdrmaa.so"/>
                       <sm:Module xsi:type="reservation_python" path="/opt/qcg/lib/qcg-comp/modules/python/reservation_maui.py"/>
                        
                       <sm:Module xsi:type="pbs_jsdl_filter"/>
                       <sm:Module xsi:type="atz_ardl_filter"/>
 
                       <sm:Module xsi:type="application_mapper">
                               <ApplicationMapFile>/opt/qcg/etc/application_mapfile</ApplicationMapFile>
                       </sm:Module>
 
                       <sm:Module xsi:type="smc:notification_wsn">
                               <sm:Module xsi:type="sm:ecm_gsoap.client" >
                                     <sm:ServiceURL>http://frontend.example.com:19005/</sm:ServiceURL>
                                     <sm:Authentication>
                                           <sm:Module xsi:type="sm:atc_transport_http.client"/>
                                     </sm:Authentication>
                                           <sm:Module xsi:type="sm:ntf_client"/>
                                </sm:Module>
                       </sm:Module>
 
                       <sm:Module xsi:type="sm:general_python" path="/opt/qcg/lib/qcg-comp/modules/python/monitoring.py"/>
  
                       <Database>
                               <DSN>qcg-comp</DSN>
                               <User>qcg-comp</User>
                               <Password>qcg-comp</Password>
                       </Database>
  
                       <UnprivilegedUser>qcg-comp</UnprivilegedUser>
                       <FactoryAttributes>
                               <CommonName>frontend.example.com</CommonName>
                               <LongDescription>IT department cluster for public use</LongDescription>
                       </FactoryAttributes>
               </sm:Service>
       </Configuration>
</sm:QCGCore>

In most cases it should be enough to change only following elements:

Database
the contact data for the Database created in the previously steps
Transport/Module/Host
the hostname of the machine where the service is deployed
Transport/Module/Authentication/Module/X509CertFile and Transport/Module/Authentication/Module/X509KeyFile
the service private key and X.509 certificate (consult the  Globus User Gide on how to generate service certificate request or use the host certificate/key pair). Make sure that the key and certificate is owned by the qcg-comp user and the private key is not password protected (generating certificate with the -service option implies this).
Module[type="smc:notification_wsn"]/Module/ServiceURL
the URL of the QCG-Notification (QCG-Notification) service (You can do it later, i.e. after  installing the QCG-Notification service)
FactoryAttributes/ContainedResources
list of the all worker nodes managed by the LRMS. You should provide at least hostnames (<bes-factory:ResourceName>) and cpus count (<bes-factory:CPUCount>) for every worker node in your cluster. If you are using Torque batch system this information can be provided automatically by enabling python/monitoring.py module.
Module[type="submission_drmaa"]/@path
path to the DRMAA library (the libdrmaa.so - search for this file within the LRMS lib directory. e.g. /opt/sge/lib/ or /opt/LSF/lsf7.0.3/7.0/linux2.6-glibc2.3-x86_64/lib/)
Module[type="reservation_ares"]/@path
path to the ARES library (LSF only)
Module[type="reservation_python"]/@path
path to the Python reservation module (SGE and Torque only) - only if you installed QCG-Computing in location other than /opt/qcg.

Finally make this configuration file readable only by the qcg-comp user (in order to protect database password):

chown qcg-comp:qcg-comp /opt/qcg/etc/qcg-compd.xml
chmod 0600 /opt/qcg/etc/qcg-compd.xml

Note on the security model

The QCG-Computing can be configured with various authentication and authorization modules. However in the typical deployment we assume that the QCG-Computing is configured as in the above example, i.e.:

  • authentication is provided on basics of httpg protocol
  • authorization is based on the local grid-mapfile mapfile (see Users configuration?).

Starting the service

As root type:

/opt/qcg/sbin/qcg-compd

or if you want the service not to daemonize and print all logs to console type:

/opt/qcg/sbin/qcg-compd -d

otherwise the logs can be found in /opt/qcg/var/log/qcg-comp/qcg-compd.log.

Note: Before starting the qcg-compd service make sure that all environment variables needed to contact the LRMS system are set (i.e. the SGE_ROOT for SGE or the whole profile.lsf file sourced in case of LSF)

Stopping the service

The service is stopped by sending SIGTERM signal, e.g.:

pkill qcg-compd

Verifying the installation

  • For convenience you can add the /opt/qcg/bin to your PATH variable.
  • Edit the QCG-Computing client configuration file (PREFIX/etc/qcg-comp.xml):
    • set the Host and Port to reflects the changes in the service configuration file (qcg-compd.xml),
    • change the authentication module type from sm:atc_transport_http.client to sm:atc_transport_gsi.client,
<?xml version="1.0" encoding="UTF-8"?>
<sm:QCGCore
        xmlns:sm="http://schemas.qoscosgrid.org/core/2011/04/config"
        xmlns="http://schemas.qoscosgrid.org/comp/2011/04/config"
        xmlns:smc="http://schemas.qoscosgrid.org/comp/2011/04/config"
        xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
        
        <Configuration>
                <sm:ModuleManager>
                        <sm:Directory>/opt/plgrid/qcg/lib/qcg-core/modules/</sm:Directory>
                        <sm:Directory>/opt/plgrid/qcg/lib/qcg-comp/modules/</sm:Directory>
                </sm:ModuleManager>

                <sm:Client xsi:type="qcg-comp" description="QCG-Computing client">
                        <sm:Transport>
                                <sm:Module xsi:type="sm:ecm_gsoap.client">
                                       <sm:ServiceURL>httpg://frontend.example.com:19000/</sm:ServiceURL>
                                       <sm:Authentication>
                                               <sm:Module xsi:type="sm:atc_transport_gsi.client"/>
                                       </sm:Authentication>
                                       <sm:Module xsi:type="smc:qcg-comp-client"/>
                               </sm:Module>
                       </sm:Transport>
               </sm:Client>
       </Configuration>
</sm:QCGCore>
  • Initialize your credentials:
    grid-proxy-init 
    Your identity: /C=PL/O=GRID/O=PSNC/CN=Mariusz Mamonski
    Enter GRID pass phrase for this identity:
    Creating proxy ............................ Done
    Your proxy is valid until: Fri Jun 10 06:23:32 2011
    
  • Query the QCG-Computing service:
    qcg-comp -G | xmllint --format - # the xmllint is used only to present the result in more pleasant way
      
     <bes-factory:FactoryResourceAttributesDocument xmlns:bes-factory="http://schemas.ggf.org/bes/2006/08/bes-factory">
       <bes-factory:IsAcceptingNewActivities>true</bes-factory:IsAcceptingNewActivities>
       <bes-factory:CommonName>IT cluster</bes-factory:CommonName>
       <bes-factory:LongDescription>IT department cluster for public   use</bes-factory:LongDescription>
       <bes-factory:TotalNumberOfActivities>0</bes-factory:TotalNumberOfActivities>
       <bes-factory:TotalNumberOfContainedResources>1</bes-factory:TotalNumberOfContainedResources>
       <bes-factory:ContainedResource xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:type="bes-factory:BasicResourceAttributesDocumentType">
           <bes-factory:ResourceName>worker.example.com</bes-factory:ResourceName>
           <bes-factory:CPUArchitecture>
               <jsdl:CPUArchitectureName xmlns:jsdl="http://schemas.ggf.org/jsdl/2005/11/jsdl">x86_32</jsdl:CPUArchitectureName>
           </bes-factory:CPUArchitecture>
           <bes-factory:CPUCount>4</bes-factory:CPUCount><bes-factory:PhysicalMemory>1073741824</bes-factory:PhysicalMemory>
       </bes-factory:ContainedResource>
       <bes-factory:NamingProfile>http://schemas.ggf.org/bes/2006/08/bes/naming/BasicWSAddressing</bes-factory:NamingProfile> 
       <bes-factory:BESExtension>http://schemas.ogf.org/hpcp/2007/01/bp/BasicFilter</bes-  factory:BESExtension>
       <bes-factory:BESExtension>http://schemas.qoscosgrid.org/comp/2009/01</bes-factory:BESExtension>
       <bes-factory:LocalResourceManagerType>http://example.com/SunGridEngine</bes-factory:LocalResourceManagerType>
       <smcf:NotificationProviderURL xmlns:smcf="http://schemas.qoscosgrid.org/comp/2011/04/factory">http://localhost:2211/</smcf:NotificationProviderURL>
     </bes-factory:FactoryResourceAttributesDocument>
    
  • Submit a sample job:
    qcg-comp -c -J /opt/qcg/share/qcg-comp/doc/examples/jsdl/sleep.xml
    Activity Id: ccb6b04a-887b-4027-633f-412375559d73
    
  • Query it status:
    qcg-comp -s -a ccb6b04a-887b-4027-633f-412375559d73
    status = Executing
    qcg-comp -s -a ccb6b04a-887b-4027-633f-412375559d73
    status = Executing
    qcg-comp -s -a ccb6b04a-887b-4027-633f-412375559d73
    status = Finished
    exit status = 0
    
  • Create an advance reservation:
    • copy the provided sample reservation description file (expressed in ARDL - Advance Reservation Description Language)
      cp /opt/qcg/share/qcg-comp/doc/examples/ardl/oneslot.xml oneslot.xml
      
    • Edit the oneslot.xml and modify the StartTime and EndTime to dates that are in the near future,
    • Create a new reservation:
      qcg-comp -c -D oneslot.xml
      Reservation Id: aab6b04a-887b-4027-633f-412375559d7d
      
    • List all reservations:
      qcg-comp -l
      Reservation Id: aab6b04a-887b-4027-633f-412375559d7d
      Total number of reservations: 1
      
    • Check which hosts where reserved:
      qcg-comp -s -r aab6b04a-887b-4027-633f-412375559d7d
      Reserved hosts:
      worker.example.com[used=0,reserved=1,total=4]
      
    • In the end delete the reservation:
      qcg-comp -t -r aab6b04a-887b-4027-633f-412375559d7d
      Reservation terminated.