Version 112 (modified by mmamonski, 12 years ago) (diff) |
---|
- Introduction
- Prerequisites
- Firewall configuration
- Installation using provided RPMS
- The Grid Mapfile
- Scheduler configuration
- Service certificates
- DRMAA library
- Service configuration
- Restricting advance reservation
- Configuring BAT Updater
- Creating applications' script space
- Note on the security model
- Starting the service
- Stopping the service
- Verifying the installation
- Maintenance
- PL-Grid Grants Support
- GOCDB
Introduction
QCG-Computing service (the successor of the OpenDSP project) is an open source service acting as a computing provider exposing on demand access to computing resources and jobs over the HPC Basic Profile compliant Web Services interface. In addition the QCG-Computing offers remote interface for Advance Reservations management.
This document describes installation of the QCG-Computing service in the PL-Grid environment. The service should be deployed on the machine (or virtual machine) that:
- has at least 1GB of memory (recommended value: 2 GB)
- has 10 GB of free disk space (most of the space will be used by the log files)
- has any modern CPU (if you plan to use virtual machine you should dedicated to it one or two cores from the host machine)
- is running under Scientific Linux 5.5 (in most cases the provided RPMs should work with any operating system based on Redhat Enterpise Linux 5.x, e.g. CentOS 5)
Prerequisites
We assume that you have the Torque local resource manager and the Maui scheduler already installed. This would be typically a frontend machine (i.e. machine where the pbs_server and maui daemons are running). If you want to install the QCG-Computing service on a separate submit host you should read this notes.
Since version 2.4 the QCG-Computing services discovers installed application using the Environment Modules package. For this reason you should install modules at the qcg host and mount directories that contain all module files used at your cluster.
Firewall configuration
In order to expose the QosCosGrid services externally you need to open the following ports in the firewall:
- 19000 (TCP) - QCG-Computing
- 19001 (TCP) - QCG-Notification
- 2811 (TCP) - GridFTP server
- 9000-9500 (TCP) - GridFTP port-range (if you want to use different port-range adjust the GLOBUS_TCP_PORT_RANGE variable in the /etc/xinetd.d/gsiftp file)
- Install database backend (PostgresSQL):
yum install postgresql postgresql-server
- UnixODBC and the PostgresSQL odbc driver:
yum install unixODBC postgresql-odbc
- Torque devel package and the rpmbuild package (needed to build DRMAA):
rpm -i torque-devel-your-version.rpm yum install rpm-build
The X.509 host certificate (signed by the Polish Grid CA) and key is already installed in the following locations:
- /etc/grid-security/hostcert.pem
- /etc/grid-security/hostkey.pem
Most of the grid services and security infrastructures are sensitive to time skews. Thus we recommended to install a Network Time Protocol daemon or use any other solution that provides accurate clock synchronization.
Installation using provided RPMS
- Create the following users:
- qcg-comp - needed by the QCG-Computing service
- qcg-broker - the user that the QCG-Broker service would be mapped to
useradd -r -d /opt/plgrid/var/log/qcg-comp/ qcg-comp useradd -r -d /opt/plgrid/var/log/qcg-broker/ qcg-broker
- and the following group:
- qcg-dev - this group is allowed to read the configuration and log files. Please add the qcg services' developers to this group.
groupadd -r qcg-dev
- qcg-dev - this group is allowed to read the configuration and log files. Please add the qcg services' developers to this group.
- install QCG (testing) and PL-Grid (official) repositories:
- Official PL-Grid repository
rpm -Uvh http://software.plgrid.pl/packages/repos/plgrid-repos-2010-2.noarch.rpm
- QosCosGrid testing repository (latest version, including new features and latest bug fixes, but may be unstable)
cat > /etc/yum.repos.d/qcg.repo << EOF [qcg] name=QosCosGrid YUM repository baseurl=http://fury.man.poznan.pl/qcg-packages/sl/x86_64/ enabled=1 gpgcheck=0 EOF
- Official PL-Grid repository
- install QCG-Computing using YUM Package Manager:
yum install qcg-comp yum install qcg-comp-client
- install grid-ftp server using YUM Package Manager:
yum install qcg-dep-gridftp-server
- setup QCG-Computing database using provided script:
/opt/plgrid/qcg/share/qcg-comp/tools/qcg-comp-install.sh Welcome to qcg-comp installation script! This script will guide you through process of configuring proper environment for running the QCG-Computing service. You have to answer few questions regarding parameters of your database. If you are not sure just press Enter and use the default values. Use local PostgreSQL server? (y/n) [y]: y Database [qcg-comp]: User [qcg-comp]: Password [qcg-comp]: MojeTajneHaslo Create database? (y/n) [y]: y Create user? (y/n) [y]: y Checking for system user qcg-comp...OK Checking whether PostgreSQL server is installed...OK Checking whether PostgreSQL server is running...OK Performing installation * Creating user qcg-comp...OK * Creating database qcg-comp...OK * Creating database schema...OK * Checking for ODBC data source qcg-comp... * Installing ODBC data source...OK Remember to add appropriate entry to /var/lib/pgsql/data/pg_hba.conf (as the first rule!) to allow user qcg-comp to access database qcg-comp. For instance: host qcg-comp qcg-comp 127.0.0.1/32 md5 and reload Postgres server.
Add a new rule to the pg_hba.conf as requested:
vim /var/lib/pgsql/data/pg_hba.conf /etc/init.d/postgresql reload
Install EGI Accepted CA certificates (this also install the Polish Grid CA):
cd /etc/yum.repos.d/ wget http://repository.egi.eu/sw/production/cas/1/current/repo-files/EGI-trustanchors.repo yum clean all yum install ca-policy-egi-core
The above instructions were based on this manual
Install PL-Grid Simpla-CA certificate (not part of IGTF):
wget http://software.plgrid.pl/packages/general/ca_PLGRID-SimpleCA-1.0-2.noarch.rpm rpm -i ca_PLGRID-SimpleCA-1.0-2.noarch.rpm #install certificate revocation list fetching utility wget https://dist.eugridpma.info/distribution/util/fetch-crl/fetch-crl-2.8.5-1.noarch.rpm rpm -i fetch-crl-2.8.5-1.noarch.rpm #get fresh CRLs now /usr/sbin/fetch-crl #install cron job for it cat > /etc/cron.daily/fetch-crl.cron << EOF #!/bin/sh /usr/sbin/fetch-crl EOF chmod a+x /etc/cron.daily/fetch-crl.cron
The Grid Mapfile
This tutorial assumes that the QCG-Computing service is configured in such way, that every authenticated user must be authorized against the grid-mapfile. This file can be created manually by an administrator (if the service is run in "test mode") or generated automatically based on the LDAP directory service.
Manually created grid mapfile (for testing purpose only)
#for test purpose only add mapping for your account echo '"MyCertDN" myaccount' >> /etc/grid-security/grid-mapfile
LDAP generated grid mapfile
# # 1. install qcg grid-mapfile generator # yum install qcg-gridmapfilegenerator # # 2. configure gridmapfilegenerator - remember to change # * url property to your local ldap replica # * search base # * filter expression # * security context vim /opt/plgrid/qcg/etc/qcg-comp/plggridmapfilegenerator.conf # # 3. run the gridmapfile generator in order to generate gridmapfile now # /opt/plgrid/qcg/sbin/qcg-gridmapfilegenerator.sh
After installing and running this tool one can find three files:
- /etc/grid-security/grid-mapfile.local - here you can put list of DN and local unix accounts name that will be merged with data acquired from local LDAP server
- /etc/grid-security/grid-mapfile.deny - here you can put list od DN's (only DNs!) that you want to deny access to the QCG-Computing service
- /etc/grid-security/grid-mapfile - the final gridmap file generated using the above two files and information available in local LDAP server. Do not edit this file as it is generated automatically!
This gridmapfile generator script is run every 10 minutes. Moreover its issues su - $USERNAME -c 'true' > /dev/null for every new user that do not have yet home directory (thus triggering pam_mkhomedir if installed).
Scheduler configuration
Add appropriate rights for the qcg-comp and qcg-broker users in the Maui scheduler configuaration file:
vim /var/spool/maui/maui.cfg # primary admin must be first in list ADMIN1 root ADMIN2 qcg-broker ADMIN3 qcg-comp
Service certificates
Copy the service certificate and key into the /opt/plgrid/qcg/etc/qcg-comp/certs/. Remember to set appropriate rights to the key file.
cp /etc/grid-security/hostcert.pem /opt/plgrid/qcg/etc/qcg-comp/certs/qcgcert.pem cp /etc/grid-security/hostkey.pem /opt/plgrid/qcg/etc/qcg-comp/certs/qcgkey.pem chown qcg-comp /opt/plgrid/qcg/etc/qcg-comp/certs/qcgcert.pem chown qcg-comp /opt/plgrid/qcg/etc/qcg-comp/certs/qcgkey.pem chmod 0600 /opt/plgrid/qcg/etc/qcg-comp/certs/qcgkey.pem
DRMAA library
Install via YUM repository:
yum install pbs-drmaa #Torque yum install pbspro-drmaa #PBS Proffesional
Alternatively compile DRMAA using source package downloaded from SourceForge.
After installation you need either:
- configure the DRMAA library to use Torque logs (RECOMMENDED). Sample configuration file of the DRMAA library (/opt/plgrid/qcg/etc/pbs_drmaa.conf):
Note: Remember to mount server log directory as described in the eariler note.
# pbs_drmaa.conf - Sample pbs_drmaa configuration file. wait_thread: 1, pbs_home: "/var/spool/pbs", cache_job_state: 600,
or
- configure Torque to keep information about completed jobs (e.g.: by setting: qmgr -c 'set server keep_completed = 300').
It is possible to set the default queue by setting default job category (in the /opt/plgrid/qcg/etc/pbs_drmaa.conf file):
job_categories: { default: "-q plgrid", },
Service configuration
Edit the preinstalled service configuration file (/opt/plgrid/qcg/etc/qcg-comp/qcg-compd.xml):
<?xml version="1.0" encoding="UTF-8"?> <sm:QCGCore xmlns:sm="http://schemas.qoscosgrid.org/core/2011/04/config" xmlns="http://schemas.qoscosgrid.org/comp/2011/04/config" xmlns:smc="http://schemas.qoscosgrid.org/comp/2011/04/config" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> <Configuration> <sm:ModuleManager> <sm:Directory>/opt/plgrid/qcg/lib/qcg-core/modules/</sm:Directory> <sm:Directory>/opt/plgrid/qcg/lib/qcg-comp/modules/</sm:Directory> </sm:ModuleManager> <sm:Service xsi:type="qcg-compd" description="QCG-Computing"> <sm:Logger> <sm:Filename>/opt/plgrid/var/log/qcg-comp/qcg-compd.log</sm:Filename> <sm:Level>INFO</sm:Level> </sm:Logger> <sm:Transport> <sm:Module xsi:type="sm:ecm_gsoap.service"> <sm:Host>frontend.example.com</sm:Host> <sm:Port>19000</sm:Port> <sm:KeepAlive>false</sm:KeepAlive> <sm:Authentication> <sm:Module xsi:type="sm:atc_transport_gsi.service"> <sm:X509CertFile>/opt/plgrid/qcg/etc/qcg-comp/certs/qcgcert.pem</sm:X509CertFile> <sm:X509KeyFile>/opt/plgrid/qcg/etc/qcg-comp/certs/qcgkey.pem</sm:X509KeyFile> </sm:Module> </sm:Authentication> <sm:Authorization> <sm:Module xsi:type="sm:atz_mapfile"> <sm:Mapfile>/etc/grid-security/grid-mapfile</sm:Mapfile> </sm:Module> </sm:Authorization> </sm:Module> <sm:Module xsi:type="smc:qcg-comp-service"/> </sm:Transport> <sm:Module xsi:type="pbs_jsdl_filter"/> <sm:Module xsi:type="atz_ardl_filter"/> <sm:Module xsi:type="sm:general_python" path="/opt/plgrid/qcg/lib/qcg-comp/modules/python/monitoring.py"/> <sm:Module xsi:type="sm:general_python" path="/opt/plgrid/qcg/lib/qcg-comp/modules/python/plgrid_info.py"/> <sm:Module xsi:type="sm:general_python" path="/opt/plgrid/qcg/lib/qcg-comp/modules/python/modules_info.py"/> <sm:Module xsi:type="submission_drmaa" path="/opt/plgrid/qcg/lib/libdrmaa.so"/> <sm:Module xsi:type="reservation_python" path="/opt/plgrid/qcg/lib/qcg-comp/modules/python/reservation_maui.py"/> <sm:Module xsi:type="notification_wsn"> <PublishedBrokerURL>https://frontend.example.com:19011/</PublishedBrokerURL> <sm:Module xsi:type="sm:ecm_gsoap.client"> <sm:ServiceURL>http://localhost:19001/</sm:ServiceURL> <sm:Authentication> <sm:Module xsi:type="sm:atc_transport_http.client"/> </sm:Authentication> <sm:Module xsi:type="sm:ntf_client"/> </sm:Module> </sm:Module> <sm:Module xsi:type="application_mapper"> <ApplicationMapFile>/opt/plgrid/qcg/etc/qcg-comp/application_mapfile</ApplicationMapFile> </sm:Module> <Database> <DSN>qcg-comp</DSN> <User>qcg-comp</User> <Password>qcg-comp</Password> </Database> <UnprivilegedUser>qcg-comp</UnprivilegedUser> <FactoryAttributes> <CommonName>klaster.plgrid.pl</CommonName> <LongDescription>PL Grid cluster</LongDescription> </FactoryAttributes> </sm:Service> </Configuration> </sm:QCGCore>
In most cases it should be enough to change only following elements:
- Transport/Module/Host
- the hostname of the machine where the service is deployed
- Transport/Module/Authentication/Module/X509CertFile and Transport/Module/Authentication/Module/X509KeyFile
- the service private key and X.509 certificate. Make sure that the key and certificate is owned by the qcg-comp user. If you installed cert and key file in the recommended location you do not need to edit these fields.
- Module[type="smc:notification_wsn"]/PublishedBrokerURL
- the external URL of the QCG-Notification service (You can do it later, i.e. after installing the QCG-Notification service)
- Module[type="smc:notification_wsn"]/Module/ServiceURL
- the localhost URL of the QCG-Notification service (You can do it later, i.e. after installing the QCG-Notification service)
- Module[type="submission_drmaa"]/@path
- path to the DRMAA library (the libdrmaa.so). Also, if you installed the DRMAA library using provided SRC RPM you do not need to change this path.
- Module[type="reservation_python"]/@path
- path to the reservation module. Change this if you are using different scheduler than Maui (e.g. use reservation_moab.py for Moab, reservation_pbs.py for PBS Pro)
- Database/Password
- the qcg-comp database password
- FactoryAttributes/CommonName
- a common name of the cluster (e.g. reef.man.poznan.pl). You can use any name that is unique among all systems (e.g. cluster name + domain name of your institution)
- FactoryAttributes/LongDescription
- a human readable description of the cluster
Restricting advance reservation
By default the QCG-Computing service can reserve any number of hosts. One can limit it by configuring the Maui/Moab scheduler and the QCG-Computing service properly:
- In Maui/Moab mark some subset of nodes, using the partition mechanism, as reservable for QCG-Computing:
# all users can use both the DEFAULT and RENABLED partition SYSCFG PLIST=DEFAULT,RENABLED #in Moab you should use 0 instead DEFAULT #SYSCFG PLIST=0,RENABLED # mark some set of the machines (e.g. 64 nodes) as reservable NODECFG[node01] PARTITION=RENABLED NODECFG[node02] PARTITION=RENABLED NODECFG[node03] PARTITION=RENABLED ... NODECFG[node64] PARTITION=RENABLED
- Tell the QCG-Computing to limit reservation to the aforementioned partition by editing the /opt/plgrid/qcg/etc/qcg-comp/sysconfig/qcg-compd configuration file:
export QCG_AR_MAUI_PARTITION="RENABLED"
- Moreover the QCG-Computing (since version 2.4) can enforce limits on maximal reservations duration length (default: one week) and size (measured in number of slots reserved):
... <ReservationsPolicy> <MaxDuration>24</MaxDuration> <!-- 24 hours --> <MaxSlots>100</MaxSlots> </ReservationsPolicy> ...
Configuring BAT Updater
Installation
- install BAT using YUM
yum install qcg-bat-updater
Configuration
At first you must ask the BAT administrator to provide you all credentials (username/password and X.509 certificate) needed to connect to the BAT. Copy the received keystore to the file /opt/plgrid/qcg/etc/qcg-bat-updater/truststore.ts (make sure this file is only readable by root). Now you are ready to edit the QCG BAT configuration file /opt/plgrid/qcg/etc/qcg-bat-updater/config.properties. You should change the following properties:
- qcg.db.pass - password of the QCG-Computing database (see <Database> section of the qcg-compd.xml file)
- qcg.bat.user and qcg.bat.pass - put values provided by the BAT administrator
- qcg.bat.keystore.pass - keystore pass (provided with key by the BAT administrator)
- qcg.site.name - your site name
- qcg.batch.server - hostname where the batch server is running
- qcg.pbs.home - the root of the Torque spool directory (e.g. /var/torque)
- qcg.bat.debug - controls verbosity of the log messages
Operations
Usage records are sent to the main BAT server (BAT Broker) every hour by the QCG BAT Updater (acting as BAT agent). The QCG BAT Updater is started every hour via a cron job (installed in /etc/cron.hourly/qcg-bat-updater.cron). The QCG BAT Updater is a Java batch application that on every run:
- reads jobs accounting information from the QCG-Computing database
- converts it to a proper XML format
- sends it over ActiveMQ channel (SSL secured) to BAT-Broker
- stores id of the last record sent to the file /opt/plgrid/var/run/qcg-bat-updater/last.id.
The QCG-BAT Updater logs can be found in /opt/plgrid/var/log/qcg-bat-updater/qcg-bat-updater.log.
Creating applications' script space
A common case for the QCG-Computing service is that an application is accessed using abstract app name rather than specifying absolute executable path. The application name/version to executbale path mappings are stored in the file /opt/plgrid/qcg/etc/qcg-comp/application_mapfile:
cat /opt/plgrid/qcg/etc/qcg-comp/application_mapfile # ApplicationName ApplicationVersion Executable date * /bin/date LPSolve 5.5 /usr/local/bin/lp_solve QCG-OMPI /opt/QCG/qcg/share/qcg-comp/tools/cross-mpi.starter
It is also common to provide here wrapper scripts rather than target executables. The wrapper script can handle such aspects of the application lifetime like: environment initialization, copying files from/to scratch storage and application monitoring. It is recommended to create separate directory for those wrapper scripts (e.g. the application partition) for an applications and add write permission to them to the QCG Developers group. This directory must be readable by all users and from every worker node (the application partition usually fullfils those requirements).
mkdir /opt/exp_soft/plgrid/qcg-app-scripts chown :qcg-dev /opt/exp_soft/plgrid/qcg-app-scripts chmod g+rwx /opt/exp_soft/plgrid/qcg-app-scripts
Note on the security model
The QCG-Computing can be configured with various authentication and authorization modules. However in the typical deployment we assume that the QCG-Computing is configured as in the above example, i.e.:
- authentication is provided on basics of httpg protocol,
- authorization is based on the local grid-mapfile mapfile.
Starting the service
As root type:
/etc/init.d/qcg-compd start
The service logs can be found in:
/opt/plgrid/var/log/qcg-comp/qcg-comp.log
The service assumes that the following commands are in the standard search path:
- pbsnodes
- showres
- setres
- releaseres
- checknode
If any of the above commands is not installed in a standard location (e.g. /usr/bin) you may need to edit the /opt/plgrid/qcg/etc/sysconfig/qcg-compd file and set the PATH variable appropriately, e.g.:
# INIT_WAIT=5 # # DRM specific options export PATH=$PATH:/opt/maui/bin
If you compiled DRMAA with logging switched on you can set there also DRMAA logging level:
# INIT_WAIT=5 # # DRM specific options export DRMAA_LOG_LEVEL=INFO
Also provide the location of the root scratch directory:
# INIT_WAIT=5 # export QCG_SCRATCH_DIR_ROOT="/mnt/lustre/scratch/people/"
Note: In current version, whenever you restart the PosgreSQL server you need also restart the QCG-Computing and QCG-Notification service:
/etc/init.d/qcg-compd restart /etc/init.d/qcg-ntfd restart
Stopping the service
The service can be stopped using the following command:
/etc/init.d/qcg-compd stop
Verifying the installation
- For convenience you can install the qcg environment module:
cp /opt/plgrid/qcg/share/qcg-core/misc/qcg.module /usr/share/Modules/modulefiles/qcg module load qcg
- Edit the QCG-Computing client configuration file (/opt/plgrid/qcg/etc/qcg-comp/qcg-comp.xml):
- set the Host and Port to reflects the changes in the service configuration file (qcg-compd.xml).
<?xml version="1.0" encoding="UTF-8"?> <sm:QCGCore xmlns:sm="http://schemas.qoscosgrid.org/core/2011/04/config" xmlns="http://schemas.qoscosgrid.org/comp/2011/04/config" xmlns:smc="http://schemas.qoscosgrid.org/comp/2011/04/config" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> <Configuration> <sm:ModuleManager> <sm:Directory>/opt/qcg/lib/qcg-core/modules/</sm:Directory> <sm:Directory>/opt/qcg/lib/qcg-comp/modules/</sm:Directory> </sm:ModuleManager> <sm:Client xsi:type="qcg-comp" description="QCG-Computing client"> <sm:Transport> <sm:Module xsi:type="sm:ecm_gsoap.client"> <sm:ServiceURL>httpg://frontend.example.com:19000/</sm:ServiceURL> <sm:Authentication> <sm:Module xsi:type="sm:atc_transport_gsi.client"/> </sm:Authentication> <sm:Module xsi:type="smc:qcg-comp-client"/> </sm:Module> </sm:Transport> </sm:Client> </Configuration> </sm:qcgCore>
- set the Host and Port to reflects the changes in the service configuration file (qcg-compd.xml).
- Initialize your credentials:
grid-proxy-init -rfc Your identity: /O=Grid/OU=QosCosGrid/OU=PSNC/CN=Mariusz Mamonski Enter GRID pass phrase for this identity: Creating proxy .................................................................. Done Your proxy is valid until: Wed Apr 6 05:01:02 2012
- Query the QCG-Computing service:
qcg-comp -G | xmllint --format - # the xmllint is used only to present the result in more pleasant way <bes-factory:FactoryResourceAttributesDocument xmlns:bes-factory="http://schemas.ggf.org/bes/2006/08/bes-factory"> <bes-factory:IsAcceptingNewActivities>true</bes-factory:IsAcceptingNewActivities> <bes-factory:CommonName>IT cluster</bes-factory:CommonName> <bes-factory:LongDescription>IT department cluster for public use</bes-factory:LongDescription> <bes-factory:TotalNumberOfActivities>0</bes-factory:TotalNumberOfActivities> <bes-factory:TotalNumberOfContainedResources>1</bes-factory:TotalNumberOfContainedResources> <bes-factory:ContainedResource xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:type="bes-factory:BasicResourceAttributesDocumentType"> <bes-factory:ResourceName>worker.example.com</bes-factory:ResourceName> <bes-factory:CPUArchitecture> <jsdl:CPUArchitectureName xmlns:jsdl="http://schemas.ggf.org/jsdl/2005/11/jsdl">x86_32</jsdl:CPUArchitectureName> </bes-factory:CPUArchitecture> <bes-factory:CPUCount>4</bes-factory:CPUCount><bes-factory:PhysicalMemory>1073741824</bes-factory:PhysicalMemory> </bes-factory:ContainedResource> <bes-factory:NamingProfile>http://schemas.ggf.org/bes/2006/08/bes/naming/BasicWSAddressing</bes-factory:NamingProfile> <bes-factory:BESExtension>http://schemas.ogf.org/hpcp/2007/01/bp/BasicFilter</bes- factory:BESExtension> <bes-factory:BESExtension>http://schemas.qoscosgrid.org/comp/2011/04</bes-factory:BESExtension> <bes-factory:LocalResourceManagerType>http://example.com/SunGridEngine</bes-factory:LocalResourceManagerType> <smcf:NotificationProviderURL xmlns:smcf="http://schemas.qoscosgrid.org/comp/2011/04/factory">http://localhost:2211/</smcf:NotificationProviderURL> </bes-factory:FactoryResourceAttributesDocument>
- Submit a sample job:
qcg-comp -c -J /opt/plgrid/qcg/share/qcg-comp/doc/examples/jsdl/sleep.xml Activity Id: ccb6b04a-887b-4027-633f-412375559d73
- Query it status:
qcg-comp -s -a ccb6b04a-887b-4027-633f-412375559d73 status = Executing qcg-comp -s -a ccb6b04a-887b-4027-633f-412375559d73 status = Executing qcg-comp -s -a ccb6b04a-887b-4027-633f-412375559d73 status = Finished exit status = 0
- Create an advance reservation:
- copy the provided sample reservation description file (expressed in ARDL - Advance Reservation Description Language)
cp /opt/plgrid/qcg/share/qcg-comp/doc/examples/ardl/oneslot.xml oneslot.xml
- Edit the oneslot.xml and modify the StartTime and EndTime to dates that are in the near future,
- Create a new reservation:
qcg-comp -c -D oneslot.xml Reservation Id: aab6b04a-887b-4027-633f-412375559d7d
- List all reservations:
qcg-comp -l Reservation Id: aab6b04a-887b-4027-633f-412375559d7d Total number of reservations: 1
- Check which hosts where reserved:
qcg-comp -s -r aab6b04a-887b-4027-633f-412375559d7d Reserved hosts: worker.example.com[used=0,reserved=1,total=4]
- Delete the reservation:
qcg-comp -t -r aab6b04a-887b-4027-633f-412375559d7d Reservation terminated.
- Check if the grid-ftp is working correctly:
globus-url-copy gsiftp://your.local.host.name/etc/profile profile diff /etc/profile profile
- copy the provided sample reservation description file (expressed in ARDL - Advance Reservation Description Language)
Maintenance
The historic usage information is stored in two relations of the QCG-Computing database: jobs_acc and reservations_acc. You can always archive old usage data to a file and delete it from the database using the psql client:
psql -h localhost qcg-comp qcg-comp Password for user qcg-comp: Welcome to psql 8.1.23, the PostgreSQL interactive terminal. Type: \copyright for distribution terms \h for help with SQL commands \? for help with psql commands \g or terminate with semicolon to execute query \q to quit qcg-comp=> \o jobs.acc qcg-comp=> SELECT * FROM jobs_acc where end_time < date '2010-01-10'; qcg-comp=> \o reservations.acc qcg-comp=> SELECT * FROM reservations_acc where end_time < date '2010-01-10'; qcg-comp=> \o qcg-comp=> DELETE FROM jobs_acc where end_time < date '2010-01-10'; qcg-comp=> DELETE FROM reservation_acc where end_time < date '2010-01-10';
you may also wish to install logrotate configuration for QCG-Computing:
yum install qcg-comp-logrotate
PL-Grid Grants Support
Since version 2.2.7 QCG-Computing is integrated with PL-Grid grants system. The integration with grant system has three main interaction points:
- QCG-Computing can accept jobs which has grant id set explicitly. One must use the <jsdl:JobProject> element, e.g.:
<?xml version="1.0" encoding="UTF-8"?> <jsdl:JobDefinition xmlns:jsdl="http://schemas.ggf.org/jsdl/2005/11/jsdl" xmlns:jsdl-hpcpa="http://schemas.ggf.org/jsdl/2006/07/jsdl-hpcpa" xmlns:jsdl-qcg-comp-factory="http://schemas.qoscosgrid.org/comp/2011/04/jsdl/factory"> <jsdl:JobDescription> <jsdl:JobIdentification> <jsdl:JobProject>Manhattan</jsdl:JobProject> </jsdl:JobIdentification> <jsdl:Application> <jsdl-hpcpa:HPCProfileApplication> ...
- QCG-Computing can provide information about the local grants to the upper layers (e.g. QCG-Broker), so they can use for scheduling purpose. One can enable it by adding the following line to the QCG-Computing configuration file (qcg-compd.xml):
</sm:Transport> ... <sm:Module xsi:type="sm:general_python" path="/opt/plgrid/qcg/lib/qcg-comp/modules/python/plgrid_info.py"/>
Please note that this module requires the qcg-gridmapfilegenerator to be installed.
- The grant id is provided in resource usage record sent to the BAT accounting service
GOCDB
Please remember to register the QCG-Computing and QCG-Notification services in the GOCDB using the QCG.Computing and QCG.Notification services types respectively.