Version 29 (modified by pkopta, 10 years ago) (diff) |
---|
Application mapping
The QCG-Computing service allows to map abstract application name to the absoluth path of the wrapper script. The mapping file is stored in file:
/etc/qcg/qcg-comp/application_mapfile
The file has the following syntax:
APPLICATION-NAME VERSION SCRIPT-PATH
where VERSION can be an asterix (*) which means "any version". The mapping file is periodically reloaded (5 minutes by default), so there is no need to restart the qcg-compd service after updating the file. Example application_mapfile file:
MATLAB * /opt/qcg-app-scripts/apps/matlab.app NAMD * /opt/qcg-app-scripts/apps/namd.app bash * /opt/qcg-app-scripts/apps/bash.app R * /opt/qcg-app-scripts/apps/R.app CFX * /opt/qcg-app-scripts/apps/cfx.app fluent * /opt/qcg-app-scripts/apps/fluent.app nwchem * /opt/qcg-app-scripts/apps/nwchem.app g09 * /opt/qcg-app-scripts/apps/g09.app
The scripts for the most common applications are available in qcg-appscripts RPM package.
For the QCG-Computing to be usable you must provide mapping for at least the bash application, i.e.:
bash * /opt/qcg-app-scripts/apps/bash.app
Frequently Asked Question: Can I use simply mapping to /bin/bash
bash * /bin/bash
? The answer is "No". You must use the bash.qcg which does much more (e.g. setup environment variables like QCG_NODEFILE).
qcg-appscripts package
This package contains QCG application scripts for the most common applications. Role of the application scripts, apart from the application launch, is the:
- loads the appropriate modules,
- converts input files to UNIX character encoding,
- setup environment,
- launch user's helper scripts - preprocess, postprocess, assistent,
- monitor the execution of application through defined schemes or user scripts,
- handling interactive jobs,
- etc.
All files are installed in directory:
/usr/share/qcg-appscripts
This directory contains following directories:
- apps - application configuration files,
- app-scripts - application scripts,
- config - QCG application scripts configuration,
- core - QCG scripts library,
- tools - QCG tools used by applications.
The administrator must create/edit application configuration files (apps) that contains settings specific to each cluster, such as:
- name of the application module,
- environment variables needed by application.
Files from apps directory should be referenced by QCG-Computing application mapping file (/etc/qcg/qcg-comp/application_mapfile).
The application scripts must be accessible by jobs running on cluster worker nodes. Thus directories apps, app-scripts, config, core and tools must be copied to directory shared by all worker nodes. This is done by:
/usr/sbin/qcg-appscripts-deploy
script included in the package. This script reads:
/etc/qcg/qcg-comp/app-scripts/config
configuration file for destination directory, where scripts should deployed.
Installation
To install QCG application scripts:
- install qcg-appscripts RPM package
- edit /etc/qcg/qcg-comp/app-scripts/config configuration file and set cluster_shared_path variable to point to a directory shared by all worker nodes
- launch qcg-appscripts-deploy script
- edit/create application configuration files in $cluster_shared_path/apps directory and refer to them in the QCG-Computing application mapping file (/etc/qcg/qcg-comp/application_mapfile)
Upgrade
After instalation of new QCG application scripts package:
- launch qcg-appscripts-deploy script to update files in shared directory.
qcg-appscripts-deploy updates directories app-scripts, core, tools. apps directories (besides bash.app, common.app and template.app is not overwritten.
New application
To handle application, following elements must be provided:
- application configuration file (script in apps directory),
- application script (script in app-scripts directory),
- mapping in /etc/qcg/qcg-comp/application_mapfile file.
The application configuration file is simple script which:
- loads the required modules or sets PATH/LD_LIBRARY_PATH variables,
To create new application configuration file aready existed file can be used as a base.
New application version
To create new version of already supported application, it is sufficient to:
- copy existing application configuration file, eg:
cp /opt/qcg-app-scripts/apps/namd.app /opt/qcg-app-scripts/apps/namd-6.2.app
- change loaded module and/or environment variables in $cluster_shared_path/apps/namd-6.2.app
- add new application version to the QCG application mapping file (/etc/qcg/qcg-comp/application_mapfile), eg:
... NAMD 6.2 /opt/qcg-app-scripts/apps/namd-6.2.app ...
common.conf
- QCG_SCRATCH_DIR - the location of job temporary directory (e.g $TMPDIR) - OPTIONAL - default to $TMPDIR (if set) or "/tmp",
- QCG_NO_DEBUG - if set do not create qcg.debug file - OPTIONAL,
- QCG_GROUP_DIR_ROOT - The root of groups home directory (PL-Grid only). Default: PLG_GROUPS_SHARED - OPTIONAL,
- MACHINE_FILE - the name of the environment variable pointing to the machine file (e.g. PBS_NODEFILE) - OPTIONAL.
bash.qcg
This application usually needs no configuration as it is assumed that bash binary cane be always found in the /bin directory. However you can request some extra argument to be passed to bash by specifying BASH_ARGS in common.conf, e.g.:
BASH_ARGS=--login
matlab.qcg
In matlab.conf simply load matlab module:
module load matlab/current
Or set the PATH variable:
export PATH=$PATH:/opt/exp_soft/matlab/bin/
namd.qcg
In namd.conf load the NAMD module
module load namd/current
Additionally if namd2 must be started using different command than mpiexec you can provide additional configuration variables:
MPIEXEC=charmrun MPIEXEC_ARGS="++verbose ++local +p`cat $QCG_NODEFILE | wc -l`"
g09.qcg
In g09.conf load the Gaussian module
module load gaussian/current
and set the following variables
#amount of memory in bytes allocated for the job MEM_BYTES=`qstat -f $PBS_JOBID | grep Resource_List.mem | cut -f 2 -d = | tr "b" " " ` #primary, unlimited, scratch GAUSSIAN_SCRATCH="/mnt/lustre/$USER" #additional fast scratches, the value after colon denotes max used size in MB GAUSSIAN_LIMITED_SCRATCHES="$TMPDIR:100" #used as shred memory key (must be an integer!) QCG_GAUSSIAN_SESSION_KEY=${PBS_JOBID%%.*}
moreover you have to compile the QCG gaussian helper library that is LD_PRELOADed before starting gaussian:
cd app-scripts/tools/qcg-gaussian make gcc -ggdb -Wall -fPIC -p -o libqcg-gaussian-dbg.so -shared qcg-gaussian.c -ldl gcc -O3 -Wall -fPIC -p -o libqcg-gaussian.so -shared qcg-gaussian.c -ldl
abinit.qcg
in abinit.conf load the abinit module
module load abinit/current
alternatively you may set and extra mpiexec arguments, e.g.:
MPIEXEC_ARGS=" --mca orte_tmpdir_base $PWD "
or set additional variables
export MV2_ENABLE_AFFINITY=0
Application Scripts - Developers view
This section is target at QCG developers rather than administrators, its list environment variables which influence job life cycle:
Common Input Environment Variables
- QCG_MODULES_LIST - list of environment modules to be loaded (separated by spaces)
- QCG_PREPROCESS - file name of the pre process script
- QCG_MONITOR - file name of the monitor script
- QCG_MONITOR_INTERVAL - determines how frequently the QCG_MONITOR script should be called (in seconds). - TBD
- QCG_NTF_CONSUMER_URL - the address of the Notification consumer interested in receiving notification about job output status change - TBD
- QCG_NTF_WATCH_PATTERN - the regular expression that should trigger notification - TBD
- QCG_NTF_WATCH_FILE - the file to be watched (default: stdout/err file) - TBD
- QCG_NTF_WATCH_INTERVAL - determines how often the watched file should be checked (in seconds) - TBD
- QCG_POSTPROCESS - file name of the post process script
- QCG_OUTERR_FILE - target filename of the joined output/error stream file (default: output.log)
- QCG_ZIPPED_INPUTS - set if the input files must be unziped first (e.g. QCG_ZIPPED_INPUTS=inputs.zip)
- QCG_ZIP_OUTPUTS - whether to zip results into $QCG_ZIP_OUTPUTS.zip file
- QCG_ZIP_OUTPUTS_FILTER - The wildcard pattern of files to be stored in the zip file (TBD)
- QCG_COMP_PROCESSES_MAP - topology of hybrid application (e.g. QCG_COMP_PROCESSES_MAP="4:1:1:1")
- QCG_FORCE_SCRATCH - force chdir to temporary directory
Application Scripts - Users view
Here we list environment variables available in any job submitted as QCG-Application (e.g. bash):
- QCG_NODEFILE - the machinefile like file with list of nodes allocated for the job
- QCG_PPN - number of processes per node
- QCG_SPN - number of allocated slots per node (usually equal to QCG_PPN, differ only if job was submitted with: #QCG node=a:b:c where b>c)
- QCG_PROCS - total number of proceses