Cluster Installation: Difference between revisions

From XDSwiki
Jump to navigation Jump to search
(remove out-of-date grid engine installation stuff; update to show forkxds and DLS UGE usage (thanks to Graeme W!))
Line 3: Line 3:
== XDS Cluster setup ==
== XDS Cluster setup ==


In order to setup XDS in cluster mode, ''forkcolspot'' and ''forkintegrate'' scripts need to be changed to access the gridengine environment and send jobs to different machines. Example scripts are below, need to be changed according to the environment. Observe there is ''qsub'' command which submits forkcolspot_job/forkintegrate_job to grid engine.
In order to setup XDS in cluster mode, the ''forkxds'' script need to be changed to access the cluster environment and send jobs to different machines. Example scripts used for Univa Grid Engine (UGA) at Diamond (from https://github.com/DiamondLightSource/fast_dp/tree/master/etc/uge_array - thanks to Graeme Winter!) are below; they may need to be changed according to the environment. Observe this uses the ''qsub'' command which submits forkxds_job to grid engine.


<pre>
<pre>
#forkcolspot
#forkxds
#!/bin/bash
#                    forkxds          Version DLS-2017/08
#
# enables  multi-tasking by splitting the COLSPOT and INTEGRATE
# steps of xds into independent jobs. Each job is carried out by
# a Fortran main program (mcolspot, mcolspot_par, mintegrate, or
# mintegrate_par). The jobs are distributed among the processor
# nodes of the NFS cluster network.
#
# 'forkxds' is called by xds or xds_par by the Fortran instruction
# CALL SYSTEM('forkxds ntask maxcpu main rhosts'),
#    ntask  ::total number of independent jobs (tasks)
#  maxcpu  ::maximum number of processors used by each job
#    main  ::name of the main program to be executed; could be
#            mcolspot | mcolspot_par | mintegrate | mintegrate_par
#  rhosts  ::names of CPU cluster nodes in the NFS network
#
# DLS UGE port of script to operate nicely with cluster
# scheduling system - will work with any XDS usage but is
# aimed for fast_dp see fast_dp#3. Options passed through environment:
#
# FORKXDS_PRIORITY - priority within queue, e.g. 1024
# FORKXDS_PROJECT - UGE project to assign for this
# FORKXDS_QUEUE - queue to submit to


ntask=$1  #total number of jobs
ntask=$1  #total number of jobs
maxcpu=$2 #maximum number of processors used by each job
maxcpu=$2 #maximum number of processors used by each job
  #maxcpu=1: use 'mcolspot' (single processor)
main=$3  #name of the main program to be executed
  #maxcpu>1: use 'mcolspot_par' (openmp version)


pids=""                    #list of background process ID's
rm -f forkxds.params
itask=1
itask=1
echo "MAX CPU $maxcpu $image1"
#check for gridengine submit host
submitnodes=`qconf -sh 2> /dev/null`
thishost=`hostname`
isgrid=0
for node in $submitnodes ; do
if [ "$node" == "$thishost" ]
then
isgrid=1
echo "Grid Engine environment detected"
fi
done
while test $itask -le $ntask
do
  if [ $maxcpu -gt 1 ]
#    then echo "$itask" | mcolspot_par &
#    else echo "$itask" | mcolspot    &
      then
      if [ $isgrid -eq 1 ]
then
        #submit job to grid engine
qsub -sync y -V -l h_rt=0:20:00 -cwd \
  forkcolspot_job \
  $itask  &
      #else echo "$itask" | qrsh -V -cwd "mcolspot"    &
else echo "$itask" | mcolspot_par &
fi
  else echo "$itask" | mcolspot    &
  fi
  pids="$pids $!"  #append id of the background process just started
  itask=`expr $itask + 1`
done
trap "kill -15 $pids" 2 15  # 2:Control-C; 15:kill
wait  #wait for all background processes issued by this shell
rm -f mcolspot.tmp  #this temporary file was generated by xds
rm -rf fork*job*
</pre>
----
<pre>
#forkcolspot_job
#!/bin/csh
echo $1
set itask=$1
set host=`uname -a | awk '{print $2}'`
echo $itask $host >> jobs.log
echo $itask | mcolspot_par
</pre>
----
<pre>
#forkintegate
fframe=$1 #id number of the first image
ni=$2    #number of images in the data set
ntask=$3  #total number of jobs
niba0=$4  #minimum number of images in a batch
maxcpu=$5 #maximum number of processors used by each job
          #maxcpu=1: use 'mintegrate' (single processor)
          #maxcpu>1: use 'mintegrate_par' (openmp version)
minitask=$(($ni / $ntask)) #minimum number of images in a job
mtask=$(($ni % $ntask))    #number of jobs with minitask+1 images
pids=""                    #list of background process ID's
nba=0
litask=0
itask=1
#Sudhir check for gridengine submit host
submitnodes=`qconf -sh 2> /dev/null`
thishost=`hostname`
isgrid=0
for node in $submitnodes ; do
if [ "$node" == "$thishost" ]
then
isgrid=1
echo "Grid Engine environment detected"
fi
done
while test $itask -le $ntask
while test $itask -le $ntask
do
do
   if [ $itask -gt $mtask ]
   echo $main >> forkxds.params
      then nitask=$minitask
      else nitask=$(($minitask + 1))
  fi
  fitask=`expr $litask + 1`
  litask=`expr $litask + $nitask`
  if [ $nitask -lt $niba0 ]
      then n=$nitask
      else n=$niba0
  fi
  if [ $n -lt 1 ]
      then n=1
  fi
  nbatask=$(($nitask / $n))
  nba=`expr $nba + $nbatask`
  image1=$(($fframe + $fitask - 1)) #id number of the first image
 
  if [ $maxcpu -gt 1 ]
      then
      if [ $isgrid -eq 1 ]
then
        #submit job to grid engine
      qsub -sync y -V -l h_rt=0:20:00 -cwd \
  forkintegrate_job \
  $image1 $nitask $itask $nbatask &
      #else echo "$image1 $nitask $itask $nbatask" | qrsh -V -cwd "mintegrate"    &
      else echo "$image1 $nitask $itask $nbatask" | mintegrate_par  &
      fi
      else echo "$image1 $nitask $itask $nbatask" | mintegrate  &
  fi
  pids="$pids $!"  #append id of the background process just started
 
   itask=`expr $itask + 1`
   itask=`expr $itask + 1`
done
done
trap "kill -15 $pids" 2 15  # 2:Control-C; 15:kill
wait  #wait for all background processes issued by this shell
rm -f mintegrate.tmp  #this temporary file was generated by mintegrate
rm -rf fork*job*
</pre>
<pre>
#forkintegrate_job
#!/bin/csh
set image1=$1
set nitask=$2
set itask=$3
set nbatask=$4
set host=`uname -a | awk '{print $2}'`
echo $image1 $nitask $itask $nbatask $host >> jobs.log
echo $image1 $nitask $itask $nbatask | mintegrate_par
</pre>
== Grid Engine Installation ==
Grid Engine consists of a master node daemon named ''sgemaster'' which schedules jobs to execution nodes.  On each execution node a daemon named ''sge_execd'' runs a job and sends a completion signal back to sgemaster. Jobs are submitted to sgemaster using command such as ''qsub'' or using DRMAA http://www.drmaa.org/ C, JAVA or IDL bindings from any applications which want to run XDS.
[[File:Gridengine arch1.png]]
Redhas/CentOS Linux distribution comes with rpms for installing Grid Engine. One need to have administrative privileges to install. Install gridengine rpms on all the nodes using following command, Default shell for Grid Engine is /bin/csh. '''It is assumed that all the workstations involved access the storage (using NFS or other cluster file systems) where the data is stored and authentication is done through protocols like LDAP.'''
<pre>
root@ws1:/home 1> yum install gridengine gridengine-qmaster gridengine-execd  gridengine-qmon
root@ws1:/home 2> rpm -qa | grep gridengine


gridengine-qmaster-6.2u5-10.el6.4.x86_64
# save environment
gridengine-qmon-6.2u5-10.el6.4.x86_64
echo "PATH=$PATH" > forkxds.env
gridengine-execd-6.2u5-10.el6.4.x86_64
echo "LD_LIBRARY_PATH=$LD_LIBRARY_PATH" >> forkxds.env
gridengine-6.2u5-10.el6.4.x86_64
</pre>


By default gridengine installation directory /usr/share/gridengine, contents shown below.
# check environment for queue; project; priority information
qsub_opt=""
if [[ -n "$FORKXDS_PRIORITY" ]] ; then
    qsub_opt="$qsub_command -p $FORKXDS_PRIORITY"
fi


<pre>
if [[ -n "$FORKXDS_PROJECT" ]] ; then
root@ws1:/home 3> cd /usr/share/gridengine
    qsub_opt="$qsub_command -P $FORKXDS_PROJECT"
fi


root@ws1:/home 4> ls
if [[ -n "$FORKXDS_QUEUE" ]] ; then
bin  default  hadoop    install_execd    lib  my_configuration.conf  qmon  utilbin
    qsub_opt="$qsub_command -q $FORKXDS_QUEUE"
ckpt  doc      inst_sge  install_qmaster  mpi  pvm                    util
fi
</pre>
Lets say ''ws1'' is ''sgemaster'' node, it will installed using install_qmaster


==== Installing sgemaster ====
qsub $qsub_opt -sync y -V -cwd -pe smp $maxcpu -t 1-$ntask \
`which forkxds_job`


<pre>
root@ws1:/usr/share/gridengine 5>./install_qmaster
</pre>
</pre>


Most of the answers are yes/no or typing enter. Following things need to be decided before installation
* Admin user is root
* Following important environment variables are written to /usr/share/gridengine/default/common/settings.csh which should be sourced.
** $SGE_ROOT=/usr/share/gridengine
** $SGE_QMASTER_PORT=6444
** $SGE_EXECD_PORT=6445
** $SGE_CELL=default
* JMX MBean server not used
* Spooling method used is ''classic''
* There is an option to give administrative email which is very useful, when ever there is any problem gridengine will send error messages to email.
* Ready with a file contains admin and submit hosts or you can manually enter all the hosts separated by space, use full DNS names of hosts.
* In this installation shadow host is not used.
* After the shadow host step make sure allhosts group and all.q are created otherwise installation sge_execd will have problems.
* Scheduler Tuning selected as 'Max', it has disadvantage, gridengine immediately schedules with out assuming the load, this will cause successive job submissions will go to same host until all the slots are filled for that machine. Selecting 'Normal' will assume the load but there is overhead of few sec. extra time for job scheduling.
After finishing the installation the configuration files are automatically written to the directory /usr/share/gridengine/default since the cell name selected is 'default'. This directory can be choosen as a shared directory over NFS. Otherwise copy this directory to every host used in the cluster.
==== Installing sge_execd ====
On execution node install execution daemon using following command
<pre>
<pre>
root@ws2:/usr/share/gridengine 5>./install_execd
# forkxds_job
</pre>
 
the input is almost typing return if you already copied the 'default' directory to this node.
 
 
==== Submit nodes ====


Install Grid engine rpms also on all the submit nodes which use ''qsub'', and copy the default directory /usr/share/gridengine/default. Remember all the execution nodes in the cluster also act as submit nodes in case of XDS.
#!/bin/bash


Use command ''qconf'' to see which are submit hosts which are not and you can add them manually.
params=$(awk "NR==$SGE_TASK_ID" forkxds.params)
JOB=`echo $params | awk '{print $1}'`


== Restarting Grid Engine ==
# load environment
. forkxds.env


When grid engine installed first time /etc/init.d/sgemaster and /etc/init.d/sge_execd services are automatically installed.
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH
If you want to restart sgemaster make sure all the sge_execd deamons are stoped. You can do this by following commands
export PATH=$PATH
<pre>
echo $SGE_TASK_ID | $JOB
service sge_execd stop
service sgemaster stop
</pre>
for starting
<pre>
service sge_execd start
service sgemaster start
</pre>
When ever work stations need to be restarted make sure sgemaster work station started first. To keep the services restarted automatically during the startup make sure chkconfig is on.
<pre>
chkconfig sgemaster on
chkconfig sge_execd on
</pre>
</pre>
== Son of Gridengine ==
rpms available in this link
http://arc.liv.ac.uk/downloads/SGE/releases/8.1.8/
by defualt these rpms install in single directory /opt/sge instead of scattering files (by default) to /usr/bin, /usr/share/gridengine, /usr/spool/gridengine
Default shell for Son of Gridengine is /bin/sh which is /bin/bash

Revision as of 10:06, 10 August 2017

XDS can be run in cluster mode using any command line job scheduling software such as Grid Engine, Condor, Torque/PBS, LSF, SLURM etc. We implemented Grid Engine. It is a distributed resource management system which monitors the CPU and memory usage of the available computing resources and schedules the job to the least used computer. Grid Engine was chosen due to its high scalability, cost effectiveness, ease of maintenance and high throughput. Grid Engine was developed by Sun Microsystems (Sun Grid Engine, SGE) and later acquired by Oracle and subsequently acquired by UNIVA. The latest versions became closed source, but the older ones are open source supplied with many Linux distributions including Redhat/CentOS 6.x. There is also open source Open Grid Scheduler [[1]], Son of Gridengine [[2]]

XDS Cluster setup

In order to setup XDS in cluster mode, the forkxds script need to be changed to access the cluster environment and send jobs to different machines. Example scripts used for Univa Grid Engine (UGA) at Diamond (from https://github.com/DiamondLightSource/fast_dp/tree/master/etc/uge_array - thanks to Graeme Winter!) are below; they may need to be changed according to the environment. Observe this uses the qsub command which submits forkxds_job to grid engine.

#forkxds
#!/bin/bash
#                    forkxds          Version DLS-2017/08
#
# enables  multi-tasking by splitting the COLSPOT and INTEGRATE
# steps of xds into independent jobs. Each job is carried out by 
# a Fortran main program (mcolspot, mcolspot_par, mintegrate, or
# mintegrate_par). The jobs are distributed among the processor 
# nodes of the NFS cluster network.
#
# 'forkxds' is called by xds or xds_par by the Fortran instruction
# CALL SYSTEM('forkxds ntask maxcpu main rhosts'),
#    ntask  ::total number of independent jobs (tasks)
#   maxcpu  ::maximum number of processors used by each job
#    main   ::name of the main program to be executed; could be
#             mcolspot | mcolspot_par | mintegrate | mintegrate_par
#   rhosts  ::names of CPU cluster nodes in the NFS network 
#
# DLS UGE port of script to operate nicely with cluster 
# scheduling system - will work with any XDS usage but is 
# aimed for fast_dp see fast_dp#3. Options passed through environment:
#
# FORKXDS_PRIORITY - priority within queue, e.g. 1024
# FORKXDS_PROJECT - UGE project to assign for this
# FORKXDS_QUEUE - queue to submit to

ntask=$1  #total number of jobs
maxcpu=$2 #maximum number of processors used by each job
main=$3   #name of the main program to be executed

rm -f forkxds.params
itask=1
while test $itask -le $ntask
do
   echo $main >> forkxds.params
   itask=`expr $itask + 1`
done

# save environment
echo "PATH=$PATH" > forkxds.env
echo "LD_LIBRARY_PATH=$LD_LIBRARY_PATH" >> forkxds.env

# check environment for queue; project; priority information
qsub_opt=""
if [[ -n "$FORKXDS_PRIORITY" ]] ; then
    qsub_opt="$qsub_command -p $FORKXDS_PRIORITY"
fi

if [[ -n "$FORKXDS_PROJECT" ]] ; then
    qsub_opt="$qsub_command -P $FORKXDS_PROJECT"
fi

if [[ -n "$FORKXDS_QUEUE" ]] ; then
    qsub_opt="$qsub_command -q $FORKXDS_QUEUE"
fi

qsub $qsub_opt -sync y -V -cwd -pe smp $maxcpu -t 1-$ntask \
`which forkxds_job`

# forkxds_job

#!/bin/bash

params=$(awk "NR==$SGE_TASK_ID" forkxds.params)
JOB=`echo $params | awk '{print $1}'`

# load environment
. forkxds.env

export LD_LIBRARY_PATH=$LD_LIBRARY_PATH
export PATH=$PATH
echo $SGE_TASK_ID | $JOB