Tuesday, July 9, 2013

Queuing and Scheduling

Useful HPC commands/links

Find out what queues are available:
qconf -sql

Find out information about a named queue
qconf -sq serial.q

Number of slots usually equals number of cores

Good intro
http://talby.rcs.manchester.ac.uk/~ri/_linux_and_hpc_lib/sge_intro.html

Translation between PBS and SGE
http://wiki.ibest.uidaho.edu/index.php/Tutorials:_SGE_PBS_Converting

Using environment vars/arguments
http://stackoverflow.com/questions/9719337/how-to-properly-pass-on-an-environment-variable-to-sun-grid-engine

Using python to poll the status of a job:

    cmd = 'qstat -f | grep %s' % job_id
    proc = subprocess.Popen([cmd], stdout=subprocess.PIPE, shell=True)
    output = proc.stdout.read()
    if output==None:
        return output
    
    else:
        status = output.split()[4].strip()
        return status


Explantion of queue error state E:
http://www.gridengine.info/2008/01/20/understanding-queue-error-state-e/