What does my Job Status mean?

greatlakes hpc arc slurm research-computing high-performance-computing advanced-research-computing slurm-computing-account

Summary

If you are working on Great Lakes, and wondered what your does your Job status mean. This article has been created to help provide definitions and details.

As always, if you have questions or concerns while going through the learning material, please don't hesitate to stop by Office Hours (Monday's 2-3pm and Thursday's 3-4pm), our HPC related workshops will be linked below or submit a ticket to arc-support@umich.edu.

"Go forth and compute!" - Dr. Charles Antonelli

Environment

Please do not only copy and paste the commands here. Any <value> placed within a greater and lesser than symbol will need to be replaced and the symbols removed. For example, if your uniqname is ‘umstudent’ then if you see ‘/home/<uniqname>’ in the instructions, write ‘/home/umstudent’.

When listing your submitted jobs with squeue -u <uniqname>, the final column titles "NODELIST(REASON)" will give you the reason that the job is not running yet. The possible statuses are:

Statuses

Resources
- This job is waiting for the resources (CPUs, Memory, GPUs) it requested to become available. Resources become available when currently running jobs complete. The job with Resources in the NODELIST(REASON) column is the top priority job and should be started next.
Priority
- This job is not the top priority, so it must wait in the queue until it becomes the top priority job. Once it becomes the top priority job, the NODELIST(REASON) column will change to “Resources”. The priority of all pending jobs can be shown with the sprio command. A job’s priority is determined by two factors: fairshare and age. The fairshare factor in a job’s priority is influenced by the amount of resources that have been consumed by members of your Slurm account. More recent usage means a lower fairshare priority. The age factor is determined by the job’s queued time. The longer the job has been waiting in the queue, the higher the age priority.
AssocGrpCpuLimit
- This job was submitted with a Slurm account that has a limit set on the number of CPUs that may be used at one time. This limit is set for all jobs by all users of the same Slurm account. Once some of the jobs running under this Slurm account complete, this reason will change to Priority or Resources unless there is some other limit or dependency. All jobs running under a given Slurm account can be viewed by running squeue --account=account_name
AssocGrpGRES
- This job was submitted with a Slurm account that has a limit set on the number of GPUs that may be used at one time. This limit is set for all jobs by all users of the same Slurm account. Once some of the jobs running under this Slurm account complete, this reason will change to Priority or Resources unless there is some other limit or dependency. All jobs running under a given Slurm account can be viewed by running squeue --account=account_name
AssocGrpMem
- This job was submitted with a Slurm account that has a limit set on the amount of memory that may be used at one time. This limit is set for all jobs by all users of the same Slurm account. Once some of the jobs running under this Slurm account complete, this reason will change to Priority or Resources unless there is some other limit or dependency. All jobs running under a given Slurm account can be viewed by running squeue --account=account_name
AssocGrpBillingMinutes
- This job was submitted with a Slurm account that has a limit set on the amount of monetary charges that may be accrued. Jobs that are pending with this reason will not start until the limit has been raised or the monthly bill has been processed.
Dependency
- This job has a dependency on another job. It will not start until that dependency is met. The most common dependency is waiting for another job to complete.
QOSMinGRES
- This job was submitted to the GPU partition, but did not request a GPU. This job will never start. This job should be deleted and resubmitted to a different partition or if a GPU is needed, resubmitted to the GPU partition with a GPU request. A GPU can be requested by adding the following line to a batch script: #SBATCH --gres=gpu:1

Additional resources

0 reviews

Print Article

Updating...

What does my Job Status mean?

Summary

Environment

Statuses

Additional resources

Deleting...