Phase 2.c: Understanding the Slurm Preamble and Creating a Batch Script [Phase2.c/3]

Summary

Are you familiar with linux, can access your data and have your software installed but want to learn about creating a Batch Script.? You've come to the right place, and this is Phase 2.c/3 of the self-guided practice. Within this Phase there are 3 sections in total and you can find them within the Index for the HPC self guided practice. 

As always, if you have questions or concerns while going through the learning material, please don't hesitate to stop by Office Hours (Monday's 2-3pm and Thursday's 3-4pm), our HPC related workshops will be linked below or submit a ticket to arc-support@umich.edu.

"Go forth and compute!" - Dr. Charles Antonelli

Environment

High Performance Computing (HPC) on the Great Lakes cluster  (non-sensitive data)

Directions 

Please do not only copy and paste the commands here. Any <value> placed within a greater and lesser than symbol will need to be replaced and the symbols removed. For example, if your uniqname is ‘umstudent’ then if you see ‘/home/<uniqname>’ in the instructions, write ‘/home/umstudent’

Understanding the Slurm Preamble and Creating a Batch Script:

  • Basics of Slurm Batch Scripting
    • Interpreter Line: !#/bin/bash
    • Slurm Directives: Lines starting with #SBATCH that specify job parameters and scheduling options creating the Slurm Preamble
    • Environment: Loading software or activating environments and if needed specifying where the data lives and where to output data.
      • After the Slurm Preamble, you’ll want to load any module(s) needed for your script. If applicable, source your .bashrc file and activate the environment before the job commands. 
    • Job Commands: Commands that execute your application tasks
       
  • Understanding the Slurm Preamble 
    Please note: The menus for an interactive session in Open OnDemand are the GUI equivalent to SBATCH lines in a batch script.
  • Tips for efficient Slurm Job submissions
    • Write clear job scripts. Clearly comment out your scripts for easy understanding and maintenance. 
    • Test software and scripts. Before submitting a large job, test your script locally or on the login node on smaller datasets or examples.
      • Interactive Sessions can be used for debugging. 
    • Request only what you need. If you don’t know how long your script will run, and are unsure of resources, request a longer walltime than your test with similar resources. Then edit those values to optimize the script and scheduler’s efficiency. 
      • Helpful commands:
        • my_job_estimate -h
          • The -h option will provide help text on how to use this tool and you can get an estimate of how much your job would cost if it ran with those resources. 
          • If your job finishes before the specified walltime, you will only be charged for the resources you requested during the time it ran.
        • my_job_statistics <jobID>
          • This tool is for jobs that are finished and will output a report with Memory and CPU efficiency. 
             
  • Practice: Creating a Batch script
    We use the file extension .sbat for batch scripts to easily identify them from different script extensions. You may use any file extension that works best for you. When submitting a ticket, please include the full path and name of the script. ​​​​​​​
    • Create a batch script with a text editor of your choice. 
    • Add the Shell Interpreter as the first line.
    • Create your Slurm Preamble with the directives to set the parameters for your job.
    • Load in your Software or Source your .bashrc file and Activate the environment. 
    • If your data is not in your home directory, specify where the data lives.
    • Add to how execute the software along with its input arguments, options and data to run.
       
  • Run your script with: sbatch <nameOFscript.sbat>
    • Helpful commands:
      • squeue
        • Shows your running jobs
      • scancel <jobID>
        • Cancels specified jobID from running
      • scontrol show job <jobID>
        • Show details for running jobID and can be helpful for troubleshooting with submitting tickets. 
      • sbatch –test-only <nameOFscript>.sbat
        • Validates batch script parameters and doesn’t run

More information on how to troubleshoot a Batch Script in Phase 3.   

External resources

As this document evolves. Will be adding links here.