Bayes computation servers (Department of Statistics)

Bayes Computers

This is a set of servers on which you can run computations not suitable for Great Lakes. It is available to faculty, postdocs, and PhD students in the Statistics Department.

  • bayes-node1.stat.lsa.umich.edu (8-core)
  • bayes-node2.stat.lsa.umich.edu (8-core)
  • bayes-node3.stat.lsa.umich.edu (8-core)
  • bayes-node4.stat.lsa.umich.edu (24-core)
  • bayes-node5.stat.lsa.umich.edu (24-core)

Check Load

Connecting

  1. First read the rest of these instructions.
  2. Check the load to find the least busy node to use.
  3. Review the "SSH" page if you are not familiar with using SSH.
  4. Use SSH to connect to the respective machine: ssh bayes-noden.stat.lsa.umich.edu

Home Directory

For users that have been with the Statistics department longer, your AFS home directory is your home directory across all of the Bayes machines. You can safely save important data here, but it is limited to 10 GB.

For newer users to the Statistics department, your home directory will be located on the Bayes nodes mounted volume, /data/uniqname.  This data volume is shared across the Bayes Nodes, so logging into to any one of the nodes, you will have the same home directory. 

Scratch Space

Copy your code and data into the scratch space to run the computation. You can find that space as: /data/uniqname where uniqname is your username. This is a scratch space, so that means only use it temporarily. Data will not be recoverable from here if you lose it. Copy your results back into your AFS space for safe keeping. The same scratch space is available across all the nodes.

Load Balancing

You want to make sure you don't overload the computation nodes as that will slow it down tremendously preventing yourself and others from completing your work. Check the Bayes Cluster Load Graphs for the load.

  • Make sure the load per node doesn't exceed approximately 80%.
  • This is about 2 white squares worth of space at the top of the graphs.
  • No more than 7.0 out of 8.0 or 22.0 out of 24.0 on the axes values.
  • This translates to 7 cores out of 8 cores on machines with 8 cores and 22 cores out of 24 cores on machines with 24 cores.

Running Jobs

Starting a job

  1. SSH to a node that doesn't have much load on it.
  2. Copy your data and code into the local scratch space you have, /data/uniqname.
  3. Run the screen command: screen
  4. Start your job in the background (it will be inside the screen command): my_program &
  5. Detach the screen (this step is important): Ctrl-a followed by Ctrl-d.
  6. Close your session: exit

Detaching in Step 5 allows your program to run when you are away without any issues.

Checking on your job

  1. SSH to the same node as before.
  2. If you had more than one screen session, enter the command screen -ls to identify them. 
  3. Type the command to resume your screen: screen -r SessionID or simplyscreen -r if you had only one session.
  4. If the job is done, you can close screen by enteringexit –OR– If it is not done, detach with Ctrl-a followed by Ctrl-d
  5. Close your session: exit

Make sure to upload your relevant files (your results) to your AFS space.

Anaconda Python shared installation

Python versions 2.7 (base) and 3.7 (python3) are available via a shared installation of Anaconda Python. Although you could install your own private copy of Anaconda in your AFS home directory, this is not necessary, and you may run out of space, as AFS is limited to 10GB, and it should be more efficient to use the shared installation.

To set up your .bashrc to be able to find the Anaconda Python base install for the first time, at your terminal prompt please type:

  • /usr/local/anaconda/bin/conda init bash

This will add the following text block to the bottom of your .bashrc (which will be shared across the Bayes nodes because your home directory is shared across them as well):

# >>> conda initialize >>>
# !! Contents within this block are managed by 'conda init' !!
__conda_setup="$('/usr/local/anaconda/bin/conda' 'shell.bash' 'hook' 2> /dev/null)"
if [ $? -eq 0 ]; then
    eval "$__conda_setup"
else
    if [ -f "/usr/local/anaconda/etc/profile.d/conda.sh" ]; then
        . "/usr/local/anaconda/etc/profile.d/conda.sh"
    else
        export PATH="/usr/local/anaconda/bin:$PATH"
    fi
fi
unset __conda_setup
# <<< conda initialize <<<

Once you exit out of your ssh session and start a new one, or run the command "source .bashrc", you should see the name of the environment your session listed in parentheses at the start of your terminal prompt.  This should appear unless you have customized your shell prompt in an unusual way.

Python 2.7

This environment is the default (base) installed environment on the Bayes nodes.  

Python 3.7

To switch into the Python 3.7 (python3) environment, you would simply type:

  • conda activate python3

If you want to deactivate Python 3.7 (python3) and return to Python 2.7 (base), do this:

  • conda deactivate

Changing your default version of Python

Also, if you'd rather have the Python 3.7 (python3) environment set as your default when you log in to the system, you can edit your .bashrc, and at the end of the file, after the closing comment,

# <<< conda initialize <<<

you can add:

  • conda activate python3

Switching back to Python 2.7 (base) if you've changed your default environment in your .bashrc file

If you ever need to switch to the Python 2.7 (base) environment, you can just type: 

  • conda activate base
     

Details

Article ID: 1500
Created
Tue 5/26/20 6:03 PM
Modified
Thu 8/13/20 12:20 PM