The bash shell¶
A shell is a user interface for accessing system resources and executing commands. One of the most commonly used shells is the Bourne Again SHell, or bash. By default, when you log into the Research Computing systems you are operating in a bash shell.
Some commonly used bash commands:
man
[command] -- Show the manual pages for a given [command]touch
-- Create a basic text filemkdir
-- Create a folder / directoryls
-- List files in a directorycd
-- Change directorypwd
-- Show current directorycat
-- Concatenate files and print on the standard output Commands are typically issued with the following format:
command [options] [arguments]
System information:
Sometimes we want to learn more about what is happening, e.g., what is the user load on the system.
We can use a utility called htop
to get more information about the running state of the system. You can see a summary of the total CPU% Cpu(s) load and the memory KiB Mem usage at the top of the screen. To change the memory units, press E. We will go into more detail using the htop later on in this tutorial.
Useful htop field descriptions:
PID Process ID
USER User ID
PR The scheduling priority level -- can take values between 1-99 with 1 being the highest priority
RES Resident memory size -- This is the amount of memory that is allocated for your program.
If we're installing packages or compiling code, it might be useful to know what operating system and distribution that we're using. To print the OS distribution and version number, we can issue a cat /etc/system-release
On Andes and Polaris, if I have forgotten the system specifications, I can always print the message of the day again by issuing cat /etc/motd
On Discovery, getting status is a little different since the cluster is composed of a number of nodes, i.e., individual machines.
Remember that the HPC resources at Dartmouth are shared resources across the entire research community. Being conscious of this aspect is especially important when on Andes, Polaris, or on the login node on Discovery.
Andes and Polaris are meant to be used in an interactive way, i.e., the way that typically use a computer. To be a good citizen on these systems entails careful monitoring of your processes to make sure that you're not using a disproportionate amount of CPU resources or memory.
Discovery, on the other hand, is primarily used for batch scheduled jobs, i.e., submitting jobs to a scheduler. When you log into Discovery, you are on the login node, also referred to as the head node. This node's main purpose is to provide a place for users to submit and monitor jobs. It can be used for small tasks, such as compiling code or monitoring jobs, but not for testing or running user programs. Since it is not meant for computationally intensive tasks, it can quickly bog down when users use this node for real computation. This typically results in adverse system behavior that affect other users, e.g., users unable to submit jobs to the cluster.