#SBATCH -output=/home/myusername/joboutput/myjob.out It is possible to redirect job output to somewhere other than the default location with the -error and -output directives: #! /bin/sh -l Link to section 'Redirecting Job Output' of 'Checking Job Output' Redirecting Job Output You will need to check the documentation for your program for more details. This may be in the directory where the program was run, or may be defined in a configuration or input file. If your program writes its own output files, those files will be created as defined by the program. Note that both stdout and stderr will be written into the same file, unless you specify otherwise.
Unless you specfied otherwise, SLURM will put the output in the directory where you submitted the job in a file named slurm- followed by the job id, with the extension out. SLURM catches output written to standard output and standard error - what would be printed to your screen if you ran your program interactively. Once a job is submitted, and has started, it will write its standard output and standard error to files that you can read. You find the job ID using the squeue command as explained in the SLURM Job Status section. To release a hold on a job, use the scontrol release job command: $ scontrol release job myjobid Once a job has started running it can not be placed on hold. To place a hold on a job before it starts running, use the scontrol hold job command: $ scontrol hold job myjobid You may be wanting to allow labmates to cut in front of you in the queue - so hold the job until their jobs have started, and then release yours. Sometimes you may want to submit a job but not have it run just yet. To set more complex dependencies on multiple jobs and conditions: $ sbatch -dependency=after:myjobid1:myjobid2:myjobid3,afterok:myjobid4 myjobsubmissionfile Holding a Job To run a job after job myjobid ends with or without errors: $ sbatch -dependency=afterany:myjobid myjobsubmissionfile To run a job after job myjobid ends with errors: $ sbatch -dependency=afternotok:myjobid myjobsubmissionfile To run a job after job myjobid ends without error: $ sbatch -dependency=afterok:myjobid myjobsubmissionfile To run a job after job myjobid has started: $ sbatch -dependency=after:myjobid myjobsubmissionfile Typically dependencies are set by capturing and using the job ID from the last job submitted. These examples illustrate setting dependencies in several ways. Jobs can be configured to run after other job state changes, such as when the job starts or the job ends. Job dependencies may be configured to ensure jobs start in a specified order. Once the condition is satisfied jobs only then become eligible to run and must still queue as normal. Jobs with a dependency are held until the condition is satisfied. Job Dependenciesĭependencies are an automated way of holding and releasing jobs. To stop a job before it finishes or remove it from a queue, use the scancel command: $ scancel myjobid A number of example SLURM jobs are also available. Link to section 'Submitting a Job' of 'Basics of SLURM Jobs' Submitting a Jobįollow the links below for information on these steps, and other basic information about jobs. Always use SLURM to submit your work as a job. All users share the front-end hosts, and running anything but the smallest test job will negatively impact everyone's ability to use Bell. The system will then take jobs from queues, allocate the necessary nodes, and execute them.ĭo NOT run large, long, multi-threaded, parallel, or CPU-intensive jobs on a front-end login host.
With SLURM, a user requests resources and submits a job to a queue. The Simple Linux Utility for Resource Management (SLURM) is a system providing job scheduling and job management on compute clusters. As well, a number of example SLURM jobs that you may be able to adapt to your own needs. In this section, you'll find a few pages describing the basics of creating and submitting SLURM jobs. Use the batch mode for finished programs use the interactive mode only for debugging. You may use either the batch or interactive mode to run your jobs. You may use SLURM to submit jobs to a partition on Bell. There is one method for submitting jobs to Bell.