The postmortem Utility

The postmortem utility archives information that is useful for understanding why jobs do not run as expected. When contacting technical support regarding a job, it is highly recommended to run this utility, and upload the archive when you fill in the support request form on the web site (https://support.schrodinger.com/s/contactsupport).

The syntax of the postmortem utility is as follows:

$SCHRODINGER/jsc postmortem [options] <jobid1> [<jobid2> ...]

The postmortem is a ZIP archive, in a single file (by default), or a hierarchical directory structure (if the -n option is used). The contents of the postmortem is exactly the same in both cases. The files created by the command is overwritten each time the command is run.

In the case of Job Server back end failure, the job directories are not cleaned up and can be recovered by running a postmortem command. Job directory zip files will be saved with the filename format <jobid>-postmortem/<jobid>-jobdir.zip and can be used to recover partially finished jobs.

Supported options are:

Options

Description

-n, --no-zip

Create the postmortem archive as a directory rather than as a zip archive.

--without-server-logs

Skip collecting server log entries for the job.

--with-subjobs

Include all the subjobs (by default, only failed subjobs are included).

--without-subjobs

Do not include any subjobs (by default, only failed subjobs are included).

--without-job-files

Skip collecting log files produced by the job.

--without-redaction

Do not redact sensitive information from postmortem. (By default, it will be redacted from all files and filenames.) With this flag on, all files produced by the job will be included, but that may be overridden with --without-job-files.

--replace value

A string to replace when redacting sensitive information.

Replacing sensitive information

You should ensure that no sensitive information is added to the archive.

The job name, which may contain sensitive information, is by default be “redacted,” i.e., replaced with a generic string, in all names and contents of files in the postmortem archive. The output of the postmortem command will mention this:

Replacing sensitive strings:
			myjobname => JOBNAME_______________

You can explicitly turn off this behavior by using the --without-redaction option:

$SCHRODINGER/jsc postmortem --without-redaction <jobid>

Using the --without-redaction option causes both the output and log files of a job to be included in the postmortem archive. If the --without-redaction option is used in conjunction with the --without-job-files option, neither job file is included.

You can also replace any other sensitive information with -replace option, which replaces it with an automatically generated string:

$SCHRODINGER/jsc postmortem -replace <string1> -replace <string2> ... <jobid>

The --without-redaction and -replace options may affect file or directory names. If one or more names contain sensitive strings, they are redacted just like file contents.

Contents of a postmortem

Every postmortem contains a file called README.txt, which briefly describes the files included in the archive. Job Server diagnostics are always included in the postmortem, while job files can be selectively included using command options. The details of each file type and the options used to include or exclude them are listed below.

By default, the postmortem contains:

  • Job Server diagnostics
  • Log files
  • Failed subjobs, if any
  • Job server log entries

Job Server diagnostics

  • General information—These files contain information not pertaining to a specific job:

    • execution-host-<jobid>.info — technical information about the host that the job was executed on that was retrieved at the time of the backend failure
    • server.info — information about the job server used to run job
    • client.info — information about the client used to talk to the job server
  • Supervisor logs—These are the log files generated by the job supervisor (the process which runs on the execution host and is responsible for running the backend and transferring the output files to the job server host):

    • supv-<jobid>.log — the log messages that the supervisor writes
    • supv-stdout-<jobid>.log — output generated while running the supervisor process
    • supv-stderr-<jobid>.log — error messages generated while running the supervisor process
  • Launch log—This is the log file generated during job launch.

    Listed as: launch-<jobid>.log

Job files

  • The archive of the job directory—Only included if explicitly requested, via the -SAVE option or the SCHRODINGER_SAVE_JOBDIR environment variable. The parent job directory archive will not contain subjobs and subjob job directory archives will only be included in the postmortem if the -SAVESUBJOBS option is used. Using the -SAVEFAILEDSUBJOBS option will include only failed subjob job directories.

    Listed as: <jobid>-jobdir.zip

  • Output files—By default, only those files with the .log extension will be included. If postmortem was called with --without-redaction, then all the output files will be included. However, if postmortem was called with --without-log-files, then no output files at all will be included.

    Listed as: output_files/<output1>, output_files/<output2>, etc.

  • Input files—They will be included only if postmortem was called with --without-redaction, and only when the job had not been downloaded before the postmortem command. In practice, this means they will almost never be included.

    Listed as: input_files/<input1>, input_files/<input2>, etc.

  • Subjob information—If there are subjobs, by default, only subjobs that failed will be included. If postmortem was called with --with-subjobs, all subjobs will be included. If it was called with --without-subjobs, none will be included.

    Listed as: subjobs/<status>/<subjobid>

    Each <subjobid> directory contains the same job-specific files as the top directory for the main job.

  • Job server log—This file contains log messages pertaining to the job (and its subjobs, if there were any and if postmortem was called with --with-subjobs) written by the job server. It will not be included if postmortem was called with --without-server-logs.

    Listed as: jobserver.log

Troubleshooting Tips

  • If the backend has failed, you should look at the archive of the job directory, if there is one. That archive may contain log files produced by the backend.

  • If you suspect that there was a general system issue (e.g. a machine ran out of disk space), you should look at the general information files.

  • If you suspect that the jobcontrol system failed to run the backend and/or to transfer the output files, you should look at the supervisor and/or server log files.

  • If you suspect files are missing from the archive, you can review the command used to create the postmortem (found at the top of postmortem.log) and conclude which files are supposed to be included, according to the rules below.