Transitioning from Job Control to Job Server
Job Server is the new default starting in 2025-4
As of 2025-4 Schrödinger software is using Job Server by default for submitting and managing computational jobs. This new infrastructure replaced the legacy Job Control system to provide a more robust, secure, and scalable solution for modern computing environments. Customers who currently use Job Control will need to deploy and configure Job Server.
Advantages of Job Server
The legacy Job Control system was based on a decentralized model that had limitations in today's increasingly complex computing landscapes. Job Server is a modern, centralized system designed to overcome these challenges.
-
Disconnect After Job Submission: Submit your jobs from your laptop or workstation and then disconnect from the network or even shut down your machine. The centralized Job Server manages the job independently, and your results will be ready for download when you reconnect.
-
Enhanced Security: Job Server introduces a more secure and simplified network configuration. All communication is centralized through just two configurable TCP ports, reducing the need for extensive firewall exceptions. The system uses a modern, certificate-based authentication model where users register with the server once. This eliminates the need for users to set up passwordless SSH between client machines and compute nodes, a common requirement of the legacy system. All data transfers are TLS encrypted, ensuring your data is secure.
-
Improved Scalability & Reliability: The new architecture is built to handle a large number of jobs efficiently. It uses a robust relational database to store job metadata, significantly reducing the likelihood of failures due to database corruption that could occur with the old file-based system.
-
Centralized Configuration: Instead of each user managing their own schrodinger.hosts file, administrators now configure host entries centrally in a hosts.yml file on the Job Server. This ensures consistency and simplifies management for all users accessing a shared cluster.
-
Modern System Support: Job Server is designed to integrate with modern operating system features like cgroups, providing better resource management on compute nodes.
Host Error Message
With Job Server as the new default, users who have not yet configured their remote connections for the new system will encounter the following error on 2025-4 when attempting to submit a job to a remote host:
Output: No server could be found for -HOST <host> because you don't have authentication certificates for any job servers.
To authenticate against a server, run:
$SCHRODINGER/jsc cert get <job-server-address>
Contact your local admin for the server address.
If you see this message, it means your environment needs to be configured for Job Server (see the "What This Transition Means for Your Specific Setup" section below).
What this means for your specific setup
The impact of this change depends on how you run your Schrödinger jobs. There are mainly three common job submission scenarios:
Scenario 1: Local Job Submission on a Laptop / Workstation
For users who only run jobs on their local machine (localhost), the transition will be seamless and the job submission workflow won’t change. The GUI of Maestro's Job Monitor panel on Job Server will look slightly different.
Impact: No change
Action Required: None
Scenario 2: Remote Job Submission to a single-node Workstation
This includes users who submit jobs from their client machine to a shared, remote single-node workstation that does not use a formal queueing system.
Impact: With the 2025-4 release, your current remote job submission workflow will stop working.
Action Required: To continue running jobs remotely, you will need to install and configure Job Server on your central workstation. A key part of this is installing a queueing system. To simplify this process, we provide an automated setup_jobserver script that can install and configure both SLURM and Job Server for you. Follow the instructions on Installing Job Server for getting Job Server set up.
Scenario 3: Remote Job Submission to an HPC Cluster
This includes users who submit jobs from their client machine to an HPC environment that already has a queueing system like SLURM, SGE, LSF, or PBS Pro installed.
Impact: With the 2025-4 release, your current remote job submission workflow will stop working.
Action Required: The transition is generally straightforward as the necessary queueing system is already in place. Your HPC/IT administrator will need to install the Job Server service on a designated server host (e.g., a login node). The main task will be to centralize your compute host configurations into the new hosts.yml file. Follow the instructions on Installing Job Server for getting Job Server set up.
Reverting to Legacy Job Control (Temporary Solution)
We strongly encourage all users to transition to the new Job Server infrastructure to take advantage of its benefits. However, we understand that some sites may need more time to adapt. As a temporary measure, you can revert to the legacy Job Control system by disabling the new feature flag on your client machine:
$SCHRODINGER/utilities/feature_flags -d JOB_SERVER
Please be aware that this is not a recommended long-term solution. The legacy Job Control system will no longer be actively developed or receive the same level of quality assurance as Job Server. Our support efforts will be focused on the new system.
We are confident that the move to Job Server will provide a significantly improved job submission and management experience for all our users.
Reach out to help@schrodinger.com if you need assistance with transitioning over to Job Server.