You can use HTCondor to run the nextnano software on your local computer infrastructure (“on-premise”). Essentially, the nextnanomat software submits the job either locally or on the “HTCondor” cluster.
This feature is only supported with our new license system.
Download HTCondor installer from HTCondor.
Downloadand go to
Current Stable Releaseof
UW Madison(as of September 16 2019, HTCondor 8.8.5).
Native Packages. We have tested the following versions:
Nextand then accept License Agreement
Create a new HTCondor Pooland fill in the name of the Pool, e.g.
nextnanoHTCondorPool. This is a unique name for your pool of machines.
Join an existing HTCondor Pooland fill in the hostname of the central manager, e.g. computername where
nextnanoCondorPoolhas been created.
Submit jobs to HTCondorPooland choose
Always run jobs and never suspend them.(Alternative: If you do not want other people to run jobs on your machine at all, select
Do not run jobs on this machineor if you do not want other people to run jobs on your machine while you are working, select
When keyboard has been idle for 15 minutes.. You can of course modify these settings later.)
www).) All PCs of your network should get the same domain name, this does not necessarily be your Windows domain.
$(CONDOR_HOST), $(IP_ADDRESS), *.yourdomainname.com, 192.168.178.*, (Replace *.cs.wisc.edu with your domain name and add your local IP subnet e.g. 192.168.178.*). On Windows you can obtain your IP subnet using cmd.exe with the ipconfig command.
C:\condor\). The directory \Program Files\ seem to be problematic, so we do not recommend to use it.
Install(You need Administrator rights.)
A few more setups
condor_store_cred add -debugfor more output information on the error.
Cloud computing. If everything is correctly set up, you will find the “HTCondor” section highlighted with green color, and the available computers show up in “Cluster”. If this is not the case, maybe you have not installed HTCondor on the computer where you are running nextnanomat. Please also check that the HTCondor installation path is correctly set within nextnanomat, e.g. the default path
C:\condormight not be the one where you installed HTCondor.
Hostname (for HTCondor pool): computername.yourcompanyname.com Policy: "Always run jobs" Accounting domain: yourcompanyname.com Read access: * Write access: $(CONDOR_HOST), $(IP_ADDRESS), *.yourcompanyname.com, 192.168.178.* Administrator: $(IP_ADDRESS)
You can find your HTCondor config settings in the file
Let's look at an example below.
RELEASE_DIR = C:\condor LOCAL_CONFIG_FILE = $(LOCAL_DIR)\condor_config.local REQUIRE_LOCAL_CONFIG_FILE = FALSE LOCAL_CONFIG_DIR = $(LOCAL_DIR)\config use SECURITY : HOST_BASED #CONDOR_HOST: $(FULL_HOSTNAME) # on computer called homer CONDOR_HOST: homer # on computer called lisa COLLECTOR_NAME = TheSimpsonsCondorPool # only on computer called homer #UID_DOMAIN = # empty if you do not have a domain UID_DOMAIN = simpson.com SOFT_UID_DOMAIN=TRUE # entry is missing if you do not have a domain FILESYSTEM_DOMAIN = simpson.com # entry is missing if you do not have a domain CONDOR_ADMIN = SMTP_SERVER = ALLOW_READ = * ALLOW_WRITE = $(CONDOR_HOST), $(IP_ADDRESS), *.simpson.com, 192.168.188.*, 2001:db8:2042::* ALLOW_ADMINISTRATOR = $(IP_ADDRESS) use POLICY : ALWAYS_RUN_JOBS #use POLICY : DESKTOP WANT_VACATE = FALSE WANT_SUSPEND = TRUE #DAEMON_LIST = MASTER SCHEDD COLLECTOR NEGOTIATOR STARTD # on computer called homer #DAEMON_LIST = MASTER SCHEDD STARTD KBDD # on computer called lisa if keyboard idle 15 minutes option was chosen DAEMON_LIST = MASTER SCHEDD STARTD # on computer called lisa
Show information on HTCondor cluster
condor_statuscommand are shown, i.e. the number of compute slots are displayed.
condor_qto show the status of your submitted jobs, i.e. select
condor_q, and then press the Refresh button.
Results of HTCondor simulations
<nextnano simulation output folder\<name of input file>\.
<input file name>.log.
condor_submit <filename>.subSubmit a job to the pool.
condor_qShows current state of own jobs in the queue.
condor_q -nobatch -global -allusersShows state of all jobs in the cluster. Of all users.
condor_q -goodput -global -allusersShows state and occupied CPU of all jobs in the cluster.
condor_q -allusers -global -analyzeDetailed information for every job in the cluster.
condor_q -global -allusers -holdShows why jobs are in hold state.
condor_statusShows state of all available resources.
condor_rmRemove jobs from a queue:
condor_rm -allRemoves all jobs from a queue.
condor_rm <cluster>.<id>Removes jobs on cluster <cluster> with id <id> (It seems
<cluster>.can be omitted, and
condor_release -allIf any jobs are in state hold, use this command to restart them.
condor_restartRestart all HTCondor daemons/services after changes in config file.
condor_versionReturns the version number of HTCondor
condor_store_cred queryReturns info about the credentials stored for HTCondor jobs
Q: I submitted a job to HTCondor, but nothing happens. nextnanomat says “transmitted”.
A: It could be that nextnanomat does not have read in all required settings. You can try to type in the command line
condor_restart. Please make sure that you entered your credentials using
condor_store_cred add - debug. You should then start nextnanomat again.
Q: I submitted a job to HTCondor, but the Batch line of nextnanomat is stuck with
preparing. What is wrong?
A1: Did you store your credentials after the installation of HTCondor? If not, enter
condor_store_cred add into the command prompt to add your password, see above (Recommended Installation Process).
A2: Did you change your password recently? If yes you have to reenter your credentials for HTCondor.
condor_store_cred add into the command prompt to add your password, see above (Recommended Installation Process). If this does not work, try to enter
condor_store_cred add -debug for more output information on the error.
Q: I specified target machines in Tools - Options. Afterwards every submitted job to HTCondor is stuck with
transmitting. What is wrong?
A: The value for
UID_DOMAIN within the condor_config file needs to be the same for every computer of your cluster. (You can easily test it in a command prompt with
condor_status -af uiddomain) If it's not the same value, no matching computer will be found and the job won't be transmitted successfully.
If you receive the following error when you type in
C:\Users\"<your user name>">condor_status Error: communication error CEDAR:6001:Failed to connect to <123.456.789.123>
you can check whether the computer associated with this IP address is your HTCondor computer using the following command.
If it is not the expected computer, you can open a Command Prompt as Administrator and type in
ipconfig /flushdns to flush the DNS Resolver Cache.
C:\Users\"<your user name>">ipconfig /flushdns
(We are working on it.)