## Option A: HTCondor

You can use HTCondor to run the nextnano software on your local computer infrastructure (“on-premise”). Essentially, the nextnanomat software submits the job either locally or on the “HTCondor” cluster. In both cases, the results of the calculations are located on your local computer.

This feature is only supported with our new license system.

### Screenshot

The following shows a screenshot. 6 computers are connected to the HTCondor pool called e25nn. 120 slots are configured, 44 are currently available. Computers 2, 3, 4 and 6 are selected to accept jobs. Computers 2 and 6 are currently not available as they are in use.

1. In the webpage, click on Download and go to Current Stable Release of UW Madison (as of September 16 2019, HTCondor 8.8.5).
2. We recommend the file for Windows in Native Packages. We have tested the following versions:
• Version 8.9.2 condor-8.9.2-471265-Windows-x64.msi
• Version 8.8.5 condor-8.8.5-480168-Windows_x64.msi
• Version 8.6.13 condor-8.6.13-453497-Windows-x64.msi
• Version 8.6.12 condor-8.6.12-446077-Windows-x64.msi
• Version 8.6.11 condor-8.6.11-440910-Windows-x64.msi

Install HTCondor.

1. Start installer
2. Click Next and then accept License Agreement
3. Then there are two options. There will be one special computer that manages all HTCondor jobs (Central Manager), and normal computers. If there is no Central Manager yet, we have to create a New Pool.
1. If you are on the Central Manager, choose Create a new HTCondor Pool and fill in the name of the Pool, e.g. nextnanoHTCondorPool. This is a unique name for your pool of machines.
2. If you are not the Central Manager, choose Join an existing HTCondor Pool and fill in the hostname of the central manager, e.g. computername where nextnanoCondorPool has been created.
4. Tic Submit jobs to HTCondorPool and choose Always run jobs and never suspend them. (Alternative: If you do not want other people to run jobs on your machine at all, select Do not run jobs on this machine or if you do not want other people to run jobs on your machine while you are working, select When keyboard has been idle for 15 minutes.. You can of course modify these settings later.)
5. Fill in your domain name (Example: Your Windows domain, e.g. yourcompanyname.com (without www).) All PCs of your network should get the same domain name, this does not necessarily be your Windows domain.
6. Hostname of SMTP Server and email address of administrator (not needed currently, leave it blank)
7. Path to Java Virtual Machine (not needed currently, leave it blank)
8. Host with Read access: *
9. Host with Write access: $(CONDOR_HOST),$(IP_ADDRESS), *.yourdomainname.com, 192.168.178.*, (Replace *.cs.wisc.edu with your domain name and add your local IP subnet e.g. 192.168.178.*). On Windows you can obtain your IP subnet using cmd.exe with the ipconfig command.

#### Config file

You can find your HTCondor config settings in the file C:\condor\condor_config. Let's look at an example below.

• Your company is called Simpson.
• Your Windows domain is called simpson.com.
• Your HTCondor pool shall have the name TheSimpsonsCondorPool.
• The HTCondor host that manages the HTCondor jobs has the computer name homer.simpson.com.
• Your computer is called lisa.simpson.com.
• The computers in your network have the IP range 192.168.188.*. (or 2001:db8:2042::* in IPv6)
 RELEASE_DIR = C:\condor
LOCAL_CONFIG_FILE = $(LOCAL_DIR)\condor_config.local REQUIRE_LOCAL_CONFIG_FILE = FALSE LOCAL_CONFIG_DIR =$(LOCAL_DIR)\config
use SECURITY : HOST_BASED
#CONDOR_HOST: $(FULL_HOSTNAME) # on computer called homer CONDOR_HOST: homer # on computer called lisa COLLECTOR_NAME = TheSimpsonsCondorPool # only on computer called homer #UID_DOMAIN = # empty if you do not have a domain UID_DOMAIN = simpson.com SOFT_UID_DOMAIN=TRUE # entry is missing if you do not have a domain FILESYSTEM_DOMAIN = simpson.com # entry is missing if you do not have a domain CONDOR_ADMIN = SMTP_SERVER = ALLOW_READ = * ALLOW_WRITE =$(CONDOR_HOST), $(IP_ADDRESS), *.simpson.com, 192.168.188.*, 2001:db8:2042::* ALLOW_ADMINISTRATOR =$(IP_ADDRESS)
use POLICY : ALWAYS_RUN_JOBS
#use POLICY : DESKTOP
WANT_VACATE = FALSE
WANT_SUSPEND = TRUE
#DAEMON_LIST = MASTER SCHEDD COLLECTOR NEGOTIATOR STARTD # on computer called homer
#DAEMON_LIST = MASTER SCHEDD STARTD KBDD                 # on computer called lisa if keyboard idle 15 minutes option was chosen
DAEMON_LIST = MASTER SCHEDD STARTD                       # on computer called lisa

### Submitting jobs to HTCondor pool with nextnanomat

Submit job

1. Add a job to the Batch list in the Run tab.
2. Click on the Run in HTCondor Cluster button (button with triangle and network).

Show information on HTCondor cluster

1. Click on Show Additional Info for Cluster Simulation.
2. Press the Refresh button on the right.
3. The results of the condor_status command are shown, i.e. the number of compute slots are displayed.
4. You can select another HTCondor command such as condor_q to show the status of your submitted jobs, i.e. select condor_q, and then press the Refresh button.
• You can type in any command in the line System command:, e.g. dir.
• The button Open Documentation opens the online documentation (this website).

Results of HTCondor simulations

• Once your HTCondor jobs are finished, the results are automatically copied back to your simulation output folder <nextnano simulation output folder\<name of input file>\.
• For debugging purposes regarding the HTCondor job, you can analyze the generated log file, <input file name>.log.

### Useful HTCondor commands for the Command Prompt

• condor_submit <filename>.sub Submit a job to the pool.
• condor_q Shows current state of own jobs in the queue.
• condor_q -nobatch -global -allusers Shows state of all jobs in the cluster. Of all users.
• condor_q -goodput -global -allusers Shows state and occupied CPU of all jobs in the cluster.
• condor_q -allusers -global -analyze Detailed information for every job in the cluster.
• condor_q -global -allusers -hold Shows why jobs are in hold state.
• condor_status Shows state of all available resources.
• condor_status -long Shows state of all available resources and many other information.
• condor_rm Remove jobs from a queue:
• condor_rm -all Removes all jobs from a queue.
• condor_rm <cluster>.<id> Removes jobs on cluster <cluster> with id <id> (It seems <cluster>. can be omitted, and id is the JOB_IDS number.)
• condor_release -all If any jobs are in state hold, use this command to restart them.
• condor_restart Restart all HTCondor daemons/services after changes in config file.
• condor_version Returns the version number of HTCondor
• condor_store_cred query Returns info about the credentials stored for HTCondor jobs
• condor_history Lists the recently submitted jobs. If for a specific job ID the status has the value ST=C, then this job has been completed (C) successfully.

### Configuration options for the Central Manager computer

With this option in the condor.config file on the central manager, one can set a policy that the jobs are spread out over several machines rather than filling all slots of one computer before filling the slots of the other computers.

##------nn: SPREAD JOBS BREADTH-FIRST OVER SERVERS
##-- Jobs are "spread out" as much as possible,
##   so that each machine is running the fewest number of jobs.
NEGOTIATOR_PRE_JOB_RANK = isUndefined(RemoteOwner) * (- SlotId)

### FAQ

Q: I submitted a job to HTCondor, but nothing happens. nextnanomat says “transmitted”.

A: It could be that nextnanomat does not have read in all required settings. You can try to type in the command line condor_restart. Please make sure that you entered your credentials using condor_store_cred add -debug. You should then start nextnanomat again.

Q: I submitted a job to HTCondor, but the Batch line of nextnanomat is stuck with preparing. What is wrong?

A1: Did you store your credentials after the installation of HTCondor? If not, enter condor_store_cred add into the command prompt to add your password, see above (Recommended Installation Process).

A2: Did you change your password recently? If yes you have to reenter your credentials for HTCondor. Enter condor_store_cred add into the command prompt to add your password, see above (Recommended Installation Process). If this does not work, try to enter condor_store_cred add -debug for more output information on the error.

Q: I specified target machines in Tools - Options. Afterwards every submitted job to HTCondor is stuck with transmitting. What is wrong?

A: The value for UID_DOMAIN within the condor_config file needs to be the same for every computer of your cluster. (You can easily test it in a command prompt with condor_status -af uiddomain) If it's not the same value, no matching computer will be found and the job won't be transmitted successfully.

### Problems with HTCondor

#### Error: communication error

If you receive the following error when you type in condor_status

C:\Users\"<your user name>">condor_status
Error: communication error
CEDAR:6001:Failed to connect to <123.456.789.123>

you can check whether the computer associated with this IP address is your HTCondor computer using the following command.

nslookup 123.456.789.123

If it is not the expected computer, you can open a Command Prompt as Administrator and type in ipconfig /flushdns to flush the DNS Resolver Cache.

C:\Users\"<your user name>">ipconfig /flushdns

#### Error? Check the Log files

If you encounter any strange errors, you can find some hints in the history or Log files generated by HTCondor. You can find them here:

C:\condor\spool

• history

C:\condor\log

• CollectorLog
• MasterLog
• MatchLog
• NegotiatorLog
• ProcLog
• SchedLog
• SharedPortLog
• StarterLog
• StartLog

More details can be found here: Logging in HTCondor

### Known bugs

• HTCondor < 8.9.5 works with all nextnano executables
• HTCondor >= 8.9.5 works with nextnano executables newer than 2020-Jan

### Run your custom executable on HTCondor with nextnanomat

You can even run your own executable with nextnanomat locally or on HTCondor! We tested the following programs:

#### Input file identifier

An input file identifier is a special string in the input file that signals to nextnanomat whether the input file is an input file for the nextnano++, nextnano³, nextnano.QCL or nextnano.MSB software, or for a custom executable.

#### Settings for Hello World (HW)

In nextnanomat, we need the following settings:

• Path to executable file: e.g. D:\HW\HelloWorld.exe
• Input file identifier: e.g. HelloWorld
• Working directory: Select 'Simulation output folder'
• HTCondor: Output folder and files (transfer_output_files = …): .

Open input file input_file_for_HelloWorld.in (or any other input file that contains the string HelloWorld) and run the simulation either locally or on HTCondor.

#### Settings for Quantum ESPRESSO (QE)

Our folder structure is

• D:\QE\inputfile\My_QE_inputfile.in (QE input file)
• D:\QE\input\pseudo\C.UPF (pseudopotential file for atom species 'C' as specified in input file)
• D:\QE\exe\pw.exe (QE executable file)
• D:\QE\exe\*.dll (all dll files needed by pw.exe)
• D:\QE\working_directory\QE_nextnanomat_HTCondor.bat (batch file)

In nextnanomat, we need the following settings:

• Path to executable file: e.g. D:\QE\working_directory\QE_nextnanomat_HTCondor.bat
• Path to folder with additional files: D:\QE\
• Input file identifier: e.g. &control
• Working directory: Select 'Simulation output folder'
• HTCondor: Output folder and files (transfer_output_files = …): .
• (Additional arguments passed to the executable: \$INPUTFILE)

The batch file (*.bat) contains the following content:

.\exe\pw.exe -in .\My_QE_inputfile.in

This means that relative to the working directory, pw.exe is started, and the specified input file is read in. In this input file, the following quantities are specified:

• C.UPF: name of pseudopotential file
• ./input/pseudo/: path to pseudopotential file C.UPF

Open input file My_QE_inputfile.in and run the simulation either locally or on HTCondor.

Things that could be improved:

• Write all files into output folder created by nextnanomat. In particular, the folder output/ should be moved.
• condor_exec.exe is deleted (better: do not copy it back)
• all *.dll files should be deleted (better: do not copy them back)
• Don't copy back *.exe and *.dll files (both HTCondor and local)

#### Settings for ABINIT

Our folder structure is

• D:\abinit\inputfile\t30.in (ABINIT input file)
• D:\abinit\input\* (input files needed by ABINIT)
• D:\abinit\exe\abinit.exe (ABINIT executable file)
• D:\abinit\exe\*.dll (all dll files needed by abinit.exe)
• D:\abinit\working_directory\abinit_nextnanomat.bat (batch file)
• Path to executable file: e.g. D:\abinit\working_directory\abinit_nextnanomat.bat
• Path to folder with additional files: D:\abinit\
• Input file identifier: e.g. acell
• Working directory: Select 'Simulation output folder'
• HTCondor: Output folder and files (transfer_output_files = …): .
• Additional arguments passed to the executable: (empty)

The batch file (*.bat) contains the following content:

.\exe\abinit.exe < .\input\ab_nextnanomat_HTCondor.files

This means that relative to the working directory, abinit.exe is started, and the specified input file is read in. In this input file, the following quantities are specified:

• .\inputfile\t30.in: name of input file
• .\input\14si.pspnc:

Open input file t30.in and run the simulation either locally or on HTCondor.

##### Notes
• condor_exec.exe is deleted (better: do not copy it back)
• all *.dll files should be deleted (better: do not copy them back)
• Don't copy back *.exe and *.dll files (both HTCondor and local)

## Option B: Amazon EC2 (aws)

(We are working on it.)