[Beowulf] HPC workflows

mark somers m.somers at chem.leidenuniv.nl
Wed Nov 28 05:22:44 PST 2018

As a follow up note on workflows,

we also have used 'sshfs like constructs' to help non technical users to compute things on local clusters, the actual CERN grid
infrastructure and on (national) super computers. We built some middleware suitable for that many moons ago:


Works great for python coded workflows on workstations so coming back to the 'sshfs trick':

We have some organic chemists here doing many many many Gaussian calculations and only knowing windows. They do this by creating
input files using the gui of Gaussian on their workstations and save them in a special directory that is synced using
SyncBackPro to a CentOS server. On that server a python script runs via cron every 5 min to push these input files for Gaussian
into our LGI setup. Compute resources hooked up in our LGI that can do Gaussian pick up those jobs, run them using slurm /
torque / glite or whatever is suitable on that compute resource and eventually upload results into the LGI repository again. The
cron python job on the CentOS server notices finished jobs in the LGI queue and downloads the results into a special output
directory and removes the job from the LGI queue. Now the windows workstation with SynBackPro again retrieves the outputs to the
windows share they all use. This has been running 24x7 for several years now without a glitch using super computers, the actual
grid and local clusters without these organic chemists having to worry about unix or details like that.

So I can concur, a seemingly simple 'sshfs trick' should not be underestimated :).

We also have many unix literate users here using the python api to build workflows via LGI or the simple cli interface of LGI to
submit jobs from their workstations. 


mark somers
tel: +31715274437
mail: m.somers at chem.leidenuniv.nl
web:  http://theorchem.leidenuniv.nl/people/somers

More information about the Beowulf mailing list