launching jobs with simfactory without necessarily syncing the sources

Issue #619 resolved
anonymous created an issue

I would like to launch jobs on a remote machine with simfactory without necessarily syncing the sources. Therefore, my Cactus directory on the remote machine is not necessarily the one guessed by simfactory in the simfactory/mdb/machines/*.ini files. For example on damiana currently simfactory wants Cactus to be in /home/baiotti/damiana/Cactus while my Cactus is in /home/baiotti/Cactus. So currently I get the error:

simfactory/bin/sim --remote damiana submit test --configuration sim_damiana --parfile=~/test.par --walltime=48:00:00 --procs=36 --queue=intel.q --num-threads=1 Warning: could not resolve hostname mbaiotti2 Warning: No xauth data; using fake authentication data for X11 forwarding. /home/baiotti/datura/Cactus: No such file or directory.

Could it be possible to have more freedom in specifying the Cactus directory on the remote machine? For the time being I have modified locally the damiana.ini file.

P.S. mbaiotti2 is the hostname of my laptop, from which I am sending the simfactory commands. I don't know why the above warning is printed (--verbose does not say more). I am using Version 1473M.

Keyword: simfactory
Keyword: remote
Keyword: directory

Comments (12)

  1. anonymous reporter
    • removed comment

    Why do you specify --remote when you are already on damiana? Can you please try without that option? Simfactory should be able to figure out the source location when called from within.

    Frank

  2. anonymous reporter
    • removed comment

    Never mind, so you do send the commands remotely. You do not have to change the damiana ini file. You can overwrite the sourcebasedir in the defs.local.ini file. I do this for some machines as well.

  3. Ian Hinder
    • removed comment

    I think that SimFactory should not require the user to configure things which can be determined automatically. If someone is running simfactory from a particular Cactus directory on Damiana, why can't simfactory use that directory to determine the sourcebasedir when building?

    The sourcebasedir machine entry should really only be used when determining the destination of a sync command (not the source), and should not be needed when building, running or submitting simulations locally (this implies that the source tree location should be a property of the simulation, not something read from the machine database).

  4. Frank Löffler
    • removed comment

    You are right Ian, but if I understand correctly, Luca tries to submit from a remote machine - his laptop. I might have created that confusion in the first place, thinking along the same lines as you now Ian. Simfactory does figure out the right directory when invoked from within a Cactus tree, at least I am pretty sure it does. From remote however, it does not have a good way to determine where Cactus is actually installed - and it might even be installed in multiple places.

    In this case the only option is to tell simfactory. One possibility is of course to change the machine entry directly, but the better, and intended option is to change the corresponding entry in the configuration file defs.local.ini. This file isn't stored in the svn repository and cannot come into conflict when updating simfactory itself, and it is much harder to accidentally commit such user-specific entries.

    This only leaves the question open what the simfactory default should be. At the moment it is probably just the preference of the original author of the configuration, and it is up to the bulk of the users of that particular machine to agree on something and request it to be the default.

  5. Ian Hinder
    • removed comment

    OK, so the answer to the original question is: "configure simfactory using its configuration file, defs.local.ini, to specify information about the remote machine that it cannot know". Can the ticket be closed?

    Regarding the situation on damiana, if you use simfactory directly on the head node, you have to prefix every command with "--machine damiana/datura". This is very tiresome. I would prefer simfactory to know which machine I meant by the directory I was in, and for that I need separate directories for the two (such a feature is not implemented yet in any case). It also means you can have the same configuration name for a given project for all machines, and not have to add _damiana or _datura suffixes all the time.

  6. Frank Löffler
    • removed comment

    Would it help if simfactory would know how to look at a certain environment variable which is similar in effect to the --machine option?

  7. Erik Schnetter
    • removed comment

    Luca

    Simfactory can handle multiple Cactus source trees per machine because many people use this. When you use --remote, then Simfactory has to find the "correct" Cactus source tree on the remote system. It does so by looking (on the local machine) from where you call Simfactory, and then using the "equivalent" source tree on the remote system. For this, it needs to know where all the source trees are stored.

    It is unfortunate that Damiana and Datura share the same head node and the same file system, but are otherwise different systems requiring different configuration options. (This is actually also the case on certain DOE and NCSA systems.) Simfactory has the notion of a "default" configuration (the one called "sim"), and since there is only one default configuration, each source tree is really only meant for a single machine, i.e. either Damiana or Datura. Hence the "damiana" and "datura" subdirectories.

    This could be changed, of course, and maybe should, but the current best solution is to either use the "datura" subdirectory (which I would recommend), or to add the two lines

    [datura] sourcebasedir = /home/@USERS

    at the end of your etc/defs.local.ini.

  8. Log in to comment