SimFactory should not silently disable thorns

Issue #1248 new
Ian Hinder created an issue

It is very confusing when you have a thorn in your thornlist, and simfactory silently disables it because it is included in the "disabled-thorns" entry of the machine. SimFactory should at the very least print a message warning the user when the job is submitted that certain thorns requested in the thornlist have been disabled. Printing it when building won't be enough, as the output will likely get lost, but it should also be printed there. It would also be useful to have an option to disable the disabling code, as it has always confused me and wasted my time, and never benefitted me, to my knowledge.

Keyword:

Comments (3)

  1. Erik Schnetter
    • removed comment

    You should be able to set "disabled-thorns =" in the [default] section of your defs.local.ini.

    It is quite possible that this never benefitted you, because we were very careful only to add thorns to the Einstein Toolkit thorn list that work on the standard production systems. If you use thorns with potentially complex dependencies (e.g. PETSc that may not be available on some systems), of if you port the Einstein Toolkit to new architectures (e.g. Blue Gene/Q), then the ability to disable thorns is very useful.

    Instead of printing a warning (nothing wrong with this), what you probably really want is to briefly check the parameter file upon job submission, so that you receive your error messages much earlier.

    In your case, which thorn was disabled on what machine? How did you get around this? Presumably, it was a thorn that you needed -- was it disabled by mistake, or did you have to invest non-trivial effort to make it work? In the latter case, I would argue that disabled-thorns may have been useful to you, since it allowed you to use that machine (or get started there) without having to care about this thorn first.

  2. Ian Hinder reporter
    • removed comment

    It was hwloc on supermuc, presumably because the configuration script didn't work. I've now fixed this (see other ticket) so I can now run hwloc on that machine. I see your point, but I didn't have hwloc in my thornlist in the first place (it was not a standard ET thornlist), it was only when I added it that I ran into the problem, and a lot of head-scratching ensued! It was only when I checked the thornlist in the config directory that I saw that the thorn had been commented out. I can see someone getting very confused if they run what they think is an identical thornlist on two different machines, and get an error on one of them because a thorn is not compiled in. This would be solved by a more visible warning though, so I think now that that would be better than disabling this feature.

    I'm uncomfortable about simfactory looking at the parameter file on job submission, as it breaks an abstraction (simfactory should not know much about cactus). Would it be possible/good to add the information about the machines to the thornlist rather than adding the information about the thorns to the machine definition? That keeps cactus information with the cactus file, and it would be quite obvious when looking at the thornlist that there is an issue on certain machines. People are probably more likely to look at the thornlist than the machine definition (I think). I think it doesn't matter where the warning is, or when, as long as the user notices it eventually.

  3. Erik Schnetter
    • removed comment

    Simfactory would call Cactus with a special option to test the parameter file. For this, it would run Cactus on the front end. There would be an MDB entry teaching Simfactory how to do this. On some machines, this probably can't work (cross-compiling), so it would be disabled there.

    Yes, this information should live in the thorn list. This way (since you have your own thorn list) you would never even have disabled the thorn.

    We already have comments in the beginning of the thorn list teaching people how to enable OpenCL. This could be formalised in the same way.

    I suggest the following syntax:

    !ENABLE [machine] = [thorn] [thorn] ... !DISABLE [machine] = [thorn] [thorn] ...

    This is reasonably similar to the current CRL syntax, and the meaning is immediately clear.

    Thorns that should be disabled by default should have a !DISABLED in front of them. In this way, Cactus will treat them as comments. The !ENABLE command would remove the !DISABLED, and the !DISABLE would add such a !DISABLED.

  4. Log in to comment