simfactory's user level run command does not record jobid in a segment's properties.ini file

Issue #2198 new
Roland Haas created an issue

The (user level) run command does not record a jobid in a segment's properties.ini file which means one cannot eg restart from a checkpoint since simfactory complains about an unset jobid:

simfactory/bin/sim create-run test1 --parfile par/tov_ETK_2018_lisbon.par --procs 1 --walltime 0:5:0 &
grep jobid ~/simulations/test1/output-0000/SIMFACTORY/properties.ini

shows the issue (but not producing any output). And the error obtained when trying to restart is along the lines of

Simulation name: binarybh9 Error: job id is negative Aborting Simfactory

This was reported by Qingwen Wang in http://lists.einsteintoolkit.org/pipermail/users/2018-September/006524.html

A workaround is to not use run as a user but always submit or to use the cleanup command after which simfactory is fine with there not being a job id.

Keyword: None

Comments (3)

  1. Ian Hinder
    • removed comment

    "sim run" doesn't create a job, because it doesn't use a queueing system. It seems to me that the bug here is that resuming from a checkpoint requires a job id.

  2. Roland Haas reporter
    • removed comment

    The issue arises b/c simfactory wants to know "is the previous job still running?" when creating a new segment to know whether it can clean up the old one and / or has to chain a new segment or not.

    Making "sim run" make up its own job id is just a hack to avoid this.

  3. Log in to comment