Modify

Opened 2 years ago

Last modified 2 years ago

#1890 new enhancement

formaline capture simfactory information

Reported by: jonah.maxwell.miller@… Owned by: Erik Schnetter
Priority: unset Milestone:
Component: Other Version: development version
Keywords: formaline Cc:

Description

It would be nice if Formaline captured the machine description used by simfactory (i.e., properties.ini, optionlist, submitscript) for a simulation. This information would be convenient for reproducing the exact configuration on a machine later.

Attachments (0)

Change History (8)

comment:1 Changed 2 years ago by Ian Hinder

Isn't this available in the SIMFACTORY metadata directory already?

comment:2 Changed 2 years ago by Erik Schnetter

I don't think the MDB entry is stored.

Also, it would be good to have these in the json files, not just in the directory.

comment:3 Changed 2 years ago by Ian Hinder

Is this a property of the configuration (at the time of build), or the simulation (at the time of submission/run)? I would say the latter, since we want to know with 100% certainty what was used (simfactory could have been updated between build and submission, which would change the MDB file). Is there a mechanism that simfactory could use to "register" the files at submission time, so that Formaline would pick them up and store them in its records?

comment:4 Changed 2 years ago by Frank Löffler

I also don't quite understand what Formaline should do here. Formaline deals with the source code, and it generally does it's job. Simfactory deals with build options, submission scripts ect, and it does so too.

The option list can usually be found in name/SIMFACTORY/cfg/OPTIONLIST, and the other scripts are in run/, or expanded versions in individual restarts, properties.ini is saved as well. As long as users don't delete this, I don't really see a reason why we should duplicate it.

On more general teams, I see Simfactory 'above' Formaline in terms of layers. If anything, Simfactory might know about Formaline's special output and post-process it if wanted, not the other way around.

comment:5 Changed 2 years ago by jonah.maxwell.miller@…

What I was thinking of when I submitted the ticket was a convenient way of packaging all of simfactory's information in, e.g., a json file or tarball so that the whole directory tree structure doesn't have to be carried around. Perhaps this would be a job for simfactory not formaline?

comment:6 in reply to:  5 Changed 2 years ago by Frank Löffler

Replying to jonah.maxwell.miller@…:

What I was thinking of when I submitted the ticket was a convenient way of packaging all of simfactory's information in, e.g., a json file or tarball so that the whole directory tree structure doesn't have to be carried around. Perhaps this would be a job for simfactory not formaline?

I think it would. On the other hand, simfactory already makes a copy of the exectuable, which contains the formaline output. So, all you really would need is the simfactory directory. If you dislike that this is a directory with a number of files in it, what about putting it into a 'tar', but what would be the point of that, other than archiving possibly?

What do you plan to do? Maybe there are other ways to do that?

comment:7 Changed 2 years ago by jonah.maxwell.miller@…

For the time being, I plan to make a tarball of the simfactory parameters, optionlist, and submit script, the source tree generated by formaline, the parameter file, and the formaline json file. This way I have a single file containing all I need to reproduce a result that I can attach to, say, a plot or visualization. I don't actually want the executable, because it's large and it will probably become stale in a bit.

There is probably a better way to do this.

comment:8 Changed 2 years ago by Erik Schnetter

Formaline is not just about storing the source code. Formaline is about making simulations reproducible, and that includes capturing all parameters (and parameter changes) at run time, and recording the simulation environment (machine, user, directory, time, various UUIDs, etc.), and also putting part of that information automatically into various places (ASCII and HDF5 output, unless disabled).

I don't know whether Formaline records the number of MPI processes (probably yes) and the number of OpenMP threads (probably no), but it should record both, as well as the job id from the queuing system. It would be trivial to record the MDB entry that Simfactory used to create a simulation, and thus it should be done. "Recording" doesn't mean that it should be in one of the directories -- "recording" means that Formaline puts this information into all the output channels where recorded information is available. These are currently (mostly) a human-readable text file and a json file, as well as a database server if one is configured (it typically isn't, unfortunately). ((We should revisit this -- people find it quite easy to set up a private git repository, so we should go this route if it makes people use it.)) This information is also available at run time, if there is e.g. a web server or a log server running as a thorn. It should probably also go, in its entirety, into large output files.

Modify Ticket

Change Properties
Set your email in Preferences
Action
as new The owner will remain Erik Schnetter.
Next status will be 'review'.
as The resolution will be set.
to The owner will be changed from Erik Schnetter to the specified user.
Next status will be 'confirmed'.
The owner will be changed from Erik Schnetter to anonymous.

Add Comment


E-mail address and name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.