Include IllinoisGRMHD into the Toolkit, as an Arrangement

Issue #1796 closed
Zach Etienne created an issue

IllinoisGRMHD, an open-source, compact rewrite of the Illinois NR group's GRMHD code, has been graciously hosted by the ET community in its BitBucket repo for more than a year, under the EinsteinEvolve arrangement.

This ticket requests official inclusion of an updated IllinoisGRMHD and closely-related thorns into the ET, as an arrangement.

IllinoisGRMHD brings a number of key new features to the Toolkit, including:

1) a staggered, vector-potential-based, AMR-compatible GRMHD scheme that automatically preserves divergenceless B-fields without the need for specialized interpolation schemes or divergence cleaning.

2) excision-less, robust modeling of GRMHD flows into black hole interiors

3) a highly-robust conservative-to-primitive solver, which checks the physicality of the conservative variables prior to inversion, and modifies them minimally to restore physicality.

The proposed arrangement will include: 1) IllinoisGRMHD: Core GRMHD evolution routines, some significant bugfixes since the original version currently in the ET's bitbucket repo; about 3,400 lines of code (cloc)

2) ID_converter_ILGRMHD: Converts HydroBase variables into variables IllinoisGRMHD can read (e.g., IllinoisGRMHD uses a velocity definition consistent with the magnetic induction equation, not the Valencia formulation); about 209 lines of code (cloc)

3) convert_to_HydroBase: Does the reverse of ID_converter_ILGRMHD, needed for compatibility with HydroBase-based analysis thorns; 126 lines of code.

All codes are written in C99 and are fully OpenMP-ified. You can read more about IllinoisGRMHD in its code announcement paper (CQG in press): http://arxiv.org/abs/1501.07276

Additionally, a Guide to Getting Started with IllinoisGRMHD has also been written and is available at the IllinoisGRMHD webpage: http://math.wvu.edu/~zetienne/ILGRMHD/

You may now download all three of these thorns (licensed GNU GPL v2 or higher) from: math.wvu.edu/~zetienne/IllinoisGRMHD_arrangement_July_20_2015.tar.gz

Keyword: GRMHD
Keyword: IllinoisGRMHD

Comments (23)

  1. Roland Haas
    • removed comment

    For what it is worth (given that I don't actually use either code right now): I would think it useful to have both GRHydro and IllinoisGRMHD since they are are somewhat complementary right now.

  2. Steven R. Brandt
    • removed comment

    Interest was expressed in making this part of the toolkit on the Aug 3 call.

  3. Roland Haas
    • removed comment

    The second review report is with Zach. Are there objections to moving IllinoisGRMHD (and its compatibility thorns if we include them) into a repository of its own (which would then be "Zach"'s in that development there is independent in the way eg Carpet is?

  4. Frank Löffler
    • removed comment

    That would be ok, of course. ET developers would need to have write access, for releases though.

  5. Frank Löffler
    • removed comment

    Some thoughts, in the order I found them (not severity):

    convert_to_HydroBase: * Most thorns usually have the convention of starting with an upper case letter. Not a must, but these stand out a little because of that. Not a must change of course. * It would be nice to provide a README file. It doesn't need to be long, just look at some examples from other thorns. Not a must, but a would be nice. * Why do you explicitly enable storage for ADMBase and HydroBase variables? Did you find that these thorns do this insufficiently? * You might want to schedule "in INITIAL after HydroBase_Initial", instead of in POSTINITIAL after SetTmunu. SetTmunu is, as far as I can see, also scheduled in Initial, and not in PostInitial, so this 'after' statement is always fulfilled already. * the make.code.defn file contains "Main make.code.defn file for thorn smallbPoyn" * Why do you have it depend on CarpetRegrid2? ID_converter_ILGRMHD: * It would be nice to provide a README file. It doesn't need to be long, just look at some examples from other thorns. Not a must, but a would be nice. * There is a "shares: driver" statement without actually anything sharing. * Same question about explicit storage of other thorn's variables * You schedule convertion IN HydroBase_Initial, instead of after it, explicitly depending on AFTER statements of a few known ID thorns. This is likely to fail for other ID thorns. Would it be possible to schedule AFTER HydroBase_Initial? * make.code.defn file for thorn IllinoisGRMHD_InitialData? * The random perturbation code would be better in a separate thorn. Wouldn't this be also useful for other purposes than just when using the conversion? both: * There are no testsuites * There is no (at least short) documentation. It doesn't need to be much, given these thorns.

  6. Zach Etienne reporter
    • removed comment

    Hi Frank,

    convert_to_HydroBase:

    • Most thorns usually have the convention of starting with an upper case letter. Not a must, but these stand out a little because of that. Not a must change of course.

    I renamed convert_to_HydroBase -> Convert_to_HydroBase

    • It would be nice to provide a README file. It doesn't need to be long, just look at some examples from other thorns. Not a must, but a would be nice.

    I added both README and LICENSE (BSD 2-clause) files.

    • Why do you explicitly enable storage for ADMBase and HydroBase variables? Did you find that these thorns do this insufficiently?

    A force of habit, as I really hate segfaults when these storage things aren't set properly. I'm worried I might break something if I adjust this.

    • You might want to schedule "in INITIAL after HydroBase_Initial", instead of in POSTINITIAL after SetTmunu. SetTmunu is, as far as I can see, also scheduled in Initial, and not in PostInitial, so this 'after' statement is always fulfilled already.

    Scheduling in ET is a nightmare. As usual, this scheduling was setup after a long period of trial and error. I'm worried I might break something if I adjust this.

    • the make.code.defn file contains "Main make.code.defn file for thorn smallbPoyn" Why do you have it depend on CarpetRegrid2?

    Fixed.

    ID_converter_ILGRMHD:

    • It would be nice to provide a README file. It doesn't need to be long, just look at some examples from other thorns. Not a must, but a would be nice.

    I added both README and LICENSE (BSD 2-clause) files.

    • There is a "shares: driver" statement without actually anything sharing.

    Removed.

    • Same question about explicit storage of other thorn's variables

    Same response; worried I might break something.

    • You schedule convertion IN HydroBase_Initial, instead of after it, explicitly depending on AFTER statements of a few known ID thorns. This is likely to fail for other ID thorns. Would it be possible to schedule AFTER HydroBase_Initial?

    Yes, this is not desirable behavior. Still needs to be fixed.

    • make.code.defn file for thorn IllinoisGRMHD_InitialData?

    Fixed.

    • The random perturbation code would be better in a separate thorn. Wouldn't this be also useful for other purposes than just when using the conversion?

    It's conceivable it might be useful for other purposes, though the one purpose I have in mind is code validation: By adding a perturbation in the initial data of order machine epsilon in a trusted version of IllinoisGRMHD, I can quickly measure how roundoff errors grow. This enables me to verify that new versions of IllinoisGRMHD agree to roundoff error with trusted versions. So I'd prefer to leave this be for the time being.

    both:

    • There are no testsuites

    As IllinoisGRMHD is nearly useless without these compatibility-layer thorns, I was waiting to add the testsuites to IllinoisGRMHD once these thorns have been added to the Toolkit.

    • There is no (at least short) documentation. It doesn't need to be much, given these thorns.

    Given how little these thorns actually do, I added a couple paragraphs in the README files to document the thorns.

  7. Zach Etienne reporter
    • removed comment

    Hi again, Frank.

    I have fixed the convert_to_HydroBase and ID_converter_ILGRMHD schedule.ccl files after verifying the STORAGE blocks are in fact not required. I also rescheduled both thorns' initial routines to be AFTER HydroBase_Initial as you suggested, again verifying that the code output is identical to the original code. This enabled me to schedule ID_converter_ILGRMHD so that it no longer depends on AFTER statements of known ID thorns. Bottom line is, these changes have no effect beyond roundoff error in my simulations.

    The only remaining item is to generate testsuites. I'll follow instructions on http://einsteintoolkit.org/documentation/UsersGuide/UsersGuidech9.html#x13-121000C1.8.5 and http://einsteintoolkit.org/documentation/UsersGuide/UsersGuidech6.html#x9-34000B2.6 to do this, using TOVSolver initial data.

    Can these compatibility layer thorns be accepted into the next version of the Toolkit with testsuites promised, but not embedded? I am concerned, because my work schedule is looking very busy over the coming month.

    Thanks in advance for letting me know.

  8. Ian Hinder
    • removed comment

    There are also some tips on writing Cactus test cases at https://docs.einsteintoolkit.org/et-docs/adding_a_test_case.

    Typically, we require a thorn to have test cases before being added to the toolkit. This is part of the "value proposition" that comes from being in the toolkit; people should expect that a certain level of quality assurance and software good practice has been applied to the components of the toolkit. However, these tests are not required to be comprehensive. Would it be relatively straightforward to add at least a minimal test to each thorn that tests its primary functionality? Or could you argue that these thorns are tested by other thorns with test cases which make use of them?

  9. Zach Etienne reporter
    • removed comment

    A test case will be added as required, despite the fact that time limitations mean it will be mostly useless to me (see below; I'd love to be proven wrong). I figure that at least the users might benefit from an example par file.

    After any update to IllinoisGRMHD that might conceivably introduce beyond-roundoff-level differences (due to a bug), I manually perform full correctness testing on a physically-motivated scenario prior to pushing my changes. This involves a careful analysis of the growth of roundoff error versus a trusted version. I also compare the trusted version with itself, but with a 15th significant digit initial perturbation. If the growth in roundoff in the new version of IllinoisGRMHD does not match the growth in roundoff in the trusted version, it is concluded that IllinoisGRMHD has a bug.

    As you might imagine, this testing requires many-timestep runs with AMR grid movement, and I don't care if the test takes a couple hours; my science requires that the tool be reliable and consistent with trusted versions beyond a doubt. Perhaps we should have a longer discussion about the possibility of including more robust correctness testing in ET.

  10. Zach Etienne reporter
    • removed comment

    I have just added a magnetized TOV code test to the IllinoisGRMHD/test directory. This required adding a new thorn called Seed_Magnetic_Fields. I also added a test.ccl file, and verified that the output of make [config name]-testsuite on this test yields:

    Success: 21 files identical

    Note that this test merely provides a sanity check. In other words, IllinoisGRMHD passing this code test is a necessary but not sufficient condition for acceptance of any IllinoisGRMHD patch; contact me (Zach) for details on the more robust set of tests that must be applied to any IllinoisGRMHD patch that might conceivably result in beyond-roundoff differences to the original version. Our #1 goal is to ensure roundoff-error identity with the original IllinoisGRMHD. Otherwise we risk getting the science wrong.

    Frank: Please review the new Seed_Magnetic_Fields at your earliest convenience; it's only 51 lines, not including the .ccl files.

  11. Erik Schnetter
    • removed comment

    Are you saying that one should not modify the code unless a certain set of tests are run? In the spirit of open source, I think this implies that these tests are publicly available. (They don't need to be clean, or easy to run.) I understand that these tests may have taken more time to develop than the code, but without these tests the code would probably not exist, or its results would not be believable.

    Are these tests described somewhere? A description in the thorn's documentation, or a pointer from the README to an arxiv paper etc. would be good.

  12. Zach Etienne reporter
    • removed comment

    My understanding, based on Ian Hinder's message to the [Users] list (Subject: Test case notes), was that tests cannot be long in duration:

    "Tests should run as quickly as possible; ideally in less than one second. Longer tests are OK if this is absolutely necessary to test for regressions."

    I interpret this to mean that my usual tests for correctness, which might require evolutions of order an hour long, running in parallel across machines, are disallowed in ET.

    Regardless, I had already added instructions for proper correctness testing in the parameter file of my test case:

    In this parameter file, I have added a code comment that outlines the basic requirements for agreement. Essentially, if a 15th significant digit random perturbation is added to the initial data (via the parameter ID_converter_ILGRMHD::random_pert = 1e-15), differences between this perturbed run and an unperturbed run after 80 iterations should appear at the 12th significant digit. Unfortunately, 80 iterations require several minutes.

    Ideally, I'd like to run the test in parallel at higher resolution for at least a few hundred iterations (sound-crossing times). In addition, asking Carpet to move the grids around would be ideal, a la Eq 85 of http://arxiv.org/pdf/1007.2848.pdf. What would be the easiest way to tell Carpet to move the grids around according to such a prescription or something similar? Another thorn?

  13. Erik Schnetter
    • removed comment

    It is true that we have neither a systematic way nor the automated infrastructure to do some tests. I am thus very happy that you do have such rigorous standards, and presumably also a battery of tests.

    My concern was that this may make it difficult for other to contribute to the code, since it will be up to you to validate all changes. I don't have a good answer, but a public, specific description of these tests -- and in the long term an official collection of such tests, for all parts of the ET -- would go a long way.

  14. Zach Etienne reporter
    • removed comment

    Then I have a proposal: 1) I will keep the current test in the ET as-is, as it takes <~10 seconds to complete and thus meets the walltime requirements.

    2) I will set up an independent, open-source testing mechanism for IllinoisGRMHD that is automated and will be in place for the next ET release. This will probably require another thorn or two to do things like move Carpet grids around. If ET folks would like, I'd be happy to incorporate the codes for this testing mechanism into the Toolkit as well. I will also arrange to have resources here at WVU in place so that the automated test results are performed here and made public on the IllinoisGRMHD website (http://math.wvu.edu/~zetienne/ILGRMHD/).

    3) To avoid any potential issues until (2) is implemented, I declare the main branch of the WVUThorns arrangment frozen (modulo any issues found in code review) until the next ET release.

    Please let me know if you think this plan will be acceptable.

  15. Zach Etienne reporter
    • removed comment

    Clarification:

    In (2), when I said "next ET release", I was referring to the first 2016 release. In (3), when I said "next ET release", I was referring to the November 2015 release.

    Sorry for any confusion.

  16. Ian Hinder
    • removed comment

    Replying to [comment:14 Zach Etienne]:

    My understanding, based on Ian Hinder's message to the [Users] list (Subject: Test case notes), was that tests cannot be long in duration:

    "Tests should run as quickly as possible; ideally in less than one second. Longer tests are OK if this is absolutely necessary to test for regressions."

    I interpret this to mean that my usual tests for correctness, which might require evolutions of order an hour long, running in parallel across machines, are disallowed in ET.

    Hi Zach,

    The point is that the tests in the thorns' 'test' directory are intended to be tests which can be run frequently and easily. They are run roughly after every commit on some fairly small virtual machines, and people are encouraged to run the tests after checking out the ET, to make sure that everything is currently working on their machine. Thus, they should be fast to run. Usually this means regression tests with small amounts of output.

    In addition, a code should have correctness tests, and these may need to be run on a cluster, consist of multiple simulations (resolutions), run for longer, produce more output, etc. The existing Cactus test mechanism is not suitable for this, and there is no alternative framework. I have argued in the past for the need for such a framework, and of course nobody disagreed, but it is a question of resources to develop such a framework.

    If you have existing correctness tests, then it would be really good if you could include instructions for running them, e.g. in the README or documentation. If they need access to large test data, I would be reluctant to include it in the thorn directory. Maybe a separate repository somewhere?

    I would be interested also in discussions about how a correctness-testing framework would look. It would need to be able to submit and collect data from simulations on clusters, hence simfactory. It would also need to be able to analyse the data, e.g. collecting data from multiple processes as you normally do when analysing your data. i.e. it would need an analysis framework (https://docs.einsteintoolkit.org/et-docs/Analysis_and_post-processing). It would also need IT infrastructure similar to a jenkins server to automate running testing jobs, providing interfaces for people to see what failed, look at results, etc. It's nontrivial, which is probably why it doesn't already exist.

  17. Zach Etienne reporter
    • removed comment

    Replying to [comment:18 hinder]:

    Replying to [comment:14 Zach Etienne]:

    My understanding, based on Ian Hinder's message to the [Users] list (Subject: Test case notes), was that tests cannot be long in duration:

    "Tests should run as quickly as possible; ideally in less than one second. Longer tests are OK if this is absolutely necessary to test for regressions."

    I interpret this to mean that my usual tests for correctness, which might require evolutions of order an hour long, running in parallel across machines, are disallowed in ET.

    The point is that the tests in the thorns' 'test' directory are intended to be tests which can be run frequently and easily. They are run roughly after every commit on some fairly small virtual machines, and people are encouraged to run the tests after checking out the ET, to make sure that everything is currently working on their machine. Thus, they should be fast to run. Usually this means regression tests with small amounts of output.

    My correctness tests require very little output (for TOV stars, just the maximum density plus the L2 norm of density versus time suffices). The issue is, they require orders of magnitude more resources than a few seconds of testing on a server.

    In addition, a code should have correctness tests, and these may need to be run on a cluster, consist of multiple simulations (resolutions), run for longer, produce more output, etc. The existing Cactus test mechanism is not suitable for this, and there is no alternative framework. I have argued in the past for the need for such a framework, and of course nobody disagreed, but it is a question of resources to develop such a framework.

    Great to hear! I would argue that resource requirements depend most sensitively on the timescale of development; if major patches occur on a weekly timescale, then as long as complete, continuous correctness tests take ~ a day or two, we should be fine. For IllinoisGRMHD, a souped-up laptop or desktop collecting dust could do all the correctness testing necessary, completed within hours. Then it'd just be a matter of automatically generating roundoff error build-up comparison plots (a la Fig. 1 of the IllinoisGRMHD paper, http://arxiv.org/pdf/1501.07276.pdf) for a couple of quantities of physical interest, and automatically uploading these to a web server. I already have the basic components of infrastructure in place here at WVU; I'd just need to write the scripts and have them run as a cronjob.

    If you have existing correctness tests, then it would be really good if you could include instructions for running them, e.g. in the README or documentation. If they need access to large test data, I would be reluctant to include it in the thorn directory. Maybe a separate repository somewhere?

    The easiest correctness test is to just run the parfile for more iterations and compare the roundoff-error build-up in maximum density with the trusted code. It is remarkable how sensitive this quantity is to any small, beyond-roundoff-level difference. I'll write the instructions for doing so in the README, as well as a small table of expected results. Thanks for the tip!

    I would be interested also in discussions about how a correctness-testing framework would look. It would need to be able to submit and collect data from simulations on clusters, hence simfactory. It would also need to be able to analyse the data, e.g. collecting data from multiple processes as you normally do when analysing your data. i.e. it would need an analysis framework (https://docs.einsteintoolkit.org/et-docs/Analysis_and_post-processing). It would also need IT infrastructure similar to a jenkins server to automate running testing jobs, providing interfaces for people to see what failed, look at results, etc. It's nontrivial, which is probably why it doesn't already exist.

    Such an automated framework indeed sounds nontrivial, requiring a lot of human time to complete. I would suggest starting with something quick and dirty, like writing scripts that simply generate plots (like Fig 1 of http://arxiv.org/pdf/1501.07276.pdf) to a web page automatically so that within a second a human could recognize by eye whether the correctness was violated. This idea is unlikely to scale-up to many thorns, but I think it will work well for WVUThorns at least, and I'm happy to share my codes for doing so, in case they are useful.

  18. Zach Etienne reporter
    • removed comment

    As requested, I have just added thorndoc .tex files to all thorns in the WVUThorns arrangement.

    Please let me know if there are any remaining concerns regarding WVUThorns incorporation into the next ET release.

  19. Log in to comment