parfiles accept 1.0, 0.0, 1e3 for integer typed parameters

Issue #1958 closed
Roland Haas created an issue

Currently I can put

Cactus::cctk_itlast = 1.0e3

into a parameter file and Cactus will accept this parfile (cctk_itlast = 0.5 fails).

My feeling would that it is better only allow "[-+]?[0-9]+" for integers and not accept everything and then check afterwards if it could be converted to an integer without loss of accuracy.

This probably changed when the parfile gained the ability to evaluate expressions (both using piraha and before). Mostly I am concerned that accepting floating point numbers that have integer values can mask errors in parfile generating scripts (eg I have one that apparently sets Llama's n_angular to 24.0) and which will crop up when the input to the parfile generator changes a bit.

Maybe the simplest way to achieve this is, is to test like this:

char valuestring[];
char *endp;
long int intval = strtol(valuestring, &endptr, 10);
if(*endptr && (strdod(valuestring, &endptr), !*endptr)) {
  CCTK_VERROR("Float given for int parameter");
}

which tests if the parameter cannot be converted to an integer but can be converted to a floating-point number. I believe there is already a similar test in the parfile code to check if valuestring is a constant or an expression.

Keyword:

Comments (27)

  1. Frank Löffler
    • removed comment

    Such an extra check shouldn't be necessary. Parameters are correctly parsed as integers or reals. However, right now the code includes a step that tries to convert any real to integer if it can be done without loss of precision. This is 'safe' because any integer can be interpreted as real as well. Look for integerize() in src/piraha/Call.cc. Making this function a no-op catches the example given in this ticket and (in the one test I did) still works otherwise as far as my limited test can show.

    However, since this was apparently done deliberately, I would wait for input from Steve to see why, and what possible consequences of removing it could be.

    In the end it is most likely more a question of what we like.

    As to masking errors in par file generators: as long as it generates 24.0 (dot zero) all should be fine. As soon as you get something else than an "integer", Cactus should complain loudly. Is the problem that a generator might create a gazillion files of which some could work (by chance hitting an integer), and some might not? I could see that this could be ... inconvenient.

  2. Steven R. Brandt
    • removed comment

    My recollection is that I tried to make it work the way you suggest initially, but got enough errors from existing thorns that I decided it wasn't what people wanted. Similarly, assigning 1 or 0 to a boolean works, so it seems consistent.

  3. Roland Haas reporter
    • removed comment

    Hmm, well, as Frank pointed out, the code will refuse actual non-integer values so there is no harm other than aesthetics. Are you sure that there are thorns out that use 1.0 instead of 1? I am asking because the pre-Piraha parser did have exactly the

    https://bitbucket.org/cactuscode/cactus/src/c365e508a8a971f6ff61ea7aca547eaec7002650/src/main/Parameters.c?at=parallel_testsuite&fileviewer=file-view-default#Parameters.c-2189
    
    and the parameter file parser would only parser parameters to strings.
    
    I think this is different from accepting 0 and 1 for bools since 0 and 1 are actually what the bools look like in the code (CCTK_BOOLEAN parameters have integer values of 0 and 1 and eg in Fortran one needs to check for this and cannot check for truth).
    
    Frank: I am saying that there is a difference between 1.0 and 1 since only the latter fits the definition of what eg the C compiler will accept as an "int" variable. The conversion in integerize() is fine and is ok for the result of expressions that are assigned to integer parameters, this is about accepting the constant 1.0 for an integer parameter.
    
  4. Frank Löffler
    • removed comment

    I just compiled all of the ET with integerize() being a no-op, i.e., with the strict check.

    Concerning boolean: another example is accepting "yes" and "no". Which isn't anything like any programming language uses.

    The question is indeed, if we should treat 1.0 as being an integer number. I would approach that independent of C or Fortran. I also don't quite see 1.0 being the problem. I don't see any real use case for 1.0 besides maybe convenience for par-file scripts. The current format also accepts things like 1.e6, which is a lot shorter than the same written as usual integer, and easier to parse by eye.

    I don't have a strong opinion either way, but also didn't see any strong reason for either option yet. We might have one other potential problem. We might hit problems with small errors making the test ineffective, especially for large numbers. If we do, and if we officially allow the current syntax, we would need to find a way around that. In this sense, I lean towards the proposed strict check. It would always be possible for a user to get from a float to an int by hand: int(1.0).

  5. Steven R. Brandt
    • removed comment

    OK, well, if it compiled without the 1.0 -> 1 auto conversion, then I guess I don't remember what motivated me to make it work this way.

    I'm not sure what you mean by "small errors making the test ineffective."

    My comment about booleans, btw, is that "true" and "false" aren't obviously 1 and 0 (at least not to me). It is in C, of course. For the shell, however, 0=true, non-zero=error code.

    In the end, however, I also don't have a strong opinion about this. :)

  6. Frank Löffler
    • removed comment

    I then propose to go with the original suggestion of this ticket; to remove the automatic conversion, by not calling/removing the integerize() function.

  7. Roland Haas reporter
    • removed comment

    I admit to being confused. Since the check only happens at runtime (when the .par files are read) then compilation seems to not be sufficient (also since C++ will let you do "int a = (double)b"). I would also expect that all tests in the testsuite will pass since the pre-piraha parser would allow 1.0 so no test could rely on it. It would however be good to check if things like:

    thorn::par = 2*1
    

    still work after the change or not. Please note that I am only suggesting that

    thorn::par = 1.0
    

    should fail, not

    thorn::par = 1.0*1
    

    (mostly since the later would require that we re-introduce integer arithmetic into the evaluator if all arguments of an expression are integers).

    If you make the change, please also update the user guide which currently states that "In cases where this is clear, types will be converted, e.g. 3.0 will convert to an integer, but not 3.1." in section B 3.2. Booleans being represented as CCTK_INT is documented in the UserGuide (section C1.6 section "Parameters" in my current checkout) documenting C's convention for true/false. There is however a body of Fortran code around that uses param.eq.1 to test for truth rather than the more correct .not. (param.eq.0), so at least for the actual code we are stuck with representing true/false as 1/0.

  8. Steven R. Brandt
    • removed comment

    OK, so the tests that fail with this change are:

        magnetizedTOV (from IllinoisGRMHD)
        ML_CCZ4_sgw3d (from ML_CCZ4_Test)
        ML_CCZ4_sgw3d_rhs (from ML_CCZ4_Test)
        expressions (from TestPar)
    

    Line 16 of ML_CCZ4_sgw3d_rhs.par assigns 1.0 to advectLapse (which is an INT). ML_CCZ4_sgw3d does this on line 15. Line 133 of magnetizedTOV.par sets evolveA with 1.0, and it's a real. If we are going to make this change, we will need to update these thorns.

  9. Roland Haas reporter
    • removed comment

    I updated the test parfiles for all but TestPar. For that one I am not sure what int parameters is set to a real value. Steve: do you have a branch with the code you used to test this?

  10. Steven R. Brandt
    • removed comment

    Sorry, all I did was remove the calls to integerize() in Call.cc, then deleted the function itself. I'll handle it.

  11. Roland Haas reporter
    • removed comment

    Ah, I see. That was not quite what I had intended with the ticket. Removing integerize() will mean that:

    intparam = ceil(1.23)
    

    and

    intparam = 100/2
    

    fail, will it (since all arithmetic is done for doubles, yes?)?

    The ticket is only asking (see comment:3) to enforce that constants (literals whatever we want to call them) assigned to parameters must be acceptable as "int" to C. Having an implicit conversion from real to int is fine for expressions (with the check that the conversion is identical) since all our expression math is done using reals and not using ints (is it).

  12. Steven R. Brandt
    • removed comment

    The attached patch removes the integerize() function )which performs the conversion from real to int when that was safe). It also changes the return type of floor, ceil, and trunc to return type int (which allows expressions.par to work without integerize()).

  13. Steven R. Brandt
    • removed comment

    Roland, the integerize() function will only convert to integer if it can do so without loss of precision, i.e. 1.0 -> 1, 1.5 -> no conversion.

  14. Roland Haas reporter
    • removed comment

    I tested the patch with the attached parfile and it fails with:

    Activating thorn TestPar...Success -> active implementation TestPar
    WARNING level 0 from host 8992d193.ncsa.illinois.edu process 0
      while executing schedule bin (none), routine (no thorn)::(no routine)
      in thorn TestPar, file int.par:7:
      -> Invalid assignment: Attempting to set a variable of type INT with (REAL)10
    

    due to

    TestPar::int2[2] = 20*0.5
    

    Commenting out line 7 I get:

    WARNING level 0 from host 8992d193.ncsa.illinois.edu process 0
      while executing schedule bin (none), routine (no thorn)::(no routine)
      in thorn TestPar, file int.par:8:
      -> Invalid assignment: Attempting to set a variable of type INT with (REAL)10
    

    due to

    TestPar::int2[3] = 20/2.
    

    Commenting out line 8 I get

    WARNING level 0 from host 8992d193.ncsa.illinois.edu process 0
      while executing schedule bin (none), routine (no thorn)::(no routine)
      in thorn TestPar, file int.par:10:
      -> Invalid assignment: Attempting to set a variable of type INT with (REAL)20
    

    from

    TestPar::int2[5] = 20.
    

    which is ok but I think should output the exact string from the parfile and not the real value.

    On the other hand it accepted

    TestPar::int2[4] = 15/2
    

    which it should not.

    Finally commenting out line 10 I get no error yet looking at the output file there is:

    real2[0] = 0
    

    which is clearly wrong since it was set to trunc(1e300). This fails due to casting the result of trunc/ceil etc to int since int cannot hold all the integer values that a double can. Note that if we ever switch to 64 bit CCTK_INT (there have been proposals for that) then using CCTK_REAL to hold CCTK_INT is no longer feasible since the mantissa of a double is not wide enough to exactly represent all 64 bit integers.

    The patched version also has the strange effect that:

    TestPar::int2[1] = 20/2
    

    is allowed but

    TestPar::int2[2] = 20*0.5
    TestPar::int2[3] = 20/2.
    

    are forbidden.

    Note that it also does abort after the first error found rather than reporting all errors and then aborting.

  15. Steven R. Brandt
    • removed comment

    OK, so like many tickets, this one seems to be growing in scope. Why shouldn't the division of two integers yield an integer? It worked this way before the patch.

  16. Steven R. Brandt
    • removed comment

    As for reporting multiple errors, I'm not sure what the solution should be. Currently, I report errors with CCTK_Error(). That mechanism doesn't have the capability to report errors on multiple lines. It could be extended, or I could create some new special purpose error reporting tool--but that should be another ticket IMHO.

  17. Steven R. Brandt
    • removed comment

    I've updated the patch to deal with the trunc(1e300) issue. There are a number of ways this could be handled, but what I chose to do was to put integerize() back and have trunc() call it. This allows you to assign the output of trunc() to an int. I could leave it as a real and require int(trunc()) if that's what people want.

  18. Frank Löffler
    • removed comment

    I agree with Steve here. Let me summarize.

    We have the cases

    int/int without remainder
    

    I believe we should allow this, and make the result an integer. Assigning integers to reals takes care of real parameters.

    int/real
    real/int
    

    Those have to be reals. The question remains what to do with reals that could be represented as integers, in case this would be needed (e.g., variable assignment). We have this question in multiple places, so let me call it 'Q'.

    int/int with remainder
    

    This is a question up for debate. Possibilities are 1) Treat '/' as integer division, throwing away the remainder 2) Treat '/' as integer division, and making a non-zero remainder an error. 3) Treat '/' as usual division, making the result always a real. Question 'Q' becomes an issue.

    My thoughts: 1) This is the C/C++ way of handling this, but is dangerous. I would like 1/2 to mean 0.5, not 0. 2) This is better than 1), but I like 3) more. 3) I think we should be able to use 1/2 as 0.5. But this assumes we can treat certain reals as integers where necessary, or that we introduce a separate integer division operator for when this would be needed.

    'Q': If we would convert reals to integers where possible without loss, it would mean we can safely use 1/2 to mean 0.5, but we would get an error if we'd assign that to an integer parameter. At the same time, 2/2 could be use there.

    My personal verdict: I liked the way it was before: with automatic conversion where possible.

    One problem I could see is 'real' precision, e.g., 3*(1/3) not being exactly '1', and with that, not being converted to an integer.

    The question of implementing the 'where possible' correctly is a separate issue.

  19. Roland Haas reporter
    • removed comment

    Sorry, this all seems to have gotten a bit out of hand to me.

    I think that we really cannot change (again) how the parfiles act since we have to remain backwards compatible with old parfiles. Ideally this should have been the original parfile parser, yet when we introduced piraha we changed what parfiles are acceptable (specifically we disallowed some that were acceptable before). We should not be changing this yet again. In this light even my request to disallow "1.0" for an int is questionable since apparently even having this abilityaround for a single release had parameter files using it crop up in our test suites (though this may actually have been a change in type of the parameters in McLachlan from real to int due to the rewrite branch).

    Rather than extend the discussion even further, I'd like to suggest to leave it as it was in the last release. Would this be with all involved?

  20. Steven R. Brandt
    • removed comment

    My memory is fuzzy at this point, so I could be wrong, but I think Piraha converts 1.0 to 1 because the old parser allowed it. In any event, leaving it the same is fine with me.

  21. Roland Haas reporter
    • changed status to resolved
    • removed comment

    True, it started to accept 1.0 in c18cc9b (which is the pre-piraha expression parser). So the (unintended) change of allowing 1.0 for 1 is older (and due to me, hanging my head in shame) than I had thought.

    So for now I'd say to leave it as is rather than going down the rabbit hole of finding what the exact best behaviour is.

  22. Roland Haas reporter
    • removed comment

    Replying to [comment:20 sbrandt]:

    As for reporting multiple errors, I'm not sure what the solution should be. Currently, I report errors with CCTK_Error(). That mechanism doesn't have the capability to report errors on multiple lines. It could be extended, or I could create some new special purpose error reporting tool--but that should be another ticket IMHO.

    Reporting multiple errors is tricky and usually not "clean". What other code does in this situation is to make the "error" reports a level 1 warning (which according to the docs are mistakes that will most likely make the code produce the wrong answer) and keep track (in a global variable or similar) that errors were encountered. Then after the full parfile is parsed, it checks whether any errors were encountered and then aborts.

  23. Log in to comment