allow calculations in parameter assignments in parameter files

Issue #798 closed

Roland Haas created an issue 2012-04-11

The attached patch extends the parameter file parser and SetParameter routine to allow things like:

foo::bar = "=2*sin(foo:baz)"

ie. arithmetic and access to already set parameters. The new behaviour is triggered if the parameter string (for real, boolean and int parameters) starts with an "=" sign. Otherwise it branches off into the old code.

The largest change is actually in the expression parser which has been extended to handle eg. exponential notation and negations. It now uses a state machine to parse its input.

I have been using this for a while and had no ill side effects (but then really only accumulators are currently using the expression parser).

Eventually I'd like to make the schedule IF/WHILE statements accept expressions using both parameters and grid scalars as well. This would be useful to simplify scheduling things only in the last MoL substep or in all but the last substep.

Keyword:

Comments (30)

Roland Haas reporter
- changed status to open
- removed comment
- 2012-04-11T11:36:08+00:00
Erik Schnetter
- removed comment
I believe it is the wrong approach to extend Cactus with home-grown syntaxes and parsers. I wish there had been faster progress with attaching Python or Lua to Cactus, so that we could use this instead.

I do not see why there needs to be an equals sign at the beginning. There is presumably no difference between "x = 2" and "x = =2", and the statement "x = 2*y" is currently ill-defined. If anything, it would give the parser more exposure, making it correct more quickly.

The patch is good, and adds functionality that is very useful and currently missing. Please apply, with the suggestion above.

Tongue-in-cheek I want to suggest to add another feature, namely being able to call a function. This will then allow us to run a Cactus simulation to determine the value of a parameter...
- 2012-04-11T12:07:09+00:00
Ian Hinder
- removed comment
Another approach is to use parameter file scripts. These are explained in the SimFactory User Guide (http://simfactory.org/info/documentation/userguide/use.html#parameter-file-scripts). SimFactory supports these directly.
- 2012-04-11T12:31:37+00:00
Roland Haas reporter
- removed comment
I added to "=" moniker because right now a number with a "=" would be invalid and it means that I can guarantee that the old parameter files are not affected (saves on flame emails). It is also a bit faster since each "=" trigger the constructions of a parser tree, recursive function calls etc.

Attached please find an updated patch no longer requiring an extra "=", it first tries feeding its input to strol/atoi/SetBoolean and if that fails trys and expression parser.

Python support would be nice, yes. Something that's easier to add support for is Tcl (at least I have added Tcl to some code in the past but never python).

I can add a "system()" function easily that would use the result of a system() call :-P.

Ian: A third option is to use simfactory's @(...)@ syntax.

Do we still want the patch (it adds a third option for parameter dependencies).

The parser itself would still be nice for grid functions.
- 2012-04-11T13:07:06+00:00
Frank Löffler
- removed comment
One problem with system() is that this seems to break several MPI implementations. We (still) have problems with this on QueenBee (because it is so old), and OpenMPI also has / can have problems with system() calls: it makes the code hang. I had to switch back from using OpenMPI to Mpich2 because of this.

See http://www.open-mpi.org/faq/?category=openfabrics#ofa-fork and http://www.open-mpi.org/faq/?category=tuning#fork-warning

The suggested `ompi_info --param btl openib --parsable | grep have_fork_support:value | cut -d: -f7` does not work for me, as Cactus simulations using openmpi still hang after system(), while the same compiled against mpich2 work.
- 2012-04-11T13:17:08+00:00
Roland Haas reporter
- removed comment
I was not really serious about adding system(). I find it rather horrible to tell the truth :-). For this kind of thing I would think using parameter scripts (eg. perl is very well suited to these unstructuted things, being a unstructured language itself :-) ) is best. Unless you really need access to things that only exist during runtime. I currently only supports numbers in the expression parser, eg something like `whoami` or `hostname` does not work.
- 2012-04-11T13:22:01+00:00
Erik Schnetter
- removed comment
I would have hoped that explicitly testing for numbers (atoi etc.) is not necessary, but that the expression parser handles these correctly automatically.

I still think that this is a useful feature to have. Please apply.
- 2012-04-11T14:23:26+00:00
Roland Haas reporter
- removed comment
The parsers handles them fine (that was one of the tests). Testing is just an optimization since going into the parser is very expensive (malloc's, string parsing etc). It's also what SpEC does :-).

I'll apply it.
- 2012-04-11T15:50:14+00:00
Roland Haas reporter
- changed status to resolved
- removed comment
- 2012-04-11T18:58:49+00:00
Roland Haas reporter
- changed status to open
- removed comment
- 2012-04-18T01:37:15+00:00
Roland Haas reporter
- changed status to open
- removed comment
the patch in expr2.patch adds some missing features:
- recognizes "nan" and "inf"
- allows constructs like 1*-1 (before this had to be written as "1*(-1)")
- add remainder operator '%' (implemented via fmod)
- some cleanup in comments and the test code
- 2012-04-18T03:18:15+00:00
Erik Schnetter
- removed comment
A few questions:

What about fmod integer operations? Why don't they use % directly?

You use ^ for exponentiation; this is maybe confusing in a C-like syntax. Can you use instead?

Are two-argument functions supported? If so, fmod and pow could be used directly.

Are there test cases for all of this? I imagine a parameter file that exercises most of this, including some corner cases. A new thorn in the CactusTest arrangement would be appropriate.

Is -inf recognized?

I see that ! is supported. What about & | ^ && || etc.?

What are the _ and @ operators in the table?

Note: I am not making requests here (except for test cases). I'd like to know what you think, and open issues can go into a TODO list.
- 2012-04-18T08:03:43+00:00
Roland Haas reporter
- removed comment
The expression code uses a macro to actually do the math, so I have to use the same operator/function for both integers and reals. The way that it is written right now, it ends up being:

int retval = fmod(a,b) if both a and b are of integer type. This is identical to a%b I think since C will automatically truncate the double to an integer. It actually has the the added benefit that it does not have the implementation defined behaviour for b<0 that a%b has. If any of a or b is real then retval is a double value.

I could use `` for exponentiation I think, but it might require a bit of rewriting. Right now I don't have to distinguish between `*` and ``, all operators are uniquely determined by a single character (even `&&` and `||`). I don't believe that this assumption is hard-coded anywhere but would have to test. Probably straightforward.

The way that the macros are written, they don't support two-argument functions. All functions are treated as unary operators. Adding two argument functions will require changing the reverse polish notation generator and evaluator since the meaning of parenthesis changes depending on whether they follow a two-argument function or not (right now there is no difference between parenthesis around function arguments and just grouping ones).

-inf is supported (it ends up as -(inf) I think since I only group the - sign with numbers).

No bitwise operations are supported (mostly because the macro would balk at `((double) a) & ((double) b)` (`!` is the boolean not). `&&` and `|||` are supported (and documented :-), even though the majority of this functionality existed in Cactus before but was only used for the accumulator parameters).

I use '_' and '@' in the table since I need to encode each operator as a string for the RPN parser (which translates them into the OP_XXX codes). '_' is the unary '-' and '@' is the unary '+'. I cannot leave them '-' and '+' since the RPN generator looks only at the token string not at context. Having the replacement the same length as the original makes live simpler (this is reason for the horrible use of a hard-coded state '10' in line 594).

I like the suggestions and am impressed you actually waded through the code.

So in order is time to implement from little work to a bit of work:

1. `&` and `|` if we can live with a cast to (long int) for the double's 1. `^` -> `` this would free `^` for the bitwise exlusive or 1. test cases 1. two-argument functions, requires some changes to parenthesis handling (different behaviour after name or after operator), and likely some new type of binary operator and code to handle it in the RPN generator.

I am sure there are fully featured evaluators out there (in fact GNU libmatheval comes to mind, which even features derivatives), though that might be overkill.
- 2012-04-18T08:34:25+00:00
Erik Schnetter
- removed comment
As much as I dislike the sign convention for %, I think it is important to be consistent with C (and Fortran), and to be consistent between / and %.

Here is an idea: - define a function CCTK_INT imod(CCTK_INT, CCTK_INT) that wraps % - define a macro isint(T) ((T)0.1==(T)0.0) - use isint(x) ? imod(x,y) : fmod(x,y)
- 2012-04-18T09:04:05+00:00
Roland Haas reporter
- removed comment
sigh. Error checking is hard and humans have a funny idea of what a logical syntax for math is. Anyhow attached please find a patch that makes things a bit easier.

1. somewhat useful error messages 1. a test suite 1. some fixes in the priorities of operators (should now be essentially as in C, though associativity is likely different eg for ``) 1. properly handles multi-char operators now 1. `` for power 1. handles nan/inf/-inf/-nan 1. fixes a bug where the first state transition happened twice (is an issue for tokens that may not repeat) 1. I've left fmod in for now (forgot to rewrite the macro to take a type as argument as well, have to do some real work now). Also what I wanted to say in the last post is that the behaviour of '%' is implementation dependent, but fmod() is not and for this reason I would prefer fmod() with integer arguments over '%'.
- 2012-04-18T18:07:50+00:00
Erik Schnetter
- removed status
- removed comment
With the test suite okay to apply.

More nagging: - Is the comparison operator really = and not == ? - The behaviour of % is now fixed (as of C99), but used to be implementation defined. - Issues such as % vs. fmod or the associativity for `` should be documented, probably with a caveat that we may change this in the future - I see that the order in which the parameters are given does now matter. That is new -- the documentation should warn about this, and that the behaviour may change. (We may want to disallow accessing a parameter that is set at a later time.)

Thanks for all the work!
- 2012-04-18T18:14:54+00:00
Roland Haas reporter
- changed status to resolved
- removed comment
applied. Changed comparison operators '==' and '!='. Documented associativity of exponentiation, and the fact that '%' is used for both float and integer values and maps to fmod() and that the order of parameter matters (it already said "already set parameter" in the documentation).

I'll leave the integer types for '%' for "future work". As well as two argument functions.
- 2012-04-18T19:22:13+00:00
Frank Löffler
- changed status to open
- removed comment
I don't really like the notion of the ordering being important. Would it be a lot of work to first skip parameters which depend on values of other parameters and then continue assigning until all parameters are set or some are left which should produce an error (which would catch loops for instance)?
- 2012-04-19T10:18:12+00:00
Erik Schnetter
- removed comment
This issue is not new, it already existed for accumulator parameters. I suggest a separate ticket for this.
- 2012-04-19T10:35:23+00:00
Frank Löffler
- removed comment
Accumulator parameters are not an issue because addition is commutative and setting accumulator parameters directly is not possible. Aside from the ActiveThorns issue nothing in parameter files was so far dependent on the order (and we should resolve that once and for all as well IHMO).
- 2012-04-19T10:48:21+00:00
Erik Schnetter
- removed comment
Accumulator parameters can have arbitrary expressions, not just "x+y". These expressions are not necessarily commutative or associative.
- 2012-04-19T10:52:14+00:00
Frank Löffler
- removed comment
Whatever these expressions result in will be summed up by Cactus. This summation is commutative. Can these expressions themselves depend on other parameters?
- 2012-04-19T11:04:45+00:00
Roland Haas reporter
- removed comment
Nono, not just summed. Arbitrary expressions (see UsersGuide section D2.3.2):

· ACCUMULATOR specifies that this is an accumulator parameter. Such parameters cannot be set directly, but are set by other parameters who specify this one as an ACCUMULATOR-BASE. The ex- pression is a two-parameter arithmetical expression of x and y. Setting the parameter consists of evaluating this expression successively, with x being the current value of the parameter (at the first iteration this is the default value), and y the value of the setting parameter. This procedure is repeated, starting from the default value of the parameter, each time one of the setting parameters changes.

Making this not order dependent would require changes to how Cactus parses parameter files. Non trivial ones at that since you have to track parameter dependency of some sort (assume A refers to B which refers to C). The easiest way might be to store the initialization as a string and mark a parameter as "not initialized", then on first access of the value via ParameterGet() check this flag and evaluate the expression if needed. Not sure if that also catches DECLARE_PARAMETERS or if that one uses a different mechanism to access values (in FORTRAN its just a common data area, so something _is_ different).

Note that I am emphatically not volunteering to implement this. :-)
- 2012-04-19T11:20:26+00:00
Erik Schnetter
- removed comment
It would be easier to keep track of which parameters are read and written at what time. This will catch errors where the result depends on the order.

One would keep track of which variables have been read and written while interpreting the parameter file. Writing to a parameter that has already been read would be flagged as error.
- 2012-04-19T11:23:00+00:00
Frank Löffler
- removed comment
Right. The usage with (x+y) is so common that I didn't think of other possibilities. However, the only other usage I found is ((x>y)*x+!(x>y)*y) - essentially max(x,y) - which is also commutative. I don't think we should really support non-commutative expressions here, this was probably never intended. But you are right, this is beyond the scope of this ticket.

What would be important here is: should entries within parameter files be order-independent? I would argue that this would be a good idea.

Concerning an implementation: Yes, the parameter file would need to be parsed multiple times: first for the active thorns and at least once more for the actual parameters (~~#705~~). And yes, Cactus would need to track dependencies.

Writing to a parameter that has already been read would be flagged as error.

Consider:

A::a = "=B::b" B::b = 5

When parsing this line-by-line, A::a would get the default value of B::b (let us assume 4), would be marked as being read, and the second line would throw an error. But what we actually want is that A::a = 5, not 4, and no error.
- 2012-04-19T11:46:26+00:00
Roland Haas reporter
- removed comment
TODO:
- allow expressions in default parameter values (and ranges?). This requires changing perl code.
- track access to unset parameters in parameter file, or make parameter file position independent (separate issue)
- if there is a difference between C99 '%' and (int)fmod(int,int): implement special code for '%'
- bitwise operators
- 2012-04-19T20:22:34+00:00
Frank Löffler
- removed comment
While I agree that a lot of operators might be nice we should not get ahead of ourselves and spend a lot of time on operators that might be difficult to implement and are then not used.
- 2012-04-19T21:14:22+00:00
Erik Schnetter
- removed comment
I believe this patch has been applied, and the ticket should thus be closed.

To request additional functionality I would open another ticket (and state why this functionality is important). If this is merely a collection of ideas, then I would add a comment to the source code (no patch review required for this).
- 2012-05-05T09:21:03+00:00
Roland Haas reporter
- changed status to resolved
- removed comment
- 2012-05-11T09:54:45+00:00
Roland Haas reporter
- edited description
- changed status to closed
- 2019-02-21T20:21:09+00:00
Log in to comment

Assignee: –

Type: enhancement

Priority: minor

Status: closed

Component: Cactus

Milestone: –

Version: –

Votes: 0

Watchers: 0