Output timer tree as XML

Issue #791 open
Ian Hinder created an issue

The attached patch to Carpet adds a parameter (off by default) which outputs the timer tree from each process to an XML file in the output directory at the end of the run. The output looks like this:

<timer name = "main"> 7.7702 
  <timer name = "CallFunction"> 0.000241 
    <timer name = "thorns"> 0.000233 
      <timer name = "CaKernel_FreeDevMem"> 0.000221 
        <timer name = "PostCall"> 2e-06 </timer>
        <timer name = "PreCall"> 1e-06 </timer>
      </timer>
    </timer>
  </timer>
  <timer name = "CarpetStartup"> 0.00982 
    <timer name = "AllocateGridHierarchy"> 5e-06 </timer>

An alternative schema would be to have the name and timer value as subelements; i.e.

<timer>
  <name>main</name>
  <value>7.7702</value>
  <children>
    <timer>
      <name>CallFunction</name>
      <value<0.000241</name>
  ...
</timer>

The <children> tag might not be necessary, but might make it easier to parse. Having one file per process is not ideal; we would like to add reductions across processes to give min, max, average, standard deviation etc, also for the standard output display.

OK to commit as a work-in-progress? We can change the schema later if it turns out to be easier to deal with.

Keyword:

Comments (15)

  1. Erik Schnetter
    • removed comment

    This only outputs the xml timers at the end of the simulation, and if it terminates successfully. Often, this is not the case, and the timer output is then lost. I suggest to output the xml timers also at other occasions, same as all other timer output. Since it may not be possible to append, I suggest to output to a temporary file and rename it when output succeeds, as we do with checkpoint files. A large out_every will then, in effect, only output at shutdown. Implementing this may be easiest if the timing output is moved to TimerReport.

    FWIW, I like the name/value pairs without the <children> tag. As you say, this can be adapted later.

    Okay to commit.

    You may need to ensure that you have sufficient accuracy for output; by default, C++ rounds to 6 digits. A call to setprecision may help.

    I prefer output file names to have equal length, filling process numbers with zeros, as this sorts them better with ls.

  2. Ian Hinder reporter
    • removed comment

    Yes, I thought about the fact that output could be lost, and the implications for appending and renaming. I agree with what you say, and indeed this can be added later.

    Regarding the schema, I have found practically that it is useful to have lists of elements enclosed by a single common element. That means I can ask for all the members of <children>. On the other hand, I probably should only look at elements that I know, to make the format extensible, so in the end maybe the <children> tag is not necessary. I will have to restrict to elements of the correct tag anyway. I will likely change the schema in the near future because at the moment it's a little annoying to process.

    Yes, the output accuracy should be fixed.

    Regarding the output filename, I used to agree, but now I don't :) Sorting with ls is only one use case. A more useful thing to do is to programatically load the timers via a loop, or from a specific process, in which case you are given an integer and want to find the filename. At the moment, for the carpetlib timers, I have to load all the filenames, parse all the 0004 etc into integers, and then select the filename which corresponds to the integer I want. Having the actual process number in the filename would make this much more straightforward. The fact that ls doesn't have the ability to sort filenames with numerical components into numerical order shouldn't force us to adopt an inelegant solution. Another problem is that the number of leading zeroes is hard-coded. Anyway, we eventually shouldn't have per-process output files anyway; the data should all be sent to the root process and added to a single XML file.

    The original patch was committed as http://www.carpetcode.org/hg/carpet/index.cgi/rev/27ddee06ff1d.

    I have another patch relating to this: http://www.carpetcode.org/hg/carpet/index.cgi/rev/e0e84d10eac8

    This second patch allows you to include the historical values of the timers (i.e. every time they are stopped) in the XML file. This allows you to determine if the run times for each routine are consistent, or if they "jitter". This was very helpful for optimising performance. There is no information about the coordinate time or iteration number, as this is not available to the TimerNode class. I'm not sure if it's possible to get that. A Carpet function get_current_iteration() would be helpful. Is there a global notion of current iteration? We would still have multiple entries for each iteration (e.g. for MoL substeps). Each time is stored in a vector<double> every time the timer is stopped. Could this be a problem? Might it grow large? Maybe this could be stored conditionally on the parameter being set?

  3. Erik Schnetter
    • removed comment

    The way to access the current iteration or coordinate is via cctkGH. One could store this in a global variable when a function is called from the scheduler, and access it from there, but this does not seem elegant -- could cctkGH instead be passed to the timer?

    Each time is stored in a vector<double> every time the timer is stopped. Could this be a problem?

    Yes, definitely. What if there are hundreds of timers and tens of thousands of iterations? This would require many megabytes of storage. Some timers may even be called many times per iteration; we cannot assume that a timer is only called once per iteration. This also makes calling timers very expensive, as they may have to allocate storage. This needs to be disabled unless explicitly enabled by a parameter.

  4. Ian Hinder reporter
    • removed comment

    I see the following potential tasks:

    1. Change the XML schema (quick); 2. Modify the history patch to only store historical values if enabled by a parameter (quick); 3. Possibly add iteration information to the timers (not clear what to do); 4. Possibly limit the amount of history stored by storing it only every so many iterations (depends on 3); 5. Output the XML file regularly, not just in shutdown (should be OK); 6. Send all the timer values to the root processor and merge them into one XML file (hard); 7. Add reductions of the timer values to the XML file (depends on 6).

    What aspect of this should be implemented by ET_2012_05? I won't have time to do all of this before the freeze, and development work is supposed to happen at the start of a release cycle, not the end. I can probably do 1 and 2 before the freeze. In the release notes, we could indicate that this is a new feature which is still under development, and some aspects of it might change.

  5. Erik Schnetter
    • removed comment

    (1) Yes, we can change the XML schema now. We don't need to warn users that things may change. (2) Yes, please.

  6. Ian Hinder reporter
    • removed comment

    Should the milestone be bumped, or removed? I wonder what our logic should be for deciding whether a particular issue is assigned a milestone. We have a large number of tickets without milestones.

  7. Frank Löffler
    • removed comment

    Someone needs to decide whether that bug or feature should be important for the next release. That can be the reporter, or maybe later a developer. We don't have a hard policy on this, and for now we don't need to I believe.

    For this particular issue I would say that it is up to you. It is clear that this will not be part of the next release, so one of the two things should happen.

  8. Log in to comment