Provide access to CarpetLib timers

Issue #834 new
Ian Hinder created an issue

CarpetLib has several internal timers which contain useful information. I would like to make this more accessible to the user. Currently, these timers can be output by setting

CarpetLib::print_timestats_every = 1

and the timer output is written to files called

carpetlib-timing-statistics.NNNN.txt

This file is in a nonstandard format, contains lots of information, and is difficult to interpret. It is also not realistic to have this output enabled routinely in production simulations because there is one file per process, which leads to large numbers of output files when running with large numbers of processes. There is also no way currently to reduce the information into a min/max/average, which is what we really care about most of the time.

Cactus already has the ability to do the above if the timer values are stored in grid arrays. I propose that CarpetLib should define some grid arrays similar to those currently defined in Carpet for its timers. For example,

CCTK_REAL timing TYPE=array DIM=1 SIZE=1 DISTRIB=constant TAGS='checkpoint="no"' { sent_bytes_count sent_bytes_per_second received_bytes_count received_bytes_per_second comm_time } "Per-processor timing information"

sent_bytes_count should come from commit_send_space::isend. received_bytes_count should come from commstate::sizes_irecv. comm_time should come from commstate::step. (Aside: Erik mentioned that the hierarchy of these timer names is incorrect; this should be fixed.)

My aim is to be able to compare sent_bytes_per_second and received_bytes_per_second with the advertised bandwidth of the interconnect. For example, our own cluster has 5 GB/s full-duplex bandwidth. I would like to display reductions of these variables on standard output using CarpetIOBasic, and store these reductions in output files using CarpetIOScalar. This allows me to see very easily whether the communication is making efficient use of the hardware.

The timers I mentioned above would tell us about the actual time to transmit data (or at least, as close as I can find), but does not include any overheads introduced by processing the data before giving it to MPI. We probably want to expose the overheads times as well.

When should these variables be updated from the CarpetLib timers? At what point in the schedule, and how frequently?

How should the rates (sent_bytes_per_second etc) be computed? Carpet uses a decaying average algorithm to compute its speeds. Does it make sense to do something similar here?

Does it make sense to separate sent_bytes_per_second and received_bytes_per_second, given that we only have the combined communication time available?

It would probably also be useful to measure latency. Would commstate::step:cnt be the correct number to measure, and if so, how should this be expressed?

Does it make sense to provide these values in CarpetLib, or should Carpet provide the infrastructure itself and get the information from the CarpetLib timers?

Keyword:

Comments (0)

  1. Log in to comment