Modify

Opened 6 years ago

Last modified 6 years ago

#970 new defect

CarpetLib::barriers fails with multipatch

Reported by: Roland Haas Owned by: Erik Schnetter
Priority: minor Milestone:
Component: Carpet Version:
Keywords: Cc:

Description

This happens during the initial storage allocation where there are mismatching barriers in dh::add and gdata::gdata. The underlying reason seems to be that Carpet/StorageCrease has a loop (schematically) around line 93 of Storage.cc.

{{
for(m=0;m<maps;++m)
new gf<T> (which calls dh::add)

arrdata.AT(group).AT(m).data.AT(var)->set_timelevels which eventually call gdata::gdata
}}

this causes the a barrier error when on process owns a component on map 0 but another does only onwn a component on map 1, since in this case the first one will encounter the barriers as:

dhd::add (map 0)
gdata::gdata (component on map 0)
dh::add (map 1)

while the other process sees:

dhd::add (map 0)
gdata::gdata (component on map 0)
dh::add (map 1)

The actual error is then (where there are some extra printf() lines that I added):

INFO (Carpet): [tl=0] Starting initialisation
INFO (Carpet): [tl=0] GroupStorageIncrease
INFO (Carpet): [tl=0]   ADMBASE::SHIFT_STATE: increase to 1
dh::add added varindex 0: shift_state
CHECKPOINT: processor 16, file /work/00945/rhaas/Zelmani/arrangements/Carpet/CarpetLib/src/dh.cc, line 2176
Adding varindex 0: shift_state
INFO (Carpet): [tl=0]   ADMBASE::DTLAPSE_STATE: increase to 1
dh::add added varindex 1: dtlapse_state
CHECKPOINT: processor 16, file /work/00945/rhaas/Zelmani/arrangements/Carpet/CarpetLib/src/dh.cc, line 2176
Adding varindex 1: dtlapse_state
INFO (Carpet): [tl=0]   ADMBASE::DTSHIFT_STATE: increase to 1
dh::add added varindex 2: dtshift_state
CHECKPOINT: processor 16, file /work/00945/rhaas/Zelmani/arrangements/Carpet/CarpetLib/src/dh.cc, line 2176
Adding varindex 2: dtshift_state
INFO (Carpet): [tl=0]   ADMBASE::LAPSE: increase to 1
dh::add added varindex 15: alp
CHECKPOINT: processor 16, file /work/00945/rhaas/Zelmani/arrangements/Carpet/CarpetLib/src/dh.cc, line 2176
dh::add added varindex 15: alp
CHECKPOINT: processor 16, file /work/00945/rhaas/Zelmani/arrangements/Carpet/CarpetLib/src/dh.cc, line 2176
WARNING level 0 in thorn CarpetLib processor 16 host c305-212.ls4.tacc.utexas.edu
  (line 251 of /work/00945/rhaas/Zelmani/arrangements/Carpet/CarpetLib/src/dist.cc):
  -> Wrong id for Barrier "CarpetLib::dist::checkpoint": expected 506880075d, found 783988953d

This like something that is rather hard to fix generally for little benefit (ie. it affects only debugging runs with multipatch). Should this even be reported (if only so that there is official notice that this is known behaviour)? Should the fix be just a warning if Carpet encounters this situation?

Attachments (0)

Change History (2)

comment:1 in reply to:  description Changed 6 years ago by Roland Haas

Replying to rhaas:

This happens during the initial storage allocation where there are mismatching barriers in dh::add and > this causes the a barrier error when on process owns a component on map 0 but another does only onwn a component on map 1, since in this case the first one will encounter the barriers as:

dhd::add (map 0)
gdata::gdata (component on map 0)
dh::add (map 1)

while the other process sees:

dhd::add (map 0)
gdata::gdata (component on map 0)
dh::add (map 1)

This is wrong (since this behaviour would actually work). The second process sees
dhd::add (map 0)
dh::add (map 1)
gdata::gdata (component on map 1)

comment:2 Changed 6 years ago by Erik Schnetter

The code and the parameter enabling these barriers should be annotated, so that people know this is the expected behaviour.

Modify Ticket

Change Properties
Set your email in Preferences
Action
as new The owner will remain Erik Schnetter.
Next status will be 'review'.
as The resolution will be set.
to The owner will be changed from Erik Schnetter to the specified user.
Next status will be 'confirmed'.
The owner will be changed from Erik Schnetter to anonymous.

Add Comment


E-mail address and name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.