-
Notifications
You must be signed in to change notification settings - Fork 225
ThomasFermi: Fix Domain Decomposition #2007
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ThomasFermi: Fix Domain Decomposition #2007
Conversation
|
OK I get it. This structure is coming from the effect that fieldTmp can not receive fields to the guard to build a local volume without borders. It is only possible to send fields from the local guard to the border of the next GPU. |
|
Ah sure, yes. totally forgot about that. The border area is correct now but the guard is still not updated correctly with the next GPUs border... Instead of just adding the guard to the border we need to:
basically the same communication but one supercell more. offline discussion with @psychocoderHPC: we don't sent the full supercell for fields but just the number of cells needed (from guard) |
|
better but not yet perfect. with #2010 (first draft) we are now away from the unholy target and are straight on to the highway to hell: |
|
@psychocoderHPC and I just found we messed up a manual event dependency in our local changes creating a race condition during energy density creation for the last image. Pushing fixed update to this PR now. |
|
@psychocoderHPC @n01r done (pls add your review) :) |
n01r
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work! 🎇
One may say you even did the work of god here in excorcising the unholy!
| /** specify field to particle interpolation scheme | ||
| * | ||
| * @todo this needs to be done independently/twice if ion species (rho) and electron | ||
| * species (ene) are of different shape |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In principle one could add another todo here
If one were to use two different atomic species they would currently each do their own density and energy calculations. Yet they would have no knowledge of each other. But of course IPD effects do not care about the species and so it remains a technical detail to be solved on how to include that. 😃
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep, that's a todo I have in my collisional methods, too xD
but let's keep it that way for the bugfix, feel free to add a .. note:: or .. warning:: to the user docs you write right now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure sure, that's a follow-up feature and I have a hunch that it requires some careful thinking. 😆
| density->template computeValue< CORE + BORDER, DensitySolver >(*srcSpecies, currentStep); | ||
| dc.releaseData( SrcSpecies::FrameType::getName() ); | ||
| EventTask densityEvent = density->asyncCommunication( __getTransactionEvent() ); | ||
| densityEvent += density->asyncCommunicationGather( densityEvent ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just for me again: the first of the two lines above gets the event of the FieldTmp operation and the second line says that in addition to this event the results from the previous event in the BORDER region of the neigboring GPU have ton be communicated to the GUARD of this GPU. Only afterwards the density values for particles whose shape extends outside the BORDER region will be correct, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, this code had two bugs with regards to domain decomposition:
- the
BORDERwas not filled correct (with additions from neighbor-GPU particles that have shape contributions) and - since you access the density and energy density fields with respect to shape you also access the guard which was not created at all.
The first event, asyncCommunication builds a valid BORDER, the second event, asyncCommunicationGather builds a valid GUARD. Both APIs are also doxygen documented now since #2010
Adds communication to the `FieldTmp` slots to cure an unholy domain-decomposition bug in ThomasFermi ionization.
|
@psychocoderHPC pls merge me :) |
|
@n01r more of a coding-priest, spreading the word of bug fixing |
Adds the following changes to the `release-0.3.0` branch: - Multiple ionizers per species: `ionizer< ... >` became a sequence of `ionizers< ... >` ComputationalRadiationPhysics#1999 - Thomas-Fermi: - fixes in domain decomposition ComputationalRadiationPhysics#2007 - fixes compile issue (missing include) ComputationalRadiationPhysics#2003 - BSI models restructured ComputationalRadiationPhysics#2013 - ADK: fix effective principal quantum number `nEff` ComputationalRadiationPhysics#2011 - multiple ionization algorithms can be applied per species, e.g. cut-off barrier suppression ionization (BSI), probabilistic field ionization (ADK) and collisional ionization ComputationalRadiationPhysics#1999 - FieldTmp: Gather support to fill GUARD ComputationalRadiationPhysics#2009 - Fix undefined device variable in tilted laser pulse ComputationalRadiationPhysics#2017 - CompileTime Accessor: Type (::type) ComputationalRadiationPhysics#1998 - TBG Macros: Outdated Comment ComputationalRadiationPhysics#2004 Increases the state of the `0.3.0` release candidate to `rc3`.
Adds the following changes to the `release-0.3.0` branch: - Multiple ionizers per species: `ionizer< ... >` became a sequence of `ionizers< ... >` ComputationalRadiationPhysics#1999 - Thomas-Fermi: - fixes in domain decomposition ComputationalRadiationPhysics#2007 - fixes compile issue (missing include) ComputationalRadiationPhysics#2003 - BSI models restructured ComputationalRadiationPhysics#2013 - ADK: fix effective principal quantum number `nEff` ComputationalRadiationPhysics#2011 - multiple ionization algorithms can be applied per species, e.g. cut-off barrier suppression ionization (BSI), probabilistic field ionization (ADK) and collisional ionization ComputationalRadiationPhysics#1999 - FieldTmp: Gather support to fill GUARD ComputationalRadiationPhysics#2009 - Fix undefined device variable in tilted laser pulse ComputationalRadiationPhysics#2017 - CompileTime Accessor: Type (::type) ComputationalRadiationPhysics#1998 - TBG Macros: Outdated Comment ComputationalRadiationPhysics#2004 Increases the state of the `0.3.0` release candidate to `rc3`.


Adds communication to the
FieldTmpslots to cure an unholy domain-decomposition bug in Thomas-Fermi ionization. (Density and Energy in the a local GPU's border has contributions from the next GPU.) The full example shown below is given in #2008 but applies for all cases where Thomas-Fermi ionization is used.BSI+ADK+TF_dev

BSI+ADK

Unfortunately, there seems an other bug (maybe wrongly shifted or some caching issue?). But at least it's already less severe:
BSI+ADK+TF_PR

To Do
and does it affect the field ionization methods, too?