Skip to content

Conversation

@ax3l
Copy link
Member

@ax3l ax3l commented Dec 14, 2016

Allows to use multiple GridBuffers ("slots") in FieldTmp for all operations done with it so far.

Required for, e.g., T_e and n_e in Thomas-Fermi Ionization.

Current control: the number of scalar temporary fields can be set in memory.param via:

/** number of scalar fields that are reserved as temporary fields */
constexpr uint32_t fieldTmpNumSlots = 1;

Still throws dozens of __host__ __device__ warnings but might compile already.

Thanks to @psychocoderHPC for pair programming support.

Usage

/* I want to access FieldTmp! */
DataConnector &dc = Environment<>::get().DataConnector();

/* And while we are defining this code, just saying: I will use two (2) slots of it,
 *   <<So, do you copy?! - ... Brzzzz >> */
PMACC_CASSERT_MSG(
    _please_allocate_at_least_two_FieldTmp_slots_in_memory_param,
    fieldTmpNumSlots >= 2
);
 
/* load each (scalar field) slot without copy data to host */
FieldTmp& density = dc.getData< FieldTmp >( FieldTmp::getUniqueId( 0 ), true );
FieldTmp& eneKin = dc.getData< FieldTmp >( FieldTmp::getUniqueId( 1 ), true );

/* reset density and kinetic energy values to zero */
density.getGridBuffer().getDeviceBuffer().setValue( FieldTmp::ValueType( 0.0 ) );
eneKin.getGridBuffer().getDeviceBuffer().setValue( FieldTmp::ValueType( 0.0 ) );
 
/* calculate and add the density values from all species */
ForEach<
    VectorAllSpecies, /* you want to use only e- species here */
    picongpu::detail::ComputeDensity< bmpl::_1, bmpl::int_< CORE + BORDER > >,
    MakeIdentifier< bmpl::_1 >
> computeDensity;
computeDensity( forward(&density), currentStep );

/* calculate and add the kinetic energy values from all species */
ForEach<
    VectorAllSpecies, /* you want to use only e- species here */
    picongpu::detail::ComputeEnergy< bmpl::_1, bmpl::int_< CORE + BORDER > >,
    MakeIdentifier< bmpl::_1 >
> computeEnergy;
computeEnergy( forward(&eneKin), currentStep );
 
/* add results of all species that are still in GUARD to next GPUs BORDER */
EventTask densityEvent = density.asyncCommunication( __getTransactionEvent() );
EventTask energyEvent = eneKin.asyncCommunication( __getTransactionEvent() );
__setTransactionEvent( fieldTmpEvent + energyEvent );

/* do work on density & eneKin */
// ...

/* release data */
dc.releaseData( FieldTmp::getUniqueId( 0 ) );
dc.releaseData( FieldTmp::getUniqueId( 1 ) );

To Do

  • remove all new warnings
  • RT test in default slot 0
  • RT test with a slot number > 0

@ax3l ax3l added component: core in PIConGPU (core application) feature labels Dec 14, 2016
@ax3l ax3l added this to the Next Stable: 0.3.0 / 1.0.0 milestone Dec 14, 2016
@ax3l ax3l force-pushed the topic-fieldTmpMultiSlots branch from 5d0b3cd to 5597e48 Compare December 14, 2016 23:19
//! how many bytes for buffer is reserved to communication in one direction
constexpr uint32_t BYTES_EXCHANGE_X = 4 * 256 * 1024; //4 MiB
constexpr uint32_t BYTES_EXCHANGE_Y = 6 * 512 * 1024; //6 MiB
constexpr uint32_t BYTES_EXCHANGE_Z = 4 * 256 * 1024; //4 MiB
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A little off-topic but just because I'm seeing it right now: in many memory.param files of the different examples there are different sizes of memory that need to be reserved for exchange but the comments often read the same numbers.
Just for consistency we could remove the comments there and state in the beginning that the unit used there is Byte.
Here at this point it would be 1 rather than 4 MiBs, right?

Copy link
Member Author

@ax3l ax3l Dec 15, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

different sizes of memory

that is generally speaking ok, since each example has different particle migration patterns resulting in different requirements for exchange buffers copying those per time step.

Just for consistency we could remove the comments there and state in the beginning that the unit used there is Byte.

great idea! open a separate issue or directly a PR for that?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can do that if you want, after yours is through - if you don't wanna rebase. But since this is such a minor change which only affects the comments I'd also be fine if it happened in the same PR.

There were some more typos in parts of the code that you didn't specifically touch in the files. Here to be precise, and in the comment below where it says negativ instead of negative. But a separate commit would be enough for this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, not touched, saw those too.

@n01r
Copy link
Member

n01r commented Dec 15, 2016

Great! 😄 ✨

DataConnector &dc = Environment<>::get().DataConnector();
FieldTmp& fieldTmp = dc.getData<FieldTmp > (FieldTmp::getName(), true);

PMACC_CASSERT_MSG(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like how with this implementation some overhead from the Particles class is removed.
Do I understand correctly, though, that every time that I would instatiate a FieldTmp object I would have to add this cassert, right? I guess this could not just be moved into the constructor?

Copy link
Member Author

@ax3l ax3l Dec 15, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

every time you want to run dc.getData you assume the slot you access is really allocated in the current simulation. the assert checks that the user did not trick you by not allocating them and that getData() will not throw at RT. I just added it since it's a CT user param currently, so CT checking makes sense.

@n01r
Copy link
Member

n01r commented Dec 15, 2016

thanks for the lovely usage example 👍

@ax3l ax3l force-pushed the topic-fieldTmpMultiSlots branch from 5597e48 to da203c4 Compare December 15, 2016 15:15
fieldTmp->getHostDataBox().getPointer());

dc.releaseData(FieldTmp::getName());
dc.releaseData( FieldTmp::getName() + "0" );
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please use FieldTmp::getUniqueId( 0 )

ValueType());

dc.releaseData(FieldTmp::getName());
dc.releaseData( FieldTmp::getName() + "0" );
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please use FieldTmp::getUniqueId( 0 )

@ax3l ax3l force-pushed the topic-fieldTmpMultiSlots branch from da203c4 to bc89360 Compare December 15, 2016 15:32
@ax3l ax3l changed the title [WIP] FieldTmp: Multiple Slots FieldTmp: Multiple Slots Dec 15, 2016
Expose a user input variable `fieldTmpNumSlots` which defines how
many `FieldTmp` scalar fields will be needed. This defaults
as before to `1` additional field.
@ax3l ax3l force-pushed the topic-fieldTmpMultiSlots branch from bc89360 to eb781d8 Compare December 17, 2016 12:43
@ax3l
Copy link
Member Author

ax3l commented Dec 18, 2016

@psychocoderHPC review comments updated! :)

@n01r if you want, you can write a kernel using two FieldTmp quantities now for testing the infrastructure is ready for TF.

@ax3l
Copy link
Member Author

ax3l commented Dec 20, 2016

RT test: oh oh, segfaults with 2 slots enabled and standard I/O of FieldTmp quantities....

@ax3l ax3l force-pushed the topic-fieldTmpMultiSlots branch from eb781d8 to 9b37b76 Compare December 21, 2016 12:35
@n01r
Copy link
Member

n01r commented Dec 21, 2016

Thx for the update! I will rebase on that and continue implementing.

2016-12-22 Update: The first run with the two new fields was successful! I only have to calculate the actual "temperature" and density values and we're in the Thomas-Fermi business! 😛

@psychocoderHPC psychocoderHPC merged commit b7b1bfa into ComputationalRadiationPhysics:dev Jan 3, 2017
@ax3l ax3l deleted the topic-fieldTmpMultiSlots branch January 5, 2017 22:00
@n01r n01r mentioned this pull request Jan 9, 2017
2 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

component: core in PIConGPU (core application)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants