Asymmetric Actor Critic and Related Memory Processing #180

patricknaughton01 · 2024-07-30T20:38:06Z

patricknaughton01
Jul 30, 2024

@Toni-SM, thanks so much for your work on this library. I'm trying to use it to train an agent with IsaacGym as a simulator, and wanted to use the asymmetric actor critic variant of PPO like is done in the IsaacGymEnvs repo (for example, in the IndustReal environments). Because of this, my observation is currently a dict that looks like:

{
    "obs": torch.Tensor, # --> given to the policy network
    "state": torch.Tensor # --> given to the value network
}

I'm using PPO_RNN as the agent. The difficulty I run into when trying to run this is the memory class is built specifically with raw tensors in mind. I wrote a subclass of the RandomMemory class to handle the storage of these elements (the state) separately, but this loop seems to be checking all elements of the memory to see if they are float tensors and filling them with nans, which causes an error when it gets to my shoehorned dict. Currently I've resolved this by just commenting out these lines in my installation of skrl, but I was wondering if they need to be there at all? It looks like the last commit on those lines mentions that they are there for backwards compatibility with old versions of torch, but I didn't follow why exactly that needs to be done to support old versions of torch.

Another issue I ran into is this line in the PPO_RNN class itself. The cast to float messes up my code because the observation is a dict not a tensor. I was wondering if that cast needs to be there at all though since the user controls what type the state is anyway when they write the environment, so they can just ensure they're giving float tensors as input.

Please also let me know if there's a better way to implement asymmetric actor critic models in skrl, I didn't see anything in the docs, but it's possible I just missed it.

Thanks for your time, and thanks again for all the work on the repo! It's been very readable and easy to work with.

Toni-SM · 2024-08-04T17:33:39Z

Toni-SM
Aug 4, 2024
Maintainer

Hi @patricknaughton01

Currently, there is necessary to modify several components in skrl to support asymmetric learning, example:

state_space returned the the environment wrapper (used when instantiating the Value model)
Define tensor to storage the state when initializing the memory
Forward the state when calling the value model during rollouts stage
Storage the state when calling record_transition
Sample the state on each agent _update and forward the state when calling the value model during the training stage

I'm working on separating (staring with the environment wrappers on this branch) the concepts of observation and state (currently mixed in skrl) to support asymmetric learning, but it may take some time.

1 reply

patricknaughton01 Aug 12, 2024
Author

Hi @Toni-SM, thanks for the info. Do you have any thoughts about the two lines that I mentioned? Thanks.

RodsCoimbra · 2025-06-10T22:56:48Z

RodsCoimbra
Jun 10, 2025

Hi @Toni-SM, I tried to access the branch, but it seems to be no longer available. I'm currently trying to implement the same asymmetric actor-critic setup to train an agent with IsaacLab, and I wanted to ask if this functionality has already been implemented or if there's any ongoing work on it.
Thanks for your time!

2 replies

Toni-SM Jun 20, 2025
Maintainer

Hi @RodsCoimbra

The work for privileged observations (states) is carried out in the following branches:
toni/agents_observations_spaces -> toni/develop_observation_states -> develop

It is still a work in progress, where only the PyTorch implementation is updated ATM

Winfred-6 Jun 21, 2025

Hi @Toni-SM , thanks for your great work!

I'm using the AMP agent in Isaac Lab and I want to use an asymmetric structure also.

Can the current AMP implementation in skrl support this? I'm not sure if i can use the state_space to AMP as privileged information.

Thanks for your time!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Asymmetric Actor Critic and Related Memory Processing #180

Uh oh!

{{title}}

Uh oh!

Replies: 2 comments 3 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Asymmetric Actor Critic and Related Memory Processing #180

Uh oh!

patricknaughton01 Jul 30, 2024

Replies: 2 comments · 3 replies

Uh oh!

Toni-SM Aug 4, 2024 Maintainer

Uh oh!

patricknaughton01 Aug 12, 2024 Author

Uh oh!

RodsCoimbra Jun 10, 2025

Uh oh!

Uh oh!

Toni-SM Jun 20, 2025 Maintainer

Uh oh!

Winfred-6 Jun 21, 2025

patricknaughton01
Jul 30, 2024

Replies: 2 comments 3 replies

Toni-SM
Aug 4, 2024
Maintainer

patricknaughton01 Aug 12, 2024
Author

RodsCoimbra
Jun 10, 2025

Toni-SM Jun 20, 2025
Maintainer