Skip to content

Conversation

@frankaging
Copy link
Member

@frankaging frankaging commented Apr 19, 2025

Description

This is a cherry-pick of changes from Peter's local try of implementing LoRA with pyvene backbone. main...peterwz-subspace

With this change, users can also provide additional intervention kwargs instead of overloading the subspaces dict as we are doing right now.

Concretely, if we currently doing this:

                    ref_outputs, policy_outputs_orig = self.ax_model(
                        base={
                            "input_ids": minibatch_inputs["input_ids"],
                            "attention_mask": minibatch_inputs["attention_mask"]
                        }, unit_locations=unit_locations,
                        output_original_output=True,
                        subspaces=[{                                      # <---- overloading `subspaces` which is intended only for intervening subspaces
                            "k": self.training_args.topk,
                            "steering_factor": minibatch_inputs["steering_factors"], 
                        }], use_cache=False)

which is overloading subspaces. We can now do:

                    ref_outputs, policy_outputs_orig = self.ax_model(
                        base={
                            "input_ids": minibatch_inputs["input_ids"],
                            "attention_mask": minibatch_inputs["attention_mask"]
                        }, unit_locations=unit_locations,
                        output_original_output=True,
                        intervention_additional_kwargs={
                            "k": self.training_args.topk,
                            "steering_factor": minibatch_inputs["steering_factors"], 
                        }, use_cache=False)

Testing Done

TBD

Checklist:

  • My PR title strictly follows the format: [Your Priority] Your Title
  • I have attached the testing log above
  • I provide enough comments to my code
  • I have changed documentations
  • I have added tests for my changes

@frankaging frankaging changed the title [P1] Allowing interventions to take additional kwargs including module [P1] Allowing interventions to take additional kwargs + module inputs Apr 19, 2025
@frankaging frankaging changed the title [P1] Allowing interventions to take additional kwargs + module inputs [P1] Allowing interventions to take additional kwargs + module inputs (#215) Apr 19, 2025
@frankaging frankaging merged commit beebdc5 into main Apr 20, 2025
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants