Conversation
jmartin-tech
left a comment
There was a problem hiding this comment.
Testing shows this changed the prompt count for LatentWhoisSnippet, looking closer into why.
From 0.10.3.1
latentinjection.LatentWhois base.TriggerListDetector: PASS ok on 28/ 28
latentinjection.LatentWhoisSnippet base.TriggerListDetector: PASS ok on 32/ 32
On main @ 55da36b
latentinjection.LatentWhois base.TriggerListDetector: PASS ok on 28/ 28
latentinjection.LatentWhoisSnippet base.TriggerListDetector: PASS ok on 32/ 32
This PR branch:
latentinjection.LatentWhois base.TriggerListDetector: PASS ok on 28/ 28
latentinjection.LatentWhoisSnippet base.TriggerListDetector: PASS ok on 256/ 256
I suspect this is not an expected change.
|
Thanks, good catch. I believe the randomisation logic changed for this probe to be more in line with common practice in garak, rather than predicated on |
Co-authored-by: Erick Galinkin <erick.galinkin@gmail.com> Signed-off-by: Leon Derczynski <leonderczynski@gmail.com>
b80749a to
2e119fe
Compare
There was a problem hiding this comment.
Review of LatentWhoisSnippet shows the permutations selected are consistent but not identical to previous prompts. The application of soft_prompt_cap as a limiter in the Full version is part of the reason in testing no identical prompts were found.
This can land as is, however I made some minor comments related to this unexpected usage of soft_prompt_cap in Full probes and that some for the application of FactSnippitMixin seems spurious which make continue to make this module difficult to maintain.
|
|
||
|
|
||
| class LatentWhoisSnippet(LatentInjectionFactSnippetEiffelFull): | ||
| class LatentWhoisSnippetFull(FactSnippetMixin, LatentWhois): |
There was a problem hiding this comment.
Optional, this does not look be needed, nothing is being inherited from FactSnippetMixin.
FactSnippetMixin.injection_instructions is still accessible as written in line 602 if this is removed.
| class LatentWhoisSnippetFull(FactSnippetMixin, LatentWhois): | |
| class LatentWhoisSnippetFull(LatentWhois): |
There was a problem hiding this comment.
Snippet building needs refactoring anyway after the current fix in #1181 - will update that PR and then it should be processed directly after this one.
There was a problem hiding this comment.
on second thoughts:
- will leave this inheritance to express intent
- bugfix: reduce latent optimisation permutation explosion #1181 creeps if it addresses this
- will pick up in later refactor-only PR that needn't block recalibration
Sounds like expected behaviour - both are intentionally sampling rather than using whole population. Good it's consistent. Tests based on validation learnings welcome |
The
LatentInjectionmodule had an overly complex inheritance graph.This is updated to use mixins, as follows:
LatentInjectionMixin- retained; used in all probes; adds tags, detector, methods for assembling prompts & triggersNonFullMixin- used to mapFullprobes to standard (i.e. lightweight) versionsTranslationMixin- templates and assembly for translation-based injectionsFactSnippetMixin- docs, instructions, injections, and assembly for fact snippet-based instructionsLatentWhoisSnippettook a heavy refactoring; previously inherited onLatentInjectionFactSnippetEiffelFull, now inheritsLatentWhoisandFactSnippetMixinVerification
python -m pytest tests/probes/test_probes_latentinjection.py(this should pass solo without config fixtures being loaded)python -m pytest tests/probes/test_probes.pyTrue/Falsedepending on their size