-
Notifications
You must be signed in to change notification settings - Fork 56
Register tracing export #178
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
buffalojoec
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I saw this is still in draft, so just a few fly-by things I noticed!
This is looking really good, though. Make sure to add some tests and a README section for the new feature!
harness/src/lib.rs
Outdated
| #[cfg(feature = "fuzz-fd")] | ||
| pub slot: u64, | ||
|
|
||
| pub enable_register_tracing: bool, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is always going to be false when the feature is disabled, I think it should be private, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Making it private prevents such manual instantiations where we get a compiler complaint that the enable_register_tracing is private:
mollusk/harness/tests/fd_test_vectors.rs
Line 69 in fc77245
| ..Default::default() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah crud. What about just using the environment variable everywhere for the logic gate? The problem is, if someone flips this to true on Mollusk directly, but doesn't provide the SBF_TRACE_DIR, it's going to disable tracing anyway. This would be confusing for people.
Alternatively, we could choose a default dir under target if we really want this to be public. However, doing something like this:
let mut mollusk = Mollusk::default();
mollusk.enable_register_tracing = true;
......would still not add the callback and not activate tracing, as per the current implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah crud. What about just using the environment variable everywhere for the logic gate? The problem is, if someone flips this to
trueon Mollusk directly, but doesn't provide theSBF_TRACE_DIR, it's going to disable tracing anyway. This would be confusing for people.
You're right - it would be confusing. There's another case that also concerns me - it's related to the environment variable only approach. As I'm going to write some tests for this feature it's that I would need to set the envvar from within the test itself. Setting envvars is unsafe as tests are run in the same process but in different threads. Hence having multiple tests playing with the SBF_TRACE_DIR var is UB. This is the reason I added with_register_tracing in the first place - just having a constructor that has the ability to instantiate mollusk with register tracing but without the need to play with the envvar.
Alternatively, we could choose a default dir under
targetif we really want this to be public. However, doing something like this:
I think it's probably a good idea to have a default path.
let mut mollusk = Mollusk::default(); mollusk.enable_register_tracing = true; ......would still not add the callback and not activate tracing, as per the current implementation.
I think I came up with some sort of a possible solution - I can bury the enable_register_tracing bool in the callback structure - thus nobody would be able to set it without actually providing a handler - I'll try to rework it to see the final outcome and will show it here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Setting envvars is unsafe as tests are run in the same process but in different threads. Hence having multiple tests playing with the SBF_TRACE_DIR var is UB.
Why wouldn't the SBF_TRACE_DIR variable just be readonly across all threads?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I came up with some sort of a possible solution - I can bury the enable_register_tracing bool in the callback structure - thus nobody would be able to set it without actually providing a handler - I'll try to rework it to see the final outcome and will show it here.
Cool! Ping me when you throw it up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've pushed the changes. I've also added one test implementing a custom register tracing callback counting the number of JMP class instructions executed. Tests help me reason about the API and its consumption later. On one side it's now that you can tweak existing codebases with the SBF_TRACE_DIR and on the other side it's possible to enable register tracing without the need to tweak the same environment variable at all - hence the with_register_tracing constructor.
What I don't like up to now is that you can instantiate Mollusk::default() and change the callback expecting this will enable register tracing. Unfortunately it won't because it's the ProgramCache that also needs to have register tracing tweaked. Yet doesn't such post-fact changes also not work for the rest of the Mollusk fields where there's a cross dependency between some of them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Setting envvars is unsafe as tests are run in the same process but in different threads. Hence having multiple tests playing with the SBF_TRACE_DIR var is UB.
Why wouldn't the
SBF_TRACE_DIRvariable just be readonly across all threads?
It could be but if we need to enable tracing for one test and skip it for another the envvar is not very convenient IMO.
harness/src/lib.rs
Outdated
| /// Create a new Mollusk instance just like the `new` method but | ||
| /// with register tracing enabled using a default callback. | ||
| #[cfg(feature = "register-tracing")] | ||
| pub fn with_register_tracing(program_id: &Pubkey, program_name: &str) -> Self { | ||
| let mut mollusk = Mollusk { | ||
| enable_register_tracing: true, | ||
| ..Self::default() | ||
| }; | ||
| let program_cache = ProgramCache::new( | ||
| &mollusk.feature_set, | ||
| &mollusk.compute_budget, | ||
| mollusk.enable_register_tracing, | ||
| ); | ||
| mollusk.program_cache = program_cache; | ||
| mollusk.invocation_inspect_callback = | ||
| Box::new(register_tracing::DefaultRegisterTracingCallback {}); | ||
| mollusk.add_program(program_id, program_name, &DEFAULT_LOADER_KEY); | ||
| mollusk | ||
| } | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you need this new constructor? The existing Mollusk::new will call into Self::default, and the logic gate where you check for the SBF_TRACE_DIR environment variable will be evaluated. So people can just use Mollusk::new!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you need this new constructor? The existing
Mollusk::newwill call intoSelf::default, and the logic gate where you check for theSBF_TRACE_DIRenvironment variable will be evaluated. So people can just useMollusk::new!
I'd be glad to have your review up to now and if you prefer I'll rework it to have everything gated behind the environment variable - I have in mind something like this:
diff --git a/harness/src/lib.rs b/harness/src/lib.rs
index 815ec07..b89830f 100644
--- a/harness/src/lib.rs
+++ b/harness/src/lib.rs
@@ -689,32 +689,6 @@ impl Mollusk {
mollusk
}
- /// Create a new Mollusk instance just like the `new` method but
- /// with register tracing enabled using a default callback.
- ///
- /// If `SBF_TRACE_DIR` is set it will override the passed `sbf_trace_dir`.
- #[cfg(feature = "register-tracing")]
- pub fn with_register_tracing(
- program_id: &Pubkey,
- program_name: &str,
- sbf_trace_dir: &str,
- ) -> Self {
- let mut mollusk = Mollusk::default();
- mollusk.invocation_inspect_callback =
- Box::new(register_tracing::DefaultRegisterTracingCallback {
- sbf_trace_dir: std::env::var("SBF_TRACE_DIR").unwrap_or(sbf_trace_dir.into()),
- });
- mollusk.program_cache = ProgramCache::new(
- &mollusk.feature_set,
- &mollusk.compute_budget,
- mollusk
- .invocation_inspect_callback
- .is_register_tracing_callback(),
- );
- mollusk.add_program(program_id, program_name, &DEFAULT_LOADER_KEY);
- mollusk
- }
-
/// Add a program to the test environment.
///
/// If you intend to CPI to a program, this is likely what you want to use.
diff --git a/harness/src/register_tracing.rs b/harness/src/register_tracing.rs
index ddb3cc0..7c18b1e 100644
--- a/harness/src/register_tracing.rs
+++ b/harness/src/register_tracing.rs
@@ -85,7 +85,7 @@ impl InvocationInspectCallback for DefaultRegisterTracingCallback {
// This callback is specifically implemented to handle register tracing.
fn is_register_tracing_callback(&self) -> bool {
- true
+ std::env::var("SBF_TRACE_DIR").is_ok()
}
}
diff --git a/harness/tests/register_tracing.rs b/harness/tests/register_tracing.rs
index ba9e85c..0c30d15 100644
--- a/harness/tests/register_tracing.rs
+++ b/harness/tests/register_tracing.rs
@@ -87,11 +87,11 @@ fn test_custom_register_tracing_callback() {
}
std::env::set_var("SBF_OUT_DIR", "../target/deploy");
+ std::env::set_var("SBF_TRACE_DIR", "/tmp/sbf_trace_dir");
let program_id = Pubkey::new_unique();
let payer_pk = Pubkey::new_unique();
- let mut mollusk =
- Mollusk::with_register_tracing(&program_id, "test_program_primary", "sbf_trace_dir");
+ let mut mollusk = Mollusk::new(&program_id, "test_program_primary");
// Phase 1 - basic register tracing test.
Bury the logic in the callback structure - if there's a register tracing callback then have the functionality enabled. With the current with_register_tracing constructor it's now possible to switch on register tracing without playing with envvars. This is shown in the provided test in this commit. Tweaking the SBF_TRACE_DIR envvar, however, is useful for existing codebases where Mollusk users are willing to enable register tracing instantly without making any changes to their code.
|
@procdump I see why this is such a pain. We have to coordinate initial cache compilation as well as per-invocation tracing, all while giving users the ability to customize behavior. I threw a few commits up to see what you think. Basically I've implemented the following:
With those changes, we can lose the extra constructor and simplify a few things. Considering what I said in bullet 1, we actually could just remove it from |
1. fixes a comment to be above the statement it pertains to 2. removes with_register_tracing 3. changes the register_tracing test to use the enable_register_tracing variable for switching on/off instead of playing with the callback.
| mollusk.enable_register_tracing = false; | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added your enable_register_tracing field back to Mollusk, for disabling the trace during instruction invocation.
There's a subtle discrepancy here. I've reworked the test so that I can show you what I mean and eventually you can debug it yourself.
Let's say you want to set enable_register_tracing to false and you'd expect that this will disable the trace during instruction invocation because in process_instruction_inner the false value is passed through InvokeContext:
let _enable_register_tracing = false;
#[cfg(feature = "register-tracing")]
let _enable_register_tracing = self.enable_register_tracing;
let program_runtime_environments: ProgramRuntimeEnvironments =
ProgramRuntimeEnvironments {
program_runtime_v1: Arc::new(
create_program_runtime_environment_v1(
&runtime_features,
&execution_budget,
/* reject_deployment_of_broken_elfs */ false,
/* debugging_features */ _enable_register_tracing,
)
.unwrap(),
),
program_runtime_v2: Arc::new(create_program_runtime_environment_v2(
&execution_budget,
/* debugging_features */ _enable_register_tracing,
)),
};
let mut invoke_context = InvokeContext::new(
&mut transaction_context,
&mut program_cache,
EnvironmentConfig::new(
Hash::default(),
/* blockhash_lamports_per_signature */ 5000, // The default value
&callback,
&runtime_features,
&program_runtime_environments,
&program_runtime_environments,
&sysvar_cache,
),
self.logger.clone(),
self.compute_budget.to_budget(),
self.compute_budget.to_cost(),
);However register tracing still remains enabled surprisingly. The reason for this goes deep into the runtime - particularly in the execution of the instruction within the VM (let's look in the case of being into interpreted mode):
https://github.com/anza-xyz/sbpf/blob/c0c1197edbd1379a2f1ae86140fec30ec46b08e9/src/interpreter.rs#L176
https://github.com/anza-xyz/sbpf/blob/c0c1197edbd1379a2f1ae86140fec30ec46b08e9/src/interpreter.rs#L190
Basically it reads enable_register_tracing from the Executable instead of InvokeContext's ProgramRuntimeEnvironments.
Due to the circumstances upper the register tracing unit test fails now. Isn't it more appropriate to read the flag from InvokeContext's EnvironmentConfig's program_runtime_v2's Config?
With such a change we'd be able to switch on/off tracing as you'd suggested.
Otherwise it's confusing for users that if they tweak the flag it won't influence the tracing.
NB: I've pushed the amended test on purpose so that you see that it fails.
Considering what I said in bullet 1, we actually could just remove it from add_program completely. What do you think?
Doesn't this suggestion depend on fixing the issue explained upper?
Problem
With solana-sbpf 0.13.1 it's possible to collect register tracing data in a consistent way. Yet a testing harness like mollusk does not provide a default mechanism to collect this data in a useful format for later ingestion.
The idea of this PR is to provide such mechanism in a way consumer tools could take advantage of - particularly willing to analyze the tracing data and generate statistics on top of it.
Solution
The PR takes advantage of the
invocation-inspect-callbackfeature where a callback could be called prior to or after the processed instruction. A default post instruction callback for storing the register tracing data in files is provided and it basically persists the register sets, sbpf instructions and an identification sha256 hash of the executable that was actually used for the generation of this data. The motivation behind providing the sha256 identification is that files could grow in number and it's necessary for the consumer to be able to distinguish which shared object to use when analyzing the tracing data in a deterministic way. Also in order to let consumers instantly take advantage of this feature it's written in a way that register tracing gets enabled as long as theSBF_TRACE_DIRis provided denoting the directory in which the tracing data will be persisted. The motivation behind this approach is that it will let users instantly get out the tracing data of their tests which can eventually be passed to statistics generators such as code coverage tools, disassemblers, etc.Review
Before getting further I'd be happy to get some feedback if you find this approach feasible or have a better idea how it may be done.