Skip to content

Storage node fix#1559

Open
cryptoAtwill wants to merge 8 commits intostorage-cli-extractedfrom
storage-node-fix
Open

Storage node fix#1559
cryptoAtwill wants to merge 8 commits intostorage-cli-extractedfrom
storage-node-fix

Conversation

@cryptoAtwill
Copy link
Copy Markdown
Contributor

@cryptoAtwill cryptoAtwill commented Mar 31, 2026

Note

High Risk
High risk because it extends on-chain blobs actor state/method dispatch with new execution-job APIs and changes legacy top-down finality/sync behavior (quorum=1 fast paths and chunked range fetching), which can affect consensus progress and message execution ordering.

Overview
Adds an MVP execution job queue to the blobs actor: new shared types/selectors, new Method entries, actor handlers (CreateJob/ClaimJob/CompleteJob/FailJob/GetJob/ListJobs), and persistent State.execution with basic validation and status transitions.

Expands the blobs actor FEVM InvokeContract facade to manually parse/encode additional selectors (operator register/query, execution job calls, and finalizeBlob), enabling delegated (f410) senders; the storage gateway and node registration paths are updated to prefer delegated senders, use fevm_invoke, and include better nonce handling/logging.

Improves legacy topdown robustness/performance: single-validator subnets bypass quorum dependence and can query freshest finalized parent view; validator changes and topdown messages are fetched/processed in fixed-size chunks via new range RPC methods, with syncer logic ignoring stale sequentiality errors after concurrent advances.

Updates ipc-cli to gate storage support behind an ipc-storage feature, adds storage quickstart docs and new commands (execution job submit/list/status, bucket create/list), and enhances node init to auto-fill validator signing config, reuse default EVM keys when missing, and generate missing resolver network keys.

Written by Cursor Bugbot for commit 4404440. This will update automatically on new commits. Configure here.

@cryptoAtwill cryptoAtwill requested a review from a team as a code owner March 31, 2026 11:10
Copy link
Copy Markdown

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 3 potential issues.

Autofix Details

Bugbot Autofix prepared fixes for all 3 issues found in the latest run.

  • ✅ Fixed: Hardcoded sequence and chain ID break on-chain transactions
    • Updated storage cp upload signing to use the delegated f410 sender plus on-chain sequence and chain ID instead of hardcoded zeros.
  • ✅ Fixed: Negation overflow panic for i32::MIN in debug builds
    • Replaced negation-based magnitude conversion with value.unsigned_abs() to avoid overflow on i32::MIN.
  • ✅ Fixed: Executor runs arbitrary binaries from on-chain data
    • Restricted executor job binaries to normalized local:// relative paths under a configured local binary directory with canonical path enforcement.

Create PR

Or push these changes by commenting:

@cursor push 3343b91a9d
Preview (3343b91a9d)
diff --git a/ipc-storage/ipc-decentralized-storage/src/bin/node.rs b/ipc-storage/ipc-decentralized-storage/src/bin/node.rs
--- a/ipc-storage/ipc-decentralized-storage/src/bin/node.rs
+++ b/ipc-storage/ipc-decentralized-storage/src/bin/node.rs
@@ -31,7 +31,7 @@
 use fvm_shared::message::Message;
 use ipc_decentralized_storage::node::{launch, NodeConfig};
 use std::net::{SocketAddr, SocketAddrV4, SocketAddrV6};
-use std::path::PathBuf;
+use std::path::{Component, Path, PathBuf};
 use std::str::FromStr;
 use std::time::Duration;
 use tendermint_rpc::Url;
@@ -193,6 +193,10 @@
     /// Polling interval in seconds
     #[arg(long, default_value = "5")]
     poll_interval_secs: u64,
+
+    /// Directory containing executable binaries referenced by jobs via local://
+    #[arg(long, default_value = "./executor-bin", env = "EXECUTOR_BIN_DIR")]
+    binary_dir: PathBuf,
 }
 
 #[tokio::main]
@@ -508,7 +512,7 @@
     if value >= 0 {
         EthU256::from(value as u32)
     } else {
-        EthU256::MAX - EthU256::from((-value) as u32) + EthU256::from(1u8)
+        EthU256::MAX - EthU256::from(value.unsigned_abs()) + EthU256::from(1u8)
     }
 }
 
@@ -735,6 +739,18 @@
 
     let mf = SignedMessageFactory::new(sk, from_f410, sequence, ChainID::from(chain_id));
     let mut tx_client = client.bind(mf);
+    let binary_dir = std::fs::canonicalize(&args.binary_dir).with_context(|| {
+        format!(
+            "failed to canonicalize executor binary directory: {}",
+            args.binary_dir.display()
+        )
+    })?;
+    if !binary_dir.is_dir() {
+        anyhow::bail!(
+            "executor binary directory is not a directory: {}",
+            binary_dir.display()
+        );
+    }
 
     let poll_interval = Duration::from_secs(args.poll_interval_secs);
     loop {
@@ -745,7 +761,10 @@
         }
 
         let job = pending_jobs[0].clone();
-        info!("Found candidate job {} with binary_ref={}", job.id, job.binary_ref);
+        info!(
+            "Found candidate job {} with binary_ref={}",
+            job.id, job.binary_ref
+        );
 
         let latest = get_job(&tx_client, job.id).await?;
         let Some(latest) = latest else {
@@ -776,7 +795,8 @@
         )
         .await
         .context("failed to send ClaimJob transaction via InvokeContract facade")?;
-        if claim_res.response.check_tx.code.is_err() || claim_res.response.deliver_tx.code.is_err() {
+        if claim_res.response.check_tx.code.is_err() || claim_res.response.deliver_tx.code.is_err()
+        {
             info!(
                 "ClaimJob rejected for {}: check={:?} deliver={:?} log={} info={}",
                 job.id,
@@ -790,18 +810,26 @@
         }
         info!("ClaimJob accepted for {}", job.id);
 
-        let run_result = run_job_binary(&job.binary_ref, &job.args).await;
+        let run_result = run_job_binary(&binary_dir, &job.binary_ref, &job.args).await;
         match run_result {
             Ok((exit_code, stdout, stderr)) if exit_code == 0 => {
                 let output_commitment = fendermint_actor_blobs_shared::bytes::B256(
                     *blake3::hash([stdout.as_bytes(), stderr.as_bytes()].concat().as_slice())
                         .as_bytes(),
                 );
-                let output_refs = vec![format!("inline://stdout/{}", hex::encode(output_commitment.0))];
+                let output_refs = vec![format!(
+                    "inline://stdout/{}",
+                    hex::encode(output_commitment.0)
+                )];
                 let complete_res = TxClient::<TxCommit>::fevm_invoke(
                     &mut tx_client,
                     BLOBS_ACTOR_ADDR,
-                    encode_complete_job_calldata(job.id, output_refs, output_commitment.0, exit_code),
+                    encode_complete_job_calldata(
+                        job.id,
+                        output_refs,
+                        output_commitment.0,
+                        exit_code,
+                    ),
                     TokenAmount::zero(),
                     gas_params.clone(),
                 )
@@ -922,7 +950,8 @@
 }
 
 async fn get_job(client: &impl QueryClient, id: u64) -> Result<Option<ExecutionJob>> {
-    let params = RawBytes::serialize(GetJobParams { id }).context("failed to serialize GetJob params")?;
+    let params =
+        RawBytes::serialize(GetJobParams { id }).context("failed to serialize GetJob params")?;
 
     let msg = Message {
         version: Default::default(),
@@ -951,14 +980,42 @@
     Ok(job)
 }
 
-async fn run_job_binary(binary_ref: &str, args: &[String]) -> Result<(i32, String, String)> {
-    // MVP runner supports local paths directly or `local://` URIs.
-    let binary = binary_ref
+async fn run_job_binary(
+    binary_dir: &Path,
+    binary_ref: &str,
+    args: &[String],
+) -> Result<(i32, String, String)> {
+    let binary_rel = binary_ref
         .strip_prefix("local://")
-        .unwrap_or(binary_ref)
-        .to_string();
+        .ok_or_else(|| anyhow::anyhow!("unsupported binary_ref scheme, expected local://"))?;
+    let binary_rel = Path::new(binary_rel);
+    if binary_rel.as_os_str().is_empty() || binary_rel.is_absolute() {
+        anyhow::bail!("invalid local binary_ref path: {}", binary_ref);
+    }
+    if binary_rel
+        .components()
+        .any(|c| !matches!(c, Component::Normal(_)))
+    {
+        anyhow::bail!(
+            "binary_ref must be a normalized relative path: {}",
+            binary_ref
+        );
+    }
 
-    let output = TokioCommand::new(binary)
+    let binary = binary_dir.join(binary_rel);
+    let binary = std::fs::canonicalize(&binary)
+        .with_context(|| format!("failed to resolve executable path: {}", binary.display()))?;
+    if !binary.starts_with(binary_dir) {
+        anyhow::bail!(
+            "binary_ref resolved outside allowed directory: {}",
+            binary.display()
+        );
+    }
+    if !binary.is_file() {
+        anyhow::bail!("executable is not a file: {}", binary.display());
+    }
+
+    let output = TokioCommand::new(&binary)
         .args(args)
         .output()
         .await

diff --git a/ipc/cli/src/commands/storage/cp.rs b/ipc/cli/src/commands/storage/cp.rs
--- a/ipc/cli/src/commands/storage/cp.rs
+++ b/ipc/cli/src/commands/storage/cp.rs
@@ -17,6 +17,10 @@
 
 use fendermint_actor_blobs_shared::bytes::B256;
 use fendermint_rpc::client::FendermintClient;
+use fendermint_rpc::QueryClient;
+use fendermint_vm_actor_interface::eam::EthAddress as FvmEthAddress;
+use fendermint_vm_message::query::FvmQueryHeight;
+use fvm_shared::address::Address;
 use fvm_shared::chainid::ChainID;
 
 use crate::commands::storage::{bucket, client::GatewayClient, config::StorageConfig, path};
@@ -72,11 +76,9 @@
                 // Storage -> Storage (copy between buckets)
                 copy_between_buckets(global, args).await
             }
-            (false, false) => {
-                Err(anyhow!(
-                    "At least one path must be a storage path (ipc://...)"
-                ))
-            }
+            (false, false) => Err(anyhow!(
+                "At least one path must be a storage path (ipc://...)"
+            )),
         }
     }
 }
@@ -95,9 +97,7 @@
     // Handle recursive directory upload
     if local_path.is_dir() {
         if !args.recursive {
-            return Err(anyhow!(
-                "Cannot copy directory without -r/--recursive flag"
-            ));
+            return Err(anyhow!("Cannot copy directory without -r/--recursive flag"));
         }
         return upload_directory(local_path, &storage_path, args).await;
     }
@@ -112,7 +112,11 @@
     storage_path: &path::StoragePath,
     args: &CopyArgs,
 ) -> Result<()> {
-    println!("Uploading {} -> {}", local_path.display(), storage_path.to_uri());
+    println!(
+        "Uploading {} -> {}",
+        local_path.display(),
+        storage_path.to_uri()
+    );
 
     // Read file data
     let data = fs::read(local_path)
@@ -155,7 +159,10 @@
     );
 
     // Convert hash to B256
-    let blob_hash_hex = upload_response.hash.strip_prefix("0x").unwrap_or(&upload_response.hash);
+    let blob_hash_hex = upload_response
+        .hash
+        .strip_prefix("0x")
+        .unwrap_or(&upload_response.hash);
     let blob_hash_bytes = hex::decode(blob_hash_hex)?;
     let mut hash_array = [0u8; 32];
     hash_array.copy_from_slice(&blob_hash_bytes[..32]);
@@ -172,18 +179,39 @@
     println!("Registering object on-chain...");
 
     // Create FendermintClient for on-chain operations
-    let fm_client = FendermintClient::new_http(
-        config.tendermint_rpc_url.parse()?,
-        None,
-    )?;
+    let fm_client = FendermintClient::new_http(config.tendermint_rpc_url.parse()?, None)?;
 
     // Create bound client with secret key
-    let secret_key = fendermint_rpc::message::SignedMessageFactory::read_secret_key(
-        &config.secret_key_file
-    )?;
+    let secret_key =
+        fendermint_rpc::message::SignedMessageFactory::read_secret_key(&config.secret_key_file)?;
 
-    let message_factory =
-        fendermint_rpc::message::SignedMessageFactory::new_secp256k1(secret_key, 0, ChainID::from(0));
+    let pk = secret_key.public_key();
+    let from_f1 = Address::new_secp256k1(&pk.serialize()).context("failed to create f1 address")?;
+    let from_eth = FvmEthAddress::new_secp256k1(&pk.serialize())
+        .context("failed to derive delegated address from secret key")?;
+    let from_f410 =
+        Address::new_delegated(10, &from_eth.0).context("failed to create f410 address")?;
+    let sequence = get_sequence_opt(&fm_client, &from_f410)
+        .await
+        .context("failed to get delegated account sequence")?
+        .ok_or_else(|| {
+            anyhow!(
+                "delegated sender {} not found on-chain; cross-fund this delegated address and retry (native f1 {} is intentionally not used)",
+                from_f410, from_f1
+            )
+        })?;
+    let chain_id = fm_client
+        .state_params(FvmQueryHeight::default())
+        .await
+        .context("failed to get state params")?
+        .value
+        .chain_id;
+    let message_factory = fendermint_rpc::message::SignedMessageFactory::new(
+        secret_key,
+        from_f410,
+        sequence,
+        ChainID::from(chain_id),
+    );
     let mut bound_client = fm_client.bind(message_factory);
 
     // Add object to bucket
@@ -202,11 +230,26 @@
     .await
     .context("Failed to register object on-chain")?;
 
-    println!("✓ Successfully uploaded and registered: {}", storage_path.key);
+    println!(
+        "✓ Successfully uploaded and registered: {}",
+        storage_path.key
+    );
 
     Ok(())
 }
 
+/// Get the next sequence number (nonce) of an account if it exists.
+async fn get_sequence_opt(client: &impl QueryClient, addr: &Address) -> Result<Option<u64>> {
+    let state = client
+        .actor_state(addr, FvmQueryHeight::default())
+        .await
+        .context("failed to get actor state")?;
+    match state.value {
+        Some((_id, state)) => Ok(Some(state.sequence)),
+        None => Ok(None),
+    }
+}
+
 /// Upload a directory recursively
 async fn upload_directory(
     local_dir: &Path,
@@ -229,7 +272,11 @@
             let storage_key = if storage_base.key.is_empty() {
                 rel_path_str
             } else {
-                format!("{}/{}", storage_base.key.trim_end_matches('/'), rel_path_str)
+                format!(
+                    "{}/{}",
+                    storage_base.key.trim_end_matches('/'),
+                    rel_path_str
+                )
             };
 
             let file_storage_path = path::StoragePath {
@@ -271,7 +318,11 @@
     local_path: &Path,
     args: &CopyArgs,
 ) -> Result<()> {
-    println!("Downloading {} -> {}", storage_path.to_uri(), local_path.display());
+    println!(
+        "Downloading {} -> {}",
+        storage_path.to_uri(),
+        local_path.display()
+    );
 
     // Get/load config
     let config_path = args.config.clone().unwrap_or_else(|| {
@@ -323,7 +374,10 @@
     _local_dir: &Path,
     _args: &CopyArgs,
 ) -> Result<()> {
-    println!("Downloading directory {} recursively...", storage_base.to_uri());
+    println!(
+        "Downloading directory {} recursively...",
+        storage_base.to_uri()
+    );
 
     // TODO: Implement by listing objects with prefix and downloading each
     // This requires implementing the ls command first to reuse list_objects functionality

This Bugbot Autofix run was free. To enable autofix for future PRs, go to the Cursor dashboard.

if value >= 0 {
EthU256::from(value as u32)
} else {
EthU256::MAX - EthU256::from((-value) as u32) + EthU256::from(1u8)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Negation overflow panic for i32::MIN in debug builds

Low Severity

In abi_int256_from_i32, the expression (-value) as u32 when value is i32::MIN (-2147483648) causes an arithmetic overflow because negating i32::MIN exceeds i32::MAX. This panics in debug builds. While i32::MIN is an unusual exit code, using value.unsigned_abs() would avoid the issue entirely.

Fix in Cursor Fix in Web

let code = output.status.code().unwrap_or(-1);
let stdout = String::from_utf8_lossy(&output.stdout).to_string();
let stderr = String::from_utf8_lossy(&output.stderr).to_string();
Ok((code, stdout, stderr))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Executor runs arbitrary binaries from on-chain data

High Severity

run_job_binary executes whatever path is in binary_ref from on-chain job data using TokioCommand::new(binary). Any on-chain user who creates a job can cause the executor to run arbitrary local commands (e.g., binary_ref = "/bin/rm" with args = ["-rf", "/"]). There's no allowlist, sandboxing, or path restriction — the executor trusts on-chain input to specify the binary to run.

Fix in Cursor Fix in Web

@karlem karlem force-pushed the storage-node-fix branch from c14c584 to 65436c5 Compare April 1, 2026 22:00
} else {
format!("{}...(truncated)", &s[..max])
}
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

String truncation panics on multi-byte UTF-8 boundaries

Medium Severity

The truncate function slices the string at a raw byte offset (&s[..max]), which panics if max falls in the middle of a multi-byte UTF-8 character. The input strings come from String::from_utf8_lossy, which replaces invalid bytes with the 3-byte U+FFFD replacement character. Any non-ASCII process output (or invalid bytes producing replacement characters) can cause the executor to crash when stderr exceeds 512 bytes.

Additional Locations (1)
Fix in Cursor Fix in Web

pub struct ExecutionState {
pub next_job_id: u64,
pub jobs: Vec<ExecutionJob>,
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unbounded inline Vec in actor state degrades all operations

Medium Severity

ExecutionState stores all jobs in an inline Vec<ExecutionJob> that is serialized as part of the root State tuple. Jobs are never removed. Since State uses Serialize_tuple, the entire jobs vector is deserialized on every actor method call — including unrelated operations like get_stats, add_blob, or credit queries. As jobs accumulate, gas costs for all blobs actor operations grow linearly and the actor can eventually become unusable.

Additional Locations (1)
Fix in Cursor Fix in Web

- Changed the IP address of "validator-1" in `ipc-subnet-config.yml` to ensure correct network routing and connectivity for the validator node.
Copy link
Copy Markdown

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

There are 5 total unresolved issues (including 4 from previous reviews).

Fix All in Cursor

Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Err(e) => return Err(e),
}
}
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Null round not detected in latest_finalized_parent_view

High Severity

The latest_finalized_parent_view function assumes null rounds are returned as errors containing the string "null round," but the actual get_block_hash implementation (IPCProviderProxy) returns Ok(BlockHashResult { is_null: true, block_hash: vec![] }) for null rounds. The Ok match arm does not check res.is_null, so the function can return a IPCParentFinality with an empty block_hash for a null round instead of skipping it. This incorrect finality can propagate to consensus proposals.

Fix in Cursor Fix in Web

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants