SIMD-0341: v0 Account Compression #341

igor56D · 2025-08-22T01:25:47Z

No description provided.

jacobcreech · 2025-08-22T01:53:36Z

proposals/0341-v0-account-compression.md

+- `combined_hash`: 32-byte hash of concatenated account pubkey and current
+  account data
+
+**Behavior**:


What authorization is required to compress/decompress an account?

That depends on the "compression condition", which is intentionally set to false in this SIMD to focus discussion on the mechanism. SIMD-0344 extends the condition to include rent delinquency but other SIMDs can add account authority/owner checks to the condition as well. In any case, it's out of scope for this SIMD.

jacobcreech · 2025-08-22T01:57:02Z

proposals/0341-v0-account-compression.md

+
+- `original_data`: Pointer to the bincode serialization of the original
+  `AccountSharedData` object, which includes the pubkey
+- `data_len`: Length of the original data in bytes


Is it expected that this information would be stored offchain for recovery?

yes, it will be off-chain but who stores the data and how isn't specified here. RPC providers and archival nodes are the more obvious options but wallets can cache account data for users, apps can subscribe to account update events and store the data, etc. It's pretty open ended and the best choice will likely depend on the account and its usage.

jacobcreech · 2025-08-22T02:07:34Z

proposals/0341-v0-account-compression.md

+
+## Impact
+
+### DApp and Wallet Developers


If an app or wallet developer wanted to view the data of an account, would this mean that they would have to simulate a transaction with a decompress call to compression system program?

If the account is compressed, the data isn't available on-chain. If that account needs to be used then it must be decompressed, which requires providing the data. The off-chain source of that data can vary, like I mentioned in a previous comment.

jacobcreech · 2025-08-22T02:09:08Z

proposals/0341-v0-account-compression.md

+
+### Validators
+
+- **Memory/Storage savings**: If enough accounts are compressed, a


What is the expected storage savings given an account being compressed with X bytes?

Compressed accounts will take up 64 bytes of space (pubkey and data hash), so the space savings will be X - 64. Every active account must at the very least store the pubkey + owner pubkey + lamports + etc so space savings will always be positive. I believe token accounts are the most common and they're about 165 bytes, for reference.

proposals/0341-v0-account-compression.md

jacobcreech · 2025-08-22T03:45:09Z

proposals/0341-v0-account-compression.md

+  compressed set
+- **Mitigation**: if the account corresponding to the pubkey was previously
+  compressed it must be recovered rather than recreated. if a collision has
+  occured a different pubkey must be used. RPCs and good errors can provide


RPCs and good errors can provide the relevant info

This is incredibly important to get right. It we don't outline exactly all the edgecases and how this works for developers, it'll be a nightmare for anyone to integrate with. We should consider putting more detail either in this SIMD on how these edge cases and errors will be displayed, or have a separate document outlining how we can properly setup the right infrastructure to handle the expected behavior with compressed state.

I added more to the SIMD regarding errors and RPC updates that will hopefully mitigate some of the concern. A separate doc on recommended infra setup + workflow changes is a good idea. I'll reach out to some of the RPC providers to see if we can put something like that together.

wjthieme · 2025-08-22T13:15:06Z

Would it be low-hanging fruit to allow programs to compress/decompress their own state? Either by allowing to syscall directly or to cpi into compression_system_program with the account_to_compress as signer?

wjthieme · 2025-08-22T13:17:45Z

Will an account appear as uninitialized if it is compressed? It would be useful to have programs be able to distinguish between uninitialized vs compressed. Or will the program a compressed account be assigned to be compressed_system_program vs normal system_program?

brooksprumo · 2025-08-22T14:14:50Z

proposals/0341-v0-account-compression.md

+1. an economic mechanism to limit state growth
+2. a predicate to determine which accounts can be compressed
+3. a compression scheme that removes accounts from the global state while
+   allowing for safe and simple recovery.


Worthwhile to include lowering rent cost in this list too? It's a main motivator in the above paragraph. (Will be a separate SIMD, of course.)

proposals/0341-v0-account-compression.md

aeyakovenko · 2025-08-22T14:54:58Z

My preference is that the data is stored in a secondary db that can be fetched by rpcs just like a snapshot. But it’s not in the hot path for a validator snapshot.

Rpc provides and indexing protocols can plug it into their systems

igor56D · 2025-08-22T16:33:02Z

Would it be low-hanging fruit to allow programs to compress/decompress their own state? Either by allowing to syscall directly or to cpi into compression_system_program with the account_to_compress as signer?

I wouldn't expect this to be too much of a lift -- would just require modifying the compression condition to allow owners/authorities to compress their accounts. The compression condition is intentionally made to be false in this SIMD (meaning no accounts are compressible) so the scope of discussion can be narrowed to just the compression mechanism itself. Modifications to the condition will be in separate SIMDs.

Will an account appear as uninitialized if it is compressed? It would be useful to have programs be able to distinguish between uninitialized vs compressed. Or will the program a compressed account be assigned to be compressed_system_program vs normal system_program?

The intention is that "compressed account : decompression" is functionally very similar to "uninitialized account : account creation". So, unless a compressed account is being decompressed, it should appear the same as an uninitialized account to users and programs. Main rationale is that this is just much simpler and less error prone implementation-wise.

I'm curious why being able to distinguish them within the runtime would be useful. If there's a good enough reason I'll think about how best to expose the distinction in the runtime.

zfedoran · 2025-08-22T20:13:11Z

sha_256(account_pubkey)[0..10]

Where did the design decision for a 10byte slice come from? Eg: why not 16 bytes?

Also, not sure it’s worth the trouble, using the full 32 byte pubkey is easier and reduces the trouble of having to check if an account can be created

igor56D · 2025-08-22T20:26:30Z

sha_256(account_pubkey)[0..10]

Where did the design decision for a 10byte slice come from? Eg: why not 16 bytes?

Also, not sure it’s worth the trouble, using the full 32 byte hash is easier and reduces the trouble of having to check if an account can be created

The intention was to use the minimum number of bytes to reduce the snapshot size as much as possible. Given what the pubkey_hash is used for, cryptographically strong collision resistance isn't necessary. That said, I discussed this with FD/Agave devs and the consensus is that the complexity isn't worth the extra space savings so I'll remove pubkey_hash and use the full pubkey instead.

zfedoran · 2025-08-22T20:46:31Z

I’m curious, the document seems to imply that only accounts that meet some TBD criteria can be compressed. Why not allow any account to be compressed as long as the signature or signing seeds are properly provided.

Some protocols want to be able to compress their program accounts, on demand, to recover account fees (as is done here).

igor56D · 2025-08-22T20:52:45Z

I’m curious, the document seems to imply that only accounts that meet some TBD criteria can be compressed. Why not allow any account to be compressed as long as the signature or signing seeds are properly provided.

Some protocols want to be able to compress their program accounts, on demand, to recover account fees (as is done here).

I don't see any reason we can't eventually support something like this. It'd be a fairly straightforward extension to the compression condition, but I intentionally left that and similar things out of this SIMD to minimize controversy and get alignment on the basics first.

zfedoran · 2025-08-22T21:04:14Z

I’m curious, the document seems to imply that only accounts that meet some TBD criteria can be compressed. Why not allow any account to be compressed as long as the signature or signing seeds are properly provided.
Some protocols want to be able to compress their program accounts, on demand, to recover account fees (as is done here).

I don't see any reason we can't eventually support something like this. It'd be a fairly straightforward extension to the compression condition, but I intentionally left that and similar things out of this SIMD to minimize controversy and get alignment on the basics first.

Awesome, fully understand and appreciate that. Just wanted to make sure it was seen as a potentially useful thing to do instead of hard-coding a “last_seen_epoch > 10” or something like that.

wjthieme · 2025-08-22T22:52:33Z

I'm curious why being able to distinguish them within the runtime would be useful. If there's a good enough reason I'll think about how best to expose the distinction in the runtime.

Imo there is functionally a difference between uninitialized and compressed. I think there is an important distinction:

From client-side code it is important to be able to know the difference. How can I otherwise know to init or decompress an account.
From program-runtime being able to distinguish might be more of a nice-to-have / future-proofing. I can imagine a scenario where you might want to decompress/init conditionally through a cpi or something. Could also just be over-optimization from my end tho

proposals/0341-v0-account-compression.md

igor56D · 2025-09-02T17:32:02Z

Updates:

removed CAM and pubkey_hash. Compressed accounts will instead be tracked in the regular accounts database.
elaborated on updates to RPCs to give devs more clarity on how account management can be implemented to handle potentially compressed accounts.
replaced SHA256 with BLAKE3 for data_hash calculation
removed unnecessary implementation details

cc @brooksprumo @ripatel-fd @jacobcreech

ripatel-fd · 2025-09-03T14:05:48Z

proposals/0341-v0-account-compression.md

+
+**Behavior**:
+
+- MUST verify that the caller is the hardcoded compression system program


Could you specify whether this check happens at program deploy-time or run-time?

ripatel-fd · 2025-09-03T14:07:04Z

proposals/0341-v0-account-compression.md

+
+- compute the 32-byte `data_hash`
+- replace the account in account databse with a compressed account entry
+- (optional) emit a compression event for off-chain data archival


sol_compress_account needs CU pricing

ripatel-fd · 2025-09-03T14:07:11Z

proposals/0341-v0-account-compression.md

+    - this is treated exactly like a new account allocation so rent
+      requirements, load limits, etc all apply
+- if verification succeeds, the compressed account entry must be replaced
+  with the full account data


sol_decompress_account needs CU pricing

ripatel-fd · 2025-09-03T14:07:55Z

proposals/0341-v0-account-compression.md

+- `data_len`: Length of the account data in bytes
+- `owner`: 32-byte public key of the account owner
+- `executable`: Whether the account is executable
+- `rent_epoch`: The rent epoch of the original account


Syscalls unfortunately can only take 5 arguments (iirc).
Maybe we can use one of the account structs in the program SDK, such as the ones from the CPI syscalls.

ripatel-fd · 2025-09-03T14:15:39Z

proposals/0341-v0-account-compression.md

+- `data_len`: Length of the account data in bytes
+- `owner`: 32-byte public key of the account owner
+- `executable`: Whether the account is executable
+- `rent_epoch`: The rent epoch of the original account


I'm pretty sure that rent_epoch is not part of the lthash on any clusters anymore. I don't know which SIMD removed rent epoch from the hash though (cc @brooksprumo)

It's fine to remove from this syscall

Yep, rent epoch was removed with the Accounts Lt Hash in SIMD-215. Let's remove it here as well.

ripatel-fd · 2025-09-03T14:20:10Z

proposals/0341-v0-account-compression.md

+
+#### `sol_decompress_account(..)`
+
+**Purpose**: Recovers a compressed account by restoring its original data.


I wonder if this syscall interface should be changed to improve user UX.
Correct me if I'm wrong but the implied flow here is as follows (for large accounts):

User creates a dummy account that will hold a copy of the original data

User uploads fragments of original account data to target account using multiple transactions

User calls invokes the compression program, which invokes this syscall. Under the hood, the runtime allocates the target account and copies over data

User cleans up the dummy account

I feel like we could combine some of these steps and skip the syscall copy

Semantically, the compression program itself can delete the dummy account and transfer rent/balance/metadata to the new account. In that case, the copy vs move becomes an implementation detail. Is that sort of what you have in mind?

rather, we can make it so the decompression syscall does the delete. If the program does it then move vs copy isn't an implementation detail.

the copy vs move becomes an implementation detail. Is that sort of what you have in mind?

@igor56D I feel like it would be very tricky to implement an account move in a syscall handler. Since all the sorrounding code (transaction executor, VM) assumes that accounts are pinned in memory.

EDIT: Would probably be easier if the move happens in the post-transaction cleanup after the program has finished executing.

good call, I'm being pretty hand-wavy because I'm not as familiar with these low-level details. Conceptually, it seems feasible to take tmp_pubkey -> data, "alias" original_pubkey -> tmp_pubkey -> data at decompression time, then resolve/collapse that whenever it makes the most sense.

I'll need to do some digging into the execution env to formulate something more concrete.

ripatel-fd · 2025-09-03T16:00:56Z

proposals/0341-v0-account-compression.md

+```rust
+pub struct CompressedAccount {
+    pub pubkey: Pubkey,
+    pub data_hash: [u8; 32],


Please note that the 'lattice hash' is 2048 bytes long, so the spec should mention how to go from 2048 bytes to 32 bytes hash size. Lattice hash is simply BLAKE3 with extended output (XOF). We can simply take the first 32 bytes here since BLAKE3 has the convenient property that, for a constant input, the low bytes of the hash output don't change as the XOF output length parameter is increased.

EDIT: blake3_2048(lthash_preimage)[:32] == blake3_32(lthash_preimage)

(What we should NOT do is something like blake3_32(blake3_2048(lthash_preimage)))

I was using lthash.out as described in SIMD-0215:

out(a): blake3.fini( blake3.append( blake3.init(), a ) ), i.e. the 32-byte blake3 of the 2048-byte data.

^^ I'm confused by this but I assumed it was what you meant by 32-byte mode. I'll change the doc to explicitly use blake3_32 to avoid confusion.

Haha yeah we don't have definitions of these blake3 pseudocode functions

(What we should NOT do is something like blake3_32(blake3_2048(lthash_preimage)))

Oops 🙃 I do blake3_32(blake3_2048(..)) for the checksum currently:
https://github.com/anza-xyz/agave/blob/7df820d832d2ae1152e7cdc13e495797600f4b3d/lattice-hash/src/lt_hash.rs#L52-L56

But, that's only used for logging purposes and can be changed. We should do the right thing for compression stuff.

topointon-jump · 2025-09-04T09:58:41Z

proposals/0341-v0-account-compression.md

+- **Compression**: replacing arbitrary account data with a fixed size
+  commitment. Not to be confused with traditional compression; the data
+  cannot be directly recovered from compressed state.


"Compression" is overloaded, especially in this space (see: compressed nfts) and as mentioned here, account compression isn't analogous to traditional compression. Using a completely new term might be clearer, maybe "Compacting"? Just a suggestion, feel free to ignore.

I'd also love to not call it compression. I've been in favor of freeze/frozen/thaw/thawed personally.

I also don't like the name but figured it had enough history that renaming would introduce more confusion.

Some alternative naming suggestions: "stubbed" accounts, "digest" accounts, "proxy" accounts

Anything but "compression"! Frozen/thawed sounds better.

topointon-jump · 2025-09-04T09:59:30Z

proposals/0341-v0-account-compression.md

+  provided in this SIMD -- it is assumed to always be false, meaning no
+  account is eligible for compression.
+- `data_hash = lthash.out(Account)` where Account includes the pubkey, 
+  lamports, data, owner, executable, and rent_epoch fields. See 


rent_epoch is not an input to the hash function.

topointon-jump · 2025-09-04T10:51:11Z

proposals/0341-v0-account-compression.md

+### Syscalls for Compression Operations
+
+The following new syscalls will be introduced to support account compression
+operations. The compression system program will support two instructions that
+just wrap these syscalls. For the time being, these syscalls can only be used
+from the compression system program but that constraint may be relaxed in the
+future.


What is the reason for doing compression/uncompression in syscalls, instead of adding two new system program instructions for doing this? Is there an advantage to these operations being syscalls rather than native program instructions?

Similarly, why a new program? Compression/un-compression feels like it fits within the system programs purview.

topointon-jump · 2025-09-04T10:54:32Z

proposals/0341-v0-account-compression.md

+
+- `account_pubkey`: 32-byte public key of the account to decompress
+- `lamports`: The lamport balance of the original account
+- `data`: Pointer to the original account data bytes


What if the original account data is larger than the max transaction size? There probably needs to be a mechanism for buffering the account data somewhere before uncompressing it.

Yeah, we should do something akin to program deployment.

Agreed with the buffering but thinking out loud, would it actually be beneficial to limit the size of compressed accounts to the max transaction size? If an account is bigger than the tx size limit, maybe it should live uncompressed onchain to prevent tpu/tx bloat?

this shouldn't materially impact tx bloat. If someone wants a large account with specific data on-chain, they need to upload the data in parts across multiple transactions. The same applies to a large account getting decompressed. When an account is decompressed, it persists on-chain like any other active account unless it gets compressed again so subsequent accesses don't need to re-upload.

topointon-jump · 2025-09-04T10:57:10Z

proposals/0341-v0-account-compression.md

+When creating new accounts, the runtime MUST verify the target pubkey does not
+already exist as a compressed account, just like with uncompressed accounts.
+
+#### Execution Error


What happens if a user attempts to access a compressed account in the vm? Is this an execution error? I think it's simpler if it's required that all accounts that an instruction accesses (including through nested cpi calls) have already been decompressed before the vm invocation.

During VM execution, attempts to access a compressed account are equivalent to attempts to access a non-existing account, and returns the same error

roryharr · 2025-09-04T18:54:17Z

proposals/0341-v0-account-compression.md

+**Parameters**:
+
+- `account_pubkey`: 32-byte public key of the account to decompress
+- `lamports`: The lamport balance of the original account


What will happen to the lamports when the account is compressed from a global tracking perspective?
Currently (I think) if you sum up all the lamports in all the accounts it should be equal to the bank capitalization. With this change that will no longer work without decompressing all the accounts.

Maybe the get transferred to some kind of system account?

roryharr · 2025-09-04T19:01:51Z

proposals/0341-v0-account-compression.md

+  subsequent attempts to access the account MUST fail unless the account has
+  been decompressed.
+
+While marking the account as compressed must be done synchronously, the


I'm not sure what this paragraph is meant to address, or if I'm missing something. This is an implementation detail for the backing database that is irrelevant to this SIMD.

A validator could never compress the account in the database (this may be useful for an RPC provider) as long as they calculate the lt_hash correctly.

I agree, this is bordering on an implementation detail. I'll remove it to avoid confusion.

brooksprumo · 2025-09-10T16:19:41Z

proposals/0341-v0-account-compression.md

+- `data_len`: Length of the account data in bytes
+- `owner`: 32-byte public key of the account owner
+- `executable`: Whether the account is executable
+- `rent_epoch`: The rent epoch of the original account


Yep, rent epoch was removed with the Accounts Lt Hash in SIMD-215. Let's remove it here as well.

brooksprumo · 2025-09-10T16:21:20Z

proposals/0341-v0-account-compression.md

+Compressed accounts are stored directly in the account database like regular
+accounts, but with a special compressed account structure:


Seems like implementation details? We may or may not do exactly this. I agree we need the account's pubkey and data_hash, but I envision us doing it as the CompressedAccount struct shows below.

brooksprumo · 2025-09-10T16:24:12Z

proposals/0341-v0-account-compression.md

+```rust
+pub struct CompressedAccount {
+    pub pubkey: Pubkey,
+    pub data_hash: [u8; 32],


(What we should NOT do is something like blake3_32(blake3_2048(lthash_preimage)))

Oops 🙃 I do blake3_32(blake3_2048(..)) for the checksum currently:
https://github.com/anza-xyz/agave/blob/7df820d832d2ae1152e7cdc13e495797600f4b3d/lattice-hash/src/lt_hash.rs#L52-L56

But, that's only used for logging purposes and can be changed. We should do the right thing for compression stuff.

brooksprumo · 2025-09-10T16:30:23Z

proposals/0341-v0-account-compression.md

+
+### Snapshot Format
+
+- **Impact**: New snapshot format including compressed accounts


I don't think we need (nor want) to change the snapshot format.

We do need to modify the account storage file format though.

Sobeston · 2025-09-22T18:31:07Z

proposals/0341-v0-account-compression.md

+**Critical Implementation Detail**: Validators MUST NOT delete the full account
+data until all `accountSubscribe` subscribers have been notified of the
+compression event. This ensures that off-chain services have the opportunity
+to archive the original account data before it is permanently removed.
+
+This approach enables:
+
+- **Archive services**: Third-party services can maintain comprehensive
+  compressed data archives using existing subscription infrastructure
+- **Application-specific storage**: DApps can store their own compressed
+  account data through established patterns
+- **Redundancy**: Multiple parties can maintain copies for data availability


Could we end up in a situations where:
a) the original data of a compressed account is entirely lost
b) the data can be found by some validators, but not others (e.g. some validators are hooked up to a paid 3rd party for storage and some aren't)

What would you expect/want to happen in these scenarios? Can redundancy be truly guaranteed, and can it be fairly distributed to all validators (and quickly enough for a slot, globally)?

Is the expectation that a solana project would ensure that it does a good job at storing its data off-chain?

Let's say there's some trading happening which requires the decompression of an account. If that project turned their archive service off, and validators can't access that data, could that risk assets becoming un-tradeable? This feels like we could risk things becoming centralised. Let me know if I've misunderstood something.

There is a theoretical possibility that compressed account data is completely lost, though it's extremely low. Even if every indexer, RPC provider, wallet provider, etc somehow loses the data (or refuses to release it), you can always fall back to replaying the ledger history from some snapshot before the account was compressed.

The protocol provides the guarantee that you can always recover a compressed account if you can provide the original data. Users/apps must decide for themselves how strong the data availability guarantee must be for their use case, and plan accordingly. The best guarantee comes from running your own full node that's synced with the network. Others may prefer to subscribe to a stream of compression events from an RPC provider and store the compressed data themselves. Less sophisticated users are likely to rely on wallet backups or indexers to keep the data available.

In any case, the complexity is moved out of the protocol and into off-chain infra.

If that project turned their archive service off, and validators can't access that data, could that risk assets becoming un-tradeable?

In reality, most full node indexers will have this data stored, though this isn't backed by clearcut incentives for now. Worst case scenario is needing to replay the ledger. There's also the option of storing the data in some distributed storage protocol like filecoin or IPFS. The availability concern is purely theoretical until the compressed state size grows dramatically, in which case we'll need to consider a more complete data availability system.

There is a theoretical possibility that compressed account data is completely lost, though it's extremely low.

Guarantees of future availability are essential. This doesn't go far enough.

fall back to replaying the ledger history from some snapshot before the account was compressed

This is not as easy as it sounds (according to people who have done it and told me it was tough).

full node indexers will have this data stored

This is a weak economic assumption. There's not much incentive to store terabytes of cold data in a separate data store for the outside chance that someone will pay $0.000025 to read it. Keep in mind the accounts are primarily cold because people aren't using/reading them.

However, some of the discussion further above suggests a solution that comes closer to a guarantee of future availability with a data store that will always exist and does not require replay. The process for thawing a frozen account, which is similar to deploying a program using a buffer account, might also be used to freeze the account: write the account state to the ledger in one or more TXs before freezing it.

The theoretical possibility of losing the ledger history is almost nil as long as Solana exists.

The ledger history is replicated across many independent players who have a financial incentive to keep it. Also, the raw data is publicly available at no cost today, so anyone can load historical data if they prefer to roll their own. The economic incentives are aligned.

The process to thaw a frozen account will still require an off-chain actor to read the account data from the ledger and then use that as an input to the thawing process. However, you won't need to replay TXs; just read the ledger history.

Footnotes:

The validators are disconnected from the full ledger history, so the entire process can't be done in-protocol. The external actor will be required to read from the ledger history & execute the thawing process.

Using the ledger history might also give some extra cryptographic peace of mind since the archived account state can be verified within the block by using the hash. In comparison, data stored in an off-chain SQL database might become corrupted or incomplete.

A parting question: Since the validators are not connected to the ledger history, does that mean the ledger history is technically an "off-chain database"? (I'll see myself out) ;-)

So first of all, a solana "user" is assumed to have a full node, or is paying someone else to run a full node directly or indirectly. There is no way for the protocol to make any guarantees about users data without this assumption. The best any protocol can do is that if the users one full node stays consistent then the user is not at risk of any loss.

If a user has access to a full node they can always replicate just part of the compressed state that user cares about.

We need tools to automate the compressed state recovery from the ledger.

If you want weak assumption that there exists at least one full node in the network that will keep the user state, that user just needs to pay enough to keep it from being compressed. But this is also a weak economic incentive.

We need tools to automate the compressed state recovery from the ledger.

The elegance of storing the frozen account state in the ledger is that the DB storage & retrieval tools already exist. getSignaturesForAddress() + getTransaction() or an Old Faithful gRPC stream will return the account state without a specialized SQL index or API. The new tooling will be related to the freeze/thaw processes.

None that can offer a guarantee of completeness or future availability. For example, streaming accounts through Geyser to a database won't work because there's no guarantee that messages will be received or that the database will be complete. Losing a single message could have catastrophic consequences for the account holder.

I think all of your points about the tricky implementation are valid. Nonetheless, the guarantee is more important than the implementation challenge. Saving data in the ledger checks the boxes for storage & retrieval guarantees, universal availability, & economic alignment for the infrastructure providers.

I can't figure out a way to make this work well enough. Even if we introduce the complexity I mentioned above, compression requests will still be prohibitively expensive. It's very important that compression is cheap -- the network needs to be able to promptly and sufficiently respond to growing state/rent prices by evicting delinquent state, which may require a lot of compression requests. Requiring all the compressed state to flow through the blocks will be a challenge for throughput.

You mentioned that replaying the ledger is difficult now. Would addressing that problem make more sense? Ledger history can tell you the slot the account was compressed at, so you'll know how far to replay. Not sure how much we can rely on historical snapshot retention, but reliable replay + snapshot fetching should make account reconstruction relatively doable.

if that doesn't seem sufficient, I'd prefer @aeyakovenko's previous idea of maintaining a separate compressed state snapshot that's replicated on all validators over making compression really difficult.

maintaining a separate compressed state snapshot that's replicated on all validators

This also works, and was my original suggestion at some earlier meetups. Hot & Cold account sets with separate snapshots & hashes. NVMe storage is affordable, and drive sizes keep getting bigger. JWash mentioned a concern about validator start times when the validator needs to validate two account DBs. That could be a more solvable problem and provide the guarantees.

P.S. Validator startup times will be slower, but all good validators run hot spares, so there's minimal downtime during failover. Slow startup times are manageable.

ultd · 2025-10-03T02:18:58Z

Could we add a transaction-scoped (ephemeral) decompression syscall so a tx can read (and optionally commit a write to) a CompressedAccount without rehydrating it in AccountsDB? This would cut AccountsDB churn and still let programs operate over the account state.

We would intro a a cpl other new syscalls:

sol_transient_decompress_account(
    compressed_acct_pubkey: Pubkey, // aka index into accounts array
    current_state: &[u8]            // full account struct bytes 
);

sol_transient_commit_compressed_account(
    compressed_acct_pubkey: Pubkey, 
    new_state: &[u8],         
);

The runtime just verifies hash(current_state) == data_hash from the CompressedAccount. No modification/rehydration of AccountsDB. If succeeds, converts CompressedAccount to an ephemeral, read-only view of the account that is addressable during the tx (maybe stored directly in TransactionContext for other ix to use). The account must be listed in the tx’s account metas per usual, locks still ensure consistent access ordering if multiple txs target it. At the tx's end, ephemeral memory is discarded.

Example pseudo-code of how used in a tx:

compressed_account_state = [...]
compressed_account_pubkey = compressedAccount1A3b5D7f9GhKm2pQ4rStVxYz6uL8

TX(a):
    accounts: [owner, compressed_account_pubkey, new_owner, SYSTEM_PROGRAM]
    signers: [owner]
    instructions:
         0: sol_transient_decompress_account(compressed_account_pubkey, compressed_account_state)
         1: transfer_asset_ownership(compressed_account_pubkey, new_owner); sol_transient_commit_compressed_account(compressed_account_pubkey, new_state)

The other sol_transient_commit_compressed_account syscall would allow one to optionally commit new account state to a CompressedAccount. This would work with sol_transient_decompress_account syscall which would need to be called first to acquire the compressed account. On commit, the sol_transient_commit_compressed_account syscall would update the CompressAccount entry’s data_hash field with new computed hash without ever storing full bytes in AccountsDB.

This covers the common pattern where an authority needs to read or locally authorize effects (NFTs/RWAs/hotspots, etc.) but doesn’t need the account to remain "active" after the tx. It avoids touching AccountsDB entirely in the read-only case, and avoids the rehydrate/recompress loop in the write-through case. It also keeps snapshots small since the compressed entry never expands.

This wouldn't work well with Accounts meant to be "global", shared state (DEX orderbooks, AMMs) that needs a consistent, persistent view across txs. Those should just remain in active set. Also, this probably only works well for small account sizes. Probably should limit to Accounts that are max N bytes as you don't want each tx submitting excess ix bytes causing TPU bloat.

EDIT:

Maybe we don't need sol_transient_commit_compressed_account as the runtime should probably commit the new hash to a CompressedAccount's data_hash field upon tx success implicitly. In fact, it's probably necessary.

igor56D · 2025-10-07T23:14:02Z

@ultd IIUC this isn't incompatible with the existing proposal -- no new syscalls would need to be introduced. This proposal intentionally leaves the compression condition itself stubbed out to focus discussion on the compression mechanism rather than all the potential ways to use it (ephemeral compressed accounts, rent eviction, etc).

For example, if there was a subsequent SIMD that expanded that compression condition to include something like ... || caller == account.owner it would give the owner of an account the authority to compress it. With this they can use the existing syscalls to compress/decompress at will.

SIMD for v0 account compression

67f4cab

igor56D changed the title ~~v0 Account Compression~~ SIMD-0341: v0 Account Compression Aug 22, 2025

igor56D added 2 commits August 21, 2025 20:27

update SIMD ID to 0341

e1dcac3

fix lint

d9230a7

jacobcreech reviewed Aug 22, 2025

View reviewed changes

proposals/0341-v0-account-compression.md Outdated Show resolved Hide resolved

jacobcreech reviewed Aug 22, 2025

View reviewed changes

brooksprumo reviewed Aug 22, 2025

View reviewed changes

github-actions bot mentioned this pull request Aug 25, 2025

Upstream Updates - Mon Aug 25 00:16:59 UTC 2025 smartcontractkit/chainlink-solana#1331

Open

ripatel-fd reviewed Aug 25, 2025

View reviewed changes

proposals/0341-v0-account-compression.md Outdated Show resolved Hide resolved

ripatel-fd reviewed Aug 25, 2025

View reviewed changes

proposals/0341-v0-account-compression.md Outdated Show resolved Hide resolved

ripatel-fd reviewed Aug 25, 2025

View reviewed changes

proposals/0341-v0-account-compression.md Outdated Show resolved Hide resolved

ripatel-fd reviewed Aug 25, 2025

View reviewed changes

proposals/0341-v0-account-compression.md Outdated Show resolved Hide resolved

igor56D mentioned this pull request Aug 29, 2025

Add SIMD-0344: Dynamic State Rent #344

Draft

igor56D force-pushed the compression-v0 branch from 5353472 to 8f6cd06 Compare September 1, 2025 20:58

remove pubkey hash and CAM. add more info on rpc calls and errors. misc

9447528

igor56D force-pushed the compression-v0 branch from 8f6cd06 to 9447528 Compare September 1, 2025 21:12

brooksprumo self-requested a review September 2, 2025 17:36

ripatel-fd reviewed Sep 3, 2025

View reviewed changes

topointon-jump reviewed Sep 4, 2025

View reviewed changes

roryharr reviewed Sep 4, 2025

View reviewed changes

brooksprumo reviewed Sep 10, 2025

View reviewed changes

Sobeston reviewed Sep 22, 2025

View reviewed changes


		### Validators

		- Memory/Storage savings: If enough accounts are compressed, a


		Behavior:

		- MUST verify that the caller is the hardcoded compression system program


		#### `sol_decompress_account(..)`

		Purpose: Recovers a compressed account by restoring its original data.

		Compressed accounts are stored directly in the account database like regular
		accounts, but with a special compressed account structure:


		### Snapshot Format

		- Impact: New snapshot format including compressed accounts

SIMD-0341: v0 Account Compression #341

Are you sure you want to change the base?

SIMD-0341: v0 Account Compression #341

Uh oh!

Conversation

igor56D commented Aug 22, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jacobcreech Aug 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

wjthieme commented Aug 22, 2025

Uh oh!

wjthieme commented Aug 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

aeyakovenko commented Aug 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

igor56D commented Aug 22, 2025

Uh oh!

zfedoran commented Aug 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

igor56D commented Aug 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

zfedoran commented Aug 22, 2025

Uh oh!

igor56D commented Aug 22, 2025

Uh oh!

zfedoran commented Aug 22, 2025

Uh oh!

wjthieme commented Aug 22, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

igor56D commented Sep 2, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ripatel-fd Sep 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

jacobcreech Aug 22, 2025 •

edited

Loading

wjthieme commented Aug 22, 2025 •

edited

Loading

aeyakovenko commented Aug 22, 2025 •

edited

Loading

zfedoran commented Aug 22, 2025 •

edited

Loading

igor56D commented Aug 22, 2025 •

edited

Loading

ripatel-fd Sep 3, 2025 •

edited

Loading

igor56D Sep 3, 2025 •

edited

Loading

ripatel-fd Sep 3, 2025 •

edited

Loading

ripatel-fd Sep 3, 2025 •

edited

Loading