-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Description
Follow-on to #593.
This specification describes Ethereum 2.0's networking wire protocol.
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL", NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.
Use of libp2p
This protocol uses the libp2p networking stack. libp2p provides a composable wrapper around common networking primitives, including:
- Transport.
- Encryption.
- Stream multiplexing.
Clients MUST be compliant with the corresponding libp2p specification whenever libp2p specific protocols are mentioned. This document will link to those specifications when applicable.
Client Identity
Identification
Upon first startup, clients MUST generate an RSA key pair in order to identify the client on the network. The SHA-256 multihash of the public key is the clients's Peer ID, which is used to look up the client in libp2p's peer book and allows a client's identity to remain constant across network changes.
Addressing
Clients on the Ethereum 2.0 network are identified bymultiaddrs. multiaddrs are self-describing network addresses that include the client's location on the network, the transport protocols it supports, and its peer ID. For example, the human-readable multiaddr for a client located at example.com, available via TCP on port 8080, and with peer ID QmUWmZnpZb6xFryNDeNU7KcJ1Af5oHy7fB9npU67sseEjR would look like this:
/dns4/example.com/tcp/8080/p2p/QmUWmZnpZb6xFryNDeNU7KcJ1Af5oHy7fB9npU67sseEjR
We refer to the /dns4/example.com part as the 'lookup protocol', the /tcp/8080 part as the networking protocol, and the /p2p/<peer ID> part as the 'identity protocol.'
Clients MAY use either dns4 or ip4 lookup protocols. Clients MUST set the identity protocol to /tcp followed by a port of their choosing. It is RECOMMENDED to use the default port of 9000. Clients MUST set the identity protocol to /p2p/ followed by their peer ID.
Relevant libp2p Specifications
Transport
Clients communicate with one another over a TCP stream. Through that TCP stream, clients receive messages either as a result of a 1-1 RPC request/response between peers or via pubsub broadcasts.
Weak-Subjectivity Period
Some of the message types below depend on a calculated value called the 'weak subjectivity period' to be processed correctly. The weak subjectivity period is a function of the size of the validator set at the last finalized epoch. The goal of the weak-subjectivity period is to define the maximum number of validator set changes a client can tolerate before requiring out-of-band information to resync.
The definition of this function will be added to the 0-beacon-chain specification in the coming days.
Messaging
All ETH 2.0 messages conform to the following structure:
+--------------------------+
| protocol path |
+--------------------------+
| compression ID |
+--------------------------+
| |
| compressed body |
| (SSZ encoded) |
| |
+--------------------------+
The protocol path is a human-readable prefix that identifies the message's contents. It is compliant with the libp2p multistream specification. For example, the protocol path for libp2p's internal ping message is /p2p/ping/1.0.0. All protocol paths include a version for future upgradeability. In practice, client implementors will not have to manually prepend the protocol path since libp2p implements this as part of the libp2p library.
The compression ID is a single-byte sigil that denotes which compression algorithm is used to compress the message body. Currently, the following compression algorithms are supported:
- ID
0x00: no compression - ID
0x01: Snappy compression
We suggest starting with Snappy because of its high throughput (~250MB/s without needing assembler), permissive license, and availability in a variety of different languages.
Finally, the compressed body is the SSZ-encoded message body after being compressed by the algorithm denoted by the compression ID.
Relevant Specifications
Messages
The schema of message bodies is notated like this:
(
field_name_1: type
field_name_2: type
)
SSZ serialization is field-order dependent. Therefore, fields MUST be encoded and decoded according to the order described in this document. The encoded values of each field are concatenated to form the final encoded message body. Embedded structs are serialized as Containers unless otherwise noted.
All ETH 2.0 RPC messages prefix their protocol path with /eth/serenity.
Handshake
Hello
Protocol Path: /eth/serenity/hello/1.0.0
Body:
(
network_id: uint8
latest_finalized_root: bytes32
latest_finalized_epoch: uint64
best_root: bytes32
best_slot: uint64
)
Clients exchange hello messages upon connection, forming a two-phase handshake. The first message the initiating client sends MUST be the hello message. In response, the receiving client MUST respond with its own hello message.
Clients SHOULD immediately disconnect from one another following the handshake above under the following conditions:
- If
network_idbelongs to a different chain, since the client definitionally cannot sync with this client. - If the time between each peer's
latest_finalized_epochexceeds the weak-subjectivity period, since syncing with this client would be unsafe. - If the
latest_finalized_rootshared by the peer is not in the client's chain at the expected epoch. For example, if Peer 1 in the diagram below has(root, epoch)of(A, 5)and Peer 2 has(B, 3), Peer 1 would disconnect because it knows thatBis not the root in their chain at epoch 3:
Root A
+---+
|xxx| +----+ Epoch 5
+-+-+
^
|
+-+-+
| | +----+ Epoch 4
+-+-+
Root B ^
|
+---+ +-+-+
|xxx+<---+--->+ | +----+ Epoch 3
+---+ | +---+
|
+-+-+
| | +-----------+ Epoch 2
+-+-+
^
|
+-+-+
| | +-----------+ Epoch 1
+---+
Once the handshake completes, the client with the higher latest_finalized_epoch or best_slot (if the clients have equal latest_finalized_epochs) SHOULD send beacon block roots to its counterparty via beacon_block_roots.
RPC
These protocols represent RPC-like request/response interactions between two clients. Clients send serialized request objects to streams at the protocol paths described below, and wait for a response. If no response is received within a reasonable amount of time, clients MAY disconnect.
Beacon Block Roots
Protocol Path: /eth/serenity/rpc/beacon_block_roots/1.0.0
Body:
# BlockRootSlot
(
block_root: HashTreeRoot
slot: uint64
)
(
roots: []BlockRootSlot
)
Send a list of block roots and slots to the peer.
Beacon Block Headers
Protocol Path: /eth/serenity/rpc/beacon_block_headers/1.0.0
Request Body
(
start_root: HashTreeRoot
start_slot: uint64
max_headers: uint64
skip_slots: uint64
)
Response Body:
# Eth1Data
(
deposit_root: bytes32
block_hash: bytes32
)
# BlockHeader
(
slot: uint64
parent_root: bytes32
state_root: bytes32
randao_reveal: bytes96
eth1_data: Eth1Data
body_root: HashTreeRoot
signature: bytes96
)
(
headers: []BlockHeader
)
Requests beacon block headers from the peer starting from (start_root, start_slot). The response MUST contain fewer than max_headers headers. skip_slots defines the maximum number of slots to skip between blocks. For example, requesting blocks starting at slots 2 a skip_slots value of 2 would return the blocks at [2, 4, 6, 8, 10]. In cases where a slot is undefined for a given slot number, the closest previous block MUST be returned returned. For example, if slot 4 were undefined in the previous example, the returned array would contain [2, 3, 6, 8, 10]. If slot three were further undefined, the array would contain [2, 6, 8, 10] - i.e., duplicate blocks MUST be collapsed.
The function of the skip_slots parameter helps facilitate light client sync - for example, in #459 - and allows clients to balance the peers from whom they request headers. Client could, for instance, request every 10th block from a set of peers where each per has a different starting block in order to populate block data.
Beacon Block Bodies
Protocol Path: /eth/serenity/rpc/beacon_block_bodies/1.0.0
Request Body:
(
block_roots: []HashTreeRoot
)
Requests the block_bodies associated with the provided block_roots from the peer. Responses MUST return block_roots in the order provided in the request. If the receiver does not have a particular block_root, it must return a zero-value block_body (i.e., a zero-filled bytes32).
Response Body:
For type definitions of the below objects, see the 0-beacon-chain specification.
# BlockRoot
(
proposer_slashings: []ProposerSlashing
attester_slashings: []AttesterSlashing
attestations: []Attestation
deposits: []Deposit
voluntary_exits: []VoluntaryExit
transfers: []Transfer
)
(
block_roots: BlockRoot[]
)
Beacon Chain State
Note: This section is preliminary, pending the definition of the data structures to be transferred over the wire during fast sync operations.
Protocol Path: /eth/serenity/rpc/beacon_chain_state/1.0.0
Request Body:
(
hashes: []HashTreeRoot
)
Requests contain the hashes of Merkle tree nodes that when merkelized yield the block's state_root.
Response Body: TBD
The response will contain the values that, when hashed, yield the hashes inside the request body.
Broadcast
These protocols represent 'topics' that clients can subscribe to via GossipSub.
Beacon Blocks
The response bodies of each topic below map to the response bodies of the Beacon RPC methods above. Note that since broadcasts have no concept of a request, any limitations to the RPC response bodies do not apply to broadcast messages.
Topics:
beacon/block_rootsbeacon/block_headersbeacon/block_bodies
Voluntary Exits
Topic: beacon/exits
Body:
See the 0-beacon-chain spec for the definition of the VoluntaryExit type.
(
exit: VoluntaryExit
)
Transfers
Topic: beacon/transfer
Body:
See the 0-beacon-chain spec for the definition of the Transfer type.
(
transfer: Transfer
)
Clients MUST ignore transfer messages if transfer.slot < current_slot - GRACE_PERIOD, where GRACE_PERIOD is an integer that represents the number of slots that a remote peer is allowed to drift from current_slot in order to take potential network time differences into account.
Shard Attestations
Topics: shard-{number}, where number is an integer in [0, SHARD_SUBNET_COUNT), and beacon/attestations.
The Attestation object below includes fully serialized AttestationData in its data field. See the 0-beacon-chain for the definition of the Attestation type.
Body:
(
attestations: []Attestation
)
Only aggregate attestations are broadcast to the beacon/attestations topic.
Clients SHOULD NOT send attestations for shards that the recipient is not interested in. Clients receiving uninteresting attestations MAY disconnect from senders.
Relevant Specifications
Client Synchronization
When a client joins the network, or has otherwise fallen behind the latest_finalized_root or latest_finalized_epoch, the client MUST perform a sync in order to catch up with the head of the chain. This specification defines two sync methods:
- Standard: Used when clients already have state at
latest_finalized_rootorlatest_finalized_epoch. In a standard sync, clients process per-block state transitions until they reach the head of the chain. - Fast: Used when clients do not have state at
latest_finalized_rootorlatest_finalized_epoch. In a fast sync, clients use RPC methods to download nodes in the state tree for a givenstate_rootvia the/eth/serenity/rpc/beacon_chain_state/1.0.0endpoint. The basic algorithm is as follows:- Peer 1 and Peer 2 connect. Peer 1 has
(C, 1)and Peer 2 has(A, 5). Peer 1 validates that this new head is within the weak subjectivity period. - If the head is within the weak subjectivity period, Peer 2 checks the validity of the new chain by verifying that all children point to valid parent roots.
- Peer 2 then takes the state root of
(A, 5)and sends/eth/serenity/rpc/beacon_chain_state/1.0.0requests recursively to its peers in order to build its SSZ BeaconState.
- Peer 1 and Peer 2 connect. Peer 1 has
Note that nodes MUST perform a fast sync if they do not have state at their starting finalized root. For example, if Peer 1 in the example above did not have the state at (C, 1), Peer 1 would have to perform a fast sync because it would have no base state to compute transitions from.
Open Questions
Encryption
This specification does not currently define an encrypted transport mechanism because the set of libp2p-native encryption libraries is limited. libp2p currently supports an encryption scheme called SecIO, which is a variant of TLSv1.3 that uses a peer's public key for authentication rather than a certificate authority. While SecIO is available for public use, it has not been audited and is going to be deprecated when TLSv1.3 ships.
Another potential solution would be to support an encryption scheme such as Noise. The Lightning Network team has successfully deployed Noise in production to secure inter-node communications.
Granularity of Topics
This specification defines granular GossipSub topics - i.e., beacon/block_headers vs. simply beacon. The goal of using granular topics is to simplify client development by defining a single payload type for each topic. For example, beacon/block_headers will only ever contain block headers, so clients know the content type of without needing to read the message body itself. This may have drawbacks. For example, having too many topics may hinder peer discovery speed. If this is the case, this specification will be updated to use less granular topics.
Block Structure Changes
The structure of blocks may change due to #649. Changes that affect this specification will be incorporated here once the PR is merged.