feat: minimal e2e framework by cmwaters · Pull Request #1586 · celestiaorg/celestia-app

cmwaters · 2023-03-31T14:41:48Z

Ref: #1256

This PR puts together the base pieces for an e2e testing suite. It includes a CLI that can:

Setup the file directories for multiple nodes, including generating a genesis with several accounts
Start the testnet (with different versions and at different start heights)
Stop the network and cleanup the used resources.

Further work aims to support:

compatibility testing
upgrade testing
sync testing
fuzz testing
- for non-determinism
- invariant checking
- as a generalized tool for other teams

codecov-commenter · 2023-03-31T14:57:56Z

Codecov Report

Merging #1586 (9a6e52f) into main (058442d) will not change coverage.
The diff coverage is n/a.

@@           Coverage Diff           @@
##             main    #1586   +/-   ##
=======================================
  Coverage   50.96%   50.96%           
=======================================
  Files          92       92           
  Lines        5751     5751           
=======================================
  Hits         2931     2931           
  Misses       2520     2520           
  Partials      300      300

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

evan-forbes

this is dope! I appreciate the similarities to the tendermint e2e test, and see how we can add the desired tests.

Are we wanting to eventually enable celestia-node to use this as well? I could see them being able to reuse a lot of the code here.

mainly just a had a few questions, but other than that I think it LGTM, and we should merge and start adding tests/iterating as needed

testing/e2e/cmd/e2e/main.go

testing/e2e/pkg/exec.go

testing/e2e/pkg/setup.go

evan-forbes · 2023-04-02T19:48:36Z

testing/e2e/pkg/setup.go

+services:
+{{- range .Nodes }}
+  {{ .Name }}:
+    labels:
+      e2e: true
+    container_name: {{ .Name }}
+    image: ghcr.io/celestiaorg/celestia-app:{{ index .Versions 0 }}
+    entrypoint: ["/bin/celestia-appd"]
+    command: ["start"]
+    init: true


my local docker setup might be wonk, beacuse I had to add
user: root in order to get the test to work. If not, the test will appear to be waiting, but none of the nodes actually start because they do not have permission to write to the volume.

does anyone know what I should do to fix that locally? change the group permissions I have for docker? cc @sweexordious

Hmm. The docker image runs using user 1001 however I made the permissions of the volume 0o777 so everyone should have read write execute access.

cmwaters · 2023-04-03T08:04:02Z

Are we wanting to eventually enable celestia-node to use this as well? I could see them being able to reuse a lot of the code here.

Yeah Potentially. Matt mentioned having a common repository for e2e testing of various node types in a system. If we are to do this I'd definitely recommend reusing parts of this code.

evan-forbes

👍

rach-id

Awesome stuff 🎉

When I run this locally, the validators start correctly, but when I want to stop them using ctrl + c, everything hangs until I kill the terminal session.
When this happens, I check the running docker images and I see none, so probably this is not related to docker runtime.

Is that the correct way to stop the framework?
Is anyone having the same issue or it's just me?

testing/e2e/pkg/rpc.go

cmwaters · 2023-04-03T13:38:02Z

Is that the correct way to stop the framework?

It should catch the sig term and just cancel the context. Let me dig in a little. In any case running stop and then cleanup should stop the network but I guess you're referring to running the entire sequence

MSevey · 2023-04-03T17:02:55Z

cc @celestiaorg/testing for visibility

Bidon15 · 2023-04-04T08:18:12Z

Few questions from my side

why we opted for in-house framework instead of using interchaintest?

state-sync example
chain-upgrade example

why not just re-use their implementation of chain creation using docker-compose and networking?

https://github.com/strangelove-ventures/interchaintest/blob/main/chain/cosmos/cosmos_chain.go

It has the same capabilities as we have rn.

Same with network pruning/setup
https://github.com/strangelove-ventures/interchaintest/blob/main/internal/dockerutil/setup.go

imho, it looks like they have took care of docker networks cleanup which is clumsy

if we want more control over implementation - why not reuse compose-spec like libp2p does?

Seeing the matrix versioning schema that libp2p does - we can def reuse the hard work they pushed.

Pros of above

Both have CI/CD implementation already meaning that we can speed up dev time and contribute back if something we really need

Bidon15 · 2023-04-04T08:23:21Z

My main concern over this PR - we are signing up on maintaining long-term, while there are tools that can help us achieve the same without maintaining it (interchaintest is constantly being updated, same is for libp2p/test-plan)
cc: @MSevey

cmwaters · 2023-04-04T12:23:54Z

Hey,

why we opted for in-house framework instead of using interchaintest?

The simple answer is we have a modified fork of the SDK which is not compatible with interchaintest. There are changes that I've spoken in the past that we can (and I think we should probably make) that means we don't have this compatibility issue and can eliminate the need to maintain the SDK fork.

Secondly, interchaintest is very IBC orientated. This means some of the initial goals of upgrade testing (we will have a different upgrade mechanism to most cosmos chains), compatibility testing and sync testing don't seem to natively fit their framework.

Lastly, for an open source library, I think there documentation and tutorials around using their library are lacking (although the inline commenting is relatively helpful). As well as maintenance we have to also consider onboarding costs to understanding a new (and large) codebase (when something simpler might suffice).

My main concern over this PR - we are signing up on maintaining long-term

I'm sympathetic to this. Ideally we reach our testing goals with little duplication and overlap. When we spoke prior to this it seemed like testground was not suitable. Rene tells me that robusta uses a different more light-weight framework that may be ideal for the outlined purposes. I'm not sure about the other (libp2p) library you mention but I definitely think that what's done here is quite generic: It's simply managing a series of binaries within a network and using the RPC endpoints to make assertions.

What I'm more concerned about is that it seems like these testing goals are perhaps the same across the other teams (node have their "swamp" testing) which seems, at least to me, to heavily overlap with what core/app is doing. If we have a "testing" team then perhaps efforts from individual teams can be consolidated

rootulp

Overall LGTM. I'm encountering an error while trying to run locally

$ ./build/e2e -f networks/simple.toml
Setting up network simple-53126
Spinning up testnet
Starting validator01 at height 0 on http://localhost:4202
Error: starting network: exec: "docker compose": executable file not found in $PATH

I have docker compose installed in my $PATH:

$ docker-compose --version
Docker Compose version v2.15.1
$ which docker-compose
/usr/local/bin/docker-compose

Should we list that installing docker compose is a prerequisite (potentially in README.md)?
Am I missing a different prerequisite step?

rootulp · 2023-04-04T17:36:35Z

go.mod

 	github.com/dgraph-io/badger/v2 v2.2007.4 // indirect
 	github.com/dgraph-io/ristretto v0.1.0 // indirect
 	github.com/dgryski/go-farm v0.0.0-20200201041132-a6ae2369ad13 // indirect
+	github.com/docker/docker v20.10.21+incompatible


[informational] I was concerned by the +incompatible suffix but it seems ok per https://stackoverflow.com/a/57356777

rootulp · 2023-04-04T19:31:07Z

testing/e2e/pkg/rpc.go

+		return nil, errors.New("network is not running")
+	}
+
+	// return heights in ascending order


[nit] this line claims to sort heights in ascending order but the function godoc comment claims to return heights in descending order.

// GetHeights loops through all running nodes and returns an array of heights // in order of highest to lowest. func GetHeights(ctx context.Context, testnet *Testnet) ([]int64, error) {

Bidon15 · 2023-04-05T09:43:35Z

There are 2 topics that got confused and we clarified them here.

The original idea behind this PR is for teams to do nightly tests using minimal e2e framework. I thought that this is only for core/app and integration ones executed on every PR.

Going forward, we separated into 2 topics:

Nightly Tests.
Integration Tests that are executed per PR.

Nightly tests

@celestiaorg/devops will implement infra features for robusta-nightlies after @cmwaters shares the notion doc with us.
Robusta-nightly is a good point as the telemetry is already there and a fuzzing client can be plugged in easily.

Integration tests

This PR should be used on Integration level per PR run. In addition, we need to migrate it to a standalone repo, so @celestiaorg/celestia-node and @rollkit can use to fulfil their testing strategies

In addition, we will reuse the compose-spec that libp2p/test-plan has to make it CI/CD ready out-of-the-box

Thanks for the input and clarifications 🚀

staheri14

Nice work on this PR. 👍

staheri14 · 2023-04-06T23:52:59Z

testing/e2e/pkg/exec.go

+	"path/filepath"
+)
+
+// execute executes a shell command.


Suggested change

// execute executes a shell command.

// exec executes a shell command.

staheri14 · 2023-04-07T00:07:31Z

testing/e2e/pkg/testnet.go

+	Key    crypto.PrivKey
+}
+
+func LoadTestnet(manifest Manifest, file string) (*Testnet, error) {


[nit-optional] I think filePath is closer to what this parameter means.

Suggested change

func LoadTestnet(manifest Manifest, file string) (*Testnet, error) {

func LoadTestnet(manifest Manifest, filePath string) (*Testnet, error) {

staheri14 · 2023-04-07T00:23:36Z

testing/e2e/pkg/testnet.go

+
+// Address returns an RPC endpoint address for the node.
+func (n Node) AddressRPC() string {
+	return fmt.Sprintf("%v:26657", n.IP.String())


[nit-optional]: Introducing a constant variable to represent the port number 26657 would enhance the code's future maintainability and readability.

staheri14 · 2023-04-07T00:25:47Z

testing/e2e/pkg/testnet.go

+}
+
+func (n Node) IsValidator() bool {
+	return n.SelfDelegation != 0


[optional] Since a non-zero SelfDelegation is the only requirement for being recognized as a validator, why not use a boolean type for SelfDelegation instead?

Because we may want to set the self delegation to an actual number which will reflect the initial voting power of the validator

evan-forbes · 2023-04-18T12:16:34Z

morge?

cmwaters · 2023-04-23T19:29:13Z

I have docker compose installed in my $PATH:

Sorry for missing this @rootulp. docker compose isn't the same binary as docker-compose, it's a command from the docker binary that was recently added. If you haven't got it then I think you need to upgrade your docker version

minimal e2e framework

d42b418

cmwaters requested a review from evan-forbes as a code owner March 31, 2023 14:41

MSevey requested review from a team and rootulp and removed request for a team March 31, 2023 14:42

merge with main

a4b32f7

MSevey requested a review from a team March 31, 2023 14:45

rootulp assigned cmwaters Mar 31, 2023

lint

fdb185b

cmwaters requested a review from rach-id as a code owner March 31, 2023 15:01

more linting

7c280fc

evan-forbes reviewed Apr 2, 2023

View reviewed changes

implement suggestions

4888dd4

MSevey requested a review from a team April 3, 2023 08:08

evan-forbes approved these changes Apr 3, 2023

View reviewed changes

rach-id reviewed Apr 3, 2023

View reviewed changes

testing/e2e/pkg/rpc.go Show resolved Hide resolved

evan-forbes requested a review from staheri14 April 3, 2023 15:12

evan-forbes removed the request for review from staheri14 April 3, 2023 18:06

evan-forbes added the testing items that are strictly related to adding or extending test coverage label Apr 3, 2023

evan-forbes added this to the Mainnet milestone Apr 3, 2023

rootulp reviewed Apr 4, 2023

View reviewed changes

staheri14 previously approved these changes Apr 7, 2023

View reviewed changes

evan-forbes previously approved these changes Apr 17, 2023

View reviewed changes

Bidon15 previously approved these changes Apr 20, 2023

View reviewed changes

This was referenced Apr 21, 2023

testing/integration: implement shrex suite celestiaorg/celestia-node#2119

Closed

testing/integration: implement blob suite celestiaorg/celestia-node#2120

Closed

Merge branch 'main' into cal/e2e

96fae19

cmwaters dismissed stale reviews from Bidon15, evan-forbes, and staheri14 via 96fae19 April 23, 2023 19:30

MSevey requested a review from a team April 23, 2023 19:30

go mod

36d9c0f

evan-forbes approved these changes Apr 24, 2023

View reviewed changes

Merge branch 'main' into cal/e2e

9a6e52f

MSevey requested a review from a team April 24, 2023 14:13

rootulp approved these changes Apr 25, 2023

View reviewed changes

cmwaters merged commit 23fb8f3 into main Apr 25, 2023

cmwaters deleted the cal/e2e branch April 25, 2023 08:33

	// execute executes a shell command.
	// exec executes a shell command.

	func LoadTestnet(manifest Manifest, file string) (*Testnet, error) {
	func LoadTestnet(manifest Manifest, filePath string) (*Testnet, error) {

Conversation

cmwaters commented Mar 31, 2023

Uh oh!

codecov-commenter commented Mar 31, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

evan-forbes left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

evan-forbes Apr 2, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cmwaters commented Apr 3, 2023

Uh oh!

evan-forbes left a comment

Choose a reason for hiding this comment

Uh oh!

rach-id left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

cmwaters commented Apr 3, 2023

Uh oh!

MSevey commented Apr 3, 2023

Uh oh!

Bidon15 commented Apr 4, 2023

why we opted for in-house framework instead of using interchaintest?

why not just re-use their implementation of chain creation using docker-compose and networking?

if we want more control over implementation - why not reuse compose-spec like libp2p does?

Pros of above

Uh oh!

Bidon15 commented Apr 4, 2023

Uh oh!

cmwaters commented Apr 4, 2023

Uh oh!

rootulp left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Bidon15 commented Apr 5, 2023

Nightly tests

Integration tests

Uh oh!

staheri14 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

evan-forbes commented Apr 18, 2023

Uh oh!

cmwaters commented Apr 23, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

codecov-commenter commented Mar 31, 2023 •

edited

Loading

evan-forbes left a comment •

edited

Loading

evan-forbes Apr 2, 2023 •

edited

Loading