-
Notifications
You must be signed in to change notification settings - Fork 67
Network: support down NIC #130
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: FingerLeader <[email protected]>
Signed-off-by: FingerLeader <[email protected]>
Signed-off-by: FingerLeader <[email protected]>
Signed-off-by: FingerLeader <[email protected]>
Signed-off-by: FingerLeader <[email protected]>
Signed-off-by: FingerLeader <[email protected]>
Signed-off-by: FingerLeader <[email protected]>
Signed-off-by: FingerLeader <[email protected]>
Signed-off-by: FingerLeader <[email protected]>
Signed-off-by: FingerLeader <[email protected]>
Signed-off-by: FingerLeader <[email protected]>
Signed-off-by: FingerLeader <[email protected]>
Signed-off-by: FingerLeader <[email protected]>
|
[REVIEW NOTIFICATION] This pull request has been approved by:
To complete the pull request process, please ask the reviewers in the list to review by filling The full list of commands accepted by this bot can be found here. DetailsReviewer can indicate their review by submitting an approval review. |
Signed-off-by: FingerLeader <[email protected]>
b864518 to
e19ff4a
Compare
Signed-off-by: FingerLeader <[email protected]>
pkg/server/chaosd/network.go
Outdated
| NICDownCommand := fmt.Sprintf("ifconfig %s down", attack.Device) | ||
|
|
||
| cmd := exec.Command("bash", "-c", NICDownCommand) | ||
| if err = cmd.Start(); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I run it under a non-root user, it runs failed but did not report any error here. Because it will not wait for the command to complete.
https://pkg.go.dev/os/exec#Cmd.Start
It's better to use cmd.CombinedOutput.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But using cmd.CombinedOutput or cmd.Run will make process stalled when exectuing sleep $duration, is it acceptable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's OK. chaosd should't exit before recover it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If chaosd will not exit, I think nohup script is useless. Should I remove it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, we can remove the nohup in cmd
| return nil | ||
| } | ||
|
|
||
| func (s *Server) recoverNICDownScheduled(attack *core.NetworkCommand) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this way, we need to set the experiment's status to Destroyed.
I think it's better to refine the schedule framework, and let it recover automatically by the schedule framework.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can refine the schedule framework in next pr
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this solution is temporary, real schedule is need, especially for some dangerous experiments.
Signed-off-by: FingerLeader <[email protected]>
Signed-off-by: FingerLeader <[email protected]>
WangXiangUSTC
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rest LGTM
| return errors.WithStack(err) | ||
| } | ||
|
|
||
| if attack.Duration != "-1" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this should be len(attack.Duration) != 0?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
len(attack.Duration) has been checked before this function, -1 means attack will run continuously. It will be marked in doc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not check this field before executing this attack?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This field has been checked for necessity before executing, if len(n.Duration) == 0 { return errors.New("duration is required") }, the judgement here is only to decide whether to run continuously.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I remember you add a note that the attack.Duration is required, but if users set it to -1, this attack won't be recovered automatically. This behavior is very dangerous. Can we forbid users to set it to -1?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, this -1 option is added on purpose, because I think some user may need this option to let the experiment run continuously, especially those who run NIC down locally rather than remotely. And attack.Duration is hardly ever set to negative numbers, I think -1 is enough to be a warning sign (maybe -2333 or -32768 is better).
WangXiangUSTC
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
pkg/server/chaosd/network.go
Outdated
| recoverCmd := exec.Command("bash", "-c", NICUpCommand) | ||
| stdout, err := recoverCmd.CombinedOutput() | ||
| if err != nil { | ||
| log.Error(recoverCmd.String()+string(stdout), zap.Error(err)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can delete this log. refer to https://github.com/chaos-mesh/rfcs/blob/main/text/2021-12-09-logging.md#relations-between-error-and-logging
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed these log.Error, PTAL : )
Signed-off-by: FingerLeader <[email protected]>
| if err != nil { | ||
| return errors.WithStack(err) | ||
| } | ||
| attack.IPAddress = strings.TrimRight(string(stdoutBytes), "\n\x00") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add some comments to explain why need to trim \x00
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DONE IT!
Signed-off-by: FingerLeader <[email protected]>
Signed-off-by: FingerLeader <[email protected]>
Signed-off-by: FingerLeader <[email protected]>
cwen0
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
/merge |
|
This pull request has been accepted and is ready to merge. DetailsCommit hash: 2af752c |
Notice: The parameter
durationis set to mandatory, because disabling the remote server's NIC is dangerous.