-
Notifications
You must be signed in to change notification settings - Fork 944
feature: stats of cri manager #1431
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
198958c to
f34e77e
Compare
Codecov Report
@@ Coverage Diff @@
## master #1431 +/- ##
==========================================
- Coverage 41.45% 39.71% -1.75%
==========================================
Files 273 274 +1
Lines 17714 18584 +870
==========================================
+ Hits 7344 7381 +37
- Misses 9461 10294 +833
Partials 909 909
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First Round Review
| return nil, fmt.Errorf("failed to create sandbox meta store: %v", err) | ||
| } | ||
|
|
||
| imageFSPath := imageFSPath(path.Join(config.HomeDir, "containerd/root"), defaultSnapshotterName) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that we should add utils' function to retrieve the root folder in the project next step.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A separate PR will be submitted to deal with similar things.
| for { | ||
| err := s.Sync() | ||
| if err != nil { | ||
| logrus.Errorf("failed to sync snapshot stats: %v", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should add TODO for this line, because it maybe print error message every 10 seconds.
The TODO is to track the error and report it to the monitor or something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
daemon/mgr/snapshot.go
Outdated
| if sn, ok := s.snapshots[key]; ok { | ||
| return sn, nil | ||
| } | ||
| return Snapshot{}, fmt.Errorf("failed to get %q in snapshot store", key) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should return const ErrNotFound error here. No need to custom right there.
Therefore, we can use the const error to check result of Get and make the code clear.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
| sn.Inodes = uint64(usage.Inodes) | ||
| s.store.Add(sn) | ||
| } | ||
| for _, sn := range s.store.List() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you mind to add more information about this part? For example, about the delete logic. Why do you only delete the out of date snapshot?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The purpose of this operation is to keep the synchronization with client.SnapshotService.
A case:
when you remove a container,you also need to remove snapshot. However, SnapshotStore will not be notified. So we need to delete snapshots from SnapshotStore that doesn't exist actually.
| return nil, fmt.Errorf("ContainerStats Not Implemented Yet") | ||
| containerID := r.GetContainerId() | ||
|
|
||
| container, err := c.ContainerMgr.Get(ctx, containerID) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we use a function to get the cs? If we have the function, we will reuse the function in the ListContainerStats. Don't repeat yourself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it is not necessary to do that, the function getContainerMetrics has been encapsulated, what has been repeated is just error handling.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think so. The function like generateContainerStatInfo wraps the c.ContainerMgr.Stats and c.getContainerMetrics to avoid the repeat the same thing. In the future, you only need to update the wrapped function.
Or use the getContainerMetrics to contains the c.ContainerMgr.Stats, WDTY?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have to admit it's more elegant. Thank you.
| timestamp := time.Now().UnixNano() | ||
| var usedBytes, inodesUsed uint64 | ||
| for _, sn := range snapshots { | ||
| // Use the oldest timestamp as the timestamp of imagefs info. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add more information about this logic?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Timestamp is latest time which the information were collected, we hava to be consistent. So I choose the oldest timestamp as the timestamp of imagefs info here.
e2690e7 to
6584c5d
Compare
| return nil, fmt.Errorf("ContainerStats Not Implemented Yet") | ||
| containerID := r.GetContainerId() | ||
|
|
||
| container, err := c.ContainerMgr.Get(ctx, containerID) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think so. The function like generateContainerStatInfo wraps the c.ContainerMgr.Stats and c.getContainerMetrics to avoid the repeat the same thing. In the future, you only need to update the wrapped function.
Or use the getContainerMetrics to contains the c.ContainerMgr.Stats, WDTY?
cri/v1alpha1/cri_utils.go
Outdated
| // deviceUUID gets device uuid of a device. The passed in rdev should | ||
| // be linux device number. | ||
| func deviceUUID(rdev uint64) (string, error) { | ||
| const uuidDir = "/dev/disk/by-uuid" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we put the const at the beginning of file?
cri/v1alpha1/cri_utils.go
Outdated
| if err != nil { | ||
| return 0, err | ||
| } | ||
| stat := info.Sys().(*syscall.Stat_t) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://golang.org/pkg/os/#FileInfo Sys() maybe nil and could we use stat, ok := info.Sys().(*syscall.Stat_t) to handle here? Because the assert failure will cause the panic.
cri/v1alpha1/cri_utils.go
Outdated
| anno "github.com/alibaba/pouch/cri/annotations" | ||
| "github.com/alibaba/pouch/daemon/mgr" | ||
| "github.com/alibaba/pouch/pkg/utils" | ||
| "github.com/go-openapi/strfmt" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please move it into next group
cb1cdb3 to
02f17da
Compare
daemon/mgr/snapshot.go
Outdated
| } | ||
|
|
||
| // NewSnapshotsSyncer creates a snapshot syncer. | ||
| func NewSnapshotsSyncer(store *SnapshotStore, cli ctrd.APIClient, period time.Duration) *SnapshotsSyncer { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NewSnapshotsSyncer will not be exported, rename it to newSnapshotsSyncer would be better.
|
Except the minor issue mentioned above, LGTM |
Signed-off-by: Starnop <[email protected]>
|
LGTM |
Signed-off-by: Starnop [email protected]
Ⅰ. Describe what this PR did
This PR implement ContainerStats(), ListContainerStats() and ImageFsInfo() of CRI manager.
All of these interfaces are used to retrieve the metrics of containers or image file system info.
Ⅱ. Does this pull request fix one issue?
None
Ⅲ. Describe how you did it
Ⅳ. Describe how to verify it
Use crictl tools:
or use curl tools:
Ⅴ. Special notes for reviews