Skip to content

[orchagent] Set ABRT signal in STATE_DB during a SAI failure#2556

Closed
vivekrnv wants to merge 18 commits intosonic-net:masterfrom
vivekrnv:orch_abrt
Closed

[orchagent] Set ABRT signal in STATE_DB during a SAI failure#2556
vivekrnv wants to merge 18 commits intosonic-net:masterfrom
vivekrnv:orch_abrt

Conversation

@vivekrnv
Copy link
Copy Markdown
Contributor

@vivekrnv vivekrnv commented Dec 5, 2022

Signed-off-by: Vivek Reddy Karri [email protected]

What I did

  • During a SAI programming failure, made orchagent set the ORCH_ABRT_STATUS flag in STATE_DB. Also, delete the flag once the orchagent restarts

Why I did it

How I verified it

Simulate a SAI failure:

81396-Nov 21 18:14:59.131240 mtbc-sonic-01-2410 ERR swss#orchagent: :- create: create status: SAI_STATUS_INVALID_ATTRIBUTE_MAX
81397-Nov 21 18:14:59.131240 mtbc-sonic-01-2410 ERR swss#orchagent: :- sflowCreateSession: Failed to create sample packet session with rate 512
81398:Nov 21 18:14:59.131240 mtbc-sonic-01-2410 ERR swss#orchagent: :- handleCreate: Encountered failure in create operation, exiting orchagent, SAI API: SAI_API_SAMPLEPACKET, status: SAI_STATUS_INVALID_ATTRIBUTE_MAX

Check if STATE_DB is updated and cleared once the orchagent is restarted

root@mtbc-sonic-01-2410:/home/admin# sonic-db-cli STATE_DB GET ORCH_ABRT_STATUS
1
root@mtbc-sonic-01-2410:/home/admin#

root@mtbc-sonic-01-2410:/home/admin# sonic-db-cli STATE_DB GET ORCH_ABRT_STATUS

root@mtbc-sonic-01-2410:/home/admin#

Details if related

Signed-off-by: Vivek Reddy Karri <[email protected]>
Signed-off-by: Vivek Reddy Karri <[email protected]>
Signed-off-by: Vivek Reddy Karri <[email protected]>
Signed-off-by: Vivek Reddy Karri <[email protected]>
Signed-off-by: Vivek Reddy Karri <[email protected]>
Signed-off-by: Vivek Reddy Karri <[email protected]>
Signed-off-by: Vivek Reddy Karri <[email protected]>
Signed-off-by: Vivek Reddy Karri <[email protected]>
Signed-off-by: Vivek Reddy Karri <[email protected]>
Signed-off-by: Vivek Reddy Karri <[email protected]>
Signed-off-by: Vivek Reddy Karri <[email protected]>
Signed-off-by: Vivek Reddy Karri <[email protected]>
@vivekrnv
Copy link
Copy Markdown
Contributor Author

vivekrnv commented Dec 6, 2022

/azpw run Azure.sonic-swss

@mssonicbld
Copy link
Copy Markdown
Collaborator

/AzurePipelines run Azure.sonic-swss

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

DBConnector state_db("STATE_DB", 0);

/* Clears the ORCH_ABORT_STATUS flag in STATE_DB */
state_db.del(ORCH_ABRT);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Contributor Author

@vivekrnv vivekrnv Dec 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That can be done, but clearing it here gives more time buffer to the processes dependent on this flag.

extern bool gLogRotate;
extern string gRecordFile;

void notifyAbort(){
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

API name is not aligned to the functionality. This is doing both notify and abort. API gives the indication that it is only notification. Please move abort to original code and use it only to notify or change the name to say notifyAndAbort

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will update

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Handled

Signed-off-by: Vivek Reddy Karri <[email protected]>
@vivekrnv vivekrnv requested a review from prsunny December 9, 2022 00:24
@vivekrnv
Copy link
Copy Markdown
Contributor Author

/azpw run Azure.sonic-swss

@mssonicbld
Copy link
Copy Markdown
Collaborator

/AzurePipelines run Azure.sonic-swss

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@vivekrnv
Copy link
Copy Markdown
Contributor Author

/azpw run Azure.sonic-swss

@mssonicbld
Copy link
Copy Markdown
Collaborator

/AzurePipelines run Azure.sonic-swss

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@vivekrnv
Copy link
Copy Markdown
Contributor Author

/azpw run Azure.sonic-swss

@mssonicbld
Copy link
Copy Markdown
Collaborator

/AzurePipelines run Azure.sonic-swss

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@vivekrnv vivekrnv closed this Jan 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants