Skip to content

[ntp]: Add configuration to avoid ntpd from panic and exit#4263

Merged
lguohan merged 2 commits intosonic-net:masterfrom
SuvarnaMeenakshi:sumeenak_ntp
Mar 22, 2020
Merged

[ntp]: Add configuration to avoid ntpd from panic and exit#4263
lguohan merged 2 commits intosonic-net:masterfrom
SuvarnaMeenakshi:sumeenak_ntp

Conversation

@SuvarnaMeenakshi
Copy link
Contributor

- What I did
Add configuration to avoid ntpd from panic and exit if the drift between new time and current system time is large.
- How I did it
Added "tinker panic 0" in ntp.conf file.

- How to verify it
[this assumes that there is a valid NTP server IP in config_db/ntp.conf]

  1. Change the current system time to a bad time with a large drift from time in ntp server; drift should be greater than 1000s.
  2. Reboot the device.

Before the fix:
3. upon reboot, ntp-config service comes up fine, ntp service goes to active(exited) state without any error message. This is because the offset between new time (from ntp server) and the current system time is very large, ntpd goes to panic mode and exits. The system continues to show the bad time.

After the fix:
3. Upon reboot, ntp-config comes up fine, ntp services comes up from and stays in active (running) state. The system clock gets synced with the ntp server time.

- Description for the changelog

- A picture of a cute animal (not mandatory but encouraged)

the drift between new time and current system time is large.
@jleveque
Copy link
Contributor

Retest vs please

@jleveque
Copy link
Contributor

Retest vsimage please

@zhenggen-xu
Copy link
Collaborator

With ntp "-g" starting option, I,E reverting: #2589 , Do you still see the panic? That "-g" worked well for me for difference cases, so just wondering if you see a case where even with "-g" you could get panic. Also I understood this PR won't hurt, just wanted to see if we can get to the bottom of the issue.

@jleveque
Copy link
Contributor

@SuvarnaMeenakshi: Have you had a chance to test @zhenggen-xu's hypothesis? The PR he is referring to was reverted in master yesterday via #4265.

@SuvarnaMeenakshi
Copy link
Contributor Author

With ntp "-g" starting option, I,E reverting: #2589 , Do you still see the panic? That "-g" worked well for me for difference cases, so just wondering if you see a case where even with "-g" you could get panic. Also I understood this PR won't hurt, just wanted to see if we can get to the bottom of the issue.

I tested with "-g" option, that after reverting #2589. I don't see the panic. "-g" works, "tinker panic 0" also works, deciding to keep both. Tested keeping both the options, and works fine.

@SuvarnaMeenakshi
Copy link
Contributor Author

retest vsimage please

@lguohan lguohan merged commit cfe754f into sonic-net:master Mar 22, 2020
rlhui pushed a commit that referenced this pull request Mar 23, 2020
- What I did
Add configuration to avoid ntpd from panic and exit if the drift between new time and current system time is large.

- How I did it
Added "tinker panic 0" in ntp.conf file.

- How to verify it
[this assumes that there is a valid NTP server IP in config_db/ntp.conf]

Change the current system time to a bad time with a large drift from time in ntp server; drift should be greater than 1000s.
Reboot the device.
Before the fix:
3. upon reboot, ntp-config service comes up fine, ntp service goes to active(exited) state without any error message. This is because the offset between new time (from ntp server) and the current system time is very large, ntpd goes to panic mode and exits. The system continues to show the bad time.

After the fix:
3. Upon reboot, ntp-config comes up fine, ntp services comes up from and stays in active (running) state. The system clock gets synced with the ntp server time.
yxieca pushed a commit that referenced this pull request Apr 3, 2020
- What I did
Add configuration to avoid ntpd from panic and exit if the drift between new time and current system time is large.

- How I did it
Added "tinker panic 0" in ntp.conf file.

- How to verify it
[this assumes that there is a valid NTP server IP in config_db/ntp.conf]

Change the current system time to a bad time with a large drift from time in ntp server; drift should be greater than 1000s.
Reboot the device.
Before the fix:
3. upon reboot, ntp-config service comes up fine, ntp service goes to active(exited) state without any error message. This is because the offset between new time (from ntp server) and the current system time is very large, ntpd goes to panic mode and exits. The system continues to show the bad time.

After the fix:
3. Upon reboot, ntp-config comes up fine, ntp services comes up from and stays in active (running) state. The system clock gets synced with the ntp server time.
tiantianlv pushed a commit to SONIC-DEV/sonic-buildimage that referenced this pull request Apr 24, 2020
…ic-net#4263)

- What I did
Add configuration to avoid ntpd from panic and exit if the drift between new time and current system time is large.

- How I did it
Added "tinker panic 0" in ntp.conf file.

- How to verify it
[this assumes that there is a valid NTP server IP in config_db/ntp.conf]

Change the current system time to a bad time with a large drift from time in ntp server; drift should be greater than 1000s.
Reboot the device.
Before the fix:
3. upon reboot, ntp-config service comes up fine, ntp service goes to active(exited) state without any error message. This is because the offset between new time (from ntp server) and the current system time is very large, ntpd goes to panic mode and exits. The system continues to show the bad time.

After the fix:
3. Upon reboot, ntp-config comes up fine, ntp services comes up from and stays in active (running) state. The system clock gets synced with the ntp server time.
@SuvarnaMeenakshi SuvarnaMeenakshi deleted the sumeenak_ntp branch December 14, 2020 09:28
mssonicbld added a commit that referenced this pull request Feb 27, 2026
…lly (#25635)

#### Why I did it
src/sonic-swss
```
* b2c4e649 - (HEAD -> 202511, origin/202511) [202511][ssw][ha] vnetorch support primary/secondary live update for custom bfd (#4228) (6 hours ago) [Jing Zhang]
* d317c8f9 - [countersyncd]: Fix cargo bench compiling error (#4263) (17 hours ago) [mssonicbld]
* 725d6660 - [HFT]: keep STATE_DB session stream_status in sync with profile stream_state (#4190) (22 hours ago) [mssonicbld]
* fa01e8d2 - [countersyncd]: Modify the exit behavior of the main function (#4225) (27 hours ago) [mssonicbld]
* 82243e7f - [macsecmgr]: Fix MACsec unconfigure failures due to time out (#4233) (2 days ago) [mssonicbld]
* 8d476206 - [ci] Install redis from Debian (#4251) (3 days ago) [mssonicbld]
* 0c11b096 - [Bug] This patch is to fix a major bug in the for loop of the (#4173) (3 days ago) [ashishalok-arista]
```
#### How I did it
#### How to verify it
#### Description for the changelog
mssonicbld added a commit that referenced this pull request Mar 13, 2026
…atically (#25254)

#### Why I did it
src/sonic-utilities
```
* 20a7131b - (HEAD -> master, origin/master, origin/HEAD) clear: make --namespace optional for arp and ndp commands (#4355) (5 minutes ago) [Oleksandr Ivantsiv]
* f56e4a78 - show version: replace --verbose with --brief flag (#4350) (20 hours ago) [Ashwin Srinivasan]
* 5e50cf3d - Wait for monit monitor <service> operation to complete during config (#4295) (23 hours ago) [Hemanth Kumar Tirupati]
* 0306ea20 - Change sensorshow conn to use TCP socket (#4343) (2 days ago) [Chenyang Wang]
* cb5b3e82 - Fix route_check.py redis client memory usage (#4217) (2 days ago) [Roee Bar]
* e93a5c3c - config: allow golden config to override mac, platform, asic_id (#4348) (2 days ago) [securely1g]
* 0024c8d4 - Add non -B- hwsku names as well (#4331) (2 days ago) [dakotac-arista]
* eb7301cc - Fix unit tests (#4345) (3 days ago) [william8545]
* 052199c0 - [Arista] Add Arista-7050CX3-32C-C28S4 to generic_config_updater (#4257) (4 days ago) [byu343]
* ed68290a - Add multi-ASIC namespace support for show/config subinterface(s) command (#4298) (4 days ago) [william8545]
* 9c9f099d - New CLI proposal for PHY diagnostics (#4214) (4 days ago) [Prince George]
* 9e3373df - Fix generate_dump to preserve per-ASIC subdirectory structure for sdk_dbg collection (#4334) (4 days ago) [william8545]
* 3fe8972f - Add multi-ASIC namespace support for ARP/NDP show and clear commands (#4231) (4 days ago) [Oleksandr Ivantsiv]
* be5fe2aa - Add multi-ASIC namespace support for VLAN and FDB operations (#4230) (4 days ago) [Oleksandr Ivantsiv]
* e74fca78 - Add multi-ASIC namespace support for static route commands (#4269) (4 days ago) [Oleksandr Ivantsiv]
* 599e7c71 - Add multi-ASIC namespace support for ACL table add/remove commands (#4270) (4 days ago) [Oleksandr Ivantsiv]
* d09d6cd6 - Add CLI support for "show interfaces <intf> <phy-signal/phy-serdes>" commands (#4312) (4 days ago) [prajjwal-arista]
* 345f5686 - Add multi-asic namespace support for IPv6 link-local commands (#4289) (4 days ago) [william8545]
* edd4b190 - Add multi-asic namespace support for crm show resources command (#4290) (4 days ago) [william8545]
* 2b52a051 - [multi-asic] Add namespace support for vxlan and vnet show/config commands (#4299) (4 days ago) [william8545]
* 03160905 - [fast-reboot][cosmetic] Fixed debug/error prints with the correct reboot type (#4285) (4 days ago) [Yair Raviv]
* 6eedf8a7 - [warm-reboot][multi-asic] Added error-handling for faulty ASIC/s after orchagent freeze (#4297) (4 days ago) [Yair Raviv]
* 2330bab5 - [BMC] Add new BMC CLIs for manual session management and reset root password (#4238) (4 days ago) [Ben Levi]
* 4d0cc933 - Fix issue: pmon services's restart count is not cleared during config reload (#4314) (4 days ago) [Stephen Sun]
* 0a1bbc55 - Fix the generate_dump for BCM Asic Q3D (#4326) (6 days ago) [saksarav-nokia]
* 1580ccce - GCU generates suboptimal plan for CreateOnly paths (#4335) (6 days ago) [Brad House - Nexthop]
* 369e703e - GCU: Add path tracing support (#4317) (7 days ago) [Brad House - Nexthop]
* bc05e1a4 - [GCU]: Restart telemetry container on port speed change via GCU to handle OID update (#4248) (7 days ago) [Xincun Li]
* 73f1ea51 - Fix warning messages due to nose test deprecation (#4322) (8 days ago) [Brad House - Nexthop]
* ebfefbd8 - [Arista] Add TH5 HWSKU to list for pfcwd support (#4329) (8 days ago) [dakotac-arista]
* 0d969b85 - [DPU] Add support for HA Set Counters (#4283) (8 days ago) [Connor Roos]
* 44f8c37b - [DPU] Add CLI to trigger and dump flows (#4278) (8 days ago) [Vivek]
* 76bf567e - [show interfaces] "show interfaces flap" command does not support multi-ASIC platforms (#4316) (9 days ago) [pnakka28]
* 2ec21e19 - Limit PFC WD Detection time to maximum value of 1000ms (#4306) (9 days ago) [Hemanth Kumar Tirupati]
* 99b1b76a - Modified dualtor_neighbor_check to use mux neighbor_mode (#4227) (10 days ago) [manamand2020]
* 5dfd11ed - Fix 'show version' KeyError when sonic_version.yml has missing fields (#4324) (10 days ago) [securely1g]
* 4c77f9d4 - fix: skip PORT_INGRESS/EGRESS_MIRROR_CAPABLE check for ERSPAN mirror sessions (#4323) (11 days ago) [bingwang-ms]
* d8d2a39e - fix scapy delayed import when we have large routes (#4315) (11 days ago) [Hemanth Kumar Tirupati]
* c6601cda - [LACP retry-count] Syntax Fix for Trixie (#4274) (11 days ago) [Yair Raviv]
* f54d0a7c - Add fsync to config save to persist config across power cycle (#4313) (11 days ago) [Jianyue Wu]
* e5f77f61 - Fix unit test assertions broken by spelling typo PRs (#4321) (13 days ago) [rustiqly]
* 7660b19f - Fix spelling typos in muxcable modules (#4259) (2 weeks ago) [rustiqly]
* f7d820f3 - Fix spelling typos in config/main.py (#4261) (2 weeks ago) [rustiqly]
* 244942bd - Fix spelling typos in scripts/ (#4262) (2 weeks ago) [rustiqly]
* 89001b10 - Fix spelling typos in show/ and clear/ modules (#4263) (2 weeks ago) [rustiqly]
* d6e646c2 - Fix spelling typos in config/config_mgmt.py (#4260) (2 weeks ago) [rustiqly]
* e244129c - Fix spelling typos in config/nat.py (#4258) (2 weeks ago) [rustiqly]
* 5a0c48f0 - In route_check.py, Convey the IJSON Backend using an env variable (#4294) (2 weeks ago) [venkit-nexthop]
* e2712fc1 - Fix spelling typos across utilities_common, config plugins, and misc modules (#4264) (2 weeks ago) [rustiqly]
* 4211edee - Fixed show vxlan remotemac ambiguity (#4121) (2 weeks ago) [Gnanapriya [Marvell]]
* cfd23f97 - Add FEC histograms to generate_dump output (#4244) (2 weeks ago) [Fraser Gordon]
* 8882a633 - [storm-control] Fixed show storm-control interface command display (#4122) (2 weeks ago) [Gnanapriya [Marvell]]
* 7a1e656e - [fibshow]: Fix exception when blackhole routes are present (#4189) (2 weeks ago) [Ravi Minnikanti(Marvell)]
* 2b3f14de - [marvell-teralynx] Enhance techsupport to include HWSKU configs (#4161) (3 weeks ago) [Naveen-Rampuram]
* 9cb7b3e6 - Merge pull request #4275 from tirupatihemanth/fix_scapy_lagkeepalive (3 weeks ago) [Ying Xie]
|\ 
| failure_prs.log skip_prs.log 7e54ddff - Fix delayed scapy import when we have a lot of routes (3 weeks ago) [Hemanth Kumar Tirupati]
* | cbb31f0d - [multi-asic] fix utilities_common Db helper (#4273) (3 weeks ago) [Yakiv Huryk]
* | f65ddfa2 - Prevent early exit of reboot status (#4282) (3 weeks ago) [Gagan Punathil Ellath]
* | 14840074 - [fast-reboot] Remove teamsyncd timer override by fast-boot (#4233) (3 weeks ago) [Yair Raviv]
* | a3085380 - [lag_keepalive] add `--namespace` option (#4194) (4 weeks ago) [Yair Raviv]
* | abc8bba1 - [teamd_retry_count] Add support for --namespace parameter (#4195) (4 weeks ago) [Yair Raviv]
* | c05d995c - [warm/fast-reboot] check per-ASIC FW upgrade status (#4196) (4 weeks ago) [Yair Raviv]
* | 433d01c1 - [check_db_integrity] Add NETNS environment (#4197) (4 weeks ago) [Yair Raviv]
* | 441595c7 - [centralize_database] Add --namespace option (#4198) (4 weeks ago) [Yair Raviv]
* | 0f3b5291 - [multi-asic][warm-reboot] Support warm-reboot on Multi-ASIC systems (#4199) (4 weeks ago) [Yair Raviv]
* | 28623ca9 - [multi-asic][warm_restart] add Multi-ASIC support for warm_restart commands (#4200) (4 weeks ago) [Yair Raviv]
* | 3cd228af - Add filesystem sync after plugin installation (#4251) (4 weeks ago) [Jianyue Wu]
* | 1d78c210 - Add .github/copilot-instructions.md for AI-assisted development (#4271) (4 weeks ago) [rustiqly]
* | 7895da57 - Fix dump port state CLI command crash on multi-asic platforms (#4229) (4 weeks ago) [Setu Patel]
|/ 
* bcb1d4bb - Clearing /tmp/tmp* is unsafe with parallel builds (#4268) (4 weeks ago) [Brad House - NextHop]
* 8103627e - Fix sonic-utilities submodule update failure due to ijson library (#4256) (4 weeks ago) [venkit-nexthop]
* 85becedc - [Mellanox] Add restricted sysfs to fw control list (#4240) (4 weeks ago) [Noa Or]
* 275bdc6c - Add multi-asic support for sonic-clear queue wredcounters and counter poll , --nonzero support for show queue wredcounters (#4152) (5 weeks ago) [saksarav-nokia]
* fbc85ee4 - Fix j2 files not getting packaged (#4250) (5 weeks ago) [Saikrishna Arcot]
* a9543cba - Fix route_check.py to not hog a lot of memory (#4205) (5 weeks ago) [venkit-nexthop]
* 40260d5b - Fix JsonMove._get_value to Support Both String and Integer List Indices (#4237) (5 weeks ago) [Xincun Li]
* 0a3ef184 - refactor: enhance show bfd summary command (#4242) (5 weeks ago) [Chenyang Wang]
* 7c6dfdc2 - Update the error message for sfputil debug loopback command (#4224) (5 weeks ago) [Ariz Zubair]
* f246da25 - [Fast-linkup] Added CLIs for config/show (#4182) (6 weeks ago) [Yair Raviv]
* 87703c1 - Use Singleton PlatformDataProvider to reduce module import time (#4183) (6 weeks ago) [Hemanth Kumar Tirupati]
* 0dae5f2 - [sfputil] Fix issue: should not do low power mode or reset for non-present ports (#4206) (6 weeks ago) [Junchao-Mellanox]
* 5f56518 - generate_dump: add interface FEC stats (#4093) (6 weeks ago) [Fraser Gordon]
* 2e9e81c - [GCU] Update WRED_PROFILE and BUFFER_POOL validators for GCU (#4219) (6 weeks ago) [Dev Ojha]
* 2350203 - Update bash completions for sonic-utilities commands (#4163) (6 weeks ago) [Saikrishna Arcot]
* 5052e02 - Fix the PSU show command error message on platform without psu at all (#4151) (6 weeks ago) [Yuanzhe]
* 7d9ec5d - Fix issue that namespace is not correctly fetched in Multi ASIC environment for mirror capability checking (#4159) (6 weeks ago) [Stephen Sun]
* f473b4f - Fix multi asic initialization for dump command (#4108) (6 weeks ago) [Gagan Punathil Ellath]
* 0f45e43 - Add current and configured frequency to DOM CLI (#4209) (7 weeks ago) [Ariz Zubair]
* 6f0b181 - Added counterpoll CLI support (#4106) (7 weeks ago) [Dhanasekar Rathinavel]
* 3d5bef9 - [multi-asic][Mellanox] Add multi-ASIC support for generate_dump and update FW upgrade script (#4192) (7 weeks ago) [Oleksandr Ivantsiv]
* 8451f01 - sonic-utilities: Support for clearing aggregate VOQ counters(#2001) (#4044) (8 weeks ago) [manish1-arista]
* 21f013f - Add q3d SKUs to gcu_field_operation_validators.conf.json (#4201) (8 weeks ago) [HP]
* 1a15091 - Fix multi asic connection creation (#4109) (8 weeks ago) [Gagan Punathil Ellath]
```
#### How I did it
#### How to verify it
#### Description for the changelog
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants