Chronyd service randomly fails to start during boot

Affected Products:

  • All Azure images configured to use /dev/ptp_hyperv device in /etc/chrony.conf configuration file

Opened: 2022-11-09

Severity: 2-Minor

Symptoms:
It is possible for the chronyd service to not start successfully during boot. This happens only if it is configured to use /dev/ptp_hyperv device in /etc/chrony.conf configuration file:

[azureuser@test ~]$ cat /etc/chrony.conf | grep ptp_hyperv
refclock PHC /dev/ptp_hyperv poll 3 dpoll -2 offset 0 stratum 2

During the boot process, several processes are started concurrently. Sometimes, in certain situations, the time required for the /dev/ptp_hyperv device to become available takes longer, and therefore the chronyd service is started before the device is created. We have noticed that usually, this happens during the system boot following a kernel upgrade, and might not happen during subsequent boots.

[azureuser@test ~]$ systemctl status chronyd
● chronyd.service - NTP client/server
Loaded: loaded (/usr/lib/systemd/system/chronyd.service; enabled; vendor preset: enabled)
Active: failed (Result: exit-code) since Tue 2022-11-08 21:07:06 UTC; 59s ago
Docs: man:chronyd(8)
man:chrony.conf(5)
Process: 752 ExecStopPost=/usr/libexec/chrony-helper remove-daemon-state (code=exited, status=0/SUCCESS)
Process: 734 ExecStart=/usr/sbin/chronyd $OPTIONS (code=exited, status=1/FAILURE)

Nov 08 21:07:05 testvm-1 systemd[1]: Starting NTP client/server…
Nov 08 21:07:06 testvm-1 chronyd[750]: chronyd version 4.2 starting (+CMDMON +NTP +REFCLOCK +RTC +PRIVDROP +SCFILTER +SIGND +ASYNCDNS +NTS +SECHASH +IPV6 +DEBUG)
Nov 08 21:07:06 testvm-1 chronyd[750]: Could not open /dev/ptp_hyperv : No such file or directory
Nov 08 21:07:06 testvm-1 chronyd[750]: Fatal error : Could not open PHC
Nov 08 21:07:06 testvm-1 chronyd[734]: Could not open PHC
Nov 08 21:07:06 testvm-1 systemd[1]: chronyd.service: Control process exited, code=exited status=1
Nov 08 21:07:06 testvm-1 systemd[1]: chronyd.service: Failed with result ‘exit-code’.
Nov 08 21:07:06 testvm-1 systemd[1]: Failed to start NTP client/server.
[root@testvm-1 ~]#

Solution:
After the virtual machine has successfully booted up, login and restart the chronyd service. By this time, the /dev/ptp_hyperv device should be available, and chronyd service should start without any problem:

[azureuser@test ~]$ sudo systemctl stop chronyd
[azureuser@test ~]$
[azureuser@test ~]$ sudo systemctl start chronyd
[azureuser@test ~]$
[azureuser@test ~]$ sudo systemctl status chronyd
● chronyd.service - NTP client/server
Loaded: loaded (/usr/lib/systemd/system/chronyd.service; enabled; vendor preset: enabled)
Active: active (running) since Tue 2022-11-08 20:31:45 UTC; 1min ago
Docs: man:chronyd(8)
man:chrony.conf(5)
[…]

To make sure this issue does not happen in the future, you can configure the chronyd service to wait for the /dev/ptp_hyperv device file to become available before starting, as described in the steps below:

  1. Using sudo and your favorite editor, edit the /etc/udev/rules.d/99-azure-hyperv-ptp.rules file and add the following two lines at the end:

# tag the ptp subsystem so that chronyd.service can wait for it
ACTION==“add”, SUBSYSTEM==“ptp”, TAG+=“systemd”

After you edit the file, it must look like below:

[azureuser@test ~]$ sudo cat /etc/udev/rules.d/99-azure-hyperv-ptp.rules
# Mellanox VFs also produce a /dev/ptp device. To avoid the conflict,
# we will rename the hyperv ptp interface “ptp_hyperv”
SUBSYSTEM==“ptp”, ATTR{clock_name}==“hyperv”, SYMLINK += “ptp_hyperv”

# tag the ptp subsystem so that chronyd.service can wait for it
ACTION==“add”, SUBSYSTEM==“ptp”, TAG+=“systemd”

NOTE: if the /etc/udev/rules.d/99-azure-hyperv-ptp.rules file does not exit at all, then the issue described in this article most probably does not affect your virtual machine.

  1. Create the /etc/systemd/system/chronyd.service.d folder:

[azureuser@test ~]$ sudo mkdir -p /etc/systemd/system/chronyd.service.d

  1. Using sudo and your favorite editor, create the /etc/systemd/system/chronyd.service.d/local.conf file with the following content:

# wait for the /dev/ptp_hyperv device to become available
[Unit]
Requires=dev-ptp_hyperv.device
After=dev-ptp_hyperv.device

  1. Reload the systemd manager configuration and restart the chronyd service:

[azureuser@test ~]$ sudo systemctl daemon-reload
[azureuser@test ~]$
[azureuser@test ~]$ sudo systemctl stop chronyd
[azureuser@test ~]$
[azureuser@test ~]$ sudo systemctl start chronyd
[azureuser@test ~]$
[azureuser@test ~]$ sudo systemctl status chronyd
● chronyd.service - NTP client/server
Loaded: loaded (/usr/lib/systemd/system/chronyd.service; enabled; vendor preset: enabled)
Active: active (running) since Tue 2022-11-08 21:09:52 UTC; 1min ago
Docs: man:chronyd(8)
man:chrony.conf(5)
[…]

  1. To make sure all is okay, you might reboot once the virtual machine.

If you still face the same issue, please contact ProComputers Support as instructed in this article.

References: