.. # Copyright (c) 2024, Arm Limited. # # SPDX-License-Identifier: Apache-2.0 ############### Troubleshooting ############### This page describes common issues and steps to resolve them. .. _python-fqdn: #. ``python3 -c 'import socket; print(socket.getfqdn())'`` does not show the FQDN: If the above python command fails to print out the correct FQDN, then your system may be affected by `this python bug `__. The solution is to modify ``/etc/hosts`` to hardcode the correct FQDN. To do so, follow the steps in the `Debian manual `__ to modify ``/etc/hosts``. #. ``Create VFs for `` task fails with ``echo 2 > /sys/bus/pci/devices//sriov_numvfs ... echo: I/O error``: CRA is most easily deployed with exclusive control over a PF. Ensure no other applications like DPDK or QEMU are using the PF. Additionally, ensure the PF is bound to the default kernel driver such as ``i40e`` or ``mlx5_core``. 1. Check if NIC port is used by other applications like QEMU or DPDK, and stop them. 2. Check if NIC is bound with ``vfio-pci`` driver, this can be done with ``/usr/local/bin/dpdk-devbind.py -s``. 3. Bind with NIC's default driver. Taking ``i40e`` as example, bind NIC port to ``i40e`` with ``/usr/local/bin/dpdk-devbind.py -b i40e ``. 4. Re-deploy CRA solution. CRA can also be deployed onto specific VFs on a machine. This can be used to share underlying PFs with other applications like DPDK or QEMU. To leverage this deployment model, specify the VF PCIe addresses in the ``pcie_addr`` for the worker node. See the user guide for more information. #. Tasks error out by throwing error ``Timeout (12s) waiting for privilege escalation prompt``: This error happens when latency to execute tasks is high. We can add the parameter ``-T 120`` at the end of ``ansible-playbook`` to increase the timeout time. #. Playbook fails due to PCIe address having active routes: .. code-block:: none TASK [bind_pcie_addrs : Fail if any PCIe address has active routes] ****************************************************** fatal: [worker-node]: FAILED! => {"changed": false, "msg": "At least one PCIe address has an active route. Binding this to igb_uio may cause a loss in connectivity."} This error happens when a dataplane interface (PF/VF/ENI) on a worker node has an active route. In this step, the dataplane interface is being rebound to a different driver, which would remove any routes. Since this may result in a loss of connectivity, the solution will not automatically remove the routes. To use that dataplane interface, remove its routes and re-run the playbook.