Quickstart Guide¶
Introduction¶
Welcome to the CNF Reference Architecture quickstart guide. This guide provides the quick guidance to run a sample containerized networking application in a Kubernetes cluster, which is comprised of an AArch64 machine.
This reference solution is targeted for a networking software developer or performance analysis engineer with in-depth Kubernetes and networking knowledge, but does not know AArch64 architecture necessarily.
Mastering knowledge on certain open source projects, e.g., Ansible, Kubernetes, DPDK, will help gain deeper understanding of this guide and the reference solution more easily.
By following the steps in this quickstart guide to the end, you will set up a single-node Kubernetes cluster. Kubernetes controller and application Pods are deployed on a single AArch64 machine. The DPDK testpmd sample application is deployed in the Application Pod. The Pod has one interface for the K8s network and one interface for the VF/PF/ENI connected to the network requiring packet processing. The testpmd application receives a packet on a port, swaps the source and destination MAC/IP address/port, and forwards the packet out the same port. The single-node Kubernetes cluster topology is shown as below.
The topology diagram above illustrates the major components of the deployment and their relationship.
DUT (Device Under Test), is the only AArch64 machine in single-node Kubernetes cluster. Kubernetes controller and Application Pods run on this machine. This can be a physical machine or an AWS EC2 Graviton 2/3 instance.
DPDK testpmd application, implements 5-tuple swap networking function in software and forwards packets out the same port which receives them.
TG (Traffic Generator), generates and sends packets to the AArch64 machine’s NIC card via the Ethernet cable. It can be hardware TG, e.g., IXIA chassis, or software TG running on regular server, e.g., TRex, DPDK Pktgen, Scapy.
Management Node, can be any bare-metal machine, VM, or container. It is used to download the project source code, login to the DUT to create the Kubernetes cluster and deploy the application.
Infrastructure Setup¶
This guide can be run on physical hardware or on AWS EC2 cloud instances.
This guide requires the following setup:
Physical Hardware Setup¶
DUT is an AArch64 architecture machine and the only node in Kubernetes cluster. NIC card is plugged in its PCIe slot to connect the traffic generator via Ethernet cable.
Hardware Minimum Requirements
The DUT has the following hardware requirements:
AArch64 v8 CPU
Minimum 1GHz and 5 CPU cores
DPDK compatible NIC
Connection to the internet to download and install packages
Minimum 8G of RAM
At least one 1G hugepage is available. To easily allocate one, add the relevant Linux command line parameters as describe in the FAQ
Software Minimum Requirements
The following items are expected of the DUT’s software environment.
DUT is running Ubuntu 20.04 (Focal)
Admin (root) privileges are required to setup the DUT
The Fully Qualified Domain Name (FQDN) of the DUT can be checked with
python3 -c 'import socket; print(socket.getfqdn())'
command. See Troubleshooting if the proper FQDN is not shown.PCIe address of the NIC port attached to the traffic generator is confirmed with
sudo lshw -C network -businfo
CPU cores are isolated via
isolcpus, nohz_full, rcu_nocbs, cpuidle.off, cpufreq.off
Linux command line parameters. See FAQ for more details.
Management node can be any bare-metal, VM, or container. The management node is used to download the repository, access the DUT via
ssh
and configure Kubernetes cluster by executing an Ansible playbook. The Ansible playbook is executed locally on management node and it configures the DUT viassh
.
Software Minimum Requirements
Can execute Ansible
Can
ssh
into the DUT using SSH keys. See FAQ for more details.Admin (root) or
sudo
privileges are required
TG can be any traffic generator capable of generating IP packets. TG must be connected to DUT.
AWS EC2 Setup¶
DUT is an EC2 instance meeting the following requirements:
EC2 Requirements
c6gn.2xlarge or c7gn.2xlarge (or larger) instance
Secondary ENI attached, with the
node.k8s.amazonaws.com/no_manage: true
tag applied. The ENI’s PCIe address is knownIt should be
0000:00:06.0
Connection to the internet to download and install packages
EC2 instance is associated with the required AWS VPC CNI IAM policies, in addition to the AmazonEC2ContainerRegistryReadOnly policy.
Software Requirements
Amazon Linux 2 AMI
aws
CLI installed, with permission to describe-instance-typesCPU cores are isolated via the
isolcpus
in the Linux command line parameters. See FAQ for more detailsAt least one 1G hugepage is available. To easily allocate one, add the relevant Linux command line parameters as describe in the FAQ
SSH access enabled via SSH keypair
Admin (root) or
sudo
privileges
Management node can be any bare-metal, VM, or container. The management node is used to download the repository, access the DUT via
ssh
and configure Kubernetes cluster by executing an Ansible playbook. The Ansible playbook is executed locally on management node and it configures the DUT viassh
.
Software Minimum Requirements
Can execute Ansible
Can
ssh
into the DUT using SSH keys. See FAQ for more details.Admin (root) or
sudo
privileges are required
TG can be any traffic generator capable of generating IP packets. For EC2 deployment, this is typically another EC2 instance in the same VPC running a software based traffic generator, such as Pktgen DPDK.
Tested Platforms¶
The steps described in this quickstart guide have been validated on the following platforms.
Physical Hardware¶
DUT¶
Ampere Altra (Neoverse-N1)
Ubuntu 20.04.3 LTS (Focal Fossa)
NIC¶
-
OFED driver: MLNX_OFED_LINUX-5.4-3.1.0.0
Firmware version: 16.30.1004 (MT_0000000013).
Intel X710
Firmware version: 6.01
Note
To use Mellanox NIC, install OFED driver, update and configure NIC firmware by following the guidance in FAQ.
Management Node¶
Ubuntu 20.04 system
Python 3.8
Ansible 6.5.0
AWS EC2 Instance¶
DUT¶
c6gn.2xlarge instance
Amazon Linux 2
Kernel
5.10.184-175.731.amzn2.aarch64
Security group settings
Security group settings added same as this Kubernetes Ports and Protocols guidelines
Additionally for allowing traffic between x86 and arm EC2 give permissions to all ports and protocol in inbound rules.
Secondary ENI attached at device index 1, with
node.k8s.amazonaws.com/no_manage
set totrue
Management Node¶
Ubuntu 20.04 system
Python 3.8
Ansible 6.5.0
Prerequisite¶
Management Node¶
Management node needs to install dependencies, e.g., git, curl, python3.8, pip, Ansible, repo
. Follow below guidelines on Ubuntu 20.04.
Make sure
sudo
is available and installgit, curl, python3.8, python3-pip, python-is-python3
by executing$ sudo apt-get update $ sudo apt-get install git curl python3.8 -y $ sudo apt-get install python3-pip python-is-python3 -y
Install
ansible
andnetaddr
by executing$ sudo python3 -m pip install ansible==6.5.0 netaddr
Note
Install the ansible
and not the ansible-core
package, as this solution makes use of community packages not included in the ansible-core
python package.
Configure git with your name and email address
$ git config --global user.email "[email protected]" $ git config --global user.name "Your Name"
Follow the instructions provided in git-repo to install the
repo
tool manuallyFollow the FAQ to setup SSH keys on the management node. For EC2 deployment, use the SSH keypair assigned to the instance.
DUT¶
Complete below steps by following the suggestions provided.
Download Source Code¶
Unless mentioned specifically, all operations henceforth are executed on the management node.
Create a new folder that will be the workspace, henceforth referred to as
<nw_cra_workspace>
in these instructions:
mkdir <nw_cra_workspace>
cd <nw_cra_workspace>
export NW_CRA_RELEASE=refs/tags/NW-CRA-2024.03.29
Note
Sometimes new features and additional bug fixes are made available in
the git repositories, but are not tagged yet as part of a release.
To pick up these latest changes, remove the
-b <release tag>
option from the repo init
command below.
However, please be aware that such untagged changes may not be formally
verified and should be considered unstable until they are tagged in an
official release.
To clone the repository, run the following commands:
repo init \
-u https://git.gitlab.arm.com/arm-reference-solutions/arm-reference-solutions-manifest.git \
-b ${NW_CRA_RELEASE} \
-m cnf-reference-arch.xml
repo sync
Create Single Node Cluster¶
Create Ansible Inventory File¶
The Ansible playbooks in this repository are easiest to use with inventory files to keep track of the cluster nodes. For this solution we need one inventory file.
A template inventory.ini
is provided at <nw_cra_workspace>/cnf-reference-arch/inventory.ini
with the following contents:
[controller]
<fqdn_or_ec2_ip> ansible_user=<remote_user> ansible_private_key_file=<key_location>
; replace above line with DUT FQDN & optionally ansible_user and ansible_private_key_file.
; If an optional variable is not used, delete the entire key=<placeholder>.
[worker]
<fqdn_or_ec2_ip> ansible_user=<remote_user> pcie_addr=<pcie_addr_from_lshw> dpdk_driver=<driver_name> ansible_private_key_file=<key_location>
; replace above line with DUT FQDN, PCIe address, DPDK linux driver & optionally ansible_user and ansible_private_key_file.
; If an optional variable is not used, delete the entire key=<placeholder>.
Filling in of the inventory file differs between a physical hardware setup and an AWS EC2 setup.
Physical Hardware Inventory File¶
Replace <fqdn_or_ec2_ip>
with the FQDN of the DUT. The same FQDN should be used for both [controller]
and [worker]
as this is a single-node setup.
<remote_user>
specifies the user name to use to login to the DUT.
Replace <pcie_addr_from_lshw>
with the PCIe address of the port on the DUT connected to the traffic generator.
If the DUT uses Mellanox ConnectX-5 NIC to connect the traffic generator, replace <driver_name>
with mlx5_core
. Otherwise, replace it with vfio-pci
.
ansible_private_key_file
should be set to the identity file used to connect to each instance if other than the default key used by ssh
.
As an example, if the user name used to access DUT is user1, the DUT FQDN is dut.arm.com
and is connected to the traffic generator on PCIe address 0000:06:00.1
with a NIC compatible with the vfio-pci
driver, then inventory.ini
would contain:
[controller]
dut.arm.com ansible_user=user1
[worker]
dut.arm.com ansible_user=user1 pcie_addr=0000:06:00.1 dpdk_driver=vfio-pci
AWS EC2 Inventory File¶
Replace <fqdn_or_ec2_ip>
with the primary IP address of the DUT. The same IP address should be used for both [controller]
and [worker]
as this is a single-node setup.
<remote_user>
specifies the user name to use to login to the DUT.
Replace <pcie_addr_from_lshw>
with the PCIe address of the secondary ENI with tag node.k8s.amazonaws.com/no_manage: true
. This is typically 0000:00:06.0
.
Replace <driver_name>
with igb_uio
.
ansible_private_key_file
should be set to the SSH key pair generated at instance creation.
As an example, if the user name to access the DUT is ec2-user
, the DUT IP is 10.100.100.100
, the private key file is ssh-key-pair.pem
, and the secondary ENI PCIe address is 0000:00:06.0
, then inventory.ini
would contain:
[controller]
10.100.100.100 ansible_user=ec2-user ansible_private_key_file=ssh-key-pair.pem
[worker]
10.100.100.100 ansible_user=ec2-user pcie_addr=0000:00:06.0 dpdk_driver=igb_uio ansible_private_key_file=ssh-key-pair.pem
Execute the Playbook¶
Physical Hardware¶
To setup the Kubernetes cluster on physical hardware, run:
$ ansible-playbook -i inventory.ini create-cluster.yaml -K
It will start by asking for the sudo
password of the user name on DUT (the prompt may say BECOME
password instead). If remote user has passwordless sudo on DUT, the -K
flag can be omitted.
See the user guide for the full list of actions this playbook will take.
EC2 Instance¶
First, note the AWS region the EC2 instance is deployed in. Next, use this table to obtain the correct AWS Elastic Container Registry (ECR) URL for the AWS region.
To setup the Kubernetes cluster on an EC2 instance, substitute the corresponding values for aws_region
and ecr_registry_url
and run:
$ ansible-playbook -i inventory.ini create-cluster.yaml -e '{aws_inst: true, deploy_on_vfs: false, aws_region: us-west-2, ecr_registry_url: 602401143452.dkr.ecr.us-west-2.amazonaws.com}'
See the user guide for the full list of actions this playbook will take.
Validate the Cluster¶
At this point in time, the setup should look like the Single-node Kubernetes cluster topology at the beginning of this document. The DUT should be in a Kubernetes cluster and running a private docker registry.
To verify, ssh
into the DUT and run kubectl get nodes
.
The output should look like:
$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
dut.arm.com Ready control-plane 24m v1.25.0
Also run kubectl describe node $(hostname) | grep -A 5 ^Allocatable:
to ensure allocatable CPUs and 1G hugepages are correct.
The output should look like:
$ kubectl describe node $(hostname) | grep -A 5 ^Allocatable:
Allocatable:
arm.com/dpdk: 2
cpu: 4
ephemeral-storage: 189217404206
hugepages-1Gi: 1Gi
Finally, verify the local docker registry is running with: docker ps -f name=registry
. The output should look like:
$ docker ps -f name=registry
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
53656144b298 registry:2 "/entrypoint.sh /etc…" 46 hours ago Up 33 minutes 0.0.0.0:443->443/tcp, 5000/tcp registry
Run the Sample Application¶
Run¶
Now, it is time to apply the dpdk-testpmd.yaml
Ansible playbook. To do so,
run the following commands on the management node:
$ cd <nw_cra_workspace>/cnf-reference-arch/examples/dpdk-testpmd
$ ansible-playbook -i ../../inventory.ini dpdk-testpmd.yaml
For EC2 instance run the following commands:
$ cd <nw_cra_workspace>/cnf-reference-arch/examples/dpdk-testpmd
$ ansible-playbook -i ../../inventory.ini dpdk-testpmd.yaml -e '{aws_inst: true, deploy_on_vfs: false}'
See the dpdk-testpmd user guide for the full list of actions this playbook will take.
Once the playbook finishes, ssh
into the DUT and deploy the DPDK testpmd
application with dpdk-deployment.yaml
file which is copied and stored
in DUT home directory:
$ cd $HOME
$ kubectl apply -f dpdk-deployment.yaml
Check deployment status with command:
$ kubectl get deploy
NAME READY UP-TO-DATE AVAILABLE AGE
dpdk-testpmd 1/1 1 1 2m31s
Test¶
Monitor the application pod status by running kubectl get pods
on the DUT.
It may take some time to start up.
kubectl get pods
should show something like:
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
dpdk-testpmd-fbb6cd468-d7x8g 1/1 Running 0 31s
Once the pod is in the “Running” state,
view its logs with kubectl logs <podname>
.
The logs should contain something similar to:
$ kubectl logs dpdk-testpmd-fbb6cd468-d7x8g
...
+ ./build/app/dpdk-testpmd --lcores 1@9,2@10 -a 0000:07:02.0 -- --forward-mode=5tswap --port-topology=loop --auto-start
...
Set 5tswap packet forwarding mode
Auto-start selected
Configuring Port 0 (socket 0)
Port 0: link state change event
Port 0: link state change event
Port 0: CA:7D:57:CB:B0:5F
Checking link statuses...
These logs show port 0 has MAC address CA:7D:57:CB:B0:5F
with PCIe address
0000:07:02.0
on the DUT.
Configure the traffic generator to send packets to the NIC port, using the specified MAC as DMAC. If deploying on AWS EC2 instances, also ensure the destination IP matches the primary IP of the dataplane ENI.
This example uses a destination MAC address of CA:7D:57:CB:B0:5F
and a destination IP of 198.18.0.21
.
Then, dpdk-testpmd
will forward those packets out on port 0 after swapping the MAC, IP and port(s).
In this example, the packets transmitted by dpdk-testpmd
will have the source MAC set to CA:7D:57:CB:B0:5F
and the source IP will be 198.18.0.21
.
The destination MAC and IP will be set to the source MAC and IP of the packets transmitted by the traffic generator.
Stop¶
The pods can be stopped by deleting the deployment by running kubectl delete deploy dpdk-testpmd
on the DUT.
Then, clean up the Kubernetes cluster by executing sudo kubeadm reset -f
and sudo rm -rf /etc/cni/net.d
on the DUT.