Below you’ll find a compilation of notes and errors encountered during the installation of the Juniper vMX router in KVM. The vMX image provides a nice way to have a local lab, and it is fairly manageable with vmx script, virsh and maybe a script of your own ;-) It is also a solution aiming at service providers willing to have network function virtualization (NFV), thus this router can support high performance throughput.
First of all, you must be aware that the vMX router is actually made of two parts
- vRE / vCP : virtual routing engine, the control plane
- vPFE / vFP : virtual packet forwarding engine, the data plane
As this virtual router aims to be deployed in production environment you have different modes of installation to gain in performance with, for example, the direct access to the network card (SR-IOV) feature. In our case we will stay with the standard installation and lite mode. The lite mode can be configured inside the VM directly.
Preparing the system
Ok, so you basically need KVM and libvirt installed.
Be sure to load the drivers for nested virtualisation (https://wiki.ubuntu.com/kvm) You can also go through this documentation before a production installation https://www.juniper.net/documentation/en_US/vmx/topics/topic-map/vmx-installing-on-kvm.html#id-preparing-the-ubuntu-host-to-install-vmx
$ sudo modprobe kvm-intel
If you run into this error, it means that your Intel-VT / AMD-V virtualization options are disabled in the BIOS or not supported at all by your CPU.
FATAL: Error inserting kvm_intel (/lib/modules/2.6.20-15-generic/kernel/drivers/kvm/kvm-intel.ko): Operation not supported Typing dmesg you may find the following at the end:-
Then you can retry. To make this settings permanent, you can adjust those configuration files
screw@kvmhost:~/vmx$ sudo vim /etc/sysctl.conf screw@kvmhost:~/vmx$ sudo vim /etc/default/qemu-kvm
Starting the vMX
Create configuration files
The configuration file is a YAML file which can be broke down into a few parts:
The configuration of the host and the links to the qcow2 images:
--- #Configuration HOST: identifier : R2 host-management-interface : ens33 routing-engine-image : "/home/screw/vmx/images/junos-vmx-x86-64-17.2R1.13.qcow2" routing-engine-hdd : "/home/screw/vmx/images/vmxhdd.img" forwarding-engine-image : "/home/screw/vmx/images/vFPC-20170523.img" ---
The external bridge configuration useful to get access to the hosts externally :
#External bridge configuration BRIDGES: - type : external name : br-ext ---
Then the vRE VM parameters, such as CPUs, RAM, console port, interfaces. This interface will be used to communicate with the vPFE.
#vRE VM parameters CONTROL_PLANE: vcpus : 1 memory-mb : 1024 console_port: 8601 interfaces : - type : static ipaddr : 10.102.144.201 macaddr : "0A:00:DD:d8:4f:b8" ---
Then the vPFE VM parameters, with the interface, in the same IP subnet to allow communication between the two. A specific bridge will be created.
#vPFE VM parameters FORWARDING_PLANE: memory-mb : 2048 vcpus : 3 console_port: 8601 device-type : virtio interfaces : - type : static ipaddr : 10.102.144.41 macaddr : "0A:00:DE:4f:84:23" ---
Finally the interfaces configuration. These are the production interfaces you will use to run routing protocols and other stuff
--- #Interfaces JUNOS_DEVICES: - interface : ge-0/0/0 mac-address : "02:06:0A:7b:84:50" description : "ge-0/0/0 interface" - interface : ge-0/0/1 mac-address : "02:06:0A:4d:ec:ce" description : "ge-0/0/1 interface"
Links configuration files
The links configuration file is documented here: https://www.juniper.net/documentation/en_US/vmx14.1/topics/task/configuration/vmx-virtio-devices-binding.html
The file is another YAML file with the following format (an example sits in config/vmx-junosdev.conf) :
interfaces : - link_name : vmx_link1 mtu : 1500 endpoint_1 : - type : junos_dev vm_name : vmx1 dev_name : ge-0/0/0 endpoint_2 : - type : bridge_dev dev_name : bridge1 - link_name : vmx_link2 mtu : 1500 endpoint_1 : - type : junos_dev vm_name : vmx2 dev_name : ge-0/0/0 endpoint_2 : - type : bridge_dev dev_name : bridge1 - link_name : vmx_link3 endpoint_1 : - type : junos_dev vm_name : vmx1
This will launch the first instance : You will repeat this step for each router.
sudo bash vmx.sh --install --cfg config/R1.conf
Then you can launch the connection script:
sudo bash vmx.sh --bind-dev –-cfg config/links.conf
Errors encountered while starting the vMX
Bash, line not interpreted
Some lines are shown not to be interpreted. This is because the shebang specify sh instead of bash.
You can either use the
bash command to launch the script or replace
#!/bin/bash in the vmx.sh script.
Generally, the error comes from the privileges of the user that executed the script. Run the script as root so it make the necessary verification regarding libvirt and hugepages. If the error persist, re-run the script again.
Then, if it still fails, you can check that everything is properly configurer on your host:
pub@kvmhost:~/vmx$ cat /etc/default/qemu-kvm | grep HUGE KVM_HUGEPAGES=1 pub@kvmhost:~/vmx$ cat /proc/meminfo | grep Huge AnonHugePages: 0 kB HugePages_Total: 44 HugePages_Free: 44 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 1048576 kB pub@kvmhost:~/vmx$ cat /etc/sysctl.conf | grep -i huge # Allocate 256 HugePageTables (start with a low number but increas it before using it vm.nr_hugepages = 8192
network ‘br-ext’ already exists
If this happens it means that you have an already br-ext bridge registered but not active. It may be the results of an unsuccessful attempt to run the script or a previous setup. In order to clean this, you can do it by running the following virsh commands:
pub@kvmhost:~/vmx$ virsh virsh # net-list Name State Autostart Persistent ---------------------------------------------------------- default active yes yes virsh # net-list --all Name State Autostart Persistent ---------------------------------------------------------- br-ext inactive no yes default active yes yes virsh # net-undefine br-ext Network br-ext has been undefined virsh # net-list --all Name State Autostart Persistent ---------------------------------------------------------- default active yes yes
If the network is active and you want to achieve the same results, you need to destroy it first with:
Libvirt file creation issue
If you encounter the following error, install the corresponding python module:
File "/home/nugraha/Documents/vmx/scripts/common/vmx_configure.py", line 9, in <module> import netifaces as ni ImportError: No module named netifaces
panic: CPU0 does not support X87 or SSE: 1
You have to configure the cpu-mode as host-passthrough in the vRE xml.
vim build/vmx1/xml/vRE-generated.xml <cpu mode="host-passthrough">
Failed to start domain
If you got into this error, you might need to check that you IPs, MACs or console ports are unique across your configuration files.
error : internal error: early end of file from monitor, possible problem: device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x8 -msg timestamp=on 2018-05-21T17:24:21.734512Z qemu-system-x86_64: -chardev socket,id=charserial0,host=127.0.0.1,port=8600,telnet,server,nowait: Failed to bind socket: Address already in use
Deleted default bridge
If default br is missing or you have deleted it accidentally, then recreate it from this configuration found on github:
<network> <name>default</name> <bridge name="virbr0" /> <forward/> <ip address="192.168.122.1" netmask="255.255.255.0"> <dhcp> <range start="192.168.122.2" end="192.168.122.254" /> </dhcp> </ip> </network>
You can use the following commands to manage your lab:
virsh list destroy net-list --all net-undefine
Auto Image Upgrade: DHCP Client State Reset: fxp0.0
This is an automatic process based on DHCP for operating system upgrade on Juniper switches. You can disable it by entering the following commands in JunOS configuration mode:
delete chassis auto-image-upgrade commit
As always, you will be asked to change the root password on the JunOS box, here is the snippet:
edit system set root-authentication plain-text-password XX....XX
Message from syslog
Message from syslogd@ at Apr 9 15:28:36 … fpc0 Frame 8: sp = 0xffeeb978, pc = 0x807c415 Message from syslogd@ at Nov 11 09:07:28 … fpc0 Scheduler Oinker
This log message kept bugging me. The only mean I found to silence it, is to put a regex to prevent it from getting its way to the console.
edit system syslog user * set match "!fpc"
Linecard disovery and communication checks
This Juniper troubleshooting procedure is interesting: https://www.juniper.net/documentation/en_US/vmx/topics/task/verification/vmx-vm-connection-troubleshooting-esxi.html
Two things seen there: troubleshoot communication between vFP and vCP using ping, and check linecard discovery by vCP.
To ping, you need to find the IPs on both vCP and vFP. It can be done with
show interfaces terse and
ifconfig on vFP.
Then the following ping command is used:
root> ping 188.8.131.52 routing-instance __juniper_private1__
To check the line card discovery :
show chassis fpc looking for linecard starting in slot 0 and
show interfaces terse looking for ge-0/x/x interfaces.
If not two command are recommanded to restart the corresponding processes:
request chassis fpc slot 0 restart
and if it fails showing
FPC is in transition:
CPU usage / Configure lite-mode
If you have an high CPU usage, you might need to change the performance mode to lite.
root# edit chassis fpc 0 root# set lite-mode
Source: https://forums.juniper.net/t5/vMX/Juniper-VMX-bad-cpu-usage-using-lite-mode-in-kvm-compared-to/td-p/318927 https://www.juniper.net/documentation/en_US/vmx/topics/task/configuration/vmx-chassis-flow-caching-enabling.html
SQUASHFS issue on vFP
SQUASHFS is a compressed read-only filesystem that is generally used for CD images, liveCD… Getting those error messages may mean that the drive or the media has an issue. Check you img file (hash fingerprint), and if there is no issue on that side, reboot your VM.
SQUASHFS error: squashfs_read_data failed to read block 0x7ca248 SQUASHFS error: Unable to read fragment cache entry [7ca248] SQUASHFS error: Unable to read page, block 7ca248, size 980f SQUASHFS error: Unable to read fragment cache entry [7ca248] SQUASHFS error: Unable to read page, block 7ca248, size 980f SQUASHFS error: Unable to read fragment cache entry [7ca248] SQUASHFS error: Unable to read page, block 7ca248, size 980f SQUASHFS error: Unable to read fragment cache entry [7ca248] SQUASHFS error: Unable to read page, block 7ca248, size 980f SQUASHFS error: Unable to read fragment cache entry [7ca248] SQUASHFS error: Unable to read page, block 7ca248, size 980f
I put the default password for reference here:
- vCP: root / no password
- vFP: root / root
If you are crazy enough to decide to emulate multiple linecards, here’s an interesting link: https://jncie.eu/how-to-deploy-vmx-with-multiple-res-and-multiple-fpcs-in-eve-ng-kvm/
In that particular case, you have multiple vPFE for one single vCP.
Mounting issue on vPFE for /dev/sda2
If you got the following error:
mount: mounting /dev/sda2 on /mnt failed: No such file or directory you can correct your virtual hard disk drive type to IDE for this machine.
Moving VMs and files
In case you want to change the host to increase performance or just to move the VM to another lab environment, I noted that it is generally preferable to have a fresh install instead. It will spare you some useless troubleshooting time :-)