Build an OpenStack/Ceph cluster with Cumulus Networks in GNS3: part 2
Adding virtual machine images to GNS3
I’m going to assume that at this stage, you’ve got a fully working (and tested) GNS3 install on a suitably powerful Linux host. Once that is complete, the next step is to download the two virtual machine images we discussed in part 1 of this blog, and integrate them into GNS3.
In my setup, I downloaded the Cumulus VX 4.0 QCOW2 image (though you are welcome to try newer releases which should work), which you can obtain by visiting this link: https://cumulusnetworks.com/accounts/login/?next=/products/cumulus-vx/download/
I also downloaded the Ubuntu Server 18.04.4 QCOW2 image from here: https://cloud-images.ubuntu.com/bionic/current/bionic-server-cloudimg-amd64.img
Once you have downloaded these two images, the next task is to integrate them into GNS3. To do this:
Make the following additional changes once the images have been copied over:
a. :/symbols/classic/ethernet_switch.svg for Cumulus VX
b. :/symbols/classic/server.svg for Ubuntu Server
With this stage complete, you can proceed to building your infrastructure on the canvas.
Build virtual infrastructure on the canvas
Once we’ve defined our QEMU VM’s, the real fun starts! We can now simply click and drag our infrastructure onto the canvas. GNS3 doesn’t support orthogonal lines for the connections, so it can be a little crowded by the time you’ve completed as complex an architecture as we are building here – however the effort is well worth it, especially when you consider that you can right click on any connection and sniff the traffic running over it! This is a great learning and investigative tool.
One important learning is this – GNS3 does not have the capability to edit the number of network connections on a device on the canvas once you’ve connected it up – you have to delete all your connections if you want to edit this property. Thus it’s worth taking some time to plan out the design or simply add more ports than you need.
You’ll also need to edit the amount of RAM some of the VM’s are allocated, and the disk sizes too. Also for our static DHCP allocations for the Cumulus VX switches you will need to set the MAC addresses recorded in the table below. Note that the sizes and values recorded in this table are the ones I have tested with – they should however be viewed as minimum viable values, and may need to be increased depending on how you want to test your virtualized infrastructure:
Note that all disks are created as sparse disks, and so will not occupy the amount of storage specified above. When I had fully built my demo environment, it occupied a total of around 112GB.
In general, the network topology is laid out as follows:
- Mgmt01 connects to both a Cloud and a NAT device in GNS3 – the NAT device provides fast connectivity, outbound, for all VM’s on the canvas. The Cloud device is very slow, but allows you to SSH directly into your management VM if you wish. You can leave this out if you prefer.
- On all other nodes, eth0 is connected to swmgmt, and is used for out-of-band management.
- All OpenStack VM’s have 5 Ethernet ports. After eth0 (management), the other 4 are used in bonded pairs for the two physical networks suggested in the openstack-ansible example document. They are wired up alternatively to leaf01 and leaf02 in pairs, starting at eth3 on these switches (eth1 and eth2 were used for testing purposes in the early stages of the design and are not assigned in the current version).
- The high numbered ports on the switches are used for the interconnects between the switches that enable MLAG to operate.
Your resulting canvas should look something like this:
Creating cloud-init configurations
Once you have built your infrastructure on the canvas, the next task on our list is to build a set of cloud-init ISO images. Although the Cumulus VX images have a well known default username and password that we can make use of, the Ubuntu images do not – in fact they have no default password set at all, as they expect to get this from the cloud orchestration system (e.g. OpenStack, Amazon EC2, etc.). Fortunately for us, cloud-init is built into the Ubuntu cloud images we downloaded, and can perform any of a number of tasks, from setting a default password, to configuring the network interfaces (even bonding can be set up!), and running arbitrary commands. On the first boot of every VM, cloud-init searches a well known set of locations for its configuration data, and one of these happens to be an ISO image attached to the VM. Thus we can create a small, unique ISO image for each VM that does the following:
- Sets the hostname for each VM
- Sets the password for the default user account (ubuntu)
- Adds an SSH public key to this user account for passwordless management
- Changes the boot parameters of the VM to use the old eth0, eth1,… style of network interface naming
- Installs Python (to enable further automation with Ansible later on).
In addition, for our “management” VM, our cloud-init scripts go even further, both installing Ansible and even cloning the GitHub repository that accompanies this article to the home ubuntu user’s directory.
Rather than go through the code in detail here, we’ll leave you to explore it yourself, as it is all available here: https://github.com/jamesfreeman959/gns3-cumulus-openstack-ansible
Makefile’s have been placed at appropriate places in the directory structure to help you get started easily. The process for building the ISO files is as simple as:
a. git clone https://github.com/jamesfreeman959/gns3-cumulus-openstack-ansible
Once this is completed, you will find that all your VM images will have all essential configuration performed on their first boot. However don’t boot all your nodes just yet! For everything to come up cleanly, we need to configure the network in a logical sequence, the next step in which is to complete the configuration of the management node.
Configuring the management node
Right click on the mgmt01 VM and click Start. Leave all other VM’s powered off at this stage. Now when you double-click on it, a console should open and you should see the VM boot up. You will also see it reboot – this is part of the cloud-init configuration which disables persistent network port naming.
Once the reboot completes, you should be presented with a normal login prompt. If you are using the cloud-init examples that accompany this blog, log in with ubuntu/ubuntu. If all has gone well, you will have a clone of the accompanying git repository in your home directory.
In our infrastructure, our management VM is going to perform a number of important functions (as well as running the Ansible playbooks to configure the rest of the infrastructure). It will act as a DNS server for our infrastructure, and also a DHCP server with static leases defined for the Cumulus VX images. It even acts as a simple NAT router so that our infrastructure can download files from the internet, via the NAT1 cloud we placed onto the canvas.
Assuming you’ve created the infrastructure as described (including the MAC addresses), you can simply change into the 3-mgmt01 subdirectory on the git clone, and run the Ansible playbook as follows:
ansible-playbook -i inventory site.yml
When the playbook completes, all the functionality of the management VM will be configured, and we can power the out-of-band management switch.
Configuring the out-of-band management network
The out-of-band management switch is a simple, layer 2 switch – however it still needs to be told that this is its configuration. Fortunately, Cumulus Networks’ switches are easy to configure using Ansible, and a playbook is provided in the accompanying git repository. Simply change into the 4-swmgmt subdirectory, and run the playbook as follows:
ansible-playbook -i inventory site.yml
Once it completes successfully, swmgmt will be configured as a layer-2 switch for managing all the other devices on our virtual infrastructure. From here, you can power the other 4 switches.
Configuring the spine-leaf switch topology
Once the 4 switches which comprise the spine-leaf topology boot up, you can proceed to configure them. Once again, Ansible playbooks are provided for exactly this purpose. The switch configuration for these switches is a simple layer-2 MLAG configuration – more advanced configurations are possible and I’ll leave it to you to explore more advanced options – however the code provided in the accompanying repository will set up a spine and leaf topology, with the switch ports configured to support resilient bonding on all the OpenStack nodes. Note that the port configuration assumes you have wired up the virtual machines as discussed earlier in this blog, and you will need to edit the configuration if you wish to change the port assignments.
Once you are ready to configure the switches, simply run the playbook in the 5-swcore subdirectory, as follows:
ansible-playbook -i inventory site.yml
When that playbook completes successfully, you will have a fully configured infrastructure with a resilient switching architecture and bonded network configurations on all nodes. The final stage is to power up all remaining nodes, and to install OpenStack.
With the switching infrastructure set up, you should now be able to power on all remaining virtual machines. They will obtain all their initial configuration (including networking) from the cloud-init ISO’s we attached earlier, and will present themselves as a blank canvas on which OpenStack can be installed. It is worth noting that although I have created this as a worked example for OpenStack, you could use this to simulate just about any distributed architecture.
Installing OpenStack, even from the openstack-ansible project, is a fairly lengthy and involved process. All details are given in the README.md file of the 6-osansible subdirectory of the GitHub repository accompanying this article, so we won’t repeat them again here.
In short the process involves cloning the openstack-ansible code onto the management node, preparing the node for OpenStack deployment (a special bootstrap script is provided for this), setting appropriate variables to configure the installation, and then running the playbooks in the correct sequence. If you’ve never done this before, this can be quite a daunting task, especially when you come across issues such as the incompatibility between ceilometer and the ujson Python module that exists in the openstack-ansible 19 release.
A great deal of credit goes to the team that manage the official openstack-ansible IRC channel, without whom the creation of this demo environment would not have been possible.
Although the nested virtualization that this setup takes advantage of can at times be slow, and the hardware requirements, especially in terms of memory and I/O performance are significant, the techniques we have covered here offer great potential for both training, and development purposes.
For example, you could build a real production OpenStack configuration on physical servers, using real physical Cumulus Networks powered switches using most of the same playbooks and cloud-init configuration data that we have used here. Similarly, you could simulate your production environment in GNS3 on a suitably powerful host, thus performing penetration testing, or testing new configurations, or even failure modes, to see how the environment responds. For example, you can easily power down entire sections of the GNS3 virtual infrastructure, or delete connections, in full confidence that you are not going to (accidentally or otherwise) impact any other vital services. The environment is a complete sandbox, and so is ideal for security testing, especially if you are investigating the impact of certain kinds of attacks or malware.
The opportunities provided by this kind of setup are endless, and the kindness of Cumulus Networks in making Cumulus VX available for free means you can easily simulate your real network infrastructure in a contained, virtual environment.
Source:: Cumulus Networks