Introduction
I’ve not posted about NetBox before, which I really should have done as I’ve been using it for years. It is a fantastic piece of open-source software that has been steadily improving over time. Originally conceived by Jeremy Stretch of packetlife.net, (I’m sure you can safely guess where the influence for the name of my own blog came from) it has matured into a feature-rich infrastructure modeling application to meet the needs of network engineers.
You can read all about NetBox here including of course how to get started building your own:
https://netbox.readthedocs.io/en/stable/
I’ve been running the application from its early release as it originally proved to be so much more helpful than spreadsheets for keeping track of IP addresses and prefix allocations in production use. Like most good software though it has continually evolved and it was the implementation of the API where this application really started to shine and also allowed me to really start to think about how I could lean on it to improve my workflows.
So where does automation fit into this? Well automation, of course, is another buzzword that is going around at the moment due to the fact the big network vendors are releasing their own software automation solutions to change the very way infrastructure devices are managed (Think SDN). Automation of course though can also mean software development to decrease the amount of manual work required. It is this creation of tools that can make life as a network engineer easier and ultimately saves time but with the added benefits of increasing consistency, and reducing errors. There are of course many ways to achieve any goal but I’ll be describing and showing examples of the code I have written and how I have implemented automation to help manage what could be classed as legacy network infrastructure, e.g. devices that still need to be managed and configured individually via the CLI.
The first thing to make clear is if you have no experience with python, jinja2, YAML, git, etc. please expect that this will be very a steep learning curve. I’ve been using python for 6+ years so I’m already very familiar with the constructs of the language and use it frequently as part of my role. I highly recommend the following book by Eric Matthes if you are interested in getting up to speed with python.
Methodology
I started by breaking down the manual steps of what I was trying to achieve. Ultimately fairly simple, use code to create the configuration for a device and record that information in NetBox. The code I decided to use was a simple choice as it is both easy to read for human beings and also for machines. YAML code is quite ubiquitous and used heavily. It also doesn’t require a huge amount of learning. Of course, I need to stress that before you can code a switch configuration, you need to know how a configuration would be if it was done manually. Any engineer that has been configuring devices for even a small amount of time would recognise that a comparable device has a large amount of configuration that is almost identical. Engineers would have in the past kept text files as templates to use so the next time a device required configuration, the bulk of the configuration would be from a template. The same principle applies in this case. The YAML file is therefore holding that information that is different, e.g. the variables.
Example YAML construct
Example file:
device-name.YAML
hostname: Device-Name
vlans: 2,3,4,5
gateway: 192.0.2.1
mgmt_address: 192.0.2.10
mgmt_netmask: 255.255.255.0
mgmt_interface: Vlan5
site: site-name
rack_a: rack-id-a
rack_b: rack-id-b
position_a: 42
position_b: 41
serial_a: serial-a
serial_b: serial-b
interfaces:
- name: Port-channel1
description:
mode:
vlan:
enabled: False
As you can see the YAML is holding unique information about a device. In this case, the device is a stacked device so some information has a corresponding a and b mapping for the devices that make a stacked pair. For devices that are managed individually, a file per physical device can be used. This is the way I have implemented the code where a shared control plain is in operation as with stacking.
Now, this is just a snippet of course and I would not suggest that these files are created manually. Each file will include many lines for the interfaces available on a device, this is where some helper python scripts come into play.
Example file:
cisco2960-xr_template_create_24.yaml
#!/usr/bin/env python3
with open("cisco2960-xr_24.yaml", "a") as f:
print('hostname: ', file=f)
print('vlans: ', file=f)
print('gateway: ', file=f)
print('mgmt_address: ', file=f)
print('mgmt_netmask: ', file=f)
print('mgmt_interface: ', file=f)
print('site: ', file=f)
print('rack_a: ', file=f)
print('rack_b: ', file=f)
print('position_a: ', file=f)
print('position_b: ', file=f)
print('serial_a: ', file=f)
print('serial_b: ', file=f)
print('interfaces: ', file=f)
for x in range(1, 49):
print(' - name: Port-channel' + str(x), file=f)
print(' description: ', file=f)
print(' mode: ', file=f)
print(' vlan: ', file=f)
print(' enabled: False', file=f)
for x in range(1, 27):
print(' - name: GigabitEthernet1/0/' + str(x), file=f)
print(' description: ', file=f)
print(' mode: ', file=f)
print(' vlan: ', file=f)
print(' channel-group: ', file=f)
print(' enabled: False', file=f)
for x in range(1, 3):
print(' - name: TenGigabitEthernet1/0/' + str(x), file=f)
print(' description: ', file=f)
print(' mode: ', file=f)
print(' vlan: ', file=f)
print(' channel-group: ', file=f)
print(' enabled: False', file=f)
for x in range(1, 27):
print(' - name: GigabitEthernet2/0/' + str(x), file=f)
print(' description: ', file=f)
print(' mode: ', file=f)
print(' vlan: ', file=f)
print(' channel-group: ', file=f)
print(' enabled: False', file=f)
for x in range(1, 3):
print(' - name: TenGigabitEthernet2/0/' + str(x), file=f)
print(' description: ', file=f)
print(' mode: ', file=f)
print(' vlan: ', file=f)
print(' channel-group: ', file=f)
print(' enabled: False', file=f)
A rather simple script that produces YAML file templates on demand. These scripts should be created for different switch models. This is due to the fact certain switch models have different interface quantities and also sometimes different logical naming schemes across device types.
These scripts so far are fairly simple and didn’t require much effort to implement. The next however took a considerable amount of time to develop and test.
Example Jinja2 Construct
I decided to use jinja2 as the building block to create device configuration. I knew upfront what a good configuration was and worked backward to create that same good configuration using jinja2 as the templating language. The jinja2 file, therefore, includes all of the static device configurations. It then uses the YAML file to insert the required variables and loop over the interfaces.
Example file:
cisco2960-xr_full_config_template.jinja2
hostname {{ hostname }}
!
.....
!
!
spanning-tree mode rapid-pvst
spanning-tree portfast default
spanning-tree portfast bpduguard default
spanning-tree extend system-id
!
!
vlan {{ vlans }}
!
lldp run
!
{% for interface in interfaces %}
{% if 'Port-channel' in interface['name'] %}
{% if interface['enabled'] %}
interface {{ interface['name'] }}
{% if interface['description'] %}
description {{ interface['description']|upper }}
{% endif %}
{% endif %}
{% if interface['mode'] %}
{% if 'trunk' in interface['mode']|lower %}
{% if interface['vlan'] %}
switchport trunk allowed vlan {{ interface['vlan']}}
switchport mode trunk
!
{% else %}
switchport mode trunk
!
{% endif %}
{% elif 'access' in interface['mode']|lower %}
switchport access vlan {{ interface['vlan'] }}
switchport mode access
!
{% endif %}
{% endif %}
{% endif %}
{% endfor %}
interface FastEthernet0
no ip address
!
.....
This is just a snippet of course. The template also includes code to run against the physical interfaces and also various static system configuration elements but I’ve chopped it out for brevity. The methodology used to get to the stage where this is working correctly, is to keep checking your code is creating a configuration for your device and once the output matches your manual creation, you’ll be in a situation where you have a template that will produce consistent device configuration for that device type. The same applies here as did for the helper scripts. A new jinja2 template will need to be created for each device type being managed using this methodology.
Running the template requires some code again:
Example file:
render_template.py
#!/usr/bin/env python3
import sys
import yaml
from colours import PrintInColour
from jinja2 import Environment, FileSystemLoader
# Initialize the Jinja2 environment to load templates
# from the current directory
env = Environment(loader=FileSystemLoader('.'), trim_blocks=True, lstrip_blocks=True)
try:
template = env.get_template(sys.argv[1])
except:
PrintInColour.red("You must specify a template and YAML")
PrintInColour.green("Command structure required is:")
PrintInColour.green("python3 render_template.py <required_template.jinja2> <Device YAML path>")
raise exit()
if not template.filename.endswith('jinja2'):
PrintInColour.red("You should only use jinja2 code as a template")
PrintInColour.green("Command structure required is:")
PrintInColour.green("python3 render_template.py <required_template.jinja2> <Device YAML path>")
raise exit()
# Load the context YAML file into a Python dictionary
try:
with open(sys.argv[2], 'r') as datafile:
try:
context = yaml.load(datafile, Loader=yaml.SafeLoader)
except:
PrintInColour.red("Valid YAML is required, please check your YAML file")
raise exit()
except IndexError:
PrintInColour.red("Have you specified the required files?")
PrintInColour.green("Command structure required is:")
PrintInColour.green("python3 render_template.py <required_template.jinja2> <Device YAML path>")
raise exit()
except:
PrintInColour.red("A valid file is required")
raise exit()
# Render the template and print the resulting document
rendered_template = template.render(**context)
print(rendered_template)
The code gives a warning if not run correctly and ensures that the user understands how to run the code required.
This solves the first issue in ensuring that device configurations are created using code. The next issue is how do we record all of this configuration in NetBox without doing so manually through the web interface.
I often use manual configuration via the web interface in NetBox because it is easy. Only really though where I am making a few changes. Whenever there is a need to do bulk processing I’d originally create files for import instead. NetBox has a native import from CSV feature which works really well. This is taking the next big step though, utilising the API to read the YAML files to automatically update NetBox.
There is a requirement when using the API, you must create a token for your user. In the case that you want to write updates to NetBox then the token used will need to be a read/write token otherwise a read token can be used if purely reading data.
Example NetBox Update Python Scripts
The first thing to make clear is my scripts use modularisation. The common functions reside in a modules directory. This is to improve the readability of the code and ensure consistency when adjustments are required across multiple scripts. I’ve found the best way to modularise python code is to include a modules directory as part of your $PYTHON_PATH. You can then just import your code as you would any other standard library as it will automatically be found within your path.
I’m also using multiple PyPi packages:
pynetbox
pyYAML
Jinja2
You can communicate with NetBox via standard HTTP calls but as I know python, I prefer to use the NetBox API client library.
I’ll break this next file into multiple parts and explain as we progress through it.
Example file part 1:
netbox_cisco_stack_update.py
#!/usr/bin/env python3
import copy
import netbox
import sys
import yaml
from colours import PrintInColour
# The script uses the yaml context ingestion file to create unique dictionaries required both for port-channel creation / updates
# and physical interface updates
# Unique to Cisco stacked switches / 3850 & 2960-XR
# 1. Stacked switches are represented as two distinct devices within NetBox but only contain one YAML configuration file
# Load the context YAML file into a Python dictionary
try:
with open(sys.argv[1], 'r') as datafile:
context = yaml.load(datafile, Loader=yaml.SafeLoader)
except IndexError:
PrintInColour.red("You must specify your YAML file on running the script")
raise exit()
except:
PrintInColour.red("Looks like your YAML is not valid, check it confirms to required standard")
raise exit()
# Function definitions
# All required functions being sourced from modules
# Check how many changes were made
device_changes_made = 0
port_channel_creations = 0
port_channel_updates = 0
physical_updates_made = 0
cables_connected = 0
# Load PyNetBox API
netbox_urls = {
'prod': 'https://netbox.ip-life.net',
'uat': 'https://uat-netbox.ip-life.net',
'test': 'https://test-netbox.ip-life.net'
}
netbox_url = netbox_urls['prod'] # Select required URL
nb = netbox.load_netbox_api(netbox_url)
if not netbox_url == netbox_urls['prod']:
PrintInColour.red("Non production URL in use ({0})".format(nb.base_url))
# Get devices
device_a = nb.dcim.devices.get(name=context['hostname'])
device_b = nb.dcim.devices.get(name=netbox.return_stack_pair_name(device_a.name))
# Check devices exist
if not device_a or not device_b:
PrintInColour.red("Check the stacked devices have been created as at least one is missing in NetBox")
raise exit()
First, we import the various modules required. The netbox library is my own custom module where all common functions are defined. Second, the YAML is loaded and will produce an error if in an invalid format. Third, we create some variables to store which changes are made. Fourth we progress to load the NetBox API and then we finally attempt to get from Netbox the devices in question and check that they actually exist.
I would urge you when developing scripts that you should try and get into the mindset of always ensuring your script fails in a controlled manner when an error is encountered. Most people are overwhelmed when presented with a python error. I find that including a warning about what has gone wrong and terminating the program early makes the script much easier to use in the long term.
Example file part 2:
netbox_cisco_stack_update.py
# Check this script is being used on a access switch
PrintInColour.green("Performing dictionary creations and checks")
netbox.check_device_is_access(nb, device_a)
# Perform checks of device details and update if required
# Add required dict entries
dd_a = ({ 'device_type': device_a.device_type.id, 'device_role': device_a.device_role.id, 'site': device_a.site.id })
dd_b = ({ 'device_type': device_b.device_type.id, 'device_role': device_b.device_role.id, 'site': device_b.site.id })
# Check Site (int, required)
# Only require a single site check per stack...
device_changes_made = netbox.update_site(nb, context, device_a, dd_a, device_changes_made)
device_changes_made = netbox.update_site(nb, context, device_b, dd_b, device_changes_made)
# Check Rack (int)
device_changes_made = netbox.update_rack(nb, context, 'rack_a', device_a, dd_a, device_changes_made)
device_changes_made = netbox.update_rack(nb, context, 'rack_b', device_b, dd_b, device_changes_made)
# Check Position - lowest unit position of the device (int) / ISSUE - Ensure a position make a unique set before updating dict / providing feedback that a mistake has occured because you have not specified a unique set!
device_changes_made = netbox.update_position(context, 'position_a', device_a, dd_a, device_changes_made)
device_changes_made = netbox.update_position(context, 'position_b', device_b, dd_b, device_changes_made)
# Check Serial number (str)
# Check for unique entries before adding....
device_changes_made = netbox.update_serial(context, 'serial_a', device_a, dd_a, device_changes_made)
device_changes_made = netbox.update_serial(context, 'serial_b', device_b, dd_b, device_changes_made)
# Make required changes
# If position is added, face must be added. Defaulting to front.
# device_a.update({ 'device_type': device_a.device_type.id, 'device_role': device_a.device_role.id, 'site': device_a.site.id, 'serial': context['serial_a'], 'face': 'front', 'position': 38 })
netbox.update_device(device_a, dd_a)
netbox.update_device(device_b, dd_b)
We now reach the part of the script where updates are attempted within NetBox. The functions reside in the netbox module so keeps the script at a significantly smaller size and as mentioned previously ensures consistent code. I’ve enclosed a sample of the content of the netbox module next to see how to structure those functions.
First, we check if the device is an access switch type. If not the script terminates with a warning. Second, dictionaries are created for each device in the stack with required keys and values. We then check that the site, rack, position, and serial number are up to date in NetBox. There is a final update of the device which takes place to create any missing configuration.
Example file:
netbox.py
#!/usr/bin/env python3
import copy
import ipaddress
import netbox
import os
import pynetbox
from colours import PrintInColour
.....
def check_device_is_access(api, device):
""" Ensure the device is a Cisco access switch else fail out graciously
Args:
api: The netbox API variable
device: The device by name
Raises:
Error with user and exits if used for the wrong device
"""
try:
d = api.dcim.devices.get(name=device)
if 'access' in str(d.device_role).lower():
pass
else:
PrintInColour.red('The device is not a Cisco Access switch, you should not be using this version with this device type..')
raise exit()
except:
PrintInColour.red("Something has gone wrong in the check_device_is_access module")
raise exit()
.....
def update_site(api, yaml_context, _device, _dd, _device_changes_made):
"""
Args:
api: NetBox entrypoint
yaml_context: YAML context
_device: Device
_dd: Device specific dictionary
_device_changes_made: Increment by one if changes performed
Returns:
The number of changes made back to the global variable by reference
"""
try:
if yaml_context['site']:
if not _device.site:
site = api.dcim.sites.get(name=yaml_context['site'].upper())
if site:
_dd.update({ 'site': site.id })
return _device_changes_made + 1
if _device.site:
if yaml_context['site'].lower() == _device.site.display.lower():
return _device_changes_made + 0 # Required to avoid TypeError
elif yaml_context['site'].lower() != _device.site.display.lower():
# Update NetBox based on context data
site = api.dcim.sites.get(name=yaml_context['site'].upper())
if site:
_dd.update({ 'site': site.id })
_dd.update({ 'site': site.id })
return _device_changes_made + 1
else:
return _device_changes_made + 0 # Required to avoid TypeError
else:
PrintInColour.red("YAML does not include site information")
return _device_changes_made + 0 # Required to avoid TypeError
except:
PrintInColour.red("Something has gone wrong in the update_site module ...")
raise
def update_rack(api, yaml_context, yaml_text, _device, _dd, _device_changes_made):
"""
Args:
api: NetBox entrypoint
yaml_context: YAML context
yaml_text: YAML text representation of serial (Cisco devices in the format rack_a & rack_b)
_device: Device
_dd: Device specific dictionary
_device_changes_made: Increment by one if changes performed
Returns:
The number of changes made back to the global variable by reference
"""
try:
if yaml_context[yaml_text]:
# Get Rack info as NetBox can error unexpectedly if you try and return the facility_id using device_a.rack.facility_id method
rack = api.dcim.racks.get(facility_id=yaml_context[yaml_text].upper())
if not _device.rack:
if rack:
_dd.update({ 'rack': rack.id })
return _device_changes_made + 1
else:
PrintInColour.red('Rack {0} is mssing from NetBox'.format(yaml_context[yaml_text]))
return _device_changes_made + 0
elif _device.rack:
if yaml_context[yaml_text].lower() == rack.facility_id.lower():
_dd.update({ 'rack': rack.id })
return _device_changes_made + 0 # Required to avoid TypeError
elif yaml_context[yaml_text].lower() != rack.facility_id.lower():
# First nullify the exisiting info so it can be updated
dd_del = copy.deepcopy(_dd)
dd_del.update({ 'rack': None })
_device.update(dd_del)
# Update NetBox based on context data
rack = api.dcim.racks.get(facility_id=yaml_context[yaml_text].upper())
if rack:
_dd.update({ 'rack': rack.id })
return _device_changes_made + 1
else:
return _device_changes_made + 0 # Required to avoid TypeError
else:
PrintInColour.red("YAML does not include rack facility ID information")
return _device_changes_made + 0 # Required to avoid TypeError
except:
PrintInColour.red("Something has gone wrong in the update_rack module ...")
raise
Provided as a sample some functions which are called from the module.
We then progress on in the script to create the required port-channel interfaces. The YAML includes blank port-channel interfaces for the total number the device supports.
Example file part 3:
netbox_cisco_stack_update.py
# Import required port channels from context
port_channel_intfs_a = []
for x in context.get('interfaces'):
if 'Port-channel' in x.get('name'):
port_channel_intfs_a.append(x)
# Deepcopy the list of dictionaries to ensure unique device ids
port_channel_intfs_b = copy.deepcopy(port_channel_intfs_a)
# Update dictionary with required variables
netbox.update_portchannel_dictionaries_cisco(nb, port_channel_intfs_a, device_a)
netbox.update_portchannel_dictionaries_cisco(nb, port_channel_intfs_b, device_b)
# Create missing port-channel interfaces
port_channel_creations = netbox.create_port_channel_interfaces(nb, port_channel_intfs_a, device_a, port_channel_creations)
port_channel_creations = netbox.create_port_channel_interfaces(nb, port_channel_intfs_b, device_b, port_channel_creations)
# Update port-channel interfaces
port_channel_updates = netbox.update_portchannel_interfaces(nb, port_channel_intfs_a, device_a, port_channel_updates)
port_channel_updates = netbox.update_portchannel_interfaces(nb, port_channel_intfs_b, device_b, port_channel_updates)
# Import physical interfaces from context
physical_interfaces_a = []
physical_interfaces_b = []
for x in context.get('interfaces'):
if 'GigabitEthernet1' in x.get('name'):
physical_interfaces_a.append(x)
elif 'GigabitEthernet2' in x.get('name'):
physical_interfaces_b.append(x)
# Update dictionary with required variables
netbox.update_physical_dictionaries_cisco(nb, physical_interfaces_a, device_a)
netbox.update_physical_dictionaries_cisco(nb, physical_interfaces_b, device_b)
# Update physical interfaces
physical_updates_made = netbox.update_physical_interfaces(nb, physical_interfaces_a, device_a, physical_updates_made)
physical_updates_made = netbox.update_physical_interfaces(nb, physical_interfaces_b, device_b, physical_updates_made)
# Update cable connections
cables_connected = netbox.update_cables_connections(nb, physical_interfaces_a, device_a, cables_connected)
cables_connected = netbox.update_cables_connections(nb, physical_interfaces_b, device_b, cables_connected)
# Display changes made
if device_changes_made + port_channel_creations + port_channel_updates + physical_updates_made + cables_connected>= 1:
PrintInColour.green("Complete, NetBox updated. A total of {0} changes were made on the system".format(device_changes_made + port_channel_creations + port_channel_updates + physical_updates_made))
if device_changes_made >= 1:
PrintInColour.green("Number of device specific information updated = {0}".format(device_changes_made))
if port_channel_creations >= 1:
PrintInColour.green("Port-channel creations completed = {0}".format(port_channel_creations))
if port_channel_updates >= 1:
PrintInColour.green("Port-channels updated = {0}".format(port_channel_updates))
if physical_updates_made >= 1:
PrintInColour.green("Physical Interfaces updated = {0}".format(physical_updates_made))
if cables_connected >= 1:
PrintInColour.green("Cables connected = {0}".format(cables_connected))
else:
PrintInColour.green("Complete, No changes required, YAML configuration file matches NetBox configuration")
The script loads the port-channel interfaces from the YAML file and proceeds to deep copy into two separate variables for each device. Any missing port-channel interfaces within NetBox are created and then updated with details from the YAML. A similar process is performed for the physical interfaces except of course the creation of those interfaces is handled when originally creating the device in NetBox using the templated physical interfaces within the device template. I also perform an operation where the logical interface names are transformed from the recorded ‘GigabitEthernet2’ to ‘GigabitEthernet1’ to match the NetBox device record. Each device’s physical interfaces are then updated with the required configuration. Finally, the changes made are displayed to the user once the script has finished running.
Conclusion
As you can see NetBox is an incredibly powerful platform that allows gradual improvements to workflows utilising common toolsets. I mentioned at the beginning of this article that you will have a hard time if you don’t have familiarity with various tools. This is due to the very nature that the traditional Network engineer role is under a steady transformation where many new technologies need to be learned. The following Cisco article details many of the important skills required:
This is making the future of network engineering much more closely aligned with a developer which is quite a change of mindset. Tools like git for example become a necessity when coding applications and learning the methodologies of software releases and automation techniques.
We’re heading into a future where very exciting changes are taking place and there is a paradigm shift in the way new modern networks are being deployed. There are however many improvements that can be made in how existing networks are managed using automation techniques for legacy infrastructures. I’m sure there are very few companies that have the resources to perform a wholesale replacement of their network to utilise all of the latest and greatest technology. Automation techniques are effectively being able to develop and use software to solve problems. I’m quite sure that NetBox will continue to evolve and the engineers that depend on it will also continue to utilise it for increasingly powerful automation.