EdgeOS - Load-Balancing Dual WAN (Part 1)

This is the first post in a two part series. When I started working from home full-time, I found it quite important to ensure that I had Internet available to me at all times. The two primary ISPs in my area are Comcast (unreliable) and AT&T (slow). I decided, to ensure that I always had something available, I would get both and therefore needed to configure dual-wan on my Ubiquiti EdgeRouter ERLite-3.

Introduction

I decided to write about my specific setup because I’ve found some interesting quirks that aren’t really discussed on the Ubiquiti Guide on the matter. Specifically, as I’ve had the configuration now for quite a while, there have been changes I’ve had to make and things that I’ve found that don’t work properly. In part 1, I’ll walk you through the whole process including the setup that Ubiquiti covered in their article to get a basic setup. In part 2, I’ll shift into the more detailed configurations and additional notes on my part.

Specific Note on ERLite-3

I strongly recommend that you have a way to use the console port on the ERLite-3 if you’re planning to do any sort of dual-WAN configuration on this device. The reason for this is because, with only three ports, you leave no ports for administration other than via the LAN port. If you then need to make a change to something that impacts the LAN port, you’re leaving no way to recover the router other than via rebooting if the change doesn’t work as expected.

For my own console port, I purchased this serial to USB converter and this rollover cable. This setup works great using screen on MacOS with no additional drivers.

Things That Break

In general, dual-wan on EdgeOS increases the complexity of the system and now causes you to have to make greater consideration when you go to perform certain activities on the system. In this section, we’ll talk about some of the things you can expect to have broken and will require more complex configurations.

QoS

If you’re relying on QoS for something critical, the traditional QoS configuration you do on EdgeOS may not work as expected with a dual-WAN setup. If you intend to specify your shaping policies on your LAN interface and you have WAN connections with two different sets of characteristics (primarily speed), you’ll have no way of (easily) assigning this to your LAN interface because you can only set one policy.

As I plan on going through each part of my EdgeRouter configuration on this blog, I hope to be able to provide a post at some point that digs into the details on this one and provides a practical solution.

Port Forwarding

The port-forward branch of the EdgeOS configuration is really actually a shortcut built-in to the platform to make destination NAT simpler for people. If you use this branch, you’ll probably realize that you can only select one WAN interface for this. It still works in that you can use that single WAN interface to receive traffic, but you will not be able to receive traffic on both WAN interfaces. Your incoming traffic cannot be balanced across the connections until you actually sit down and implement some destination NAT rules.

At some point soon, I’ll update this section of the article with the exact steps as soon as I actually do this myself but this gets more complicated when you: A) Use hairpin NAT (me!) and B) use zone-based firewalling (also me!).

Configuration

Initial Planning

WAN Performance Disparity

If you don’t have good, recent speed tests from your provider, now would be a great time to take them. Plug directly into each device from your provider and run an appropriate speed test. This is important for some of the configurations in part 2 so make sure you note the results. I recommend running the test a few times (and against different servers) and averaging your rresults for download speed (as this is what we’ll be concerned with later).

Port Definitions

For the purposes of this article, we’ll be using the following port assignments:

  • eth0 - WAN 1
  • eth1 - LAN
  • eth2 - WAN 2

You should ensure that your WAN connections are configured and working properly on the ports of your choice on your device. Adjust the configuration examples I provide appropriately for your setup.

Configuration Mode

The examples below will assume you are in configuration mode on your device. To enter configuration mode, type configure and press enter from your SSH/console session.

Configure a Basic Load-Balance Group

Start by configuring a load-balance group on your device that will provide the parameters for how the balancing should be done. We’re going to start with a very basic configuration:

set load-balance group G interface eth0
set load-balance group G interface eth2
set load-balance group G lb-local enable
set load-balance group G sticky dest-addr enable
set load-balance group G sticky source-addr enable

Let’s walk through each of these with an explanation:

  • set load-balance group G interface <interface>
    • This command is simply adding the interface to the group as a target. We’ll go back and define more here later.
  • set load-balance group G lb-local enable
    • This will tell our router to use this load-balance group for its own traffic as well. This ensures that things like dns-forwarding get balanced.
  • set load-balance group G sticky <type> enable
    • This set of rules (where we used dest-addr and source-addr) ensures that there is a relationship between which interface is used for traffic based on what has been sent previously. I chose to tie the dest-addr (on the Internet) and source-addr (on the LAN) together so that we could ensure that things like WebRTC will behave properly.
    • Basically anything that requires you to connect to multiple ports on the same remote address (some VPNs) has the opportunity to misbehave without this.

Setup Address Exclusions

Even though we will plan to load-balance most traffic, there are certain things we shouldn’t plan to load-balance. Amongst those things is our LAN network.

Create a network group that represents your LAN subnet (such as 192.168.1.0/24):

set firewall group network-group PRIVATE_NETS <your LAN subnet>

Setup Firewall Modify Rules

Now we need to specify the policy that will cause load-balancing to occur on the interface where it is assigned. Here’s the configuration we should put in place:

set firewall modify LOAD_BALANCE rule 10 action modify
set firewall modify LOAD_BALANCE rule 10 destination group network-group PRIVATE_NETS
set firewall modify LOAD_BALANCE rule 10 modify table main

set firewall modify LOAD_BALANCE rule 20 action modify
set firewall modify LOAD_BALANCE rule 20 destination group address-group ADDRv4_eth0
set firewall modify LOAD_BALANCE rule 20 modify table main

set firewall modify LOAD_BALANCE rule 30 action modify
set firewall modify LOAD_BALANCE rule 30 destination group address-group ADDRv4_eth2
set firewall modify LOAD_BALANCE rule 30 modify table main

set firewall modify LOAD_BALANCE rule 100 action modify
set firewall modify LOAD_BALANCE rule 100 modify lb-group G

Let’s walk through each of these with an explanation:

  • set firewall modify LOAD_BALANCE rule <number> action modify
    • This instructs the rule to perform a modification on a routing table.
  • set firewall modify LOAD_BALANCE rule <number> destination group <specification>
    • This instructs the rule on when to be applied. In this instance, we’re saying with rules 10-30 that we want to be applied when the traffic is bound for either PRIVATE_NETS or the IPv4 address of eth0 or eth2.
  • set firewall modify LOAD_BALANCE rule <number> modify table main
    • This rule should modify the main routing table.
  • set firewall modify LOAD_BALANCE rule <number> modify lb-group G
    • After the preceeding rules, use the load-balance group G that we created earlier for the remaining traffic.

At this point in time, you have a basic load-balancing configuration. You should commit the configuration at this point but you’ll likely find some odd behaviors in the future. We’ll cover additional configuration in part 2.

Verify Load Balancing

You can validate that your load balancing is working properly using a few commands. First, to see the general status of the interfaces and determine their status, use this command (outside of configure mode):

show load-balance watchdog

You’ll receive a response similar to this:

Group G
  eth0
  status: OK
  pings: 19998
  fails: 63
  run fails: 0/2
  route drops: 1
  ping gateway: 8.8.8.8 - REACHABLE
  last route drop   : Tue Feb  2 01:54:07 2021
  last route recover: Tue Feb  2 02:05:27 2021

  eth2
  status: OK
  pings: 19979
  fails: 64
  run fails: 0/2
  route drops: 1
  ping gateway: 1.1.1.1 - REACHABLE
  last route drop   : Tue Feb  2 01:54:07 2021
  last route recover: Tue Feb  2 02:05:16 2021

From this screen, we can easily see that both of our interfaces are operational with pings running properly.

Next, we’ll show the flow of traffic across the load-balancing to ensure that our traffic is actually being balanced:

show load-balance status

You’ll receive a response similar to this:

Group G
    Balance Local  : true
    Lock Local DNS : false
    Conntrack Flush: true
    Sticky Bits    : 0x00000003

  interface   : eth0
  reachable   : true
  status      : active
  gateway     : <hidden>
  route table : 201
  weight      : 50%
  fo_priority : 100
  flows
      WAN Out   : 291K
      WAN In    : 0
      Local ICMP: 20019
      Local DNS : 0
      Local Data: 42013

  interface   : eth2
  reachable   : true
  status      : active
  gateway     : <hidden>
  route table : 202
  weight      : 50%
  fo_priority : 100
  flows
      WAN Out   : 382K
      WAN In    : 481
      Local ICMP: 20001
      Local DNS : 0
      Local Data: 17728

Specifically, the field under flows show the traffic on each interface. In my own experience, it can take some time for the balancing to be realized because of existing sessions. I usually just check back the next day to see balancing.

Summary

You now have a functional, generic load-balancing dual-WAN configuration for your EdgeOS device. If left in this configuration, you’ll probably start to notice little problems here and there that cause you with some service issues or frustration. Continue into part 2 to explore additional configurations that you should make to round-out your experience. Thanks for your time!