In a previous blog, I have shown you how to use Route Reflectors which provide better scalability, effectively reducing the need of the BGP full-mesh. BGP confederations is just another way of achieving the same thing – there however differences which I will be tackling in this blog.
I will work on the following topology:
Above we have two ASs (AS 230 & AS 300) peering over an eBGP session. However, AS 230 comprises of several other inner-ASs (AS 65001, AS 65002 & AS 65003). The parent AS is called a BGP Confederation. Notice also how each sub-AS (child AS) is using a private ASN whereas the BGP confederation is peering with AS 300 using public ASNs. This is how you would normally see it implemented and, although not a requirement, you should keep this as a best design practice.
I will show below the configuration on R3 & R4 and will further provide more details to the relevant commands:
router bgp 65002
bgp router-id 10.255.255.3
bgp confederation identifier 230
bgp confederation peers 65003
[ remove for clarity ]
neighbor 10.255.255.4 remote-as 65003
neighbor 10.255.255.4 ebgp-multihop 2
neighbor 10.255.255.4 update-source Loopback0
neighbor 10.255.255.4 next-hop-self
router bgp 65003
bgp router-id 10.255.255.4
bgp confederation identifier 230
bgp confederation peers 65001 65002
[ removed for clarity ]
neighbor 10.255.255.3 remote-as 65002
neighbor 10.255.255.3 ebgp-multihop 2
neighbor 10.255.255.3 update-source Loopback0
neighbor 10.255.255.3 next-hop-self
Use the router bgp <subASN> to configure sub-AS, as part of the confederation which in turn, is set using the the bgp confederation identifier <ASN> command. You will then need to specify the peering sub-ASs – you do so using the bgp confederation peers <subASNsList> command.
Normally, it is a good practice for eBGP session to be configured using the directly connected subnet, unless you are trying to achieve load-balance over two links connecting the two BGP peers. In this case however, it is best practice to use the loopback IPs for the inter-Sub-ASs peering sessions – and this is why you need to use the ebgp-multihop, update-source and next-hop-self commands.
And now, to the goodies!
With a single network command R5, we will advertise the prefix 192.168.1.0 /24 not only to R1, but to the rest of the confederation! I think that is very cool. Had we configured without using a BGP Confederation (and it’s sub-ASs), the prefix would have been advertised from (R5, R1, eBGP), (R1, R2, iBGP), (R1, R4, iBGP) – notice that for R3 to learn the prefix, we would’ve had to do one of the following:
- Add an extra BGP session to R1 (remember that routes learnt via eBGP are automatically passed on to the iBGP peers)
- Use RR (Route Reflectors)
- Advertise the route manually using the network command
So to recap, using a BGP confederation, we achieve the following:
R5 advertises the prefix which gets to R1; in turn, R1 will automatically pass on the prefix to its confederation external peers, R4 and R2 respectively. Furthermore, R4 will also automatically pass the prefix on to R3 over it’s confederation external eBGP session. At last, R3, having received the prefix from an external BGP peer, will automatically pass the prefix on to its internal BGP peer, R2.
R2 will also receive the prefix from R1 and will automatically pass it on to its internal confederation peer, R3. Once received, R3 will pass it on to its external peer, R4.
When using BGP confederations, to avoid loops inside the confederation, two attributes are used: AS_CONFED_SEQUENCE and AS_CONFED_SET – these are used similarly to how AS_PATH attributes are used (AS_SEQUENCE & AS_SET).
When a prefix is advertised outside the confederation, the AS_CONFED_SEQUENCE and AS_CONFED_SET attributes, are stripped out!
As a last difference from the standard BGP operation, note that the MED and LOCAL_PREF behave as within a standard (no-confederation) AS; BGP ignores the sub-ASs – i.e. MED and LOCAL_PREFERENCE are passed on between sub-ASs.
Let’s now take a quick look at the show ip bgp <route> command output, on R3:
R3#sh ip bgp 192.168.1.0
BGP routing table entry for 192.168.1.0/24, version 4
Paths: (2 available, best #1, table Default-IP-Routing-Table)
Advertised to update-groups:
(65003 65001) 300
10.255.255.4 (metric 156160) from 10.255.255.4 (10.255.255.4)
Origin IGP, metric 0, localpref 100, valid, confed-external, best
10.255.255.2 (metric 156160) from 10.255.255.2 (10.255.255.1)
Origin IGP, metric 0, localpref 100, valid, confed-internal
You can see that there are two paths to reach the prefix:
- AS65003, AS65001, AS300 – what you see as a metric of 156160 is actually the EIGRP metric for reaching the NEXT_HOP of 10.255.255.4. We can also see that the prefix was learnt from 10.255.255.4 which has the BGP router id of 10.255.255.4 (in brackets)
- AS65001, AS300 – similarly, 156160 will be the IGP metric of the NEXT_HOP of 10.255.255.2. Furthermore, the route was learnt from 10.255.255.2 which has the BGP router-id of 10.255.255.1