Go to file
David S. Miller 3976001c9d Merge branch 'ipv6-Improve-user-experience-with-multipath-routes'
David Ahern says:

====================
net: ipv6: Improve user experience with multipath routes

This series closes a couple of gaps between IPv4 and IPv6 with respect
to multipath routes:

1. IPv4 allows all nexthops of multipath routes to be deleted using just
   the prefix and length; IPv6 only deletes the first nexthop for the
   route if only the prefix and length are given.

2. IPv4 returns multipath routes encoded in the RTA_MULTIPATH attribute.
   IPv6 returns a series of routes with the same prefix and length - one
   for each nexthop. This happens for both dumps and notifications.

IPv6 does accept RTA_MULTIPATH encoded routes, but installs them as a
series of routes.

Patch 1 addresses the first item by allowing IPv6 multipath routes to be
deleted using just the prefix and length. Patch 2 addresses the second
allowing IPv6 multipath routes to be returned encoded in the RTA_MULTIPATH.

Patches 3 and 4 upate the RTM_{NEW,DEL}ROUTE notifications to generate
1 notification with RTA_MULTIPATH where applicable.

Patch 5 prints IPv6 addresses in compressed format when showing route
replace errors. This was noticed testing REPLACE failures.

The end result for multipath routes:
1. Dump
   - RTA_MULTIPATH used for multipath routes

    $ ip -6 ro ls vrf red
    2001:db8:1::/120 dev eth1 proto kernel metric 256  pref medium
    2001:db8:2::/120 dev eth2 proto kernel metric 256  pref medium
    2001:db8:200::/120 metric 1024
	    nexthop via 2001:db8:1::2  dev eth1 weight 1
	    nexthop via 2001:db8:2::2  dev eth2 weight 1
    ...

2. Route Add
   - one notification with RTA_MULTIPATH attribute

    $ ip -6 ro add vrf red 2001:db8:200::/120 nexthop via 2001:db8:1::2 nexthop via 2001:db8:2::2

    $ ip mon route
    2001:db8:200::/120 table red metric 1024
	nexthop via 2001:db8:1::2  dev eth1 weight 1
	nexthop via 2001:db8:2::2  dev eth2 weight 1

2. Route Replace
   - one notification with RTA_MULTIPATH attribute

    $ ip -6 ro replace vrf red 2001:db8:200::/120 nexthop via 2001:db8:1::16 nexthop via 2001:db8:2::16

    $ ip mon route
    Replaced 2001:db8:200::/120 table red metric 1024
	    nexthop via 2001:db8:1::16  dev eth1 weight 1
	    nexthop via 2001:db8:2::16  dev eth2 weight 1

   - on a failure after the insertion of the first nexthop (which means
     the original route has been replaced in the FIB), a notification is
     sent with the successful nexthops and then the nexthops are deleted
     with one notification per hop. This is consistent with how it works
     today except the successful additions are coalesced into 1
     notification.

3. Route Delete
   - delete of entire multipath route using prefix/length only 1
     notification is generated:
    $ ip -6 ro del vrf red 2001:db8:200::/120

    $ ip mon route
    Deleted 2001:db8:200::/120 table red metric 1024
	    nexthop via 2001:db8:1::16  dev eth1 weight 1
	    nexthop via 2001:db8:2::16  dev eth2 weight 1

   - if a delete request contains nexthops one notification is
     generated per nexthop deleted. This is unavoidable since IPv6
     alllows a single nexthop to be deleted within a multipath route

4. Route Appends
   - IPv6 allows nexthops to be appended to an existing route. In this
     case one notification is sent for the new route with the append
     flag set.

    $ ip -6 ro append vrf red 2001:db8:200::/120 nexthop via 2001:db8:2::20 nexthop via 2001:db8:1::20

    $ ip mon route
    Append 2001:db8:200::/120 table red metric 1024
	    nexthop via 2001:db8:1::2  dev eth1 weight 1
	    nexthop via 2001:db8:2::2  dev eth2 weight 1
	    nexthop via 2001:db8:2::20  dev eth2 weight 1
	    nexthop via 2001:db8:1::20  dev eth1 weight 1

  - on failure of an append, a notification is sent with the route
    containing all of the nexthops successfully added, and it is
    followed by delete notifications as the hops are removed
    returning the route to its prior state. This is consistent with
    how it works today except the successful additions are coalesced
    into 1 notification.

Addresses some of the inconsistencies also noted by Roopa at netdev0.1:
https://www.netdev01.org/docs/prabhu-linux_ipv4_ipv6_inconsistencies_talk_slides.pdf

v4
- changed series to do encoding in 1 patch and updating notificatons
  in separate patches to make it easier to review and understand

- 1 notification for delete when using prefix/length; 1 notification for
  append

- handle delete of a single nexthop without RTA_MULTIPATH in delete request

- upated commit messages and cover letter

v3
- removed the need for a user API to opt-in to change. Requiring an
  API just shifts the difference from same API with different
  behavior to different API to achieve equivalent behavior
- route notifications changed to use RTA_MULTIPATH for add and replace
- upated commit messages and cover letter

v2
- fixed locking in patch 1 as noted by DaveM
- changed user API for patch 2 to require an rtmsg with RTM_F_ALL_NEXTHOPS
  set in rtm_flags
- revamped explanation of patch 2 and cover letter
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-04 19:58:15 -05:00
arch Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2017-02-02 16:54:00 -05:00
block blk-mq: Remove unused variable 2017-01-18 15:14:15 -07:00
certs
crypto crypto: api - Clear CRYPTO_ALG_DEAD bit before registering an alg 2017-01-23 22:41:32 +08:00
Documentation Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next 2017-02-03 16:58:20 -05:00
drivers virtio_net: exploit napi_complete_done() return value 2017-02-04 19:38:28 -05:00
firmware
fs Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2017-02-02 16:54:00 -05:00
include net: ipv6: Change notifications for multipath add to RTA_MULTIPATH 2017-02-04 19:58:14 -05:00
init cgroup: move CONFIG_SOCK_CGROUP_DATA to init/Kconfig 2017-01-11 09:47:10 -05:00
ipc ipc/sem.c: fix incorrect sem_lock pairing 2017-01-10 18:31:55 -08:00
kernel trace: rename trace_print_hex_seq arg and add kdoc 2017-02-03 15:50:18 -05:00
lib lib: Introduce priority array area manager 2017-02-03 16:35:42 -05:00
mm mm, page_alloc: fix premature OOM when racing with cpuset mems update 2017-01-24 16:26:14 -08:00
net net: ipv6: Use compressed IPv6 addresses showing route replace error 2017-02-04 19:58:14 -05:00
samples Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2017-01-28 10:33:06 -05:00
scripts gcc-plugins: update gcc-common.h for gcc-7 2017-01-03 12:08:59 -08:00
security Introduce a sysctl that modifies the value of PROT_SOCK. 2017-01-24 12:10:51 -05:00
sound ASoC: Fixes for v4.10 2017-01-11 19:49:27 +01:00
tools Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2017-01-28 10:33:06 -05:00
usr kbuild: initramfs cleanup, set target from Kconfig 2017-01-05 09:40:16 -08:00
virt KVM/ARM updates for 4.10-rc4 2017-01-17 15:04:59 +01:00
.cocciconfig
.get_maintainer.ignore
.gitattributes
.gitignore
.mailmap mailmap: add codeaurora.org names for nameless email commits 2017-01-10 18:31:55 -08:00
COPYING
CREDITS
Kbuild
Kconfig
MAINTAINERS lib: Introduce priority array area manager 2017-02-03 16:35:42 -05:00
Makefile Linux 4.10-rc6 2017-01-29 14:25:17 -08:00
README

Linux kernel
============

This file was moved to Documentation/admin-guide/README.rst

Please notice that there are several guides for kernel developers and users.
These guides can be rendered in a number of formats, like HTML and PDF.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.
See Documentation/00-INDEX for a list of what is contained in each file.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.