kernel_optimize_test/net/sched/act_nat.c

362 lines
8.0 KiB
C
Raw Normal View History

[PKT_SCHED]: Add stateless NAT Stateless NAT is useful in controlled environments where restrictions are placed on through traffic such that we don't need connection tracking to correctly NAT protocol-specific data. In particular, this is of interest when the number of flows or the number of addresses being NATed is large, or if connection tracking information has to be replicated and where it is not practical to do so. Previously we had stateless NAT functionality which was integrated into the IPv4 routing subsystem. This was a great solution as long as the NAT worked on a subnet to subnet basis such that the number of NAT rules was relatively small. The reason is that for SNAT the routing based system had to perform a linear scan through the rules. If the number of rules is large then major renovations would have take place in the routing subsystem to make this practical. For the time being, the least intrusive way of achieving this is to use the u32 classifier written by Alexey Kuznetsov along with the actions infrastructure implemented by Jamal Hadi Salim. The following patch is an attempt at this problem by creating a new nat action that can be invoked from u32 hash tables which would allow large number of stateless NAT rules that can be used/updated in constant time. The actual NAT code is mostly based on the previous stateless NAT code written by Alexey. In future we might be able to utilise the protocol NAT code from netfilter to improve support for other protocols. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-09-28 03:48:05 +08:00
/*
* Stateless NAT actions
*
* Copyright (c) 2007 Herbert Xu <herbert@gondor.apana.org.au>
*
* This program is free software; you can redistribute it and/or modify it
* under the terms of the GNU General Public License as published by the Free
* Software Foundation; either version 2 of the License, or (at your option)
* any later version.
*/
#include <linux/errno.h>
#include <linux/init.h>
#include <linux/kernel.h>
#include <linux/module.h>
#include <linux/netfilter.h>
#include <linux/rtnetlink.h>
#include <linux/skbuff.h>
#include <linux/slab.h>
#include <linux/spinlock.h>
#include <linux/string.h>
#include <linux/tc_act/tc_nat.h>
#include <net/act_api.h>
net/sched: act_nat: validate the control action inside init() the following script: # tc qdisc add dev crash0 clsact # tc filter add dev crash0 egress matchall \ > action nat ingress 1.18.1.1 1.18.2.2 pass index 90 # tc actions replace action nat \ > ingress 1.18.1.1 1.18.2.2 goto chain 42 index 90 cookie c1a0c1a0 # tc actions show action nat had the following output: Error: Failed to init TC action chain. We have an error talking to the kernel total acts 1 action order 0: nat ingress 1.18.1.1/32 1.18.2.2 goto chain 42 index 90 ref 2 bind 1 cookie c1a0c1a0 Then, the first packet transmitted by crash0 made the kernel crash: BUG: unable to handle kernel NULL pointer dereference at 0000000000000000 #PF error: [normal kernel read fault] PGD 800000002d180067 P4D 800000002d180067 PUD 7cb8b067 PMD 0 Oops: 0000 [#1] SMP PTI CPU: 3 PID: 164 Comm: kworker/3:1 Not tainted 5.0.0-rc4.gotochain_crash+ #533 Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 Workqueue: ipv6_addrconf addrconf_dad_work RIP: 0010:tcf_action_exec+0xb8/0x100 Code: 00 00 00 20 74 1d 83 f8 03 75 09 49 83 c4 08 4d 39 ec 75 bc 48 83 c4 10 5b 5d 41 5c 41 5d 41 5e 41 5f c3 49 8b 97 a8 00 00 00 <48> 8b 12 48 89 55 00 48 83 c4 10 5b 5d 41 5c 41 5d 41 5e 41 5f c3 RSP: 0018:ffffae4500e2fad0 EFLAGS: 00010246 RAX: 000000002000002a RBX: ffff9fa52e28c800 RCX: 0000000001011201 RDX: 0000000000000000 RSI: 0000000000000056 RDI: ffff9fa52ca12800 RBP: ffffae4500e2fb70 R08: 0000000000000022 R09: 000000000000000e R10: 00000000ffffffff R11: 0000000001011201 R12: ffff9fa52cbc9c00 R13: ffff9fa52cbc9c08 R14: 0000000000000001 R15: ffff9fa52ca12780 FS: 0000000000000000(0000) GS:ffff9fa57db80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000000073f8c004 CR4: 00000000001606e0 Call Trace: tcf_classify+0x58/0x120 __dev_queue_xmit+0x40a/0x890 ? ndisc_next_option+0x50/0x50 ? ___neigh_create+0x4d5/0x680 ? ip6_finish_output2+0x1b5/0x590 ip6_finish_output2+0x1b5/0x590 ? ip6_output+0x68/0x110 ip6_output+0x68/0x110 ? nf_hook.constprop.28+0x79/0xc0 ndisc_send_skb+0x248/0x2e0 ndisc_send_ns+0xf8/0x200 ? addrconf_dad_work+0x389/0x4b0 addrconf_dad_work+0x389/0x4b0 ? __switch_to_asm+0x34/0x70 ? process_one_work+0x195/0x380 ? addrconf_dad_completed+0x370/0x370 process_one_work+0x195/0x380 worker_thread+0x30/0x390 ? process_one_work+0x380/0x380 kthread+0x113/0x130 ? kthread_park+0x90/0x90 ret_from_fork+0x35/0x40 Modules linked in: act_nat veth ip6table_filter ip6_tables iptable_filter binfmt_misc ext4 crct10dif_pclmul crc32_pclmul ghash_clmulni_intel mbcache jbd2 snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hwdep snd_hda_core snd_seq snd_seq_device snd_pcm aesni_intel crypto_simd cryptd glue_helper snd_timer snd joydev virtio_balloon pcspkr soundcore i2c_piix4 nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs qxl ata_generic pata_acpi drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm virtio_net virtio_blk net_failover failover virtio_console drm crc32c_intel floppy ata_piix libata virtio_pci virtio_ring virtio serio_raw dm_mirror dm_region_hash dm_log dm_mod CR2: 0000000000000000 Validating the control action within tcf_nat_init() proved to fix the above issue. A TDC selftest is added to verify the correct behavior. Fixes: db50514f9a9c ("net: sched: add termination action to allow goto chain") Fixes: 97763dc0f401 ("net_sched: reject unknown tcfa_action values") Signed-off-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-20 22:00:06 +08:00
#include <net/pkt_cls.h>
[PKT_SCHED]: Add stateless NAT Stateless NAT is useful in controlled environments where restrictions are placed on through traffic such that we don't need connection tracking to correctly NAT protocol-specific data. In particular, this is of interest when the number of flows or the number of addresses being NATed is large, or if connection tracking information has to be replicated and where it is not practical to do so. Previously we had stateless NAT functionality which was integrated into the IPv4 routing subsystem. This was a great solution as long as the NAT worked on a subnet to subnet basis such that the number of NAT rules was relatively small. The reason is that for SNAT the routing based system had to perform a linear scan through the rules. If the number of rules is large then major renovations would have take place in the routing subsystem to make this practical. For the time being, the least intrusive way of achieving this is to use the u32 classifier written by Alexey Kuznetsov along with the actions infrastructure implemented by Jamal Hadi Salim. The following patch is an attempt at this problem by creating a new nat action that can be invoked from u32 hash tables which would allow large number of stateless NAT rules that can be used/updated in constant time. The actual NAT code is mostly based on the previous stateless NAT code written by Alexey. In future we might be able to utilise the protocol NAT code from netfilter to improve support for other protocols. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-09-28 03:48:05 +08:00
#include <net/icmp.h>
#include <net/ip.h>
#include <net/netlink.h>
#include <net/tc_act/tc_nat.h>
#include <net/tcp.h>
#include <net/udp.h>
netns: make struct pernet_operations::id unsigned int Make struct pernet_operations::id unsigned. There are 2 reasons to do so: 1) This field is really an index into an zero based array and thus is unsigned entity. Using negative value is out-of-bound access by definition. 2) On x86_64 unsigned 32-bit data which are mixed with pointers via array indexing or offsets added or subtracted to pointers are preffered to signed 32-bit data. "int" being used as an array index needs to be sign-extended to 64-bit before being used. void f(long *p, int i) { g(p[i]); } roughly translates to movsx rsi, esi mov rdi, [rsi+...] call g MOVSX is 3 byte instruction which isn't necessary if the variable is unsigned because x86_64 is zero extending by default. Now, there is net_generic() function which, you guessed it right, uses "int" as an array index: static inline void *net_generic(const struct net *net, int id) { ... ptr = ng->ptr[id - 1]; ... } And this function is used a lot, so those sign extensions add up. Patch snipes ~1730 bytes on allyesconfig kernel (without all junk messing with code generation): add/remove: 0/0 grow/shrink: 70/598 up/down: 396/-2126 (-1730) Unfortunately some functions actually grow bigger. This is a semmingly random artefact of code generation with register allocator being used differently. gcc decides that some variable needs to live in new r8+ registers and every access now requires REX prefix. Or it is shifted into r12, so [r12+0] addressing mode has to be used which is longer than [r8] However, overall balance is in negative direction: add/remove: 0/0 grow/shrink: 70/598 up/down: 396/-2126 (-1730) function old new delta nfsd4_lock 3886 3959 +73 tipc_link_build_proto_msg 1096 1140 +44 mac80211_hwsim_new_radio 2776 2808 +32 tipc_mon_rcv 1032 1058 +26 svcauth_gss_legacy_init 1413 1429 +16 tipc_bcbase_select_primary 379 392 +13 nfsd4_exchange_id 1247 1260 +13 nfsd4_setclientid_confirm 782 793 +11 ... put_client_renew_locked 494 480 -14 ip_set_sockfn_get 730 716 -14 geneve_sock_add 829 813 -16 nfsd4_sequence_done 721 703 -18 nlmclnt_lookup_host 708 686 -22 nfsd4_lockt 1085 1063 -22 nfs_get_client 1077 1050 -27 tcf_bpf_init 1106 1076 -30 nfsd4_encode_fattr 5997 5930 -67 Total: Before=154856051, After=154854321, chg -0.00% Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-17 09:58:21 +08:00
static unsigned int nat_net_id;
static struct tc_action_ops act_nat_ops;
static const struct nla_policy nat_policy[TCA_NAT_MAX + 1] = {
[TCA_NAT_PARMS] = { .len = sizeof(struct tc_nat) },
};
static int tcf_nat_init(struct net *net, struct nlattr *nla, struct nlattr *est,
struct tc_action **a, int ovr, int bind,
net/sched: prepare TC actions to properly validate the control action - pass a pointer to struct tcf_proto in each actions's init() handler, to allow validating the control action, checking whether the chain exists and (eventually) refcounting it. - remove code that validates the control action after a successful call to the action's init() handler, and replace it with a test that forbids addition of actions having 'goto_chain' and NULL goto_chain pointer at the same time. - add tcf_action_check_ctrlact(), that will validate the control action and eventually allocate the action 'goto_chain' within the init() handler. - add tcf_action_set_ctrlact(), that will assign the control action and swap the current 'goto_chain' pointer with the new given one. This disallows 'goto_chain' on actions that don't initialize it properly in their init() handler, i.e. calling tcf_action_check_ctrlact() after successful IDR reservation and then calling tcf_action_set_ctrlact() to assign 'goto_chain' and 'tcf_action' consistently. By doing this, the kernel does not leak anymore refcounts when a valid 'goto chain' handle is replaced in TC actions, causing kmemleak splats like the following one: # tc chain add dev dd0 chain 42 ingress protocol ip flower \ > ip_proto tcp action drop # tc chain add dev dd0 chain 43 ingress protocol ip flower \ > ip_proto udp action drop # tc filter add dev dd0 ingress matchall \ > action gact goto chain 42 index 66 # tc filter replace dev dd0 ingress matchall \ > action gact goto chain 43 index 66 # echo scan >/sys/kernel/debug/kmemleak <...> unreferenced object 0xffff93c0ee09f000 (size 1024): comm "tc", pid 2565, jiffies 4295339808 (age 65.426s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00 00 00 00 08 00 06 00 00 00 00 00 00 00 00 00 ................ backtrace: [<000000009b63f92d>] tc_ctl_chain+0x3d2/0x4c0 [<00000000683a8d72>] rtnetlink_rcv_msg+0x263/0x2d0 [<00000000ddd88f8e>] netlink_rcv_skb+0x4a/0x110 [<000000006126a348>] netlink_unicast+0x1a0/0x250 [<00000000b3340877>] netlink_sendmsg+0x2c1/0x3c0 [<00000000a25a2171>] sock_sendmsg+0x36/0x40 [<00000000f19ee1ec>] ___sys_sendmsg+0x280/0x2f0 [<00000000d0422042>] __sys_sendmsg+0x5e/0xa0 [<000000007a6c61f9>] do_syscall_64+0x5b/0x180 [<00000000ccd07542>] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [<0000000013eaa334>] 0xffffffffffffffff Fixes: db50514f9a9c ("net: sched: add termination action to allow goto chain") Fixes: 97763dc0f401 ("net_sched: reject unknown tcfa_action values") Signed-off-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-20 21:59:59 +08:00
bool rtnl_held, struct tcf_proto *tp,
struct netlink_ext_ack *extack)
[PKT_SCHED]: Add stateless NAT Stateless NAT is useful in controlled environments where restrictions are placed on through traffic such that we don't need connection tracking to correctly NAT protocol-specific data. In particular, this is of interest when the number of flows or the number of addresses being NATed is large, or if connection tracking information has to be replicated and where it is not practical to do so. Previously we had stateless NAT functionality which was integrated into the IPv4 routing subsystem. This was a great solution as long as the NAT worked on a subnet to subnet basis such that the number of NAT rules was relatively small. The reason is that for SNAT the routing based system had to perform a linear scan through the rules. If the number of rules is large then major renovations would have take place in the routing subsystem to make this practical. For the time being, the least intrusive way of achieving this is to use the u32 classifier written by Alexey Kuznetsov along with the actions infrastructure implemented by Jamal Hadi Salim. The following patch is an attempt at this problem by creating a new nat action that can be invoked from u32 hash tables which would allow large number of stateless NAT rules that can be used/updated in constant time. The actual NAT code is mostly based on the previous stateless NAT code written by Alexey. In future we might be able to utilise the protocol NAT code from netfilter to improve support for other protocols. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-09-28 03:48:05 +08:00
{
struct tc_action_net *tn = net_generic(net, nat_net_id);
struct nlattr *tb[TCA_NAT_MAX + 1];
net/sched: act_nat: validate the control action inside init() the following script: # tc qdisc add dev crash0 clsact # tc filter add dev crash0 egress matchall \ > action nat ingress 1.18.1.1 1.18.2.2 pass index 90 # tc actions replace action nat \ > ingress 1.18.1.1 1.18.2.2 goto chain 42 index 90 cookie c1a0c1a0 # tc actions show action nat had the following output: Error: Failed to init TC action chain. We have an error talking to the kernel total acts 1 action order 0: nat ingress 1.18.1.1/32 1.18.2.2 goto chain 42 index 90 ref 2 bind 1 cookie c1a0c1a0 Then, the first packet transmitted by crash0 made the kernel crash: BUG: unable to handle kernel NULL pointer dereference at 0000000000000000 #PF error: [normal kernel read fault] PGD 800000002d180067 P4D 800000002d180067 PUD 7cb8b067 PMD 0 Oops: 0000 [#1] SMP PTI CPU: 3 PID: 164 Comm: kworker/3:1 Not tainted 5.0.0-rc4.gotochain_crash+ #533 Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 Workqueue: ipv6_addrconf addrconf_dad_work RIP: 0010:tcf_action_exec+0xb8/0x100 Code: 00 00 00 20 74 1d 83 f8 03 75 09 49 83 c4 08 4d 39 ec 75 bc 48 83 c4 10 5b 5d 41 5c 41 5d 41 5e 41 5f c3 49 8b 97 a8 00 00 00 <48> 8b 12 48 89 55 00 48 83 c4 10 5b 5d 41 5c 41 5d 41 5e 41 5f c3 RSP: 0018:ffffae4500e2fad0 EFLAGS: 00010246 RAX: 000000002000002a RBX: ffff9fa52e28c800 RCX: 0000000001011201 RDX: 0000000000000000 RSI: 0000000000000056 RDI: ffff9fa52ca12800 RBP: ffffae4500e2fb70 R08: 0000000000000022 R09: 000000000000000e R10: 00000000ffffffff R11: 0000000001011201 R12: ffff9fa52cbc9c00 R13: ffff9fa52cbc9c08 R14: 0000000000000001 R15: ffff9fa52ca12780 FS: 0000000000000000(0000) GS:ffff9fa57db80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000000073f8c004 CR4: 00000000001606e0 Call Trace: tcf_classify+0x58/0x120 __dev_queue_xmit+0x40a/0x890 ? ndisc_next_option+0x50/0x50 ? ___neigh_create+0x4d5/0x680 ? ip6_finish_output2+0x1b5/0x590 ip6_finish_output2+0x1b5/0x590 ? ip6_output+0x68/0x110 ip6_output+0x68/0x110 ? nf_hook.constprop.28+0x79/0xc0 ndisc_send_skb+0x248/0x2e0 ndisc_send_ns+0xf8/0x200 ? addrconf_dad_work+0x389/0x4b0 addrconf_dad_work+0x389/0x4b0 ? __switch_to_asm+0x34/0x70 ? process_one_work+0x195/0x380 ? addrconf_dad_completed+0x370/0x370 process_one_work+0x195/0x380 worker_thread+0x30/0x390 ? process_one_work+0x380/0x380 kthread+0x113/0x130 ? kthread_park+0x90/0x90 ret_from_fork+0x35/0x40 Modules linked in: act_nat veth ip6table_filter ip6_tables iptable_filter binfmt_misc ext4 crct10dif_pclmul crc32_pclmul ghash_clmulni_intel mbcache jbd2 snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hwdep snd_hda_core snd_seq snd_seq_device snd_pcm aesni_intel crypto_simd cryptd glue_helper snd_timer snd joydev virtio_balloon pcspkr soundcore i2c_piix4 nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs qxl ata_generic pata_acpi drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm virtio_net virtio_blk net_failover failover virtio_console drm crc32c_intel floppy ata_piix libata virtio_pci virtio_ring virtio serio_raw dm_mirror dm_region_hash dm_log dm_mod CR2: 0000000000000000 Validating the control action within tcf_nat_init() proved to fix the above issue. A TDC selftest is added to verify the correct behavior. Fixes: db50514f9a9c ("net: sched: add termination action to allow goto chain") Fixes: 97763dc0f401 ("net_sched: reject unknown tcfa_action values") Signed-off-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-20 22:00:06 +08:00
struct tcf_chain *goto_ch = NULL;
[PKT_SCHED]: Add stateless NAT Stateless NAT is useful in controlled environments where restrictions are placed on through traffic such that we don't need connection tracking to correctly NAT protocol-specific data. In particular, this is of interest when the number of flows or the number of addresses being NATed is large, or if connection tracking information has to be replicated and where it is not practical to do so. Previously we had stateless NAT functionality which was integrated into the IPv4 routing subsystem. This was a great solution as long as the NAT worked on a subnet to subnet basis such that the number of NAT rules was relatively small. The reason is that for SNAT the routing based system had to perform a linear scan through the rules. If the number of rules is large then major renovations would have take place in the routing subsystem to make this practical. For the time being, the least intrusive way of achieving this is to use the u32 classifier written by Alexey Kuznetsov along with the actions infrastructure implemented by Jamal Hadi Salim. The following patch is an attempt at this problem by creating a new nat action that can be invoked from u32 hash tables which would allow large number of stateless NAT rules that can be used/updated in constant time. The actual NAT code is mostly based on the previous stateless NAT code written by Alexey. In future we might be able to utilise the protocol NAT code from netfilter to improve support for other protocols. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-09-28 03:48:05 +08:00
struct tc_nat *parm;
int ret = 0, err;
[PKT_SCHED]: Add stateless NAT Stateless NAT is useful in controlled environments where restrictions are placed on through traffic such that we don't need connection tracking to correctly NAT protocol-specific data. In particular, this is of interest when the number of flows or the number of addresses being NATed is large, or if connection tracking information has to be replicated and where it is not practical to do so. Previously we had stateless NAT functionality which was integrated into the IPv4 routing subsystem. This was a great solution as long as the NAT worked on a subnet to subnet basis such that the number of NAT rules was relatively small. The reason is that for SNAT the routing based system had to perform a linear scan through the rules. If the number of rules is large then major renovations would have take place in the routing subsystem to make this practical. For the time being, the least intrusive way of achieving this is to use the u32 classifier written by Alexey Kuznetsov along with the actions infrastructure implemented by Jamal Hadi Salim. The following patch is an attempt at this problem by creating a new nat action that can be invoked from u32 hash tables which would allow large number of stateless NAT rules that can be used/updated in constant time. The actual NAT code is mostly based on the previous stateless NAT code written by Alexey. In future we might be able to utilise the protocol NAT code from netfilter to improve support for other protocols. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-09-28 03:48:05 +08:00
struct tcf_nat *p;
if (nla == NULL)
[PKT_SCHED]: Add stateless NAT Stateless NAT is useful in controlled environments where restrictions are placed on through traffic such that we don't need connection tracking to correctly NAT protocol-specific data. In particular, this is of interest when the number of flows or the number of addresses being NATed is large, or if connection tracking information has to be replicated and where it is not practical to do so. Previously we had stateless NAT functionality which was integrated into the IPv4 routing subsystem. This was a great solution as long as the NAT worked on a subnet to subnet basis such that the number of NAT rules was relatively small. The reason is that for SNAT the routing based system had to perform a linear scan through the rules. If the number of rules is large then major renovations would have take place in the routing subsystem to make this practical. For the time being, the least intrusive way of achieving this is to use the u32 classifier written by Alexey Kuznetsov along with the actions infrastructure implemented by Jamal Hadi Salim. The following patch is an attempt at this problem by creating a new nat action that can be invoked from u32 hash tables which would allow large number of stateless NAT rules that can be used/updated in constant time. The actual NAT code is mostly based on the previous stateless NAT code written by Alexey. In future we might be able to utilise the protocol NAT code from netfilter to improve support for other protocols. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-09-28 03:48:05 +08:00
return -EINVAL;
err = nla_parse_nested(tb, TCA_NAT_MAX, nla, nat_policy, NULL);
if (err < 0)
return err;
if (tb[TCA_NAT_PARMS] == NULL)
[PKT_SCHED]: Add stateless NAT Stateless NAT is useful in controlled environments where restrictions are placed on through traffic such that we don't need connection tracking to correctly NAT protocol-specific data. In particular, this is of interest when the number of flows or the number of addresses being NATed is large, or if connection tracking information has to be replicated and where it is not practical to do so. Previously we had stateless NAT functionality which was integrated into the IPv4 routing subsystem. This was a great solution as long as the NAT worked on a subnet to subnet basis such that the number of NAT rules was relatively small. The reason is that for SNAT the routing based system had to perform a linear scan through the rules. If the number of rules is large then major renovations would have take place in the routing subsystem to make this practical. For the time being, the least intrusive way of achieving this is to use the u32 classifier written by Alexey Kuznetsov along with the actions infrastructure implemented by Jamal Hadi Salim. The following patch is an attempt at this problem by creating a new nat action that can be invoked from u32 hash tables which would allow large number of stateless NAT rules that can be used/updated in constant time. The actual NAT code is mostly based on the previous stateless NAT code written by Alexey. In future we might be able to utilise the protocol NAT code from netfilter to improve support for other protocols. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-09-28 03:48:05 +08:00
return -EINVAL;
parm = nla_data(tb[TCA_NAT_PARMS]);
[PKT_SCHED]: Add stateless NAT Stateless NAT is useful in controlled environments where restrictions are placed on through traffic such that we don't need connection tracking to correctly NAT protocol-specific data. In particular, this is of interest when the number of flows or the number of addresses being NATed is large, or if connection tracking information has to be replicated and where it is not practical to do so. Previously we had stateless NAT functionality which was integrated into the IPv4 routing subsystem. This was a great solution as long as the NAT worked on a subnet to subnet basis such that the number of NAT rules was relatively small. The reason is that for SNAT the routing based system had to perform a linear scan through the rules. If the number of rules is large then major renovations would have take place in the routing subsystem to make this practical. For the time being, the least intrusive way of achieving this is to use the u32 classifier written by Alexey Kuznetsov along with the actions infrastructure implemented by Jamal Hadi Salim. The following patch is an attempt at this problem by creating a new nat action that can be invoked from u32 hash tables which would allow large number of stateless NAT rules that can be used/updated in constant time. The actual NAT code is mostly based on the previous stateless NAT code written by Alexey. In future we might be able to utilise the protocol NAT code from netfilter to improve support for other protocols. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-09-28 03:48:05 +08:00
err = tcf_idr_check_alloc(tn, &parm->index, a, bind);
if (!err) {
ret = tcf_idr_create(tn, parm->index, est, a,
&act_nat_ops, bind, false);
if (ret) {
tcf_idr_cleanup(tn, parm->index);
return ret;
}
[PKT_SCHED]: Add stateless NAT Stateless NAT is useful in controlled environments where restrictions are placed on through traffic such that we don't need connection tracking to correctly NAT protocol-specific data. In particular, this is of interest when the number of flows or the number of addresses being NATed is large, or if connection tracking information has to be replicated and where it is not practical to do so. Previously we had stateless NAT functionality which was integrated into the IPv4 routing subsystem. This was a great solution as long as the NAT worked on a subnet to subnet basis such that the number of NAT rules was relatively small. The reason is that for SNAT the routing based system had to perform a linear scan through the rules. If the number of rules is large then major renovations would have take place in the routing subsystem to make this practical. For the time being, the least intrusive way of achieving this is to use the u32 classifier written by Alexey Kuznetsov along with the actions infrastructure implemented by Jamal Hadi Salim. The following patch is an attempt at this problem by creating a new nat action that can be invoked from u32 hash tables which would allow large number of stateless NAT rules that can be used/updated in constant time. The actual NAT code is mostly based on the previous stateless NAT code written by Alexey. In future we might be able to utilise the protocol NAT code from netfilter to improve support for other protocols. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-09-28 03:48:05 +08:00
ret = ACT_P_CREATED;
} else if (err > 0) {
if (bind)
return 0;
if (!ovr) {
tcf_idr_release(*a, bind);
[PKT_SCHED]: Add stateless NAT Stateless NAT is useful in controlled environments where restrictions are placed on through traffic such that we don't need connection tracking to correctly NAT protocol-specific data. In particular, this is of interest when the number of flows or the number of addresses being NATed is large, or if connection tracking information has to be replicated and where it is not practical to do so. Previously we had stateless NAT functionality which was integrated into the IPv4 routing subsystem. This was a great solution as long as the NAT worked on a subnet to subnet basis such that the number of NAT rules was relatively small. The reason is that for SNAT the routing based system had to perform a linear scan through the rules. If the number of rules is large then major renovations would have take place in the routing subsystem to make this practical. For the time being, the least intrusive way of achieving this is to use the u32 classifier written by Alexey Kuznetsov along with the actions infrastructure implemented by Jamal Hadi Salim. The following patch is an attempt at this problem by creating a new nat action that can be invoked from u32 hash tables which would allow large number of stateless NAT rules that can be used/updated in constant time. The actual NAT code is mostly based on the previous stateless NAT code written by Alexey. In future we might be able to utilise the protocol NAT code from netfilter to improve support for other protocols. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-09-28 03:48:05 +08:00
return -EEXIST;
}
} else {
return err;
[PKT_SCHED]: Add stateless NAT Stateless NAT is useful in controlled environments where restrictions are placed on through traffic such that we don't need connection tracking to correctly NAT protocol-specific data. In particular, this is of interest when the number of flows or the number of addresses being NATed is large, or if connection tracking information has to be replicated and where it is not practical to do so. Previously we had stateless NAT functionality which was integrated into the IPv4 routing subsystem. This was a great solution as long as the NAT worked on a subnet to subnet basis such that the number of NAT rules was relatively small. The reason is that for SNAT the routing based system had to perform a linear scan through the rules. If the number of rules is large then major renovations would have take place in the routing subsystem to make this practical. For the time being, the least intrusive way of achieving this is to use the u32 classifier written by Alexey Kuznetsov along with the actions infrastructure implemented by Jamal Hadi Salim. The following patch is an attempt at this problem by creating a new nat action that can be invoked from u32 hash tables which would allow large number of stateless NAT rules that can be used/updated in constant time. The actual NAT code is mostly based on the previous stateless NAT code written by Alexey. In future we might be able to utilise the protocol NAT code from netfilter to improve support for other protocols. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-09-28 03:48:05 +08:00
}
net/sched: act_nat: validate the control action inside init() the following script: # tc qdisc add dev crash0 clsact # tc filter add dev crash0 egress matchall \ > action nat ingress 1.18.1.1 1.18.2.2 pass index 90 # tc actions replace action nat \ > ingress 1.18.1.1 1.18.2.2 goto chain 42 index 90 cookie c1a0c1a0 # tc actions show action nat had the following output: Error: Failed to init TC action chain. We have an error talking to the kernel total acts 1 action order 0: nat ingress 1.18.1.1/32 1.18.2.2 goto chain 42 index 90 ref 2 bind 1 cookie c1a0c1a0 Then, the first packet transmitted by crash0 made the kernel crash: BUG: unable to handle kernel NULL pointer dereference at 0000000000000000 #PF error: [normal kernel read fault] PGD 800000002d180067 P4D 800000002d180067 PUD 7cb8b067 PMD 0 Oops: 0000 [#1] SMP PTI CPU: 3 PID: 164 Comm: kworker/3:1 Not tainted 5.0.0-rc4.gotochain_crash+ #533 Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 Workqueue: ipv6_addrconf addrconf_dad_work RIP: 0010:tcf_action_exec+0xb8/0x100 Code: 00 00 00 20 74 1d 83 f8 03 75 09 49 83 c4 08 4d 39 ec 75 bc 48 83 c4 10 5b 5d 41 5c 41 5d 41 5e 41 5f c3 49 8b 97 a8 00 00 00 <48> 8b 12 48 89 55 00 48 83 c4 10 5b 5d 41 5c 41 5d 41 5e 41 5f c3 RSP: 0018:ffffae4500e2fad0 EFLAGS: 00010246 RAX: 000000002000002a RBX: ffff9fa52e28c800 RCX: 0000000001011201 RDX: 0000000000000000 RSI: 0000000000000056 RDI: ffff9fa52ca12800 RBP: ffffae4500e2fb70 R08: 0000000000000022 R09: 000000000000000e R10: 00000000ffffffff R11: 0000000001011201 R12: ffff9fa52cbc9c00 R13: ffff9fa52cbc9c08 R14: 0000000000000001 R15: ffff9fa52ca12780 FS: 0000000000000000(0000) GS:ffff9fa57db80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000000073f8c004 CR4: 00000000001606e0 Call Trace: tcf_classify+0x58/0x120 __dev_queue_xmit+0x40a/0x890 ? ndisc_next_option+0x50/0x50 ? ___neigh_create+0x4d5/0x680 ? ip6_finish_output2+0x1b5/0x590 ip6_finish_output2+0x1b5/0x590 ? ip6_output+0x68/0x110 ip6_output+0x68/0x110 ? nf_hook.constprop.28+0x79/0xc0 ndisc_send_skb+0x248/0x2e0 ndisc_send_ns+0xf8/0x200 ? addrconf_dad_work+0x389/0x4b0 addrconf_dad_work+0x389/0x4b0 ? __switch_to_asm+0x34/0x70 ? process_one_work+0x195/0x380 ? addrconf_dad_completed+0x370/0x370 process_one_work+0x195/0x380 worker_thread+0x30/0x390 ? process_one_work+0x380/0x380 kthread+0x113/0x130 ? kthread_park+0x90/0x90 ret_from_fork+0x35/0x40 Modules linked in: act_nat veth ip6table_filter ip6_tables iptable_filter binfmt_misc ext4 crct10dif_pclmul crc32_pclmul ghash_clmulni_intel mbcache jbd2 snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hwdep snd_hda_core snd_seq snd_seq_device snd_pcm aesni_intel crypto_simd cryptd glue_helper snd_timer snd joydev virtio_balloon pcspkr soundcore i2c_piix4 nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs qxl ata_generic pata_acpi drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm virtio_net virtio_blk net_failover failover virtio_console drm crc32c_intel floppy ata_piix libata virtio_pci virtio_ring virtio serio_raw dm_mirror dm_region_hash dm_log dm_mod CR2: 0000000000000000 Validating the control action within tcf_nat_init() proved to fix the above issue. A TDC selftest is added to verify the correct behavior. Fixes: db50514f9a9c ("net: sched: add termination action to allow goto chain") Fixes: 97763dc0f401 ("net_sched: reject unknown tcfa_action values") Signed-off-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-20 22:00:06 +08:00
err = tcf_action_check_ctrlact(parm->action, tp, &goto_ch, extack);
if (err < 0)
goto release_idr;
p = to_tcf_nat(*a);
[PKT_SCHED]: Add stateless NAT Stateless NAT is useful in controlled environments where restrictions are placed on through traffic such that we don't need connection tracking to correctly NAT protocol-specific data. In particular, this is of interest when the number of flows or the number of addresses being NATed is large, or if connection tracking information has to be replicated and where it is not practical to do so. Previously we had stateless NAT functionality which was integrated into the IPv4 routing subsystem. This was a great solution as long as the NAT worked on a subnet to subnet basis such that the number of NAT rules was relatively small. The reason is that for SNAT the routing based system had to perform a linear scan through the rules. If the number of rules is large then major renovations would have take place in the routing subsystem to make this practical. For the time being, the least intrusive way of achieving this is to use the u32 classifier written by Alexey Kuznetsov along with the actions infrastructure implemented by Jamal Hadi Salim. The following patch is an attempt at this problem by creating a new nat action that can be invoked from u32 hash tables which would allow large number of stateless NAT rules that can be used/updated in constant time. The actual NAT code is mostly based on the previous stateless NAT code written by Alexey. In future we might be able to utilise the protocol NAT code from netfilter to improve support for other protocols. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-09-28 03:48:05 +08:00
spin_lock_bh(&p->tcf_lock);
p->old_addr = parm->old_addr;
p->new_addr = parm->new_addr;
p->mask = parm->mask;
p->flags = parm->flags;
net/sched: act_nat: validate the control action inside init() the following script: # tc qdisc add dev crash0 clsact # tc filter add dev crash0 egress matchall \ > action nat ingress 1.18.1.1 1.18.2.2 pass index 90 # tc actions replace action nat \ > ingress 1.18.1.1 1.18.2.2 goto chain 42 index 90 cookie c1a0c1a0 # tc actions show action nat had the following output: Error: Failed to init TC action chain. We have an error talking to the kernel total acts 1 action order 0: nat ingress 1.18.1.1/32 1.18.2.2 goto chain 42 index 90 ref 2 bind 1 cookie c1a0c1a0 Then, the first packet transmitted by crash0 made the kernel crash: BUG: unable to handle kernel NULL pointer dereference at 0000000000000000 #PF error: [normal kernel read fault] PGD 800000002d180067 P4D 800000002d180067 PUD 7cb8b067 PMD 0 Oops: 0000 [#1] SMP PTI CPU: 3 PID: 164 Comm: kworker/3:1 Not tainted 5.0.0-rc4.gotochain_crash+ #533 Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 Workqueue: ipv6_addrconf addrconf_dad_work RIP: 0010:tcf_action_exec+0xb8/0x100 Code: 00 00 00 20 74 1d 83 f8 03 75 09 49 83 c4 08 4d 39 ec 75 bc 48 83 c4 10 5b 5d 41 5c 41 5d 41 5e 41 5f c3 49 8b 97 a8 00 00 00 <48> 8b 12 48 89 55 00 48 83 c4 10 5b 5d 41 5c 41 5d 41 5e 41 5f c3 RSP: 0018:ffffae4500e2fad0 EFLAGS: 00010246 RAX: 000000002000002a RBX: ffff9fa52e28c800 RCX: 0000000001011201 RDX: 0000000000000000 RSI: 0000000000000056 RDI: ffff9fa52ca12800 RBP: ffffae4500e2fb70 R08: 0000000000000022 R09: 000000000000000e R10: 00000000ffffffff R11: 0000000001011201 R12: ffff9fa52cbc9c00 R13: ffff9fa52cbc9c08 R14: 0000000000000001 R15: ffff9fa52ca12780 FS: 0000000000000000(0000) GS:ffff9fa57db80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000000073f8c004 CR4: 00000000001606e0 Call Trace: tcf_classify+0x58/0x120 __dev_queue_xmit+0x40a/0x890 ? ndisc_next_option+0x50/0x50 ? ___neigh_create+0x4d5/0x680 ? ip6_finish_output2+0x1b5/0x590 ip6_finish_output2+0x1b5/0x590 ? ip6_output+0x68/0x110 ip6_output+0x68/0x110 ? nf_hook.constprop.28+0x79/0xc0 ndisc_send_skb+0x248/0x2e0 ndisc_send_ns+0xf8/0x200 ? addrconf_dad_work+0x389/0x4b0 addrconf_dad_work+0x389/0x4b0 ? __switch_to_asm+0x34/0x70 ? process_one_work+0x195/0x380 ? addrconf_dad_completed+0x370/0x370 process_one_work+0x195/0x380 worker_thread+0x30/0x390 ? process_one_work+0x380/0x380 kthread+0x113/0x130 ? kthread_park+0x90/0x90 ret_from_fork+0x35/0x40 Modules linked in: act_nat veth ip6table_filter ip6_tables iptable_filter binfmt_misc ext4 crct10dif_pclmul crc32_pclmul ghash_clmulni_intel mbcache jbd2 snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hwdep snd_hda_core snd_seq snd_seq_device snd_pcm aesni_intel crypto_simd cryptd glue_helper snd_timer snd joydev virtio_balloon pcspkr soundcore i2c_piix4 nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs qxl ata_generic pata_acpi drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm virtio_net virtio_blk net_failover failover virtio_console drm crc32c_intel floppy ata_piix libata virtio_pci virtio_ring virtio serio_raw dm_mirror dm_region_hash dm_log dm_mod CR2: 0000000000000000 Validating the control action within tcf_nat_init() proved to fix the above issue. A TDC selftest is added to verify the correct behavior. Fixes: db50514f9a9c ("net: sched: add termination action to allow goto chain") Fixes: 97763dc0f401 ("net_sched: reject unknown tcfa_action values") Signed-off-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-20 22:00:06 +08:00
goto_ch = tcf_action_set_ctrlact(*a, parm->action, goto_ch);
[PKT_SCHED]: Add stateless NAT Stateless NAT is useful in controlled environments where restrictions are placed on through traffic such that we don't need connection tracking to correctly NAT protocol-specific data. In particular, this is of interest when the number of flows or the number of addresses being NATed is large, or if connection tracking information has to be replicated and where it is not practical to do so. Previously we had stateless NAT functionality which was integrated into the IPv4 routing subsystem. This was a great solution as long as the NAT worked on a subnet to subnet basis such that the number of NAT rules was relatively small. The reason is that for SNAT the routing based system had to perform a linear scan through the rules. If the number of rules is large then major renovations would have take place in the routing subsystem to make this practical. For the time being, the least intrusive way of achieving this is to use the u32 classifier written by Alexey Kuznetsov along with the actions infrastructure implemented by Jamal Hadi Salim. The following patch is an attempt at this problem by creating a new nat action that can be invoked from u32 hash tables which would allow large number of stateless NAT rules that can be used/updated in constant time. The actual NAT code is mostly based on the previous stateless NAT code written by Alexey. In future we might be able to utilise the protocol NAT code from netfilter to improve support for other protocols. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-09-28 03:48:05 +08:00
spin_unlock_bh(&p->tcf_lock);
net/sched: act_nat: validate the control action inside init() the following script: # tc qdisc add dev crash0 clsact # tc filter add dev crash0 egress matchall \ > action nat ingress 1.18.1.1 1.18.2.2 pass index 90 # tc actions replace action nat \ > ingress 1.18.1.1 1.18.2.2 goto chain 42 index 90 cookie c1a0c1a0 # tc actions show action nat had the following output: Error: Failed to init TC action chain. We have an error talking to the kernel total acts 1 action order 0: nat ingress 1.18.1.1/32 1.18.2.2 goto chain 42 index 90 ref 2 bind 1 cookie c1a0c1a0 Then, the first packet transmitted by crash0 made the kernel crash: BUG: unable to handle kernel NULL pointer dereference at 0000000000000000 #PF error: [normal kernel read fault] PGD 800000002d180067 P4D 800000002d180067 PUD 7cb8b067 PMD 0 Oops: 0000 [#1] SMP PTI CPU: 3 PID: 164 Comm: kworker/3:1 Not tainted 5.0.0-rc4.gotochain_crash+ #533 Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 Workqueue: ipv6_addrconf addrconf_dad_work RIP: 0010:tcf_action_exec+0xb8/0x100 Code: 00 00 00 20 74 1d 83 f8 03 75 09 49 83 c4 08 4d 39 ec 75 bc 48 83 c4 10 5b 5d 41 5c 41 5d 41 5e 41 5f c3 49 8b 97 a8 00 00 00 <48> 8b 12 48 89 55 00 48 83 c4 10 5b 5d 41 5c 41 5d 41 5e 41 5f c3 RSP: 0018:ffffae4500e2fad0 EFLAGS: 00010246 RAX: 000000002000002a RBX: ffff9fa52e28c800 RCX: 0000000001011201 RDX: 0000000000000000 RSI: 0000000000000056 RDI: ffff9fa52ca12800 RBP: ffffae4500e2fb70 R08: 0000000000000022 R09: 000000000000000e R10: 00000000ffffffff R11: 0000000001011201 R12: ffff9fa52cbc9c00 R13: ffff9fa52cbc9c08 R14: 0000000000000001 R15: ffff9fa52ca12780 FS: 0000000000000000(0000) GS:ffff9fa57db80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000000073f8c004 CR4: 00000000001606e0 Call Trace: tcf_classify+0x58/0x120 __dev_queue_xmit+0x40a/0x890 ? ndisc_next_option+0x50/0x50 ? ___neigh_create+0x4d5/0x680 ? ip6_finish_output2+0x1b5/0x590 ip6_finish_output2+0x1b5/0x590 ? ip6_output+0x68/0x110 ip6_output+0x68/0x110 ? nf_hook.constprop.28+0x79/0xc0 ndisc_send_skb+0x248/0x2e0 ndisc_send_ns+0xf8/0x200 ? addrconf_dad_work+0x389/0x4b0 addrconf_dad_work+0x389/0x4b0 ? __switch_to_asm+0x34/0x70 ? process_one_work+0x195/0x380 ? addrconf_dad_completed+0x370/0x370 process_one_work+0x195/0x380 worker_thread+0x30/0x390 ? process_one_work+0x380/0x380 kthread+0x113/0x130 ? kthread_park+0x90/0x90 ret_from_fork+0x35/0x40 Modules linked in: act_nat veth ip6table_filter ip6_tables iptable_filter binfmt_misc ext4 crct10dif_pclmul crc32_pclmul ghash_clmulni_intel mbcache jbd2 snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hwdep snd_hda_core snd_seq snd_seq_device snd_pcm aesni_intel crypto_simd cryptd glue_helper snd_timer snd joydev virtio_balloon pcspkr soundcore i2c_piix4 nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs qxl ata_generic pata_acpi drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm virtio_net virtio_blk net_failover failover virtio_console drm crc32c_intel floppy ata_piix libata virtio_pci virtio_ring virtio serio_raw dm_mirror dm_region_hash dm_log dm_mod CR2: 0000000000000000 Validating the control action within tcf_nat_init() proved to fix the above issue. A TDC selftest is added to verify the correct behavior. Fixes: db50514f9a9c ("net: sched: add termination action to allow goto chain") Fixes: 97763dc0f401 ("net_sched: reject unknown tcfa_action values") Signed-off-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-20 22:00:06 +08:00
if (goto_ch)
tcf_chain_put_by_act(goto_ch);
[PKT_SCHED]: Add stateless NAT Stateless NAT is useful in controlled environments where restrictions are placed on through traffic such that we don't need connection tracking to correctly NAT protocol-specific data. In particular, this is of interest when the number of flows or the number of addresses being NATed is large, or if connection tracking information has to be replicated and where it is not practical to do so. Previously we had stateless NAT functionality which was integrated into the IPv4 routing subsystem. This was a great solution as long as the NAT worked on a subnet to subnet basis such that the number of NAT rules was relatively small. The reason is that for SNAT the routing based system had to perform a linear scan through the rules. If the number of rules is large then major renovations would have take place in the routing subsystem to make this practical. For the time being, the least intrusive way of achieving this is to use the u32 classifier written by Alexey Kuznetsov along with the actions infrastructure implemented by Jamal Hadi Salim. The following patch is an attempt at this problem by creating a new nat action that can be invoked from u32 hash tables which would allow large number of stateless NAT rules that can be used/updated in constant time. The actual NAT code is mostly based on the previous stateless NAT code written by Alexey. In future we might be able to utilise the protocol NAT code from netfilter to improve support for other protocols. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-09-28 03:48:05 +08:00
if (ret == ACT_P_CREATED)
tcf_idr_insert(tn, *a);
[PKT_SCHED]: Add stateless NAT Stateless NAT is useful in controlled environments where restrictions are placed on through traffic such that we don't need connection tracking to correctly NAT protocol-specific data. In particular, this is of interest when the number of flows or the number of addresses being NATed is large, or if connection tracking information has to be replicated and where it is not practical to do so. Previously we had stateless NAT functionality which was integrated into the IPv4 routing subsystem. This was a great solution as long as the NAT worked on a subnet to subnet basis such that the number of NAT rules was relatively small. The reason is that for SNAT the routing based system had to perform a linear scan through the rules. If the number of rules is large then major renovations would have take place in the routing subsystem to make this practical. For the time being, the least intrusive way of achieving this is to use the u32 classifier written by Alexey Kuznetsov along with the actions infrastructure implemented by Jamal Hadi Salim. The following patch is an attempt at this problem by creating a new nat action that can be invoked from u32 hash tables which would allow large number of stateless NAT rules that can be used/updated in constant time. The actual NAT code is mostly based on the previous stateless NAT code written by Alexey. In future we might be able to utilise the protocol NAT code from netfilter to improve support for other protocols. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-09-28 03:48:05 +08:00
return ret;
net/sched: act_nat: validate the control action inside init() the following script: # tc qdisc add dev crash0 clsact # tc filter add dev crash0 egress matchall \ > action nat ingress 1.18.1.1 1.18.2.2 pass index 90 # tc actions replace action nat \ > ingress 1.18.1.1 1.18.2.2 goto chain 42 index 90 cookie c1a0c1a0 # tc actions show action nat had the following output: Error: Failed to init TC action chain. We have an error talking to the kernel total acts 1 action order 0: nat ingress 1.18.1.1/32 1.18.2.2 goto chain 42 index 90 ref 2 bind 1 cookie c1a0c1a0 Then, the first packet transmitted by crash0 made the kernel crash: BUG: unable to handle kernel NULL pointer dereference at 0000000000000000 #PF error: [normal kernel read fault] PGD 800000002d180067 P4D 800000002d180067 PUD 7cb8b067 PMD 0 Oops: 0000 [#1] SMP PTI CPU: 3 PID: 164 Comm: kworker/3:1 Not tainted 5.0.0-rc4.gotochain_crash+ #533 Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 Workqueue: ipv6_addrconf addrconf_dad_work RIP: 0010:tcf_action_exec+0xb8/0x100 Code: 00 00 00 20 74 1d 83 f8 03 75 09 49 83 c4 08 4d 39 ec 75 bc 48 83 c4 10 5b 5d 41 5c 41 5d 41 5e 41 5f c3 49 8b 97 a8 00 00 00 <48> 8b 12 48 89 55 00 48 83 c4 10 5b 5d 41 5c 41 5d 41 5e 41 5f c3 RSP: 0018:ffffae4500e2fad0 EFLAGS: 00010246 RAX: 000000002000002a RBX: ffff9fa52e28c800 RCX: 0000000001011201 RDX: 0000000000000000 RSI: 0000000000000056 RDI: ffff9fa52ca12800 RBP: ffffae4500e2fb70 R08: 0000000000000022 R09: 000000000000000e R10: 00000000ffffffff R11: 0000000001011201 R12: ffff9fa52cbc9c00 R13: ffff9fa52cbc9c08 R14: 0000000000000001 R15: ffff9fa52ca12780 FS: 0000000000000000(0000) GS:ffff9fa57db80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000000073f8c004 CR4: 00000000001606e0 Call Trace: tcf_classify+0x58/0x120 __dev_queue_xmit+0x40a/0x890 ? ndisc_next_option+0x50/0x50 ? ___neigh_create+0x4d5/0x680 ? ip6_finish_output2+0x1b5/0x590 ip6_finish_output2+0x1b5/0x590 ? ip6_output+0x68/0x110 ip6_output+0x68/0x110 ? nf_hook.constprop.28+0x79/0xc0 ndisc_send_skb+0x248/0x2e0 ndisc_send_ns+0xf8/0x200 ? addrconf_dad_work+0x389/0x4b0 addrconf_dad_work+0x389/0x4b0 ? __switch_to_asm+0x34/0x70 ? process_one_work+0x195/0x380 ? addrconf_dad_completed+0x370/0x370 process_one_work+0x195/0x380 worker_thread+0x30/0x390 ? process_one_work+0x380/0x380 kthread+0x113/0x130 ? kthread_park+0x90/0x90 ret_from_fork+0x35/0x40 Modules linked in: act_nat veth ip6table_filter ip6_tables iptable_filter binfmt_misc ext4 crct10dif_pclmul crc32_pclmul ghash_clmulni_intel mbcache jbd2 snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hwdep snd_hda_core snd_seq snd_seq_device snd_pcm aesni_intel crypto_simd cryptd glue_helper snd_timer snd joydev virtio_balloon pcspkr soundcore i2c_piix4 nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs qxl ata_generic pata_acpi drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm virtio_net virtio_blk net_failover failover virtio_console drm crc32c_intel floppy ata_piix libata virtio_pci virtio_ring virtio serio_raw dm_mirror dm_region_hash dm_log dm_mod CR2: 0000000000000000 Validating the control action within tcf_nat_init() proved to fix the above issue. A TDC selftest is added to verify the correct behavior. Fixes: db50514f9a9c ("net: sched: add termination action to allow goto chain") Fixes: 97763dc0f401 ("net_sched: reject unknown tcfa_action values") Signed-off-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-20 22:00:06 +08:00
release_idr:
tcf_idr_release(*a, bind);
return err;
[PKT_SCHED]: Add stateless NAT Stateless NAT is useful in controlled environments where restrictions are placed on through traffic such that we don't need connection tracking to correctly NAT protocol-specific data. In particular, this is of interest when the number of flows or the number of addresses being NATed is large, or if connection tracking information has to be replicated and where it is not practical to do so. Previously we had stateless NAT functionality which was integrated into the IPv4 routing subsystem. This was a great solution as long as the NAT worked on a subnet to subnet basis such that the number of NAT rules was relatively small. The reason is that for SNAT the routing based system had to perform a linear scan through the rules. If the number of rules is large then major renovations would have take place in the routing subsystem to make this practical. For the time being, the least intrusive way of achieving this is to use the u32 classifier written by Alexey Kuznetsov along with the actions infrastructure implemented by Jamal Hadi Salim. The following patch is an attempt at this problem by creating a new nat action that can be invoked from u32 hash tables which would allow large number of stateless NAT rules that can be used/updated in constant time. The actual NAT code is mostly based on the previous stateless NAT code written by Alexey. In future we might be able to utilise the protocol NAT code from netfilter to improve support for other protocols. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-09-28 03:48:05 +08:00
}
static int tcf_nat_act(struct sk_buff *skb, const struct tc_action *a,
struct tcf_result *res)
[PKT_SCHED]: Add stateless NAT Stateless NAT is useful in controlled environments where restrictions are placed on through traffic such that we don't need connection tracking to correctly NAT protocol-specific data. In particular, this is of interest when the number of flows or the number of addresses being NATed is large, or if connection tracking information has to be replicated and where it is not practical to do so. Previously we had stateless NAT functionality which was integrated into the IPv4 routing subsystem. This was a great solution as long as the NAT worked on a subnet to subnet basis such that the number of NAT rules was relatively small. The reason is that for SNAT the routing based system had to perform a linear scan through the rules. If the number of rules is large then major renovations would have take place in the routing subsystem to make this practical. For the time being, the least intrusive way of achieving this is to use the u32 classifier written by Alexey Kuznetsov along with the actions infrastructure implemented by Jamal Hadi Salim. The following patch is an attempt at this problem by creating a new nat action that can be invoked from u32 hash tables which would allow large number of stateless NAT rules that can be used/updated in constant time. The actual NAT code is mostly based on the previous stateless NAT code written by Alexey. In future we might be able to utilise the protocol NAT code from netfilter to improve support for other protocols. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-09-28 03:48:05 +08:00
{
struct tcf_nat *p = to_tcf_nat(a);
[PKT_SCHED]: Add stateless NAT Stateless NAT is useful in controlled environments where restrictions are placed on through traffic such that we don't need connection tracking to correctly NAT protocol-specific data. In particular, this is of interest when the number of flows or the number of addresses being NATed is large, or if connection tracking information has to be replicated and where it is not practical to do so. Previously we had stateless NAT functionality which was integrated into the IPv4 routing subsystem. This was a great solution as long as the NAT worked on a subnet to subnet basis such that the number of NAT rules was relatively small. The reason is that for SNAT the routing based system had to perform a linear scan through the rules. If the number of rules is large then major renovations would have take place in the routing subsystem to make this practical. For the time being, the least intrusive way of achieving this is to use the u32 classifier written by Alexey Kuznetsov along with the actions infrastructure implemented by Jamal Hadi Salim. The following patch is an attempt at this problem by creating a new nat action that can be invoked from u32 hash tables which would allow large number of stateless NAT rules that can be used/updated in constant time. The actual NAT code is mostly based on the previous stateless NAT code written by Alexey. In future we might be able to utilise the protocol NAT code from netfilter to improve support for other protocols. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-09-28 03:48:05 +08:00
struct iphdr *iph;
__be32 old_addr;
__be32 new_addr;
__be32 mask;
__be32 addr;
int egress;
int action;
int ihl;
int noff;
[PKT_SCHED]: Add stateless NAT Stateless NAT is useful in controlled environments where restrictions are placed on through traffic such that we don't need connection tracking to correctly NAT protocol-specific data. In particular, this is of interest when the number of flows or the number of addresses being NATed is large, or if connection tracking information has to be replicated and where it is not practical to do so. Previously we had stateless NAT functionality which was integrated into the IPv4 routing subsystem. This was a great solution as long as the NAT worked on a subnet to subnet basis such that the number of NAT rules was relatively small. The reason is that for SNAT the routing based system had to perform a linear scan through the rules. If the number of rules is large then major renovations would have take place in the routing subsystem to make this practical. For the time being, the least intrusive way of achieving this is to use the u32 classifier written by Alexey Kuznetsov along with the actions infrastructure implemented by Jamal Hadi Salim. The following patch is an attempt at this problem by creating a new nat action that can be invoked from u32 hash tables which would allow large number of stateless NAT rules that can be used/updated in constant time. The actual NAT code is mostly based on the previous stateless NAT code written by Alexey. In future we might be able to utilise the protocol NAT code from netfilter to improve support for other protocols. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-09-28 03:48:05 +08:00
spin_lock(&p->tcf_lock);
tcf_lastuse_update(&p->tcf_tm);
[PKT_SCHED]: Add stateless NAT Stateless NAT is useful in controlled environments where restrictions are placed on through traffic such that we don't need connection tracking to correctly NAT protocol-specific data. In particular, this is of interest when the number of flows or the number of addresses being NATed is large, or if connection tracking information has to be replicated and where it is not practical to do so. Previously we had stateless NAT functionality which was integrated into the IPv4 routing subsystem. This was a great solution as long as the NAT worked on a subnet to subnet basis such that the number of NAT rules was relatively small. The reason is that for SNAT the routing based system had to perform a linear scan through the rules. If the number of rules is large then major renovations would have take place in the routing subsystem to make this practical. For the time being, the least intrusive way of achieving this is to use the u32 classifier written by Alexey Kuznetsov along with the actions infrastructure implemented by Jamal Hadi Salim. The following patch is an attempt at this problem by creating a new nat action that can be invoked from u32 hash tables which would allow large number of stateless NAT rules that can be used/updated in constant time. The actual NAT code is mostly based on the previous stateless NAT code written by Alexey. In future we might be able to utilise the protocol NAT code from netfilter to improve support for other protocols. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-09-28 03:48:05 +08:00
old_addr = p->old_addr;
new_addr = p->new_addr;
mask = p->mask;
egress = p->flags & TCA_NAT_FLAG_EGRESS;
action = p->tcf_action;
bstats_update(&p->tcf_bstats, skb);
[PKT_SCHED]: Add stateless NAT Stateless NAT is useful in controlled environments where restrictions are placed on through traffic such that we don't need connection tracking to correctly NAT protocol-specific data. In particular, this is of interest when the number of flows or the number of addresses being NATed is large, or if connection tracking information has to be replicated and where it is not practical to do so. Previously we had stateless NAT functionality which was integrated into the IPv4 routing subsystem. This was a great solution as long as the NAT worked on a subnet to subnet basis such that the number of NAT rules was relatively small. The reason is that for SNAT the routing based system had to perform a linear scan through the rules. If the number of rules is large then major renovations would have take place in the routing subsystem to make this practical. For the time being, the least intrusive way of achieving this is to use the u32 classifier written by Alexey Kuznetsov along with the actions infrastructure implemented by Jamal Hadi Salim. The following patch is an attempt at this problem by creating a new nat action that can be invoked from u32 hash tables which would allow large number of stateless NAT rules that can be used/updated in constant time. The actual NAT code is mostly based on the previous stateless NAT code written by Alexey. In future we might be able to utilise the protocol NAT code from netfilter to improve support for other protocols. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-09-28 03:48:05 +08:00
spin_unlock(&p->tcf_lock);
if (unlikely(action == TC_ACT_SHOT))
goto drop;
noff = skb_network_offset(skb);
if (!pskb_may_pull(skb, sizeof(*iph) + noff))
[PKT_SCHED]: Add stateless NAT Stateless NAT is useful in controlled environments where restrictions are placed on through traffic such that we don't need connection tracking to correctly NAT protocol-specific data. In particular, this is of interest when the number of flows or the number of addresses being NATed is large, or if connection tracking information has to be replicated and where it is not practical to do so. Previously we had stateless NAT functionality which was integrated into the IPv4 routing subsystem. This was a great solution as long as the NAT worked on a subnet to subnet basis such that the number of NAT rules was relatively small. The reason is that for SNAT the routing based system had to perform a linear scan through the rules. If the number of rules is large then major renovations would have take place in the routing subsystem to make this practical. For the time being, the least intrusive way of achieving this is to use the u32 classifier written by Alexey Kuznetsov along with the actions infrastructure implemented by Jamal Hadi Salim. The following patch is an attempt at this problem by creating a new nat action that can be invoked from u32 hash tables which would allow large number of stateless NAT rules that can be used/updated in constant time. The actual NAT code is mostly based on the previous stateless NAT code written by Alexey. In future we might be able to utilise the protocol NAT code from netfilter to improve support for other protocols. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-09-28 03:48:05 +08:00
goto drop;
iph = ip_hdr(skb);
if (egress)
addr = iph->saddr;
else
addr = iph->daddr;
if (!((old_addr ^ addr) & mask)) {
if (skb_try_make_writable(skb, sizeof(*iph) + noff))
[PKT_SCHED]: Add stateless NAT Stateless NAT is useful in controlled environments where restrictions are placed on through traffic such that we don't need connection tracking to correctly NAT protocol-specific data. In particular, this is of interest when the number of flows or the number of addresses being NATed is large, or if connection tracking information has to be replicated and where it is not practical to do so. Previously we had stateless NAT functionality which was integrated into the IPv4 routing subsystem. This was a great solution as long as the NAT worked on a subnet to subnet basis such that the number of NAT rules was relatively small. The reason is that for SNAT the routing based system had to perform a linear scan through the rules. If the number of rules is large then major renovations would have take place in the routing subsystem to make this practical. For the time being, the least intrusive way of achieving this is to use the u32 classifier written by Alexey Kuznetsov along with the actions infrastructure implemented by Jamal Hadi Salim. The following patch is an attempt at this problem by creating a new nat action that can be invoked from u32 hash tables which would allow large number of stateless NAT rules that can be used/updated in constant time. The actual NAT code is mostly based on the previous stateless NAT code written by Alexey. In future we might be able to utilise the protocol NAT code from netfilter to improve support for other protocols. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-09-28 03:48:05 +08:00
goto drop;
new_addr &= mask;
new_addr |= addr & ~mask;
/* Rewrite IP header */
iph = ip_hdr(skb);
if (egress)
iph->saddr = new_addr;
else
iph->daddr = new_addr;
csum_replace4(&iph->check, addr, new_addr);
} else if ((iph->frag_off & htons(IP_OFFSET)) ||
iph->protocol != IPPROTO_ICMP) {
goto out;
[PKT_SCHED]: Add stateless NAT Stateless NAT is useful in controlled environments where restrictions are placed on through traffic such that we don't need connection tracking to correctly NAT protocol-specific data. In particular, this is of interest when the number of flows or the number of addresses being NATed is large, or if connection tracking information has to be replicated and where it is not practical to do so. Previously we had stateless NAT functionality which was integrated into the IPv4 routing subsystem. This was a great solution as long as the NAT worked on a subnet to subnet basis such that the number of NAT rules was relatively small. The reason is that for SNAT the routing based system had to perform a linear scan through the rules. If the number of rules is large then major renovations would have take place in the routing subsystem to make this practical. For the time being, the least intrusive way of achieving this is to use the u32 classifier written by Alexey Kuznetsov along with the actions infrastructure implemented by Jamal Hadi Salim. The following patch is an attempt at this problem by creating a new nat action that can be invoked from u32 hash tables which would allow large number of stateless NAT rules that can be used/updated in constant time. The actual NAT code is mostly based on the previous stateless NAT code written by Alexey. In future we might be able to utilise the protocol NAT code from netfilter to improve support for other protocols. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-09-28 03:48:05 +08:00
}
ihl = iph->ihl * 4;
/* It would be nice to share code with stateful NAT. */
switch (iph->frag_off & htons(IP_OFFSET) ? 0 : iph->protocol) {
case IPPROTO_TCP:
{
struct tcphdr *tcph;
if (!pskb_may_pull(skb, ihl + sizeof(*tcph) + noff) ||
skb_try_make_writable(skb, ihl + sizeof(*tcph) + noff))
[PKT_SCHED]: Add stateless NAT Stateless NAT is useful in controlled environments where restrictions are placed on through traffic such that we don't need connection tracking to correctly NAT protocol-specific data. In particular, this is of interest when the number of flows or the number of addresses being NATed is large, or if connection tracking information has to be replicated and where it is not practical to do so. Previously we had stateless NAT functionality which was integrated into the IPv4 routing subsystem. This was a great solution as long as the NAT worked on a subnet to subnet basis such that the number of NAT rules was relatively small. The reason is that for SNAT the routing based system had to perform a linear scan through the rules. If the number of rules is large then major renovations would have take place in the routing subsystem to make this practical. For the time being, the least intrusive way of achieving this is to use the u32 classifier written by Alexey Kuznetsov along with the actions infrastructure implemented by Jamal Hadi Salim. The following patch is an attempt at this problem by creating a new nat action that can be invoked from u32 hash tables which would allow large number of stateless NAT rules that can be used/updated in constant time. The actual NAT code is mostly based on the previous stateless NAT code written by Alexey. In future we might be able to utilise the protocol NAT code from netfilter to improve support for other protocols. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-09-28 03:48:05 +08:00
goto drop;
tcph = (void *)(skb_network_header(skb) + ihl);
inet_proto_csum_replace4(&tcph->check, skb, addr, new_addr,
true);
[PKT_SCHED]: Add stateless NAT Stateless NAT is useful in controlled environments where restrictions are placed on through traffic such that we don't need connection tracking to correctly NAT protocol-specific data. In particular, this is of interest when the number of flows or the number of addresses being NATed is large, or if connection tracking information has to be replicated and where it is not practical to do so. Previously we had stateless NAT functionality which was integrated into the IPv4 routing subsystem. This was a great solution as long as the NAT worked on a subnet to subnet basis such that the number of NAT rules was relatively small. The reason is that for SNAT the routing based system had to perform a linear scan through the rules. If the number of rules is large then major renovations would have take place in the routing subsystem to make this practical. For the time being, the least intrusive way of achieving this is to use the u32 classifier written by Alexey Kuznetsov along with the actions infrastructure implemented by Jamal Hadi Salim. The following patch is an attempt at this problem by creating a new nat action that can be invoked from u32 hash tables which would allow large number of stateless NAT rules that can be used/updated in constant time. The actual NAT code is mostly based on the previous stateless NAT code written by Alexey. In future we might be able to utilise the protocol NAT code from netfilter to improve support for other protocols. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-09-28 03:48:05 +08:00
break;
}
case IPPROTO_UDP:
{
struct udphdr *udph;
if (!pskb_may_pull(skb, ihl + sizeof(*udph) + noff) ||
skb_try_make_writable(skb, ihl + sizeof(*udph) + noff))
[PKT_SCHED]: Add stateless NAT Stateless NAT is useful in controlled environments where restrictions are placed on through traffic such that we don't need connection tracking to correctly NAT protocol-specific data. In particular, this is of interest when the number of flows or the number of addresses being NATed is large, or if connection tracking information has to be replicated and where it is not practical to do so. Previously we had stateless NAT functionality which was integrated into the IPv4 routing subsystem. This was a great solution as long as the NAT worked on a subnet to subnet basis such that the number of NAT rules was relatively small. The reason is that for SNAT the routing based system had to perform a linear scan through the rules. If the number of rules is large then major renovations would have take place in the routing subsystem to make this practical. For the time being, the least intrusive way of achieving this is to use the u32 classifier written by Alexey Kuznetsov along with the actions infrastructure implemented by Jamal Hadi Salim. The following patch is an attempt at this problem by creating a new nat action that can be invoked from u32 hash tables which would allow large number of stateless NAT rules that can be used/updated in constant time. The actual NAT code is mostly based on the previous stateless NAT code written by Alexey. In future we might be able to utilise the protocol NAT code from netfilter to improve support for other protocols. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-09-28 03:48:05 +08:00
goto drop;
udph = (void *)(skb_network_header(skb) + ihl);
if (udph->check || skb->ip_summed == CHECKSUM_PARTIAL) {
inet_proto_csum_replace4(&udph->check, skb, addr,
new_addr, true);
[PKT_SCHED]: Add stateless NAT Stateless NAT is useful in controlled environments where restrictions are placed on through traffic such that we don't need connection tracking to correctly NAT protocol-specific data. In particular, this is of interest when the number of flows or the number of addresses being NATed is large, or if connection tracking information has to be replicated and where it is not practical to do so. Previously we had stateless NAT functionality which was integrated into the IPv4 routing subsystem. This was a great solution as long as the NAT worked on a subnet to subnet basis such that the number of NAT rules was relatively small. The reason is that for SNAT the routing based system had to perform a linear scan through the rules. If the number of rules is large then major renovations would have take place in the routing subsystem to make this practical. For the time being, the least intrusive way of achieving this is to use the u32 classifier written by Alexey Kuznetsov along with the actions infrastructure implemented by Jamal Hadi Salim. The following patch is an attempt at this problem by creating a new nat action that can be invoked from u32 hash tables which would allow large number of stateless NAT rules that can be used/updated in constant time. The actual NAT code is mostly based on the previous stateless NAT code written by Alexey. In future we might be able to utilise the protocol NAT code from netfilter to improve support for other protocols. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-09-28 03:48:05 +08:00
if (!udph->check)
udph->check = CSUM_MANGLED_0;
}
break;
}
case IPPROTO_ICMP:
{
struct icmphdr *icmph;
if (!pskb_may_pull(skb, ihl + sizeof(*icmph) + noff))
[PKT_SCHED]: Add stateless NAT Stateless NAT is useful in controlled environments where restrictions are placed on through traffic such that we don't need connection tracking to correctly NAT protocol-specific data. In particular, this is of interest when the number of flows or the number of addresses being NATed is large, or if connection tracking information has to be replicated and where it is not practical to do so. Previously we had stateless NAT functionality which was integrated into the IPv4 routing subsystem. This was a great solution as long as the NAT worked on a subnet to subnet basis such that the number of NAT rules was relatively small. The reason is that for SNAT the routing based system had to perform a linear scan through the rules. If the number of rules is large then major renovations would have take place in the routing subsystem to make this practical. For the time being, the least intrusive way of achieving this is to use the u32 classifier written by Alexey Kuznetsov along with the actions infrastructure implemented by Jamal Hadi Salim. The following patch is an attempt at this problem by creating a new nat action that can be invoked from u32 hash tables which would allow large number of stateless NAT rules that can be used/updated in constant time. The actual NAT code is mostly based on the previous stateless NAT code written by Alexey. In future we might be able to utilise the protocol NAT code from netfilter to improve support for other protocols. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-09-28 03:48:05 +08:00
goto drop;
icmph = (void *)(skb_network_header(skb) + ihl);
if ((icmph->type != ICMP_DEST_UNREACH) &&
(icmph->type != ICMP_TIME_EXCEEDED) &&
(icmph->type != ICMP_PARAMETERPROB))
break;
if (!pskb_may_pull(skb, ihl + sizeof(*icmph) + sizeof(*iph) +
noff))
goto drop;
icmph = (void *)(skb_network_header(skb) + ihl);
[PKT_SCHED]: Add stateless NAT Stateless NAT is useful in controlled environments where restrictions are placed on through traffic such that we don't need connection tracking to correctly NAT protocol-specific data. In particular, this is of interest when the number of flows or the number of addresses being NATed is large, or if connection tracking information has to be replicated and where it is not practical to do so. Previously we had stateless NAT functionality which was integrated into the IPv4 routing subsystem. This was a great solution as long as the NAT worked on a subnet to subnet basis such that the number of NAT rules was relatively small. The reason is that for SNAT the routing based system had to perform a linear scan through the rules. If the number of rules is large then major renovations would have take place in the routing subsystem to make this practical. For the time being, the least intrusive way of achieving this is to use the u32 classifier written by Alexey Kuznetsov along with the actions infrastructure implemented by Jamal Hadi Salim. The following patch is an attempt at this problem by creating a new nat action that can be invoked from u32 hash tables which would allow large number of stateless NAT rules that can be used/updated in constant time. The actual NAT code is mostly based on the previous stateless NAT code written by Alexey. In future we might be able to utilise the protocol NAT code from netfilter to improve support for other protocols. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-09-28 03:48:05 +08:00
iph = (void *)(icmph + 1);
if (egress)
addr = iph->daddr;
else
addr = iph->saddr;
if ((old_addr ^ addr) & mask)
break;
if (skb_try_make_writable(skb, ihl + sizeof(*icmph) +
sizeof(*iph) + noff))
[PKT_SCHED]: Add stateless NAT Stateless NAT is useful in controlled environments where restrictions are placed on through traffic such that we don't need connection tracking to correctly NAT protocol-specific data. In particular, this is of interest when the number of flows or the number of addresses being NATed is large, or if connection tracking information has to be replicated and where it is not practical to do so. Previously we had stateless NAT functionality which was integrated into the IPv4 routing subsystem. This was a great solution as long as the NAT worked on a subnet to subnet basis such that the number of NAT rules was relatively small. The reason is that for SNAT the routing based system had to perform a linear scan through the rules. If the number of rules is large then major renovations would have take place in the routing subsystem to make this practical. For the time being, the least intrusive way of achieving this is to use the u32 classifier written by Alexey Kuznetsov along with the actions infrastructure implemented by Jamal Hadi Salim. The following patch is an attempt at this problem by creating a new nat action that can be invoked from u32 hash tables which would allow large number of stateless NAT rules that can be used/updated in constant time. The actual NAT code is mostly based on the previous stateless NAT code written by Alexey. In future we might be able to utilise the protocol NAT code from netfilter to improve support for other protocols. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-09-28 03:48:05 +08:00
goto drop;
icmph = (void *)(skb_network_header(skb) + ihl);
iph = (void *)(icmph + 1);
new_addr &= mask;
new_addr |= addr & ~mask;
/* XXX Fix up the inner checksums. */
if (egress)
iph->daddr = new_addr;
else
iph->saddr = new_addr;
inet_proto_csum_replace4(&icmph->checksum, skb, addr, new_addr,
false);
[PKT_SCHED]: Add stateless NAT Stateless NAT is useful in controlled environments where restrictions are placed on through traffic such that we don't need connection tracking to correctly NAT protocol-specific data. In particular, this is of interest when the number of flows or the number of addresses being NATed is large, or if connection tracking information has to be replicated and where it is not practical to do so. Previously we had stateless NAT functionality which was integrated into the IPv4 routing subsystem. This was a great solution as long as the NAT worked on a subnet to subnet basis such that the number of NAT rules was relatively small. The reason is that for SNAT the routing based system had to perform a linear scan through the rules. If the number of rules is large then major renovations would have take place in the routing subsystem to make this practical. For the time being, the least intrusive way of achieving this is to use the u32 classifier written by Alexey Kuznetsov along with the actions infrastructure implemented by Jamal Hadi Salim. The following patch is an attempt at this problem by creating a new nat action that can be invoked from u32 hash tables which would allow large number of stateless NAT rules that can be used/updated in constant time. The actual NAT code is mostly based on the previous stateless NAT code written by Alexey. In future we might be able to utilise the protocol NAT code from netfilter to improve support for other protocols. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-09-28 03:48:05 +08:00
break;
}
default:
break;
}
out:
[PKT_SCHED]: Add stateless NAT Stateless NAT is useful in controlled environments where restrictions are placed on through traffic such that we don't need connection tracking to correctly NAT protocol-specific data. In particular, this is of interest when the number of flows or the number of addresses being NATed is large, or if connection tracking information has to be replicated and where it is not practical to do so. Previously we had stateless NAT functionality which was integrated into the IPv4 routing subsystem. This was a great solution as long as the NAT worked on a subnet to subnet basis such that the number of NAT rules was relatively small. The reason is that for SNAT the routing based system had to perform a linear scan through the rules. If the number of rules is large then major renovations would have take place in the routing subsystem to make this practical. For the time being, the least intrusive way of achieving this is to use the u32 classifier written by Alexey Kuznetsov along with the actions infrastructure implemented by Jamal Hadi Salim. The following patch is an attempt at this problem by creating a new nat action that can be invoked from u32 hash tables which would allow large number of stateless NAT rules that can be used/updated in constant time. The actual NAT code is mostly based on the previous stateless NAT code written by Alexey. In future we might be able to utilise the protocol NAT code from netfilter to improve support for other protocols. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-09-28 03:48:05 +08:00
return action;
drop:
spin_lock(&p->tcf_lock);
p->tcf_qstats.drops++;
spin_unlock(&p->tcf_lock);
return TC_ACT_SHOT;
}
static int tcf_nat_dump(struct sk_buff *skb, struct tc_action *a,
int bind, int ref)
{
unsigned char *b = skb_tail_pointer(skb);
struct tcf_nat *p = to_tcf_nat(a);
struct tc_nat opt = {
.index = p->tcf_index,
.refcnt = refcount_read(&p->tcf_refcnt) - ref,
.bindcnt = atomic_read(&p->tcf_bindcnt) - bind,
};
[PKT_SCHED]: Add stateless NAT Stateless NAT is useful in controlled environments where restrictions are placed on through traffic such that we don't need connection tracking to correctly NAT protocol-specific data. In particular, this is of interest when the number of flows or the number of addresses being NATed is large, or if connection tracking information has to be replicated and where it is not practical to do so. Previously we had stateless NAT functionality which was integrated into the IPv4 routing subsystem. This was a great solution as long as the NAT worked on a subnet to subnet basis such that the number of NAT rules was relatively small. The reason is that for SNAT the routing based system had to perform a linear scan through the rules. If the number of rules is large then major renovations would have take place in the routing subsystem to make this practical. For the time being, the least intrusive way of achieving this is to use the u32 classifier written by Alexey Kuznetsov along with the actions infrastructure implemented by Jamal Hadi Salim. The following patch is an attempt at this problem by creating a new nat action that can be invoked from u32 hash tables which would allow large number of stateless NAT rules that can be used/updated in constant time. The actual NAT code is mostly based on the previous stateless NAT code written by Alexey. In future we might be able to utilise the protocol NAT code from netfilter to improve support for other protocols. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-09-28 03:48:05 +08:00
struct tcf_t t;
spin_lock_bh(&p->tcf_lock);
opt.old_addr = p->old_addr;
opt.new_addr = p->new_addr;
opt.mask = p->mask;
opt.flags = p->flags;
opt.action = p->tcf_action;
if (nla_put(skb, TCA_NAT_PARMS, sizeof(opt), &opt))
goto nla_put_failure;
tcf_tm_dump(&t, &p->tcf_tm);
if (nla_put_64bit(skb, TCA_NAT_TM, sizeof(t), &t, TCA_NAT_PAD))
goto nla_put_failure;
spin_unlock_bh(&p->tcf_lock);
[PKT_SCHED]: Add stateless NAT Stateless NAT is useful in controlled environments where restrictions are placed on through traffic such that we don't need connection tracking to correctly NAT protocol-specific data. In particular, this is of interest when the number of flows or the number of addresses being NATed is large, or if connection tracking information has to be replicated and where it is not practical to do so. Previously we had stateless NAT functionality which was integrated into the IPv4 routing subsystem. This was a great solution as long as the NAT worked on a subnet to subnet basis such that the number of NAT rules was relatively small. The reason is that for SNAT the routing based system had to perform a linear scan through the rules. If the number of rules is large then major renovations would have take place in the routing subsystem to make this practical. For the time being, the least intrusive way of achieving this is to use the u32 classifier written by Alexey Kuznetsov along with the actions infrastructure implemented by Jamal Hadi Salim. The following patch is an attempt at this problem by creating a new nat action that can be invoked from u32 hash tables which would allow large number of stateless NAT rules that can be used/updated in constant time. The actual NAT code is mostly based on the previous stateless NAT code written by Alexey. In future we might be able to utilise the protocol NAT code from netfilter to improve support for other protocols. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-09-28 03:48:05 +08:00
return skb->len;
nla_put_failure:
spin_unlock_bh(&p->tcf_lock);
[PKT_SCHED]: Add stateless NAT Stateless NAT is useful in controlled environments where restrictions are placed on through traffic such that we don't need connection tracking to correctly NAT protocol-specific data. In particular, this is of interest when the number of flows or the number of addresses being NATed is large, or if connection tracking information has to be replicated and where it is not practical to do so. Previously we had stateless NAT functionality which was integrated into the IPv4 routing subsystem. This was a great solution as long as the NAT worked on a subnet to subnet basis such that the number of NAT rules was relatively small. The reason is that for SNAT the routing based system had to perform a linear scan through the rules. If the number of rules is large then major renovations would have take place in the routing subsystem to make this practical. For the time being, the least intrusive way of achieving this is to use the u32 classifier written by Alexey Kuznetsov along with the actions infrastructure implemented by Jamal Hadi Salim. The following patch is an attempt at this problem by creating a new nat action that can be invoked from u32 hash tables which would allow large number of stateless NAT rules that can be used/updated in constant time. The actual NAT code is mostly based on the previous stateless NAT code written by Alexey. In future we might be able to utilise the protocol NAT code from netfilter to improve support for other protocols. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-09-28 03:48:05 +08:00
nlmsg_trim(skb, b);
return -1;
}
static int tcf_nat_walker(struct net *net, struct sk_buff *skb,
struct netlink_callback *cb, int type,
const struct tc_action_ops *ops,
struct netlink_ext_ack *extack)
{
struct tc_action_net *tn = net_generic(net, nat_net_id);
return tcf_generic_walker(tn, skb, cb, type, ops, extack);
}
static int tcf_nat_search(struct net *net, struct tc_action **a, u32 index)
{
struct tc_action_net *tn = net_generic(net, nat_net_id);
return tcf_idr_search(tn, a, index);
}
[PKT_SCHED]: Add stateless NAT Stateless NAT is useful in controlled environments where restrictions are placed on through traffic such that we don't need connection tracking to correctly NAT protocol-specific data. In particular, this is of interest when the number of flows or the number of addresses being NATed is large, or if connection tracking information has to be replicated and where it is not practical to do so. Previously we had stateless NAT functionality which was integrated into the IPv4 routing subsystem. This was a great solution as long as the NAT worked on a subnet to subnet basis such that the number of NAT rules was relatively small. The reason is that for SNAT the routing based system had to perform a linear scan through the rules. If the number of rules is large then major renovations would have take place in the routing subsystem to make this practical. For the time being, the least intrusive way of achieving this is to use the u32 classifier written by Alexey Kuznetsov along with the actions infrastructure implemented by Jamal Hadi Salim. The following patch is an attempt at this problem by creating a new nat action that can be invoked from u32 hash tables which would allow large number of stateless NAT rules that can be used/updated in constant time. The actual NAT code is mostly based on the previous stateless NAT code written by Alexey. In future we might be able to utilise the protocol NAT code from netfilter to improve support for other protocols. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-09-28 03:48:05 +08:00
static struct tc_action_ops act_nat_ops = {
.kind = "nat",
.id = TCA_ID_NAT,
[PKT_SCHED]: Add stateless NAT Stateless NAT is useful in controlled environments where restrictions are placed on through traffic such that we don't need connection tracking to correctly NAT protocol-specific data. In particular, this is of interest when the number of flows or the number of addresses being NATed is large, or if connection tracking information has to be replicated and where it is not practical to do so. Previously we had stateless NAT functionality which was integrated into the IPv4 routing subsystem. This was a great solution as long as the NAT worked on a subnet to subnet basis such that the number of NAT rules was relatively small. The reason is that for SNAT the routing based system had to perform a linear scan through the rules. If the number of rules is large then major renovations would have take place in the routing subsystem to make this practical. For the time being, the least intrusive way of achieving this is to use the u32 classifier written by Alexey Kuznetsov along with the actions infrastructure implemented by Jamal Hadi Salim. The following patch is an attempt at this problem by creating a new nat action that can be invoked from u32 hash tables which would allow large number of stateless NAT rules that can be used/updated in constant time. The actual NAT code is mostly based on the previous stateless NAT code written by Alexey. In future we might be able to utilise the protocol NAT code from netfilter to improve support for other protocols. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-09-28 03:48:05 +08:00
.owner = THIS_MODULE,
.act = tcf_nat_act,
[PKT_SCHED]: Add stateless NAT Stateless NAT is useful in controlled environments where restrictions are placed on through traffic such that we don't need connection tracking to correctly NAT protocol-specific data. In particular, this is of interest when the number of flows or the number of addresses being NATed is large, or if connection tracking information has to be replicated and where it is not practical to do so. Previously we had stateless NAT functionality which was integrated into the IPv4 routing subsystem. This was a great solution as long as the NAT worked on a subnet to subnet basis such that the number of NAT rules was relatively small. The reason is that for SNAT the routing based system had to perform a linear scan through the rules. If the number of rules is large then major renovations would have take place in the routing subsystem to make this practical. For the time being, the least intrusive way of achieving this is to use the u32 classifier written by Alexey Kuznetsov along with the actions infrastructure implemented by Jamal Hadi Salim. The following patch is an attempt at this problem by creating a new nat action that can be invoked from u32 hash tables which would allow large number of stateless NAT rules that can be used/updated in constant time. The actual NAT code is mostly based on the previous stateless NAT code written by Alexey. In future we might be able to utilise the protocol NAT code from netfilter to improve support for other protocols. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-09-28 03:48:05 +08:00
.dump = tcf_nat_dump,
.init = tcf_nat_init,
.walk = tcf_nat_walker,
.lookup = tcf_nat_search,
.size = sizeof(struct tcf_nat),
};
static __net_init int nat_init_net(struct net *net)
{
struct tc_action_net *tn = net_generic(net, nat_net_id);
return tc_action_net_init(tn, &act_nat_ops);
}
static void __net_exit nat_exit_net(struct list_head *net_list)
{
tc_action_net_exit(net_list, nat_net_id);
}
static struct pernet_operations nat_net_ops = {
.init = nat_init_net,
.exit_batch = nat_exit_net,
.id = &nat_net_id,
.size = sizeof(struct tc_action_net),
[PKT_SCHED]: Add stateless NAT Stateless NAT is useful in controlled environments where restrictions are placed on through traffic such that we don't need connection tracking to correctly NAT protocol-specific data. In particular, this is of interest when the number of flows or the number of addresses being NATed is large, or if connection tracking information has to be replicated and where it is not practical to do so. Previously we had stateless NAT functionality which was integrated into the IPv4 routing subsystem. This was a great solution as long as the NAT worked on a subnet to subnet basis such that the number of NAT rules was relatively small. The reason is that for SNAT the routing based system had to perform a linear scan through the rules. If the number of rules is large then major renovations would have take place in the routing subsystem to make this practical. For the time being, the least intrusive way of achieving this is to use the u32 classifier written by Alexey Kuznetsov along with the actions infrastructure implemented by Jamal Hadi Salim. The following patch is an attempt at this problem by creating a new nat action that can be invoked from u32 hash tables which would allow large number of stateless NAT rules that can be used/updated in constant time. The actual NAT code is mostly based on the previous stateless NAT code written by Alexey. In future we might be able to utilise the protocol NAT code from netfilter to improve support for other protocols. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-09-28 03:48:05 +08:00
};
MODULE_DESCRIPTION("Stateless NAT actions");
MODULE_LICENSE("GPL");
static int __init nat_init_module(void)
{
return tcf_register_action(&act_nat_ops, &nat_net_ops);
[PKT_SCHED]: Add stateless NAT Stateless NAT is useful in controlled environments where restrictions are placed on through traffic such that we don't need connection tracking to correctly NAT protocol-specific data. In particular, this is of interest when the number of flows or the number of addresses being NATed is large, or if connection tracking information has to be replicated and where it is not practical to do so. Previously we had stateless NAT functionality which was integrated into the IPv4 routing subsystem. This was a great solution as long as the NAT worked on a subnet to subnet basis such that the number of NAT rules was relatively small. The reason is that for SNAT the routing based system had to perform a linear scan through the rules. If the number of rules is large then major renovations would have take place in the routing subsystem to make this practical. For the time being, the least intrusive way of achieving this is to use the u32 classifier written by Alexey Kuznetsov along with the actions infrastructure implemented by Jamal Hadi Salim. The following patch is an attempt at this problem by creating a new nat action that can be invoked from u32 hash tables which would allow large number of stateless NAT rules that can be used/updated in constant time. The actual NAT code is mostly based on the previous stateless NAT code written by Alexey. In future we might be able to utilise the protocol NAT code from netfilter to improve support for other protocols. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-09-28 03:48:05 +08:00
}
static void __exit nat_cleanup_module(void)
{
tcf_unregister_action(&act_nat_ops, &nat_net_ops);
[PKT_SCHED]: Add stateless NAT Stateless NAT is useful in controlled environments where restrictions are placed on through traffic such that we don't need connection tracking to correctly NAT protocol-specific data. In particular, this is of interest when the number of flows or the number of addresses being NATed is large, or if connection tracking information has to be replicated and where it is not practical to do so. Previously we had stateless NAT functionality which was integrated into the IPv4 routing subsystem. This was a great solution as long as the NAT worked on a subnet to subnet basis such that the number of NAT rules was relatively small. The reason is that for SNAT the routing based system had to perform a linear scan through the rules. If the number of rules is large then major renovations would have take place in the routing subsystem to make this practical. For the time being, the least intrusive way of achieving this is to use the u32 classifier written by Alexey Kuznetsov along with the actions infrastructure implemented by Jamal Hadi Salim. The following patch is an attempt at this problem by creating a new nat action that can be invoked from u32 hash tables which would allow large number of stateless NAT rules that can be used/updated in constant time. The actual NAT code is mostly based on the previous stateless NAT code written by Alexey. In future we might be able to utilise the protocol NAT code from netfilter to improve support for other protocols. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-09-28 03:48:05 +08:00
}
module_init(nat_init_module);
module_exit(nat_cleanup_module);