Commit Graph

521932 Commits

Author SHA1 Message Date
David S. Miller
ada6c1de9e Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next
Pablo Neira Ayuso says:

====================
Netfilter updates for net-next

This a bit large (and late) patchset that contains Netfilter updates for
net-next. Most relevantly br_netfilter fixes, ipset RCU support, removal of
x_tables percpu ruleset copy and rework of the nf_tables netdev support. More
specifically, they are:

1) Warn the user when there is a better protocol conntracker available, from
   Marcelo Ricardo Leitner.

2) Fix forwarding of IPv6 fragmented traffic in br_netfilter, from Bernhard
   Thaler. This comes with several patches to prepare the change in first place.

3) Get rid of special mtu handling of PPPoE/VLAN frames for br_netfilter. This
   is not needed anymore since now we use the largest fragment size to
   refragment, from Florian Westphal.

4) Restore vlan tag when refragmenting in br_netfilter, also from Florian.

5) Get rid of the percpu ruleset copy in x_tables, from Florian. Plus another
   follow up patch to refine it from Eric Dumazet.

6) Several ipset cleanups, fixes and finally RCU support, from Jozsef Kadlecsik.

7) Get rid of parens in Netfilter Kconfig files.

8) Attach the net_device to the basechain as opposed to the initial per table
   approach in the nf_tables netdev family.

9) Subscribe to netdev events to detect the removal and registration of a
   device that is referenced by a basechain.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-15 14:30:32 -07:00
Pablo Neira Ayuso
835b803377 netfilter: nf_tables_netdev: unregister hooks on net_device removal
In case the net_device is gone, we have to unregister the hooks and put back
the reference on the net_device object. Once it comes back, register them
again. This also covers the device rename case.

This patch also adds a new flag to indicate that the basechain is disabled, so
their hooks are not registered. This flag is used by the netdev family to
handle the case where the net_device object is gone. Currently this flag is not
exposed to userspace.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-06-15 23:02:35 +02:00
Pablo Neira Ayuso
d8ee8f7c56 netfilter: nf_tables: add nft_register_basechain() and nft_unregister_basechain()
This wrapper functions take care of hook registration for basechains.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-06-15 23:02:33 +02:00
Pablo Neira Ayuso
2cbce139fc netfilter: nf_tables: attach net_device to basechain
The device is part of the hook configuration, so instead of a global
configuration per table, set it to each of the basechain that we create.

This patch reworks ebddf1a8d7 ("netfilter: nf_tables: allow to bind table to
net_device").

Note that this adds a dev_name field in the nft_base_chain structure which is
required the netdev notification subscription that follows up in a patch to
handle gone net_devices.

Suggested-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-06-15 23:02:31 +02:00
Eric Dumazet
711bdde6a8 netfilter: x_tables: remove XT_TABLE_INFO_SZ and a dereference.
After Florian patches, there is no need for XT_TABLE_INFO_SZ anymore :
Only one copy of table is kept, instead of one copy per cpu.

We also can avoid a dereference if we put table data right after
xt_table_info. It reduces register pressure and helps compiler.

Then, we attempt a kmalloc() if total size is under order-3 allocation,
to reduce TLB pressure, as in many cases, rules fit in 32 KB.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-06-15 20:19:20 +02:00
Pablo Neira Ayuso
53b8762727 Merge branch 'master' of git://blackhole.kfki.hu/nf-next
Jozsef Kadlecsik says:

====================
ipset patches for nf-next

Please consider to apply the next bunch of patches for ipset. First
comes the small changes, then the bugfixes and at the end the RCU
related patches.

* Use MSEC_PER_SEC consistently instead of the number.
* Use SET_WITH_*() helpers to test set extensions from Sergey Popovich.
* Check extensions attributes before getting extensions from Sergey Popovich.
* Permit CIDR equal to the host address CIDR in IPv6 from Sergey Popovich.
* Make sure we always return line number on batch in the case of error
  from Sergey Popovich.
* Check CIDR value only when attribute is given from Sergey Popovich.
* Fix cidr handling for hash:*net* types, reported by Jonathan Johnson.
* Fix parallel resizing and listing of the same set so that the original
  set is kept for the whole dumping.
* Make sure listing doesn't grab a set which is just being destroyed.
* Remove rbtree from ip_set_hash_netiface.c in order to introduce RCU.
* Replace rwlock_t with spinlock_t in "struct ip_set", change the locking
  in the core and simplifications in the timeout routines.
* Introduce RCU locking in bitmap:* types with a slight modification in the
  logic on how an element is added.
* Introduce RCU locking in hash:* types. This is the most complex part of
  the changes.
* Introduce RCU locking in list type where standard rculist is used.
* Fix coding styles reported by checkpatch.pl.
====================

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-06-15 18:33:09 +02:00
Vincent Cuissard
fb77ff4f43 NFC: nci: fix mistake in uart generic driver
It was not possible to register a UART driver due
to a bad condition.

Signed-off-by: Vincent Cuissard <cuissard@marvell.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2015-06-15 18:10:37 +02:00
Pablo Neira Ayuso
f09becc79f netfilter: Kconfig: get rid of parens around depends on
According to the reporter, they are not needed.

Reported-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-06-15 17:26:37 +02:00
Kalle Valo
d9378080a1 Merge ath-next from ath.git
Major changes:

wil6210:

* add modparam for bcast ring size
* support hidden SSID
* add per-MCS Rx stats
2015-06-15 13:25:32 +03:00
Stanislaw Gruszka
ed8e0ed53b rt2800: fix assigning same WCID for different stations
On some hardware reading WCID entries table results getting 0xff
numbers, no matter of value written there before. This cause assigning
the same WCID for different stations and makes not possible to connect
to more than one station.

Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2015-06-15 13:13:50 +03:00
Arend van Spriel
f7a40873d2 brcmfmac: assure p2pdev is unregistered upon driver unload
When unloading the driver with a p2pdev interface it resulted in
a warning upon calling wiphy_unregister() and subsequently a crash
in the driver. This patch assures the p2pdev is unregistered calling
unregister_wdev() before doing the wiphy_unregister().

Reviewed-by: Hante Meuleman <meuleman@broadcom.com>
Reviewed-by: Franky (Zhenhui) Lin <frankyl@broadcom.com>
Reviewed-by: Pieter-Paul Giesberts <pieterpg@broadcom.com>
Signed-off-by: Arend van Spriel <arend@broadcom.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2015-06-15 12:55:25 +03:00
Arend van Spriel
55479df884 brcmfmac: move p2p attach/detach functions
Moving two functions in p2p.c as is so next change will be
easier to review.

Reviewed-by: Hante Meuleman <meuleman@broadcom.com>
Reviewed-by: Franky (Zhenhui) Lin <frankyl@broadcom.com>
Reviewed-by: Pieter-Paul Giesberts <pieterpg@broadcom.com>
Signed-off-by: Arend van Spriel <arend@broadcom.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2015-06-15 12:55:24 +03:00
Arend van Spriel
f37d69a4ba brcmfmac: free ifp for non-netdev interface in p2p module
Making it more clear by freeing the ifp in same place where the
vif object is freed.

Reviewed-by: Hante Meuleman <meuleman@broadcom.com>
Reviewed-by: Franky (Zhenhui) Lin <frankyl@broadcom.com>
Reviewed-by: Pieter-Paul Giesberts <pieterpg@broadcom.com>
Signed-off-by: Arend van Spriel <arend@broadcom.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2015-06-15 12:55:23 +03:00
Arend van Spriel
5768f31e4e brcmfmac: have sdio return -EIO when device communication is not possible
The bus interface functions txctl and rxctl may be used while the device
can not be accessed, eg. upon driver .remove() callback. This patch will
immediately return -EIO when this is the case which speeds up the module
unload.

Reviewed-by: Hante Meuleman <meuleman@broadcom.com>
Reviewed-by: Franky (Zhenhui) Lin <frankyl@broadcom.com>
Reviewed-by: Pieter-Paul Giesberts <pieterpg@broadcom.com>
Signed-off-by: Arend van Spriel <arend@broadcom.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2015-06-15 12:55:22 +03:00
Arend van Spriel
1f0dc59a6d brcmfmac: rework .get_station() callback
The .get_station() cfg80211 callback is used in several scenarios. In
managed mode it can obtain information about the access-point and its
BSS parameters. In managed mode it can also obtain information about
TDLS peers. In AP mode it can obtain information about connected
clients.

Reviewed-by: Hante Meuleman <meuleman@broadcom.com>
Reviewed-by: Daniel (Deognyoun) Kim <dekim@broadcom.com>
Reviewed-by: Franky (Zhenhui) Lin <frankyl@broadcom.com>
Reviewed-by: Pieter-Paul Giesberts <pieterpg@broadcom.com>
Signed-off-by: Arend van Spriel <arend@broadcom.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2015-06-15 12:55:21 +03:00
Pontus Fuchs
2e5f66fe95 brcmfmac: Build wiphy mode and interface combinations dynamically
Switch from using semi hard coded interface combinations. This makes
it easier to announce what the firmware actually supports. This fixes
the case where brcmfmac announces p2p but the firmware doesn't
support it.

Reviewed-by: Pieter-Paul Giesberts <pieterpg@broadcom.com>
Reviewed-by: Hante Meuleman <meuleman@broadcom.com>
Reviewed-by: Arend Van Spriel <arend@broadcom.com>
Signed-off-by: Pontus Fuchs <pontusf@broadcom.com>
Signed-off-by: Arend van Spriel <arend@broadcom.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2015-06-15 12:55:20 +03:00
Pontus Fuchs
2b560d7148 brcmfmac: Check if firmware supports p2p
Add a feature flag to reflect the firmware's p2p capability.

Reviewed-by: Pieter-Paul Giesberts <pieterpg@broadcom.com>
Reviewed-by: Hante Meuleman <meuleman@broadcom.com>
Reviewed-by: Arend Van Spriel <arend@broadcom.com>
Signed-off-by: Pontus Fuchs <pontusf@broadcom.com>
Signed-off-by: Arend van Spriel <arend@broadcom.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2015-06-15 12:55:19 +03:00
Jakub Kicinski
8d0123748a mt7601u: don't warn about devices without per-rate power table
We expect EEPROM per-rate power table to be filled with
s6 values and warn user if values are invalid.  However,
there appear to be devices which don't have this section
of EEPROM initialized.  In such case we should ignore
the values and leave the driver power tables set to zero.

Note that vendor driver doesn't care about this case but
mt76x2 skips 0xff per value.  We take mt76x2's approach.

Signed-off-by: Jakub Kicinski <kubakici@wp.pl>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2015-06-15 12:49:35 +03:00
Stanislaw Gruszka
3bdb4a4e61 MAINTAINERS: remove rt2x00.serialmonkey.com list and web page
rt2x00.serialmonkey.com will be shutdown.

Since traffic on rt2x00 mailing list is very low, we can use only
linux-wireless list for any rt2x00 related topics.

Thanks for Luis Correia, Ivo van Doorn and Mark Wallis for maintaining
rt2x00 servers for years!

Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2015-06-15 12:48:26 +03:00
Chunfan Chen
d219b7eb37 mwifiex: handle BT coex event to adjust Rx BA window size
If timeshare coexistance between bluetooth and WLAN gets enabled,
firmware will give host an event to reduce Rx AMPDU BA window size.
The event is handled in this patch.

Signed-off-by: Chunfan Chen <jeffc@marvell.com>
Signed-off-by: Cathy Luo <cluo@marvell.com>
Signed-off-by: Amitkumar Karwar <akarwar@marvell.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2015-06-15 12:46:56 +03:00
Chun-Yeow Yeoh
f0e449627e ath9k_htc: add support of channel switch
Add the support of channel switching functionality, similar
to ath9k support.

Tested with TP-Link TL-WN722N and TL-WN821N.

Signed-off-by: Chun-Yeow Yeoh <yeohchunyeow@gmail.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2015-06-15 12:44:34 +03:00
Arend van Spriel
5b18ffb2d7 brcmfmac: use debugfs_create_devm_seqfile() helper function
Some time ago the function debugfs_create_devm_seqfile() was
introduced in debugfs. The caller simply needs to provide a
device pointer and read function. The function brcmf_debugfs_add_entry()
is now simply a wrapper only doing the work for CONFIG_BRCMDBG.

Reviewed-by: Hante Meuleman <meuleman@broadcom.com>
Reviewed-by: Franky (Zhenhui) Lin <frankyl@broadcom.com>
Reviewed-by: Daniel (Deognyoun) Kim <dekim@broadcom.com>
Reviewed-by: Pieter-Paul Giesberts <pieterpg@broadcom.com>
Signed-off-by: Arend van Spriel <arend@broadcom.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2015-06-15 12:42:29 +03:00
Arend van Spriel
c161f29bd6 brcmfmac: remove watchdog reset from brcmf_pcie_buscoreprep()
The watchdog reset as done in brcmf_pcie_buscoreprep() is not
sufficient. It needs to modify PCIe core registers as well
which is properly done by brcmf_pcie_reset_device() after the
chip recognition is done. So the faulty watchdog reset can be
removed as it was causing driver reload to fail and hang the
system requiring a power-cycle. Instead the call to to the
brcmf_pcie_reset_device() function is done twice in the unload.

Reviewed-by: Hante Meuleman <meuleman@broadcom.com>
Reviewed-by: Daniel (Deognyoun) Kim <dekim@broadcom.com>
Reviewed-by: Franky (Zhenhui) Lin <frankyl@broadcom.com>
Reviewed-by: Pieter-Paul Giesberts <pieterpg@broadcom.com>
Signed-off-by: Arend van Spriel <arend@broadcom.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2015-06-15 12:42:28 +03:00
Arend van Spriel
dbf967537d brcmfmac: remove chipinfo debugfs entry
The information provided by chipinfo is also provided by the
revinfo debugfs entry. Removing it from debugfs.

Reviewed-by: Hante Meuleman <meuleman@broadcom.com>
Reviewed-by: Pieter-Paul Giesberts <pieterpg@broadcom.com>
Signed-off-by: Arend van Spriel <arend@broadcom.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2015-06-15 12:42:26 +03:00
Hante Meuleman
df738c2f0c brcmfmac: Update msgbuf read pointer quicker.
On device to host data using msgbuf the read pointer gets updated
once all data is processed. Updating this pointer more frequently
allows the firmware to add more data quicker. This will result in
slightly higher and more stable throughput on CPU bounded host
processors.

Reviewed-by: Arend Van Spriel <arend@broadcom.com>
Reviewed-by: Franky (Zhenhui) Lin <frankyl@broadcom.com>
Reviewed-by: Pieter-Paul Giesberts <pieterpg@broadcom.com>
Signed-off-by: Hante Meuleman <meuleman@broadcom.com>
Signed-off-by: Arend van Spriel <arend@broadcom.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2015-06-15 12:42:25 +03:00
Taehee Yoo
e996db6983 rtlwifi: rtl8192c: Add init codes for "fw_version" and "fw_subversion".
The variable "fw_version" is used in the _ResetDigitalProcedure1().
but It is not initialized. so I add init codes for "fw_version" and
"fw_subversion".

Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Acked-by: Larry Finger <Larry.Finger@lwfinger.net>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2015-06-15 12:38:33 +03:00
Taehee Yoo
d92460097c rtlwifi: rtl8192cu: Fix variable isfirst_ampdu
rtl92cu_rx_query_desc set a isampdu twice.
but second code is related to isfirst_ampdu.
so i change it.

Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2015-06-15 12:31:42 +03:00
Taehee Yoo
8657f9c4d5 rtlwifi: rtl8192cu: debug message change "RTL8192CE" to "RTL8192CU"
In the rtlwifi/rtl8192cu, I change debug message "RTL8192CE" to
"RTL8192CU".

Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2015-06-15 12:31:41 +03:00
Taehee Yoo
138055e23b rtlwifi: rtl8192cu: remove duplicated routine in _rtl92c_phy_rf6052_config_parafile
in the _rtl92c_phy_rf6052_config_parafile(), cases
RF90_PATH_A and RF90_PATH_B call the same routine.
so i remove one of these routine. also the return
routine is duplicated. so i remove it.

Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2015-06-15 12:31:39 +03:00
Hans Ulli Kroll
1637c1b7eb rtlwifi: fix tm_trigger usage
While working on getting my rtl8821au driver in pretty shape for
inclusion, it is dicosvered that the tm_trigger flag is used only for
the first device using this driver.
This flag handles the thermal power management in the hardware.

To change this add a entry in sttruct rtl_dm, so each device can handle
is separately.

Signed-off-by: Hans Ulli Kroll <ulli.kroll@googlemail.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2015-06-15 12:30:20 +03:00
Taehee Yoo
fbcaee1c6d rtlwifi: rtl8192cu: remove INTF_PCI and INTF_USB
in the rtlwifi/rtl8192cu, INTF_PCI and INTF_USB is unnecessary.
because RTL8192CU chipset is only USB interface.

Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2015-06-15 12:29:18 +03:00
Taehee Yoo
1d6b2fb1bc rtlwifi: rtl8192cu: remove _InitBeaconParameters().
_InitBeaconParameters() and rtl92cu_init_beacon_parameters() is
same routine. I remove both functions. then i add
_rtl92cu_init_beacon_parameters() in the hw.c.
_rtl92cu_init_beacon_parameters() is same routine with
_InitBeaconParameters().

Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2015-06-15 12:29:17 +03:00
Taehee Yoo
f5372e940c rtlwifi: rtl8192cu: remove IS_HARDWARE_TYPE_8192CE and IS_HARDWARE_TYPE_8192CU
in the rtlwifi/rtl8192cu, IS_HARDWARE_TYPE_8192CE and IS_HARDWARE_TYPE_8192CU
is unnecessary. because rtlwifi/rtl8192cu codes aren't shared.

Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2015-06-15 12:29:15 +03:00
Alexander Aring
789a99ecb9 fakelb: add xmit_async after stop testcase
This patch adds a suspended testcase into the xmit_async functionality.
In the hope that we can found race conditions when xmit_async is called
after an ieee802154_ops stop. This should never happen.

Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-06-15 01:25:06 +02:00
Alexander Aring
e6f7ed9dc1 at86rf230: add support for sleep state
This patch adds support for sleep state when between stop and start
period. In this period the transceiver isn't used by the subsystem, in
this time we disable the irq and going into the sleep state.

Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-06-15 01:22:19 +02:00
Alexander Aring
9ff19e6f44 at86rf230: use level high as fallback default
This patch use high level interrupt type as fallback handling when no
irq type is given.

Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-06-15 01:22:19 +02:00
Alexander Aring
ed2e627cb1 mac802154: iface: flush workqueue before stop
This patch flushs the workqueue which is currently used for xmit_sync
callback before calling stop driver-ops. Flush the queue will ensure all
pending tx frames are transmitted.

Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-06-15 01:22:19 +02:00
Alexander Aring
b4ee194441 mac802154: iface: fix hrtimer cancel on ifdown
The interframe spacing timer is a per phy definition and is part of a
ieee802154_local structure. If we have possible multiple interfaces
ifdown one interface then the timer should not be cancled. First if the
last interface is down and the receive handling is stopped we should be
sure that the interframe spacing timer isn't run anymore.

Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-06-15 01:22:19 +02:00
Varka Bhadram
1bc1754e82 mac802154: rx packet handle cleanup
This patch replaces !netif_running(sdata->dev) with
!ieee802154_sdata_running(sdata) and also devide the
code two separate if branches.

Signed-off-by: Varka Bhadram <varkab@cdac.in>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-06-15 01:20:17 +02:00
Kenneth Klette Jonassen
758f0d4b16 tcp: cdg: use div_u64()
Fixes cross-compile to mips.

Signed-off-by: Kenneth Klette Jonassen <kennetkl@ifi.uio.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-14 12:57:45 -07:00
Jozsef Kadlecsik
ca0f6a5cd9 netfilter: ipset: Fix coding styles reported by checkpatch.pl
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
2015-06-14 10:40:18 +02:00
Jozsef Kadlecsik
00590fdd5b netfilter: ipset: Introduce RCU locking in list type
Standard rculist is used.

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
2015-06-14 10:40:17 +02:00
Jozsef Kadlecsik
18f84d41d3 netfilter: ipset: Introduce RCU locking in hash:* types
Three types of data need to be protected in the case of the hash types:

a. The hash buckets: standard rcu pointer operations are used.
b. The element blobs in the hash buckets are stored in an array and
   a bitmap is used for book-keeping to tell which elements in the array
   are used or free.
c. Networks per cidr values and the cidr values themselves are stored
   in fix sized arrays and need no protection. The values are modified
   in such an order that in the worst case an element testing is repeated
   once with the same cidr value.

The ipset hash approach uses arrays instead of lists and therefore is
incompatible with rhashtable.

Performance is tested by Jesper Dangaard Brouer:

Simple drop in FORWARD
~~~~~~~~~~~~~~~~~~~~~~

Dropping via simple iptables net-mask match::

 iptables -t raw -N simple || iptables -t raw -F simple
 iptables -t raw -I simple  -s 198.18.0.0/15 -j DROP
 iptables -t raw -D PREROUTING -j simple
 iptables -t raw -I PREROUTING -j simple

Drop performance in "raw": 11.3Mpps

Generator: sending 12.2Mpps (tx:12264083 pps)

Drop via original ipset in RAW table
~~~~~~~~~~~~~~~~~~~~~~~~~~~

Create a set with lots of elements::

 sudo ./ipset destroy test
 echo "create test hash:ip hashsize 65536" > test.set
 for x in `seq 0 255`; do
    for y in `seq 0 255`; do
        echo "add test 198.18.$x.$y" >> test.set
    done
 done
 sudo ./ipset restore < test.set

Dropping via ipset::

 iptables -t raw -F
 iptables -t raw -N net198 || iptables -t raw -F net198
 iptables -t raw -I net198 -m set --match-set test src -j DROP
 iptables -t raw -I PREROUTING -j net198

Drop performance in "raw" with ipset: 8Mpps

Perf report numbers ipset drop in "raw"::

 +   24.65%  ksoftirqd/1  [ip_set]           [k] ip_set_test
 -   21.42%  ksoftirqd/1  [kernel.kallsyms]  [k] _raw_read_lock_bh
    - _raw_read_lock_bh
       + 99.88% ip_set_test
 -   19.42%  ksoftirqd/1  [kernel.kallsyms]  [k] _raw_read_unlock_bh
    - _raw_read_unlock_bh
       + 99.72% ip_set_test
 +    4.31%  ksoftirqd/1  [ip_set_hash_ip]   [k] hash_ip4_kadt
 +    2.27%  ksoftirqd/1  [ixgbe]            [k] ixgbe_fetch_rx_buffer
 +    2.18%  ksoftirqd/1  [ip_tables]        [k] ipt_do_table
 +    1.81%  ksoftirqd/1  [ip_set_hash_ip]   [k] hash_ip4_test
 +    1.61%  ksoftirqd/1  [kernel.kallsyms]  [k] __netif_receive_skb_core
 +    1.44%  ksoftirqd/1  [kernel.kallsyms]  [k] build_skb
 +    1.42%  ksoftirqd/1  [kernel.kallsyms]  [k] ip_rcv
 +    1.36%  ksoftirqd/1  [kernel.kallsyms]  [k] __local_bh_enable_ip
 +    1.16%  ksoftirqd/1  [kernel.kallsyms]  [k] dev_gro_receive
 +    1.09%  ksoftirqd/1  [kernel.kallsyms]  [k] __rcu_read_unlock
 +    0.96%  ksoftirqd/1  [ixgbe]            [k] ixgbe_clean_rx_irq
 +    0.95%  ksoftirqd/1  [kernel.kallsyms]  [k] __netdev_alloc_frag
 +    0.88%  ksoftirqd/1  [kernel.kallsyms]  [k] kmem_cache_alloc
 +    0.87%  ksoftirqd/1  [xt_set]           [k] set_match_v3
 +    0.85%  ksoftirqd/1  [kernel.kallsyms]  [k] inet_gro_receive
 +    0.83%  ksoftirqd/1  [kernel.kallsyms]  [k] nf_iterate
 +    0.76%  ksoftirqd/1  [kernel.kallsyms]  [k] put_compound_page
 +    0.75%  ksoftirqd/1  [kernel.kallsyms]  [k] __rcu_read_lock

Drop via ipset in RAW table with RCU-locking
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

With RCU locking, the RW-lock is gone.

Drop performance in "raw" with ipset with RCU-locking: 11.3Mpps

Performance-tested-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
2015-06-14 10:40:17 +02:00
Jozsef Kadlecsik
96f51428c4 netfilter: ipset: Introduce RCU locking in bitmap:* types
There's nothing much required because the bitmap types use atomic
bit operations. However the logic of adding elements slightly changed:
first the MAC address updated (which is not atomic), then the element
activated (added). The extensions may call kfree_rcu() therefore we
call rcu_barrier() at module removal.

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
2015-06-14 10:40:16 +02:00
Jozsef Kadlecsik
b57b2d1fa5 netfilter: ipset: Prepare the ipset core to use RCU at set level
Replace rwlock_t with spinlock_t in "struct ip_set" and change the locking
accordingly. Convert the comment extension into an rcu-avare object. Also,
simplify the timeout routines.

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
2015-06-14 10:40:16 +02:00
Jozsef Kadlecsik
bd55389cc3 netfilter:ipset Remove rbtree from hash:net,iface
Remove rbtree in order to introduce RCU instead of rwlock in ipset

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
2015-06-14 10:40:15 +02:00
Jozsef Kadlecsik
9c1ba5c809 netfilter: ipset: Make sure listing doesn't grab a set which is just being destroyed.
There was a small window when all sets are destroyed and a concurrent
listing of all sets could grab a set which is just being destroyed.

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
2015-06-14 10:40:15 +02:00
Jozsef Kadlecsik
c4c997839c netfilter: ipset: Fix parallel resizing and listing of the same set
When elements added to a hash:* type of set and resizing triggered,
parallel listing could start to list the original set (before resizing)
and "continue" with listing the new set. Fix it by references and
using the original hash table for listing. Therefore the destroying of
the original hash table may happen from the resizing or listing functions.

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
2015-06-14 10:40:15 +02:00
Jozsef Kadlecsik
f690cbaed9 netfilter: ipset: Fix cidr handling for hash:*net* types
Commit "Simplify cidr handling for hash:*net* types" broke the cidr
handling for the hash:*net* types when the sets were used by the SET
target: entries with invalid cidr values were added to the sets.
Reported by Jonathan Johnson.

Testsuite entry is added to verify the fix.

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
2015-06-14 10:40:14 +02:00
Sergey Popovich
aff227581e netfilter: ipset: Check CIDR value only when attribute is given
There is no reason to check CIDR value regardless attribute
specifying CIDR is given.

Initialize cidr array in element structure on element structure
declaration to let more freedom to the compiler to optimize
initialization right before element structure is used.

Remove local variables cidr and cidr2 for netnet and netportnet
hashes as we do not use packed cidr value for such set types and
can store value directly in e.cidr[].

Signed-off-by: Sergey Popovich <popovich_sergei@mail.ua>
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
2015-06-14 10:40:14 +02:00