NETLINK_ADD_MEMBERSHIP/NETLINK_DROP_MEMBERSHIP are used to join/leave
groups, NETLINK_PKTINFO is used to enable nl_pktinfo control messages
for received packets to get the extended destination group number.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
This is necessary for dynamic number of netlink groups to make sure we know
the number of possible groups before bind() is called. With this change pure
userspace communication using unused netlink protocols becomes impossible.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Using the group number allows increasing the number of groups without
beeing limited by the size of the bitmask. It introduces one limitation
for netlink users: messages can't be broadcasted to multiple groups anymore,
however this feature was never used inside the kernel.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use-after-free: the struct proto_ops containing the module pointer
is freed when a socket with pid=0 is released, which besides for kernel
sockets is true for all unbound sockets.
Module refcount leak: when the kernel socket is closed before all user
sockets have been closed the proto_ops struct for this family is
replaced by the generic one and the module refcount can't be dropped.
The second problem can't be solved cleanly using module refcounting in the
generic socket code, so this patch adds explicit refcounting to
netlink_create/netlink_release.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
netlink_broadcast users must initialize NETLINK_CB(skb).dst_groups to the
destination group mask for netlink_recvmsg.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
w1 does not need to multicast its state to several groups at once,
and upcoming netlink changes will not allow bitmask for groups anyway.
Signed-off-by: Evgeniy Polyakov <johnpol@2ka.mipt.ru>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
may be a false warning if there always is something on ccid3hcrx_hist:
net/dccp/ccids/ccid3.c: In function 'ccid3_hc_rx_packet_recv':
net/dccp/ccids/ccid3.c:1634: warning: 'tstamp.tv_usec' may be used uninitialized in this function
net/dccp/ccids/ccid3.c:1634: warning: 'tstamp.tv_sec' may be used uninitialized in this function
const on inline functions doesn't have any effect:
net/dccp/dccp.h:64: warning: type qualifiers ignored on function return type
net/dccp/dccp.h:70: warning: type qualifiers ignored on function return type
net/dccp/dccp.h:76: warning: type qualifiers ignored on function return type
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
To avoid holding TIMEWAIT state for sockets in the LISTEN state.
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Only available if CONFIG_DEBUG_KERNEL is enabled in the "Kernel
Hacking" Menu.
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Based on discussions with Nishida-san.
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As requested by Ian.
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: Ian McDonald <iam4@cs.waikato.ac.nz>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rip out cmd/sid/pid matching since its unfixable broken and stands in the
way of locking changes to tasklist_lock.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Gary Wayne Smith <gary.w.smith@primeexalia.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Increases consistency in source-address selection.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch ads a new "connbytes" match that utilizes the CONFIG_NF_CT_ACCT
per-connection byte and packet counters. Using it you can do things like
packet classification on average packet size within a connection.
Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
As proposed by Andi Kleen, this is required esp. for x86_64 architecture,
where 64bit code needs 8byte aligned 64bit data types, but 32bit userspace
apps will only align to 4bytes.
Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
With this the previous setup is back, i.e. tcp_diag can be built as a module,
as dccp_diag and both share the infrastructure available in inet_diag.
If one selects CONFIG_INET_DIAG as module CONFIG_INET_TCP_DIAG will also be
built as a module, as will CONFIG_INET_DCCP_DIAG, if CONFIG_IP_DCCP was
selected static or as a module, if CONFIG_INET_DIAG is y, being statically
linked CONFIG_INET_TCP_DIAG will follow suit and CONFIG_INET_DCCP_DIAG will be
built in the same manner as CONFIG_IP_DCCP.
Now to aim at UDP, converting it to use inet_hashinfo, so that we can use
iproute2 for UDP sockets as well.
Ah, just to show an example of this new infrastructure working for DCCP :-)
[root@qemu ~]# ./ss -dane
State Recv-Q Send-Q Local Address:Port Peer Address:Port
LISTEN 0 0 *:5001 *:* ino:942 sk:cfd503a0
ESTAB 0 0 127.0.0.1:5001 127.0.0.1:32770 ino:943 sk:cfd50a60
ESTAB 0 0 127.0.0.1:32770 127.0.0.1:5001 ino:947 sk:cfd50700
TIME-WAIT 0 0 127.0.0.1:32769 127.0.0.1:5001 timer:(timewait,3.430ms,0) ino:0 sk:cf209620
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Next changeset will introduce net/ipv4/tcp_diag.c, moving the code that was put
transitioanlly in inet_diag.c.
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Next changeset will rename tcp_diag.[ch] to inet_diag.[ch].
I'm taking this longer route so as to easy review, making clear the changes
made all along the way.
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Next changeset will rename tcp_diag to inet_diag and move the tcp_diag code out
of it and into a new tcp_diag.c, similar to the net/dccp/diag.c introduced in
this changeset, completing the transition to a generic inet_diag
infrastructure.
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Doing this we allow tcp_diag to support IPV6 even if tcp_diag is compiled
statically and IPV6 is compiled as a module, removing the previous restriction
while not building any IPV6 code if it is not selected.
Now to work on the tcpdiag_register infrastructure and then to rename the whole
thing to inetdiag, reflecting its by then completely generic nature.
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In the same way as was done with the v4 counterparts, this will be moved
to inet6_hashtables.c.
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix gcc-3.4.x warning about iplicit operator precedence in NF_QUEUE_NR()
Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
# grep -r 'netif_carrier_o[nf]' linux-2.6.12 | wc -l
246
# size vmlinux.org vmlinux.carrier
text data bss dec hex filename
4339634 1054414 259296 5653344 564360 vmlinux.org
4337710 1054414 259296 5651420 563bdc vmlinux.carrier
And this ain't an allyesconfig kernel!
Signed-off-by: David S. Miller <davem@davemloft.net>
I obviously wanted to use bitwise-or, not logical or.
Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch contains the following possible cleanups/fixes:
- use C99 struct initializers
- make a few arrays and structs static
- remove a few uses of literal 0 as NULL pointer
- use convenience function instead of cast+dereference in bnx2_ioctl()
- remove superfluous casts to u8 * in calls to readl/writel
Signed-off-by: Peter Hagervall <hager@cs.umu.se>
Acked-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Please consider the patch below which makes use of file->private_data to
store the pointer to the socket, which avoids touching several unused
cachelines in the dentry and inode in sockfd_lookup.
Signed-off-by: Benjamin LaHaise <bcrl@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This also changes the list_for_each_entry_safe_continue behaviour to match its
kerneldoc comment, that is, to start after the pos passed.
Also adds several helper functions from previously open coded fragments, making
the code more clear.
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
With ugly ifdefs, etc, but this actually:
1. keeps the existing ABI, i.e. no need to recompile the iproute2
utilities if not interested in DCCP.
2. Provides all the tcp_diag functionality in DCCP, with just a
small patch that makes iproute2 support DCCP.
Of course I'll get this cleaned-up in time, but for now I think its
OK to be this way to quickly get this functionality.
iproute2-ss050808 patch at:
http://vger.kernel.org/~acme/iproute2-ss050808.dccp.patch
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This changeset basically moves tcp_sk()->{ca_ops,ca_state,etc} to inet_csk(),
minimal renaming/moving done in this changeset to ease review.
Most of it is just changes of struct tcp_sock * to struct sock * parameters.
With this we move to a state closer to two interesting goals:
1. Generalisation of net/ipv4/tcp_diag.c, becoming inet_diag.c, being used
for any INET transport protocol that has struct inet_hashinfo and are
derived from struct inet_connection_sock. Keeps the userspace API, that will
just not display DCCP sockets, while newer versions of tools can support
DCCP.
2. INET generic transport pluggable Congestion Avoidance infrastructure, using
the current TCP CA infrastructure with DCCP.
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Using most of the infrastructure TCP uses, with a dccp_death_row,
etc. As per my current interpretation of the draft what we have with
this changeset seems to be all we need (or very close to it 8)).
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Also export the ones that will be used in the next changeset, when
DCCP uses this infrastructure.
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
That groups all of the tables and variables associated to the TCP timewait
schedulling/recycling/killing code, that now can be isolated from the TCP
specific code and used by other transport protocols, such as DCCP.
Next changeset will move this code to net/ipv4/inet_timewait_sock.c
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>