Commit Graph

1072 Commits

Author SHA1 Message Date
Paul Moore
9bb5fd2b05 NetLabel: use cipso_v4_doi_search() for local CIPSOv4 functions
The cipso_v4_doi_search() function behaves the same as cipso_v4_doi_getdef()
but is a local, static function so use it whenever possibile in the CIPSOv4
code base.

Signed-of-by: Paul Moore <paul.moore@hp.com>

Signed-off-by: James Morris <jmorris@namei.org>
2006-12-02 21:24:12 -08:00
Paul Moore
9fade4bf8e NetLabel: return the correct error for translated CIPSOv4 tags
The CIPSOv4 translated tag #1 mapping does not always return the correct error
code if the desired mapping does not exist; instead of returning -EPERM it
returns -ENOSPC indicating that the buffer is not large enough to hold the
translated value.  This was caused by failing to check a specific error
condition.  This patch fixes this so that unknown mappings return
-EPERM which is consistent with the rest of the related CIPSOv4 code.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
2006-12-02 21:24:11 -08:00
Paul Moore
91b1ed0afd NetLabel: fixup the handling of CIPSOv4 tags to allow for multiple tag types
While the original CIPSOv4 code had provisions for multiple tag types the
implementation was not as great as it could be, pushing a lot of non-tag
specific processing into the tag specific code blocks.  This patch fixes that
issue making it easier to support multiple tag types in the future.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
2006-12-02 21:24:10 -08:00
Paul Moore
6ce61a7c26 NetLabel: add tag verification when adding new CIPSOv4 DOI definitions
Currently the CIPSOv4 engine does not do any sort of checking when a new DOI
definition is added.  The tags are still verified but only as a side effect of
normal NetLabel operation (packet processing, socket labeling, etc.) which
would cause application errors due to the faulty configuration.  This patch
adds tag checking when new DOI definition are added allowing us to catch these
configuration problems when they happen.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
2006-12-02 21:24:09 -08:00
Paul Moore
05e00cbf50 NetLabel: check for a CIPSOv4 option before we do call into the CIPSOv4 layer
Right now the NetLabel code always jumps into the CIPSOv4 layer to determine if
a CIPSO IP option is present.  However, we can do this check directly in the
NetLabel code by making use of the CIPSO_V4_OPTEXIST() macro which should save
us a function call in the common case of not having a CIPSOv4 option present.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
2006-12-02 21:24:08 -08:00
Paul Moore
701a90bad9 NetLabel: make netlbl_lsm_secattr struct easier/quicker to understand
The existing netlbl_lsm_secattr struct required the LSM to check all of the
fields to determine if any security attributes were present resulting in a lot
of work in the common case of no attributes.  This patch adds a 'flags' field
which is used to indicate which attributes are present in the structure; this
should allow the LSM to do a quick comparison to determine if the structure
holds any security attributes.

Example:

 if (netlbl_lsm_secattr->flags)
	/* security attributes present */
 else
	/* NO security attributes present */

Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
2006-12-02 21:24:07 -08:00
Arnaldo Carvalho de Melo
352d48008b [TCP]: Tidy up skb_entail
Heck, it even saves us some few bytes:

[acme@newtoy net-2.6.20]$ codiff -f /tmp/tcp.o.before ../OUTPUT/qemu/net-2.6.20/net/ipv4/tcp.o
/pub/scm/linux/kernel/git/acme/net-2.6.20/net/ipv4/tcp.c:
  tcp_sendpage |   -7
  tcp_sendmsg  |   -5
 2 functions changed, 12 bytes removed
[acme@newtoy net-2.6.20]$

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2006-12-02 21:24:03 -08:00
Arnaldo Carvalho de Melo
c67862403e [TCP] minisocks: Use kmemdup and LIMIT_NETDEBUG
Code diff stats:

[acme@newtoy net-2.6.20]$ codiff /tmp/tcp_minisocks.o.before /tmp/tcp_minisocks.o.after
/pub/scm/linux/kernel/git/acme/net-2.6.20/net/ipv4/tcp_minisocks.c:
  tcp_check_req |  -44
 1 function changed, 44 bytes removed
[acme@newtoy net-2.6.20]$

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2006-12-02 21:23:57 -08:00
Arnaldo Carvalho de Melo
42e5ea466c [IPV4]: Use kmemdup in net/ipv4/devinet.c
Code diff stats:

[acme@newtoy net-2.6.20]$ codiff /tmp/devinet.o.before /tmp/devinet.o.after
/pub/scm/linux/kernel/git/acme/net-2.6.20/net/ipv4/devinet.c:
  devinet_sysctl_register |  -38
 1 function changed, 38 bytes removed
[acme@newtoy net-2.6.20]$

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2006-12-02 21:23:56 -08:00
Arnaldo Carvalho de Melo
fac5d73151 [NETLABEL]: Use kmemdup in cipso_ipv4.c
Code diff stats:

[acme@newtoy net-2.6.20]$ codiff /tmp/cipso_ipv4.o.before /tmp/cipso_ipv4.o.after
/pub/scm/linux/kernel/git/acme/net-2.6.20/net/ipv4/cipso_ipv4.c:
  cipso_v4_cache_add |  -46
 1 function changed, 46 bytes removed
[acme@newtoy net-2.6.20]$

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2006-12-02 21:23:55 -08:00
Arnaldo Carvalho de Melo
f6685938f9 [TCP_IPV4]: Use kmemdup where appropriate
Also use a variable to avoid the longish tp->md5sig_info-> use
in tcp_v4_md5_do_add.

Code diff stats:

[acme@newtoy net-2.6.20]$ codiff /tmp/tcp_ipv4.o.before /tmp/tcp_ipv4.o.after
/pub/scm/linux/kernel/git/acme/net-2.6.20/net/ipv4/tcp_ipv4.c:
  tcp_v4_md5_do_add     |  -62
  tcp_v4_syn_recv_sock  |  -32
  tcp_v4_parse_md5_keys |  -86
 3 functions changed, 180 bytes removed
[acme@newtoy net-2.6.20]$

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2006-12-02 21:23:54 -08:00
Arnaldo Carvalho de Melo
7174259e6c [TCP_IPV4]: CodingStyle cleanups, no code change
Mostly related to CONFIG_TCP_MD5SIG recent merge.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2006-12-02 21:23:53 -08:00
Gerrit Renker
078250d68d [NET/IPv4]: Make udp_push_pending_frames static
udp_push_pending_frames is only referenced within
net/ipv4/udp.c and hence can remain static.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:23:47 -08:00
Al Viro
43bc0ca7ea [NET]: netfilter checksum annotations
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:23:42 -08:00
Al Viro
f9214b2627 [NET]: ipvs checksum annotations.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:23:41 -08:00
Al Viro
f6ab028804 [NET]: Make mangling a checksum (0 -> 0xffff on the wire) explicit.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:23:39 -08:00
Al Viro
b51655b958 [NET]: Annotate __skb_checksum_complete() and friends.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:23:38 -08:00
Al Viro
b1550f2212 [NET]: Annotate ip_vs_checksum_complete() and callers.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:23:37 -08:00
Al Viro
5f92a7388a [NET]: Annotate callers of the reset of checksum.h stuff.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:23:34 -08:00
Al Viro
5084205faf [NET]: Annotate callers of csum_partial_copy_...() and csum_and_copy...() in net/*
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:23:33 -08:00
Al Viro
44bb93633f [NET]: Annotate csum_partial() callers in net/*
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:23:32 -08:00
Al Viro
6b11687ef0 [NET]: Annotate csum_tcpudp_magic() callers in net/*
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:23:29 -08:00
Al Viro
d3bc23e7ee [NET]: Annotate callers of csum_fold() in net/*
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:23:27 -08:00
Al Viro
75e7ce66ef [IPVS]: Annotate ..._app_hashkey().
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:22:58 -08:00
Al Viro
42d224aa17 [NETFILTER]: More trivial annotations.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:22:54 -08:00
Al Viro
714e85be35 [IPV6]: Assorted trivial endianness annotations.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:22:50 -08:00
Gerrit Renker
ba4e58eca8 [NET]: Supporting UDP-Lite (RFC 3828) in Linux
This is a revision of the previously submitted patch, which alters
the way files are organized and compiled in the following manner:

	* UDP and UDP-Lite now use separate object files
	* source file dependencies resolved via header files
	  net/ipv{4,6}/udp_impl.h
	* order of inclusion files in udp.c/udplite.c adapted
	  accordingly

[NET/IPv4]: Support for the UDP-Lite protocol (RFC 3828)

This patch adds support for UDP-Lite to the IPv4 stack, provided as an
extension to the existing UDPv4 code:
        * generic routines are all located in net/ipv4/udp.c
        * UDP-Lite specific routines are in net/ipv4/udplite.c
        * MIB/statistics support in /proc/net/snmp and /proc/net/udplite
        * shared API with extensions for partial checksum coverage

[NET/IPv6]: Extension for UDP-Lite over IPv6

It extends the existing UDPv6 code base with support for UDP-Lite
in the same manner as per UDPv4. In particular,
        * UDPv6 generic and shared code is in net/ipv6/udp.c
        * UDP-Litev6 specific extensions are in net/ipv6/udplite.c
        * MIB/statistics support in /proc/net/snmp6 and /proc/net/udplite6
        * support for IPV6_ADDRFORM
        * aligned the coding style of protocol initialisation with af_inet6.c
        * made the error handling in udpv6_queue_rcv_skb consistent;
          to return `-1' on error on all error cases
        * consolidation of shared code

[NET]: UDP-Lite Documentation and basic XFRM/Netfilter support

The UDP-Lite patch further provides
        * API documentation for UDP-Lite
        * basic xfrm support
        * basic netfilter support for IPv4 and IPv6 (LOG target)

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:22:46 -08:00
David S. Miller
a928630a2f [TCP]: Fix some warning when MD5 is disabled.
Just some mis-placed ifdefs:

net/ipv4/tcp_minisocks.c: In function ‘tcp_twsk_destructor’:
net/ipv4/tcp_minisocks.c:364: warning: unused variable ‘twsk’
net/ipv6/tcp_ipv6.c:1846: warning: ‘tcp_sock_ipv6_specific’ defined but not used
net/ipv6/tcp_ipv6.c:1877: warning: ‘tcp_sock_ipv6_mapped_specific’ defined but not used

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:22:43 -08:00
YOSHIFUJI Hideaki
cfb6eeb4c8 [TCP]: MD5 Signature Option (RFC2385) support.
Based on implementation by Rick Payne.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:22:39 -08:00
Gerrit Renker
b9df3cb8cf [TCP/DCCP]: Introduce net_xmit_eval
Throughout the TCP/DCCP (and tunnelling) code, it often happens that the
return code of a transmit function needs to be tested against NET_XMIT_CN
which is a value that does not indicate a strict error condition.

This patch uses a macro for these recurring situations which is consistent
with the already existing macro net_xmit_errno, saving on duplicated code.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2006-12-02 21:22:27 -08:00
David S. Miller
2404043a66 [TCP] htcp: Better packing of struct htcp.
Based upon a patch by Joe Perches.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:22:14 -08:00
Thomas Graf
339bf98ffc [NETLINK]: Do precise netlink message allocations where possible
Account for the netlink message header size directly in nlmsg_new()
instead of relying on the caller calculate it correctly.

Replaces error handling of message construction functions when
constructing notifications with bug traps since a failure implies
a bug in calculating the size of the skb.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Acked-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:22:11 -08:00
Gerrit Renker
a94f723d59 [TCP]: Remove dead code in init_sequence
This removes two redundancies:

1) The test (skb->protocol == htons(ETH_P_IPV6) in tcp_v6_init_sequence()
   is always true, due to
	* tcp_v6_conn_request() is the only function calling this one
	* tcp_v6_conn_request() redirects all skb's with ETH_P_IP protocol to
	  tcp_v4_conn_request() [ cf. top of tcp_v6_conn_request()]

2) The first argument, `struct sock *sk' of tcp_v{4,6}_init_sequence() is
   never used.

Signed-off-by: Gerrit Renker  <gerrit@erg.abdn.ac.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:22:10 -08:00
David S. Miller
931731123a [TCP]: Don't set SKB owner in tcp_transmit_skb().
The data itself is already charged to the SKB, doing
the skb_set_owner_w() just generates a lot of noise and
extra atomics we don't really need.

Lmbench improvements on lat_tcp are minimal:

before:
TCP latency using localhost: 23.2701 microseconds
TCP latency using localhost: 23.1994 microseconds
TCP latency using localhost: 23.2257 microseconds

after:
TCP latency using localhost: 22.8380 microseconds
TCP latency using localhost: 22.9465 microseconds
TCP latency using localhost: 22.8462 microseconds

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:21:52 -08:00
Stephen Hemminger
35bfbc9407 [TCP]: Allow autoloading of congestion control via setsockopt.
If user has permision to load modules, then autoload then attempt
autoload of TCP congestion module.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:21:50 -08:00
Stephen Hemminger
ce7bc3bf15 [TCP]: Restrict congestion control choices.
Allow normal users to only choose among a restricted set of congestion
control choices.  The default is reno and what ever has been configured
as default. But the policy can be changed by administrator at any time.

For example, to allow any choice:
    cp /proc/sys/net/ipv4/tcp_available_congestion_control \
       /proc/sys/net/ipv4/tcp_allowed_congestion_control

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:21:49 -08:00
Stephen Hemminger
3ff825b28d [TCP]: Add tcp_available_congestion_control sysctl.
Create /proc/sys/net/ipv4/tcp_available_congestion_control
that reflects currently available TCP choices.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:21:48 -08:00
Eric Dumazet
72a3effaf6 [NET]: Size listen hash tables using backlog hint
We currently allocate a fixed size (TCP_SYNQ_HSIZE=512) slots hash table for
each LISTEN socket, regardless of various parameters (listen backlog for
example)

On x86_64, this means order-1 allocations (might fail), even for 'small'
sockets, expecting few connections. On the contrary, a huge server wanting a
backlog of 50000 is slowed down a bit because of this fixed limit.

This patch makes the sizing of listen hash table a dynamic parameter,
depending of :
- net.core.somaxconn tunable (default is 128)
- net.ipv4.tcp_max_syn_backlog tunable (default : 256, 1024 or 128)
- backlog value given by user application  (2nd parameter of listen())

For large allocations (bigger than PAGE_SIZE), we use vmalloc() instead of
kmalloc().

We still limit memory allocation with the two existing tunables (somaxconn &
tcp_max_syn_backlog). So for standard setups, this patch actually reduce RAM
usage.

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:21:44 -08:00
Thomas Graf
1f6c9557e8 [NET] rules: Share common attribute validation policy
Move the attribute policy for the non-specific attributes into
net/fib_rules.h and include it in the respective protocols.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:21:41 -08:00
Thomas Graf
b8964ed9fa [NET] rules: Protocol independant mark selector
Move mark selector currently implemented per protocol into
the protocol independant part.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:21:41 -08:00
Thomas Graf
5f300893fd [IPV4] nl_fib_lookup: Rename fl_fwmark to fl_mark
For the sake of consistency.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:21:40 -08:00
Thomas Graf
47dcf0cb10 [NET]: Rethink mark field in struct flowi
Now that all protocols have been made aware of the mark
field it can be moved out of the union thus simplyfing
its usage.

The config options in the IPv4/IPv6/DECnet subsystems
to enable respectively disable mark based routing only
obfuscate the code with ifdefs, the cost for the
additional comparison in the flow key is insignificant,
and most distributions have all these options enabled
by default anyway. Therefore it makes sense to remove
the config options and enable mark based routing by
default.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:21:39 -08:00
Thomas Graf
82e91ffef6 [NET]: Turn nfmark into generic mark
nfmark is being used in various subsystems and has become
the defacto mark field for all kinds of packets. Therefore
it makes sense to rename it to `mark' and remove the
dependency on CONFIG_NETFILTER.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:21:38 -08:00
Venkat Yekkirala
6b877699c6 SELinux: Return correct context for SO_PEERSEC
Fix SO_PEERSEC for tcp sockets to return the security context of
the peer (as represented by the SA from the peer) as opposed to the
SA used by the local/source socket.

Signed-off-by: Venkat Yekkirala <vyekkirala@TrustedCS.com>
Signed-off-by: James Morris <jmorris@namei.org>
2006-12-02 21:21:33 -08:00
Al Viro
d5a0a1e310 [IPV4]: encapsulation annotations
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:21:17 -08:00
Al Viro
8c689a6eae [XFRM]: misc annotations
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:21:11 -08:00
Al Viro
5a874db4d9 [NET]: ipconfig and nfsroot annotations
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:21:09 -08:00
Patrick McHardy
af443b6d90 [NETFILTER]: ipt_REJECT: fix memory corruption
On devices with hard_header_len > LL_MAX_HEADER ip_route_me_harder()
reallocates the skb, leading to memory corruption when using the stale
tcph pointer to update the checksum.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-11-28 20:59:38 -08:00
Yasuyuki Kozakai
2e47c264a2 [NETFILTER]: conntrack: fix refcount leak when finding expectation
All users of __{ip,nf}_conntrack_expect_find() don't expect that
it increments the reference count of expectation.

Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-11-28 20:59:37 -08:00
Patrick McHardy
c537b75a3b [NETFILTER]: ctnetlink: fix reference count leak
When NFA_NEST exceeds the skb size the protocol reference is leaked.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-11-28 20:59:36 -08:00