RFC 1122 specifies two threshold values R1 and R2 for connection timeouts,
which may represent a number of allowed retransmissions or a timeout value.
Currently linux uses sysctl_tcp_retries{1,2} to specify the thresholds
in number of allowed retransmissions.
For any desired threshold R2 (by means of time) one can specify tcp_retries2
(by means of number of retransmissions) such that TCP will not time out
earlier than R2. This is the case, because the RTO schedule follows a fixed
pattern, namely exponential backoff.
However, the RTO behaviour is not predictable any more if RTO backoffs can be
reverted, as it is the case in the draft
"Make TCP more Robust to Long Connectivity Disruptions"
(http://tools.ietf.org/html/draft-zimmermann-tcp-lcd).
In the worst case TCP would time out a connection after 3.2 seconds, if the
initial RTO equaled MIN_RTO and each backoff has been reverted.
This patch introduces a function retransmits_timed_out(N),
which calculates the timeout of a TCP connection, assuming an initial
RTO of MIN_RTO and N unsuccessful, exponentially backed-off retransmissions.
Whenever timeout decisions are made by comparing the retransmission counter
to some value N, this function can be used, instead.
The meaning of tcp_retries2 will be changed, as many more RTO retransmissions
can occur than the value indicates. However, it yields a timeout which is
similar to the one of an unpatched, exponentially backing off TCP in the same
scenario. As no application could rely on an RTO greater than MIN_RTO, there
should be no risk of a regression.
Signed-off-by: Damian Lukowski <damian@tvk.rwth-aachen.de>
Acked-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
Here, an ICMP host/network unreachable message, whose payload fits to
TCP's SND.UNA, is taken as an indication that the RTO retransmission has
not been lost due to congestion, but because of a route failure
somewhere along the path.
With true congestion, a router won't trigger such a message and the
patched TCP will operate as standard TCP.
This patch reverts one RTO backoff, if an ICMP host/network unreachable
message, whose payload fits to TCP's SND.UNA, arrives.
Based on the new RTO, the retransmission timer is reset to reflect the
remaining time, or - if the revert clocked out the timer - a retransmission
is sent out immediately.
Backoffs are only reverted, if TCP is in RTO loss recovery, i.e. if
there have been retransmissions and reversible backoffs, already.
Changes from v2:
1) Renaming of skb in tcp_v4_err() moved to another patch.
2) Reintroduced tcp_bound_rto() and __tcp_set_rto().
3) Fixed code comments.
Signed-off-by: Damian Lukowski <damian@tvk.rwth-aachen.de>
Acked-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
This supplementary patch renames skb to icmp_skb in tcp_v4_err() in order to
disambiguate from another sk_buff variable, which will be introduced
in a separate patch.
Signed-off-by: Damian Lukowski <damian@tvk.rwth-aachen.de>
Acked-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add support for dcbnl_rtnl_ops.setapp/getapp to set or get the current user
priority bitmap for the given application protocol. Currently, 82599 only
supports setapp/getapp for Fiber Channel over Ethernet (FCoE) protocol.
Signed-off-by: Yi Zou <yi.zou@intel.com>
Acked-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Implements the dcbnl netlink setapp/getapp pair. When a setapp/getapp
is received, dcbnl would just pass on to dcbnl_rtnl_op.setapp/getapp
that are supposed to be implemented by the low level drivers.
Signed-off-by: Yi Zou <yi.zou@intel.com>
Acked-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add defines for dcbnl netlink attributes to support netlink message passing of
setapp/getapp in dcbnl.
Signed-off-by: Yi Zou <yi.zou@intel.com>
Acked-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Adds support of dcbnl setapp/getapp to dcbnl_rtnl_ops in netdev to allow
LLDs to implement their corresponding dcbnl setapp/getapp ops to support
the IEEE 802.1Q DCBX setapp/getapp commands.
Signed-off-by: Yi Zou <yi.zou@intel.com>
Acked-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds dcbnl command definitions to support setapp/getapp
functionality from the IEEE 802.1Qaz Data Center Bridging Capability
Exchange protocol (DCBX) specification. Section 3.3 defines the
application protocol and its 802.1p user priority in DCBX, which is
implemented here as a pair of setapp/getapp commands in the kernel
dcbnl for setting and retrieving the user priority for an given
application protocol. The protocol is identified by the combination of
an id and an idtype. Currently, when idtype is 0, the corresponding
id gives the ether type of this protocol, e.g., for FCoE, it will be
0x8906; when idtype is 1, then the corresponding id gives the TCP or
UDP port number.
For more information regarding DCBX spec., please refer to the following:
http://www.ieee802.org/1/files/public/docs2008/
az-wadekar-dcbx-capability-exchange-discovery-protocol-1108-v1.01.pdf
Signed-off-by: Yi Zou <yi.zou@intel.com>
Acked-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This adds support to the net_device_ops.ndo_fcoe_enable/disable for 82599. This
consequently allows us to dynamically turn FCoE offload feature on or off
upon incoming calls to ndo_fcoe_enable/disable. When this happens, FCoE offload
features are enabled/disabled accordingly, and this is regardless of whether
DCB being turned on or not.
Signed-off-by: Yi Zou <yi.zou@intel.com>
Acked-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This adds implementation of the net_devices_ops.ndo_fcoe_enable/_disable to
the VLAN driver. It checks if the real_dev has support for ndo_fcoe_enable/
ndo_fcoe_disable and if so, passes on to call the associated real_dev.
Signed-off-by: Yi Zou <yi.zou@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add ndo_fcoe_enable/_disable to net_device_ops so the corresponding
HW can initialize itself for FCoE traffic or clean up after FCoE traffic is
done. This is expected to be called by the kernel FCoE stack upon receiving
a request for creating an FCoE instance on the corresponding netdev interface.
When implemented by the actual HW, the HW driver check the op code to perform
corresponding initialization or clean up for FCoE. The initialization normally
includes allocating extra queues for FCoE, setting corresponding HW registers
for FCoE, indicating FCoE offload features via netdev, etc. The clean-up would
include releasing the resources allocated for FCoE.
Signed-off-by: Yi Zou <yi.zou@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In a couple of cases collapse some extra code like:
int retval = NETDEV_TX_OK;
...
return retval;
into
return NETDEV_TX_OK;
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Mostly just simple conversions:
* ray_cs had bogus return of NET_TX_LOCKED but driver
was not using NETIF_F_LLTX
* hostap and ipw2x00 had some code that returned value
from a called function that also had to change to return netdev_tx_t
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Get rid of some bogus return wrapping as well.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
These are all drivers that don't touch real hardware.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Update all the pcmcia network drivers for netdev_tx_t.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The transmit function should only return one of three possible values,
some drivers got confused and returned errno's or other values.
This changes the definition so that this can be caught at compile time.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Single line log messages should be emitted by a single call
where possible.
Converted multiple calls to DBG_PRINT to single call form.
Removed "s2io:" preface from DBG_PRINTs.
The DBG_PRINT macro now emits a log level and is surrounded by
a do {...} while (0)
All s2io log output is now prefaced with KBUILD_MODNAME ": "
via pr_fmt.
The DBG_PRINT macro should probably be converted to use the
dev_<level> form eventually.
Signed-off-by: Joe Perches <joe@perches.com>
Acked-by: Sreenivasa Honnur <sreenivasa.honnur@neterion.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Missed doing the conversion in earlier patch.
Signed-off-by: Joe Perches <joe@perches.com>
Acked-by: Sreenivasa Honnur <sreenivasa.honnur@neterion.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Regularize the declaration and uses of
struct config_param *config = &sp->config;
struct mac_info *mac_control = &sp->mac_control;
and use
struct stat_block *stats = mac_control->stats_info;
struct swStat *swstats = &stats->sw_stat;
struct xpakStat *xstats = &stats->xpak_stat;
and convert the longish uses like
nic->mac_control.stats_info->sw_stat.<foo>
to
swstats-><foo>
etc.
This also makes the statistics code marginally smaller
and presumably faster.
Old:
$ size s2io.o
text data bss dec hex filename
114289 516 33360 148165 242c5 s2io.o
New:
$ size s2io.o
text data bss dec hex filename
114097 516 33360 147973 24205 s2io.o
Signed-off-by: Joe Perches <joe@perches.com>
Acked-by: Sreenivasa Honnur <sreenivasa.honnur@neterion.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Joe Perches <joe@perches.com>
Acked-by: Sreenivasa Honnur <sreenivasa.honnur@neterion.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fixed trivial typo as well
Signed-off-by: Joe Perches <joe@perches.com>
Acked-by: Sreenivasa Honnur <sreenivasa.honnur@neterion.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Still has a few long lines.
checkpatch was:
total: 263 errors, 53 warnings, 8751 lines checked
is:
total: 4 errors, 35 warnings, 8767 lines checked
Signed-off-by: Joe Perches <joe@perches.com>
Acked-by: Sreenivasa Honnur <sreenivasa.honnur@neterion.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Joe Perches <joe@perches.com>
Acked-by: Sreenivasa Honnur <sreenivasa.honnur@neterion.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use consistent style. Don't calculate the kmalloc size multiple times
Signed-off-by: Joe Perches <joe@perches.com>
Acked-by: Sreenivasa Honnur <sreenivasa.honnur@neterion.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Repeated variable use and line wrapping is hard to read.
Use temp variables instead of direct references.
struct fifo_info *fifo = &mac_control->fifos[i];
struct ring_info *ring = &mac_control->rings[i];
struct tx_fifo_config *tx_cfg = &config->tx_cfg[i];
struct rx_ring_config *rx_cfg = &config->rx_cfg[i];
Signed-off-by: Joe Perches <joe@perches.com>
Acked-by: Sreenivasa Honnur <sreenivasa.honnur@neterion.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Joe Perches <joe@perches.com>
Acked-by: Sreenivasa Honnur <sreenivasa.honnur@neterion.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
pfifo_fast_enqueue has this check:
if (skb_queue_len(list) < qdisc_dev(qdisc)->tx_queue_len) {
which allows each band to enqueue upto tx_queue_len skbs for a
total of 3*tx_queue_len skbs. I am not sure if this was the
intention of limiting in qdisc.
Patch compiled and 32 simultaneous netperf testing ran fine. Also:
# tc -s qdisc show dev eth2
qdisc pfifo_fast 0: root bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
Sent 16835026752 bytes 373116 pkt (dropped 0, overlimits 0 requeues 25)
rate 0bit 0pps backlog 0b 0p requeues 25
Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Patch compiled and 32 simultaneous netperf testing ran fine.
Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Patch compiled and 32 simultaneous netperf testing ran fine.
Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dropped skb's should be documented by an appropriate return value.
Use the correct NET_RX_DROP and NET_RX_SUCCESS values for that reason.
Signed-off-by: Oliver Hartkopp <oliver@hartkopp.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix the tests that check whether Frame* bits are not set
Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch implements suspend/resume and WOL support for UCC Ethernet
driver.
We support two wake up events: wake on PHY/link changes and wake
on magic packet.
In some CPUs (like MPC8569) QE shuts down during sleep, so magic packet
detection is unusable, and also on resume we should fully reinitialize
UCC structures.
Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch removes currently unused UGETH_MAGIC_PACKET Kconfig symbol
and code, i.e. magic_packet_detection_{enable,disable} functions.
The two functions each contain just two steps that we'll place into
suspend/resume code path under CONFIG_PM.
Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch factors out MAC initialization into ucc_geth_init_mac()
function that we'll use for suspend/resume.
Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In some CPUs (i.e. MPC8569) QE shuts down completely during sleep,
drivers may want to know that to reinitialize registers and buffer
descriptors.
This patch implements qe_alive_during_sleep() helper function, so far
it just checks if MPC8569-compatible power management controller is
present, which is a sign that QE turns off during sleep.
Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In commit 3e73fc9a12 ("ucc_geth: Fix IO
memory (un)mapping code") I fixed ug_regs IO memory leak by properly
freeing the allocated memory. But ethtool_stats() callback doesn't
check for ug_regs being NULL, and that causes following oops if
'ethtool -S' is executed on a closed eth device:
Unable to handle kernel paging request for data at address 0x00000180
Faulting instruction address: 0xc0208228
Oops: Kernel access of bad area, sig: 11 [#1]
...
NIP [c0208228] uec_get_ethtool_stats+0x38/0x140
LR [c02559a0] ethtool_get_stats+0xf8/0x23c
Call Trace:
[ef87bcd0] [c025597c] ethtool_get_stats+0xd4/0x23c (unreliable)
[ef87bd00] [c025706c] dev_ethtool+0xfe8/0x11bc
[ef87be00] [c0252b5c] dev_ioctl+0x454/0x6a8
...
---[ end trace 77fff1162a9586b0 ]---
Segmentation fault
This patch fixes the issue.
Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>
Signed-off-by: David S. Miller <davem@davemloft.net>