drivers: ata: ahci_sunxi: Increased SATA/AHCI DMA TX/RX FIFOs

Increasing the SATA/AHCI DMA TX/RX FIFOs (P0DMACR.TXTS and .RXTS, ie.
TX_TRANSACTION_SIZE and RX_TRANSACTION_SIZE) from default 0x0 each
to 0x3 each, gives a write performance boost of 120 MiB/s to 132 MiB/s
from lame 36 MiB/s to 45 MiB/s previously.
Read performance is above 200 MiB/s.
[tested on SSD using dd bs=4K/8K/12K/16K/20K/24K/32K: peak-perf at 12K]

Tested on the SBCs Banana Pi R1 (aka Lamobo R1) and Banana Pi M1 which
are based on the Allwinner A20 32bit-SoC (ARMv7-a / arm-linux-gnueabihf).
These devices are RaspberryPi-like small devices.

This problem of slow SATA write-speed with these small devices lasts
for about 7 years now (beginning with the A10 SoC). Many commentators
throughout the years wrongly assumed the slow write speed was a
hardware limitation. This patch finally solves the problem, which
in fact was just a hard-to-find software problem due to lack of
SATA/AHCI documentation by the SoC-maker Allwinner Technology.

Lists of the affected sunxi and other boards and SoCs with SATA using
the ahci_sunxi driver:
  $ grep -i -e "^&ahci" arch/arm/boot/dts/sun*dts
  and http://linux-sunxi.org/SATA#Devices_with_SATA_ports
  See also http://linux-sunxi.org/Category:Devices_with_SATA_port

Tested-by: Chen-Yu Tsai <wens@csie.org>
Acked-by: Maxime Ripard <maxime.ripard@bootlin.com>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Uenal Mutlu <um@mutluit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
This commit is contained in:
Uenal Mutlu 2019-05-13 16:24:10 +02:00 committed by Jens Axboe
parent 8756a25b07
commit 120357ea17

View File

@ -149,8 +149,51 @@ static void ahci_sunxi_start_engine(struct ata_port *ap)
void __iomem *port_mmio = ahci_port_base(ap);
struct ahci_host_priv *hpriv = ap->host->private_data;
/* Setup DMA before DMA start */
sunxi_clrsetbits(hpriv->mmio + AHCI_P0DMACR, 0x0000ff00, 0x00004400);
/* Setup DMA before DMA start
*
* NOTE: A similar SoC with SATA/AHCI by Texas Instruments documents
* this Vendor Specific Port (P0DMACR, aka PxDMACR) in its
* User's Guide document (TMS320C674x/OMAP-L1x Processor
* Serial ATA (SATA) Controller, Literature Number: SPRUGJ8C,
* March 2011, Chapter 4.33 Port DMA Control Register (P0DMACR),
* p.68, https://www.ti.com/lit/ug/sprugj8c/sprugj8c.pdf)
* as equivalent to the following struct:
*
* struct AHCI_P0DMACR_t
* {
* unsigned TXTS : 4;
* unsigned RXTS : 4;
* unsigned TXABL : 4;
* unsigned RXABL : 4;
* unsigned Reserved : 16;
* };
*
* TXTS: Transmit Transaction Size (TX_TRANSACTION_SIZE).
* This field defines the DMA transaction size in DWORDs for
* transmit (system bus read, device write) operation. [...]
*
* RXTS: Receive Transaction Size (RX_TRANSACTION_SIZE).
* This field defines the Port DMA transaction size in DWORDs
* for receive (system bus write, device read) operation. [...]
*
* TXABL: Transmit Burst Limit.
* This field allows software to limit the VBUSP master read
* burst size. [...]
*
* RXABL: Receive Burst Limit.
* Allows software to limit the VBUSP master write burst
* size. [...]
*
* Reserved: Reserved.
*
*
* NOTE: According to the above document, the following alternative
* to the code below could perhaps be a better option
* (or preparation) for possible further improvements later:
* sunxi_clrsetbits(hpriv->mmio + AHCI_P0DMACR, 0x0000ffff,
* 0x00000033);
*/
sunxi_clrsetbits(hpriv->mmio + AHCI_P0DMACR, 0x0000ffff, 0x00004433);
/* Start DMA */
sunxi_setbits(port_mmio + PORT_CMD, PORT_CMD_START);