kernel_optimize_test/fs/btrfs
Filipe Manana 75b463d2b4 btrfs: do not commit logs and transactions during link and rename operations
Since commit d4682ba03e ("Btrfs: sync log after logging new name") we
started to commit logs, and fallback to transaction commits when we failed
to log the new names or commit the logs, after link and rename operations
when the target inodes (or their parents) were previously logged in the
current transaction. This was to avoid losing directories despite an
explicit fsync on them when they are ancestors of some inode that got a
new named logged, due to a link or rename operation. However that adds the
cost of starting IO and waiting for it to complete, which can cause higher
latencies for applications.

Instead of doing that, just make sure that when we log a new name for an
inode we don't mark any of its ancestors as logged, so that if any one
does an fsync against any of them, without doing any other change on them,
the fsync commits the log. This way we only pay the cost of a log commit
(or a transaction commit if something goes wrong or a new block group was
created) if the application explicitly asks to fsync any of the parent
directories.

Using dbench, which mixes several filesystems operations including renames,
revealed some significant latency gains. The following script that uses
dbench was used to test this:

  #!/bin/bash

  DEV=/dev/nvme0n1
  MNT=/mnt/btrfs
  MOUNT_OPTIONS="-o ssd -o space_cache=v2"
  MKFS_OPTIONS="-m single -d single"
  THREADS=16

  echo "performance" | tee /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
  mkfs.btrfs -f $MKFS_OPTIONS $DEV
  mount $MOUNT_OPTIONS $DEV $MNT

  dbench -t 300 -D $MNT $THREADS

  umount $MNT

The test was run on bare metal, no virtualization, on a box with 12 cores
(Intel i7-8700), 64Gb of RAM and using a NVMe device, with a kernel
configuration that is the default of typical distributions (debian in this
case), without debug options enabled (kasan, kmemleak, slub debug, debug
of page allocations, lock debugging, etc).

Results before this patch:

 Operation      Count    AvgLat    MaxLat
 ----------------------------------------
 NTCreateX    10750455     0.011   155.088
 Close         7896674     0.001     0.243
 Rename         455222     2.158  1101.947
 Unlink        2171189     0.067   121.638
 Deltree           256     2.425     7.816
 Mkdir             128     0.002     0.003
 Qpathinfo     9744323     0.006    21.370
 Qfileinfo     1707092     0.001     0.146
 Qfsinfo       1786756     0.001    11.228
 Sfileinfo      875612     0.003    21.263
 Find          3767281     0.025     9.617
 WriteX        5356924     0.011   211.390
 ReadX        16852694     0.003     9.442
 LockX           35008     0.002     0.119
 UnlockX         35008     0.001     0.138
 Flush          753458     4.252  1102.249

Throughput 1128.35 MB/sec  16 clients  16 procs  max_latency=1102.255 ms

Results after this patch:

16 clients, after

 Operation      Count    AvgLat    MaxLat
 ----------------------------------------
 NTCreateX    11471098     0.012   448.281
 Close         8426396     0.001     0.925
 Rename         485746     0.123   267.183
 Unlink        2316477     0.080    63.433
 Deltree           288     2.830    11.144
 Mkdir             144     0.003     0.010
 Qpathinfo    10397420     0.006    10.288
 Qfileinfo     1822039     0.001     0.169
 Qfsinfo       1906497     0.002    14.039
 Sfileinfo      934433     0.004     2.438
 Find          4019879     0.026    10.200
 WriteX        5718932     0.011   200.985
 ReadX        17981671     0.003    10.036
 LockX           37352     0.002     0.076
 UnlockX         37352     0.001     0.109
 Flush          804018     5.015   778.033

Throughput 1201.98 MB/sec  16 clients  16 procs  max_latency=778.036 ms
(+6.5% throughput, -29.4% max latency, -75.8% rename latency)

Test case generic/498 from fstests tests the scenario that the previously
mentioned commit fixed.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2020-10-07 12:06:56 +02:00
..
tests btrfs: make btrfs_set_extent_delalloc take btrfs_inode 2020-07-27 12:55:35 +02:00
acl.c
async-thread.c Btrfs: fix crash during unmount due to race with delayed inode workers 2020-03-23 17:01:51 +01:00
async-thread.h Btrfs: fix crash during unmount due to race with delayed inode workers 2020-03-23 17:01:51 +01:00
backref.c btrfs: remove unnecessarily shadowed variables 2020-10-07 12:06:55 +02:00
backref.h btrfs: rename BTRFS_ROOT_REF_COWS to BTRFS_ROOT_SHAREABLE 2020-05-25 11:25:35 +02:00
block-group.c btrfs: call btrfs_try_granting_tickets when reserving space 2020-10-07 12:06:51 +02:00
block-group.h btrfs: convert block group refcount to refcount_t 2020-07-27 12:55:42 +02:00
block-rsv.c btrfs: rename BTRFS_ROOT_REF_COWS to BTRFS_ROOT_SHAREABLE 2020-05-25 11:25:35 +02:00
block-rsv.h btrfs: Remove __ prefix from btrfs_block_rsv_release 2020-03-23 17:01:55 +01:00
btrfs_inode.h btrfs: reduce contention on log trees when logging checksums 2020-07-27 12:55:45 +02:00
check-integrity.c btrfs: check-integrity: remove unnecessary failure messages during memory allocation 2020-07-27 12:55:21 +02:00
check-integrity.h btrfs: remove btrfsic_submit_bh() 2020-03-23 17:01:39 +01:00
compression.c btrfs: compression: move declarations to header 2020-10-07 12:06:55 +02:00
compression.h btrfs: compression: move declarations to header 2020-10-07 12:06:55 +02:00
ctree.c btrfs: delete duplicated words + other fixes in comments 2020-10-07 12:06:50 +02:00
ctree.h btrfs: do async reclaim for data reservations 2020-10-07 12:06:54 +02:00
delalloc-space.c btrfs: add btrfs_reserve_data_bytes and use it 2020-10-07 12:06:52 +02:00
delalloc-space.h btrfs: make btrfs_delalloc_reserve_space take btrfs_inode 2020-07-27 12:55:36 +02:00
delayed-inode.c btrfs: use nofs allocations for running delayed items 2020-03-25 16:26:00 +01:00
delayed-inode.h btrfs: delayed-inode: Replace zero-length array with flexible-array member 2020-03-23 17:01:53 +01:00
delayed-ref.c btrfs: Remove __ prefix from btrfs_block_rsv_release 2020-03-23 17:01:55 +01:00
delayed-ref.h btrfs: migrate the delayed refs rsv code 2019-07-04 17:26:17 +02:00
dev-replace.c btrfs: change nr to u64 in btrfs_start_delalloc_roots 2020-10-07 12:06:50 +02:00
dev-replace.h btrfs: add __pure attribute to functions 2019-11-18 12:46:52 +01:00
dir-item.c
discard.c btrfs: discard: add missing put when grabbing block group from unused list 2020-07-07 16:06:28 +02:00
discard.h btrfs: discard: Use the correct style for SPDX License Identifier 2020-04-20 17:43:42 +02:00
disk-io.c btrfs: do async reclaim for data reservations 2020-10-07 12:06:54 +02:00
disk-io.h btrfs: preallocate anon block device at first phase of snapshot creation 2020-07-27 12:55:38 +02:00
export.c btrfs: simplify iget helpers 2020-05-25 11:25:37 +02:00
export.h btrfs: export helpers for subvolume name/id resolution 2020-03-23 17:01:42 +01:00
extent_io.c btrfs: delete duplicated words + other fixes in comments 2020-10-07 12:06:50 +02:00
extent_io.h btrfs: fix potential deadlock in the search ioctl 2020-08-27 13:56:27 +02:00
extent_map.c Btrfs: fix race between using extent maps and merging them 2020-02-12 17:16:46 +01:00
extent_map.h btrfs: remove extent_map::bdev 2019-11-18 23:43:44 +01:00
extent-io-tree.h btrfs: trim: fix underflow in trim length to prevent access beyond device boundary 2020-08-12 10:15:58 +02:00
extent-tree.c btrfs: call btrfs_try_granting_tickets when unpinning anything 2020-10-07 12:06:51 +02:00
file-item.c btrfs: make btrfs_csum_one_bio takae btrfs_inode 2020-07-27 12:55:26 +02:00
file.c btrfs: cleanup calculation of lockend in lock_and_cleanup_extent_if_need() 2020-10-07 12:06:54 +02:00
free-space-cache.c btrfs: delete duplicated words + other fixes in comments 2020-10-07 12:06:50 +02:00
free-space-cache.h btrfs: let btrfs_return_cluster_to_free_space() return void 2020-07-27 12:55:21 +02:00
free-space-tree.c btrfs: block-group: fix free-space bitmap threshold 2020-08-27 13:37:54 +02:00
free-space-tree.h btrfs: rename btrfs_block_group_cache 2019-11-18 17:51:51 +01:00
inode-item.c btrfs: Make btrfs_find_name_in_ext_backref return struct btrfs_inode_extref 2019-09-09 14:59:16 +02:00
inode-map.c btrfs: make btrfs_delalloc_reserve_space take btrfs_inode 2020-07-27 12:55:36 +02:00
inode-map.h
inode.c btrfs: do not commit logs and transactions during link and rename operations 2020-10-07 12:06:56 +02:00
ioctl.c btrfs: change nr to u64 in btrfs_start_delalloc_roots 2020-10-07 12:06:50 +02:00
Kconfig Revert "btrfs: switch to iomap_dio_rw() for dio" 2020-06-14 01:19:02 +02:00
locking.c btrfs: add missing annotation for btrfs_tree_lock() 2020-05-25 11:25:16 +02:00
locking.h btrfs: Implement DREW lock 2020-03-23 17:01:43 +01:00
lzo.c btrfs: compression: inline free_workspace 2019-11-18 12:46:59 +01:00
Makefile Btrfs: move all reflink implementation code into its own file 2020-03-23 17:01:54 +01:00
misc.h btrfs: rename tree_entry to rb_simple_node and export it 2020-05-25 11:25:19 +02:00
ordered-data.c btrfs: make btrfs_add_ordered_extent_dio take btrfs_inode 2020-07-27 12:55:34 +02:00
ordered-data.h btrfs: make btrfs_add_ordered_extent_dio take btrfs_inode 2020-07-27 12:55:34 +02:00
orphan.c
print-tree.c btrfs: require only sector size alignment for parent eb bytenr 2020-09-07 14:51:05 +02:00
print-tree.h
props.c btrfs: simplify iget helpers 2020-05-25 11:25:37 +02:00
props.h
qgroup.c btrfs: delete duplicated words + other fixes in comments 2020-10-07 12:06:50 +02:00
qgroup.h btrfs: qgroup: export qgroups in sysfs 2020-07-27 12:55:37 +02:00
raid56.c btrfs: raid56: remove out label in __raid56_parity_recover 2020-07-27 12:55:44 +02:00
raid56.h btrfs: constify map parameter for nr_parity_stripes and nr_data_stripes 2019-07-01 13:34:58 +02:00
rcu-string.h btrfs: rcu-string: Replace zero-length array with flexible-array member 2020-03-23 17:01:53 +01:00
reada.c btrfs: rename btrfs_block_group_cache 2019-11-18 17:51:51 +01:00
ref-verify.c btrfs: ref-verify: fix memory leak in add_block_entry 2020-07-27 12:55:43 +02:00
ref-verify.h
reflink.c btrfs: reduce contention on log trees when logging checksums 2020-07-27 12:55:45 +02:00
reflink.h Btrfs: move all reflink implementation code into its own file 2020-03-23 17:01:54 +01:00
relocation.c btrfs: relocation: review the call sites which can be interrupted by signal 2020-07-27 12:55:45 +02:00
root-tree.c btrfs: simplify root lookup by id 2020-05-25 11:25:36 +02:00
scrub.c btrfs: scrub: rename ratelimit state varaible to avoid shadowing 2020-10-07 12:06:55 +02:00
send.c btrfs: send: remove indirect callback parameter for changed_cb 2020-10-07 12:06:55 +02:00
send.h
space-info.c btrfs: fix possible infinite loop in data async reclaim 2020-10-07 12:06:54 +02:00
space-info.h btrfs: add btrfs_reserve_data_bytes and use it 2020-10-07 12:06:52 +02:00
struct-funcs.c btrfs: update documentation of set/get helpers 2020-05-25 11:25:35 +02:00
super.c btrfs: do async reclaim for data reservations 2020-10-07 12:06:54 +02:00
sysfs.c btrfs: remove const from btrfs_feature_set_name 2020-10-07 12:06:55 +02:00
sysfs.h btrfs: remove const from btrfs_feature_set_name 2020-10-07 12:06:55 +02:00
transaction.c btrfs: fix NULL pointer dereference after failure to create snapshot 2020-09-07 21:18:35 +02:00
transaction.h btrfs: qgroup: remove ASYNC_COMMIT mechanism in favor of reserve retry-after-EDQUOT 2020-07-27 12:55:43 +02:00
tree-checker.c btrfs: tree-checker: fix the error message for transid error 2020-08-27 14:16:05 +02:00
tree-checker.h
tree-defrag.c btrfs: remove unused btrfs_root::defrag_trans_start 2020-07-27 12:55:28 +02:00
tree-log.c btrfs: do not commit logs and transactions during link and rename operations 2020-10-07 12:06:56 +02:00
tree-log.h btrfs: do not commit logs and transactions during link and rename operations 2020-10-07 12:06:56 +02:00
ulist.c
ulist.h
uuid-tree.c btrfs: simplify root lookup by id 2020-05-25 11:25:36 +02:00
volumes.c btrfs: remove fsid argument from btrfs_sysfs_update_sprout_fsid 2020-10-07 12:06:50 +02:00
volumes.h btrfs: move btrfs_scratch_superblocks into btrfs_dev_replace_finishing 2020-09-25 16:40:22 +02:00
xattr.c Btrfs: fix failure to persist compression property xattr deletion on fsync 2019-06-17 16:37:17 +02:00
xattr.h
zlib.c btrfs: use larger zlib buffer for s390 hardware compression 2020-01-31 10:30:40 -08:00
zstd.c btrfs: compression: inline free_workspace 2019-11-18 12:46:59 +01:00