kernel_optimize_test/fs/btrfs
Filipe Manana f7fa1107f3 Btrfs: fix race between cloning range ending at eof and writeback
The recent rework that makes btrfs' remap_file_range operation use the
generic helper generic_remap_file_range_prep() introduced a race between
writeback and cloning a range that covers the eof extent of the source
file into a destination offset that is greater then the same file's size.

This happens because we now wait for writeback to complete before doing
the truncation of the eof block, while previously we did the truncation
and then waited for writeback to complete. This leads to a race between
writeback of the truncated block and cloning the file extents in the
source range, because we copy each file extent item we find in the fs
root into a buffer, then release the path and then increment the reference
count for the extent referred in that file extent item we copied, which
can no longer exist if writeback of the truncated eof block completes
after we copied the file extent item into the buffer and before we
incremented the reference count. This is illustrated by the following
diagram:

        CPU 1                                       CPU 2

  btrfs_clone_files()
    btrfs_cont_expand()
      btrfs_truncate_block()
         --> zeroes part of the
             page containg eof,
             marking it for
            delalloc

    btrfs_clone()
      --> finds extent item
          covering eof,
          points to extent
          at bytenr X
      --> copies it into a
          local buffer
      --> releases path

                                        writeback starts

                                        btrfs_finish_ordered_io()
                                          insert_reserved_file_extent()
                                            __btrfs_drop_extents()
                                              --> creates delayed
                                                  reference to drop
                                                  the extent at
                                                  bytenr X

      --> starts transaction
      --> creates delayed
          reference to
          increment extent
          at bytenr X

                    <delayed references are run, due to a transaction
                     commit for example, and the transaction is aborted
                     with -EIO because we attempt to increment reference
                     count for the extent at bytenr X after we freed it>

When this race is hit the running transaction ends up getting aborted with
an -EIO error and a trace like the following is produced:

[ 4382.553858] WARNING: CPU: 2 PID: 3648 at fs/btrfs/extent-tree.c:1552 lookup_inline_extent_backref+0x4f4/0x650 [btrfs]
(...)
[ 4382.556293] CPU: 2 PID: 3648 Comm: btrfs Tainted: G        W         4.20.0-rc6-btrfs-next-41 #1
[ 4382.556294] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.11.2-0-gf9626ccb91-prebuilt.qemu-project.org 04/01/2014
[ 4382.556308] RIP: 0010:lookup_inline_extent_backref+0x4f4/0x650 [btrfs]
(...)
[ 4382.556310] RSP: 0018:ffffac784408f738 EFLAGS: 00010202
[ 4382.556311] RAX: 0000000000000001 RBX: ffff8980673c3a48 RCX: 0000000000000001
[ 4382.556312] RDX: 0000000000000008 RSI: 0000000000000000 RDI: 0000000000000000
[ 4382.556312] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000001
[ 4382.556313] R10: 0000000000000001 R11: ffff897f40000000 R12: 0000000000001000
[ 4382.556313] R13: 00000000c224f000 R14: ffff89805de9bd40 R15: ffff8980453f4548
[ 4382.556315] FS:  00007f5e759178c0(0000) GS:ffff89807b300000(0000) knlGS:0000000000000000
[ 4382.563130] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 4382.563562] CR2: 00007f2e9789fcbc CR3: 0000000120512001 CR4: 00000000003606e0
[ 4382.564005] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 4382.564451] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 4382.564887] Call Trace:
[ 4382.565343]  insert_inline_extent_backref+0x55/0xe0 [btrfs]
[ 4382.565796]  __btrfs_inc_extent_ref.isra.60+0x88/0x260 [btrfs]
[ 4382.566249]  ? __btrfs_run_delayed_refs+0x93/0x1650 [btrfs]
[ 4382.566702]  __btrfs_run_delayed_refs+0xa22/0x1650 [btrfs]
[ 4382.567162]  btrfs_run_delayed_refs+0x7e/0x1d0 [btrfs]
[ 4382.567623]  btrfs_commit_transaction+0x50/0x9c0 [btrfs]
[ 4382.568112]  ? _raw_spin_unlock+0x24/0x30
[ 4382.568557]  ? block_rsv_release_bytes+0x14e/0x410 [btrfs]
[ 4382.569006]  create_subvol+0x3c8/0x830 [btrfs]
[ 4382.569461]  ? btrfs_mksubvol+0x317/0x600 [btrfs]
[ 4382.569906]  btrfs_mksubvol+0x317/0x600 [btrfs]
[ 4382.570383]  ? rcu_sync_lockdep_assert+0xe/0x60
[ 4382.570822]  ? __sb_start_write+0xd4/0x1c0
[ 4382.571262]  ? mnt_want_write_file+0x24/0x50
[ 4382.571712]  btrfs_ioctl_snap_create_transid+0x117/0x1a0 [btrfs]
[ 4382.572155]  ? _copy_from_user+0x66/0x90
[ 4382.572602]  btrfs_ioctl_snap_create+0x66/0x80 [btrfs]
[ 4382.573052]  btrfs_ioctl+0x7c1/0x30e0 [btrfs]
[ 4382.573502]  ? mem_cgroup_commit_charge+0x8b/0x570
[ 4382.573946]  ? do_raw_spin_unlock+0x49/0xc0
[ 4382.574379]  ? _raw_spin_unlock+0x24/0x30
[ 4382.574803]  ? __handle_mm_fault+0xf29/0x12d0
[ 4382.575215]  ? do_vfs_ioctl+0xa2/0x6f0
[ 4382.575622]  ? btrfs_ioctl_get_supported_features+0x30/0x30 [btrfs]
[ 4382.576020]  do_vfs_ioctl+0xa2/0x6f0
[ 4382.576405]  ksys_ioctl+0x70/0x80
[ 4382.576776]  __x64_sys_ioctl+0x16/0x20
[ 4382.577137]  do_syscall_64+0x60/0x1b0
[ 4382.577488]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
(...)
[ 4382.578837] RSP: 002b:00007ffe04bf64c8 EFLAGS: 00000202 ORIG_RAX: 0000000000000010
[ 4382.579174] RAX: ffffffffffffffda RBX: 00005564136f3050 RCX: 00007f5e74724dd7
[ 4382.579505] RDX: 00007ffe04bf64d0 RSI: 000000005000940e RDI: 0000000000000003
[ 4382.579848] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000044
[ 4382.580164] R10: 0000000000000541 R11: 0000000000000202 R12: 00005564136f3010
[ 4382.580477] R13: 0000000000000003 R14: 00005564136f3035 R15: 00005564136f3050
[ 4382.580792] irq event stamp: 0
[ 4382.581106] hardirqs last  enabled at (0): [<0000000000000000>]           (null)
[ 4382.581441] hardirqs last disabled at (0): [<ffffffff8d085842>] copy_process.part.32+0x6e2/0x2320
[ 4382.581772] softirqs last  enabled at (0): [<ffffffff8d085842>] copy_process.part.32+0x6e2/0x2320
[ 4382.582095] softirqs last disabled at (0): [<0000000000000000>]           (null)
[ 4382.582413] ---[ end trace d3c188e3e9367382 ]---
[ 4382.623855] BTRFS: error (device sdc) in btrfs_run_delayed_refs:2981: errno=-5 IO failure
[ 4382.624295] BTRFS info (device sdc): forced readonly

Fix this by waiting for writeback to complete after truncating the eof
block.

Fixes: 34a28e3d77 ("Btrfs: use generic_remap_file_range_prep() for cloning and deduplication")
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2019-01-09 14:52:22 +01:00
..
tests btrfs: remove always true if branch in find_delalloc_range 2018-12-17 14:51:44 +01:00
acl.c btrfs: remove unnecessary curly braces in btrfs_get_acl 2018-08-06 13:12:41 +02:00
async-thread.c
async-thread.h
backref.c btrfs: Fix typos in comments and strings 2018-12-17 14:51:50 +01:00
backref.h
btrfs_inode.h Btrfs: fix fsync of files with multiple hard links in new directories 2018-12-17 14:51:43 +01:00
check-integrity.c btrfs: Fix typos in comments and strings 2018-12-17 14:51:50 +01:00
check-integrity.h
compression.c btrfs: Fix typos in comments and strings 2018-12-17 14:51:50 +01:00
compression.h
ctree.c btrfs: Fix typos in comments and strings 2018-12-17 14:51:50 +01:00
ctree.h btrfs: catch cow on deleting snapshots 2018-12-17 14:51:48 +01:00
dedupe.h
delayed-inode.c Btrfs: kill btrfs_clear_path_blocking 2018-10-15 17:23:38 +02:00
delayed-inode.h Btrfs: delayed-inode: use rb_first_cached for ins_root and del_root 2018-10-15 17:23:33 +02:00
delayed-ref.c btrfs: introduce delayed_refs_rsv 2018-12-17 14:51:46 +01:00
delayed-ref.h btrfs: add btrfs_delete_ref_head helper 2018-12-17 14:51:46 +01:00
dev-replace.c btrfs: Fix typos in comments and strings 2018-12-17 14:51:50 +01:00
dev-replace.h btrfs: dev-replace: open code trivial locking helpers 2018-12-17 14:51:45 +01:00
dir-item.c btrfs: Remove root parameter from btrfs_insert_dir_item 2018-10-15 17:23:25 +02:00
disk-io.c btrfs: Fix typos in comments and strings 2018-12-17 14:51:50 +01:00
disk-io.h btrfs: drop extra enum initialization where using defaults 2018-12-17 14:51:43 +01:00
export.c btrfs: Remove 'objectid' member from struct btrfs_root 2018-10-15 17:23:25 +02:00
export.h
extent_io.c btrfs: Fix typos in comments and strings 2018-12-17 14:51:50 +01:00
extent_io.h btrfs: Fix typos in comments and strings 2018-12-17 14:51:50 +01:00
extent_map.c btrfs: Fix typos in comments and strings 2018-12-17 14:51:50 +01:00
extent_map.h btrfs: switch EXTENT_FLAG_* to enums 2018-12-17 14:51:43 +01:00
extent-tree.c btrfs: Fix typos in comments and strings 2018-12-17 14:51:50 +01:00
file-item.c btrfs: replace btrfs_io_bio::end_io with a simple helper 2018-12-17 14:51:40 +01:00
file.c btrfs: Fix typos in comments and strings 2018-12-17 14:51:50 +01:00
free-space-cache.c Btrfs: fix deadlock on tree root leaf when finding free extent 2018-11-06 16:42:32 +01:00
free-space-cache.h
free-space-tree.c btrfs: use EXPORT_FOR_TESTS for conditionally exported functions 2018-12-17 14:51:37 +01:00
free-space-tree.h
inode-item.c
inode-map.c btrfs: prune unused includes 2018-08-06 13:12:43 +02:00
inode-map.h
inode.c btrfs: Fix typos in comments and strings 2018-12-17 14:51:50 +01:00
ioctl.c Btrfs: fix race between cloning range ending at eof and writeback 2019-01-09 14:52:22 +01:00
Kconfig
locking.c
locking.h
lzo.c btrfs: Fix typos in comments and strings 2018-12-17 14:51:50 +01:00
Makefile
math.h
ordered-data.c Btrfs: remove no longer used stuff for tracking pending ordered extents 2018-12-17 14:51:25 +01:00
ordered-data.h btrfs: switch BTRFS_ORDERED_* to enums 2018-12-17 14:51:43 +01:00
orphan.c
print-tree.c btrfs: annotate unlikely branches after V0 extent type removal 2018-08-06 13:12:41 +02:00
print-tree.h
props.c
props.h
qgroup.c btrfs: Fix typos in comments and strings 2018-12-17 14:51:50 +01:00
qgroup.h btrfs: Fix typos in comments and strings 2018-12-17 14:51:50 +01:00
raid56.c btrfs: Fix typos in comments and strings 2018-12-17 14:51:50 +01:00
raid56.h
rcu-string.h
reada.c btrfs: dev-replace: open code trivial locking helpers 2018-12-17 14:51:45 +01:00
ref-verify.c btrfs: Fix typos in comments and strings 2018-12-17 14:51:50 +01:00
ref-verify.h
relocation.c btrfs: Fix typos in comments and strings 2018-12-17 14:51:50 +01:00
root-tree.c btrfs: Remove fs_info from btrfs_add_root_ref 2018-08-06 13:13:00 +02:00
scrub.c btrfs: Fix typos in comments and strings 2018-12-17 14:51:50 +01:00
send.c btrfs: Fix typos in comments and strings 2018-12-17 14:51:50 +01:00
send.h
struct-funcs.c btrfs: prune unused includes 2018-08-06 13:12:43 +02:00
super.c btrfs: Fix typos in comments and strings 2018-12-17 14:51:50 +01:00
sysfs.c btrfs: Add sysfs support for metadata_uuid feature 2018-12-17 14:51:37 +01:00
sysfs.h btrfs: drop extra enum initialization where using defaults 2018-12-17 14:51:43 +01:00
transaction.c btrfs: Fix typos in comments and strings 2018-12-17 14:51:50 +01:00
transaction.h btrfs: drop extra enum initialization where using defaults 2018-12-17 14:51:43 +01:00
tree-checker.c btrfs: Fix typos in comments and strings 2018-12-17 14:51:50 +01:00
tree-checker.h
tree-defrag.c
tree-log.c btrfs: Fix typos in comments and strings 2018-12-17 14:51:50 +01:00
tree-log.h Btrfs: remove no longer used io_err from btrfs_log_ctx 2018-12-17 14:51:31 +01:00
ulist.c
ulist.h
uuid-tree.c
volumes.c btrfs: Fix typos in comments and strings 2018-12-17 14:51:50 +01:00
volumes.h btrfs: remove btrfs_bio_end_io_t 2018-12-17 14:51:40 +01:00
xattr.c Btrfs: use nofs context when initializing security xattrs to avoid deadlock 2018-12-17 14:51:49 +01:00
xattr.h
zlib.c
zstd.c