kernel_optimize_test/drivers/md
Kiyoshi Ueda 3f77316de0 dm: separate device deletion from dm_put
This patch separates the device deletion code from dm_put()
to make sure the deletion happens in the process context.

By this patch, device deletion always occurs in an ioctl (process)
context and dm_put() can be called in interrupt context.
As a result, the request-based dm's bad dm_put() usage pointed out
by Mikulas below disappears.
    http://marc.info/?l=dm-devel&m=126699981019735&w=2

Without this patch, I confirmed there is a case to crash the system:
    dm_put() => dm_table_destroy() => vfree() => BUG_ON(in_interrupt())

Some more backgrounds and details:
In request-based dm, a device opener can remove a mapped_device
while the last request is still completing, because bios in the last
request complete first and then the device opener can close and remove
the mapped_device before the last request completes:
  CPU0                                          CPU1
  =================================================================
  <<INTERRUPT>>
  blk_end_request_all(clone_rq)
    blk_update_request(clone_rq)
      bio_endio(clone_bio) == end_clone_bio
        blk_update_request(orig_rq)
          bio_endio(orig_bio)
                                                <<I/O completed>>
                                                dm_blk_close()
                                                dev_remove()
                                                  dm_put(md)
                                                    <<Free md>>
   blk_finish_request(clone_rq)
     ....
     dm_end_request(clone_rq)
       free_rq_clone(clone_rq)
       blk_end_request_all(orig_rq)
       rq_completed(md)

So request-based dm used dm_get()/dm_put() to hold md for each I/O
until its request completion handling is fully done.
However, the final dm_put() can call the device deletion code which
must not be run in interrupt context and may cause kernel panic.

To solve the problem, this patch moves the device deletion code,
dm_destroy(), to predetermined places that is actually deleting
the mapped_device in ioctl (process) context, and changes dm_put()
just to decrement the reference count of the mapped_device.
By this change, dm_put() can be used in any context and the symmetric
model below is introduced:
    dm_create():  create a mapped_device
    dm_destroy(): destroy a mapped_device
    dm_get():     increment the reference count of a mapped_device
    dm_put():     decrement the reference count of a mapped_device

dm_destroy() waits for all references of the mapped_device to disappear,
then deletes the mapped_device.

dm_destroy() uses active waiting with msleep(1), since deleting
the mapped_device isn't performance-critical task.
And since at this point, nobody opens the mapped_device and no new
reference will be taken, the pending counts are just for racing
completing activity and will eventually decrease to zero.

For the unlikely case of the forced module unload, dm_destroy_immediate(),
which doesn't wait and forcibly deletes the mapped_device, is also
introduced and used in dm_hash_remove_all().  Otherwise, "rmmod -f"
may be stuck and never return.
And now, because the mapped_device is deleted at this point, subsequent
accesses to the mapped_device may cause NULL pointer references.

Cc: stable@kernel.org
Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2010-08-12 04:13:56 +01:00
..
.gitignore
bitmap.c md/bitmap: separate out loading a bitmap from initialising the structures. 2010-07-26 13:21:34 +10:00
bitmap.h md/bitmap: separate out loading a bitmap from initialising the structures. 2010-07-26 13:21:34 +10:00
dm-bio-record.h
dm-crypt.c dm table: remove unused dm_get_device range parameters 2010-03-06 02:32:27 +00:00
dm-delay.c dm table: remove unused dm_get_device range parameters 2010-03-06 02:32:27 +00:00
dm-exception-store.c dm snapshot: test chunk size against both origin and snapshot 2010-08-12 04:13:51 +01:00
dm-exception-store.h dm snapshot: test chunk size against both origin and snapshot 2010-08-12 04:13:51 +01:00
dm-io.c block: unify flags for struct bio and struct request 2010-08-07 18:20:39 +02:00
dm-ioctl.c dm: separate device deletion from dm_put 2010-08-12 04:13:56 +01:00
dm-kcopyd.c block: unify flags for struct bio and struct request 2010-08-07 18:20:39 +02:00
dm-linear.c dm table: remove unused dm_get_device range parameters 2010-03-06 02:32:27 +00:00
dm-log-userspace-base.c include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h 2010-03-30 22:02:32 +09:00
dm-log-userspace-transfer.c include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h 2010-03-30 22:02:32 +09:00
dm-log-userspace-transfer.h
dm-log.c dm table: remove unused dm_get_device range parameters 2010-03-06 02:32:27 +00:00
dm-mpath.c dm mpath: fix NULL pointer dereference when path parameters missing 2010-08-12 04:13:49 +01:00
dm-mpath.h
dm-path-selector.c
dm-path-selector.h
dm-queue-length.c
dm-raid1.c block: unify flags for struct bio and struct request 2010-08-07 18:20:39 +02:00
dm-region-hash.c include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h 2010-03-30 22:02:32 +09:00
dm-round-robin.c
dm-service-time.c include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h 2010-03-30 22:02:32 +09:00
dm-snap-persistent.c dm snapshot: persistent annotate work_queue as on stack 2010-02-16 18:42:51 +00:00
dm-snap-transient.c dm snapshot: move cow ref from exception store to snap core 2009-12-10 23:52:12 +00:00
dm-snap.c dm snapshot: test chunk size against both origin and snapshot 2010-08-12 04:13:51 +01:00
dm-stripe.c block: unify flags for struct bio and struct request 2010-08-07 18:20:39 +02:00
dm-sysfs.c Driver core: Constify struct sysfs_ops in struct kobj_type 2010-03-07 17:04:49 -08:00
dm-table.c dm table: remove unused dm_get_device range parameters 2010-03-06 02:32:27 +00:00
dm-target.c include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h 2010-03-30 22:02:32 +09:00
dm-uevent.c dm table: remove dm_get from dm_table_get_md 2010-03-06 02:29:52 +00:00
dm-uevent.h
dm-zero.c
dm.c dm: separate device deletion from dm_put 2010-08-12 04:13:56 +01:00
dm.h dm: separate device deletion from dm_put 2010-08-12 04:13:56 +01:00
faulty.c Merge commit '3ff195b011d7decf501a4d55aeed312731094796' into for-linus 2010-05-22 08:31:36 +10:00
Kconfig Merge branch 'async' of macbook:git/btrfs-unstable 2010-08-09 10:36:44 +01:00
linear.c block: unify flags for struct bio and struct request 2010-08-07 18:20:39 +02:00
linear.h
Makefile Merge branch 'async' of macbook:git/btrfs-unstable 2010-08-09 10:36:44 +01:00
md.c Merge branch 'for-linus' of git://neil.brown.name/md 2010-08-10 15:38:19 -07:00
md.h Merge branch 'for-linus' of git://neil.brown.name/md 2010-08-10 15:38:19 -07:00
multipath.c block: unify flags for struct bio and struct request 2010-08-07 18:20:39 +02:00
multipath.h
raid1.c block: unify flags for struct bio and struct request 2010-08-07 18:20:39 +02:00
raid1.h md/raid1: add takeover support for raid5->raid1 2009-12-14 12:51:41 +11:00
raid5.c Merge branch 'for-linus' of git://neil.brown.name/md 2010-08-10 15:38:19 -07:00
raid5.h md/raid5: export raid5 unplugging interface. 2010-07-26 12:53:10 +10:00
raid10.c Merge branch 'for-linus' of git://neil.brown.name/md 2010-08-10 15:38:19 -07:00
raid10.h md: fix handling of array level takeover that re-arranges devices. 2010-06-24 13:33:24 +10:00
raid0.c block: unify flags for struct bio and struct request 2010-08-07 18:20:39 +02:00
raid0.h md: fix handling of array level takeover that re-arranges devices. 2010-06-24 13:33:24 +10:00