kernel_optimize_test

Author	SHA1	Message	Date
Eric W. Biederman	b592fcfe7f	sysfs: Shadow directory support The problem. When implementing a network namespace I need to be able to have multiple network devices with the same name. Currently this is a problem for /sys/class/net/*. What I want is a separate /sys/class/net directory in sysfs for each network namespace, and I want to name each of them /sys/class/net. I looked and the VFS actually allows that. All that is needed is for /sys/class/net to implement a follow link method to redirect lookups to the real directory you want. Implementing a follow link method that is sensitive to the current network namespace turns out to be 3 lines of code so it looks like a clean approach. Modifying sysfs so it doesn't get in my was is a bit trickier. I am calling the concept of multiple directories all at the same path in the filesystem shadow directories. With the directory entry really at that location the shadow master. The following patch modifies sysfs so it can handle a directory structure slightly different from the kobject tree so I can implement the shadow directories for handling /sys/class/net/. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Maneesh Soni <maneesh@in.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2007-02-07 10:37:14 -08:00
Oliver Neukum	82244b169e	sysfs: error handling in sysfs, fill_read_buffer() if a driver returns an error in fill_read_buffer(), the buffer will be marked as filled. Subsequent reads will return eof. But there is no data because of an error, not because it has been read. Not marking the buffer filled is the obvious fix. Signed-off-by: Oliver Neukum <oliver@neukum.name> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2007-02-07 10:37:13 -08:00
Mariusz Kozlowski	f750653670	sysfs: kobject_put cleanup This patch removes redundant argument checks for kobject_put(). Signed-off-by: Mariusz Kozlowski <m.kozlowski@tuxland.pl> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2007-02-07 10:37:13 -08:00
Frederik Deweerdt	d3fc373ac5	sysfs: suppress lockdep warnings Lockdep issues the following warning: [ 9.064000] ============================================= [ 9.064000] [ INFO: possible recursive locking detected ] [ 9.064000] 2.6.20-rc3-mm1 #3 [ 9.064000] --------------------------------------------- [ 9.064000] init/1 is trying to acquire lock: [ 9.064000] (&sysfs_inode_imutex_key){--..}, at: [<c03e6afc>] mutex_lock+0x1c/0x1f [ 9.064000] [ 9.064000] but task is already holding lock: [ 9.064000] (&sysfs_inode_imutex_key){--..}, at: [<c03e6afc>] mutex_lock+0x1c/0x1f [ 9.065000] [ 9.065000] other info that might help us debug this: [ 9.065000] 2 locks held by init/1: [ 9.065000] #0: (tty_mutex){--..}, at: [<c03e6afc>] mutex_lock+0x1c/0x1f [ 9.065000] #1: (&sysfs_inode_imutex_key){--..}, at: [<c03e6afc>] mutex_lock+0x1c/0x1f [ 9.065000] [ 9.065000] stack backtrace: [ 9.065000] [<c010390d>] show_trace_log_lvl+0x1a/0x30 [ 9.066000] [<c0103935>] show_trace+0x12/0x14 [ 9.066000] [<c0103a2f>] dump_stack+0x16/0x18 [ 9.066000] [<c0138cb8>] print_deadlock_bug+0xb9/0xc3 [ 9.066000] [<c0138d17>] check_deadlock+0x55/0x5a [ 9.066000] [<c013a953>] __lock_acquire+0x371/0xbf0 [ 9.066000] [<c013b7a9>] lock_acquire+0x69/0x83 [ 9.066000] [<c03e6b7e>] __mutex_lock_slowpath+0x75/0x2d1 [ 9.066000] [<c03e6afc>] mutex_lock+0x1c/0x1f [ 9.066000] [<c01b249c>] sysfs_drop_dentry+0xb1/0x133 [ 9.066000] [<c01b25d1>] sysfs_hash_and_remove+0xb3/0x142 [ 9.066000] [<c01b30ed>] sysfs_remove_file+0xd/0x10 [ 9.067000] [<c02849e0>] device_remove_file+0x23/0x2e [ 9.067000] [<c02850b2>] device_del+0x188/0x1e6 [ 9.067000] [<c028511b>] device_unregister+0xb/0x15 [ 9.067000] [<c0285318>] device_destroy+0x9c/0xa9 [ 9.067000] [<c0261431>] vcs_remove_sysfs+0x1c/0x3b [ 9.067000] [<c0267a08>] con_close+0x5e/0x6b [ 9.067000] [<c02598f2>] release_dev+0x4c4/0x6e5 [ 9.067000] [<c0259faa>] tty_release+0x12/0x1c [ 9.067000] [<c0174872>] __fput+0x177/0x1a0 [ 9.067000] [<c01746f5>] fput+0x3b/0x41 [ 9.068000] [<c0172ee1>] filp_close+0x36/0x65 [ 9.068000] [<c0172f73>] sys_close+0x63/0xa4 [ 9.068000] [<c0102a96>] sysenter_past_esp+0x5f/0x99 [ 9.068000] ======================= This is due to sysfs_hash_and_remove() holding dir->d_inode->i_mutex before calling sysfs_drop_dentry() which calls orphan_all_buffers() which in turn takes node->i_mutex. Signed-off-by: Frederik Deweerdt <frederik.deweerdt@gmail.com> Cc: Oliver Neukum <oliver@neukum.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2007-02-07 10:37:13 -08:00
Oliver Neukum	94bebf4d1b	Driver core: fix race in sysfs between sysfs_remove_file() and read()/write() This patch prevents a race between IO and removing a file from sysfs. It introduces a list of sysfs_buffers associated with a file at the inode. Upon removal of a file the list is walked and the buffers marked orphaned. IO to orphaned buffers fails with -ENODEV. The driver can safely free associated data structures or be unloaded. Signed-off-by: Oliver Neukum <oliver@neukum.name> Acked-by: Maneesh Soni <maneesh@in.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2007-02-07 10:37:13 -08:00
Cornelia Huck	c744aeae9d	driver core: Allow device_move(dev, NULL). If we allow NULL as the new parent in device_move(), we need to make sure that the device is placed into the same place as it would if it was newly registered: - Consider the device virtual tree. In order to be able to reuse code, setup_parent() has been tweaked a bit. - kobject_move() can fall back to the kset's kobject. - sysfs_move_dir() uses the sysfs root dir as fallback. Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com> Cc: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2007-02-07 10:37:11 -08:00
Linus Torvalds	5331be0905	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/shaggy/jfs-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/shaggy/jfs-2.6: JFS: Remove incorrect kgdb define JFS: call io_schedule() instead of schedule() to avoid deadlock JFS: Add lockdep annotations JFS: Avoid BUG() on a damaged file system	2007-02-07 08:10:48 -08:00
Linus Torvalds	d3f8fd765e	Merge git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-nmw * git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-nmw: (57 commits) [GFS2] make gfs2_writepages() static [GFS2] Unlock page on prepare_write try lock failure [GFS2] nfsd readdirplus assertion failure [DLM] fix softlockup in dlm_recv [DLM] zero new user lvbs [DLM/GFS2] indent help text [GFS2] Fix unlink deadlocks [GFS2] Put back semaphore to avoid umount problem [GFS2] more CURRENT_TIME_SEC [GFS2/DLM] fix GFS2 circular dependency [GFS2/DLM] use sysfs [GFS2] make lock_dlm drop_count tunable in sysfs [GFS2] increase default lock limit [GFS2] Fix list corruption in lops.c [GFS2] Fix recursive locking attempt with NFS [DLM] can miss clearing resend flag [DLM] saved dlm message can be dropped [DLM] Make sock_sem into a mutex [GFS2] Fix typo in glock.c [GFS2] use CURRENT_TIME_SEC instead of get_seconds in gfs2 ...	2007-02-07 08:09:00 -08:00
Adrian Bunk	a2cf822274	[GFS2] make gfs2_writepages() static On Mon, Jan 29, 2007 at 08:45:28PM -0800, Andrew Morton wrote: >... > Changes since 2.6.20-rc6-mm2: >... > git-gfs2-nmw.patch >... > git trees >... This patch makes the needlessly global gfs2_writepages() static. Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-07 10:48:48 -05:00
Steven Whitehouse	2d72e7101c	[GFS2] Unlock page on prepare_write try lock failure When the try lock of the glock failed in prepare_write we were incorrectly exiting this function with the page still locked. This was resulting in further I/O to this page hanging. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-07 10:25:59 -05:00
Linus Torvalds	dda2ac15d2	Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2 * 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2: ocfs2: ocfs2_link() journal credits update	2007-02-06 14:59:27 -08:00
Linus Torvalds	768c242b30	Merge git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6: [CIFS] Minor cleanup [CIFS] Missing free in error path [CIFS] Reduce cifs stack space usage [CIFS] lseek polling returned stale EOF	2007-02-06 14:42:20 -08:00
Steve French	a850790f6c	[CIFS] Minor cleanup Missing tab. Missing entry in changelog Signed-off-by: Steve French <sfrench@us.ibm.com>	2007-02-06 20:43:30 +00:00
Wendy Cheng	549ae0ac3d	[GFS2] nfsd readdirplus assertion failure Glock assertion failure found in '07 NFS connectathon. One of the NFSDs is doing a "readdirplus" procedure call. It passes the logic into gfs2_readdir() where it obtains its directory inode glock. This is then followed by filehandle construction that invokes lookup code. It hits the assertion failure while trying to obtain the inode glock again inside gfs2_drevalidate(). This patch bypasses the recursive glock call if caller already holds the lock. Signed-off-by: S. Wendy Cheng <wcheng@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-06 11:36:01 -05:00
Patrick Caulfield	a34fbc6363	[DLM] fix softlockup in dlm_recv This patch stops the dlm_recv workqueue from busy-waiting when a node disconnects. This can cause soft lockup errors on debug systems and bad performance generally. Signed-Off-By: Patrick Caulfield <pcaulfie@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:38:27 -05:00
David Teigland	62a0f62369	[DLM] zero new user lvbs A new lvb for a userland lock wasn't being initialized to zero. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:38:24 -05:00
Randy Dunlap	9beeb9f3c5	[DLM/GFS2] indent help text Indent help text as expected. Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:38:20 -05:00
Russell Cattelan	ddee76089c	[GFS2] Fix unlink deadlocks Move the glock acquisition to outside of the transactions. Lock odering must be preserved in order to prevent ABBA deadlocks. The current gfs2_change_nlink code would tries to grab the glock after having started a transaction and thus is holding the log lock. This is inconsistent with other code paths in gfs that grab the resource group glock prior to staring a tranactions. One problem with this fix is that the resource group lock is always grabbed now even if the inode still has ref count and can not be marked for unlink. Signed-off-by: Russell Cattelan <cattelan@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:38:17 -05:00
Steven Whitehouse	61be084efc	[GFS2] Put back semaphore to avoid umount problem Dave Teigland fixed this bug a while back, but I managed to mistakenly remove the semaphore during later development. It is required to avoid the list of inodes changing during an invalidate_inodes call. I have made it an rwsem since the read side will be taken frequently during normal filesystem operation. The write site will only happen during umount of the file system. Also the bug only triggers when using the DLM lock manager and only then under certain conditions as its timing related. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Cc: David Teigland <teigland@redhat.com>	2007-02-05 13:38:14 -05:00
Eric Sandeen	bbb28ab759	[GFS2] more CURRENT_TIME_SEC Whoops, quilt user error, missed this one in the previous patch. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:38:11 -05:00
Adrian Bunk	0011727785	[GFS2/DLM] fix GFS2 circular dependency On Sun, Jan 28, 2007 at 11:08:18AM +0100, Jiri Slaby wrote: > Andrew Morton napsal(a): > >Temporarily at > > > > http://userweb.kernel.org/~akpm/2.6.20-rc6-mm1/ > > Unable to select IPV6. Menuconfig doesn't offer it when INET is selected. > When it's not it appears in the menu, but after state change it gets away. > The same behaviour in xconfig, gconfig. > > $ mkdir ../a/tst > $ make O=../a/tst menuconfig > HOSTCC scripts/basic/fixdep > [...] > HOSTLD scripts/kconfig/mconf > scripts/kconfig/mconf arch/i386/Kconfig > Warning! Found recursive dependency: INET GFS2_FS_LOCKING_DLM SYSFS > OCFS2_FS INET > > Maybe this is the problem? Yes, patch below. > regards, cu Adrian <-- snip --> This patch fixes a circular dependency by letting GFS2_FS_LOCKING_DLM and DLM depend on instead of select SYSFS. Since SYSFS depends on EMBEDDED this change shouldn't cause any problems for users. Signed-off-by: Adrian Bunk <bunk@stusta.de> Acked-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:38:08 -05:00
Randy Dunlap	67f55897ee	[GFS2/DLM] use sysfs With CONFIG_DLM=m, CONFIG_PROC_FS=n, and CONFIG_SYSFS=n, kernel build fails with: WARNING: "kernel_subsys" [fs/gfs2/locking/dlm/lock_dlm.ko] undefined! WARNING: "kernel_subsys" [fs/dlm/dlm.ko] undefined! WARNING: "kernel_subsys" [fs/configfs/configfs.ko] undefined! make[1]: * [__modpost] Error 1 make: * [modules] Error 2 Since fs/dlm/lockspace.c and fs/gfs2/locking/dlm/sysfs.c use kernel_subsys, they should either DEPEND on it or SELECT it. Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:38:05 -05:00
David Teigland	ee32e4f3d3	[GFS2] make lock_dlm drop_count tunable in sysfs We want to be able to change or disable the default drop_count (number at which the dlm asks gfs to limit the the number of locks it's holding). Add it to the collection of sysfs tunables for an fs. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:38:01 -05:00
David Teigland	2f708649ba	[GFS2] increase default lock limit Increase the number of locks at which point the dlm begins asking gfs to reduce its lock usage. The default value is largely arbitrary, but the current value of 50,000 ends up limiting performance unnecessarily for too many users. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:37:59 -05:00
Steven Whitehouse	8bd9572769	[GFS2] Fix list corruption in lops.c The patch below appears to fix the list corruption that we are seeing on occasion. Although the transaction structure is private to a single thread, when the queued structures are dismantled during an in-core commit, its possible for a different thread to be trying to add the same structure to another, new, transaction at the same time. To avoid this, this patch takes the log spinlock during this operation. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:37:56 -05:00
Steven Whitehouse	d7c103d0bd	[GFS2] Fix recursive locking attempt with NFS In certain cases, its possible for NFS to call the lookup code while holding the glock (when doing a readdirplus operation) so we need to check for that and not try and lock the glock twice. This also fixes a typo in a previous NFS related GFS2 patch. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:37:53 -05:00
David Teigland	b790c3b7c3	[DLM] can miss clearing resend flag A long, complicated sequence of events, beginning with the RESEND flag not being cleared on an lkb, can result in an unlock never completing. - lkb on waiters list for remote lookup - the remote node is both the dir node and the master node, so it optimizes the lookup into a request and sends a request reply back - the request reply is saved on the requestqueue to be processed after recovery - recovery runs dlm_recover_waiters_pre() which sets RESEND flag so the lookup will be resent after recovery - end of recovery: process_requestqueue takes saved request reply which removes the lkb off the waitesr list, _without_ clearing the RESEND flag - end of recovery: dlm_recover_waiters_post() doesn't do anything with the now completed lookup lkb (would usually clear RESEND) - later, the node unmounts, unlocks this lkb that still has RESEND flag set - the lkb is on the waiters list again, now for unlock, when recovery occurs, dlm_recover_waiters_pre() shows the lkb for unlock with RESEND set, doesn't do anything since the master still exists - end of recovery: dlm_recover_waiters_post() takes this lkb off the waiters list because it has the RESEND flag set, then reports an error because unlocks are never supposed to be handled in recover_waiters_post(). - later, the unlock reply is received, doesn't find the lkb on the waiters list because recover_waiters_post() has wrongly removed it. - the unlock operation has been lost, and we're left with a stray granted lock - unmount spins waiting for the unlock to complete The visible evidence of this problem will be a node where gfs umount is spinning, the dlm waiters list will be empty, and the dlm locks list will show a granted lock. The fix is simply to clear the RESEND flag when taking an lkb off the waiters list. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:37:50 -05:00
David Teigland	8fd3a98f2c	[DLM] saved dlm message can be dropped dlm_receive_message() returns 0 instead of returning 'error'. What would happen is that process_requestqueue would take a saved message off the requestqueue and call receive_message on it. receive_message would then see that recovery had been aborted, set error to EINTR, and 'goto out', expecting that the error would be returned. Instead, 0 was always returned, so process_requestqueue would think that the message had been processed and delete it instead of saving it to process next time. This means the message (usually an unlock in my tests) would be lost. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:37:47 -05:00
Patrick Caulfield	f1f1c1ccf7	[DLM] Make sock_sem into a mutex Now that there can be multiple dlm_recv threads running we need to prevent two recvs running for the same connection - it's unlikely but it can happen and it causes message corruption. Signed-Off-By: Patrick Caulfield <pcaulfie@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:37:44 -05:00
Steven Whitehouse	d043e1900c	[GFS2] Fix typo in glock.c This is a one letter typo fix in glock.c, spotted by Rob Kenna. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:37:41 -05:00
Eric Sandeen	ddfe062783	[GFS2] use CURRENT_TIME_SEC instead of get_seconds in gfs2 I was looking something else up and came across this... I don't honestly have a good reason to change it other than to make it like every other Linux filesystem in this regard. ;-) It doesn't functionally change anything, but makes some lines shorter. :) I'm also curious; why does gfs2 have 64-bits of on-disk timestamps, but not in timespec_t format, and only stores second resolutions? Seems like you're halfway to sub-second resolutions already. I suppose if that gets implemented then all of the below should instead be CURRENT_TIME not CURRENT_TIME_SEC. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:37:38 -05:00
Steven Whitehouse	90101c3186	[GFS2] Compile fix for glock.c This one liner got missed from the previous patch. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:37:35 -05:00
Steven Whitehouse	12132933c4	[GFS2] Remove queue_empty() function This function is not longer required since we do not do recursive locking in the glock layer. As a result all its callers can be replaceed with list_empty() calls. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:37:32 -05:00
Patrick Caulfield	bd44e2b007	[DLM] fix lowcomms receiving This patch fixes a bug whereby data on a newly accepted connection would be ignored if it arrived soon after the accept. Signed-Off-By: Patrick Caulfield <pcaulfie@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:37:29 -05:00
Steven Whitehouse	b5d32bead1	[GFS2] Tidy up glops calls This patch doesn't make any changes to the ordering of the various operations related to glocking, but it does tidy up the calls to the glops.c functions to make the structure more obvious. The two functions: gfs2_glock_xmote_th() and gfs2_glock_drop_th() can be made static within glock.c since they are called by every set of glock operations. The xmote_th and drop_th glock operations are then made conditional upon those two routines existing and called from the previously mentioned functions in glock.c respectively. Also it can be seen that the go_sync operation isn't needed since it can easily be replaced by calls to xmote_bh and drop_bh respectively. This results in no longer (confusingly) calling back into routines in glock.c from glops.c and also reducing the glock operations by one member. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:37:26 -05:00
Patrick Caulfield	f2f5095f9e	[DLM] lowcomms tidy This patch removes some redundant fields from the connection structure and adds some lockdep annotation to remove spurious warnings. Signed-Off-By: Patrick Caulfield <pcaulfie@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:37:23 -05:00
Steven Whitehouse	1c0f4872dc	[GFS2] Remove local exclusive glock mode Here is a patch for GFS2 to remove the local exclusive flag. In the places it was used, mutex's are always held earlier in the call path, so it appears redundant in the LM_ST_SHARED case. Also, the GFS2 holders were setting local exclusive in any case where the requested lock was LM_ST_EXCLUSIVE. So the other places in the glock code where the flag was tested have been replaced with tests for the lock state being LM_ST_EXCLUSIVE in order to ensure the logic is the same as before (i.e. LM_ST_EXCLUSIVE is always locally exclusive as well as globally exclusive). Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:37:20 -05:00
Steven Whitehouse	6bd9c8c2fb	[GFS2] Remove unused go_callback operation This is never used, so we might as well remove it. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:37:17 -05:00
Steven Whitehouse	e5dab552c8	[GFS2] Remove the "greedy" function from glock.[ch] The "greedy" code was an attempt to retain glocks for a minimum length of time when they relate to mmap()ed files. The current implementation of this feature is not, however, ideal in that it required allocating memory in order to do this and its overly complicated. It also misses the mark by ignoring the other I/O operations which are just as likely to suffer from the same problem. So the plan is to remove this now and then add the functionality back as part of the glock state machine at a later date (and thus take into account all the possible users of this feature) Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:37:14 -05:00
Steven Whitehouse	fee852e374	[GFS2] Shrink gfs2_inode memory by half Here is something I spotted (while looking for something entirely different) the other day. Rather than using a completion in each and every struct gfs2_holder, this removes it in favour of hashed wait queues, thus saving a considerable amount of memory both on the stack (where a number of gfs2_holder structures are allocated) and in particular in the gfs2_inode which has 8 gfs2_holder structures embedded within it. As a result on x86_64 the gfs2_inode shrinks from 2488 bytes to 1912 bytes, a saving of 576 bytes per inode (no thats not a typo!). In actual practice we get a much better result than that since now that a gfs2_inode is under the 2048 byte barrier, we get two per 4k slab page effectively halving the amount of memory required to store gfs2_inodes. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:37:11 -05:00
Steven Whitehouse	330005c2b2	[GFS2] Remove max_atomic_write tunable This removes an unused sysfs tunable parameter. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:37:08 -05:00
Steven Whitehouse	3699e3a44b	[GFS2] Clean up/speed up readdir This removes the extra filldir callback which gfs2 was using to enclose an attempt at readahead for inodes during readdir. The code was too complicated and also hurts performance badly in the case that the getdents64/readdir call isn't being followed by stat() and it wasn't even getting it right all the time when it was. As a result, on my test box an "ls" of a directory containing 250000 files fell from about 7mins (freshly mounted, so nothing cached) to between about 15 to 25 seconds. When the directory content was cached, the time taken fell from about 3mins to about 4 or 5 seconds. Interestingly in the cached case, running "ls -l" once reduced the time taken for subsequent runs of "ls" to about 6 secs even without this patch. Now it turns out that there was a special case of glocks being used for prefetching the metadata, but because of the timeouts for these locks (set to 10 secs) the metadata was being timed out before it was being used and this the prefetch code was constantly trying to prefetch the same data over and over. Calling "ls -l" meant that the inodes were brought into memory and once the inodes are cached, the glocks are not disposed of until the inodes are pushed out of the cache, thus extending the lifetime of the glocks, and thus bringing down the time for subsequent runs of "ls" considerably. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:37:04 -05:00
Steven Whitehouse	a8d638e30e	[GFS2] Add writepages for "data=writeback" mounts It occurred to me that although a gfs2 specific writepages for ordered writes and journaled data would be tricky, by hooking writepages only for "data=writeback" mounts we could take advantage of not needing buffer heads (we don't use them on the read side, nor have we for some time) and create much larger I/Os for the block layer. Using blktrace both before and after, its possible to see that for large I/Os, most of the requests generated through writepages are now 1024 sectors after this patch is applied as opposed to 8 sectors before. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:37:01 -05:00
David Teigland	222d396092	[DLM] fix master recovery If master recovery happens on an rsb in one recovery sequence, then that sequence is aborted before lock recovery happens, then in the next sequence, we rely on the previous master recovery (which may now be invalid due to another node ignoring a lookup result) and go on do to the lock recovery where we get stuck due to an invalid master value. recovery cycle begins: master of rsb X has left nodes A and B send node C an rcom lookup for X to find the new master C gets lookup from B first, sets B as new master, and sends reply back to B C gets lookup from A next, and sends reply back to A saying B is master A gets lookup reply from C and sets B as the new master in the rsb recovery cycle on A, B and C is aborted to start a new recovery B gets lookup reply from C and ignores it since there's a new recovery recovery cycle begins: some other node has joined B doesn't think it's the master of X so it doesn't rebuild it in the directory C looks up the master of X, no one is master, so it becomes new master B looks up the master of X, finds it's C A believes that B is the master of X, so it sends its lock to B B sends an error back to A A resends this repeats forever, the incorrect master value on A is never corrected The fix is to do master recovery on an rsb that still has the NEW_MASTER flag set from an earlier recovery sequence, and therefore didn't complete lock recovery. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:36:58 -05:00
David Teigland	a1bc86e6bd	[DLM] fix user unlocking When a user process exits, we clear all the locks it holds. There is a problem, though, with locks that the process had begun unlocking before it exited. We couldn't find the lkb's that were in the process of being unlocked remotely, to flag that they are DEAD. To solve this, we move lkb's being unlocked onto a new list in the per-process structure that tracks what locks the process is holding. We can then go through this list to flag the necessary lkb's when clearing locks for a process when it exits. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:36:55 -05:00
Patrick Caulfield	1d6e8131cf	[DLM] Use workqueues for dlm lowcomms This patch converts the DLM TCP lowcomms to use workqueues rather than using its own daemon functions. Simultaneously removing a lot of code and making it more scalable on multi-processor machines. Signed-Off-By: Patrick Caulfield <pcaulfie@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:36:52 -05:00
Adrian Bunk	03dc6a538e	[GFS2] make gfs2_change_nlink_i() static On Thu, Jan 11, 2007 at 10:26:27PM -0800, Andrew Morton wrote: >... > Changes since 2.6.20-rc3-mm1: >... > git-gfs2-nmw.patch >... > git trees >... This patch makes the needlessly globlal gfs2_change_nlink_i() static. Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:36:49 -05:00
Robert Peterson	7083146564	[GFS2] gfs2 knows of directories which it chooses not to display This is for Red Hat bugzilla bug bz #222302: Moving a virtual IP from node to node between two NFS-over-GFS2 servers was causing one of the GFS2 servers to become confused and reference a deleted inode. The problem was due to vfs dentries that did not reference the gfs2_dops and therefore didn't call the gfs2 revalidate code to revalidate a dentry after a directory had been deleted & recreated. This patch is a crosswrite from a RHEL4 bug found in GFS1 as bz #190756 and it is against the latest -nmw git tree. Signed-off-by: Robert Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:36:46 -05:00
David Teigland	d200778e12	[DLM] expose dlm_config_info fields in configfs Make the dlm_config_info values readable and writeable via configfs entries. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:36:43 -05:00
David Teigland	99fc64874a	[DLM] add config entry to enable log_debug Add a new dlm_config_info field to enable log_debug output and change log_debug() to use it. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-02-05 13:36:40 -05:00

1 2 3 4 5 ...

4777 Commits