kernel_optimize_test

Author	SHA1	Message	Date
Jeff Layton	8570552425	cifs: fix wait_for_response to time out sleeping processes correctly cifs: fix wait_for_response to time out sleeping processes correctly The current scheme that CIFS uses to sleep and wait for a response is not quite what we want. After sending a request, wait_for_response puts the task to sleep with wait_event(). One of the conditions for wait_event is a timeout (using time_after()). The problem with this is that there is no guarantee that the process will ever be woken back up. If the server stops sending data, then cifs_demultiplex_thread will leave its response queue sleeping. I think the only thing that saves us here is the fact that cifs_dnotify_thread periodically (every 15s) wakes up sleeping processes on all response_q's that have calls in flight. This makes for unnecessary wakeups of some processes. It also means large variability in the timeouts since they're all woken up at once. Instead of this, put the tasks to sleep with wait_event_timeout. This makes them wake up on their own if they time out. With this change, cifs_dnotify_thread should no longer be needed. I've been testing this in conjunction with some other patches that I'm working on. It doesn't seem to affect performance at all with with heavy I/O. Identical iozone -ac runs complete in almost exactly the same time (<1% difference in times). Thanks to Wasrshi Nimara for initially pointing this out. Wasrshi, it would be nice to know whether this patch also helps your testcase. Signed-off-by: Jeff Layton <jlayton@redhat.com> Cc: Wasrshi Nimara <warshinimara@gmail.com> Signed-off-by: Steve French <sfrench@us.ibm.com>	2008-12-26 02:29:11 +00:00
Steve French	8be0ed44c2	[CIFS] Can not mount with prefixpath if root directory of share is inaccessible Windows allows you to deny access to the top of a share, but permit access to a directory lower in the path. With the prefixpath feature of cifs (ie mounting \\server\share\directory\subdirectory\etc.) this should have worked if the user specified a prefixpath which put the root of the mount at a directory to which he had access, but we still were doing a lookup on the root of the share (null path) when we should have been doing it on the prefixpath subdirectory. This fixes Samba bug # 5925 Acked-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>	2008-12-26 02:29:11 +00:00
Steve French	61e7480158	[CIFS] various minor cleanups pointed out by checkpatch script Signed-off-by: Steve French <sfrench@us.ibm.com>	2008-12-26 02:29:10 +00:00
Steve French	3de2091ac7	[CIFS] fix typo Signed-off-by: Steve French <sfrench@us.ibm.com>	2008-12-26 02:29:10 +00:00
Steve French	acc18aa1e6	[CIFS] remove sparse warning Signed-off-by: Steve French <sfrench@us.ibm.com>	2008-12-26 02:29:10 +00:00
Steve French	13a6e42af8	[CIFS] add mount option to send mandatory rather than advisory locks Some applications/subsystems require mandatory byte range locks (as is used for Windows/DOS/OS2 etc). Sending advisory (posix style) byte range lock requests (instead of mandatory byte range locks) can lead to problems for these applications (which expect that other clients be prevented from writing to portions of the file which they have locked and are updating). This mount option allows mounting cifs with the new mount option "forcemand" (or "forcemandatorylock") in order to have the cifs client use mandatory byte range locks (ie SMB/CIFS/Windows/NTFS style locks) rather than posix byte range lock requests, even if the server would support posix byte range lock requests. This has no effect if the server does not support the CIFS Unix Extensions (since posix style locks require support for the CIFS Unix Extensions), but for mounts to Samba servers this can be helpful for Wine and applications that require mandatory byte range locks. Acked-by: Jeff Layton <jlayton@redhat.com> CC: Alexander Bokovoy <ab@samba.org> Signed-off-by: Steve French <sfrench@us.ibm.com>	2008-12-26 02:29:10 +00:00
Jeff Layton	d5c5605c27	cifs: make ipv6_connect take a TCP_Server_Info arg Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>	2008-12-26 02:29:10 +00:00
Jeff Layton	bcf4b1063d	cifs: make ipv4_connect take a TCP_Server_Info arg In order to unify the smb_send routines, we need to reorganize the routines that connect the sockets. Have ipv4_connect take a TCP_Server_Info pointer and get the necessary fields from that. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>	2008-12-26 02:29:10 +00:00
Jeff Layton	7586b76585	cifs: don't declare smb_vol info on the stack struct smb_vol is fairly large, it's probably best to kzalloc it... Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>	2008-12-26 02:29:10 +00:00
Jeff Layton	63c038c297	cifs: move allocation of new TCP_Server_Info into separate function Clean up cifs_mount a bit by moving the code that creates new TCP sessions into a separate function. Have that function search for an existing socket and then create a new one if one isn't found. Also reorganize the initializion of TCP_Server_Info a bit to prepare for cleanup of the socket connection code. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>	2008-12-26 02:29:10 +00:00
Jeff Layton	8ecaf67a8e	cifs: account for IPv6 in ses->serverName and clean up netbios name handling The current code for setting the session serverName is IPv4-specific. Allow it to be an IPv6 address as well. Use NIP* macros to set the format. This also entails increasing the length of the serverName field, so declare a new macro for RFC1001 name length and use it in the appropriate places. Finally, drop the unicode_server_Name field from TCP_Server_Info since it's not used. We can add it back later if needed, but for now it just wastes memory. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>	2008-12-26 02:29:09 +00:00
Jeff Layton	954d7a1cf1	cifs: make dnotify thread experimental code Now that tasks sleeping in wait_for_response will time out on their own, we're not reliant on the dnotify thread to do this. Mark it as experimental code for now. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>	2008-12-26 02:29:09 +00:00
Jeff Layton	72ca545b2d	cifs: convert tcpSem to a mutex Mutexes are preferred for single-holder semaphores... Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>	2008-12-26 02:29:09 +00:00
Jeff Layton	0468a2cf91	cifs: take module reference when starting cifsd cifsd can outlive the last cifs mount. We need to hold a module reference until it exits to prevent someone from unplugging the module until we're ready. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>	2008-12-26 02:29:09 +00:00
Jeff Layton	80909022ce	cifs: display addr and prefixpath options in /proc/mounts Have cifs_show_options display the addr and prefixpath options in /proc/mounts. Reduce struct dereferencing by adding some local variables. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>	2008-12-26 02:29:09 +00:00
Jeff Layton	24b9b06ba7	cifs: remove unused SMB session pointer from struct mid_q_entry Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>	2008-12-26 02:29:09 +00:00
Julia Lawall	f1d9e4586e	fs/9p: change simple_strtol to simple_strtoul Since v9ses->uid is unsigned, it would seem better to use simple_strtoul that simple_strtol. A simplified version of the semantic patch that makes this change is as follows: (http://www.emn.fr/x-info/coccinelle/) // <smpl> @r2@ long e; position p; @@ e = simple_strtol@p(...) @@ position p != r2.p; type T; T e; @@ e = - simple_strtol@p + simple_strtoul (...) // </smpl> Signed-off-by: Julia Lawall <julia@diku.dk> Acked-by: Eric Van Hensbergen <ericvh@gmail.com>	2008-12-19 16:50:22 -06:00
Wu Fengguang	7dd0cdc51c	9p: convert d_iname references to d_name.name d_iname is rubbish for long file names. Use d_name.name in printks instead. Signed-off-by: Wu Fengguang <wfg@linux.intel.com> Acked-by: Eric Van Hensbergen <ericvh@gmail.com>	2008-12-19 16:47:40 -06:00
Duane Griffin	6ff232070a	9p: Remove potentially bad parameter from function entry debug print. Signed-off-by: Duane Griffin <duaneg@dghda.com> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>	2008-12-19 16:45:21 -06:00
Linus Torvalds	0bc77ecbe4	Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2 * 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2: ocfs2: Add JBD2 compat feature bit. ocfs2: Always update xattr search when creating bucket.	2008-12-17 15:01:23 -08:00
Jeff Layton	331c313510	cifs: fix buffer overrun in parse_DFS_referrals While testing a kernel with memory poisoning enabled, I saw some warnings about the redzone getting clobbered when chasing DFS referrals. The buffer allocation for the unicode converted version of the searchName is too small and needs to take null termination into account. Signed-off-by: Jeff Layton <jlayton@redhat.com> Acked-by: Steve French <sfrench@us.ibm.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-12-17 14:59:55 -08:00
Joel Becker	a97721894a	ocfs2: Add JBD2 compat feature bit. Define the OCFS2_FEATURE_COMPAT_JBD2 bit in the filesystem header. Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>	2008-12-16 18:26:16 -08:00
Tao Ma	83099bc647	ocfs2: Always update xattr search when creating bucket. When we create xattr bucket during the process of xattr set, we always need to update the ocfs2_xattr_search since even if the bucket size is the same as block size, the offset will change because of the removal of the ocfs2_xattr_block header. Signed-off-by: Tao Ma <tao.ma@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>	2008-12-16 14:07:37 -08:00
Linus Torvalds	f4fd2c5b6f	Merge branch 'to-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/frob/linux-2.6-roland * 'to-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/frob/linux-2.6-roland: tracehook: exec double-reporting fix	2008-12-10 14:40:21 -08:00
Hugh Dickins	9c24624727	KSYM_SYMBOL_LEN fixes Miles Lane tailing /sys files hit a BUG which Pekka Enberg has tracked to my `966c8c12dc` sprint_symbol(): use less stack exposing a bug in slub's list_locations() - kallsyms_lookup() writes a 0 to namebuf[KSYM_NAME_LEN-1], but that was beyond the end of page provided. The 100 slop which list_locations() allows at end of page looks roughly enough for all the other stuff it might print after the symbol before it checks again: break out KSYM_SYMBOL_LEN earlier than before. Latencytop and ftrace and are using KSYM_NAME_LEN buffers where they need KSYM_SYMBOL_LEN buffers, and vmallocinfo a 2*KSYM_NAME_LEN buffer where it wants a KSYM_SYMBOL_LEN buffer: fix those before anyone copies them. [akpm@linux-foundation.org: ftrace.h needs module.h] Signed-off-by: Hugh Dickins <hugh@veritas.com> Cc: Christoph Lameter <cl@linux-foundation.org> Cc Miles Lane <miles.lane@gmail.com> Acked-by: Pekka Enberg <penberg@cs.helsinki.fi> Acked-by: Steven Rostedt <srostedt@redhat.com> Acked-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-12-10 08:01:54 -08:00
Dmitri Monakhov	6ee5a399d6	inotify: fix IN_ONESHOT unmount event watcher On umount two event will be dispatched to watcher: 1: inotify_dev_queue_event(.., IN_UNMOUNT,..) 2: remove_watch(watch, dev) ->inotify_dev_queue_event(.., IN_IGNORED, ..) But if watcher has IN_ONESHOT bit set then the watcher will be released inside first event. Which result in accessing invalid object later. IMHO it is not pure regression. This bug wasn't triggered while initial inotify interface testing phase because of another bug in IN_ONESHOT handling logic :) commit `ac74c00e49` Author: Ulisses Furquim <ulissesf@gmail.com> Date: Fri Feb 8 04:18:16 2008 -0800 inotify: fix check for one-shot watches before destroying them As the IN_ONESHOT bit is never set when an event is sent we must check it in the watch's mask and not in the event's mask. TESTCASE: mkdir mnt mount -ttmpfs none mnt mkdir mnt/d ./inotify mnt/d& umount mnt ## << lockup or crash here TESTSOURCE: /* gcc -oinotify inotify.c / #include <stdio.h> #include <stdlib.h> #include <sys/inotify.h> int main(int argc, char argv) { char buf[1024]; struct inotify_event ie; char p; int i; ssize_t l; p = argv[1]; i = inotify_init(); inotify_add_watch(i, p, ~0); l = read(i, buf, sizeof(buf)); printf("read %d bytes\n", l); ie = (struct inotify_event ) buf; printf("event mask: %d\n", ie->mask); return 0; } Signed-off-by: Dmitri Monakhov <dmonakhov@openvz.org> Cc: John McCutchan <ttb@tentacle.dhs.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Robert Love <rlove@google.com> Cc: Ulisses Furquim <ulissesf@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-12-10 08:01:53 -08:00
Matt Mackall	49c50342c7	pagemap: fix 32-bit pagemap regression The large pages fix from `bcf8039ed4` broke 32-bit pagemap by pulling the pagemap entry code out into a function with the wrong return type. Pagemap entries are 64 bits on all systems and unsigned long is only 32 bits on 32-bit systems. Signed-off-by: Matt Mackall <mpm@selenic.com> Reported-by: Doug Graham <dgraham@nortel.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Dave Hansen <dave@linux.vnet.ibm.com> Cc: <stable@kernel.org> [2.6.26.x, 2.6.27.x] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-12-10 08:01:53 -08:00
Andrew Morton	02d2116887	revert "percpu_counter: new function percpu_counter_sum_and_set" Revert commit `e8ced39d5e` Author: Mingming Cao <cmm@us.ibm.com> Date: Fri Jul 11 19:27:31 2008 -0400 percpu_counter: new function percpu_counter_sum_and_set As described in revert "percpu counter: clean up percpu_counter_sum_and_set()" the new percpu_counter_sum_and_set() is racy against updates to the cpu-local accumulators on other CPUs. Revert that change. This means that ext4 will be slow again. But correct. Reported-by: Eric Dumazet <dada1@cosmosbay.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mingming Cao <cmm@us.ibm.com> Cc: <linux-ext4@vger.kernel.org> Cc: <stable@kernel.org> [2.6.27.x] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-12-10 08:01:52 -08:00
Andrew Morton	71c5576fbd	revert "percpu counter: clean up percpu_counter_sum_and_set()" Revert commit `1f7c14c62c` Author: Mingming Cao <cmm@us.ibm.com> Date: Thu Oct 9 12:50:59 2008 -0400 percpu counter: clean up percpu_counter_sum_and_set() Before this patch we had the following: percpu_counter_sum(): return the percpu_counter's value percpu_counter_sum_and_set(): return the percpu_counter's value, copying that value into the central value and zeroing the per-cpu counters before returning. After this patch, percpu_counter_sum_and_set() has gone, and percpu_counter_sum() gets the old percpu_counter_sum_and_set() functionality. Problem is, as Eric points out, the old percpu_counter_sum_and_set() functionality was racy and wrong. It zeroes out counters on "other" cpus, without holding any locks which will prevent races agaist updates from those other CPUS. This patch reverts `1f7c14c62c`. This means that percpu_counter_sum_and_set() still has the race, but percpu_counter_sum() does not. Note that this is not a simple revert - ext4 has since started using percpu_counter_sum() for its dirty_blocks counter as well. Note that this revert patch changes percpu_counter_sum() semantics. Before the patch, a call to percpu_counter_sum() will bring the counter's central counter mostly up-to-date, so a following percpu_counter_read() will return a close value. After this patch, a call to percpu_counter_sum() will leave the counter's central accumulator unaltered, so a subsequent call to percpu_counter_read() can now return a significantly inaccurate result. If there is any code in the tree which was introduced after `e8ced39d5e` was merged, and which depends upon the new percpu_counter_sum() semantics, that code will break. Reported-by: Eric Dumazet <dada1@cosmosbay.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mingming Cao <cmm@us.ibm.com> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-12-10 08:01:52 -08:00
Roland McGrath	85f334666a	tracehook: exec double-reporting fix The patch `6341c39` "tracehook: exec" introduced a small regression in 2.6.27 regarding binfmt_misc exec event reporting. Since the reporting is now done in the common search_binary_handler() function, an exec of a misc binary will result in two (or possibly multiple) exec events being reported, instead of just a single one, because the misc handler contains a recursive call to search_binary_handler. To add to the confusion, if PTRACE_O_TRACEEXEC is not active, the multiple SIGTRAP signals will in fact cause only a single ptrace intercept, as the signals are not queued. However, if PTRACE_O_TRACEEXEC is on, the debugger will actually see multiple ptrace intercepts (PTRACE_EVENT_EXEC). The test program included below demonstrates the problem. This change fixes the bug by calling tracehook_report_exec() only in the outermost search_binary_handler() call (bprm->recursion_depth == 0). The additional change to restore bprm->recursion_depth after each binfmt load_binary call is actually superfluous for this bug, since we test the value saved on entry to search_binary_handler(). But it keeps the use of of the depth count to its most obvious expected meaning. Depending on what binfmt handlers do in certain cases, there could have been false-positive tests for recursion limits before this change. /* Test program using PTRACE_O_TRACEEXEC. This forks and exec's the first argument with the rest of the arguments, while ptrace'ing. It expects to see one PTRACE_EVENT_EXEC stop and then a successful exit, with no other signals or events in between. Test for kernel doing two PTRACE_EVENT_EXEC stops for a binfmt_misc exec: $ gcc -g traceexec.c -o traceexec $ sudo sh -c 'echo :test:M::foobar::/bin/cat: > /proc/sys/fs/binfmt_misc/register' $ echo 'foobar test' > ./foobar $ chmod +x ./foobar $ ./traceexec ./foobar; echo $? ==> good <== foobar test 0 $ ==> bad <== foobar test unexpected status 0x4057f != 0 3 $ / #include <stdio.h> #include <sys/types.h> #include <sys/wait.h> #include <sys/ptrace.h> #include <unistd.h> #include <signal.h> #include <stdlib.h> static void wait_for (pid_t child, int expect) { int status; pid_t p = wait (&status); if (p != child) { perror ("wait"); exit (2); } if (status != expect) { fprintf (stderr, "unexpected status %#x != %#x\n", status, expect); exit (3); } } int main (int argc, char argv) { pid_t child = fork (); if (child < 0) { perror ("fork"); return 127; } else if (child == 0) { ptrace (PTRACE_TRACEME); raise (SIGUSR1); execv (argv[1], &argv[1]); perror ("execve"); _exit (127); } wait_for (child, W_STOPCODE (SIGUSR1)); if (ptrace (PTRACE_SETOPTIONS, child, 0L, (void ) (long) PTRACE_O_TRACEEXEC) != 0) { perror ("PTRACE_SETOPTIONS"); return 4; } if (ptrace (PTRACE_CONT, child, 0L, 0L) != 0) { perror ("PTRACE_CONT"); return 5; } wait_for (child, W_STOPCODE (SIGTRAP \| (PTRACE_EVENT_EXEC << 8))); if (ptrace (PTRACE_CONT, child, 0L, 0L) != 0) { perror ("PTRACE_CONT"); return 6; } wait_for (child, W_EXITCODE (0, 0)); return 0; } Reported-by: Arnd Bergmann <arnd@arndb.de> CC: Ulrich Weigand <ulrich.weigand@de.ibm.com> Signed-off-by: Roland McGrath <roland@redhat.com>	2008-12-09 19:36:38 -08:00
J. Bruce Fields	a4f4d6df53	EXPORTFS: handle NULL returns from fh_to_dentry()/fh_to_parent() While `440037287c` "[PATCH] switch all filesystems over to d_obtain_alias" removed some cases where fh_to_dentry() and fh_to_parent() could return NULL, there are still a few NULL returns left in individual filesystems. Thus it was a mistake for that commit to remove the handling of NULL returns in the callers. Revert those parts of `440037287c` which removed the NULL handling. (We could, alternatively, modify all implementations to return -ESTALE instead of NULL, but that proves to require fixing a number of filesystems, and in some cases it's arguably more natural to return NULL.) Thanks to David for original patch and Linus, Christoph, and Hugh for review. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Cc: David Howells <dhowells@redhat.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Hugh Dickins <hugh@veritas.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-12-08 19:49:32 -08:00
Jonathan Corbet	218d11a8b0	Fix a race condition in FASYNC handling Changeset `a238b790d5` (Call fasync() functions without the BKL) introduced a race which could leave file->f_flags in a state inconsistent with what the underlying driver/filesystem believes. Revert that change, and also fix the same races in ioctl_fioasync() and ioctl_fionbio(). This is a minimal, short-term fix; the real fix will not involve the BKL. Reported-by: Oleg Nesterov <oleg@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: stable@kernel.org Signed-off-by: Jonathan Corbet <corbet@lwn.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-12-05 15:35:10 -08:00
Linus Torvalds	bbeba4c35c	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/bdev * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/bdev: [PATCH] fix bogus argument of blkdev_put() in pktcdvd [PATCH 2/2] documnt FMODE_ constants [PATCH 1/2] kill FMODE_NDELAY_NOW [PATCH] clean up blkdev_get a little bit [PATCH] Fix block dev compat ioctl handling [PATCH] kill obsolete temporary comment in swsusp_close()	2008-12-04 21:45:44 -08:00
Dave Chinner	576a488a27	[XFS] Fix hang after disallowed rename across directory quota domains When project quota is active and is being used for directory tree quota control, we disallow rename outside the current directory tree. This requires a check to be made after all the inodes involved in the rename are locked. We fail to unlock the inodes correctly if we disallow the rename when the target is outside the current directory tree. This results in a hang on the next access to the inodes involved in failed rename. Reported-by: Arkadiusz Miskiewicz <arekm@maven.pl> Signed-off-by: Dave Chinner <david@fromorbit.com> Tested-by: Arkadiusz Miskiewicz <arekm@maven.pl> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-12-05 15:39:13 +11:00
Christoph Hellwig	fd4ce1acd0	[PATCH 1/2] kill FMODE_NDELAY_NOW Update FMODE_NDELAY before each ioctl call so that we can kill the magic FMODE_NDELAY_NOW. It would be even better to do this directly in setfl(), but for that we'd need to have FMODE_NDELAY for all files, not just block special files. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-12-04 04:22:57 -05:00
Christoph Hellwig	ebbefc011e	[PATCH] clean up blkdev_get a little bit The way the bd_claim for the FMODE_EXCL case is implemented is rather confusing. Clean it up to the most logical style. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-12-04 04:22:56 -05:00
Linus Torvalds	2433c41789	Merge branch 'for-2.6.28' of git://linux-nfs.org/~bfields/linux * 'for-2.6.28' of git://linux-nfs.org/~bfields/linux: NLM: client-side nlm_lookup_host() should avoid matching on srcaddr nfsd: use of unitialized list head on error exit in nfs4recover.c Add a reference to sunrpc in svc_addsock nfsd: clean up grace period on early exit	2008-12-03 16:40:37 -08:00
Linus Torvalds	51eaaa6776	Merge branch 'linux-next' of git://git.infradead.org/ubifs-2.6 * 'linux-next' of git://git.infradead.org/ubifs-2.6: UBIFS: pre-allocate bulk-read buffer UBIFS: do not allocate too much UBIFS: do not print scary memory allocation warnings UBIFS: allow for gaps when dirtying the LPT UBIFS: fix compilation warnings MAINTAINERS: change UBI/UBIFS git tree URLs UBIFS: endian handling fixes and annotations UBIFS: remove printk	2008-12-02 15:56:55 -08:00
Randy Dunlap	0380155363	ntfs: don't fool kernel-doc kernel-doc handles macros now (it has for quite some time), so change the ntfs_debug() macro's kernel-doc to be just before the macro instead of before a phony function prototype. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Cc: Anton Altaparmakov <aia21@cantab.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-12-01 19:55:25 -08:00
Davide Libenzi	7ef9964e6d	epoll: introduce resource usage limits It has been thought that the per-user file descriptors limit would also limit the resources that a normal user can request via the epoll interface. Vegard Nossum reported a very simple program (a modified version attached) that can make a normal user to request a pretty large amount of kernel memory, well within the its maximum number of fds. To solve such problem, default limits are now imposed, and /proc based configuration has been introduced. A new directory has been created, named /proc/sys/fs/epoll/ and inside there, there are two configuration points: max_user_instances = Maximum number of devices - per user max_user_watches = Maximum number of "watched" fds - per user The current default for "max_user_watches" limits the memory used by epoll to store "watches", to 1/32 of the amount of the low RAM. As example, a 256MB 32bit machine, will have "max_user_watches" set to roughly 90000. That should be enough to not break existing heavy epoll users. The default value for "max_user_instances" is set to 128, that should be enough too. This also changes the userspace, because a new error code can now come out from EPOLL_CTL_ADD (-ENOSPC). The EMFILE from epoll_create() was already listed, so that should be ok. [akpm@linux-foundation.org: use get_current_user()] Signed-off-by: Davide Libenzi <davidel@xmailserver.org> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: <stable@kernel.org> Cc: Cyrill Gorcunov <gorcunov@gmail.com> Reported-by: Vegard Nossum <vegardno@ifi.uio.no> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-12-01 19:55:24 -08:00
Mark Fasheh	d6b58f89f7	ocfs2: fix regression in ocfs2_read_blocks_sync() We're panicing in ocfs2_read_blocks_sync() if a jbd-managed buffer is seen. At first glance, this seems ok but in reality it can happen. My test case was to just run 'exorcist'. A struct inode is being pushed out of memory but is then re-read at a later time, before the buffer has been checkpointed by jbd. This causes a BUG to be hit in ocfs2_read_blocks_sync(). Reviewed-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>	2008-12-01 14:46:58 -08:00
Coly Li	07d9a3954a	ocfs2: fix return value set in init_dlmfs_fs() In init_dlmfs_fs(), if calling kmem_cache_create() failed, the code will use return value from calling bdi_init(). The correct behavior should be set status as -ENOMEM before going to "bail:". Signed-off-by: Coly Li <coyli@suse.de> Acked-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>	2008-12-01 14:46:55 -08:00
David Teigland	07f9eebcdf	ocfs2: fix wake_up in unlock_ast In ocfs2_unlock_ast(), call wake_up() on lockres before releasing the spin lock on it. As soon as the spin lock is released, the lockres can be freed. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>	2008-12-01 14:46:45 -08:00
David Teigland	66f502a416	ocfs2: initialize stack_user lvbptr The locking_state dump, ocfs2_dlm_seq_show, reads the lvb on locks where it has not yet been initialized by a lock call. Signed-off-by: David Teigland <teigland@redhat.com> Acked-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>	2008-12-01 14:46:39 -08:00
Coly Li	3b5da0189c	ocfs2: comments typo fix This patch fixes two typos in comments of ocfs2. Signed-off-by: Coly Li <coyli@suse.de> Signed-off-by: Mark Fasheh <mfasheh@suse.com>	2008-12-01 14:46:31 -08:00
Linus Torvalds	8e36a5d6ad	Merge git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6: [CIFS] fix regression in cifs_write_begin/cifs_write_end	2008-11-30 14:04:02 -08:00
Jan Kara	52b19ac993	udf: Fix BUG_ON() in destroy_inode() udf_clear_inode() can leave behind buffers on mapping's i_private list (when we truncated preallocation). Call invalidate_inode_buffers() so that the list is properly cleaned-up before we return from udf_clear_inode(). This is ugly and suggest that we should cleanup preallocation earlier than in clear_inode() but currently there's no such call available since drop_inode() is called under inode lock and thus is unusable for disk operations. Signed-off-by: Jan Kara <jack@suse.cz>	2008-11-27 17:38:28 +01:00
Jeff Layton	a98ee8c1c7	[CIFS] fix regression in cifs_write_begin/cifs_write_end The conversion to write_begin/write_end interfaces had a bug where we were passing a bad parameter to cifs_readpage_worker. Rather than passing the page offset of the start of the write, we needed to pass the offset of the beginning of the page. This was reliably showing up as data corruption in the fsx-linux test from LTP. It also became evident that this code was occasionally doing unnecessary read calls. Optimize those away by using the PG_checked flag to indicate that the unwritten part of the page has been initialized. CC: Nick Piggin <npiggin@suse.de> Acked-by: Dave Kleikamp <shaggy@us.ibm.com> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>	2008-11-26 19:32:33 +00:00
Chuck Lever	a8d82d9b95	NLM: client-side nlm_lookup_host() should avoid matching on srcaddr Since commit `c98451bd`, the loop in nlm_lookup_host() unconditionally compares the host's h_srcaddr field to the incoming source address. For client-side nlm_host entries, both are always AF_UNSPEC, so this check is unnecessary. Since commit `781b61a6`, which added support for AF_INET6 addresses to nlm_cmp_addr(), nlm_cmp_addr() now returns FALSE for AF_UNSPEC addresses, which causes nlm_lookup_host() to create a fresh nlm_host entry every time it is called on the client. These extra entries will eventually expire once the server is unmounted, so the impact of this regression, introduced with lockd IPv6 support in 2.6.28, should be minor. We could fix this by adding an arm in nlm_cmp_addr() for AF_UNSPEC addresses, but really, nlm_lookup_host() shouldn't be matching on the srcaddr field for client-side nlm_host lookups. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-11-24 13:29:07 -06:00
J. Bruce Fields	e4625eb826	nfsd: use of unitialized list head on error exit in nfs4recover.c Thanks to Matthew Dodd for this bug report: A file label issue while running SELinux in MLS mode provoked the following bug, which is a result of use before init on a 'struct list_head'. In nfsd4_list_rec_dir() if the call to dentry_open() fails the 'goto out' skips INIT_LIST_HEAD() which results in the normally improbable case where list_entry() returns NULL. Trace follows. NFSD: Using /var/lib/nfs/v4recovery as the NFSv4 state recovery directory SELinux: Context unconfined_t:object_r:var_lib_nfs_t:s0 is not valid (left unmapped). type=1400 audit(1227298063.609:282): avc: denied { read } for pid=1890 comm="rpc.nfsd" name="v4recovery" dev=dm-0 ino=148726 scontext=system_u:system_r:nfsd_t:s0-s15:c0.c1023 tcontext=system_u:object_r:unlabeled_t:s15:c0.c1023 tclass=dir BUG: unable to handle kernel NULL pointer dereference at 00000004 IP: [<c050894e>] list_del+0x6/0x60 pde = 0d9ce067 pte = 00000000 Oops: 0000 [#1] SMP Modules linked in: nfsd lockd nfs_acl auth_rpcgss exportfs autofs4 sunrpc ipv6 dm_multipath scsi_dh ppdev parport_pc sg parport floppy ata_piix pata_acpi ata_generic libata pcnet32 i2c_piix4 mii pcspkr i2c_core dm_snapshot dm_zero dm_mirror dm_log dm_mod BusLogic sd_mod scsi_mod crc_t10dif ext3 jbd mbcache uhci_hcd ohci_hcd ehci_hcd [last unloaded: microcode] Pid: 1890, comm: rpc.nfsd Not tainted (2.6.27.5-37.fc9.i686 #1) EIP: 0060:[<c050894e>] EFLAGS: 00010217 CPU: 0 EIP is at list_del+0x6/0x60 EAX: 00000000 EBX: 00000000 ECX: 00000000 EDX: cd99e480 ESI: cf9caed8 EDI: 00000000 EBP: cf9caebc ESP: cf9caeb8 DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 Process rpc.nfsd (pid: 1890, ti=cf9ca000 task=cf4de580 task.ti=cf9ca000) Stack: 00000000 cf9caef0 d0a9f139 c0496d04 d0a9f217 fffffff3 00000000 00000000 00000000 00000000 cf32b220 00000000 00000008 00000801 cf9caefc d0a9f193 00000000 cf9caf08 d0a9b6ea 00000000 cf9caf1c d0a874f2 cf9c3004 00000008 Call Trace: [<d0a9f139>] ? nfsd4_list_rec_dir+0xf3/0x13a [nfsd] [<c0496d04>] ? do_path_lookup+0x12d/0x175 [<d0a9f217>] ? load_recdir+0x0/0x26 [nfsd] [<d0a9f193>] ? nfsd4_recdir_load+0x13/0x34 [nfsd] [<d0a9b6ea>] ? nfs4_state_start+0x2a/0xc5 [nfsd] [<d0a874f2>] ? nfsd_svc+0x51/0xff [nfsd] [<d0a87f2d>] ? write_svc+0x0/0x1e [nfsd] [<d0a87f48>] ? write_svc+0x1b/0x1e [nfsd] [<d0a87854>] ? nfsctl_transaction_write+0x3a/0x61 [nfsd] [<c04b6a4e>] ? sys_nfsservctl+0x116/0x154 [<c04975c1>] ? putname+0x24/0x2f [<c04975c1>] ? putname+0x24/0x2f [<c048d49f>] ? do_sys_open+0xad/0xb7 [<c048d337>] ? filp_close+0x50/0x5a [<c048d4eb>] ? sys_open+0x1e/0x26 [<c0403cca>] ? syscall_call+0x7/0xb [<c064007b>] ? init_cyrix+0x185/0x490 ======================= Code: 75 e1 8b 53 08 8d 4b 04 8d 46 04 e8 75 00 00 00 8b 53 10 8d 4b 0c 8d 46 0c e8 67 00 00 00 5b 5e 5f 5d c3 90 90 55 89 e5 53 89 c3 <8b> 40 04 8b 00 39 d8 74 16 50 53 68 3e d6 6f c0 6a 30 68 78 d6 EIP: [<c050894e>] list_del+0x6/0x60 SS:ESP 0068:cf9caeb8 ---[ end trace a89c4ad091c4ad53 ]--- Cc: Matthew N. Dodd <Matthew.Dodd@spart.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-11-24 10:36:09 -06:00

1 2 3 4 5 ...

10743 Commits