kernel_optimize_test

Author	SHA1	Message	Date
Dan Carpenter	f65a03f6ab	kexec: return -EFAULT on copy_to_user() failures copy_to/from_user() returns the number of bytes remaining to be copied. It never returns a negative value. The correct return code is -EFAULT and not -EIO. All the callers check for non-zero returns so that's Ok, but the return code is passed to the user so we should fix this. Signed-off-by: Dan Carpenter <error27@gmail.com> Cc: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Simon Kagstrom <simon.kagstrom@netinsight.net> Acked-by: WANG Cong <xiyou.wangcong@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-08-11 08:59:22 -07:00
Fr?d?ric Bri?re	832ccf6f44	parport_serial: use the PCI IRQ if offered Commit `51dcdfe` ("parport: Use the PCI IRQ if offered") added IRQ support for PCI parallel port devices handled by parport_pc, but turned it off for parport_serial, despite a printk() message to the contrary. Signed-off-by: Fr?d?ric Bri?re <fbriere@fbriere.net> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-08-11 08:59:22 -07:00
Anton Blanchard	863a604920	lib/bug.c: add oops end marker to WARN implementation We are missing the oops end marker for the exception based WARN implementation in lib/bug.c. This is useful for logfile analysis tools. Signed-off-by: Anton Blanchard <anton@samba.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Arjan van de Ven <arjan@infradead.org> Cc: "Kirill A. Shutemov" <kirill@shutemov.name> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-08-11 08:59:22 -07:00
Anton Blanchard	e2e7e09325	lib/bug.c: make WARN implementation match the kernel/panic.c one There are a few issues with the exception based WARN implementation in lib/bug.c: - Inconsistent printk flags. The "cut here" line is printed at KERN_EMERG, so the console and all logged in users see the single line: ------------[ cut here ]------------ for each WARN. Fix this so we print everything at KERN_WARNING to match the kernel/panic.c version. - The lib/bug.c WARN would print "Badness at". Change it to match the kernel/panic.c version which prints "WARNING: at". - Print the list of modules, similar to kernel/panic.c of modules, similar to kernel/panic.c [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Anton Blanchard <anton@samba.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Arjan van de Ven <arjan@infradead.org> Cc: "Kirill A. Shutemov" <kirill@shutemov.name> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-08-11 08:59:22 -07:00
TAMUKI Shoichi	c7ff0d9c92	panic: keep blinking in spite of long spin timer mode To keep panic_timeout accuracy when running under a hypervisor, the current implementation only spins on long time (1 second) calls to mdelay. That brings a good effect, but the problem is the keyboard LEDs don't blink at all on that situation. This patch changes to call to panic_blink_enter() between every mdelay and keeps blinking in spite of long spin timer mode. The time to call to mdelay is now 100ms. Even this change will keep panic_timeout accuracy enough when running under a hypervisor. Signed-off-by: TAMUKI Shoichi <tamuki@linet.gr.jp> Cc: Ben Dooks <ben-linux@fluff.org> Cc: Russell King <linux@arm.linux.org.uk> Acked-by: Dmitry Torokhov <dtor@mail.ru> Cc: Anton Blanchard <anton@samba.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-08-11 08:59:22 -07:00
Dan Carpenter	bebf8cfaea	afs: destroy work queue on init failure We can clean up the work queue on this error path. This function is called from afs_init(). Signed-off-by: Dan Carpenter <error27@gmail.com> Acked-by: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-08-11 08:59:22 -07:00
FUJITA Tomonori	a35274cd10	dma-mapping: add DMA_xxBIT_MASK to feature-removal-schedule.txt DMA_xxBIT_MASK macros were marked as deprecated in June 2009. One more year is long enough, I think. Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-08-11 08:59:22 -07:00
FUJITA Tomonori	175833635f	pci: add PCI DMA unamp state API to feature-removal-schedule.txt It was replaced with the DMA unamp state API (which can be used for any bus). Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-08-11 08:59:22 -07:00
FUJITA Tomonori	c31e74c4c3	Documentation: DMA-API-HOWTO.txt: add multiple types of IOMMUs support Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-08-11 08:59:21 -07:00
FUJITA Tomonori	3b9c6c11f5	dma-mapping: remove dma_is_consistent API Architectures implement dma_is_consistent() in different ways (some misinterpret the definition of API in DMA-API.txt). So it hasn't been so useful for drivers. We have only one user of the API in tree. Unlikely out-of-tree drivers use the API. Even if we fix dma_is_consistent() in some architectures, it doesn't look useful at all. It was invented long ago for some old systems that can't allocate coherent memory at all. It's better to export only APIs that are definitely necessary for drivers. Let's remove this API. Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: <linux-arch@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-08-11 08:59:21 -07:00
FUJITA Tomonori	d80e0d96a3	scsi: 53c700: remove dma_is_consistent usage This driver is the only user of dma_is_consistent(). We plan to remove this API. The driver uses the API in the following way: BUG_ON(!dma_is_consistent(hostdata->dev, pScript) && L1_CACHE_BYTES < dma_get_cache_alignment()); The above code tries to see if L1_CACHE_BYTES is greater than dma_get_cache_alignment() on sysmtes that can not allocate coherent memory (some old systems can't). James Bottomley exmplained that this is necesary because the driver packs the set of mailboxes into a single coherent area and separates the different usages by a L1 cache stride. So it's fatal if the dma He also pointed out that we can kill this checking because we don't hit this BUG_ON on all architectures that actually use the driver. (akpm: stolen from the scsi tree because dma-mapping-remove-dma_is_consistent-api.patch needs it) Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Signed-off-by: James Bottomley <James.Bottomley@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-08-11 08:59:21 -07:00
FUJITA Tomonori	7896bfa451	dma-mapping: parisc: set ARCH_DMA_MINALIGN Architectures that handle DMA-non-coherent memory need to set ARCH_DMA_MINALIGN to make sure that kmalloc'ed buffer is DMA-safe: the buffer doesn't share a cache with the others. Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: Kyle McMartin <kyle@mcmartin.ca> Cc: Helge Deller <deller@gmx.de> Cc: James E.J. Bottomley <jejb@parisc-linux.org> Cc: <linux-arch@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-08-11 08:59:21 -07:00
FUJITA Tomonori	4565f0170d	dma-mapping: unify dma_get_cache_alignment implementations dma_get_cache_alignment returns the minimum DMA alignment. Architectures defines it as ARCH_DMA_MINALIGN (formally ARCH_KMALLOC_MINALIGN). So we can unify dma_get_cache_alignment implementations. Note that some architectures implement dma_get_cache_alignment wrongly. dma_get_cache_alignment() should return the minimum DMA alignment. So fully-coherent architectures should return 1. This patch also fixes this issue. Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: <linux-arch@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-08-11 08:59:21 -07:00
FUJITA Tomonori	a6eb9fe105	dma-mapping: rename ARCH_KMALLOC_MINALIGN to ARCH_DMA_MINALIGN Now each architecture has the own dma_get_cache_alignment implementation. dma_get_cache_alignment returns the minimum DMA alignment. Architectures define it as ARCH_KMALLOC_MINALIGN (it's used to make sure that malloc'ed buffer is DMA-safe; the buffer doesn't share a cache with the others). So we can unify dma_get_cache_alignment implementations. This patch: dma_get_cache_alignment() needs to know if an architecture defines ARCH_KMALLOC_MINALIGN or not (needs to know if architecture has DMA alignment restriction). However, slab.h define ARCH_KMALLOC_MINALIGN if architectures doesn't define it. Let's rename ARCH_KMALLOC_MINALIGN to ARCH_DMA_MINALIGN. ARCH_KMALLOC_MINALIGN is used only in the internals of slab/slob/slub (except for crypto). Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: <linux-arch@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-08-11 08:59:21 -07:00
Anton Vorontsov	cd1542c819	edac: mpc85xx: add support for new MPCxxx/Pxxxx EDAC controllers Simply add proper IDs into the device table. Signed-off-by: Anton Vorontsov <avorontsov@mvista.com> Cc: Scott Wood <scottwood@freescale.com> Cc: Peter Tyser <ptyser@xes-inc.com> Cc: Dave Jiang <djiang@mvista.com> Cc: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-08-11 08:59:21 -07:00
Kulikov Vasiliy	b425d5c82d	edac: i5400: improve handling of pci_enable_device() return value -EIO is not the only error code that pci_enable_device() may return, also the set of errors can be enhanced in future. We should compare return code with zero, not with concrete error value. Signed-off-by: Kulikov Vasiliy <segooon@gmail.com> Acked-by: Mauro Carvalho Chehab <mchehab@redhat.com> Cc: Jeff Roberson <jroberson@jroberson.net> Cc: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-08-11 08:59:21 -07:00
Kulikov Vasiliy	44aa80f005	edac: i5000: improve handling of pci_enable_device() return value -EIO is not the only error code that pci_enable_device() may return, also the set of errors can be enhanced in future. We should compare return code with zero, not with concrete error value. Signed-off-by: Kulikov Vasiliy <segooon@gmail.com> Acked-by: Mauro Carvalho Chehab <mchehab@redhat.com> Cc: Jeff Roberson <jroberson@jroberson.net> Cc: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-08-11 08:59:21 -07:00
Christoph Egger	bd1688dcdf	edac: add wissing pieces from MPC85xx -> FSL_SOC_BOOKE In `5753c082f6` ("powerpc/85xx: Kconfig cleanup") menuconfig MPC85xx was replaced by FSL_SOC_BOOKE but some references insider the code were not adjusted accordingly. This patch adresses these missing pieces. Signed-off-by: Christoph Egger <siccegge@cs.fau.de> Cc: Doug Thompson <dougthompson@xmission.com> Cc: Peter Tyser <ptyser@xes-inc.com> Cc: Kumar Gala <galak@kernel.crashing.org> Cc: Scott Wood <scottwood@freescale.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-08-11 08:59:20 -07:00
Oleg Nesterov	c52b0b91ba	pids: alloc_pidmap: remove the unnecessary boundary checks alloc_pidmap() calculates max_scan so that if the initial offset != 0 we inspect the first map->page twice. This is correct, we want to find the unused bits < offset in this bitmap block. Add the comment. But it doesn't make any sense to stop the find_next_offset() loop when we are looking into this map->page for the second time. We have already already checked the bits >= offset during the first attempt, it is fine to do this again, no matter if we succeed this time or not. Remove this hard-to-understand code. It optimizes the very unlikely case when we are going to fail, but slows down the more likely case. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Cc: Salman Qazi <sqazi@google.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Sukadev Bhattiprolu <sukadev@us.ibm.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-08-11 08:59:20 -07:00
Salman	5fdee8c4a5	pids: fix a race in pid generation that causes pids to be reused immediately A program that repeatedly forks and waits is susceptible to having the same pid repeated, especially when it competes with another instance of the same program. This is really bad for bash implementation. Furthermore, many shell scripts assume that pid numbers will not be used for some length of time. Race Description: A B // pid == offset == n // pid == offset == n + 1 test_and_set_bit(offset, map->page) test_and_set_bit(offset, map->page); pid_ns->last_pid = pid; pid_ns->last_pid = pid; // pid == n + 1 is freed (wait()) // Next fork()... last = pid_ns->last_pid; // == n pid = last + 1; Code to reproduce it (Running multiple instances is more effective): #include <errno.h> #include <sys/types.h> #include <sys/wait.h> #include <unistd.h> #include <stdio.h> #include <stdlib.h> // The distance mod 32768 between two pids, where the first pid is expected // to be smaller than the second. int PidDistance(pid_t first, pid_t second) { return (second + 32768 - first) % 32768; } int main(int argc, char* argv[]) { int failed = 0; pid_t last_pid = 0; int i; printf("%d\n", sizeof(pid_t)); for (i = 0; i < 10000000; ++i) { if (i % 32786 == 0) printf("Iter: %d\n", i/32768); int child_exit_code = i % 256; pid_t pid = fork(); if (pid == -1) { fprintf(stderr, "fork failed, iteration %d, errno=%d", i, errno); exit(1); } if (pid == 0) { // Child exit(child_exit_code); } else { // Parent if (i > 0) { int distance = PidDistance(last_pid, pid); if (distance == 0 \|\| distance > 30000) { fprintf(stderr, "Unexpected pid sequence: previous fork: pid=%d, " "current fork: pid=%d for iteration=%d.\n", last_pid, pid, i); failed = 1; } } last_pid = pid; int status; int reaped = wait(&status); if (reaped != pid) { fprintf(stderr, "Wait return value: expected pid=%d, " "got %d, iteration %d\n", pid, reaped, i); failed = 1; } else if (WEXITSTATUS(status) != child_exit_code) { fprintf(stderr, "Unexpected exit status %x, iteration %d\n", WEXITSTATUS(status), i); failed = 1; } } } exit(failed); } Thanks to Ted Tso for the key ideas of this implementation. Signed-off-by: Salman Qazi <sqazi@google.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Theodore Ts'o <tytso@mit.edu> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Sukadev Bhattiprolu <sukadev@us.ibm.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-08-11 08:59:20 -07:00
Alexey Dobriyan	9c867fbe06	partitions: fix sometimes unreadable partition strings Fix this garbage happening quite often: ==> sda: scsi 3:0:0:0: CD-ROM TOSHIBA ==> sda1 sda2 sda3 sda4 <sr0: scsi3-mmc drive: 24x/24x writer dvd-ram cd/rw xa/form2 cdda tray ^^^ Uniform CD-ROM driver Revision: 3.20 sr 3:0:0:0: Attached scsi CD-ROM sr0 ==> sda5 sda6 sda7 > Make "sda: sda1 ..." lines actually lines. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-08-11 08:59:20 -07:00
Jens Rottmann	ecd6269174	cs5535-mfgpt: reuse timers that have never been set up The MFGPT hardware may be set up only once, therefore cs5535_mfgpt_free_timer() didn't re-set the timer's "avail" bit. However if a timer is freed before it has actually been in use then it may be made available again. Signed-off-by: Jens Rottmann <JRottmann@LiPPERTEmbedded.de> Acked-by: Andres Salomon <dilinger@queued.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Jordan Crouse <jordan@cosmicpenguin.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-08-11 08:59:20 -07:00
Julia Lawall	e73790a57a	drivers/char/n_gsm.c: add missing spin_unlock_irqrestore Add a spin_unlock_irqrestore missing on the error path. Converting the return to break leads to the spin_unlock_irqrestore at the end of the function. The semantic match that finds this problem is as follows: (http://coccinelle.lip6.fr/) // <smpl> @@ expression E1; @@ * spin_lock_irqsave(E1,...); <+... when != E1 if (...) { ... when != E1 * return ...; } ...+> * spin_unlock_irqrestore(E1,...); // </smpl> Signed-off-by: Julia Lawall <julia@diku.dk> Cc: Greg Kroah-Hartman <gregkh@suse.de> Cc: Alan Cox <alan@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-08-11 08:59:20 -07:00
Yinghai Lu	7bb671e3d0	ipmi: print info for spmi and smbios paths like acpi and pci Print out the reg spacing and size for spmi and smbios so BIOS developers can make them consistent. Also remove extra PFX on the duplicating path. Signed-off-by: Yinghai Lu <yinghai@kernel.org> Cc: Corey Minyard <minyard@acm.org> Cc: Matthew Garrett <mjg@redhat.com> Cc: Len Brown <len.brown@intel.com> Cc: Myron Stowe <myron.stowe@hp.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-08-11 08:59:20 -07:00
Yinghai Lu	7faefea66a	ipmi: fix memleaking for add_smi when duplicating happen Free the temporary info struct when we have duplicated ones. Signed-off-by: Yinghai Lu <yinghai@kernel.org> Cc: Corey Minyard <minyard@acm.org> Cc: Matthew Garrett <mjg@redhat.com> Cc: Len Brown <len.brown@intel.com> Cc: Myron Stowe <myron.stowe@hp.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-08-11 08:59:20 -07:00
Justin P. Mattock	f46c77c283	drivers/char/ipmi/ipmi_si_intf.c: fix warning: variable 'addr_space' set but not used Fix a warning message generated by GCC, and also updates a web address pointing to a pdf containing information. CC [M] drivers/char/ipmi/ipmi_si_intf.o drivers/char/ipmi/ipmi_si_intf.c: In function 'try_init_spmi': drivers/char/ipmi/ipmi_si_intf.c:2016:8: warning: variable 'addr_space' set but not used Signed-off-by: Sergey V. <sftp.mtuci@gmail.com> Signed-off-by: Justin P. Mattock <justinmattock@gmail.com> Acked-by: Corey Minyard <minyard@acm.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-08-11 08:59:20 -07:00
Robert P. J. Day	cfbef3cb16	procfs: simplify conditional processing of fs/proc.o. Since the entire fs/proc directory is conditionally included based on CONFIG_PROC_FS, it's redundant to check that same variable within that directory. Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-08-11 08:59:20 -07:00
Nathan Lynch	a2a20c412c	signalfd: fill in ssi_int for posix timers and message queues If signalfd is used to consume a signal generated by a POSIX interval timer or POSIX message queue, the ssi_int field does not reflect the data (sigevent->sigev_value) supplied to timer_create(2) or mq_notify(3). (The ssi_ptr field, however, is filled in.) This behavior differs from signalfd's treatment of sigqueue-generated signals -- see the default case in signalfd_copyinfo. It also gives results that differ from the case when a signal is handled conventionally via a sigaction-registered handler. So, set signalfd_siginfo->ssi_int in the remaining cases (__SI_TIMER, __SI_MESGQ) where ssi_ptr is set. akpm: a non-back-compatible change. Merge into -stable to minimise the number of kernels which are in the field and which miss this feature. Signed-off-by: Nathan Lynch <ntl@pobox.com> Acked-by: Davide Libenzi <davidel@xmailserver.org> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-08-11 08:59:20 -07:00
Oleg Nesterov	c7e49c1488	ptrace: optimize exit_ptrace() for the likely case exit_ptrace() takes tasklist_lock unconditionally. We need this lock to avoid the race with ptrace_traceme(), it acts as a barrier. Change its caller, forget_original_parent(), to call exit_ptrace() under tasklist_lock. Change exit_ptrace() to drop and reacquire this lock if needed. This allows us to add the fastpath list_empty(ptraced) check. In the likely no-tracees case exit_ptrace() just returns and we avoid the lock() + unlock() sequence. "Zhang, Yanmin" <yanmin_zhang@linux.intel.com> suggested to add this check, and he reports that this change adds about 11% improvement in some tests. Suggested-and-tested-by: "Zhang, Yanmin" <yanmin_zhang@linux.intel.com> Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-08-11 08:59:19 -07:00
KOSAKI Motohiro	13d7e3a2db	memcg: convert to use zone_to_nid() from bare zone->zone_pgdat->node_id We have zone_to_nid(). this patch convert all existing users of zone->zone_pgdat->node_id. Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Acked-by: Balbir Singh <balbir@linux.vnet.ibm.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Nishimura Daisuke <d-nishimura@mtf.biglobe.ne.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-08-11 08:59:19 -07:00
KOSAKI Motohiro	00918b6ab8	memcg: remove nid and zid argument from mem_cgroup_soft_limit_reclaim() mem_cgroup_soft_limit_reclaim() has zone, nid and zid argument. but nid and zid can be calculated from zone. So remove it. Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Acked-by: Mel Gorman <mel@csn.ul.ie> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Nishimura Daisuke <d-nishimura@mtf.biglobe.ne.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-08-11 08:59:19 -07:00
KOSAKI Motohiro	14fec79680	memcg: mem_cgroup_shrink_node_zone() doesn't need sc.nodemask Currently mem_cgroup_shrink_node_zone() call shrink_zone() directly. thus it doesn't need to initialize sc.nodemask because shrink_zone() doesn't use it at all. Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Acked-by: Mel Gorman <mel@csn.ul.ie> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Nishimura Daisuke <d-nishimura@mtf.biglobe.ne.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-08-11 08:59:19 -07:00
KOSAKI Motohiro	da280d636b	memcg: kill unnecessary initialization in mem_cgroup_shrink_node_zone() sc.nr_reclaimed and sc.nr_scanned have already been initialized few lines above "struct scan_control sc = {}" statement. So, This patch remove this unnecessary code. Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Nishimura Daisuke <d-nishimura@mtf.biglobe.ne.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-08-11 08:59:19 -07:00
KOSAKI Motohiro	b8f5c5664d	memcg: sc.nr_to_reclaim should be initialized Currently, mem_cgroup_shrink_node_zone() initialize sc.nr_to_reclaim as 0. It mean shrink_zone() only scan 32 pages and immediately return even if it doesn't reclaim any pages. This patch fixes it. Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Acked-by: Mel Gorman <mel@csn.ul.ie> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Nishimura Daisuke <d-nishimura@mtf.biglobe.ne.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-08-11 08:59:19 -07:00
KAMEZAWA Hiroyuki	f75ca96203	memcg: avoid css_get() Now, memory cgroup increments css(cgroup subsys state)'s reference count per a charged page. And the reference count is kept until the page is uncharged. But this has 2 bad effect. 1. Because css_get/put calls atomic_inc()/dec, heavy call of them on large smp will not scale well. 2. Because css's refcnt cannot be in a state as "ready-to-release", cgroup's notify_on_release handler can't work with memcg. 3. css's refcnt is atomic_t, it means smaller than 32bit. Maybe too small. This has been a problem since the 1st merge of memcg. This is a trial to remove css's refcnt per a page. Even if we remove refcnt, pre_destroy() does enough synchronization as - check res->usage == 0. - check no pages on LRU. This patch removes css's refcnt per page. Even after this patch, at the 1st look, it seems css_get() is still called in try_charge(). But the logic is. - If a memcg of mm->owner is cached one, consume_stock() will work. At success, return immediately. - If consume_stock returns false, css_get() is called and go to slow path which may be blocked. At the end of slow path, css_put() is called and restart from the start if necessary. So, in the fast path, we don't call css_get() and can avoid access to shared counter. This patch can make the most possible case fast. Here is a result of multi-threaded page fault benchmark. [Before] 25.32% multi-fault-all [kernel.kallsyms] [k] clear_page_c 9.30% multi-fault-all [kernel.kallsyms] [k] _raw_spin_lock_irqsave 8.02% multi-fault-all [kernel.kallsyms] [k] try_get_mem_cgroup_from_mm <=====(*) 7.83% multi-fault-all [kernel.kallsyms] [k] down_read_trylock 5.38% multi-fault-all [kernel.kallsyms] [k] __css_put 5.29% multi-fault-all [kernel.kallsyms] [k] __alloc_pages_nodemask 4.92% multi-fault-all [kernel.kallsyms] [k] _raw_spin_lock_irq 4.24% multi-fault-all [kernel.kallsyms] [k] up_read 3.53% multi-fault-all [kernel.kallsyms] [k] css_put 2.11% multi-fault-all [kernel.kallsyms] [k] handle_mm_fault 1.76% multi-fault-all [kernel.kallsyms] [k] __rmqueue 1.64% multi-fault-all [kernel.kallsyms] [k] __mem_cgroup_commit_charge [After] 28.41% multi-fault-all [kernel.kallsyms] [k] clear_page_c 10.08% multi-fault-all [kernel.kallsyms] [k] _raw_spin_lock_irq 9.58% multi-fault-all [kernel.kallsyms] [k] down_read_trylock 9.38% multi-fault-all [kernel.kallsyms] [k] _raw_spin_lock_irqsave 5.86% multi-fault-all [kernel.kallsyms] [k] __alloc_pages_nodemask 5.65% multi-fault-all [kernel.kallsyms] [k] up_read 2.82% multi-fault-all [kernel.kallsyms] [k] handle_mm_fault 2.64% multi-fault-all [kernel.kallsyms] [k] mem_cgroup_add_lru_list 2.48% multi-fault-all [kernel.kallsyms] [k] __mem_cgroup_commit_charge Then, 8.02% of try_get_mem_cgroup_from_mm() disappears because this patch removes css_tryget() in it. (But yes, this is an extreme case.) Signed-off-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Cc: Balbir Singh <balbir@in.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-08-11 08:59:19 -07:00
KAMEZAWA Hiroyuki	158e0a2d1b	memcg: use find_lock_task_mm() in memory cgroups oom When the OOM killer scans task, it check a task is under memcg or not when it's called via memcg's context. But, as Oleg pointed out, a thread group leader may have NULL ->mm and task_in_mem_cgroup() may do wrong decision. We have to use find_lock_task_mm() in memcg as generic OOM-Killer does. Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Cc: Balbir Singh <balbir@linux.vnet.ibm.com> Reviewed-by: Minchan Kim <minchan.kim@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-08-11 08:59:19 -07:00
Daisuke Nishimura	73045c47b6	memcg: remove mem from arg of charge_common mem_cgroup_charge_common() is always called with @mem = NULL, so it's meaningless. This patch removes it. Signed-off-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Cc: Balbir Singh <balbir@linux.vnet.ibm.com> Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-08-11 08:59:19 -07:00
Daisuke Nishimura	bd0d24bfe8	memcg: remove redundant code - try_get_mem_cgroup_from_mm() calls rcu_read_lock/unlock by itself, so we don't have to call them in task_in_mem_cgroup(). - *mz is not used in __mem_cgroup_uncharge_common(). - we don't have to call lookup_page_cgroup() in mem_cgroup_end_migration() after we've cleared PCG_MIGRATION of @oldpage. - remove empty comment. - remove redundant empty line in mem_cgroup_cache_charge(). Signed-off-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Acked-by: Balbir Singh <balbir@linux.vnet.ibm.com> Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-08-11 08:59:18 -07:00
KAMEZAWA Hiroyuki	2bd9bb206b	memcg: clean up waiting move acct Now, for checking a memcg is under task-account-moving, we do css_tryget() against mc.to and mc.from. But this is just complicating things. This patch makes the check easier. This patch adds a spinlock to move_charge_struct and guard modification of mc.to and mc.from. By this, we don't have to think about complicated races arount this not-critical path. [balbir@linux.vnet.ibm.com: don't crash on a null memcg being passed] Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Balbir Singh <balbir@linux.vnet.ibm.com> Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Cc: Balbir Singh <balbir@in.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-08-11 08:59:18 -07:00
KAMEZAWA Hiroyuki	4b53433468	memcg: clean up try_charge main loop mem_cgroup_try_charge() has a big loop in it and seems to be hard to read. Most of routines are for slow path. This patch moves codes out from the loop and make it clear what's done. Summary: - refactoring a function to detect a memcg is under acccount move or not. - refactoring a function to wait for the end of moving task acct. - refactoring a main loop('s slow path) as a function and make it clear why we retry or quit by return code. - add fatal_signal_pending() check for bypassing charge loops. Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Cc: Balbir Singh <balbir@in.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-08-11 08:59:18 -07:00
KAMEZAWA Hiroyuki	65e0e81166	memcg: remove experimental from swap account config It's 11 months since we changed swap_map[] to indicates SWAP_HAS_CACHE. Since that, memcg's swap accounting has been very stable and it seems it can be maintained. So, I'd like to remove EXPERIMENTAL from the config. Acked-by: Balbir Singh <balbir@linux.vnet.ibm.com> Acked-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-08-11 08:59:18 -07:00
Chris Wright	b7300b78d1	blkdev: cgroup whitelist permission fix The cgroup device whitelist code gets confused when trying to grant permission to a disk partition that is not currently open. Part of blkdev_open() includes __blkdev_get() on the whole disk. Basically, the only ways to reliably allow a cgroup access to a partition on a block device when using the whitelist are to 1) also give it access to the whole block device or 2) make sure the partition is already open in a different context. The patch avoids the cgroup check for the whole disk case when opening a partition. Addresses https://bugzilla.redhat.com/show_bug.cgi?id=589662 Signed-off-by: Chris Wright <chrisw@sous-sol.org> Acked-by: Serge E. Hallyn <serue@us.ibm.com> Tested-by: Serge E. Hallyn <serue@us.ibm.com> Reported-by: Vivek Goyal <vgoyal@redhat.com> Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: Christoph Hellwig <hch@infradead.org> Cc: "Daniel P. Berrange" <berrange@redhat.com> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-08-11 08:59:18 -07:00
Dan Carpenter	e400c28524	cgroups: save space for the terminator The original code didn't leave enough space for a NULL terminator. These strings are copied with strcpy() into fixed length buffers in cgroup_root_from_opts(). Signed-off-by: Dan Carpenter <error27@gmail.com> Acked-by: Serge E. Hallyn <serge@hallyn.com> Reviewd-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Paul Menage <menage@google.com> Cc: Li Zefan <lizf@cn.fujitsu.com> Cc: Ben Blum <bblum@andrew.cmu.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-08-11 08:59:18 -07:00
Randy Dunlap	2b24706a79	Documentation/padata.txt: fix typos etc. Fix typos & grammar. Use CPU instead of cpu in text. Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Acked-by: Steffen Klassert <steffen.klassert@secunet.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-08-11 08:59:18 -07:00
Huang Shijie	a3038ea5e2	Documentation/00-INDEX: remove reference to exception.txt The exception.txt has been removed from the Documentation directory. So update the index file for it. Signed-off-by: Huang Shijie <shijie8@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-08-11 08:59:18 -07:00
Ben Hutchings	a435600e5b	docbook: need xmldoclinks for all doc types $ rm -rf build $ mkdir build $ cp .config build $ make O=build htmldocs ... xmlto: linux-2.6/build/Documentation/DocBook/media.xml does not validate (status 3) xmlto: Fix document syntax or use --skip-validation option linux-2.6/build/Documentation/DocBook/media.xml:4: warning: failed to load external entity "linux-2.6/build/Documentation/DocBook/media-entities.tmpl" We need the xmldoclinks built for any document types built from the XML sources. Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Acked-by: Andy Whitcroft <apw@canonical.com> Cc: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-08-11 08:59:18 -07:00
Joe Perches	79872e07ab	Documentation/networking/wavelan.txt: deleted, not in tree Commit `1d794e3b35` ("Staging: wavelan: delete the driver") removed the source, so remove the documentation as well. Signed-off-by: Joe Perches <joe@perches.com> Cc: Jean Tourrilhes <jt@hpl.hp.com> Acked-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-08-11 08:59:18 -07:00
Randy Dunlap	b6d676db35	mtd/nand_base: fix kernel-doc warnings & typos Fix mtd/nand_base.c kernel-doc warnings and typos. Warning(drivers/mtd/nand/nand_base.c:893): No description found for parameter 'mtd' Warning(drivers/mtd/nand/nand_base.c:893): No description found for parameter 'ofs' Warning(drivers/mtd/nand/nand_base.c:893): No description found for parameter 'len' Warning(drivers/mtd/nand/nand_base.c:893): No description found for parameter 'invert' Warning(drivers/mtd/nand/nand_base.c:930): No description found for parameter 'mtd' Warning(drivers/mtd/nand/nand_base.c:930): No description found for parameter 'ofs' Warning(drivers/mtd/nand/nand_base.c:930): No description found for parameter 'len' Warning(drivers/mtd/nand/nand_base.c:987): No description found for parameter 'mtd' Warning(drivers/mtd/nand/nand_base.c:987): No description found for parameter 'ofs' Warning(drivers/mtd/nand/nand_base.c:987): No description found for parameter 'len' Warning(drivers/mtd/nand/nand_base.c:2087): No description found for parameter 'len' Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Cc: David Woodhouse <dwmw2@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-08-11 08:59:18 -07:00
Randy Dunlap	71740c423c	fusion: fix kernel-doc warnings Fix (delete) empty kernel-doc lines/warnings: Warning(drivers/message/fusion/mptbase.c:6916): bad line: Warning(drivers/message/fusion/mptbase.c:7060): bad line: Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Cc: Eric Moore <Eric.Moore@lsi.com> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-08-11 08:59:12 -07:00
Changli Gao	b3397ad544	reiserfs: remove unused local `wait' Signed-off-by: Changli Gao <xiaosuo@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-08-11 08:59:12 -07:00

... 3 4 5 6 7 ...

209300 Commits