The kernel's implementation of notifier chains is unsafe. There is no
protection against entries being added to or removed from a chain while the
chain is in use. The issues were discussed in this thread:
http://marc.theaimsgroup.com/?l=linux-kernel&m=113018709002036&w=2
We noticed that notifier chains in the kernel fall into two basic usage
classes:
"Blocking" chains are always called from a process context
and the callout routines are allowed to sleep;
"Atomic" chains can be called from an atomic context and
the callout routines are not allowed to sleep.
We decided to codify this distinction and make it part of the API. Therefore
this set of patches introduces three new, parallel APIs: one for blocking
notifiers, one for atomic notifiers, and one for "raw" notifiers (which is
really just the old API under a new name). New kinds of data structures are
used for the heads of the chains, and new routines are defined for
registration, unregistration, and calling a chain. The three APIs are
explained in include/linux/notifier.h and their implementation is in
kernel/sys.c.
With atomic and blocking chains, the implementation guarantees that the chain
links will not be corrupted and that chain callers will not get messed up by
entries being added or removed. For raw chains the implementation provides no
guarantees at all; users of this API must provide their own protections. (The
idea was that situations may come up where the assumptions of the atomic and
blocking APIs are not appropriate, so it should be possible for users to
handle these things in their own way.)
There are some limitations, which should not be too hard to live with. For
atomic/blocking chains, registration and unregistration must always be done in
a process context since the chain is protected by a mutex/rwsem. Also, a
callout routine for a non-raw chain must not try to register or unregister
entries on its own chain. (This did happen in a couple of places and the code
had to be changed to avoid it.)
Since atomic chains may be called from within an NMI handler, they cannot use
spinlocks for synchronization. Instead we use RCU. The overhead falls almost
entirely in the unregister routine, which is okay since unregistration is much
less frequent that calling a chain.
Here is the list of chains that we adjusted and their classifications. None
of them use the raw API, so for the moment it is only a placeholder.
ATOMIC CHAINS
-------------
arch/i386/kernel/traps.c: i386die_chain
arch/ia64/kernel/traps.c: ia64die_chain
arch/powerpc/kernel/traps.c: powerpc_die_chain
arch/sparc64/kernel/traps.c: sparc64die_chain
arch/x86_64/kernel/traps.c: die_chain
drivers/char/ipmi/ipmi_si_intf.c: xaction_notifier_list
kernel/panic.c: panic_notifier_list
kernel/profile.c: task_free_notifier
net/bluetooth/hci_core.c: hci_notifier
net/ipv4/netfilter/ip_conntrack_core.c: ip_conntrack_chain
net/ipv4/netfilter/ip_conntrack_core.c: ip_conntrack_expect_chain
net/ipv6/addrconf.c: inet6addr_chain
net/netfilter/nf_conntrack_core.c: nf_conntrack_chain
net/netfilter/nf_conntrack_core.c: nf_conntrack_expect_chain
net/netlink/af_netlink.c: netlink_chain
BLOCKING CHAINS
---------------
arch/powerpc/platforms/pseries/reconfig.c: pSeries_reconfig_chain
arch/s390/kernel/process.c: idle_chain
arch/x86_64/kernel/process.c idle_notifier
drivers/base/memory.c: memory_chain
drivers/cpufreq/cpufreq.c cpufreq_policy_notifier_list
drivers/cpufreq/cpufreq.c cpufreq_transition_notifier_list
drivers/macintosh/adb.c: adb_client_list
drivers/macintosh/via-pmu.c sleep_notifier_list
drivers/macintosh/via-pmu68k.c sleep_notifier_list
drivers/macintosh/windfarm_core.c wf_client_list
drivers/usb/core/notify.c usb_notifier_list
drivers/video/fbmem.c fb_notifier_list
kernel/cpu.c cpu_chain
kernel/module.c module_notify_list
kernel/profile.c munmap_notifier
kernel/profile.c task_exit_notifier
kernel/sys.c reboot_notifier_list
net/core/dev.c netdev_chain
net/decnet/dn_dev.c: dnaddr_chain
net/ipv4/devinet.c: inetaddr_chain
It's possible that some of these classifications are wrong. If they are,
please let us know or submit a patch to fix them. Note that any chain that
gets called very frequently should be atomic, because the rwsem read-locking
used for blocking chains is very likely to incur cache misses on SMP systems.
(However, if the chain's callout routines may sleep then the chain cannot be
atomic.)
The patch set was written by Alan Stern and Chandra Seetharaman, incorporating
material written by Keith Owens and suggestions from Paul McKenney and Andrew
Morton.
[jes@sgi.com: restructure the notifier chain initialization macros]
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com>
Signed-off-by: Jes Sorensen <jes@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* master.kernel.org:/home/rmk/linux-2.6-arm:
[ARM] 3030/2: fix permission check in the obscur cmpxchg syscall
[ARM] nommu: rename compressed/head.S symbols to a new style
[ARM] select TLS_REG_EMUL and NEEDS_SYSCALL_FOR_CMPXCHG
[ARM] nommu: Move hardware page table definitions to pgtable-hwdef.h
[ARM] Move read of processor ID out of lookup_processor_type()
[ARM] Fix typo in tlbflush.h
[ARM] noMMU: removes TLB codes in nommu mode
[ARM] noMMU: block sys_fork in nommu mode
[ARM] 3399/1: Fix link problem when CONFIG_PRINTK is disabled
[ARM] 3398/1: Fix the VFP registers loading/storing base address
[ARM] 3397/1: AT91RM9200 Header update
[ARM] 3385/1: Battery support for sharp zaurus sl-5500 (collie)
[ARM] SMP: don't set cpu_*_map in smp_prepare_boot_cpu
include/linux/clk.h is betraying its ARM origins
[ARM] Move enable_irq and disable_irq to assembler.h
[ARM] 3391/1: use PLAT8250_DEV_PLATFORM{,1} for platform device id instead of 0/1
Patch from Lennert Buytenhek
Add a PLAT8250_DEV_PLATFORM2, and convert the two ixdp2x01 CPLD serial
ports to use platform serial devices with ids PLAT8250_DEV_PLATFORM[12].
(The on-chip xscale UART is PLAT8250_DEV_PLATFORM, id #0.)
Signed-off-by: Lennert Buytenhek <buytenh@wantstofly.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from Nicolas Pitre
Quoting RMK:
|pte_write() just says that the page _may_ be writable. It doesn't say
|that the MMU is programmed to allow writes. If pte_dirty() doesn't
|return true, that means that the page is _not_ writable from userspace.
|If you write to it from kernel mode (without using put_user) you'll
|bypass the MMU read-only protection and may end up writing to a page
|owned by two separate processes.
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from Malcolm Parsons
Printking a backtrace requires printk, so disable backtrace code
when printk is disabled.
Without this patch, a kernel with CONFIG_PRINTK disabled does not link:
arch/arm/lib/lib.a(backtrace.o): In function `c_backtrace':
arch/arm/lib/backtrace.S:(.text+0x108): undefined reference to `printk'
arch/arm/lib/backtrace.S:(.text+0x11c): undefined reference to `printk'
arch/arm/lib/lib.a(backtrace.o):(.fixup+0x8): undefined reference to `printk'
Signed-off-by: Malcolm Parsons <malcolm.parsons@gmail.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from Catalin Marinas
The current VFP code corrupts the VFP registers (including the control
ones) if more than one floating point application is executed at the same
time. This patch fixes the updating of the load/store base addresses for
the VFP registers.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from Pavel Machek
This adds support for battery reading on collie. Collie slowly charges
battery even with charging disabled, so I did not yet enable fast
charge.
Signed-off-by: Pavel Machek <pavel@suse.cz>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The recent addition of boot_cpu_init() implements the initialisation
of the online, present and possible cpu maps for the boot CPU, so
there is no reason to duplicate this in the architecture
smp_prepare_boot_cpu() hook.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
* master.kernel.org:/pub/scm/linux/kernel/git/sam/kbuild: (46 commits)
kbuild: remove obsoleted scripts/reference_* files
kbuild: fix make help & make *pkg
kconfig: fix time ordering of writes to .kconfig.d and include/linux/autoconf.h
Kconfig: remove the CONFIG_CC_ALIGN_* options
kbuild: add -fverbose-asm to i386 Makefile
kbuild: clean-up genksyms
kbuild: Lindent genksyms.c
kbuild: fix genksyms build error
kbuild: in makefile.txt note that Makefile is preferred name for kbuild files
kbuild: replace PHONY with FORCE
kbuild: Fix bug in crc symbol generating of kernel and modules
kbuild: change kbuild to not rely on incorrect GNU make behavior
kbuild: when warning symbols exported twice now tell user this is the problem
kbuild: fix make dir/file.xx when asm symlink is missing
kbuild: in the section mismatch check try harder to find symbols
kbuild: fix section mismatch check for unwind on IA64
kbuild: kill false positives from section mismatch warnings for powerpc
kbuild: kill trailing whitespace in modpost & friends
kbuild: small update of allnoconfig description
kbuild: make namespace.pl CROSS_COMPILE happy
...
Trivial conflict in arch/ppc/boot/Makefile manually fixed up
This adds missing bits of collie (sharp sl-5500) PCMCIA support and
MFD support.
Signed-off-by: Pavel Machek <pavel@suse.cz>
Acked-by: Richard Purdie <rpurdie@rpsys.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch renames symbols to a new style to prepare mpu support
code merging. e.g. __armv4_cache_on --> __armv4_mmu_cache_on
Signed-off-by: Hyok S. Choi <hyok.choi@samsung.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
5d25ac038a broke VFP builds due to
enable_irq not being defined as an assembly macro. Move it to
assembler.h so everyone can use it.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from Lennert Buytenhek
This patch changes iop3xx and omap2 and to use PLAT8250_DEV_PLATFORM{,1}
as platform device id instead of just hardcoding 0/1 directly.
Signed-off-by: Lennert Buytenhek <buytenh@wantstofly.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from Erik Hovland
I found a typo and what seems to be a run-on sentence in
arch/arm/common/dmabounce.c
This patch corrects both.
Signed-off-by: Erik Hovland <erik@hovland.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from Andrew Victor
This patch includes a few changes to the clock support on the
AT91RM9200.
1. Added definitions for Ethernet, MMC, TWI, USARTs, and SPI peripheral
clocks.
2. Replaced some hard-coded hex values with the text definitions in
at91rm9200_sys.h.
3. If the USB96M bit is set for PLLB, then the rate of PLLB is not
affected but only the USB Host/Device clocks which are derived from it.
Issue reported by Sergei Sharonov.
Signed-off-by: Andrew Victor <andrew@sanpeople.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from Andrew Victor
If the timer interrupt is ever significantly delayed (or after the
system was suspended), the system could spin incrementing the time for
too long.
The fix is to replace the "do {} while" with a "while {}".
Orignal patch by Savin Zlobec and Peter Menzebach.
Signed-off-by: Andrew Victor <andrew@sanpeople.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from Lennert Buytenhek
Unify the five existing ixp2000 defconfigs into one defconfig.
Signed-off-by: Lennert Buytenhek <buytenh@wantstofly.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from Lennert Buytenhek
ixp2000 used to initially mark GPIO interrupts as invalid, and not
mark them valid until set_irq_type() was called, but this doesn't
work if you want to use request_irq() with the SA_TRIGGER_* flags.
So, just mark the GPIO interrupts valid from the beginning. We
configure GPIOs as inputs when set_irq_type() is called anyway, so
this shouldn't be a problem.
Signed-off-by: Lennert Buytenhek <buytenh@wantstofly.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
set_page_count usage outside mm/ is limited to setting the refcount to 1.
Remove set_page_count from outside mm/, and replace those users with
init_page_count() and set_page_refcounted().
This allows more debug checking, and tighter control on how code is allowed
to play around with page->_count.
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Have an explicit mm call to split higher order pages into individual pages.
Should help to avoid bugs and be more explicit about the code's intention.
Signed-off-by: Nick Piggin <npiggin@suse.de>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: David Howells <dhowells@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Chris Zankel <chris@zankel.net>
Signed-off-by: Yoichi Yuasa <yoichi_yuasa@tripeaks.co.jp>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Only issue a "nobody cared" warning after 99900 spurious interrupts.
This avoids the occasional spurious interrupt causing warnings, as
per x86.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
We need this to be zero initialised. Since this is an array, use kcalloc
rather than kzalloc or kmalloc.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from Ben Dooks
Add Simtec Osiris to the default build, and enable the
USB-OHCI section by default.
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from Ben Dooks
Fix the build of arch/arm/mach-s3c2410/mach-osiris.c
and fix the warnings from sparse.
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from Lennert Buytenhek
Add GPIO interrupt support for the first 16 GPIO lines (port A
and B.)
Signed-off-by: Lennert Buytenhek <buytenh@wantstofly.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from Ben Dooks
Add USB bus clock definition for 48MHz fed to OHCI and gadget cores
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from Ben Dooks
Add set_rate methods for the extra clocks on the S3C2440
and add the camera UPLL clock source
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from Ben Dooks
Add support for clk_set_rate and clk_round_rate to the
s3c2410 clock implementation
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from Lennert Buytenhek
Move the uengine loader from arch/arm/mach-ixp2000 to arch/arm/common
so that ixp23xx can use it too.
Signed-off-by: Lennert Buytenhek <buytenh@wantstofly.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from Lennert Buytenhek
Add ep93xx defconfig.
Signed-off-by: Lennert Buytenhek <buytenh@wantstofly.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from Lennert Buytenhek
Add support for setting the direction of and getting/setting the
value of the 64 GPIO lines.
Signed-off-by: Lennert Buytenhek <buytenh@wantstofly.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from Lennert Buytenhek
This patch adds support for the Cirrus ep93xx series of CPUs. The
ep93xx is an ARM920T based CPU with two VICs, PL010 based UARTs,
IrDA, MaverickCrunch floating point coprocessor, between 24 and 64
GPIOs, ethernet, OHCI USB and, depending on the model, pcmcia, raster
engine, graphics accelerator, IDE controller and a bunch of other
stuff.
This patch adds the core ep93xx support code, and support for the
Glomation GESBC-9312-sx and the Technologic Systems TS-72xx SBCs.
Signed-off-by: Lennert Buytenhek <buytenh@wantstofly.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from Alessandro Zummo
ixp4xx_config_irq did not configure the gpio line
as an input.
As an added bonus, the irq2gpio array has been converted
from int to char.
Signed-off-by: Alessandro Zummo <a.zummo@towertech.it>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from Peter Teichmann
Currently, if the kernels HZ value is greater than 100, delays with the udelay function are too short. This can cause trouble for instance with the zd1201 usb wlan driver.
This patch suggests a solution that keeps the overhead small and maintains (hopefully) sufficient resolution.
Signed-off-by: Peter Teichmann
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from Deepak Saxena
This patch adds support for Intel's IXDP28x5 platform. This
is just and IXDP2801 with a new CPU rev but the bootloader
has been updated to reflect a new machine ID so we just build
support for it by default when we build IXDP2801.
Signed-off-by: Deepak Saxena <dsaxena@plexity.net>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from Ben Dooks
Add enable and set_parent calls for the dclk
and clkout clocks.
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from Ben Dooks
Add clk_set_parent() call to clock code
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from Ben Dooks
Move the UPLL clock registration to the central
clock file, and add an enable method
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from Ben Dooks
Add selection for timer code for the Simtec Osiris
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>