Commit Graph

120074 Commits

Author SHA1 Message Date
Cornelia Huck
90ed2b692f [S390] cio: Dont fail probe for I/O subchannels.
If we fail the probe for an I/O subchannel, we won't be able
to unregister it again since there are no sch_event()
callbacks for unbound subchannels. Just succeed the probe in
any case and schedule unregistering the subchannel.

Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25 13:39:08 +01:00
Cornelia Huck
5fb6b8544d [S390] cio: Only register ccw_device for registered subchannel.
There is a race between io_subchannel_register() and
io_subchannel_sch_event() which may cause a subchannel to be
unregistered because it is no longer operational before
io_subchannel_register() had run. We need to check whether the
subchannel is still registered before the ccw device can be
registered and just bail out if it is not.

Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25 13:39:08 +01:00
Cornelia Huck
6eff208f47 [S390] cio: Fix I/O subchannel refcounting.
Subchannel refcounting was incorrect in some places, especially
a refcount was missing when ccw_device_call_sch_unregister()
was called and the refcount was not correctly switched after
moving devices.

Fix this by establishing the following rules:
- The ccw_device obtains a reference on its parent subchannel
  when dev.parent is set and gives it up in its release
  function. This is needed because we need a parent reference
  for correct refcounting even before the ccw device is (if at
  all) registered.
- When calling device_move(), obtain a reference on the new
  subchannel before moving the ccw device and give up the
  reference on the old parent after moving. This brings the
  refcount in line with the first rule.

Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25 13:39:08 +01:00
Cornelia Huck
9cd6742197 [S390] cio: Fix reference counting for online/offline.
The current code attempts to get an extra reference count
for online devices by doing a get_device() in ccw_device_online()
and a put_device() in ccw_device_done(). However, this
- incorrectly obtains an extra reference for disconnected
  devices becoming available again (since they are already
  online)
- needs special checks for css_init_done in order to handle
  the console device
- is not obvious and
- may incorretly drop a reference count in ccw_device_done() if
  that function is called after path verification for a device
  that just became not operational.

So let's just get the reference in ccw_device_set_online() and
drop it in ccw_device_set_offline(). (Unfortunately, we still
need the special case in io_subchannel_probe().)

Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25 13:39:07 +01:00
Cornelia Huck
97166f52fc [S390] cio: Put referernce on correct device after moving.
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25 13:39:07 +01:00
Peter Oberparleiter
c619d4223e [S390] cio: fix ccwgroup online vs. ungroup race condition
Ensure atomicity of ungroup operation to prevent concurrent ungroup
and online processing which may lead to use-after-release situations.

Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25 13:39:06 +01:00
Sebastian Ott
111e95a4ca [S390] cio: move irritating comment.
Due to former patches a comment and device id initialization were
split from the addressed function call in io_subchannel_probe.

Move it back to where it belongs.

Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25 13:39:06 +01:00
Peter Oberparleiter
d7b604891b [S390] cio: update sac values
Values for the sac field have changed - update code accordingly.

Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25 13:39:06 +01:00
Heiko Carstens
191fd44c11 [S390] cio: get rid of compile warning
Move cio_tpi() to the rest of the CONFIG_CCW_CONSOLE functions to
get rid of this one:

drivers/s390/cio/cio.c:115: warning: 'cio_tpi' defined but not used

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25 13:39:05 +01:00
Martin Schwidefsky
547e3cec4f [S390] remove ptrace warning on 31 bit.
A kernel compile on 31 bit gives the following warnings in ptrace.c:

arch/s390/kernel/ptrace.c: In function 'peek_user':
arch/s390/kernel/ptrace.c:207: warning: unused variable 'dummy'
arch/s390/kernel/ptrace.c: In function 'poke_user':
arch/s390/kernel/ptrace.c:315: warning: unused variable 'dummy'

Getting rid of the dummy variables removes the warnings.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25 13:39:05 +01:00
Martin Schwidefsky
32272a2697 [S390] __page_to_pfn warnings
For CONFIG_SPARSEMEM_VMEMMAP=y on s390 I get warnings like

init/main.c: In function 'start_kernel':
init/main.c:641: warning: format '%08lx' expects type 'long unsigned int', but argument 2 has type 'int'

The warning can be suppressed with a cast to unsigned long in the
CONFIG_SPARSEMEM_VMEMMAP=y version of __page_to_pfn.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25 13:39:04 +01:00
Hendrik Brueckner
91d5d45ee0 [S390] iucv: Locking free version of iucv_message_(receive|send)
Provide a locking free version of iucv_message_receive and iucv_message_send
that do not call local_bh_enable in a spin_lock_(bh|irqsave)() context.

Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
2008-12-25 13:39:04 +01:00
Hendrik Brueckner
44a01d5ba8 [S390] s390/hvc_console: z/VM IUCV hypervisor console support
This patch introduces a new hypervisor console (HVC) back-end that provides
terminal access over the z/VM inter-user communication vehicle (IUCV).

The z/VM IUCV communication is independent of the regular tcp/ip network
and allows access even if there is no network connection between two
z/VM guest virtual machines.
The z/VM IUCV hypervisor console back-end helps the user to access a
z/VM guest virtual machine that lacks of network connectivity; and thus,
provides a "full-screen" terminal alternative to 3215/3270 terminal sessions.

Use the hvc_iucv=[0..8] kernel boot parameter to specify the number of
HVC terminals using a z/VM IUCV back-end.

A recent version of the s390-tools package is required to establish a
terminal connection to a z/VM IUCV hypervisor console back-end.

Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25 13:39:03 +01:00
Heiko Carstens
5d360a75f8 [S390] ftrace: function tracer backend for s390
This implements just the basic function tracer (_mcount) backend for s390.
The dynamic variant will come later.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25 13:39:03 +01:00
Kay Sievers
98df67b324 [S390] struct device - replace bus_id with dev_name(), dev_set_name()
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25 13:39:03 +01:00
Stefan Haberland
0cd4bd4754 [S390] dasd: call cleanup_cqr with request_queue_lock
__dasd_cleanup_cqr should be called with request_queue_lock held and
__dasd_block_process_erp with queue_lock

Signed-off-by: Stefan Haberland <stefan.haberland@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25 13:39:02 +01:00
Stefan Haberland
50afd20f8c [S390] dasd: correct sense byte condition for SIM
SIM sense data are always 32 bit sense data so sense byte 27 bit 0
has not to be set.

Signed-off-by: Stefan Haberland <stefan.haberland@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25 13:39:02 +01:00
Cornelia Huck
faf16aa9b3 [S390] dasd: Use accessors instead of using driver_data directly.
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25 13:39:01 +01:00
Stefan Haberland
2bf373b3e3 [S390] dasd: improve dasd statistics proc interface
For a large number of I/O requests the values were shifted binary.
The shift was not transparent for the user because the shift value
was not displayed. To make this interface more human readable the
values are shifted decimal and the scale factor is displayed.

Signed-off-by: Stefan Haberland <stefan.haberland@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25 13:39:01 +01:00
Christof Schmitt
bd43a42b7e [S390] zfcp: Report microcode level through service level interface
Register zfcp with the new /proc/service_level interface to report the
FCP microcode level. When the adapter goes offline or a channel path
disappears, zfcp unregisters, since the microcode version might change
and zfcp does not know about it.

Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25 13:39:01 +01:00
Martin Schwidefsky
6bcac508fb [S390] service level interface.
Add a new proc interface /proc/service_levels that allows any code
to report a relevant service level, e.g. the microcode level of
devices, the service level of the hypervisor, etc.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25 13:39:00 +01:00
Jan Glauber
7a0b4cbc7d [S390] qdio: fix error reporting for hipersockets
Hipersocket connections can encounter temporary busy conditions.
In case of the busy bit set we retry the SIGA operation immediatelly.
If the busy condition still persists after 100 ms we fail and report
the error to the upper layer. The second stage retry logic is removed.
In case of ongoing busy conditions the upper layer needs to reset the
connection.

The reporting of a SIGA error is now done synchronously to allow the
network driver to requeue the buffers. Also no error trace is created
for the temporary SIGA errors so the error message view is not flooded.

Signed-off-by: Jan Glauber <jang@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25 13:39:00 +01:00
Jan Glauber
50f769df1c [S390] qdio: improve inbound buffer acknowledgement
- Use automatic acknowledgement of incoming buffers in QEBSM mode
- Move ACK for non-QEBSM mode always to the newest buffer to prevent
  a race with qdio_stop_polling
- Remove the polling spinlock, the upper layer drivers return new buffers
  in the same code path and could not run in parallel
- Don't flood the error log in case of no-target-buffer-empty
- In handle_inbound we check if we would overwrite an ACK'ed buffer, if so
  advance the pointer to the oldest ACK'ed buffer so we don't overwrite an
  empty buffer in qdio_stop_polling

Signed-off-by: Jan Glauber <jang@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25 13:38:59 +01:00
Jan Glauber
22f9934767 [S390] qdio: rework debug feature logging
- make qdio_trace a per device view
- remove s390dbf exceptions
- remove CONFIG_QDIO_DEBUG, not needed anymore if we check for the level
  before calling sprintf
- use snprintf for dbf entries
- add start markers to see if the dbf view wrapped
- add a global error view for all queues

Signed-off-by: Jan Glauber <jang@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25 13:38:59 +01:00
Jan Glauber
9a1ce28aeb [S390] qdio: fix compile warning under 31 bit
The QEBSM instructions are only available for CONFIG_64BIT, they are not
used under 31 bit. Make compiler happy about the false positive:

drivers/s390/cio/qdio_main.c: In function ?qdio_inbound_q_done?:
drivers/s390/cio/qdio_main.c:532: warning: ?state? may be used uninitialized in this function

Signed-off-by: Jan Glauber <jang@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25 13:38:58 +01:00
Jan Glauber
23589d057a [S390] qdio: add eqbs/sqbs instruction counters
Add counters for the eqbs and sqbs instructions that indicate how often
we issued the instructions and how often the instructions returned with
less buffers than specified.

Signed-off-by: Jan Glauber <jang@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25 13:38:58 +01:00
Jan Glauber
bbd50e172f [S390] qdio: fix qeth port count detection
qeth needs to get the port count information before
qdio has allocated a page for the chsc operation.
Extend qdio_get_ssqd_desc() to store the data in the
specified structure.

Signed-off-by: Jan Glauber <jang@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25 13:38:58 +01:00
Christian Maaser
43c207e6e5 [S390] ap: Minor code beautification.
Changed some symbol names for a better and clearer code.

Signed-off-by: Christian Maaser <cmaaser@de.ibm.com>
Signed-off-by: Felix Beck <beckf@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25 13:38:57 +01:00
Felix Beck
cb17a6364a [S390] zcrypt: Use of Thin Interrupts
When the machine supports AP adapter interrupts polling will be
switched off at module initialization and the driver will work in
interrupt mode.

Signed-off-by: Felix Beck <felix.beck@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25 13:38:57 +01:00
Heiko Carstens
320c04c068 [S390] Move stfle to header file.
stfle will be needed by the ap_bus module to figure out wether the AP
queue adapter interruption facility is installed.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25 13:38:56 +01:00
Heiko Carstens
ca9fc75a68 [S390] convert s390 to generic IPI infrastructure
Since etr/stp don't need the old smp_call_function semantics anymore
we can convert s390 to the generic IPI infrastructure.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25 13:38:56 +01:00
Martin Schwidefsky
0b3016b781 [S390] serialize stp/etr work
The work function dispatched with schedule_work() can be run twice
on different cpus because run_workqueue clears the WORK_STRUCT_PENDING
bit and then executes the function. Another cpu can call schedule_work()
again and run the work function a second time before the first call
is completed. This patch serialized the etr and stp work function with
a mutex.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25 13:38:56 +01:00
Heiko Carstens
750887dedc [S390] convert etr/stp to stop_machine interface
This converts the etr and stp code to the new stop_machine interface
which allows to synchronize all cpus without allocating any memory.
This way we get rid of the only reason why we haven't converted s390
to the generic IPI interface yet.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25 13:38:55 +01:00
Martin Schwidefsky
b020632e40 [S390] introduce vdso on s390
Add a vdso to speed up gettimeofday and clock_getres/clock_gettime for
CLOCK_REALTIME/CLOCK_MONOTONIC.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25 13:38:55 +01:00
Martin Schwidefsky
fc5243d98a [S390] arch_setup_additional_pages arguments
arch_setup_additional_pages currently gets two arguments, the binary
format descripton and an indication if the process uses an executable
stack or not. The second argument is not used by anybody, it could
be removed without replacement.

What actually does make sense is to pass an indication if the process
uses the elf interpreter or not. The glibc code will not use anything
from the vdso if the process does not use the dynamic linker, so for
statically linked binaries the architecture backend can choose not
to map the vdso.

Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25 13:38:54 +01:00
Christian Borntraeger
a114a9d69d [S390] vmcp: remove BKL
The vmcp driver uses the session->mutex for concurrent access of the data
structures. Therefore, the BKL in vmcp_open does not protect against any
other function in the driver.
The BLK in vmcp_open would protect concurrent access to the module init
but all necessary steps ave finished before misc_register is called.
We can safely remove the lock_kernel from vcmp.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25 13:38:54 +01:00
Heiko Carstens
f414f5f153 [S390] cpu topology: dont destroy cpu sets on topology change
Call rebuild_sched_domains instead of arch_reinit_sched_domains if
cpu topology changes. This leaves cpu sets alone which otherwise would
be destroyed.
If and how it makes sense to define cpu sets on a virtualized
architecture is another question.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25 13:38:07 +01:00
Al Viro
8f2961c39e [S390] audit: get s390 ret_from_fork in sync with other architectures
On s390 we have ret_from_fork jump not to the "do all work we
normally do on return from syscall" as on x86, ppc, etc., but to the
"do all such work except audit".  Historical reasons - the codepath
triggered when we have AUDIT process flag set is separated from the
normall one and they converge at sysc_return, which is the common
part of post-syscall work.  And does not include calling audit_syscall_exit() -
that's done in the end of sysc_tracesys path, just before that path jumps
to sysc_return.

	IOW, the child returning from fork()/clone()/vfork() doesn't
call audit_syscall_exit() at all, so no matter what we do with its
audit context, we are not going to see the audit entry.

	The fix is simple: have ret_from_fork go to the point just past
the call of sys_.... in the 'we have AUDIT flag set' path.  There we
have (64bit variant; for 31bit the situation is the same):
sysc_tracenogo:
        tm      __TI_flags+7(%r9),(_TIF_SYSCALL_TRACE|_TIF_SYSCALL_AUDIT)
        jz      sysc_return
        la      %r2,SP_PTREGS(%r15)     # load pt_regs
        larl    %r14,sysc_return        # return point is sysc_return
        jg      do_syscall_trace_exit
which is precisely what we need - check the flag, bugger off to sysc_return
if not set, otherwise call do_syscall_trace_exit() and bugger off to
sysc_return.  r9 has just been properly set by ret_from_fork itself,
so we are fine.

	Tested on s390x, seems to work fine.  WARNING: it's been about
16 years since my last contact with 3X0 assembler[1], so additional
review would be very welcome.  I don't think I've managed to screw it
up, but...

[1] that *was* in another country and besides, the box is dead...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25 13:38:07 +01:00
Heiko Carstens
5439050f9f [S390] cpu topology: fix cpu_core_map initialization
Common code doesn't call arch_update_cpu_topology() anymore on
cpu hotplug. But our architecture backend relied on that in order to
update the cpu_core_map. For machines without cpu topology support
this leads uninitialized cpu_core_maps for later on added cpus.

To solve this just initialize the maps with cpu_possible_map, since
that will be always valid for machines without topology support.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25 13:38:07 +01:00
Linus Torvalds
4a6908a3a0 Linux 2.6.28
Happy holidays..
2008-12-24 15:26:37 -08:00
Linus Torvalds
c20137fc53 Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-2.6
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-2.6:
  V4L/DVB (9920): em28xx: fix NULL pointer dereference in call to VIDIOC_INT_RESET command
  V4L/DVB (9908a): MAINTAINERS: mark linux-uvc-devel as subscribers only
  V4L/DVB (9906): v4l2-compat: test for unlocked_ioctl as well.
  V4L/DVB (9885): drivers/media Kconfig's: fix bugzilla #12204
  V4L/DVB (9875): gspca - main: Fix vidioc_s_jpegcomp locking.
  V4L/DVB (9781): [PATCH] Cablestar 2 I2C retries (fix CableStar2 support)
  V4L/DVB (9780): dib0700: Stop repeating after user stops pushing button
2008-12-24 10:24:52 -08:00
Linus Torvalds
1806f82655 Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86: disable X86_PTRACE_BTS
2008-12-24 10:24:14 -08:00
Linus Torvalds
2523659ded Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6:
  ALSA: hda - Add missing terminators in patch_sigmatel.c
2008-12-24 10:23:21 -08:00
Herton Ronaldo Krzesinski
574f3c4f5c ALSA: hda - Add missing terminators in patch_sigmatel.c
Signed-off-by: Herton Ronaldo Krzesinski <herton@mandriva.com.br>
Cc: stable@kernel.org
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2008-12-24 11:03:56 +01:00
Ingo Molnar
40f15ad8aa x86: disable X86_PTRACE_BTS
there's a new ptrace arch level feature in .28:

  config X86_PTRACE_BTS
  bool "Branch Trace Store"

it has broken fork() handling: the old DS area gets copied over into
a new task without clearing it.

Fixes exist but they came too late:

  c5dee61: x86, bts: memory accounting
  bf53de9: x86, bts: add fork and exit handling

and are queued up for v2.6.29. This shows that the facility is still not
tested well enough to release into a stable kernel - disable it for now and
reactivate in .29. In .29 the hardware-branch-tracer will use the DS/BTS
facilities too - hopefully resulting in better code.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-24 10:49:51 +01:00
Kyle McMartin
5289f46b9d parisc: disable UP-optimized flush_tlb_mm
flush_tlb_mm's "optimized" uniprocessor case of allocating a new
context for userspace is exposing a race where we can suddely return
to a syscall with the protection id and space id out of sync, trapping
on the next userspace access.

Debugged-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Tested-by: Helge Deller <deller@gmx.de>
Signed-off-by: Kyle McMartin <kyle@mcmartin.ca>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-12-23 17:03:21 -08:00
Linus Torvalds
8960223d59 Merge branch 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6
* 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6:
  drm/radeon: fix correctness of irq_enabled check for radeon.
2008-12-23 17:01:40 -08:00
Harry Ciao
d519c8d9cc edac: fix edac core deadlock when removing a device
When deleting an edac device, we have to wait for its edac_dev.work to be
completed before deleting the whole edac_dev structure.  Since we have no
idea which work in current edac_poller's workqueue is the work we are
conerned about, we wait for all work in the edac_poller's workqueue to be
proceseed.  This is done via flush_cpu_workqueue() which inserts a
wq_barrier into the tail of the workqueue and then sleeping on the
completion of this wq_barrier.  The edac_poller will wake up sleepers when
it is found.

EDAC core creates only one kernel worker thread, edac_poller, to run the
works of all current edac devices.  They share the same callback function
of edac_device_workq_function(), which would grab the mutex of
device_ctls_mutex first before it checks the device.  This is exactly
where edac_poller and rmmod would have a great chance to deadlock.

In below call trace of rmmod > ... >
edac_device_del_device >
edac_device_workq_teardown > flush_workqueue > flush_cpu_workqueue,

device_ctls_mutex would have already been grabbed by
edac_device_del_device().  So, on one hand rmmod would sleep on the
completion of a wq_barrier, holding device_ctls_mutex; on the other hand
edac_poller would be blocked on the same mutex when it's running any one
of works of existing edac evices(Note, this edac_dev.work is likely to be
totally irrelevant to the one that is being removed right now)and never
would have a chance to run the work of above wq_barrier to wake rmmod up.

edac_device_workq_teardown() should not be called within the critical
region of device_ctls_mutex.  Just like is done in edac_pci_del_device()
and edac_mc_del_mc(), where edac_pci_workq_teardown() and
edac_mc_workq_teardown() are called after related mutex are released.

Moreover, an edac_dev.work should check first if it is being removed.  If
this is the case, then it should bail out immediately.  Since not all of
existing edac devices are to be removed, this "shutting flag" should be
contained to edac device being removed.  The current edac_dev.op_state can
be used to serve this purpose.

The original deadlock problem and the solution have been witnessed and
tested on actual hardware.  Without the solution, rmmod an edac driver
would result in below deadlock:

root@localhost:/root> rmmod mv64x60_edac
EDAC DEBUG: mv64x60_dma_err_remove()
EDAC DEBUG: edac_device_del_device()
EDAC DEBUG: find_edac_device_by_dev()

(hang for a moment)

INFO: task edac-poller:2030 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
edac-poller   D 00000000     0  2030      2
Call Trace:
[df159dc0] [c0071e3c] free_hot_cold_page+0x17c/0x304 (unreliable)
[df159e80] [c000a024] __switch_to+0x6c/0xa0
[df159ea0] [c03587d8] schedule+0x2f4/0x4d8
[df159f00] [c03598a8] __mutex_lock_slowpath+0xa0/0x174
[df159f40] [e1030434] edac_device_workq_function+0x28/0xd8 [edac_core]
[df159f60] [c003beb4] run_workqueue+0x114/0x218
[df159f90] [c003c674] worker_thread+0x5c/0xc8
[df159fd0] [c004106c] kthread+0x5c/0xa0
[df159ff0] [c0013538] original_kernel_thread+0x44/0x60
INFO: task rmmod:2062 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
rmmod         D 0ff2c9fc     0  2062   1839
Call Trace:
[df119c00] [c0437a74] 0xc0437a74 (unreliable)
[df119cc0] [c000a024] __switch_to+0x6c/0xa0
[df119ce0] [c03587d8] schedule+0x2f4/0x4d8
[df119d40] [c03591dc] schedule_timeout+0xb0/0xf4

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-12-23 15:58:21 -08:00
Li Zefan
20ca9b3f4c cgroups: avoid accessing uninitialized data in failure path
If cgroup_get_rootdir() failed, free_cg_links() will be called in the
failure path, but tmp_cg_links hasn't been initialized at that time.

I introduced this bug in the 2.6.27 merge window.

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Cc: Paul Menage <menage@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-12-23 15:58:21 -08:00
Sharyathi Nagesh
e368d3a836 cgroups: suppress bogus warning messages
Remove spurious warning messages that are thrown onto the console during
cgroup operations.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Sharyathi Nagesh <sharyathi@in.ibm.com>
Acked-by: Serge E. Hallyn <serge@hallyn.com>
Cc: Paul Menage <menage@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-12-23 15:58:21 -08:00