If a given CPU avoids the idle loop but also avoids starting a new
RCU grace period for a full minute, RCU can issue spurious RCU CPU
stall warnings. This commit fixes this issue by adding a check for
ongoing grace period to avoid these spurious stall warnings.
Reported-by: Becky Bruce <bgillbruce@gmail.com>
Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
The print_other_cpu_stall() function accesses a number of rcu_node
fields without protection from the ->lock. In theory, this is not
a problem because the fields accessed are all integers, but in
practice the compiler can get nasty. Therefore, the commit extends
the existing critical section to cover the entire loop body.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
The rcu_print_detail_task_stall_rnp() function invokes
rcu_preempt_blocked_readers_cgp() to verify that there are some preempted
RCU readers blocking the current grace period outside of the protection
of the rcu_node structure's ->lock. This means that the last blocked
reader might exit its RCU read-side critical section and remove itself
from the ->blkd_tasks list before the ->lock is acquired, resulting in
a segmentation fault when the subsequent code attempts to dereference
the now-NULL gp_tasks pointer.
This commit therefore moves the test under the lock. This will not
have measurable effect on lock contention because this code is invoked
only when printing RCU CPU stall warnings, in other words, in the common
case, never.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
The increment_cpu_stall_ticks() function listed each RCU flavor
explicitly, with an ifdef to handle preemptible RCU. This commit
therefore applies for_each_rcu_flavor() to save a line of code.
Because this commit switches from a code-based enumeration of the
flavors of RCU to an rcu_state-list-based enumeration, it is no longer
possible to apply __get_cpu_var() to the per-CPU rcu_data structures.
We instead use __this_cpu_var() on the rcu_state structure's ->rda field
that references the corresponding rcu_data structures.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Commit 1217ed1b (rcu: permit rcu_read_unlock() to be called while holding
runqueue locks) made rcu_initiate_boost() restore irq state when releasing
the rcu_node structure's ->lock, but failed to update the header comment
accordingly. This commit therefore brings the header comment up to date.
Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
The rcu_implicit_offline_qs() function implicitly assumed that execution
would progress predictably when interrupts are disabled, which is of course
not guaranteed when running on a hypervisor. Furthermore, this function
is short, and is called from one place only in a short function.
This commit therefore ensures that the timing is checked before
checking the condition, which guarantees correct behavior even given
indefinite delays. It also inlines rcu_implicit_offline_qs() into
rcu_implicit_dynticks_qs().
Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
The rcu_preempt_offline_tasks() moves all tasks queued on a given leaf
rcu_node structure to the root rcu_node, which is done when the last CPU
corresponding the the leaf rcu_node structure goes offline. Now that
RCU-preempt's synchronize_rcu_expedited() implementation blocks CPU-hotplug
operations during the initialization of each rcu_node structure's
->boost_tasks pointer, rcu_preempt_offline_tasks() can do a better job
of setting the root rcu_node's ->boost_tasks pointer.
The key point is that rcu_preempt_offline_tasks() runs as part of the
CPU-hotplug process, so that a concurrent synchronize_rcu_expedited()
is guaranteed to either have not started on the one hand (in which case
there is no boosting on behalf of the expedited grace period) or to be
completely initialized on the other (in which case, in the absence of
other priority boosting, all ->boost_tasks pointers will be initialized).
Therefore, if rcu_preempt_offline_tasks() finds that the ->boost_tasks
pointer is equal to the ->exp_tasks pointer, it can be sure that it is
correctly placed.
In the case where there was boosting ongoing at the time that the
synchronize_rcu_expedited() function started, different nodes might start
boosting the tasks blocking the expedited grace period at different times.
In this mixed case, the root node will either be boosting tasks for
the expedited grace period already, or it will start as soon as it gets
done boosting for the normal grace period -- but in this latter case,
the root node's tasks needed to be boosted in any case.
This commit therefore adds a check of the ->boost_tasks pointer against
the ->exp_tasks pointer to the list that prevents updating ->boost_tasks.
Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
There is a need to use RCU from interrupt context, but either before
rcu_irq_enter() is called or after rcu_irq_exit() is called. If the
interrupt occurs from idle, then lockdep-RCU will complain about such
uses, as they appear to be illegal uses of RCU from the idle loop.
In other environments, RCU_NONIDLE() could be used to properly protect
the use of RCU, but RCU_NONIDLE() currently cannot be invoked except
from process context.
This commit therefore modifies RCU_NONIDLE() to permit its use more
globally.
Reported-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
When rcu_preempt_offline_tasks() clears tasks from a leaf rcu_node
structure, it does not NULL out the structure's ->boost_tasks field.
This commit therefore fixes this issue.
Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Because TINY_RCU's idle detection keys directly off of the nesting
level, rather than from a separate variable as in TREE_RCU, the
TINY_RCU dyntick-idle tracing on transition to idle must happen
before the change to the nesting level. This commit therefore makes
this change by passing the desired new value (rather than the old value)
of the nesting level in to rcu_idle_enter_common().
[ paulmck: Add fix for wrong-variable bug spotted by
Michael Wang <wangyun@linux.vnet.ibm.com>. ]
Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
There have been some recent bugs that were triggered only when
preemptible RCU's __rcu_read_unlock() was preempted just after setting
->rcu_read_lock_nesting to INT_MIN, which is a low-probability event.
Therefore, reproducing those bugs (to say nothing of gaining confidence
in alleged fixes) was quite difficult. This commit therefore creates
a new debug-only RCU kernel config option that forces a short delay
in __rcu_read_unlock() to increase the probability of those sorts of
bugs occurring.
Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Each grace period is supposed to have at least one callback waiting
for that grace period to complete. However, if CONFIG_NO_HZ=n, an
extra callback-free grace period is no big problem -- it will chew up
a tiny bit of CPU time, but it will complete normally. In contrast,
CONFIG_NO_HZ=y kernels have the potential for all the CPUs to go to
sleep indefinitely, in turn indefinitely delaying completion of the
callback-free grace period. Given that nothing is waiting on this grace
period, this is also not a problem.
That is, unless RCU CPU stall warnings are also enabled, as they are
in recent kernels. In this case, if a CPU wakes up after at least one
minute of inactivity, an RCU CPU stall warning will result. The reason
that no one noticed until quite recently is that most systems have enough
OS noise that they will never remain absolutely idle for a full minute.
But there are some embedded systems with cut-down userspace configurations
that consistently get into this situation.
All this begs the question of exactly how a callback-free grace period
gets started in the first place. This can happen due to the fact that
CPUs do not necessarily agree on which grace period is in progress.
If a CPU still believes that the grace period that just completed is
still ongoing, it will believe that it has callbacks that need to wait for
another grace period, never mind the fact that the grace period that they
were waiting for just completed. This CPU can therefore erroneously
decide to start a new grace period. Note that this can happen in
TREE_RCU and TREE_PREEMPT_RCU even on a single-CPU system: Deadlock
considerations mean that the CPU that detected the end of the grace
period is not necessarily officially informed of this fact for some time.
Once this CPU notices that the earlier grace period completed, it will
invoke its callbacks. It then won't have any callbacks left. If no
other CPU has any callbacks, we now have a callback-free grace period.
This commit therefore makes CPUs check more carefully before starting a
new grace period. This new check relies on an array of tail pointers
into each CPU's list of callbacks. If the CPU is up to date on which
grace periods have completed, it checks to see if any callbacks follow
the RCU_DONE_TAIL segment, otherwise it checks to see if any callbacks
follow the RCU_WAIT_TAIL segment. The reason that this works is that
the RCU_WAIT_TAIL segment will be promoted to the RCU_DONE_TAIL segment
as soon as the CPU is officially notified that the old grace period
has ended.
This change is to cpu_needs_another_gp(), which is called in a number
of places. The only one that really matters is in rcu_start_gp(), where
the root rcu_node structure's ->lock is held, which prevents any
other CPU from starting or completing a grace period, so that the
comparison that determines whether the CPU is missing the completion
of a grace period is stable.
Reported-by: Becky Bruce <bgillbruce@gmail.com>
Reported-by: Subodh Nijsure <snijsure@grid-net.com>
Reported-by: Paul Walmsley <paul@pwsan.com>
Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Tested-by: Paul Walmsley <paul@pwsan.com> # OMAP3730, OMAP4430
Cc: stable@vger.kernel.org
Tracepoints declare a static inline trace_*_rcuidle variant of the trace
function, to support safely generating trace events from the idle loop.
Module code never actually uses that variant of trace functions, because
modules don't run code that needs tracing with RCU idled. However, the
declaration of those otherwise unused functions causes the module to
reference rcu_idle_exit and rcu_idle_enter, which RCU does not export to
modules.
To avoid this, don't generate trace_*_rcuidle functions for tracepoints
declared in module code.
Link: http://lkml.kernel.org/r/20120905062306.GA14756@leaf
Reported-by: Steven Rostedt <rostedt@goodmis.org>
Acked-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Josh Triplett <josh@joshtriplett.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Pull DMA-mapping fixes from Marek Szyprowski:
"Another set of fixes for ARM dma-mapping subsystem.
Commit e9da6e9905 replaced custom consistent buffer remapping code
with generic vmalloc areas. It however introduced some regressions
caused by limited support for allocations in atomic context. This
series contains fixes for those regressions.
For some subplatforms the default, pre-allocated pool for atomic
allocations turned out to be too small, so a function for setting its
size has been added.
Another set of patches adds support for atomic allocations to
IOMMU-aware DMA-mapping implementation.
The last part of this pull request contains two fixes for Contiguous
Memory Allocator, which relax too strict requirements."
* 'fixes-for-3.6' of git://git.linaro.org/people/mszyprowski/linux-dma-mapping:
ARM: dma-mapping: IOMMU allocates pages from atomic_pool with GFP_ATOMIC
ARM: dma-mapping: Introduce __atomic_get_pages() for __iommu_get_pages()
ARM: dma-mapping: Refactor out to introduce __in_atomic_pool
ARM: dma-mapping: atomic_pool with struct page **pages
ARM: Kirkwood: increase atomic coherent pool size
ARM: DMA-Mapping: print warning when atomic coherent allocation fails
ARM: DMA-Mapping: add function for setting coherent pool size from platform code
ARM: relax conditions required for enabling Contiguous Memory Allocator
mm: cma: fix alignment requirements for contiguous regions
Pull HID updates from Jiri Kosina:
"It contains a fix for Eaton Ellipse MAX UPS from Alan Stern,
performance improvement (not processing debug data if noone is
interested), by Henrik Rydberg, and allowing tpkbd-driven devices to
work even with generic driver in a crippled mode, by Andres Freund."
* 'upstream-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid:
HID: tpkbd: work even if the new Lenovo Keyboard driver is not configured
HID: Only dump input if someone is listening
HID: add NOGET quirk for Eaton Ellipse MAX UPS
c1dcad2d32 added a new driver configured by
HID_LENOVO_TPKBD but made the hid_have_special_driver entry non-optional which
lead to a recognized but non-working device if the new driver wasn't
configured (which is the correct default).
Signed-off-by: Andres Freund <andres@anarazel.de>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
* Fix for TLB flushing introduced in v3.6
* Fix Xen-SWIOTLB not using proper DMA mask - device had 64bit but
in a 32-bit kernel we need to allocate for coherent pages from a
32-bit pool.
* When trying to re-use P2M nodes we had a one-off error and triggered
a BUG_ON check with specific CONFIG_ option.
* When doing FLR in Xen-PCI-backend we would first do FLR then save the
PCI configuration space. We needed to do it the other way around.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iQEcBAABAgAGBQJQSS7TAAoJEFjIrFwIi8fJOBQH/03JBKBbFvewhop8T5Jww2c9
SWmMgzDm5HkxeWj5XnM+zlLoHrYFFu7tyuRLiny4weE0LdRl4adJ1TVpStxap/b6
MSQKe+tZevslaReBOsMpbCk3z7fEWNlAcpm6wMp1xYmLoHcr0JMpOCmzigbf7dwM
F4UWULheih9ME3UeqDAU8qgvfv6ccZ9vempO4TDWKjxfxfWODCNMRx+Ny+C7NNRk
QeoInHJUqcRkg0q0OIciF/YYDmn8hIH7HgfqomuMb6rEv2LOieLnWVuniEPWM75s
lECxro7v2Z9s1Std0WWKcFp8VZpI2scnQYU8aL8TpewHcbzKHQYyipuKPu7FiEQ=
=GfLE
-----END PGP SIGNATURE-----
Merge tag 'stable/for-linus-3.6-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen
Pull Xen bug-fixes from Konrad Rzeszutek Wilk:
* Fix for TLB flushing introduced in v3.6
* Fix Xen-SWIOTLB not using proper DMA mask - device had 64bit but
in a 32-bit kernel we need to allocate for coherent pages from a
32-bit pool.
* When trying to re-use P2M nodes we had a one-off error and triggered
a BUG_ON check with specific CONFIG_ option.
* When doing FLR in Xen-PCI-backend we would first do FLR then save the
PCI configuration space. We needed to do it the other way around.
* tag 'stable/for-linus-3.6-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
xen/pciback: Fix proper FLR steps.
xen: Use correct masking in xen_swiotlb_alloc_coherent.
xen: fix logical error in tlb flushing
xen/p2m: Fix one-off error in checking the P2M tree directory.
Power management
- PCI/PM: Enable D3/D3cold by default for most devices
- PCI/PM: Keep parent bridge active when probing device
- PCI/PM: Fix config reg access for D3cold and bridge suspending
- PCI/PM: Add ABI document for sysfs file d3cold_allowed
Core
- PCI: Don't print anything while decoding is disabled
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
iQIcBAABAgAGBQJQSQ9nAAoJEPGMOI97Hn6zc+EQAMWlPJUAUR0uY2mnz9KANGxQ
7yHsgoCj1lbMNMrlyD9CveE1b5PCUwL8oiSnsLvqUwsdvNr14FGhCCtTYotoDEBh
2QuSkI2rG1y+2Kr9Lrd4TrzdcCfBjQ3vUIhU1va2uPjrf8fsPQCvm0Yn8gC/SKDg
qV452021IFxkeO0PIZ9z7H4uaOIuqqb1mnqXImKZ3zUMNfNl0xLaZfrWUK5DG6qn
Mgm25qCNZ70Hj5mM/AqhKLsWH2ruMsZuA/5QvdHHJmi0pGF8OkFtvcKNv785Ajlg
Ar5YOtLaeszyVpueNAWzevH+1m86D6USW1yKnlxgVigVpxFNhHs2OHr4kU1IzwKU
AkMdDCyGCBXoEBiXwozsMSVX4gi87k91ETpe8E/WlU5gkLN8hqBO8TJlCsLBAXGV
7RIWJPeoem6UjoYYbqUe8Z4/ZPbSgWgdVkSI9PLcVsVrPQZR0Z7f3JM5Kcdgooc9
iPQleMK0x5YPtsFHVqCJLD6A4xaWR8VZaK3hHSziYlyOjPpIJ0vErh5uxh6pWBWo
wAzQDp/dfmOBhld7RzzQQrOyHHIPR7+jHghH7/zpgAhaaGhAeqMXX4Y+R2CKt9yR
iNW6QlHasaBPQHehWjstMKlwR7iXZDORMtEv+tAyg2m1ZmgWi7Aut2Mk/6e3Z3Jy
+d2tn7b5aBTc3y8dcVBP
=zFNa
-----END PGP SIGNATURE-----
Merge tag '3.6-pci-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull PCI updates from Bjorn Helgaas:
"Power management
- PCI/PM: Enable D3/D3cold by default for most devices
- PCI/PM: Keep parent bridge active when probing device
- PCI/PM: Fix config reg access for D3cold and bridge suspending
- PCI/PM: Add ABI document for sysfs file d3cold_allowed
Core
- PCI: Don't print anything while decoding is disabled"
* tag '3.6-pci-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
PCI: Don't print anything while decoding is disabled
PCI/PM: Add ABI document for sysfs file d3cold_allowed
PCI/PM: Fix config reg access for D3cold and bridge suspending
PCI/PM: Keep parent bridge active when probing device
PCI/PM: Enable D3/D3cold by default for most devices
Mostly Renesas and Atmel bugfixes this time, targeting boot and build
problems. A couple of patches for gemini and kirkwood as well. On a
whole nothing very controversial.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQIcBAABAgAGBQJQR+5tAAoJEIwa5zzehBx37ucQAIoOkQN6RqzXDkM4o6uhgtrS
Fqt231762msmrrMpFwH/o0slgtyNIogHZfykoy2H8EXxJC93ZLYyVaqii+D2/H9O
hRn7F6JGZ2R/h0+bYWAf3skX1ecRikxOF+knPmD4aolsnkddBeP6Rb0KIDgRqwRb
pJxmlD8M68Dq7oKCMIT3EWP8pzXjJlVeS5Zli/X3XIxE5iTELQltK4St8oZWE2S1
O2uOj/GuGgERiq39lIbf1YggEQZrZVf0a3a4koVFMCQ/kmT7LE1bzVp2lqsUNKpB
2KG2GllTac2excjeg8Oyd4apy2mmGQimnIdKf1W3RfpVlnXUySw1XaPbjU1L6IcO
xgZoLsxLgpeohK1T1mUsKEBtHWPKTBOTqGvquuvf8h7CAqRVPzX0nNmIt+JcfaTN
lcdflHEks55eZfEad794DjwZdCG6/RXY5wDFeZKOzUTWK2//hlsRVlYHumpiRXIe
UaVdFB//N2bzZiz+HYVBpd4TgEDhGm6WM3REYhMNVtnie/ubg/CiMW1cLuA3Xouk
STvZ6EM/G0bwAM3PZOnBWxKlv7jAN0Zwhw0meK7W1Go1FrQsdroVmOMDL7PzPIWI
kgtDzxLa6VV77EvBVJAPs2JZqdsRTa7r/AZLaBne/VvhikUVOdo2aOROPyn/vJqy
/8Atzs6hU2BqDYStyK/z
=hMrY
-----END PGP SIGNATURE-----
Merge tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC bug fixes from Olof Johansson:
"Mostly Renesas and Atmel bugfixes this time, targeting boot and build
problems. A couple of patches for gemini and kirkwood as well. On a
whole nothing very controversial."
* tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
ARM: gemini: fix the gemini build
ARM: shmobile: armadillo800eva: enable rw rootfs mount
ARM: Kirkwood: Fix 'SZ_1M' undeclared here for db88f6281-bp-setup.c
ARM: shmobile: mackerel: fixup usb module order
ARM: shmobile: armadillo800eva: fixup: sound card detection order
ARM: shmobile: marzen: fixup smsc911x id for regulator
ARM: at91/feature-removal-schedule: delay at91_mci removal
ARM: mach-shmobile: armadillo800eva: Enable power button as wakeup source
ARM: mach-shmobile: armadillo800eva: Fix GPIO buttons descriptions
ARM: at91/dts: remove partial parameter in at91sam9g25ek.dts
ARM: at91/clock: fix PLLA overclock warning
ARM: at91: fix rtc-at91sam9 irq issue due to sparse irq support
ARM: at91: fix system timer irq issue due to sparse irq support
ARM: shmobile: sh73a0: fixup RELOC_BASE of intca_irq_pins_desc
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQIcBAABAgAGBQJQSCWfAAoJEMsfJm/On5mBvicP/0CmzUgasB1MVpBpxZeaLrcf
buGw5GZpH/Kh7h+4mdfY5egzvn0J9lFt9gWB58lw1xaQhHgihaus/h63K9nDQpyb
NhsuGDY628st9cJWBFU18KcnSjVKNSEdVOZLtSkqzpbtAiy6zH0pQGfNPCSZaJuo
XngjHAyIHZaAyORgwudGm9hrTqTGNUdaLPp1NbXO1N7+/Upnm5f237XUWqgboTgK
4BGVOG6Prjm6ytJqc+eXg/iUACPgdG8Fe8rQMhRm0HjIEdX58+xfjrOZ3IPqxqcX
F+ri3W615PazVi27wC6Afk9NqssvJagImEzRh7DbbMAeTesh0vAPMb8UihhhX4M6
BsWU8zE1UVBEHJmi0ZT/Q+5v3heLbBd2kbmyorSBvHZyH+zFaAgvWVSqMCWguKO+
CBptQnFhFY209Pi3S79tZFe7g5T6xMFddGW+0Wpp7pdT+BgC9EL7UBJGctCzO7Yq
ipfCtRzZmJO7HnQic9T7XQhQvmCNjZEXHIooFZgZ6uF3GJL0Ipetc3t/uHwZ2C+E
TwX4eNJH/IxgyRJszjCyKWs6iu7RaRQu7lq4a6ZpROmQJIW64pEF/ZWQ2+QULwnN
pX2j2mLPltidjUiJ6uAIyzZr8FL97uLSLDkWhc4fZTsDC+2NOLwaK7nPErO1M6Rc
C9fde4G5GOhkCaGrZX83
=eZQ4
-----END PGP SIGNATURE-----
Merge tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging
Pull a hwmon fix from Guenter Roeck:
"One patch, fixing DIV_ROUND_CLOSEST to support negative dividends.
While the changes are not in the drivers/hwmon directory, the problem
primarily affects hwmon drivers, and it makes sense to push the patch
through the hwmon tree."
* tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
linux/kernel.h: Fix DIV_ROUND_CLOSEST to support negative dividends
Pull kbuild fixes from Michal Marek:
"These are two fixes that should go into 3.6. The link-vmlinux.sh one
is obvious.
The other one fixes make firmware_install with certain configurations,
where a file in the toplevel firmware tree gets installed first, and
$(INSTALL_FW_PATH)/$$(dir <file>) results in /lib/firmware/./, which
confuses make 3.82 for some reason."
* 'rc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild:
firmware: fix directory creation rule matching with make 3.82
link-vmlinux.sh: Fix stray "echo" in error message
When we do FLR and save PCI config we did it in the wrong order.
The end result was that if a PCI device was unbind from
its driver, then binded to xen-pciback, and then back to its
driver we would get:
> lspci -s 04:00.0
04:00.0 Ethernet controller: Intel Corporation 82574L Gigabit Network Connection
13:42:12 # 4 :~/
> echo "0000:04:00.0" > /sys/bus/pci/drivers/pciback/unbind
> modprobe e1000e
e1000e: Intel(R) PRO/1000 Network Driver - 2.0.0-k
e1000e: Copyright(c) 1999 - 2012 Intel Corporation.
e1000e 0000:04:00.0: Disabling ASPM L0s L1
e1000e 0000:04:00.0: enabling device (0000 -> 0002)
xen: registering gsi 48 triggering 0 polarity 1
Already setup the GSI :48
e1000e 0000:04:00.0: Interrupt Throttling Rate (ints/sec) set to dynamic conservative mode
e1000e: probe of 0000:04:00.0 failed with error -2
This fixes it by first saving the PCI configuration space, then
doing the FLR.
Reported-by: Ren, Yongjie <yongjie.ren@intel.com>
Reported-and-Tested-by: Tobias Geiger <tobias.geiger@vido.info>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
CC: stable@vger.kernel.org
- a firmware bug on several Samsung MoviNAND eMMC models causes
permanent corruption on the device when secure erase and secure trim
requests are made, so we disable those requests on these eMMC devices.
- atmel-mci: fix a hang with some SD cards by waiting for not-busy flag.
- dw_mmc: low-power mode breaks SDIO interrupts; fix PIO error handling;
fix handling of error interrupts.
- mxs-mmc: fix deadlocks; fix compile error due to dma.h arch change.
- omap: fix broken PIO mode causing memory corruption.
- sdhci-esdhc: fix card detection.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iQIcBAABAgAGBQJQR/7BAAoJEHNBYZ7TNxYMT3gQANQDdD/wvYFgUvssIAgdQJdl
KORu7ke5ks2Gvx0Ef0Ch+GXIFFDoqH0zMkQMXc+c0BdhR4BF/vNBbBYZoVYyvmgK
GUHK740nznwp9edIVbGbRm+FSxDG7ZpjUlFq+SInBOehIw7tQAep3Tbv1rYZzu6M
SUBGc31Nif9eCvTKzGU195qbutAhvcGmqEvi/ALP9bUCYR7QaTt/oD3YYuUD/ZjV
O/mf3R30vrV+R1lrQRqzXIi8vCojPMzkVU8x+C8PdOTjewvYviM0huW9+Lv6WsxK
DzhCgoNvrB1Q/rYtiNZ1gyWd5cTZWns308slREEwywD5IBQJMo5T68Q/D+h2PkS9
JCvtiZ+ryycvINyUn2JxEZuygseUiT/nMS0ijSidQY7vrZlNR8JrrK3qf0vRDkWo
0mXhQG5DvxDN6Dx9K15OgTAuABek1CEdKFD0R/a3X41H/KUvKQwWPWmdTybBriAo
8jkUl64YMpFt+LZqrmeQ1oOcwc8BvMvcM4dnMbkXrHvBb926fENBIMGNeI7O10o8
597dEpLlKNZj8e8YxJjbqaCuFhOp/pHRQy+rxdjP28V4mfHASQGgsnFAO4qV7B24
1wqtSu5ZCor4K616wfDvMYgL88pMg7fVEV3nN9Cc/WpsBaBPZv7vOQbp/f40rKWa
GRb9+9NC0WU0CP28GlCq
=n1hi
-----END PGP SIGNATURE-----
Merge tag 'mmc-fixes-for-3.6-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc
Pull MMC fixes from Chris Ball:
- a firmware bug on several Samsung MoviNAND eMMC models causes
permanent corruption on the device when secure erase and secure trim
requests are made, so we disable those requests on these eMMC devices.
- atmel-mci: fix a hang with some SD cards by waiting for not-busy flag.
- dw_mmc: low-power mode breaks SDIO interrupts; fix PIO error handling;
fix handling of error interrupts.
- mxs-mmc: fix deadlocks; fix compile error due to dma.h arch change.
- omap: fix broken PIO mode causing memory corruption.
- sdhci-esdhc: fix card detection.
* tag 'mmc-fixes-for-3.6-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc:
mmc: omap: fix broken PIO mode
mmc: card: Skip secure erase on MoviNAND; causes unrecoverable corruption.
mmc: dw_mmc: Disable low power mode if SDIO interrupts are used
mmc: dw_mmc: fix error handling in PIO mode
mmc: dw_mmc: correct mishandling error interrupt
mmc: dw_mmc: amend using error interrupt status
mmc: atmel-mci: not busy flag has also to be used for read operations
mmc: sdhci-esdhc: break out early if clock is 0
mmc: mxs-mmc: fix deadlock caused by recursion loop
mmc: mxs-mmc: fix deadlock in SDIO IRQ case
mmc: bfin_sdh: fix dma_desc_array build error
Fix the following compile error on UML.
arch/um/os-Linux/time.c: In function 'deliver_alarm':
arch/um/os-Linux/time.c:117:3: error: too few arguments to function 'alarm_handler'
arch/um/os-Linux/internal.h:1:6: note: declared here
The error was introduced by commit d3c1cfcd ("um: pass siginfo to guest
process") in 3.6-rc1.
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
CC: Martin Pärtel <martin.partel@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Allocate a structure not a pointer to it !
Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull powerpc fixes from Benjamin Herrenschmidt:
"Here are a few fixes for 3.6 that were piling up while I was away or
busy (I was mostly MIA a week or two before San Diego).
Some fixes from Anton fixing up issues with our relatively new DSCR
control feature, and a few other fixes that are either regressions or
bugs nasty enough to warrant not waiting."
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
powerpc: Don't use __put_user() in patch_instruction
powerpc: Make sure IPI handlers see data written by IPI senders
powerpc: Restore correct DSCR in context switch
powerpc: Fix DSCR inheritance in copy_thread()
powerpc: Keep thread.dscr and thread.dscr_inherit in sync
powerpc: Update DSCR on all CPUs when writing sysfs dscr_default
powerpc/powernv: Always go into nap mode when CPU is offline
powerpc: Give hypervisor decrementer interrupts their own handler
powerpc/vphn: Fix arch_update_cpu_topology() return value
- Erroneous debug message from of_get_named_gpio_flags()
- Make sure the MC9S08DZ60 GPIO driver depend on I2C being
compiled in (not module) or allmodconfig breaks.
- Check return value from irq_alloc_descs() in the Emma
Mobile GPIO driver.
- Assign the owner field for the rdc321x driver so the
module won't be removed if it has active GPIOs.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iQIcBAABAgAGBQJQRuO5AAoJEEEQszewGV1zxw0QAMvCidLRwqdoHfWOIPx3YI7x
lp44BteGZagQrELvVo/16+P5/eRumOo4Pyd/KFIZjmwhKAtHjbl0j01fAijTy0XN
bAYPHoIhekwXr1PPVRFCcqqQFxa8JkY6lPVx3fRoCAiyBeoj3RXTElaE6NmqE81I
ZGSAul2WOKzHjSn2aKGc5ciwiC5sbsI/KxxnZZxv4Mrqz32l71dn8AmqazTXVSBA
Mr7hrr8N2sO1O5SIMcdH1VjLsN73XPDuwSKHJw55uAA4TqxZ5Z0T1HYERxh97Nql
p//x4wMDV5orBIkQQROFu2FjxHiPDcnEp8bFq5oU/QnC9jJSH3qiTvJhK3OkdN4r
W+hhs26iD31fJG/nK3NVWd0JIe8dJm6gh/dDo6sQXQKnaUospZayoz5M8DkFiyo8
Ba6nrSAssIdwoY+cYwo4q+VN63qAEKFC/PolehKayUoB0RWdV178DX6v4Dl9TwHs
UeGIGzR2ymiAZJsMBZYP/simRgdljNgcr7EuhAecpI9CYnV7fsuOBEQydcmOsIHq
t2YTtzRkX/RUiKcFieXOoGv0ixnrYam4VLGBHQuDRikvyyS6mlh6KDrnWlrko4al
atObhaEgOJ+NUz1izIZvg8J4t/AMybNYRdZl8xW9s3YgUFne1wBCdmxxL9Cyax/3
jHL99DkmoYVISU0uOyPG
=QDOR
-----END PGP SIGNATURE-----
Merge tag 'gpio-fixes-for-v3.6' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio
Pull GPIO fixes from Linus Walleij:
"These are some GPIO regression fixes for v3.6:
- Erroneous debug message from of_get_named_gpio_flags()
- Make sure the MC9S08DZ60 GPIO driver depend on I2C being compiled
in (not module) or allmodconfig breaks.
- Check return value from irq_alloc_descs() in the Emma Mobile GPIO
driver.
- Assign the owner field for the rdc321x driver so the module won't
be removed if it has active GPIOs."
* tag 'gpio-fixes-for-v3.6' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
gpio: rdc321x: Prevent removal of modules exporting active GPIOs
gpio: em: Fix checking return value of irq_alloc_descs
gpio: mc9s08dz60: Fix build error if I2C=m
gpio: Fix debug message in of_get_named_gpio_flags()
There are nothing scaring, contains only small fixes for HD-audio and
USB-audio:
- EPSS regression fix and GPIO fix for HD-audio IDT codecs
- A series of USB-audio regression fixes that are found since 3.5 kernel
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (GNU/Linux)
iQIcBAABAgAGBQJQRZeCAAoJEGwxgFQ9KSmkPikP/2ZpK3xvf1WnEJT+K+2ErsI+
KEChW70QNRcghvbOa0viUx6v3B8tkPhDuo6YgTIjiP+yde0BjVujSXYAOvznvHxy
ea8xX7hR0j0G9y0FVb0oYlj+CIZWupChz9V2uSB7EudMvYPUBUzkkDtrpsb4+t3z
0e/4VEEkWIVhVyuPEnaL8fdzYPLs7UA6/QotC/SLxnFSmbaGvUuryssEmb99vl0h
xQHr8FuIF12DTS1ZwdCYcL9VTFriXcYeIWf+ThaoY3evSeStCUuboHf69RUy1WR9
JAn0hArI358m/Xicmr6pG6y0VqIPqfFLgsjXgt8/Fe0ljTcYbXoWL09MXVSL40wA
2jFWAizPe8/OAASHpgzWNb/8WWDFL/XWoeefSwL1lOIArlZhe07b61bA69s2GG0k
/rwZn058ZgQCgqC5Dph7RD7squ0ANg6WIq8obdJevLgekw/MCkKUPNA6N2lXjboT
KaKEouLUff72A2WKiE7W6ykNATWjqteDe7+11HhDC543lAKBnq/Zyv8zjsqhzPuU
TnxA0W1pcfc4GLRBaedIH2AeEzQc/aT0RE79sGGzPSB4SG0MLTRWT3yUXMiVTeWB
vBb4ZuEvGeo6i81QovIN/y1m/kEljaSu7uSi0Kzopq663HsUEvJ1BvLAviZsJZ01
pSNAne3Ze7kWQwyWkVDQ
=AbfN
-----END PGP SIGNATURE-----
Merge tag 'sound-3.6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"There are nothing scaring, contains only small fixes for HD-audio and
USB-audio:
- EPSS regression fix and GPIO fix for HD-audio IDT codecs
- A series of USB-audio regression fixes that are found since 3.5
kernel"
* tag 'sound-3.6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: snd-usb: fix cross-interface streaming devices
ALSA: snd-usb: fix calls to next_packet_size
ALSA: snd-usb: restore delay information
ALSA: snd-usb: use list_for_each_safe for endpoint resources
ALSA: snd-usb: Fix URB cancellation at stream start
ALSA: hda - Don't trust codec EPSS bit for IDT 92HD83xx & co
ALSA: hda - Avoid unnecessary parameter read for EPSS
ALSA: hda - Do not set GPIOs for speakers on IDT if there are no speakers
- a fix by Paul Cercueil to prevent a possible buffer overflow
- a fix by Bruno Prémont to prevent a rare sleep in invalid context
- a fix by Julia Lawall for a double free in auo_k190x
- a fix by Dan Carpenter to prevent a division by zero in mb862xxfb
- a regression fix by Tomi Valkeinen for the SDI output in OMAP
- a fix by Grazvydas Ignotas to fix the console colors in OMAP
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.12 (GNU/Linux)
iQIcBAABAgAGBQJQRdW3AAoJECSVL5KnPj1PKpcQAJn9tS3BeTi9zcV5sk31U1Jy
kc72y318bg5jpJPDgWOfM+9CY3yhezVi2q9odiTUtuhx5WzWbOJh7/TuUGoWGKaY
dUvTqen83hFGKfTs+eujTbAgAH3+PGIGo5VmTyx1TVj5Vnb1m3wsU6jwDzalfox8
MChhoOyrCpn6wxiqSuF6eoqZ01jASsNDW1rQaxi6AdaKKxJXrG22phMpwmfZz9Mt
naIx1aI8aowkZ5gEBlOsnX6O0kjPVyQWlXqljGuZxRJZD+x9MKj/8nU1PITi0Pz2
DNAQz3ZU/apAooLCFsBF7oGtMUJ6e/v61LdIcMZZzw569tCtnmidwdU0ss2JJhAP
aDhdtWESbxFbpggOEWD36Wq6Z5ekCW5xfTR+ZASiknSMAT6rmqfX4NPJ3750jjVb
Aa9Ds37KQgfUrQIEhKCoZPyPTtCK98fQUHhxlf2I0TbuCGBvbZEAnRpuc1eelNo9
G9IJKG1vnea2x+DmsR1OEqFpHU3fn82T7RYm3SYc6ZDJwU58GxEvTzytGBs5YxLe
bTTM2mTT3m23l8BBp6UPZjESJoBv2bd+cY70MCxhVIKHXbQYdzet0DDEDA6t2UQK
isIOLXMGqxTXNZq7bM+q9UuYUuAkTUv1VmwGy5YJOD1JtOhFGQtxBe59OMmrC1H5
aU+GP6btV/biv8cTs/JG
=8eFG
-----END PGP SIGNATURE-----
Merge tag 'fbdev-fixes-for-3.6-1' of git://github.com/schandinat/linux-2.6
Pull fbdev fixes from Florian Tobias Schandinat:
- a fix by Paul Cercueil to prevent a possible buffer overflow
- a fix by Bruno Prémont to prevent a rare sleep in invalid context
- a fix by Julia Lawall for a double free in auo_k190x
- a fix by Dan Carpenter to prevent a division by zero in mb862xxfb
- a regression fix by Tomi Valkeinen for the SDI output in OMAP
- a fix by Grazvydas Ignotas to fix the console colors in OMAP
* tag 'fbdev-fixes-for-3.6-1' of git://github.com/schandinat/linux-2.6:
OMAPFB: fix framebuffer console colors
OMAPDSS: Fix SDI PLL locking
video: mb862xxfb: prevent divide by zero bug
drivers/video/auo_k190x.c: drop kfree of devm_kzalloc's data
fbcon: Fix bit_putcs() call to kmalloc(s, GFP_KERNEL)
fbcon: prevent possible buffer overflow.
'kmem_cache_alloc()' but were freeing it using 'kfree()' in some cases.
Now we fix this by using 'kmem_cache_free()' instead.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iQIcBAABAgAGBQJQRaL3AAoJECmIfjd9wqK01AMP+wazg4SPSPoyj8HJFOHy/ZWu
Y18TNMtiSQDL2i9cH34rj6sxdNZYy+tHYt64VGQ+jBDTXIOJyZ9HhmPVcAuK5/CJ
ol7YzFRsKCF7dLps3MmUejgf15nnTVgQAMzXTbJ5Cv9FB+pex87XaObOK8phUk4r
k0o5ORwViZRXm9KPqIv65iviAw6fnzJDVaFy8WGGk3WX+gBUZq+pz78gB96N2LKE
rHjD2SyzAIam9Mv8kcJRvJ4TjWAJrPMxAAF9uGzxx/D9UbpiArmu233qCeGTXO+p
3v94lyLcWPsMA+ugwT3taHH0HgpXM1DFQb6ln/SRtMwiITi5mFY+O1MORQlb8llB
gyOUKKHeTBhD8myoStZH3lwN0lelyhcKO/s9WRZ6pPaV9992UVau+dWBTNCswckw
N+qKQHJNQEw47OaqOhYKY7lIjxRQIVZZTffJuTJWz3Ck+d63b2U46pke91G7pqrI
YiJ8MukunurXyK4ItwJ1XKvpjgrayABGhHAR8tzFQAlIy7a74T9J1qKlXSfIoB0T
L0DA30Txwb5wbfjX6unoGsVLv8Uo7oh5JltZJ2GCrL3eNS4iIW6nitGX5juWZFBl
GcarWW6lIeg9qYlz1jG5+Iqckb+IGCJqBrHrFNP9E6jpigw3K6pVxch81N3tBBld
2aq3PcZ0e0TAGokunElP
=Bin1
-----END PGP SIGNATURE-----
Merge tag 'upstream-3.6-rc5' of git://git.infradead.org/linux-ubi
Pull ubi fix from Artem Bityutskiy:
"A single small fix for memory deallocation: we allocated memory using
'kmem_cache_alloc()' but were freeing it using 'kfree()' in some
cases. Now we fix this by using 'kmem_cache_free()' instead."
* tag 'upstream-3.6-rc5' of git://git.infradead.org/linux-ubi:
UBI: fix a horrible memory deallocation bug
Commit 644595f896 ("compat: Handle COMPAT_USE_64BIT_TIME in
net/socket.c") introduced a bug where the helper functions to take
either a 64-bit or compat time[spec|val] got the arguments in the wrong
order, passing the kernel stack pointer off as a user pointer (and vice
versa).
Because of the user address range check, that in turn then causes an
EFAULT due to the user pointer range checking failing for the kernel
address. Incorrectly resuling in a failed system call for 32-bit
processes with a 64-bit kernel.
On odder architectures like HP-PA (with separate user/kernel address
spaces), it can be used read kernel memory.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When running 32-bit pvops-dom0 and a driver tries to allocate a coherent
DMA-memory the xen swiotlb-implementation returned memory beyond 4GB.
The underlaying reason is that if the supplied driver passes in a
DMA_BIT_MASK(64) ( hwdev->coherent_dma_mask is set to 0xffffffffffffffff)
our dma_mask will be u64 set to 0xffffffffffffffff even if we set it to
DMA_BIT_MASK(32) previously. Meaning we do not reset the upper bits.
By using the dma_alloc_coherent_mask function - it does the proper casting
and we get 0xfffffffff.
This caused not working sound on a system with 4 GB and a 64-bit
compatible sound-card with sets the DMA-mask to 64bit.
On bare-metal and the forward-ported xen-dom0 patches from OpenSuse a coherent
DMA-memory is always allocated inside the 32-bit address-range by calling
dma_alloc_coherent_mask.
This patch adds the same functionality to xen swiotlb and is a rebase of the
original patch from Ronny Hegewald which never got upstream b/c the
underlaying reason was not understood until now.
The original email with the original patch is in:
http://old-list-archives.xen.org/archives/html/xen-devel/2010-02/msg00038.html
the original thread from where the discussion started is in:
http://old-list-archives.xen.org/archives/html/xen-devel/2010-01/msg00928.html
Signed-off-by: Ronny Hegewald <ronny.hegewald@online.de>
Signed-off-by: Stefano Panella <stefano.panella@citrix.com>
Acked-By: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
CC: stable@vger.kernel.org
While TLB_FLUSH_ALL gets passed as 'end' argument to
flush_tlb_others(), the Xen code was made to check its 'start'
parameter. That may give a incorrect op.cmd to MMUEXT_INVLPG_MULTI
instead of MMUEXT_TLB_FLUSH_MULTI. Then it causes some page can not
be flushed from TLB.
This patch fixed this issue.
Reported-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Alex Shi <alex.shi@intel.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Tested-by: Yongjie Ren <yongjie.ren@intel.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
* commit '4cb38750d49010ae72e718d46605ac9ba5a851b4': (6849 commits)
bcma: fix invalid PMU chip control masks
[libata] pata_cmd64x: whitespace cleanup
libata-acpi: fix up for acpi_pm_device_sleep_state API
sata_dwc_460ex: device tree may specify dma_channel
ahci, trivial: fixed coding style issues related to braces
ahci_platform: add hibernation callbacks
libata-eh.c: local functions should not be exposed globally
libata-transport.c: local functions should not be exposed globally
sata_dwc_460ex: support hardreset
ata: use module_pci_driver
drivers/ata/pata_pcmcia.c: adjust suspicious bit operation
pata_imx: Convert to clk_prepare_enable/clk_disable_unprepare
ahci: Enable SB600 64bit DMA on MSI K9AGM2 (MS-7327) v2
[libata] Prevent interface errors with Seagate FreeAgent GoFlex
drivers/acpi/glue: revert accidental license-related 6b66d95895 bits
libata-acpi: add missing inlines in libata.h
i2c-omap: Add support for I2C_M_STOP message flag
i2c: Fall back to emulated SMBus if the operation isn't supported natively
i2c: Add SCCB support
i2c-tiny-usb: Add support for the Robofuzz OSIF USB/I2C converter
...
We would traverse the full P2M top directory (from 0->MAX_DOMAIN_PAGES
inclusive) when trying to figure out whether we can re-use some of the
P2M middle leafs.
Which meant that if the kernel was compiled with MAX_DOMAIN_PAGES=512
we would try to use the 512th entry. Fortunately for us the p2m_top_index
has a check for this:
BUG_ON(pfn >= MAX_P2M_PFN);
which we hit and saw this:
(XEN) domain_crash_sync called from entry.S
(XEN) Domain 0 (vcpu#0) crashed on cpu#0:
(XEN) ----[ Xen-4.1.2-OVM x86_64 debug=n Tainted: C ]----
(XEN) CPU: 0
(XEN) RIP: e033:[<ffffffff819cadeb>]
(XEN) RFLAGS: 0000000000000212 EM: 1 CONTEXT: pv guest
(XEN) rax: ffffffff81db5000 rbx: ffffffff81db4000 rcx: 0000000000000000
(XEN) rdx: 0000000000480211 rsi: 0000000000000000 rdi: ffffffff81db4000
(XEN) rbp: ffffffff81793db8 rsp: ffffffff81793d38 r8: 0000000008000000
(XEN) r9: 4000000000000000 r10: 0000000000000000 r11: ffffffff81db7000
(XEN) r12: 0000000000000ff8 r13: ffffffff81df1ff8 r14: ffffffff81db6000
(XEN) r15: 0000000000000ff8 cr0: 000000008005003b cr4: 00000000000026f0
(XEN) cr3: 0000000661795000 cr2: 0000000000000000
Fixes-Oracle-Bug: 14570662
CC: stable@vger.kernel.org # only for v3.5
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
patch_instruction() can be called very early on ppc32, when the kernel
isn't yet running at it's linked address. That can cause the !
is_kernel_addr() test in __put_user() to trip and call might_sleep()
which is very bad at that point during boot.
Use a lower level function instead for now, at least until we get to
rework ppc32 boot process to do the code patching later, like ppc64
does.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
We have been observing hangs, both of KVM guest vcpu tasks and more
generally, where a process that is woken doesn't properly wake up and
continue to run, but instead sticks in TASK_WAKING state. This
happens because the update of rq->wake_list in ttwu_queue_remote()
is not ordered with the update of ipi_message in
smp_muxed_ipi_message_pass(), and the reading of rq->wake_list in
scheduler_ipi() is not ordered with the reading of ipi_message in
smp_ipi_demux(). Thus it is possible for the IPI receiver not to see
the updated rq->wake_list and therefore conclude that there is nothing
for it to do.
In order to make sure that anything done before smp_send_reschedule()
is ordered before anything done in the resulting call to scheduler_ipi(),
this adds barriers in smp_muxed_message_pass() and smp_ipi_demux().
The barrier in smp_muxed_message_pass() is a full barrier to ensure that
there is a full ordering between the smp_send_reschedule() caller and
scheduler_ipi(). In smp_ipi_demux(), we use xchg() rather than
xchg_local() because xchg() includes release and acquire barriers.
Using xchg() rather than xchg_local() makes sense given that
ipi_message is not just accessed locally.
This moves the barrier between setting the message and calling the
cause_ipi() function into the individual cause_ipi implementations.
Most of them -- those that used outb, out_8 or similar -- already had
a full barrier because out_8 etc. include a sync before the MMIO
store. This adds an explicit barrier in the two remaining cases.
These changes made no measurable difference to the speed of IPIs as
measured using a simple ping-pong latency test across two CPUs on
different cores of a POWER7 machine.
The analysis of the reason why processes were not waking up properly
is due to Milton Miller.
Cc: stable@vger.kernel.org # v3.0+
Reported-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
During a context switch we always restore the per thread DSCR value.
If we aren't doing explicit DSCR management
(ie thread.dscr_inherit == 0) and the default DSCR changed while
the process has been sleeping we end up with the wrong value.
Check thread.dscr_inherit and select the default DSCR or per thread
DSCR as required.
This was found with the following test case, when running with
more threads than CPUs (ie forcing context switching):
http://ozlabs.org/~anton/junkcode/dscr_default_test.c
With the four patches applied I can run a combination of all
test cases successfully at the same time:
http://ozlabs.org/~anton/junkcode/dscr_default_test.chttp://ozlabs.org/~anton/junkcode/dscr_explicit_test.chttp://ozlabs.org/~anton/junkcode/dscr_inherit_test.c
Signed-off-by: Anton Blanchard <anton@samba.org>
Cc: <stable@kernel.org> # 3.0+
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
If the default DSCR is non zero we set thread.dscr_inherit in
copy_thread() meaning the new thread and all its children will ignore
future updates to the default DSCR. This is not intended and is
a change in behaviour that a number of our users have hit.
We just need to inherit thread.dscr and thread.dscr_inherit from
the parent which ends up being much simpler.
This was found with the following test case:
http://ozlabs.org/~anton/junkcode/dscr_default_test.c
Signed-off-by: Anton Blanchard <anton@samba.org>
Cc: <stable@kernel.org> # 3.0+
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
When we update the DSCR either via emulation of mtspr(DSCR) or via
a change to dscr_default in sysfs we don't update thread.dscr.
We will eventually update it at context switch time but there is
a period where thread.dscr is incorrect.
If we fork at this point we will copy the old value of thread.dscr
into the child. To avoid this, always keep thread.dscr in sync with
reality.
This issue was found with the following testcase:
http://ozlabs.org/~anton/junkcode/dscr_inherit_test.c
Signed-off-by: Anton Blanchard <anton@samba.org>
Cc: <stable@kernel.org> # 3.0+
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Writing to dscr_default in sysfs doesn't actually change the DSCR -
we rely on a context switch on each CPU to do the work. There is no
guarantee we will get a context switch in a reasonable amount of time
so fire off an IPI to force an immediate change.
This issue was found with the following test case:
http://ozlabs.org/~anton/junkcode/dscr_explicit_test.c
Signed-off-by: Anton Blanchard <anton@samba.org>
Cc: <stable@kernel.org> # 3.0+
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The CPU hotplug code for the powernv platform currently only puts
offline CPUs into nap mode if the powersave_nap variable is set.
However, HV-style KVM on this platform requires secondary CPU threads
to be offline and in nap mode. Since we know nap mode works just
fine on all POWER7 machines, and the only machines that support the
powernv platform are POWER7 machines, this changes the code to
always put offline CPUs into nap mode, regardless of powersave_nap.
Powersave_nap still controls whether or not CPUs go into nap mode
when idle, as before.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
At the moment the handler for hypervisor decrementer interrupts is
the same as for decrementer interrupts, i.e. timer_interrupt().
This is bogus; if we ever do get a hypervisor decrementer interrupt
it won't have anything to do with the next timer event. In fact
the only time we get hypervisor decrementer interrupts is when one
is left pending on exit from a KVM guest.
When we get a hypervisor decrementer interrupt we don't need to do
anything special to clear it, since they are edge-triggered on the
transition of HDEC from 0 to -1. Thus this adds an empty handler
function for them. We don't need to have them masked when interrupts
are soft-disabled, so we use STD_EXCEPTION_HV instead of
MASKABLE_EXCEPTION_HV.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
arch_update_cpu_topology() should only return 1 when the topology has
actually changed, and should return 0 otherwise.
This patch fixes a potential bug where rebuild_sched_domains() would
reinitialize the sched domains even when the topology hasn't changed.
Signed-off-by: Jesse Larrew <jlarrew@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Test-compiling obscure machines I notice that the gemini (which
by the way lacks a defconfig) is broken since some time back.
Adding a simple missing include makes it build again.
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Olof Johansson <olof@lixom.net>
Two regression fixes and one boot-loader compatibility fix from Simon Horman.
* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas:
ARM: shmobile: armadillo800eva: enable rw rootfs mount
ARM: shmobile: mackerel: fixup usb module order
ARM: shmobile: armadillo800eva: fixup: sound card detection order
After commit 26b88520b8 ("mmc:
omap_hsmmc: remove private DMA API implementation"), the Nokia N800
here stopped booting:
[ 2.086181] Waiting for root device /dev/mmcblk0p1...
[ 2.324066] Unhandled fault: imprecise external abort (0x406) at 0x00000000
[ 2.331451] Internal error: : 406 [#1] ARM
[ 2.335784] Modules linked in:
[ 2.339050] CPU: 0 Not tainted (3.6.0-rc3 #60)
[ 2.344146] PC is at default_idle+0x28/0x30
[ 2.348602] LR is at trace_hardirqs_on_caller+0x15c/0x1b0
...
This turned out to be due to memory corruption caused by long-broken
PIO code in drivers/mmc/host/omap.c. (Previously, this driver had
been using DMA; but the above commit caused the MMC driver to fall
back to PIO mode with an unmodified Kconfig.)
The PIO code, added with the rest of the driver in commit
730c9b7e66 ("[MMC] Add OMAP MMC host
driver"), confused bytes with 16-bit words. This bug caused memory
located after the PIO transfer buffer to be corrupted with transfers
larger than 32 bytes. The driver also did not increment the buffer
pointer after the transfer occurred. This bug resulted in data
corruption during any transfer larger than 64 bytes.
Signed-off-by: Paul Walmsley <paul@pwsan.com>
Reviewed-by: Felipe Balbi <balbi@ti.com>
Tested-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Chris Ball <cjb@laptop.org>