OK, that's probably the easiest way to do that, as much as I don't like it...
Since iget() et.al. will not accept I_FREEING (will wait to go away
and restart), and since we'd better have serialization between new/free
on fs data structures anyway, we can afford simply skipping I_FREEING
et.al. in insert_inode_locked().
We do that from new_inode, so it won't race with free_inode in any interesting
ways and it won't race with iget (of any origin; nfsd or in case of fs
corruption a lookup) since both still will wait for I_LOCK.
Reviewed-by: "Theodore Ts'o" <tytso@mit.edu>
Acked-by: Jan Kara <jack@suse.cz>
Tested-by: David Watson <dbwatson@ukfsn.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
The nobh_truncate_page() function is used by ext2, exofs, and jfs. Of
these three, only ext2 and jfs's get_block() function pays attention
to bh->b_size --- which is normally always the filesystem blocksize
except when the get_block() function is called by either
mpage_readpage(), mpage_readpages(), or the direct I/O routines in
fs/direct_io.c.
Unfortunately, nobh_truncate_page() does not initialize map_bh before
calling the filesystem-supplied get_block() function. So ext2 and jfs
will try to calculate the number of blocks to map by taking stack
garbage and shifting it left by inode->i_blkbits. This should be
*mostly* harmless (except the filesystem will do some unnneeded work)
unless the stack garbage is less than filesystem's blocksize, in which
case maxblocks will be zero, and the attempt to find out whether or
not the filesystem has a hole at a given logical block will fail, and
the page cache entry might not get zero'ed out.
Also if the stack garbage in in map_bh->state happens to have the
BH_Mapped bit set, there could be an attempt to call readpage() on a
non-existent page, which could cause nobh_truncate_page() to return an
error when it should not.
Fix this by initializing map_bh->state and map_bh->size.
Fortunately, it's probably fairly unlikely that ext2 and jfs users
mount with nobh these days.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Cc: linux-fsdevel@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable:
Btrfs: Fix oops and use after free during space balancing
Btrfs: set device->total_disk_bytes when adding new device
DaVinci clock support has been updated in mainline.
Update clock names accordingly.
Signed-off-by: Kevin Hilman <khilman@deeprootsystems.com>
Acked-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev:
ata_piix: Add HP Compaq nc6000 to the broken poweroff list
ahci: add warning messages for hp laptops with broken suspend
pata_efar: fix PIO2 underclocking
pata_legacy: wait for async probing
HP Compaq nc6000 suffers from the double disk spindown issue.
Add it to the broken poweroff DMI list.
Signed-off-by: Ville Syrjala <syrjala@sci.fi>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Harddisks on HP dv[4-6] and HDX18 fail to come online after resume on
earlier BIOSen. Fortunately, HP recently released BIOS updates for
all machines to fix the issue. Detect old BIOSen, warn the user to
update BIOS on boot and suspend attempts and fail suspend.
Kudos to all the bug reporters.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: kernel.org@epperson.homelinux.net
Cc: emisca@gmail.com
Cc: Gadi Cohen <dragon@wastelands.net>
Cc: Paul Swanson <paul@procursa.com>
Cc: s@ourada.org
Cc: Trevor Davenport <trevor.davenport@gmail.com>
Cc: corruptor1972 <steven_tierney@yahoo.co.uk>
Cc: Victoria Wilson <mail@vwilson.co.uk>
Cc: khiraly <khiraly.list@gmail.com>
Cc: Sean <wollombi@gmail.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Fix the PIO mode 2 using mode 0 timings -- this driver should enable the
fast timing bank starting with PIO2, just like the PIIX/ICH drivers do.
Also, fix/rephrase some comments while at it.
Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
The basic problem here that pata_legacy attaches the host, sees if it found
any devices and detaches it if none were found. With async probing, it's not
waiting until discovery is finished before deciding it has no devices and
trying the detach leading to this warning:
ata1: PATA max PIO4 cmd 0x1f0 ctl 0x3f6 irq 14
------------[ cut here ]------------
WARNING: at drivers/ata/libata-core.c:6222 ata_host_detach+0x75/0x90()
Modules linked in:
Pid: 1, comm: swapper Not tainted 2.6.30-rc7 #1
Call Trace:
[<c01fbb05>] ? ata_host_detach+0x75/0x90
[<c01fbb05>] ? ata_host_detach+0x75/0x90
[<c01139b5>] ? warn_slowpath_common+0x45/0x80
[<c01139fa>] ? warn_slowpath_null+0xa/0x10
[<c01fbb05>] ? ata_host_detach+0x75/0x90
[<c02f40e0>] ? legacy_init+0x44e/0x87f
[<c02f3c92>] ? legacy_init+0x0/0x87f
[<c0101021>] ? _stext+0x21/0x140
[<c01890ff>] ? proc_register+0x2f/0x190
[<c018938c>] ? create_proc_entry+0x5c/0xc0
[<c0135ebe>] ? register_irq_proc+0x6e/0x90
[<c02e6484>] ? kernel_init+0x6e/0xbf
[<c02e6416>] ? kernel_init+0x0/0xbf
[<c01031d7>] ? kernel_thread_helper+0x7/0x10
---[ end trace ef1ee36e873ae3a0 ]---
Because it detaches before the probe is complete.
One way to fix it would be to put an async_synchronize_full() before looking
for devices, which this patch does. A better way might be to separate libata
into its own domain and only wait for that.
Reported-by: Mikael Pettersson <mikpe@it.uu.se>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
The powernow-k8 driver checks to see that the Performance Control/Status
Registers are declared as FFH (functional fixed hardware) by the BIOS.
However, this check got broken in the commit:
0e64a0c982
[CPUFREQ] checkpatch cleanups for powernow-k8
Fix based on an original patch from Naga Chumbalkar.
Signed-off-by: Naga Chumbalkar <nagananda.chumbalkar@hp.com>
Cc: Mark Langsdorf <mark.langsdorf@amd.com>
Signed-off-by: Dave Jones <davej@redhat.com>
This reverts commit 6c51d1cfa0, which
apparently causes DRI initialization failures on Radeons.
Reported-by: Markus Trippelsdorf <markus@trippelsdorf.de>
Requested-by: Dave Airlie <airlied@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The ivtv stream buffers may be for receive or for send but the attached
sg handle is always destined cpu->device. We flush it correctly but the
allocation is wrongly done with the same type as the buffers.
See bug: http://bugzilla.kernel.org/show_bug.cgi?id=13385
(Note this doesn't close the bug - it fixes the ivtv part and in turn
the logging next shows up some rather alarming DMA sg list warnings in
libata)
Signed-off-by: Alan Cox <alan@linux.intel.com>
Acked-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Commit 95a3540da9 ("ptrace_detach: the wrong
wakeup breaks the ERESTARTxxx logic") removed the "extra"
wake_up_process() from ptrace_detach(), but as Jan pointed out this breaks
the compatibility.
I believe the changelog is right and this wake_up() is wrong in many
ways, but GDB assumes that ptrace(PTRACE_DETACH, child, 0, 0) always
wakes up the tracee.
Despite the fact this breaks SIGNAL_STOP_STOPPED/group_stop_count logic,
and despite the fact this wake_up_process() can break another
assumption: PTRACE_DETACH with SIGSTOP should leave the tracee in
TASK_STOPPED case. Because the untraced child can dequeue SIGSTOP and
call do_signal_stop() before ptrace_detach() calls wake_up_process().
Revert this change for now. We need some fixes even if we we want to keep
the current behaviour, but these fixes are not for 2.6.30.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Roland McGrath <roland@redhat.com>
Cc: Jan Kratochvil <jan.kratochvil@redhat.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The checking of CONFIG_FRAME_WARN in the top level Makefile forgot to
actually derefence the variable thus leading to an always true check.
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Cc: Andi Kleen <ak@suse.de>
Cc: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The "trace || CLONE_PTRACE" check in tracehook_report_clone() is not right,
- If the untraced task does clone(CLONE_PTRACE) the new child is not traced,
we must not queue SIGSTOP.
- If we forked the traced task, but the tracer exits and untraces both the
forking task and the new child (after copy_process() drops tasklist_lock),
we should not queue SIGSTOP too.
Change the code to check task_ptrace() != 0 instead. This is still racy, but
the race is harmless.
We can race with another tracer attaching to this child, or the tracer can
exit and detach in parallel. But giwen that we didn't do wake_up_new_task()
yet, the child must have the pending SIGSTOP anyway.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Roland McGrath <roland@redhat.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6:
drm: ignore EDID with really tiny modes.
drm: don't associate _DRM_DRIVER maps with a master
drm/i915: intel_lvds.c fix section mismatch
drm: Hook up DPMS property handling in drm_crtc.c. Add drm_helper_connector_dpms.
drm: set permissions on edid file to 0444
drm: add newlines to text sysfs files
drm/radeon: fix ring free alignment calculations
drm: fix irq naming for kms drivers.
While running 20 parallel instances of dd as follows:
#!/bin/bash
for i in `seq 1 20`; do
dd if=/dev/zero of=/export/hda3/dd_$i bs=1073741824 count=1 &
done
wait
on a 16G machine, we noticed that rather than just killing the processes,
the entire kernel went down. Stracing dd reveals that it first does an
mmap2, which makes 1GB worth of zero page mappings. Then it performs a
read on those pages from /dev/zero, and finally it performs a write.
The machine died during the reads. Looking at the code, it was noticed
that /dev/zero's read operation had been changed by
557ed1fa26 ("remove ZERO_PAGE") from giving
zero page mappings to actually zeroing the page.
The zeroing of the pages causes physical pages to be allocated to the
process. But, when the process exhausts all the memory that it can, the
kernel cannot kill it, as it is still in the kernel mode allocating more
memory. Consequently, the kernel eventually crashes.
To fix this, I propose that when a fatal signal is pending during
/dev/zero read operation, we simply return and let the user process die.
Signed-off-by: Salman Qazi <sqazi@google.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
[ Modified error return and comment trivially. - Linus]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The btrfs allocator uses list_for_each to walk the available block
groups when searching for free blocks. It starts off with a hint
to help find the best block group for a given allocation.
The hint is resolved into a block group, but we don't properly check
to make sure the block group we find isn't in the middle of being
freed due to filesystem shrinking or balancing. If it is being
freed, the list pointers in it are bogus and can't be trusted. But,
the code happily goes along and uses them in the list_for_each loop,
leading to all kinds of fun.
The fix used here is to check to make sure the block group we find really
is on the list before we use it. list_del_init is used when removing
it from the list, so we can do a proper check.
The allocation clustering code has a similar bug where it will trust
the block group in the current free space cluster. If our allocation
flags have changed (going from single spindle dup to raid1 for example)
because the drives in the FS have changed, we're not allowed to use
the old block group any more.
The fix used here is to check the current cluster against the
current allocation flags.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
We don't set up the canary; let's disable stack protector on boot.c so
we can get into lguest_init, then set it up. As a side effect,
switch_to_new_gdt() sets up %fs for us properly too.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This could be triggered by a gtt mapping fault on 965 that decides to
remove the fence from another object that happens to be active currently.
Since the other object doesn't rely on the fence reg for its execution, we
don't wait for it to finish. We'll soon be not waiting on 915 most of the
time as well, so just drop the BUG_ON.
Signed-off-by: Eric Anholt <eric@anholt.net>
Some EDIDs lie and report tiny modes that aren't possible. Ignore
these modes.
Signed-off-by: Adam Jackson <ajax@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
A driver will use the _DRM_DRIVER map flag to indicate that it wants
to be responsible for removing the map itself, bypassing the DRM's
automagic cleanup code.
Since the multi-master changes this has been broken, resulting in some
drivers having their registers unmapped before it's finished with them.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
intel_no_lvds[] does not require __initdata as it is used only by
void intel_lvds_init(struct drm_device *dev).
Signed-off-by: Jaswinder Singh Rajput <jaswinder@kernel.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Making the drm_crtc.c code recognize the DPMS property and invoke the
connector->dpms function doesn't remove any capability from the driver while
reducing code duplication.
That just highlighted the problem with the existing DPMS functions which
could turn off the connector, but failed to turn off any relevant crtcs. The
new drm_helper_connector_dpms function manages all of that, using the
drm_helper-specific crtc and encoder dpms functions, automatically computing
the appropriate DPMS level for each object in the system.
This fixes the current troubles in the i915 driver which left PLLs, pipes
and planes running while in DPMS_OFF mode or even while they were unused.
Signed-off-by: Keith Packard <keithp@keithp.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Without initializing the sysfs attributes for the edid file,
it was created with mode 0, making it difficult for applications to use.
Signed-off-by: Keith Packard <keithp@keithp.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
The contents of various simple text files in sysfs should end with
a newline to make them easier to read from the console.
Signed-off-by: Keith Packard <keithp@keithp.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
fd.o bz#21849
We were aligning to +16 dwords, instead of to the next 16dword
boundary in the ring. Fix the calculation to go to the next 16dword
boundary when space checking.
Signed-off-by: Dave Airlie <airlied@redhat.com>
allocating devname in the i915 driver was a hack originally and I
forgot to figure out how to do this properly back then.
So this is the cleaner version that just picks devname or driver name
in the irq code.
It removes the devname allocs from the i915 driver.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Ideally we should have a directory of drivers and a link to the 'active'
driver. For now just show the first device which is effectively the existing
semantics without a warning.
This is an update on the original buggy patch that I then forgot to
resubmit. Confusingly it was proposed by Red Hat, written by Etched Pixels
fixed and submitted by Intel ...
Resolves-Bug: http://bugzilla.kernel.org/show_bug.cgi?id=9749
Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This matches Bartlomiej's patch for ide_pci_generic:
c339dfdd65
In the libata case netcell has its own mini driver. I suspect this fix is
actually only needed for some firmware revs but it does no harm either way.
Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'for-linus' of git://oss.sgi.com/xfs/xfs:
xfs: prevent deadlock in xfs_qm_shake()
xfs: fix overflow in xfs_growfs_data_private
xfs: fix double unlock in xfs_swap_extents()
This patch fixes a bug which unconfigured struct tcf_proto keeps
chaining in tc_ctl_tfilter(), and avoids kernel panic in
cls_cgroup_classify() when we use cls_cgroup.
When we execute 'tc filter add', tcf_proto is allocated, initialized
by classifier's init(), and chained. After it's chained,
tc_ctl_tfilter() calls classifier's change(). When classifier's
change() fails, tc_ctl_tfilter() does not free and keeps tcf_proto.
In addition, cls_cgroup is initialized in change() not in init(). It
accesses unconfigured struct tcf_proto which is chained before
change(), then hits Oops.
Signed-off-by: Minoru Usui <usui@mxm.nes.nec.co.jp>
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
Tested-by: Minoru Usui <usui@mxm.nes.nec.co.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
Patch to fix bad length checking in e1000. E1000 by default does two
things:
1) Spans rx descriptors for packets that don't fit into 1 skb on recieve
2) Strips the crc from a frame by subtracting 4 bytes from the length prior to
doing an skb_put
Since the e1000 driver isn't written to support receiving packets that span
multiple rx buffers, it checks the End of Packet bit of every frame, and
discards it if its not set. This places us in a situation where, if we have a
spanning packet, the first part is discarded, but the second part is not (since
it is the end of packet, and it passes the EOP bit test). If the second part of
the frame is small (4 bytes or less), we subtract 4 from it to remove its crc,
underflow the length, and wind up in skb_over_panic, when we try to skb_put a
huge number of bytes into the skb. This amounts to a remote DOS attack through
careful selection of frame size in relation to interface MTU. The fix for this
is already in the e1000e driver, as well as the e1000 sourceforge driver, but no
one ever pushed it to e1000. This is lifted straight from e1000e, and prevents
small frames from causing the underflow described above
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Tested-by: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a phy_power_down parameter to forcedeth: set to 1 to power down the
phy and disable the link when an interface goes down; set to 0 to always
leave the phy powered up.
The phy power state persists across reboots; Windows, some BIOSes, and
older versions of Linux don't bother to power up the phy again, forcing
users to remove all power to get the interface working (see
http://bugzilla.kernel.org/show_bug.cgi?id=13072). Leaving the phy
powered on is the safest default behavior. Users accustomed to seeing
the link state reflect the interface state and/or wanting to minimize
power consumption can set phy_power_down=1 if compatibility with other
OSes is not an issue.
Signed-off-by: Ed Swierk <eswierk@aristanetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
It's possible to recurse into filesystem from the memory
allocation, which deadlocks in xfs_qm_shake(). Add check
for __GFP_FS, and bail out if it is not set.
Signed-off-by: Felix Blyakher <felixb@sgi.com>
Signed-off-by: Hedi Berriche <hedi@sgi.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Felix Blyakher <felixb@sgi.com>
In the case where growing a filesystem would leave the last AG
too small, the fixup code has an overflow in the calculation
of the new size with one fewer ag, because "nagcount" is a 32
bit number. If the new filesystem has > 2^32 blocks in it
this causes a problem resulting in an EINVAL return from growfs:
# xfs_io -f -c "truncate 19998630180864" fsfile
# mkfs.xfs -f -bsize=4096 -dagsize=76288719b,size=3905982455b fsfile
# mount -o loop fsfile /mnt
# xfs_growfs /mnt
meta-data=/dev/loop0 isize=256 agcount=52,
agsize=76288719 blks
= sectsz=512 attr=2
data = bsize=4096 blocks=3905982455, imaxpct=5
= sunit=0 swidth=0 blks
naming =version 2 bsize=4096 ascii-ci=0
log =internal bsize=4096 blocks=32768, version=2
= sectsz=512 sunit=0 blks, lazy-count=0
realtime =none extsz=4096 blocks=0, rtextents=0
xfs_growfs: XFS_IOC_FSGROWFSDATA xfsctl failed: Invalid argument
Reported-by: richard.ems@cape-horn-eng.com
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Felix Blyakher <felixb@sgi.com>
Signed-off-by: Felix Blyakher <felixb@sgi.com>
Regreesion from commit ef8f7fc, which rearranged the code in
xfs_swap_extents() leading to double unlock of xfs inode ilock.
That resulted in xfs_fsr deadlocking itself on platforms, which
don't handle double unlock of rw_semaphore nicely. It caused the
count go negative, which represents the write holder, without
really having one. ia64 is one of the platforms where deadlock
was easily reproduced and the fix was tested.
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Reviewed-by: Eric Sandeen <sandeen@sandeen.net>
Signed-off-by: Felix Blyakher <felixb@sgi.com>