Commit Graph

264122 Commits

Author SHA1 Message Date
Sasha Levin
743eeb0b01 KVM: Intelligent device lookup on I/O bus
Currently the method of dealing with an IO operation on a bus (PIO/MMIO)
is to call the read or write callback for each device registered
on the bus until we find a device which handles it.

Since the number of devices on a bus can be significant due to ioeventfds
and coalesced MMIO zones, this leads to a lot of overhead on each IO
operation.

Instead of registering devices, we now register ranges which points to
a device. Lookup is done using an efficient bsearch instead of a linear
search.

Performance test was conducted by comparing exit count per second with
200 ioeventfds created on one byte and the guest is trying to access a
different byte continuously (triggering usermode exits).
Before the patch the guest has achieved 259k exits per second, after the
patch the guest does 274k exits per second.

Cc: Avi Kivity <avi@redhat.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2011-09-25 19:17:59 +03:00
Stefan Hajnoczi
0d460ffc09 KVM: Use __print_symbolic() for vmexit tracepoints
The vmexit tracepoints format the exit_reason to make it human-readable.
Since the exit_reason depends on the instruction set (vmx or svm),
formatting is handled with ftrace_print_symbols_seq() by referring to
the appropriate exit reason table.

However, the ftrace_print_symbols_seq() function is not meant to be used
directly in tracepoints since it does not export the formatting table
which userspace tools like trace-cmd and perf use to format traces.

In practice perf dies when formatting vmexit-related events and
trace-cmd falls back to printing the numeric value (with extra
formatting code in the kvm plugin to paper over this limitation).  Other
userspace consumers of vmexit-related tracepoints would be in similar
trouble.

To avoid significant changes to the kvm_exit tracepoint, this patch
moves the vmx and svm exit reason tables into arch/x86/kvm/trace.h and
selects the right table with __print_symbolic() depending on the
instruction set.  Note that __print_symbolic() is designed for exporting
the formatting table to userspace and allows trace-cmd and perf to work.

Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2011-09-25 19:17:59 +03:00
Stefan Hajnoczi
e097e5ffd6 KVM: Record instruction set in all vmexit tracepoints
The kvm_exit tracepoint recently added the isa argument to aid decoding
exit_reason.  The semantics of exit_reason depend on the instruction set
(vmx or svm) and the isa argument allows traces to be analyzed on other
machines.

Add the isa argument to kvm_nested_vmexit and kvm_nested_vmexit_inject
so these tracepoints can also be self-describing.

Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2011-09-25 19:17:58 +03:00
Mike Waychison
d1613ad5d0 KVM: Really fix HV_X64_MSR_APIC_ASSIST_PAGE
Commit 0945d4b228 tried to fix the get_msr path for the
HV_X64_MSR_APIC_ASSIST_PAGE msr, but was poorly tested.  We should be
returning 0 if the read succeeded, and passing the value back to the
caller via the pdata out argument, not returning the value directly.

Signed-off-by: Mike Waychison <mikew@google.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2011-09-25 19:17:58 +03:00
Mike Waychison
14fa67ee95 KVM: x86: get_msr support for HV_X64_MSR_APIC_ASSIST_PAGE
"get" support for the HV_X64_MSR_APIC_ASSIST_PAGE msr was missing, even
though it is explicitly enumerated as something the vmm should save in
msrs_to_save and reported to userland via the KVM_GET_MSR_INDEX_LIST
ioctl.

Add "get" support for HV_X64_MSR_APIC_ASSIST_PAGE.  We simply return the
guest visible value of this register, which seems to be correct as a set
on the register is validated for us already.

Signed-off-by: Mike Waychison <mikew@google.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-09-25 19:17:58 +03:00
Sasha Levin
2b3c246a68 KVM: Make coalesced mmio use a device per zone
This patch changes coalesced mmio to create one mmio device per
zone instead of handling all zones in one device.

Doing so enables us to take advantage of existing locking and prevents
a race condition between coalesced mmio registration/unregistration
and lookups.

Suggested-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-09-25 19:17:57 +03:00
Sasha Levin
8c3ba334f8 KVM: x86: Raise the hard VCPU count limit
The patch raises the hard limit of VCPU count to 254.

This will allow developers to easily work on scalability
and will allow users to test high VCPU setups easily without
patching the kernel.

To prevent possible issues with current setups, KVM_CAP_NR_VCPUS
now returns the recommended VCPU limit (which is still 64) - this
should be a safe value for everybody, while a new KVM_CAP_MAX_VCPUS
returns the hard limit which is now 254.

Cc: Avi Kivity <avi@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Pekka Enberg <penberg@kernel.org>
Suggested-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-09-25 19:17:57 +03:00
Sasha Levin
c298125f4b KVM: MMIO: Lock coalesced device when checking for available entry
Move the check whether there are available entries to within the spinlock.
This allows working with larger amount of VCPUs and reduces premature
exits when using a large number of VCPUs.

Cc: Avi Kivity <avi@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Pekka Enberg <penberg@kernel.org>
Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-09-25 19:17:18 +03:00
Xiao Guangrong
22388a3c8c KVM: x86: cleanup the code of read/write emulation
Using the read/write operation to remove the same code

Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2011-09-25 19:17:17 +03:00
Xiao Guangrong
77d197b2ca KVM: x86: abstract the operation for read/write emulation
The operations of read emulation and write emulation are very similar, so we
can abstract the operation of them, in larter patch, it is used to cleanup the
same code

Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2011-09-25 19:17:17 +03:00
Xiao Guangrong
ca7d58f375 KVM: x86: fix broken read emulation spans a page boundary
If the range spans a page boundary, the mmio access can be broke, fix it as
write emulation.

And we already get the guest physical address, so use it to read guest data
directly to avoid walking guest page table again

Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2011-09-25 19:17:17 +03:00
Avi Kivity
9be3be1f15 KVM: x86 emulator: fix Src2CL decode
Src2CL decode (used for double width shifts) erronously decodes only bit 3
of %rcx, instead of bits 7:0.

Fix by decoding %cl in its entirety.

Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-09-25 19:14:58 +03:00
Zhao Jin
41bc3186b3 KVM: MMU: fix incorrect return of spte
__update_clear_spte_slow should return original spte while the
current code returns low half of original spte combined with high
half of new spte.

Signed-off-by: Zhao Jin <cronozhj@gmail.com>
Reviewed-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-09-25 19:13:25 +03:00
Linus Torvalds
b172e38e43 Merge branch 'spi/merge' of git://git.secretlab.ca/git/linux-2.6
* 'spi/merge' of git://git.secretlab.ca/git/linux-2.6:
  spi: Fix WARN when removing spi-fsl-spi module
  spi/imx: Fix spi-imx when the hardware SPI chipselects are used
2011-09-23 16:53:16 -07:00
Jeff Harris
387719c2ec spi: Fix WARN when removing spi-fsl-spi module
If CPM mode is not used, the fsl_dummy_rx variable is never allocated.  When
the cleanup attempts to free it, the reference count is zero and a WARN is
generated.  The same CPM mode check used in the initialize is applied to the
free as well.

Tested on 2.6.33 with the previous spi_mpc8xxx driver.  The renamed
spi-fsl-spi driver looks to have the same problem.

Signed-off-by: Jeff Harris <jeff_harris@kentrox.com>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2011-09-23 17:28:29 -06:00
Randy Dunlap
8ec9c7fb15 scsi: fix qla2xxx printk format warning
sector_t can be different types, so cast it to its largest possible
type.

  drivers/scsi/qla2xxx/qla_isr.c:1509:5: warning: format '%lx' expects type 'long unsigned int', but argument 5 has type 'sector_t'

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-09-23 16:02:33 -07:00
Randy Dunlap
2b7fe39bab scsi: SCSI_ISCI needs to select SCSI_SAS_HOST_SMP, fixes build error
SCSI_ISCI needs to select SCSI_SAS_HOST_SMP to ensure that all
needed symbols are available to it.

Fixes this build error:

  ERROR: "try_test_sas_gpio_gp_bit" [drivers/scsi/isci/isci.ko] undefined!

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-09-23 16:02:33 -07:00
Linus Torvalds
78bbd284e8 Merge branch 'perf-tools-for-linus' of git://github.com/acmel/linux
* 'perf-tools-for-linus' of git://github.com/acmel/linux:
  perf python: Add missing perf_event__parse_sample 'swapped' parm
2011-09-23 15:17:02 -07:00
Linus Torvalds
eab8bcb67a Merge branch 'perf-tools-for-linus' of git://github.com/acmel/linux
* 'perf-tools-for-linus' of git://github.com/acmel/linux:
  perf tools: Add support for disabling -Werror via WERROR=0
  perf top: Fix userspace sample addr map offset
  perf symbols: Fix issue with binaries using 16-bytes buildids (v2)
  perf tool: Fix endianness handling of u32 data in samples
  perf sort: Fix symbol sort output by separating unresolved samples by type
  perf symbols: Synthesize anonymous mmap events
  perf record: Create events initially disabled and enable after init
  perf symbols: Add some heuristics for choosing the best duplicate symbol
  perf symbols: Preserve symbol scope when parsing /proc/kallsyms
  perf symbols: /proc/kallsyms does not sort module symbols
  perf symbols: Fix ppc64 SEGV in dso__load_sym with debuginfo files
  perf probe: Fix regression of variable finder
2011-09-23 13:59:37 -07:00
Linus Torvalds
f35f3dc485 Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
  drm/radeon/kms: fix DDIA enable on some rs690 systems
  Revert "drm/radeon/kms: fix typo in r100_blit_copy"
2011-09-23 12:05:53 -07:00
Linus Torvalds
0acf043e89 Merge branch 'for-linus' of git://github.com/tiwai/sound
* 'for-linus' of git://github.com/tiwai/sound:
  ALSA: usb-audio - clear chip->probing on error exit
  ALSA: fm801: Gracefully handle failure of tuner auto-detect
  ALSA: fm801: Fix double free in case of error in tuner detection
  ASoC: Ensure we generate a driver name
  ASoC: Remove bitrotted wm8962_resume()
  ASoC: bf5xx-ad73311: Fix prototype for bf5xx_probe
2011-09-23 12:04:32 -07:00
Arnaldo Carvalho de Melo
2b022a82a0 perf python: Add missing perf_event__parse_sample 'swapped' parm
Problem introduced in 936be50, that missed one perf_event__parse_sample
user, the python binding.

Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-ja4phms9618ggi657plyuch2@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-09-23 15:38:53 -03:00
Darren Hart
9e59e0995a perf tools: Add support for disabling -Werror via WERROR=0
GCC often introduces new warnings with lots of false positives -
breaking -Werror builds. WERROR=0 allows one to build perf without much
fuss - while still encouraging people to send patches to avoid the fuss
of having to type WERROR=0.

Bisecting back to commits that produce a (mostly harmless) warning on
some compilers is more difficult. With WERROR=0 one could bisect without
worrying about harmless warnings.

Cc: Ingo Molnar <mingo@elte.hu>
Link: http://lkml.kernel.org/r/eac06c7cc4920e5d4830417d466161fb26c7359c.1315514559.git.dvhart@linux.intel.com
Signed-off-by: Darren Hart <dvhart@linux.intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-09-23 14:38:07 -03:00
Arnaldo Carvalho de Melo
af52aafad2 perf top: Fix userspace sample addr map offset
The 'perf top' tool came from the kernel where we had each DSO (vmlinux,
modules) loaded just once at a time.

But userspace may have DSOs loaded in multiple addresses (shared
libraries), requiring that we use the just resolved map instead of the
first one found.

Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-ag53wz0yllpgers0n2w7hchp@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-09-23 14:37:54 -03:00
Stephane Eranian
be96ea8ffa perf symbols: Fix issue with binaries using 16-bytes buildids (v2)
Buildid can vary in size. According to the man page of ld, buildid can
be 160 bits (sha1) or 128 bits (md5, uuid). Perf assumes buildid size of
20 bytes (160 bits) regardless. When dealing with md5 buildids, it would
thus read more than needed and that would cause mismatches and samples
without symbols.

This patch fixes this by taking into account the actual buildid size as
encoded int he section header. The leftover bytes are also cleared.

This second version fixes a minor issue with the memset() base position.

Cc: David S. Miller <davem@davemloft.net>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Robert Richter <robert.richter@amd.com>
Cc: Stephane Eranian <eranian@gmail.com>
Link: http://lkml.kernel.org/r/4cc1af3c.8ee7d80a.5a28.ffff868e@mx.google.com
Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-09-23 14:37:41 -03:00
David Ahern
936be50306 perf tool: Fix endianness handling of u32 data in samples
Currently, analyzing PPC data files on x86 the cpu field is always 0 and
the tid and pid are backwards. For example, analyzing a PPC file on PPC
the pid/tid fields show:

        rsyslogd  1210/1212

and analyzing the same PPC file using an x86 perf binary shows:

        rsyslogd  1212/1210

The problem is that the swap_op method for samples is
perf_event__all64_swap which assumes all elements in the sample_data
struct are u64s. cpu, tid and pid are u32s and need to be handled
individually. Given that the swap is done before the sample is parsed,
the simplest solution is to undo the 64-bit swap of those elements when
the sample is parsed and do the proper swap.

The RAW data field is generic and perf cannot have programmatic knowledge
of how to treat that data. Instead a warning is given to the user.

Thanks to Anton Blanchard for providing a data file for a mult-CPU
PPC system so I could verify the fix for the CPU fields.

v3 -> v4:
- fixed use of WARN_ONCE

v2 -> v3:
- used WARN_ONCE for message regarding raw data
- removed struct wrapper around union
- fixed whitespace issues

v1 -> v2:
- added a union for undoing the byte-swap on u64 and redoing swap on
  u32's to address compiler errors (see git commit 65014ab3)

Cc: Anton Blanchard <anton@samba.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1315321946-16993-1-git-send-email-dsahern@gmail.com
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-09-23 14:37:27 -03:00
Anton Blanchard
6bb8f311a8 perf sort: Fix symbol sort output by separating unresolved samples by type
I took a profile that suggested 60% of total CPU time was in the
hypervisor:

...
    60.20%  [H] 0x33d43c
     4.43%  [k] ._spin_lock_irqsave
     1.07%  [k] ._spin_lock

Using perf stat to get the user/kernel/hypervisor breakdown contradicted
this.

The problem is we merge all unresolved samples into the one unknown
bucket. If add a comparison by sample type to sort__sym_cmp we get the
real picture:

...
    57.11%  [.] 0x80fbf63c
     4.43%  [k] ._spin_lock_irqsave
     1.07%  [k] ._spin_lock
     0.65%  [H] 0x33d43c

So it was almost all userspace, not hypervisor as the initial profile
suggested.

I found another issue while adding this. Symbol sorting sometimes shows
multiple entries for the unknown bucket:

...
    16.65%  [.] 0x6cd3a8
     7.25%  [.] 0x422460
     5.37%  [.] yylex
     4.79%  [.] malloc
     4.78%  [.] _int_malloc
     4.03%  [.] _int_free
     3.95%  [.] hash_source_code_string
     2.82%  [.] 0x532908
     2.64%  [.] 0x36b538
     0.94%  [H] 0x8000000000e132a4
     0.82%  [H] 0x800000000000e8b0

This happens because we aren't consistent with our sorting. On
one hand we check to see if both symbols match and for two unresolved
samples sym is NULL so we match:

        if (left->ms.sym == right->ms.sym)
                return 0;

On the other hand we use sample IP for unresolved samples when
comparing against a symbol:

       ip_l = left->ms.sym ? left->ms.sym->start : left->ip;
       ip_r = right->ms.sym ? right->ms.sym->start : right->ip;

This means unresolved samples end up spread across the rbtree and we
can't merge them all.

If we use cmp_null all unresolved samples will end up in the one bucket
and the output makes more sense:

...
    39.12%  [.] 0x36b538
     5.37%  [.] yylex
     4.79%  [.] malloc
     4.78%  [.] _int_malloc
     4.03%  [.] _int_free
     3.95%  [.] hash_source_code_string
     2.26%  [H] 0x800000000000e8b0

Acked-by: Eric B Munson <emunson@mgebm.net>
Cc: Eric B Munson <emunson@mgebm.net>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Ian Munsie <imunsie@au1.ibm.com>
Link: http://lkml.kernel.org/r/20110831115145.4f598ab2@kryten
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-09-23 14:37:17 -03:00
Anton Blanchard
6a0e55d85b perf symbols: Synthesize anonymous mmap events
perf_event__synthesize_mmap_events does not create anonymous mmap events
even though the kernel does. As a result an already running application
with dynamically created code will not get profiled - all samples end up
in the unknown bucket.

This patch skips any entries with '[' in the name to avoid adding events
for special regions (eg the vsyscall page). All other executable mmaps
are assumed to be anonymous and an event is synthesized.

Acked-by: Pekka Enberg <penberg@kernel.org>
Cc: Eric B Munson <emunson@mgebm.net>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Link: http://lkml.kernel.org/r/20110830091506.60b51fe8@kryten
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-09-23 14:37:06 -03:00
David Ahern
764e16a30a perf record: Create events initially disabled and enable after init
perf-record currently creates events enabled. When doing a system wide
collection (-a arg) this causes data collection for perf's
initialization activities -- eg., perf_event__synthesize_threads().

For some events (e.g., context switch S/W event or tracepoints like
syscalls) perf's initialization causes a lot of events to be captured
frequently generating "Check IO/CPU overload!" warnings on larger
systems (e.g., 2 socket, quad core, hyperthreading).

perf's initialization phase can be skipped by creating events
disabled and then enabling them once the initialization is done.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1314289075-14706-1-git-send-email-dsahern@gmail.com
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-09-23 14:36:53 -03:00
Anton Blanchard
694bf407b0 perf symbols: Add some heuristics for choosing the best duplicate symbol
Try and pick the best symbol based on a few heuristics:

-  Prefer a non weak symbol over a weak one
-  Prefer a global symbol over a non global one
-  Prefer a symbol with less underscores (idea taken from kallsyms.c)
-  If all else fails, choose the symbol with the longest name

Cc: Eric B Munson <emunson@mgebm.net>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20110824065243.161953371@samba.org
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-09-23 14:36:36 -03:00
Anton Blanchard
3187790860 perf symbols: Preserve symbol scope when parsing /proc/kallsyms
kallsyms__parse capitalises the symbol type, so every symbol is marked
global. Remove this and fix symbol_type__is_a to handle both local and
global symbols.

Cc: Eric B Munson <emunson@mgebm.net>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20110824065243.077125989@samba.org
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-09-23 14:36:25 -03:00
Anton Blanchard
3f5a42722b perf symbols: /proc/kallsyms does not sort module symbols
kallsyms__parse assumes that /proc/kallsyms is sorted and sets the end
of the previous symbol to the start of the current one.

Unfortunately module symbols are not sorted, eg:

ffffffffa0081f30 t e1000_clean_rx_irq   [e1000e]
ffffffffa00817a0 t e1000_alloc_rx_buffers       [e1000e]

Some symbols end up with a negative length and others have a length
larger than they should. This results in confusing perf output.

We already have a function to fixup the end of zero length symbols so
use that instead.

Cc: Eric B Munson <emunson@mgebm.net>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20110824065242.969681349@samba.org
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-09-23 14:36:12 -03:00
Anton Blanchard
adb0918463 perf symbols: Fix ppc64 SEGV in dso__load_sym with debuginfo files
64bit PowerPC debuginfo files have an empty function descriptor section.
I hit a SEGV when perf tried to use this section for symbol resolution.

To fix this we need to check the section is valid and we can do this by
checking for type SHT_PROGBITS.

Cc: <stable@kernel.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Eric B Munson <emunson@mgebm.net>
Link: http://lkml.kernel.org/r/20110824065242.895239970@samba.org
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-09-23 14:35:57 -03:00
Masami Hiramatsu
f66fedcb72 perf probe: Fix regression of variable finder
Fix to call convert_variable() if previous call does not fail.

To call convert_variable, it ensures "ret" is 0. However, since
"ret" has the return value of synthesize_perf_probe_arg() which
always returns positive value if it succeeded, perf probe doesn't
call convert_variable(). This will cause a SEGV when we add an
event with arguments.

This has to be fixed as it ensures "ret" is greater than 0
(or not negative).

This regression has been introduced by my previous patch, f182e3e1.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: yrl.pp-manager.tt@hitachi.com
Link: http://lkml.kernel.org/r/20110820053922.3286.65805.stgit@fedora15
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-09-23 14:33:19 -03:00
Takashi Iwai
3127b6aa75 Merge branch 'fix/asoc' into for-linus 2011-09-23 15:26:37 +02:00
Thomas Pfaff
362e4e49ab ALSA: usb-audio - clear chip->probing on error exit
The Terratec Aureon 5.1 USB sound card support is broken since kernel
2.6.39.
2.6.39 introduced power management support for USB sound cards that added
a probing flag in struct snd_usb_audio.

During the probe of the card it gives following error message :

usb 7-2: new full speed USB device number 2 using uhci_hcd
cannot find UAC_HEADER
snd-usb-audio: probe of 7-2:1.3 failed with error -5
input: USB Audio as
/devices/pci0000:00/0000:00:1d.1/usb7/7-2/7-2:1.3/input/input6
generic-usb 0003:0CCD:0028.0001: input: USB HID v1.00 Device [USB Audio]
on usb-0000:00:1d.1-2/input3

I can not comment about that "cannot find UAC_HEADER" error, but until
2.6.38 the card worked anyway.
With 2.6.39 chip->probing remains 1 on error exit, and any later ioctl
stops in snd_usb_autoresume with -ENODEV.

Signed-off-by: Thomas Pfaff <tpfaff@gmx.net>
Cc: <stable@kernel.org> [2.6.39+]
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2011-09-23 15:26:06 +02:00
Alex Deucher
fdfc61594e drm/radeon/kms: fix DDIA enable on some rs690 systems
DVOOutputControl checks the value of of bios scratch reg 3
on some tables and assumes the encoder is already enabled
if the DFP2_ACTIVE bit is set.  Clear that bit so the table
sets the DDIA enable bit properly.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@kernel.org
Signed-off-by: Dave Airlie <airlied@redhat.com>
2011-09-23 14:02:18 +01:00
Dave Airlie
d9ad77ebfd Revert "drm/radeon/kms: fix typo in r100_blit_copy"
This reverts commit 18b4fada27.

This code was correct, apologies to anyone who noticed things broke.

revert contents are different due to another commit in between.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2011-09-23 14:01:44 +01:00
Linus Torvalds
d942e43b58 Merge branch 'for-linus' of git://git.selinuxproject.org/~jmorris/linux-security
* 'for-linus' of git://git.selinuxproject.org/~jmorris/linux-security:
  TPM: Zero buffer after copying to userspace
  TPM: Call tpm_transmit with correct size
  TPM: tpm_nsc: Fix a double free of pdev in cleanup_nsc
  TPM: TCG_ATMEL should depend on HAS_IOPORT
2011-09-22 19:57:27 -07:00
Peter Huewe
3321c07ae5 TPM: Zero buffer after copying to userspace
Since the buffer might contain security related data it might be a good idea to
zero the buffer after we have copied it to userspace.

This got assigned CVE-2011-1162.

Signed-off-by: Rajiv Andrade <srajiv@linux.vnet.ibm.com>
Cc: Stable Kernel <stable@kernel.org>
Signed-off-by: James Morris <jmorris@namei.org>
2011-09-23 09:46:41 +10:00
Peter Huewe
6b07d30aca TPM: Call tpm_transmit with correct size
This patch changes the call of tpm_transmit by supplying the size of the
userspace buffer instead of TPM_BUFSIZE.

This got assigned CVE-2011-1161.

[The first hunk didn't make sense given one could expect
 way less data than TPM_BUFSIZE, so added tpm_transmit boundary
 check over bufsiz instead
 The last parameter of tpm_transmit() reflects the amount
 of data expected from the device, and not the buffer size
 being supplied to it. It isn't ideal to parse it directly,
 so we just set it to the maximum the input buffer can handle
 and let the userspace API to do such job.]

Signed-off-by: Rajiv Andrade <srajiv@linux.vnet.ibm.com>
Cc: Stable Kernel <stable@kernel.org>
Signed-off-by: James Morris <jmorris@namei.org>
2011-09-23 09:46:29 +10:00
Axel Lin
de69113ec1 TPM: tpm_nsc: Fix a double free of pdev in cleanup_nsc
platform_device_unregister() will release all resources
and remove it from the subsystem, then drop reference count by
calling platform_device_put().

We should not call kfree(pdev) after platform_device_unregister(pdev).

Signed-off-by: Axel Lin <axel.lin@gmail.com>
Signed-off-by: Rajiv Andrade <srajiv@linux.vnet.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
2011-09-23 09:46:17 +10:00
Geert Uytterhoeven
5ce5ed3593 TPM: TCG_ATMEL should depend on HAS_IOPORT
On m68k, I get:

drivers/char/tpm/tpm_atmel.h: In function ‘atmel_get_base_addr’:
drivers/char/tpm/tpm_atmel.h:129: error: implicit declaration of function ‘ioport_map’
drivers/char/tpm/tpm_atmel.h:129: warning: return makes pointer from integer without a cast

The code in tpm_atmel.h supports PPC64 (using the device tree and ioremap())
and "anything else" (using ioport_map()). However, ioportmap() is only
available on platforms that set HAS_IOPORT.

Although PC64 seems to have HAS_IOPORT, a "depends on HAS_IOPORT" should work,
but I think it's better to expose the special PPC64 handling explicit using
"depends on PPC64 || HAS_IOPORT".

Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Rajiv Andrade <srajiv@linux.vnet.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
2011-09-23 09:45:57 +10:00
David Rientjes
e369fde1af thp: fix khugepaged defrag tunable documentation
Commit e27e6151b1 ("mm/thp: use conventional format for boolean
attributes") changed

  /sys/kernel/mm/transparent_hugepage/khugepaged/defrag

to be tuned by using 1 (enabled) or 0 (disabled) instead of "yes" and
"no", respectively.

Update the documentation.

Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-09-22 14:27:14 -07:00
Geert Uytterhoeven
a7f4d00a82 zorro: Defer device_register() until all devices have been identified
As the Amiga Zorro II address space is limited to 8.5 MiB and Zorro
devices can contain only one BAR, several Amiga Zorro II expansion
boards (mainly graphics cards) contain multiple Zorro devices: a small
one for the control registers and one (or more) for the graphics memory.

The conversion of cirrusfb to the new driver framework introduced a
regression: the driver contains a zorro_driver for the first Zorro
device, and uses the (old) zorro_find_device() call to find the second
Zorro device.

However, as the Zorro core calls device_register() as soon as a Zorro
device is identified, it may not have identified the second Zorro device
belonging to the same physical Zorro expansion card.  Hence cirrusfb
could no longer find the second part of the Picasso II graphics card,
causing a NULL pointer dereference.

Defer the registration of Zorro devices with the driver framework until
all Zorro devices have been identified to fix this.

Note that the alternative solution (modifying cirrusfb to register a
zorro_driver for all Zorro devices belonging to a graphics card, instead
of only for the first one, and adding a synchronization mechanism to
defer initialization until all have been found), is not an option, as on
some cards one device may be optional (e.g.  the second bank of 2 MiB of
graphics memory on the Picasso IV in Zorro II mode).

Reported-by: Ingo Jürgensmann <ij@2011.bluespice.org>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-09-22 12:59:35 -07:00
Linus Torvalds
fae3f6f2ee Merge branch 'for-linus' of git://git390.marist.edu/pub/scm/linux-2.6
* 'for-linus' of git://git390.marist.edu/pub/scm/linux-2.6:
  [S390] kvm: extension capability for new address space layout
  [S390] kvm: fix address mode switching
2011-09-22 09:32:21 -07:00
Ben Hutchings
c37279b92a ALSA: fm801: Gracefully handle failure of tuner auto-detect
Commit 9676001559
("ALSA: fm801: add error handling if auto-detect fails") seems to
break systems that were previously working without a tuner.

As a bonus, this should fix init and cleanup for the case where the
tuner is explicitly disabled.

Reported-and-tested-by: Hor Jiun Shyong <jiunshyong@gmail.com>
References: http://bugs.debian.org/641946
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Cc: stable@kernel.org [v3.0+]
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2011-09-22 15:52:52 +02:00
Ben Hutchings
2ba34e43ba ALSA: fm801: Fix double free in case of error in tuner detection
Commit 9676001559
("ALSA: fm801: add error handling if auto-detect fails") added
incorrect error handling.

Once we have successfully called snd_device_new(), the cleanup
function fm801_free() will automatically be called by snd_card_free()
and we must *not* also call fm801_free() directly.

Reported-by: Hor Jiun Shyong <jiunshyong@gmail.com>
References: http://bugs.debian.org/641946
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Cc: stable@kernel.org [v3.0+]
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2011-09-22 15:51:46 +02:00
Linus Torvalds
d93dc5c447 Linux 3.1-rc7 2011-09-21 16:58:15 -07:00
Lasse Collin
9c1f8594df XZ: Fix incorrect XZ_BUF_ERROR
xz_dec_run() could incorrectly return XZ_BUF_ERROR if all of the
following was true:

 - The caller knows how many bytes of output to expect and only provides
   that much output space.

 - When the last output bytes are decoded, the caller-provided input
   buffer ends right before the LZMA2 end of payload marker.  So LZMA2
   won't provide more output anymore, but it won't know it yet and thus
   won't return XZ_STREAM_END yet.

 - A BCJ filter is in use and it hasn't left any unfiltered bytes in the
   temp buffer.  This can happen with any BCJ filter, but in practice
   it's more likely with filters other than the x86 BCJ.

This fixes <https://bugzilla.redhat.com/show_bug.cgi?id=735408> where
Squashfs thinks that a valid file system is corrupt.

This also fixes a similar bug in single-call mode where the uncompressed
size of a block using BCJ + LZMA2 was 0 bytes and caller provided no
output space.  Many empty .xz files don't contain any blocks and thus
don't trigger this bug.

This also tweaks a closely related detail: xz_dec_bcj_run() could call
xz_dec_lzma2_run() to decode into temp buffer when it was known to be
useless.  This was harmless although it wasted a minuscule number of CPU
cycles.

Signed-off-by: Lasse Collin <lasse.collin@tukaani.org>
Cc: stable <stable@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-09-21 13:39:59 -07:00