* pci/hotplug:
PCI: cpqphp: Fix possible null pointer dereference
NVMe: Implement PCIe reset notification callback
PCI: Notify driver before and after device reset
* pci/pci_is_bridge:
pcmcia: Use pci_is_bridge() to simplify code
PCI: pciehp: Use pci_is_bridge() to simplify code
PCI: acpiphp: Use pci_is_bridge() to simplify code
PCI: cpcihp: Use pci_is_bridge() to simplify code
PCI: shpchp: Use pci_is_bridge() to simplify code
PCI: rpaphp: Use pci_is_bridge() to simplify code
sparc/PCI: Use pci_is_bridge() to simplify code
powerpc/PCI: Use pci_is_bridge() to simplify code
ia64/PCI: Use pci_is_bridge() to simplify code
x86/PCI: Use pci_is_bridge() to simplify code
PCI: Use pci_is_bridge() to simplify code
PCI: Add new pci_is_bridge() interface
PCI: Rename pci_is_bridge() to pci_has_subordinate()
* pci/virtualization:
PCI: Introduce new device binding path using pci_dev.driver_override
Conflicts:
drivers/pci/pci-sysfs.c
The driver_override field allows us to specify the driver for a device
rather than relying on the driver to provide a positive match of the
device. This shortcuts the existing process of looking up the vendor and
device ID, adding them to the driver new_id, binding the device, then
removing the ID, but it also provides a couple advantages.
First, the above existing process allows the driver to bind to any device
matching the new_id for the window where it's enabled. This is often not
desired, such as the case of trying to bind a single device to a meta
driver like pci-stub or vfio-pci. Using driver_override we can do this
deterministically using:
echo pci-stub > /sys/bus/pci/devices/0000:03:00.0/driver_override
echo 0000:03:00.0 > /sys/bus/pci/devices/0000:03:00.0/driver/unbind
echo 0000:03:00.0 > /sys/bus/pci/drivers_probe
Previously we could not invoke drivers_probe after adding a device to
new_id for a driver as we get non-deterministic behavior whether the driver
we intend or the standard driver will claim the device. Now it becomes a
deterministic process, only the driver matching driver_override will probe
the device.
To return the device to the standard driver, we simply clear the
driver_override and reprobe the device:
echo > /sys/bus/pci/devices/0000:03:00.0/driver_override
echo 0000:03:00.0 > /sys/bus/pci/devices/0000:03:00.0/driver/unbind
echo 0000:03:00.0 > /sys/bus/pci/drivers_probe
Another advantage to this approach is that we can specify a driver override
to force a specific binding or prevent any binding. For instance when an
IOMMU group is exposed to userspace through VFIO we require that all
devices within that group are owned by VFIO. However, devices can be
hot-added into an IOMMU group, in which case we want to prevent the device
from binding to any driver (override driver = "none") or perhaps have it
automatically bind to vfio-pci. With driver_override it's a simple matter
for this field to be set internally when the device is first discovered to
prevent driver matches.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Alexander Graf <agraf@suse.de>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Remove pcibios_add_platform_entries(). Architecture-specific attributes
can be achieved by setting pdev->dev.groups.
Link: https://lkml.kernel.org/r/alpine.LFD.2.11.1404141101500.1529@denkbrett
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Move the devspec OF attribute to PCI common code's set of device attributes
since it's not architecture dependent. As a side effect microblaze and
powerpc no longer need to use pcibios_add_platform_entries().
[bhelgaas: fold in #include for compile error]
Link: https://lkml.kernel.org/r/alpine.LFD.2.11.1404141101500.1529@denkbrett
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
driver-core now supports synchrnous self-deletion of attributes and
the asynchrnous removal mechanism is scheduled for removal. Use it
instead of device_schedule_callback(). This makes "remove" behave
synchronously.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: linux-pci@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
There are multiple PCI device addition and removal code paths that may be
run concurrently with the generic PCI bus rescan and device removal that
can be triggered via sysfs. If that happens, it may lead to multiple
different, potentially dangerous race conditions.
The most straightforward way to address those problems is to run
the code in question under the same lock that is used by the
generic rescan/remove code in pci-sysfs.c. To prepare for those
changes, move the definition of the global PCI remove/rescan lock
to probe.c and provide global wrappers, pci_lock_rescan_remove()
and pci_unlock_rescan_remove(), allowing drivers to manipulate
that lock. Also provide pci_stop_and_remove_bus_device_locked()
for the callers of pci_stop_and_remove_bus_device() who only need
to hold the rescan/remove lock around it.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Fix whitespace, capitalization, and spelling errors. No functional change.
I know "busses" is not an error, but "buses" was more common, so I used it
consistently.
Signed-off-by: Marta Rybczynska <rybczynska@gmail.com> (pci_reset_bridge_secondary_bus())
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
local_cpus_show() and local_cpulist_show() are almost the same.
This adds a new helper function, pci_dev_show_local_cpu(), to simplify
code.
The same strategy is already used by cpuaffinity_show() and
cpulistaffinity_show().
Signed-off-by: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Local variables used only in this file are made static.
[bhelgaas: also make pci_dev_attrs[] static (from Fengguang)]
Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
The dev_attrs field of struct bus_type is going away soon, dev_groups
should be used instead. This converts the PCI bus code to use the
correct field.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
The bus_attrs field of struct bus_type is going away soon, dev_groups
should be used instead. This converts the PCI bus code to use the
correct field.
Cc: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The dev_attrs field of struct class is going away soon, dev_groups
should be used instead. This converts the PCI class code to use the
correct field.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
PCI devices for SR-IOV virtual functions should only be created/
destroyed by pci_enable_sriov()/pci_disable_sriov() because special
data structures are associated with SR-IOV virtual functions.
So hide hotplug related sysfs interfaces "remove" and "rescan" for
SR-IOV virtual functions, otherwise it may cause memory leakage
and other issues.
Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Signed-off-by: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Donald Dutile <ddutile@redhat.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Ram Pai <linuxram@us.ibm.com>
The usage of strict_strtoul() is not preferred, because
strict_strtoul() is obsolete. Thus, kstrtoul() should be
used.
[bhelgaas: "#define strict_strtoul kstrtoul", so no functional change]
Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
(*->vm_end - *->vm_start) >> PAGE_SHIFT operation is implemented
as an inline funcion vma_pages() in linux/mm.h, so use it.
Signed-off-by: Libin <huawei.libin@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
If we request "num_vfs" and the driver's sriov_configure() method enables
exactly that number ("num_vfs_enabled"), we complain "Invalid value for
number of VFs to enable" and return an error. We should silently return
success instead.
Also, use kstrtou16() since numVFs is defined to be a 16-bit field and
rework to simplify control flow.
Reported-by: Greg Rose <gregory.v.rose@intel.com>
Reference: http://lkml.kernel.org/r/20121214101911.00002f59@unknown
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Donald Dutile <ddutile@redhat.com>
Host bridge hotplug:
- Untangle _PRT from struct pci_bus (Bjorn Helgaas)
- Request _OSC control before scanning root bus (Taku Izumi)
- Assign resources when adding host bridge (Yinghai Lu)
- Remove root bus when removing host bridge (Yinghai Lu)
- Remove _PRT during hot remove (Yinghai Lu)
SRIOV
- Add sysfs knobs to control numVFs (Don Dutile)
Power management
- Notify devices when power resource turned on (Huang Ying)
Bug fixes
- Work around broken _SEG on HP xw9300 (Bjorn Helgaas)
- Keep runtime PM enabled for unbound PCI devices (Huang Ying)
- Fix Optimus dual-GPU runtime D3 suspend issue (Dave Airlie)
- Fix xen frontend shutdown issue (David Vrabel)
- Work around PLX PCI 9050 BAR alignment erratum (Ian Abbott)
Miscellaneous
- Add GPL license for drivers/pci/ioapic (Andrew Cooks)
- Add standard PCI-X, PCIe ASPM register #defines (Bjorn Helgaas)
- NumaChip remote PCI support (Daniel Blueman)
- Fix PCIe Link Capabilities Supported Link Speed definition (Jingoo Han)
- Convert dev_printk() to dev_info(), etc (Joe Perches)
- Add support for non PCI BAR ROM data (Matthew Garrett)
- Add x86 support for host bridge translation offset (Mike Yoknis)
- Report success only when every driver supports AER (Vijay Pandarathil)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
iQIcBAABAgAGBQJQyKwSAAoJEPGMOI97Hn6zScgQAJZK2VDfCv74mKrgSDNokIzH
5nVDrc9AHKJm7CUODs6keJK5d4TD/za3Zao68zrYHsJJKes2ni2Z3W34HP2RXKK2
eOmePXOHYPPZMlimP9r9cVxNu1ZJCyp/yWSBcsPF4zUgWhBWLRaSj85I049gQ0sz
+05nZYfLjVd3HNiaXsG4CQyMrNF46XEsLhF9vs+Nr2GHPwrpzhfScgYv63oDS86C
3ICKsjmiRUZcNelxIFYmyxa5u89QdW5XHjzc9eHGQuus24Vxw+TZzsdfc17sUJEE
HTyXY+RjDpOVhdtwwUjrCEOiyZYvy3g9+3sKxoxgt/76ghdUaR7fxITwB97qVMFD
T0ESlKjSV/Qv5QYdyy5uP4zwNs/PXCWXkTg/L1m71F30BxKWDa7tgiA6uK7Z7fl5
1aokKBdk3mtJJJIDJG1YkxPXx/JItTGCNYrx7CcFj49rSjrUWLQdmrYahersRIsB
3wiD2xTi9e4dXeP/+VGzGOWB/sHk+73jvrvZe/REa1FCnMINDz4+9V9WaGROMqyq
MQ8kX0KfYcNVNxy1GOXjU5wLpMN/t/QbvI7gwzRP1DAUCJPoOgFy7AjvSTVG3zuy
8CtdOFttVkUn5dqsbQR0gVbyQVTS3PGSKz5XC/s8kVDWhja0xZTBYwrskM/4zdSD
Xf48OyYV5EjpC3FYUSiU
=OE3Q
-----END PGP SIGNATURE-----
Merge tag 'for-3.8' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull PCI update from Bjorn Helgaas:
"Host bridge hotplug:
- Untangle _PRT from struct pci_bus (Bjorn Helgaas)
- Request _OSC control before scanning root bus (Taku Izumi)
- Assign resources when adding host bridge (Yinghai Lu)
- Remove root bus when removing host bridge (Yinghai Lu)
- Remove _PRT during hot remove (Yinghai Lu)
SRIOV
- Add sysfs knobs to control numVFs (Don Dutile)
Power management
- Notify devices when power resource turned on (Huang Ying)
Bug fixes
- Work around broken _SEG on HP xw9300 (Bjorn Helgaas)
- Keep runtime PM enabled for unbound PCI devices (Huang Ying)
- Fix Optimus dual-GPU runtime D3 suspend issue (Dave Airlie)
- Fix xen frontend shutdown issue (David Vrabel)
- Work around PLX PCI 9050 BAR alignment erratum (Ian Abbott)
Miscellaneous
- Add GPL license for drivers/pci/ioapic (Andrew Cooks)
- Add standard PCI-X, PCIe ASPM register #defines (Bjorn Helgaas)
- NumaChip remote PCI support (Daniel Blueman)
- Fix PCIe Link Capabilities Supported Link Speed definition (Jingoo
Han)
- Convert dev_printk() to dev_info(), etc (Joe Perches)
- Add support for non PCI BAR ROM data (Matthew Garrett)
- Add x86 support for host bridge translation offset (Mike Yoknis)
- Report success only when every driver supports AER (Vijay
Pandarathil)"
Fix up trivial conflicts.
* tag 'for-3.8' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (48 commits)
PCI: Use phys_addr_t for physical ROM address
x86/PCI: Add NumaChip remote PCI support
ath9k: Use standard #defines for PCIe Capability ASPM fields
iwlwifi: Use standard #defines for PCIe Capability ASPM fields
iwlwifi: collapse wrapper for pcie_capability_read_word()
iwlegacy: Use standard #defines for PCIe Capability ASPM fields
iwlegacy: collapse wrapper for pcie_capability_read_word()
cxgb3: Use standard #defines for PCIe Capability ASPM fields
PCI: Add standard PCIe Capability Link ASPM field names
PCI/portdrv: Use PCI Express Capability accessors
PCI: Use standard PCIe Capability Link register field names
x86: Use PCI setup data
PCI: Add support for non-BAR ROMs
PCI: Add pcibios_add_device
EFI: Stash ROMs if they're not in the PCI BAR
PCI: Add and use standard PCI-X Capability register names
PCI/PM: Keep runtime PM enabled for unbound PCI devices
xen-pcifront: Handle backend CLOSED without CLOSING
PCI: SRIOV control and status via sysfs (documentation)
PCI/AER: Report success only when every device has AER-aware driver
...
Remove conditional code based on CONFIG_HOTPLUG being false. It's
always on now in preparation of it going away as an option.
Signed-off-by: Bill Pemberton <wfp5p@virginia.edu>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Some implementations of SRIOV provide a capability structure
value of TotalVFs that is greater than what the software can support.
Provide a method to reduce the capability structure reported value
to the value the driver can support.
This ensures sysfs reports the current capability of the system,
hardware and software.
Example for its use: igb & ixgbe -- report 8 & 64 as TotalVFs,
but drivers only support 7 & 63 maximum.
Signed-off-by: Donald Dutile <ddutile@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Provide files under sysfs to determine the maximum number of VFs
an SR-IOV-capable PCIe device supports, and methods to enable and
disable the VFs on a per-device basis.
Currently, VF enablement by SR-IOV-capable PCIe devices is done
via driver-specific module parameters. If not setup in modprobe files,
it requires admin to unload & reload PF drivers with number of desired
VFs to enable. Additionally, the enablement is system wide: all
devices controlled by the same driver have the same number of VFs
enabled. Although the latter is probably desired, there are PCI
configurations setup by system BIOS that may not enable that to occur.
Two files are created for the PF of PCIe devices with SR-IOV support:
sriov_totalvfs Contains the maximum number of VFs the device
could support as reported by the TotalVFs register
in the SR-IOV extended capability.
sriov_numvfs Contains the number of VFs currently enabled on
this device as reported by the NumVFs register in
the SR-IOV extended capability.
Writing zero to this file disables all VFs.
Writing a positive number to this file enables that
number of VFs.
These files are readable for all SR-IOV PF devices. Writes to the
sriov_numvfs file are effective only if a driver that supports the
sriov_configure() method is attached.
Signed-off-by: Donald Dutile <ddutile@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Should make pci_create_sysfs_dev_files() simpler. Also fix possible
memleak in remove path.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Need type filled in device structure so it can be used for visible
attribute control in sysfs for pci_dev.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
In https://bugzilla.kernel.org/show_bug.cgi?id=48981
Peter reported that /proc/bus/pci/??/??.? does not work for 3.6.
This is because the device configuration space registers are
not accessible if the corresponding parent bridge is suspended or
the device is put into D3cold state.
This is the same as /sys/bus/pci/devices/0000:??:??.?/config access
issue. So the function used to solve sysfs issue is used to solve
this issue.
This patch moves pci_config_pm_runtime_get()/_put() from pci/pci-sysfs.c
to pci/pci.c and makes them extern so they can be used by both the
sysfs and proc paths.
[bhelgaas: changelog, references, reporters]
Reference: https://bugzilla.kernel.org/show_bug.cgi?id=48981
Reference: https://bugzilla.kernel.org/show_bug.cgi?id=49031
Reported-by: Forrest Loomis <cybercyst@gmail.com>
Reported-by: Peter <lekensteyn@gmail.com>
Reported-by: Micael Dias <kam1kaz3@gmail.com>
Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
CC: stable@vger.kernel.org # v3.6+
This patch fixes the following bug:
http://marc.info/?l=linux-pci&m=134338059022620&w=2
Where lspci does not work properly if a device and the corresponding
parent bridge (such as PCIe port) is suspended. This is because the
device configuration space registers will be not accessible if the
corresponding parent bridge is suspended or the device is put into
D3cold state.
To solve the issue, the bridge/PCIe port connected to the device is
put into active state before read/write configuration space registers.
If the device is in D3cold state, it will be put into active state
too.
To avoid resume/suspend PCIe port for each configuration register
read/write, a small delay is added before the PCIe port to go
suspended.
Reported-by: Bjorn Mork <bjorn@mork.no>
Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rafael J. Wysocki <rjw@sisk.pl>
* topic/huang-d3cold-v7:
PCI/PM: add PCIe runtime D3cold support
PCI: do not call pci_set_power_state with PCI_D3cold
PCI/PM: add runtime PM support to PCIe port
ACPI/PM: specify lowest allowed state for device sleep state
This patch adds runtime D3cold support and corresponding ACPI platform
support. This patch only enables runtime D3cold support; it does not
enable D3cold support during system suspend/hibernate.
D3cold is the deepest power saving state for a PCIe device, where its main
power is removed. While it is in D3cold, you can't access the device at
all, not even its configuration space (which is still accessible in D3hot).
Therefore the PCI PM registers can not be used to transition into/out of
the D3cold state; that must be done by platform logic such as ACPI _PR3.
To support wakeup from D3cold, a system may provide auxiliary power, which
allows a device to request wakeup using a Beacon or the sideband WAKE#
signal. WAKE# is usually connected to platform logic such as ACPI GPE.
This is quite different from other power saving states, where devices
request wakeup via a PME message on the PCIe link.
Some devices, such as those in plug-in slots, have no direct platform
logic. For example, there is usually no ACPI _PR3 for them. D3cold
support for these devices can be done via the PCIe Downstream Port leading
to the device. When the PCIe port is powered on/off, the device is powered
on/off too. Wakeup events from the device will be notified to the
corresponding PCIe port.
For more information about PCIe D3cold and corresponding ACPI support,
please refer to:
- PCI Express Base Specification Revision 2.0
- Advanced Configuration and Power Interface Specification Revision 5.0
[bhelgaas: changelog]
Reviewed-by: Rafael J. Wysocki <rjw@sisk.pl>
Originally-by: Zheng Yan <zheng.z.yan@intel.com>
Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
The default VGA device is a somewhat fluid concept on platforms with
multiple GPUs. Add support for setting it so switching code can update
things appropriately, and make sure that the sysfs code returns the right
device if it's changed.
v2: Updated to fix builds when __ARCH_HAS_VGA_DEFAULT_DEVICE is false.
Signed-off-by: Matthew Garrett <mjg@redhat.com>
Acked-by: H. Peter Anvin <hpa@zytor.com>
Acked-by: benh@kernel.crashing.org
Cc: airlied@redhat.com
Signed-off-by: Dave Airlie <airlied@redhat.com>
The old pci_remove_bus_device actually did stop and remove.
Make the name reflect that to reduce confusion.
This patch is done by sed scripts and changes back some incorrect
__pci_remove_bus_device changes.
Suggested-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Current rescan will not touch bridge MMIO and IO.
Try to reuse pci_assign_unassigned_bridge_resources(bridge) to update bridge
resources, if child devices need more resources.
Only do that for bridges whose children are all removed already; i.e. don't
release resources that could already be in use by drivers on child devices.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
* 'for-linus' of git://selinuxproject.org/~jmorris/linux-security:
capabilities: remove __cap_full_set definition
security: remove the security_netlink_recv hook as it is equivalent to capable()
ptrace: do not audit capability check when outputing /proc/pid/stat
capabilities: remove task_ns_* functions
capabitlies: ns_capable can use the cap helpers rather than lsm call
capabilities: style only - move capable below ns_capable
capabilites: introduce new has_ns_capabilities_noaudit
capabilities: call has_ns_capability from has_capability
capabilities: remove all _real_ interfaces
capabilities: introduce security_capable_noaudit
capabilities: reverse arguments to security_capable
capabilities: remove the task from capable LSM hook entirely
selinux: sparse fix: fix several warnings in the security server cod
selinux: sparse fix: fix warnings in netlink code
selinux: sparse fix: eliminate warnings for selinuxfs
selinux: sparse fix: declare selinux_disable() in security.h
selinux: sparse fix: move selinux_complete_init
selinux: sparse fix: make selinux_secmark_refcount static
SELinux: Fix RCU deref check warning in sel_netport_insert()
Manually fix up a semantic mis-merge wrt security_netlink_recv():
- the interface was removed in commit fd77846152 ("security: remove
the security_netlink_recv hook as it is equivalent to capable()")
- a new user of it appeared in commit a38f7907b9 ("crypto: Add
userspace configuration API")
causing no automatic merge conflict, but Eric Paris pointed out the
issue.
security_capable takes ns, cred, cap. But the LSM capable() hook takes
cred, ns, cap. The capability helper functions also take cred, ns, cap.
Rather than flip argument order just to flip it back, leave them alone.
Heck, this should be a little faster since argument will be in the right
place!
Signed-off-by: Eric Paris <eparis@redhat.com>
They were implicitly getting it from device.h --> module.h but
we want to clean that up. So add the minimal header for these
macros.
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Requested by Greg KH to fix a race condition in the creating of PCI bus
cpuaffinity files.
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
After remove the device from /sys, we have to rescan all or
find out the bridge and access /sys../device/rescan there.
this patch add /sys/.../pci_bus/.../rescan. So user can rescan more easy.
that is more clean and easy to understand.
like after remove 0000:c4:00.0, you can rescan 0000:c4 directly.
-v2: According to Jesse, use function instead of exposing attr, so could hide
#ifdef in header file.
also add code to remove rescan file in remove path.
-v3: GregKH pointed out that we should use dev_attrs to avoid racing.
So add pcibus_attrs and make it to be member of pcibus_attrs.
-v4: Change name to pcibus_dev_attrs according to GregKH
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
- Introduce ns_capable to test for a capability in a non-default
user namespace.
- Teach cap_capable to handle capabilities in a non-default
user namespace.
The motivation is to get to the unprivileged creation of new
namespaces. It looks like this gets us 90% of the way there, with
only potential uid confusion issues left.
I still need to handle getting all caps after creation but otherwise I
think I have a good starter patch that achieves all of your goals.
Changelog:
11/05/2010: [serge] add apparmor
12/14/2010: [serge] fix capabilities to created user namespaces
Without this, if user serge creates a user_ns, he won't have
capabilities to the user_ns he created. THis is because we
were first checking whether his effective caps had the caps
he needed and returning -EPERM if not, and THEN checking whether
he was the creator. Reverse those checks.
12/16/2010: [serge] security_real_capable needs ns argument in !security case
01/11/2011: [serge] add task_ns_capable helper
01/11/2011: [serge] add nsown_capable() helper per Bastian Blank suggestion
02/16/2011: [serge] fix a logic bug: the root user is always creator of
init_user_ns, but should not always have capabilities to
it! Fix the check in cap_capable().
02/21/2011: Add the required user_ns parameter to security_capable,
fixing a compile failure.
02/23/2011: Convert some macros to functions as per akpm comments. Some
couldn't be converted because we can't easily forward-declare
them (they are inline if !SECURITY, extern if SECURITY). Add
a current_user_ns function so we can use it in capability.h
without #including cred.h. Move all forward declarations
together to the top of the #ifdef __KERNEL__ section, and use
kernel-doc format.
02/23/2011: Per dhowells, clean up comment in cap_capable().
02/23/2011: Per akpm, remove unreachable 'return -EPERM' in cap_capable.
(Original written and signed off by Eric; latest, modified version
acked by him)
[akpm@linux-foundation.org: fix build]
[akpm@linux-foundation.org: export current_user_ns() for ecryptfs]
[serge.hallyn@canonical.com: remove unneeded extra argument in selinux's task_has_capability]
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Serge E. Hallyn <serge.hallyn@canonical.com>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Acked-by: Daniel Lezcano <daniel.lezcano@free.fr>
Acked-by: David Howells <dhowells@redhat.com>
Cc: James Morris <jmorris@namei.org>
Signed-off-by: Serge E. Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
PCI: label: remove #include of ACPI header to avoid warnings
PCI: label: Fix compilation error when CONFIG_ACPI is unset
PCI: pre-allocate additional resources to devices only after successful allocation of essential resources.
PCI: introduce reset_resource()
PCI: data structure agnostic free list function
PCI: refactor io size calculation code
PCI: do not create quirk I/O regions below PCIBIOS_MIN_IO for ICH
PCI hotplug: acpiphp: set current_state to D0 in register_slot
PCI: Export ACPI _DSM provided firmware instance number and string name to sysfs
PCI: add more checking to ICH region quirks
PCI: aer-inject: Override PCIe AER Mask Registers
PCI: fix tlan build when CONFIG_PCI is not enabled
PCI: remove quirk for pre-production systems
PCI: Avoid potential NULL pointer dereference in pci_scan_bridge
PCI/lpc: irq and pci_ids patch for Intel DH89xxCC DeviceIDs
PCI: sysfs: Fix failure path for addition of "vpd" attribute
This reintroduces commit 47970b1b which was subsequently reverted
as f00eaeea. The original change was broken and caused X startup
failures and generally made privileged processes incapable of reading
device dependent config space. The normal capable() interface returns
true on success, but the LSM interface returns 0 on success. This thinko
is now fixed in this patch, and has been confirmed to work properly.
So, once again...Eric Paris noted that commit de139a3 ("pci: check caps
from sysfs file open to read device dependent config space") caused the
capability check to bypass security modules and potentially auditing.
Rectify this by calling security_capable() when checking the open file's
capabilities for config space reads.
Reported-by: Eric Paris <eparis@redhat.com>
Tested-by: Dave Young <hidave.darkstar@gmail.com>
Acked-by: James Morris <jmorris@namei.org>
Cc: Dave Airlie <airlied@gmail.com>
Cc: Alex Riesen <raa.lkml@gmail.com>
Cc: Sedat Dilek <sedat.dilek@googlemail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: James Morris <jmorris@namei.org>
This reverts commit 47970b1b2a.
It turns out it breaks several distributions. Looks like the stricter
selinux checks fail due to selinux policies not being set to allow the
access - breaking X, but also lspci.
So while the change was clearly the RightThing(tm) to do in theory, in
practice we have backwards compatibility issues making it not work.
Reported-by: Dave Young <hidave.darkstar@gmail.com>
Acked-by: David Airlie <airlied@linux.ie>
Acked-by: Alex Riesen <raa.lkml@gmail.com>
Cc: Eric Paris <eparis@redhat.com>
Cc: Chris Wright <chrisw@sous-sol.org>
Cc: James Morris <jmorris@namei.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Eric Paris noted that commit de139a3 ("pci: check caps from sysfs file
open to read device dependent config space") caused the capability check
to bypass security modules and potentially auditing. Rectify this by
calling security_capable() when checking the open file's capabilities
for config space reads.
Reported-by: Eric Paris <eparis@redhat.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: James Morris <jmorris@namei.org>
Commit 280c73d ("PCI: centralize the capabilities code in
pci-sysfs.c") changed the initialisation of the "rom" and "vpd"
attributes, and made the failure path for the "vpd" attribute
incorrect. We must free the new attribute structure (attr), but
instead we currently free dev->vpd->attr. That will normally be NULL,
resulting in a memory leak, but it might be a stale pointer, resulting
in a double-free.
Found by inspection; compile-tested only.
Cc: stable@kernel.org
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
The PCI sysfs ROM interface requires an enabling write to access the ROM
image, but the default file mode is 0400. The original proposed patch
adding sysfs ROM support was a true read-only interface, with the
enabling bit coming in as a feature request. I suspect it was simply an
oversight that the file mode didn't get updated to match the API.
Acked-by: Chris Wright <chrisw@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
I just loaded 2.6.37-rc2 on my machines, and I noticed that X no longer starts.
Running an strace of the X server shows that it's doing this:
open("/sys/bus/pci/devices/0000:07:00.0/resource0", O_RDWR) = 10
mmap(NULL, 16777216, PROT_READ|PROT_WRITE, MAP_SHARED, 10, 0) = -1 EINVAL (Invalid argument)
This code seems to be asking for a shared read/write mapping of 16MB worth of
BAR0 starting at file offset 0, and letting the kernel assign a starting
address. Unfortunately, this -EINVAL causes X not to start. Looking into
dmesg, there's a complaint like so:
process "Xorg" tried to map 0x01000000 bytes at page 0x00000000 on 0000:07:00.0 BAR 0 (start 0x 96000000, size 0x 1000000)
...with the following code in pci_mmap_fits:
pci_start = (mmap_api == PCI_MMAP_SYSFS) ?
pci_resource_start(pdev, resno) >> PAGE_SHIFT : 0;
if (start >= pci_start && start < pci_start + size &&
start + nr <= pci_start + size)
It looks like the logic here is set up such that when the mmap call comes via
sysfs, the check in pci_mmap_fits wants vma->vm_pgoff to be between the
resource's start and end address, and the end of the vma to be no farther than
the end. However, the sysfs PCI resource files always start at offset zero,
which means that this test always fails for programs that mmap the sysfs files.
Given the comment in the original commit
3b519e4ea6, I _think_ the old procfs files
require that the file offset be equal to the resource's base address when
mmapping.
I think what we want here is for pci_start to be 0 when mmap_api ==
PCI_MMAP_PROCFS. The following patch makes that change, after which the Matrox
and Mach64 X drivers work again.
Acked-by: Martin Wilck <martin.wilck@ts.fujitsu.com>
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Cast pci_resource_start() and pci_resource_len() to u64 for printk.
drivers/pci/pci-sysfs.c:753: warning: format '%16Lx' expects type 'long long unsigned int', but argument 9 has type 'resource_size_t'
drivers/pci/pci-sysfs.c:753: warning: format '%16Lx' expects type 'long long unsigned int', but argument 10 has type 'resource_size_t'
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
The checks for valid mmaps of PCI resources made through /proc/bus/pci files
that were introduced in 9eff02e204 have several
problems:
1. mmap() calls on /proc/bus/pci files are made with real file offsets > 0,
whereas under /sys/bus/pci/devices, the start of the resource corresponds
to offset 0. This may lead to false negatives in pci_mmap_fits(), which
implicitly assumes the /sys/bus/pci/devices layout.
2. The loop in proc_bus_pci_mmap doesn't skip empty resouces. This leads
to false positives, because pci_mmap_fits() doesn't treat empty resources
correctly (the calculated size is 1 << (8*sizeof(resource_size_t)-PAGE_SHIFT)
in this case!).
3. If a user maps resources with BAR > 0, pci_mmap_fits will emit bogus
WARNINGS for the first resources that don't fit until the correct one is found.
On many controllers the first 2-4 BARs are used, and the others are empty.
In this case, an mmap attempt will first fail on the non-empty BARs
(including the "right" BAR because of 1.) and emit bogus WARNINGS because
of 3., and finally succeed on the first empty BAR because of 2.
This is certainly not the intended behaviour.
This patch addresses all 3 issues.
Updated with an enum type for the additional parameter for pci_mmap_fits().
Cc: stable@kernel.org
Signed-off-by: Martin Wilck <martin.wilck@ts.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>